TransWikia.com

New Compiler Technology in V12

Mathematica Asked on July 7, 2021

I am wondering whether the speed of compilation for the new compiler can be improved.

Here are the results from my test.

Consider the following function for generating a random draw from a gamma distribution

ablRanGamma[alpha_, beta_] := 
    Module[
      {d, c, x = 0.0, v = 0.0, u, cond},

        d = alpha - 1.0/3.0;
        c = 1.0/Sqrt[9.0*d];
        cond = 1;

        While[cond == 1,
            v = -1.0;
            While[v <= 0.0,
                    x = RandomVariate[NormalDistribution[0, 1]];
                    v = 1.0 + c*x;
                ];
            v = v^3;
            u = RandomReal[];
            If[u > 1. - 0.0331*x^4 && 
         Log[u] > 0.5*x^2 + d*(1.0 - v + Log[v]), cond = 1, cond = 0];
        ];
        beta*d*v
        ]

The old compiler technology can be used to compile the function to C, as follows

    ablRanGammaC = Compile[{{alpha, _Real}, {beta, _Real}},
   Module[
        {d, c, x = 0.0, v = 0.0, u, cond},
        d = alpha - 1.0/3.0; c = 1.0/Sqrt[9.0*d]; cond = 1;
        While[cond == 1,
            v = -1.0;
            While[v <= 0.0,
                x = RandomVariate[NormalDistribution[0, 1]];
                v = 1.0 + c*x;
            ];
            v = v^3;
            u = RandomReal[];
            If[
      u > 1. - 0.0331*x^4 && Log[u] > 0.5*x^2 + d*(1.0 - v + Log[v]), 
      cond = 1, cond = 0];
        ];
        beta*d*v
        ], 
    CompilationTarget -> "C"
   ];

This results in a significant improvement is speed, as expected.

In[10]:= AbsoluteTiming[Mean@Table[ablRanGamma[2.1, 3.2], {500000}]]

Out[10]= {11.7353, 6.71994}

In[7]:= AbsoluteTiming[Mean@Table[ablRanGammaC[2.1, 3.2], {500000}]]

Out[7]= {0.142951, 6.72236}

Now moving to the new compiler. First, RandomVariate[NormalDistribution[0,1]] cannot be compiled directly, so we need to use a KernelFunction. We create two functions to get random normal draws, one compiled and the other not.

f1[m_, s_] := RandomVariate[NormalDistribution[m, s]]

f1C = Compile[{{m, _Real}, {s, _Real}}, 
   RandomVariate[NormalDistribution[m, s]],
   CompilationTarget -> "C"
   ];

Then we can use these as KernelFunctions using the new compiler. Consider the function which uses f1.

ablRanGammaN1 = FunctionCompile[
   Function[{Typed[alpha, "Real64"], Typed[beta, "Real64"]},
    Module[
     {d, c, x = 0.0, v = 0.0, u, cond, 
      f1 = Typed[
        KernelFunction[f1], {"Real64", "Real64"} -> "Real64"]},
        d = alpha - 1.0/3.0; c = 1.0/Sqrt[9.0*d]; cond = 1;
        While[cond == 1,
            v = -1.0;
            While[v <= 0.0,
                x = f1[0.0, 1.0];
                v = 1.0 + c*x;
            ];
            v = v^3;
            u = RandomReal[];
            If[
       u > 1. - 0.0331*x^4 && Log[u] > 0.5*x^2 + d*(1.0 - v + Log[v]),
        cond = 1, cond = 0];
        ];
        beta*d*v
        ]
    ]
   ];

Its performance is

In[8]:= AbsoluteTiming[Mean@Table[ablRanGammaN1[2.1, 3.2], {500000}]]

Out[8]= {2.46856, 6.72463}

This can be improved by using f1C instead of f1 as a kernel function.

ablRanGammaN2 = FunctionCompile[
   Function[{Typed[alpha, "Real64"], Typed[beta, "Real64"]},
    Module[
     {d, c, x = 0.0, v = 0.0, u, cond, 
      f1 = Typed[
        KernelFunction[f1C], {"Real64", "Real64"} -> "Real64"]},
        d = alpha - 1.0/3.0; c = 1.0/Sqrt[9.0*d]; cond = 1;
        While[cond == 1,
            v = -1.0;
            While[v <= 0.0,
                x = f1[0.0, 1.0];
                v = 1.0 + c*x;
            ];
            v = v^3;
            u = RandomReal[];
            If[
       u > 1. - 0.0331*x^4 && Log[u] > 0.5*x^2 + d*(1.0 - v + Log[v]),
        cond = 1, cond = 0];
        ];
        beta*d*v
        ]
    ]
   ];

We now have

In[9]:= AbsoluteTiming[Mean@Table[ablRanGammaN2[2.1, 3.2], {500000}]]

Out[9]= {0.81675, 6.72306}

This is clearly better than the uncompiled function, but still is significantly slower than when we compile to C using the old technology. Can anybody explain why this is the case. In my experiments, this does not seem to be due to the overhead of using a kernel function. The C compiled code appears to be faster than the machine compiled code for the new technology.

One Answer

I wanted to look into this for a long time, and it took only 2 months to come back to this question. To answer your main question: Yes, it appears that the new compiler only supports uniformly distributed random numbers. There are three native functions: Native`Random, Native`RandomGeneratorBitFunction, and Native`RandomRuntime. If you call RandomReal it is compiled to Native`Random. I'll come back to the issue, why it's so hard to find information on how to use them.

To give you a solution to your problem first: Your function basically uses random numbers from a normal- and a uniform-distribution to compute a random number which is Gamma-distributed. I believe this is Marsaglia's simple transformation-rejection method. What you haven't considered is that you can also compute a normally-distributed random number from uniformly distributed random numbers. An easy way is the Box-Muller method which only uses elementary functions and two random reals.

You can replace your f1 entirely by this simple and calculate and assign a random number which comes from a (0,1)-normal-distribution. The code would then look like this:

f = Function[{Typed[alpha, "Real64"], Typed[beta, "Real64"]},
   Module[
    {d, c, x = 0.0, v = 0.0, u, cond},
    d = alpha - 1.0/3.0;
    c = 1.0/Sqrt[9.0*d];
    cond = 1;
    While[cond == 1,
     v = -1.0;
     While[v <= 0.0,
      x = Sqrt[-2.0*Log[Native`Random[0.0, 1.0]]]*Cos[2.0*Pi*Native`Random[0.0, 1.0]];
      v = 1.0 + c*x
      ];
     v = v^3;
     u = Native`Random[0., 1.];
     If[
      u > 1. - 0.0331*x^4 && Log[u] > 0.5*x^2 + d*(1.0 - v + Log[v]),
      cond = 1,
      cond = 0
      ];
     ];
    beta*d*v]];

ablRanGammaN3 = FunctionCompile[f];

The timings on my machine are:

  • uncompiled: 10.6206
  • compiled C: 0.126193
  • ablRanGammaN1: 2.27945
  • ablRanGammaN2: 0.80765
  • ablRanGammaN3: 0.369398

You see, we stole another 50% of the runtime but we still have 3x the runtime of the old compiler. And the histograms give hope that my implementation is not incorrect

Mathematica graphics

Gain more insight

I have followed the rise of the new compiler pretty much from the start but I admit that I haven't used it very often. The big disadvantage is that the most reliable documentation resources are still the videos on Youtube (video 1, video 2, video 3, video 4).

If you want to look a bit deeper into it, my first advice is that you use the functions under

<<Compile`

and in particular CompileToCodeFunction which seems to do the same as FunctionCompile but has many options and you can use to as an entry point to inspect the underlying code using PrintDefinitions. Other than that, you have CompileToAST, CompileToIR, etc which really help because you can call ["toString"] on their output.

Another thing to look into is Names["Native`*"] which seems to be all the functions that are used internally and which you can also use directly in your code. The downside is that I'm currently not aware of how we can find out how to use them. If you watch the videos I linked, you'll see that they are heavily used for performance tweaking and to represent things the compiler cannot do at the moment. Pay close attention especially to the sort example of Tom-Wickham Jones at the end of his video.

Finally, the two major advantages of the new technology for me personally are:

  • we can link functions written in C/C++, Rust, ... in an instance. There are a lot of examples in one of the videos
  • we can create code for different platforms like web-assembly

It's just that the new compiler isn't there yet when it comes to documentation. At the moment it's more a playground for the adventurous heroes among us.

Answered by halirutan on July 7, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP