TransWikia.com

Compile - Optimization of function calls - does it matter?

Mathematica Asked on January 24, 2021

I was surprised to see that, in the results of CompilePrint for a function made with Compile, calls to Functions seemed to be "actually" making a copy of the arguments, and perhaps-not optimizing-out the copies.

For example, compare the CompilePrint output of compiledFunction and compiledExpression below:

CompileSustitutable[vars_, body_] := 
 Hold[vars, body] /. Hold -> Compile

func = Function[x, Evaluate@Through@{# &, # + 1 &, #^2 &}@x]
expr = func[x]

With[
  {func = func},
  compiledFunction = Compile[{y}, func[y]];
  ];

compiledExpression = CompileSustitutable[{x}, expr];

CompilePrint@compiledFunction
CompilePrint@compiledExpression

The compiledFunction output is:

        R0 = A1
        I0 = 1
        Result = T(R1)0

1   R1 = R0
2   R2 = I0
3   R2 = R2 + R1
4   R3 = Square[ R1]
5   T(R1)0 = {R1, R2, R3}
6   Return

Whereas the compiledExpression output is:

        R0 = A1
        I0 = 1
        Result = T(R1)0

1   R1 = I0
2   R1 = R1 + R0
3   R2 = Square[ R0]
4   T(R1)0 = {R0, R1, R2}
5   Return

Although modern CPUs may perform their own optimization (which may make any Mathematica optimization a bit of a moot point, excepting any edge-cases), I don’t like relying on unknown downstream processes. Does anyone know if the "lowest-level code output by Mathematica" is basically what’s shown in CompilePrint, as-shown? I see that there are options like CompilationTarget -> "C", so does a "normal/mainstream compiler" still do things like inline‘ing function calls?

Is my CompileSustitutable function, which ostensibly saves argument copying, even worth the bother?

FYI: A much-less-elegant but interesting first attempt, demonstration the ability to bypass the "scoping" of Compile‘s arguments is shown below — I forget who to give credit to for the idea of using Evaluate in this manner:

With[
  {
   directMethod = Unevaluated[Evaluate[x]*2]
   },
  C1 = Compile[
    {Evaluate[x]},
    directMethod
    ]
  ];

One Answer

I decided to get some quantitative results to this question -- through brute-force. I decided to write my own version of BorderDimensions (calculates borders of a solid color around an image) using Compile. It's non-trivial enough to actually demonstrate the point, but (I hope) small enough to post here.

Note that I'm aware that my version could be "tricked" if an image contains horizontal/vertical lines of the same color, but where each line doesn't have the same color as its adjacent line. That's not the point for this question -- it's just a proof-of-concept. Also, I think there's a bug in the "function-based" version (I ran it through some test images), but the point is that you can see a ton of TensorCopy operations in CompilePrint, and the results of AbsoluteTiming speak for themselves.

The results confirmed that inlining functions yourself, manually, does actually result in performance boosts.

I used Henrik Schumacher's suggestion of using ExportString[cf, "C"] to view the actual C-code generated, plus AbsoluteTiming. Results: the version with function calls generated double as many lines of C-code, and took more than double time to execute, as measured by running 1000 iterations through a small test image and using AbsoluteTiming.

First, here is the test data:

testImgRaster = Rasterize@x
testImgData = 
  Rasterize[x] /. HoldPattern@Image[data__] -> List[data];
testImgDataArr = testImgData[[1]] // Normal;

Next, here is the benchmarking setup:

Style["PaddingCalculatorCompiledWithFunctions", Bold, Red]
codePaddingCalculatorCompiledWithFunctions = 
  ExportString[PaddingCalculatorCompiledWithFunctions, "C"];
StringLength[codePaddingCalculatorCompiledWithFunctions]
Do[PaddingCalculatorCompiledWithFunctions[testImgDataArr], {i, 0, 
   1000}] // AbsoluteTiming

Style["PaddingCalculatorCompiledWithExpressions", Bold, Red]
codePaddingCalculatorCompiledWithExpressions = 
  ExportString[PaddingCalculatorCompiledWithExpressions, "C"];
StringLength[codePaddingCalculatorCompiledWithExpressions]
Do[PaddingCalculatorCompiledWithExpressions[testImgDataArr], {i, 0, 
   1000}] // AbsoluteTiming

Style["Comparison - BorderDimensions", Bold, Red]
Do[BorderDimensions[testImgRaster], {i, 0, 1000}] // AbsoluteTiming

Here are the benchmarking results:

PaddingCalculatorCompiledWithFunctions
15390
{0.447371,Null}

PaddingCalculatorCompiledWithExpressions
8898
{0.196994,Null}

Comparison - BorderDimensions
{0.180333,Null}

Finally, if you've read to here and want to see the actual code, here it is...

Note: I normally go way out-of-my-way to make small, concise, purpose-built functions. But in addition to the performance benefits, I think I like the "look and feel" of the expression-based version better. The style takes some getting-used-to, though.

Function-based version:


PaddingCalculatorCompiledWithFunctions = 
 Module[{PaddingCalculatorGenerator, PaddingCalculatorParams, numRows,
    numCols, PaddingCalculator, PaddingCalculatorInner},
  
  PaddingCalculatorGenerator[primaryRange_, secondaryRange_, 
    comparePart1_, comparePart2_] =
   Function[imageDataArgToCalculator,
    Hold@Do[
      Function[innerResult,
        If[innerResult == -1, 
         Return@Abs@(primaryRange[[2]] - primaryRange[[1]])]
        ]@
       Do[
        If[
         comparePart1@imageDataArgToCalculator == 
          comparePart2@imageDataArgToCalculator,(*Null*)-1, Return@-1],
        secondaryRange
        ],
      primaryRange
      ]
    ];
  
  PaddingCalculatorParams[numRows_, numCols_] =
   {
    {{rowIdx, 1, numRows, +1}, {colIdx, 1, numCols}, 
     Part[#, rowIdx, colIdx] &, Part[#, rowIdx, 1] &},
    {{rowIdx, numRows, 1, -1}, {colIdx, 1, numCols}, 
     Part[#, rowIdx, colIdx] &, Part[#, rowIdx, 1] &},
    {{colIdx, 1, numCols, +1}, {rowIdx, 1, numRows}, 
     Part[#, rowIdx, colIdx] &, Part[#, 1, colIdx] &},
    {{colIdx, numCols, 1, -1}, {rowIdx, 1, numRows}, 
     Part[#, rowIdx, colIdx] &, Part[#, 1, colIdx] &}
    };
  
  PaddingCalculator =
   ReleaseHold@
    Function[{imageDataArgToCalculators, numRows, numCols},
     imageDataArgToCalculators // 
        Apply[PaddingCalculatorGenerator] /@ 
         PaddingCalculatorParams[numRows, numCols] // Through // 
      Evaluate
     ];
  
  PaddingCalculatorInner = 
   With[{PaddingCalculator = PaddingCalculator},
    ReleaseHold@Function[imageDataArgToMain,
      Module[{numRows, numCols},
       numRows = Hold@Length@imageDataArgToMain;
       numCols = Hold@Length@First@imageDataArgToMain;
       Hold@PaddingCalculator[imageDataArgToMain, numRows, numCols]
       ]
      ]
    ];
  
  With[{PaddingCalculatorInner = PaddingCalculatorInner},
   Compile[{{imageDataArgToCompile, _Integer, 3}}, 
    PaddingCalculatorInner[imageDataArgToCompile]]
   ]
  ]

Expression-based version:


ReleaseHoldUnevaluated[expr_] := 
 ReplaceRepeated[HoldComplete[Unevaluated[expr]], Hold[x__] -> x] // 
  ReleaseHold

CompileSustitutable[vars_, body_] := 
 Hold[vars, body] /. Hold -> Compile

Module[{PaddingCalculatorGenerator, PaddingCalculatorParams, 
  PaddingCalculator, PaddingCalculatorInner},
 
 PaddingCalculatorGenerator[primaryRange_, secondaryRange_, 
   comparePart1_, comparePart2_] :=
  
  With[{primaryIndex = primaryRange[[1]], 
    primaryStart = primaryRange[[2]]},
   Hold@Do[ 
     Function[innerResult,
       If[innerResult == -1, 
        Return@Abs@(primaryStart - primaryIndex)] ]@
      Do[
       If[comparePart1 == comparePart2, Null, Return@-1],
       secondaryRange
       ],
     primaryRange
     ]
   ];
 
 ImagePart[row_, col_] = Hold@Part[imageData, row, col];
 
 PaddingCalculatorParams =
  {
   {{rowIdx, 1, numRows, +1}, {colIdx, 1, numCols}, 
    ImagePart[rowIdx, colIdx], ImagePart[rowIdx, 1]},
   {{rowIdx, numRows, 1, -1}, {colIdx, 1, numCols}, 
    ImagePart[rowIdx, colIdx], ImagePart[rowIdx, 1]},
   {{colIdx, 1, numCols, +1}, {rowIdx, 1, numRows}, 
    ImagePart[rowIdx, colIdx], ImagePart[1, colIdx]},
   {{colIdx, numCols, 1, -1}, {rowIdx, 1, numRows}, 
    ImagePart[rowIdx, colIdx], ImagePart[1, colIdx]}
   };
 
 PaddingCalculator =
  Hold[Module]
   [{numRows, numCols},
   Hold[CompoundExpression][
    Hold[numRows = Length@imageData],
    Hold[numCols = Length@First@imageData],
    PaddingCalculatorGenerator @@@ PaddingCalculatorParams
    ]
   ];
 
 PaddingCalculatorCompiledWithExpressions = 
  CompileSustitutable[{{imageData, _Integer, 3}}, 
   ReleaseHoldUnevaluated[PaddingCalculator]]
 ]

Answered by Sean on January 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP