TransWikia.com

Parallelizing large sum to use all cpu cores

Mathematica Asked on March 12, 2021

I’m looking to compute the the following expectation, with $x$ a column vector sampled from $d$-dimensional distribution.

$$E[(I-alpha xx’)text{cov}(I-alpha xx’)]$$

Simple example is below. samples controls number of $x$‘s sampled, and the issue is that when samples is large compared to $d$, Mathematica doesn’t use all cpu cores — I’m only seeing utilization on a single core. How do I make this computation better parallelized?

samples = 1000000;
alpha = 0.5;
d = 2;
ii = IdentityMatrix[d];
cov = RandomReal[{-1, 1}, {d, d}];
X = {{0, 3}, {2, 0}, {-1, -1}};
batchStepCov :=
  
  Mean[(ii - alpha Outer[Times, #, #]).cov.(ii - 
        alpha Outer[Times, #, #]) & /@ RandomChoice[X, samples]];
batchStepCov // Timing
```

One Answer

I think that there is no need for parallelization. The following piece of code generates the same numbers with a considerably less amount of computational effort. It is about 100 times faster than your implementation.

f = With[{A = ii - alpha Outer[Times, #, #]}, A.cov.A] &
mats =  f/@ X;
Dot[
 SortBy[Tally[RandomChoice[Range[Length[X]], samples]], First][[All, 2]],
 mats
 ]/samples

The main idea: There are only three different matrices in X, so there are only three possible matrices in f /@ X. Sampling from X and applying f has the same effect as sampling directly from mats = f /@ X -- but without having to recompute f all the time. But even better: You do not need to have to evaluate RandomChoice[mats, samples]; it suffices to know how often each matrix from mats is drawn. The random vector of the frequencies can be obtained with

SortBy[Tally[RandomChoice[Range[Length[X]], samples]], First][[All, 2]]]

and from there you compute the mean by Dotting against mats and dividing by the number of samples.

In fact, you can also get around computing

SortBy[Tally[RandomChoice[Range[Length[X]], samples]], First][[All, 2]]]

(which is the only really time consuming part) by observing that this vector follows a MultinomialDistribution. The following takes only 0.4 milliseconds and should have the same effect:

mats = With[{A = ii - alpha Outer[Times, #, #]}, A.cov.A] & /@ X;
Dot[
  RandomVariate[
   MultinomialDistribution[
    samples, 
    ConstantArray[1/Length[X], {Length[X]}]]
    ],
  mats
  ]/samples

Correct answer by Henrik Schumacher on March 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP