TransWikia.com

How to parallel this nested table efficiently?

Mathematica Asked on August 5, 2021

I have an 8 core CPU and want to parallel evaluate the following nested Table

Table[Table[expr[i,j], {i,1,10}], {j,1,4}]

But there is a problem, the time cost of evaluating expr[i,j] increases with the value of variable i. If expr[1,j] takes 5min, expr[2,j] will take 10min and expr[10,j] will take 3hours. Now you see, no matter where I will put Parallel, in the outer Table or in the inner Table, the efficiency will not change.

The best way would be to first evaluate the most time consuming terms expr[10,1], expr[10,2], expr[10,3], expr[10,4] and other expressions with less time cost just throw onto the remaining core one by one. I naively tried several parallel order, for example

ParallelTable[expr[i,j], {i,10,1,-1}, {j,1,4}]

but this will not use 4 cores out of my 8 cores. The question is what is the best way to parallelize this nested table evaluation?

2 Answers

This is a similar question to Efficient way to utilise Parallel features to make use of many cores.

However, in addition to the answers there you need to know:

If the times for each evaluation of expr are long, even if not nearly as long as you describe, you will not benefit from queuing multiple operations per kernel. Instead an algorithm that merely waits for a free kernel is appropriate. As the documentation for Parallelize states:

Method -> "FinestGrained" is suitable for computations involving few subunits whose evaluations take different amounts of time. It leads to higher overhead, but maximizes load balancing.

Combining this with a variation Szabolcs's Tuples and ParallelMap method:

ParallelMap[Labeled[Pause[RandomReal[{0, 0.1}]]; {#[[2]], #[[1]]}, $KernelID] &, 
  Tuples@Range@{4, 10}, Method -> "FinestGrained"] ~Partition~ 10

enter image description here

Correct answer by Mr.Wizard on August 5, 2021

the solution that I use is:

ParallelMap[(#/.ft->f)&, Flatten[Table[ft[a,b],{a,1,10},{b,1,10}],1]]

where f is the function I want to evaluate in parallel on my 80 kernels and ft is some previously undefined symbol. I find this solution a bit ugly but practical.

Answered by Sergey Slizovskiy on August 5, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP