TransWikia.com

How to parallelize metaheuristics algorithms (Island Model)?

Operations Research Asked by Antarctica on August 19, 2021

I have different metaheuristics (Tabu search, Simulated annealing, Iterated local search) solving an optimization problem (a variant of the resource-constrained project scheduling problem). Each metaheuristic is implemented separately in Python.

I want to make an "island model" i.e to run them in parallel with the possibility of handing high-quality solutions to one another i.e keeping track of the best-found solution among all the metaheuristics. If a new higher quality solution using some metaheuristic, every other metaheuristic will then restart from that solution. I am not sure it’s the right term

I would be grateful if you could guide me on how to do it. I hope I will not need to re-implement everything from scratch.

Another question: Is there a risk of implementing a "fancy thing" that could be useless? If so, how to figure out in advance if the parallelization will be useful.

P.S: I don’t have any knowledge about parallel methods, except for knowing the term "island model" =).

One Answer

The easiest way is to use the Python multiprocessing module (or similar). You can create a pool of parallel workers, each of which would run a different heuristic. The multiprocessing toolbox also allows you to pass messages between processes, which you can use to communicate information among them (e.g., solution vectors).

In order to do that you would need synchronization points inside your heuristics, i.e., "checkpoints" where each worker requests information from a manager process. If no new information is available in the message it receives, it keeps calculating as normal, and if there's new information it does stuff.

The cleanest design is to have a dedicated process as the manager, and that process will be responsible for all communication between workers. Workers should only be able to communicate with the manager, and not with each other. This design pattern is crucial for good scaling and avoiding deadlocks & bottlenecks.

Be warned though that parallel computing is hard and full of quirks and subtle edge cases, even in Python. Always implement the bare minimum of the functionality you need. Test, test, test, and if it works for you, you're done. Complexity is your greatest enemy here because debugging becomes rapidly intractable in complicated parallel code.

My advice would be to consistently test everything, even syntax that "surely" does what you think it should - in parallel it rarely does, especially in Python.

W.r.t. your second question, yes, definitely. Parallel code is very time consuming, so unless there is a clear need for it, you probably shouldn't do it. Some indicators that it might work for you are:

  • There are numerous independent steps in your algorithm that take roughly the same time.
  • If your tasks differ by too much in execution time you need asynchronous parallel code which is much more complicated.
  • If you have one expensive chunk of calculations and everything else is cheap by comparison, you will not see benefits.
  • The application actually "needs" more speed. If it doesn't you need to make a business case to your boss about why this quality of life feature is something they should be spending money on.

Finally, keep in mind that you can simply run the same Python script multiple times with different inputs as background processes, and the OS will run them in parallel if you have enough cores.

Answered by Nikos Kazazakis on August 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP