I have different metaheuristics (Tabu search, Simulated annealing, Iterated local search) solving an optimization problem (a variant of the resource-constrained project scheduling problem). Each metaheuristic is implemented separately in Python.
I want to make an "island model" i.e to run them in parallel with the possibility of handing high-quality solutions to one another i.e keeping track of the best-found solution among all the metaheuristics. If a new higher quality solution using some metaheuristic, every other metaheuristic will then restart from that solution. I am not sure it’s the right term
I would be grateful if you could guide me on how to do it. I hope I will not need to re-implement everything from scratch.
Another question: Is there a risk of implementing a "fancy thing" that could be useless? If so, how to figure out in advance if the parallelization will be useful.
P.S: I don’t have any knowledge about parallel methods, except for knowing the term "island model" =).
The easiest way is to use the Python multiprocessing module (or similar). You can create a pool of parallel workers, each of which would run a different heuristic. The multiprocessing toolbox also allows you to pass messages between processes, which you can use to communicate information among them (e.g., solution vectors).
In order to do that you would need synchronization points inside your heuristics, i.e., "checkpoints" where each worker requests information from a manager process. If no new information is available in the message it receives, it keeps calculating as normal, and if there's new information it does stuff.
The cleanest design is to have a dedicated process as the manager, and that process will be responsible for all communication between workers. Workers should only be able to communicate with the manager, and not with each other. This design pattern is crucial for good scaling and avoiding deadlocks & bottlenecks.
Be warned though that parallel computing is hard and full of quirks and subtle edge cases, even in Python. Always implement the bare minimum of the functionality you need. Test, test, test, and if it works for you, you're done. Complexity is your greatest enemy here because debugging becomes rapidly intractable in complicated parallel code.
My advice would be to consistently test everything, even syntax that "surely" does what you think it should - in parallel it rarely does, especially in Python.
W.r.t. your second question, yes, definitely. Parallel code is very time consuming, so unless there is a clear need for it, you probably shouldn't do it. Some indicators that it might work for you are:
Finally, keep in mind that you can simply run the same Python script multiple times with different inputs as background processes, and the OS will run them in parallel if you have enough cores.
Answered by Nikos Kazazakis on August 19, 2021
1 Asked on September 1, 2020
1 Asked on July 27, 2020 by betty
Get help from others!