Steve Maughan 26 Posted July 16, 2019 I'm considering ways of speeding up an algorithm. One option is to add parallel "for loops". I can certainly add a single parallel "for loop" and get a speedup. The algorithms effectively has nested "for loops" and I could also make the inner "for loops" parallel. Is this a advisable or are nested parallel "for loops" a big no-no? Or does it depend on the number of levels of nesting e.g. one inner parallel loop may be ok, but more would be bad? Thanks, Steve Share this post Link to post
PeterBelow 239 Posted July 17, 2019 Using threads to do thing in parallel obviously only works efficiently if the "things" do not depend on each other. As soon as a thread has to wait on the results of another the processing slows down and the chance of deadlocks goes up. Also, using more threads than the hardware has cores/cpus quickly gets you to a point of diminishing returns, since the OS has some administrative overhead to manage the threads. So, don't overdo it, especially in a scenario like a parallel for loop, where the code can only continue after the loop has finished completely it may be more efficient to split the loop into several and run each in its own thread. If the loop does not have that many iterations and executes each round fairly quickly setting up the individual tasks in a parallel for may consume more time than you gain by doing stuff in parallel. There is no substitute for thorough testing and timing in such a scenario. 1 Share this post Link to post
Stefan Glienke 2019 Posted July 17, 2019 It's not about the level of nesting but about the number of cores you have and if the outer loop already produces enough threads to utilize all of them. If that's the case it does not make sense to parallelize even further because your CPU is already saturated Share this post Link to post
Steve Maughan 26 Posted July 17, 2019 Hi Stefan & Peter, Thanks! I'm assuming most users will have a two or four core system. But power users may have 64 cores; it was really them I was thinking about. Truth is, I need to stop optimizing at this stage and focus on the actual algorithm. Thanks again, Steve 1 Share this post Link to post