Search the Community
Showing results for tags 'affinity'.
Found 2 results
-
How to set the affinity for TOmniThread/TOmiTaskControl, it is keep on resetting the default affinity value (utilizing all CPU's). I tried setting Environment.Process.Affinity.AsString, but this value also keep on changing to default (255), it is happening when the OmiThread starts, I see it have many properties to set the affinity, but I'm not sure how to set it.
-
Here is the problem (a bit of a long message) - I'm writing a threaded program to run on workstations with dual Xeons. Each of the Xeons has eight hyperthreaded cores. (I've read https://www.delphipraxis.net/113427-beginthreadaffinity-setthreadaffinity.html but I don't think it helps me.) Dual-Xeon systems have Non-Uniform Memory Access (NUMA) - each of the Xeons have direct access to their own memory, but they can access memory on the other Xeon - but it takes a long time for a thread running on one Xeon to access memory on the other Xeon. The bottleneck in the program is accessing memory. I have the program set so I can adjust the number of threads it is running. As I test from 1 up to 16 threads, performance improves. TaskManager/performance/Resource meter shows that all 16 threads are running on one Xeon. But if I go to 17 threads or more, it hurts performance (compared to 16 threads on one Xeon). What must be happening is that threads 17 and above are executing on the second Xeon but their memory must be on the first Xeon. That's the problem. I've tried to set the affinity for each thread to a particular CPU and hope it uses that CPU for its memory, but setting the affinity inside a thread isn't working at all (even with only one thread, it runs on all CPUs.) What I have right now is I limit an instance of the program to running 16 threads and I have the affinity set to run on the first Xeon. Then I start a second instance of the program - it checks to see if the first instance is running, and if so, it sets its affinity to run on the second Xeon. Then I merge the results later (which isn't an ideal situation). This is working, with double the performance of running on one Xeon, but it is no different from running on two computers, each with one Xeon. So is there a way to get one instance to use all 32 available CPUs with full performance? (If I could assign each thread and its memory to a particular CPU, that should fix the problem, but I haven't been able to do that.)