Jump to content

Search the Community

Showing results for tags 'affinity'.



More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Delphi Questions and Answers
    • Algorithms, Data Structures and Class Design
    • VCL
    • FMX
    • RTL and Delphi Object Pascal
    • Databases
    • Network, Cloud and Web
    • Windows API
    • Cross-platform
    • Delphi IDE and APIs
    • General Help
    • Delphi Third-Party
  • C++Builder Questions and Answers
    • General Help
  • General Discussions
    • Embarcadero Lounge
    • Tips / Blogs / Tutorials / Videos
    • Job Opportunities / Coder for Hire
    • I made this
  • Software Development
    • Project Planning and -Management
    • Software Testing and Quality Assurance
  • Community
    • Community Management

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Delphi-Version

Found 1 result

  1. Here is the problem (a bit of a long message) - I'm writing a threaded program to run on workstations with dual Xeons. Each of the Xeons has eight hyperthreaded cores. (I've read https://www.delphipraxis.net/113427-beginthreadaffinity-setthreadaffinity.html but I don't think it helps me.) Dual-Xeon systems have Non-Uniform Memory Access (NUMA) - each of the Xeons have direct access to their own memory, but they can access memory on the other Xeon - but it takes a long time for a thread running on one Xeon to access memory on the other Xeon. The bottleneck in the program is accessing memory. I have the program set so I can adjust the number of threads it is running. As I test from 1 up to 16 threads, performance improves. TaskManager/performance/Resource meter shows that all 16 threads are running on one Xeon. But if I go to 17 threads or more, it hurts performance (compared to 16 threads on one Xeon). What must be happening is that threads 17 and above are executing on the second Xeon but their memory must be on the first Xeon. That's the problem. I've tried to set the affinity for each thread to a particular CPU and hope it uses that CPU for its memory, but setting the affinity inside a thread isn't working at all (even with only one thread, it runs on all CPUs.) What I have right now is I limit an instance of the program to running 16 threads and I have the affinity set to run on the first Xeon. Then I start a second instance of the program - it checks to see if the first instance is running, and if so, it sets its affinity to run on the second Xeon. Then I merge the results later (which isn't an ideal situation). This is working, with double the performance of running on one Xeon, but it is no different from running on two computers, each with one Xeon. So is there a way to get one instance to use all 32 available CPUs with full performance? (If I could assign each thread and its memory to a particular CPU, that should fix the problem, but I haven't been able to do that.)
×