Jump to content

RDP1974

Members
  • Content Count

    231
  • Joined

  • Last visited

  • Days Won

    1

RDP1974 last won the day on September 29 2021

RDP1974 had the most liked content!

Community Reputation

40 Excellent

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I feel dumb, want edit first post, can't find how to want erase one post, seems impossible! 😕
  2. I don't find where to rename the topic 😕
  3. lowering the size of the output to 2.5kB (json blob) instead of 46kB, isapi has this throughput (cpu near old 9th 14nm i9900-kf) Concurrency Level: 100 Time taken for tests: 0.398 seconds Complete requests: 10000 Failed requests: 0 Keep-Alive requests: 10000 Total transferred: 27800000 bytes HTML transferred: 26100000 bytes Requests per second: 25106.45 [#/sec] (mean) Time per request: 3.983 [ms] (mean) Time per request: 0.040 [ms] (mean, across all concurrent requests) Transfer rate: 68160.09 [Kbytes/sec] received
  4. with keep-alive isapi app Concurrency Level: 100 Time taken for tests: 4.763 seconds Complete requests: 10000 Failed requests: 0 Keep-Alive requests: 10000 Total transferred: 431790000 bytes HTML transferred: 430080000 bytes Requests per second: 2099.45 [#/sec] (mean) Time per request: 47.631 [ms] (mean) Time per request: 0.476 [ms] (mean, across all concurrent requests) Transfer rate: 88527.55 [Kbytes/sec] received in project source: Application.MaxConnections:=1000; Application.CacheConnections:=True;
  5. please tell me if these test are disturbing or inapropriate if so will delete them
  6. single request latest oneapi v.2022 (see first page) Concurrency Level: 1 Time taken for tests: 0.004 seconds Complete requests: 1 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 43146 bytes HTML transferred: 43008 bytes Requests per second: 285.06 [#/sec] (mean) Time per request: 3.508 [ms] (mean) Time per request: 3.508 [ms] (mean, across all concurrent requests) Transfer rate: 12011.05 [Kbytes/sec] received
  7. latest oneapi v.2022 intel tbbmalloc with zlib deflate ac: https://github.com/RDP1974/Delphi64RTL Server Software: Microsoft-IIS/10.0 Server Hostname: 192.168.1.110 Server Port: 80 Document Path: /isapi/testisapi.dll Document Length: 8416 bytes Concurrency Level: 100 Time taken for tests: 5.478 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 86170000 bytes HTML transferred: 84160000 bytes Requests per second: 1825.64 [#/sec] (mean) Time per request: 54.775 [ms] (mean) Time per request: 0.548 [ms] (mean, across all concurrent requests) Transfer rate: 15362.84 [Kbytes/sec] received without zlib deflate Document Path: /isapi/testisapi.dll Document Length: 43008 bytes Concurrency Level: 100 Time taken for tests: 5.236 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 431740000 bytes HTML transferred: 430080000 bytes Requests per second: 1909.82 [#/sec] (mean) Time per request: 52.361 [ms] (mean) Time per request: 0.524 [ms] (mean, across all concurrent requests) Transfer rate: 80522.17 [Kbytes/sec] received
  8. btw. I'll provide intel one api accelerated zlib.so library for linux too
  9. look this -> using zlib accelerated deflate (RDPMM64 repo zip.dll) msheap, isapi dll Document Path: /isapi/testisapi.dll Document Length: 8416 bytes Concurrency Level: 100 Time taken for tests: 0.609 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 8617000 bytes HTML transferred: 8416000 bytes Requests per second: 1641.52 [#/sec] (mean) Time per request: 60.919 [ms] (mean) Time per request: 0.609 [ms] (mean, across all concurrent requests) Transfer rate: 13813.49 [Kbytes/sec] received added: uses JsonDataObjects, DataSet.Serialize, RDPWebbroker64; in response method: Response.ContentType:='application/json; charset="UTF-8"'; Response.ZlibDeflate; end; response reduced in real time from 43kb to 8kb (thus saving time from server -> to browser and bandwidth cloud cost) while keeping throughput reqs/s unchanged
  10. hi, I have done a more extended sample, with webbroker indy http server, json parsing, data serialize to firedac memtable and populate a response, using threadvar blobs single request > 1 request 1 thread: ab -n 1 -c 1 -k http://192.168.1.110:8080/ default Concurrency Level: 1 Time taken for tests: 0.003 seconds Complete requests: 1 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 43146 bytes HTML transferred: 43008 bytes Requests per second: 287.27 [#/sec] (mean) Time per request: 3.481 [ms] (mean) Time per request: 3.481 [ms] (mean, across all concurrent requests) Transfer rate: 12104.21 [Kbytes/sec] received rdp64 intel tbb Concurrency Level: 1 Time taken for tests: 0.003 seconds Complete requests: 1 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 43146 bytes HTML transferred: 43008 bytes Requests per second: 287.44 [#/sec] (mean) Time per request: 3.479 [ms] (mean) Time per request: 3.479 [ms] (mean, across all concurrent requests) Transfer rate: 12111.17 [Kbytes/sec] received msheap Concurrency Level: 1 Time taken for tests: 0.005 seconds Complete requests: 1 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 43146 bytes HTML transferred: 43008 bytes Requests per second: 191.57 [#/sec] (mean) Time per request: 5.220 [ms] (mean) Time per request: 5.220 [ms] (mean, across all concurrent requests) Transfer rate: 8071.79 [Kbytes/sec] received fastmm5 Concurrency Level: 1 Time taken for tests: 0.005 seconds Complete requests: 1 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 43146 bytes HTML transferred: 43008 bytes Requests per second: 191.64 [#/sec] (mean) Time per request: 5.218 [ms] (mean) Time per request: 5.218 [ms] (mean, across all concurrent requests) Transfer rate: 8074.89 [Kbytes/sec] received > multi thread test > 100 requests 100 threads: ab -n 100 -c 100 -k http://192.168.1.110:8080/ default: Concurrency Level: 100 Time taken for tests: 1.549 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 4314600 bytes HTML transferred: 4300800 bytes Requests per second: 64.56 [#/sec] (mean) Time per request: 1548.967 [ms] (mean) Time per request: 15.490 [ms] (mean, across all concurrent requests) Transfer rate: 2720.18 [Kbytes/sec] received rdp64 intel tbb Concurrency Level: 100 Time taken for tests: 0.063 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 4314600 bytes HTML transferred: 4300800 bytes Requests per second: 1596.37 [#/sec] (mean) Time per request: 62.642 [ms] (mean) Time per request: 0.626 [ms] (mean, across all concurrent requests) Transfer rate: 67262.80 [Kbytes/sec] received msheap Concurrency Level: 100 Time taken for tests: 0.070 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 4314600 bytes HTML transferred: 4300800 bytes Requests per second: 1431.02 [#/sec] (mean) Time per request: 69.880 [ms] (mean) Time per request: 0.699 [ms] (mean, across all concurrent requests) Transfer rate: 60295.89 [Kbytes/sec] received fastmm5 Concurrency Level: 100 Time taken for tests: 0.110 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 4314600 bytes HTML transferred: 4300800 bytes Requests per second: 909.90 [#/sec] (mean) Time per request: 109.902 [ms] (mean) Time per request: 1.099 [ms] (mean, across all concurrent requests) Transfer rate: 38338.49 [Kbytes/sec] received The ab bench is not very granular, so I assume that in a single thread all allocators should be close together. I also did a test on linux ubuntu 22 in the same host and the results are similar to windows with msheap or tbb. kind regards btw. I used libraries from JsonDataObjects, DataSet.Serialize TestWebbroker.zip
  11. RDP1974

    win11 24h2 msheap fastest

    https://learn.microsoft.com/en-us/windows/win32/memory/low-fragmentation-heap https://illmatics.com/Understanding_the_LFH.pdf https://www.softwareverify.com/blog/memory-fragmentation-your-worst-nightmare/ https://users.rust-lang.org/t/why-dont-windows-targets-use-malloc-instead-of-heapalloc/57936 I don't know if we need an intermediate allocator or if we can use the Win API heap directly.
  12. RDP1974

    win11 24h2 msheap fastest

    this in win2022 server 21h2 hyper-v guest C:\Exes>cmd3_def Total Time: 1,062 Seconds Hands Per Second 2447231,63841808 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 C:\Exes>cmd3_fm5 (fastmm5) Total Time: 1 Seconds Hands Per Second 2598960 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 C:\Exes>cmd3_rdp64 Total Time: 1,344 Seconds Hands Per Second 1933750 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 C:\Exes>cmd3_laz Total Time: 2,266 Seconds Hands Per Second 1146937,33451015 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 C:\Exes>cmd3_msheap Total Time: 0,984 Seconds Hands Per Second 2641219,51219512 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960
  13. hi, I did a quick benchmark test for single threaded application (using attached poker game, of course it's not exhaustive with a small subset of things) the win heap manager seems enhanced, now direct heap is faster than default MM or Intel TBB (using host I-9900kf win11 24h2 delphi 12.2.1 fpc 3.2.2 release mode) single thread x64 console mode-> D12 default: Total Time: 1,031 Seconds Hands Per Second 2520814,74296799 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 D12 intel tbb (rdpmm64): Total Time: 1,281 Seconds Hands Per Second 2028852,45901639 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 D12 msheap: Total Time: 0,984 Seconds Hands Per Second 2641219,51219512 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 latest FPC lazarus: Total Time: 2,25 Seconds Hands Per Second 1155093,33333333 Hand Evaluation Expected Actual Royal Flushes: 4 4 Straight Flushes: 36 36 Four of a Kinds: 624 624 Full Houses: 3744 3744 Flushes: 5108 5108 Straights: 10200 10200 Three of a Kinds: 54912 54912 Two Pairs: 123552 123552 One Pairs: 1098240 1098240 Other: 1302540 1302540 Total Hands: 2598960 2598960 Many RTL are using directly the heap of windows, as Rust, Clang and others, it resists fragmentation, so I suppose it is okay to use it directly. Also this act very well in multithreading as webbroker apps. look here if you wish https://github.com/RDP1974/ Sorry to bore you with these things, just out of curiosity to squeeze the possible performance btw. do you know a more complete real-world test scenario than this? wldd49.zip
×