Jump to content

RDP1974

Members
  • Content Count

    247
  • Joined

  • Last visited

  • Days Won

    1

Posts posted by RDP1974


  1. look this -> using zlib accelerated deflate (RDPMM64 repo zip.dll)

     

    msheap, isapi dll

     

    Document Path:          /isapi/testisapi.dll
    Document Length:        8416 bytes

    Concurrency Level:      100
    Time taken for tests:   0.609 seconds
    Complete requests:      1000
    Failed requests:        0
    Total transferred:      8617000 bytes
    HTML transferred:       8416000 bytes
    Requests per second:    1641.52 [#/sec] (mean)
    Time per request:       60.919 [ms] (mean)
    Time per request:       0.609 [ms] (mean, across all concurrent requests)
    Transfer rate:          13813.49 [Kbytes/sec] received

     

    added:

     

    uses JsonDataObjects, DataSet.Serialize, RDPWebbroker64;
     

    in response method:

    Response.ContentType:='application/json; charset="UTF-8"';
    Response.ZlibDeflate;

    end;

     

    response reduced in real time from 43kb to 8kb (thus saving time from server -> to browser and bandwidth cloud cost) while keeping throughput reqs/s unchangediis_deflateac.thumb.png.779f2d6322a5727b825ce84bb3da2875.pngiisdef.thumb.png.a4b3a8e315c28afe6365bdf4a132ed5a.png


  2. hi,

    I have done a more extended sample, with webbroker indy http server, json parsing, data serialize to firedac memtable and populate a response, using threadvar blobs

     

    single request >
    1 request 1 thread: ab -n 1 -c 1 -k http://192.168.1.110:8080/

     

    default
    Concurrency Level:      1
    Time taken for tests:   0.003 seconds
    Complete requests:      1
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      43146 bytes
    HTML transferred:       43008 bytes
    Requests per second:    287.27 [#/sec] (mean)
    Time per request:       3.481 [ms] (mean)
    Time per request:       3.481 [ms] (mean, across all concurrent requests)
    Transfer rate:          12104.21 [Kbytes/sec] received

     

    rdp64 intel tbb
    Concurrency Level:      1
    Time taken for tests:   0.003 seconds
    Complete requests:      1
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      43146 bytes
    HTML transferred:       43008 bytes
    Requests per second:    287.44 [#/sec] (mean)
    Time per request:       3.479 [ms] (mean)
    Time per request:       3.479 [ms] (mean, across all concurrent requests)
    Transfer rate:          12111.17 [Kbytes/sec] received

     

    msheap
    Concurrency Level:      1
    Time taken for tests:   0.005 seconds
    Complete requests:      1
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      43146 bytes
    HTML transferred:       43008 bytes
    Requests per second:    191.57 [#/sec] (mean)
    Time per request:       5.220 [ms] (mean)
    Time per request:       5.220 [ms] (mean, across all concurrent requests)
    Transfer rate:          8071.79 [Kbytes/sec] received

     

    fastmm5
    Concurrency Level:      1
    Time taken for tests:   0.005 seconds
    Complete requests:      1
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      43146 bytes
    HTML transferred:       43008 bytes
    Requests per second:    191.64 [#/sec] (mean)
    Time per request:       5.218 [ms] (mean)
    Time per request:       5.218 [ms] (mean, across all concurrent requests)
    Transfer rate:          8074.89 [Kbytes/sec] received

     

    > multi thread test >
    100 requests 100 threads: ab -n 100 -c 100 -k http://192.168.1.110:8080/ 

     

    default:
    Concurrency Level:      100
    Time taken for tests:   1.549 seconds
    Complete requests:      100
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      4314600 bytes
    HTML transferred:       4300800 bytes
    Requests per second:    64.56 [#/sec] (mean)
    Time per request:       1548.967 [ms] (mean)
    Time per request:       15.490 [ms] (mean, across all concurrent requests)
    Transfer rate:          2720.18 [Kbytes/sec] received

     

    rdp64 intel tbb
    Concurrency Level:      100
    Time taken for tests:   0.063 seconds
    Complete requests:      100
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      4314600 bytes
    HTML transferred:       4300800 bytes
    Requests per second:    1596.37 [#/sec] (mean)
    Time per request:       62.642 [ms] (mean)
    Time per request:       0.626 [ms] (mean, across all concurrent requests)
    Transfer rate:          67262.80 [Kbytes/sec] received

     

    msheap
    Concurrency Level:      100
    Time taken for tests:   0.070 seconds
    Complete requests:      100
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      4314600 bytes
    HTML transferred:       4300800 bytes
    Requests per second:    1431.02 [#/sec] (mean)
    Time per request:       69.880 [ms] (mean)
    Time per request:       0.699 [ms] (mean, across all concurrent requests)
    Transfer rate:          60295.89 [Kbytes/sec] received

     

    fastmm5
    Concurrency Level:      100
    Time taken for tests:   0.110 seconds
    Complete requests:      100
    Failed requests:        0
    Keep-Alive requests:    0
    Total transferred:      4314600 bytes
    HTML transferred:       4300800 bytes
    Requests per second:    909.90 [#/sec] (mean)
    Time per request:       109.902 [ms] (mean)
    Time per request:       1.099 [ms] (mean, across all concurrent requests)
    Transfer rate:          38338.49 [Kbytes/sec] received

     

    The ab bench is not very granular, so I assume that in a single thread all allocators should be close together.

    I also did a test on linux ubuntu 22 in the same host and the results are similar to windows with msheap or tbb.

     

    kind regards

    btw. I used libraries from JsonDataObjects, DataSet.Serialize

     

    LinuxSingle.thumb.png.59bac3ade13963264010cc68b95f9eaf.pngLinuxThreads.thumb.jpg.a07d9e4fe2beeb18c1e0b04daec9b13c.jpg

     

    TestWebbroker.zip


  3. this in win2022 server 21h2 hyper-v guest

     

    C:\Exes>cmd3_def
    Total Time: 1,062 Seconds
    Hands Per Second  2447231,63841808
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960


    C:\Exes>cmd3_fm5 (fastmm5)
    Total Time: 1 Seconds
    Hands Per Second  2598960
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960


    C:\Exes>cmd3_rdp64
    Total Time: 1,344 Seconds
    Hands Per Second  1933750
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960


    C:\Exes>cmd3_laz
    Total Time: 2,266 Seconds
    Hands Per Second  1146937,33451015
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960


    C:\Exes>cmd3_msheap
    Total Time: 0,984 Seconds
    Hands Per Second  2641219,51219512
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960


  4. hi,

    I did a quick benchmark test for single threaded application (using attached poker game, of course it's not exhaustive with a small subset of things)

    the win heap manager seems enhanced, now direct heap is faster than default MM or Intel TBB (using host I-9900kf win11 24h2 delphi 12.2.1 fpc 3.2.2 release mode)

     

    single thread x64 console mode->

     

    D12 default:

    Total Time: 1,031 Seconds
    Hands Per Second  2520814,74296799
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960

     

    D12 intel tbb (rdpmm64):

    Total Time: 1,281 Seconds
    Hands Per Second  2028852,45901639
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960

     

    D12 msheap:

    Total Time: 0,984 Seconds
    Hands Per Second  2641219,51219512
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960

     

    latest FPC lazarus:

    Total Time: 2,25 Seconds
    Hands Per Second  1155093,33333333
    Hand Evaluation  Expected Actual
    Royal Flushes:          4 4
    Straight Flushes:      36 36
    Four of a Kinds:      624 624
    Full Houses:         3744 3744
    Flushes:             5108 5108
    Straights:          10200 10200
    Three of a Kinds:   54912 54912
    Two Pairs:         123552 123552
    One Pairs:        1098240 1098240
    Other:            1302540 1302540
    Total Hands:      2598960 2598960

     

    Many RTL are using directly the heap of windows, as Rust, Clang and others, it resists fragmentation, so I suppose it is okay to use it directly. Also this act very well in multithreading as webbroker apps.

     

    look here if you wish

    https://github.com/RDP1974/

     

    Sorry to bore you with these things, just out of curiosity to squeeze the possible performance
    btw. do you know a more complete real-world test scenario than this?

    wldd49.zip


  5. On 10/12/2024 at 10:07 PM, Arnaud Bouchez said:

    Personal biais: the mORMot 2 Open Source framework has a very efficient JSON library, and several ways to use it:

    - from RTTI, using classes, records, collections, dynamic arrays, mORMot generics...
    - from variants, and a custom "document" variant type to store JSON objects or arrays...
    - from high-level IDocList / IDocDict holders.

     

    See https://blog.synopse.info/?post/2024/02/01/Easy-JSON-with-Delphi-and-FPC

     

    It is perhaps the fastest library available, working on Delphi and FPC, with unique features, like:

    
      list := DocList('[{"a":10,"b":20},{"a":1,"b":21},{"a":11,"b":20}]');
      // sort a list/array by the nested objects field(s)
      list.SortByKeyValue(['b', 'a']);
      assert(list.Json = '[{"a":10,"b":20},{"a":11,"b":20},{"a":1,"b":21}]');
    
      // create a dictionary from key:value pairs supplied from code
      dict := DocDict(['one', 1, 'two', 2, 'three', _Arr([5, 6, 7, 'huit'])]);
      assert(dict.Len = 3); // one dictionary with 3 elements
      assert(dict.Json = '{"one":1,"two":2,"three":[5,6,7,"huit"]}');
      // convert to JSON with nice formatting (line feeds and spaces)
      Memo1.Caption := dic.ToString(jsonHumanReadable);
    
      // integrated search / filter
      assert(DocList('[{ab:1,cd:{ef:"two"}}]').First('ab<>0').cd.ef = 'two');

     

    Hi, is it possible to use a retrieval path method for example on a list with multiple arrays inside?
    Many libraries offer a subset of https://goessner.net/articles/JsonPath/ for simple retrievals.

     


  6. hi,
    can I ask your opinion about this JSON library to adopt?

    As much as there are good repositories for Delphi,
    in the last period I feel comfortable with this JsonTools (I handled all the blobs correctly from many devices where other libraries gave me errors in parsing)

    https://www.getlazarus.org/json/
    https://www.getlazarus.org/json/tests/
    (all test passed)
    btw. I am a loyal Delphi customer, I don't want to advertise Lazarus 🙂

     


  7. On 8/17/2024 at 6:19 PM, RDP1974 said:

    from ver 8 the dll are inside the c:\program files\mysql\bin folder of the server

    latest version needs also a pair of dll crypto ssleay or something for the new ssl protocol auth

    e.c. these dll are only for 64bit latest mysql (copy them from or use the bin folder of the mysql in FDPhysMySQLDriverLink)

    602198037_A6E89872-B192-4DF9-9ECE-411C2A305226.png.b59e9292820f9d284e625ebe9938faf5.png


  8. 55 minutes ago, Stefan Glienke said:

    You *must not* remove that line - Move in Delphi allowed to be called with negative Count (a possible result of some size calculation), resulting in a no-op in the System implementation. Passing a negative number to most C++ implementations will result in passing a value >2mio because their size parameter is unsigned.

    Also, the performance difference is hardly about that little check but the ippsMove_8u implementation.

    thanks for the hint

    I correct


  9. look, I did only pascal wrappers, here's how to obtain the dll's:

     

    install visual studio c++ (I use the 2019 version)

    install intel ipp https://www.intel.com/content/www/us/en/developer/tools/oneapi/ipp.html

    install python

     

    the allocator is here:

    https://www.oneapi.io/

    https://github.com/oneapi-src/oneTBB/tree/master

    see cmake dir, run it and build a visual studio project, locate tbbmalloc and compile it (you should select MD multithreading static library)

     

    the simd rtl replacement:

    install qt5 python library, if I remember "pip install pyqt5"
    python C:\Program Files (x86)\Intel\oneAPI\ipp\latest\opt\ipp\tools\custom_library_tool_python\main.py

    this tool will build a vc script to create the custom dll
    locate ippsZero_8u, ippsCopy_8u, ippsMove_8u, ippsSet_8u, ippsFind_8u, ippsCompare_8u, ippsUppercaseLatin_8u_I, ippsReplaceC_8u from signal processing, and other select

    (indeed we can extend many functions from this tool, especially for image processing, but I didn't have time to do)

    run the script to obtain the dll

     

    the zlib accelerated with ipp:

    C:\Program Files (x86)\Intel\oneAPI\ipp\2021.12\share\doc\ipp\components_and_examples_win\components\interfaces

    here you can see common libraries enhanced, check the folder zlib and open the readme, follow the instructions

     

    tell me if you have troubles in build this

    kind regards

     


  10. hi,

    I have updated the libraries and units of the RTL patches from Intel IPP oneTBB performance suite

    these are well suited for web server application scalability on windows architecture

     

    https://github.com/RDP1974

    (test on concurrent http calls show 13x speedup)

     

    there are enhancements on zlib 1.3.1 options, and some random av seems solved correctly

     

    I have done testing, but please if you found troubles please notify me at roberto.dellapasqua dot live.com

    kind regards

    Roberto

    • Thanks 1

  11. hi,

    I want do this with fmx:

    1- vertscrollbox

    2- tlayout 

    3- create at runtime trectangle then fill it with a bitmap, dinamically, when needed (as add tiles)

    I did a small test without success

    how to do?

    btw. I like to add event callbacks too, as onclick()

    btw. the best method to draw things, call update, etc.


  12. mysql oracle from ver.8 is 64bit only

    c connectors dll are inside \lib and \bin folders

    (without 32bit version)

    the solution of Lajos seems a good workaround (32bit odbc in designtime and 64bit native for runtime)

    very thanks

×