Jump to content

RDP1974

Members
  • Content Count

    235
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by RDP1974

  1. hello, I'm updating my github wrapper of Intel Performance Libraries, Memory manager super scalar, (they have fixed memory consuption and I will add a thread to release cache every n minutes) RTL SIMD foundation, Zlib enhanced this is one of the best industry proven foundation, game industry and servers deeply use those libs, so Delphi multihreaded apps will get an exceptional performance boost I need you help. I have done the DLLs without touch sources, 0 errors, 0 warnings, O2 optimized all. TBB (memory manager) IPP (RTL patches) WebBroker ZLIB (for now only 64bit, I need time to patch 32bit zlib) Why memory manager with 32bit doesn't work? I get an exception. Further can you help me with 32bit asm in IPP file for patching 32bit RTL functions? With 64bit works perfectly up to the stairs. Thank you! IntelMM2.zip (BTW consider, having time with one or a pair of volunteers we can pathc a whole of low level string RTL routines with the SIMD intel libs... also VCL Imaging as JPEG, PNG, Bilinear StretchAPI et.etc.)
  2. I have deleted the repositories because somebody pointed me about problems with intel license IPP and redistributables (seems that intel permits to redistribute for free the complete 30MB of dlls with the compiled exe, but don't cover over custom thiny dll as I used). btw: with TBBMalloc there are no problems of redistribute https://github.com/oneapi-src/oneTBB here the sources and here a delphi wrapper for tbbmalloc https://sites.google.com/site/scalable68/intel-tbbmalloc-interfaces-for-delphi-and-delphi-xe-versions-and-freepascal
  3. https://github.com/RDP1974/Delphi-64-bit-compiler-RTL-speed-up_2 win32 win64 intel tbb 100x speed up on core I9 windows 2016 server wizard webserver app indy/webbroker Response.Content := '<html>' + '<head><title>Web Server Application</title></head>' + '<body>Web Server Application '+ FormatDateTime('YYYY-MM-DD hh:mm:ss', Now) +' </body>' + '</html> win32 60 ops/s win64 45 ops/s win32 replacement 6400 ops/s win64 replacement 6400 ops/s (btw. you will gain those speed up only on heavy multithreaded apps deeply relying over heap allocations, strings, etc.) C:\ApacheBench>ab -n 100 -c 100 -k -r http://192.168.1.160:8080/ This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.168.1.160 (be patient).....done Server Software: Server Hostname: 192.168.1.160 Server Port: 8080 Document Path: / Document Length: 119 bytes Concurrency Level: 100 Time taken for tests: 1.625 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 25500 bytes HTML transferred: 11900 bytes Requests per second: 61.54 [#/sec] (mean) Time per request: 1624.997 [ms] (mean) Time per request: 16.250 [ms] (mean, across all concurrent requests) Transfer rate: 15.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 15 88.4 0 516 Processing: 0 833 579.8 1031 1547 Waiting: 0 817 582.5 1031 1547 Total: 0 848 578.3 1031 1547 Percentage of the requests served within a certain time (ms) 50% 1031 66% 1031 75% 1547 80% 1547 90% 1547 95% 1547 98% 1547 99% 1547 100% 1547 (longest request) C:\ApacheBench>ab -n 100 -c 100 -k -r http://192.168.1.160:8080/ This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.168.1.160 (be patient).....done Server Software: Server Hostname: 192.168.1.160 Server Port: 8080 Document Path: / Document Length: 119 bytes Concurrency Level: 100 Time taken for tests: 2.156 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 25500 bytes HTML transferred: 11900 bytes Requests per second: 46.38 [#/sec] (mean) Time per request: 2156.256 [ms] (mean) Time per request: 21.563 [ms] (mean, across all concurrent requests) Transfer rate: 11.55 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 21 101.6 0 516 Processing: 0 1186 667.9 1031 2062 Waiting: 0 1165 672.7 1031 2062 Total: 0 1207 662.9 1547 2062 Percentage of the requests served within a certain time (ms) 50% 1547 66% 1547 75% 1547 80% 2062 90% 2062 95% 2062 98% 2062 99% 2062 100% 2062 (longest request) C:\ApacheBench>ab -n 100 -c 100 -k -r http://192.168.1.160:8080/ This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.168.1.160 (be patient).....done Server Software: Server Hostname: 192.168.1.160 Server Port: 8080 Document Path: / Document Length: 119 bytes Concurrency Level: 100 Time taken for tests: 0.016 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 25500 bytes HTML transferred: 11900 bytes Requests per second: 6396.72 [#/sec] (mean) Time per request: 15.633 [ms] (mean) Time per request: 0.156 [ms] (mean, across all concurrent requests) Transfer rate: 1592.93 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 0 15 2.2 16 16 Waiting: 0 1 4.5 0 16 Total: 0 15 2.2 16 16 Percentage of the requests served within a certain time (ms) 50% 16 66% 16 75% 16 80% 16 90% 16 95% 16 98% 16 99% 16 100% 16 (longest request) C:\ApacheBench>ab -n 100 -c 100 -k -r http://192.168.1.160:8080/ This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 192.168.1.160 (be patient).....done Server Software: Server Hostname: 192.168.1.160 Server Port: 8080 Document Path: / Document Length: 119 bytes Concurrency Level: 100 Time taken for tests: 0.016 seconds Complete requests: 100 Failed requests: 0 Keep-Alive requests: 0 Total transferred: 25500 bytes HTML transferred: 11900 bytes Requests per second: 6405.33 [#/sec] (mean) Time per request: 15.612 [ms] (mean) Time per request: 0.156 [ms] (mean, across all concurrent requests) Transfer rate: 1595.08 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 1.6 0 16 Processing: 0 14 4.7 16 16 Waiting: 0 14 4.9 16 16 Total: 0 14 4.5 16 16 Percentage of the requests served within a certain time (ms) 50% 16 66% 16 75% 16 80% 16 90% 16 95% 16 98% 16 99% 16 100% 16 (longest request) Cheers. Bob
  4. https://github.com/RDP1974/Delphi-64-bit-compiler-RTL-speed-up_2 give me a pair of hours to fix 32bit memory manager wrapper
  5. hello dear community, I'm doing a C library conversion producing .o static files However now I'm stopped by this linker error: [dcc64 Error] E2216 Can't handle section '.tls$' in object file Can be possible to solve this problem with C compiler options making the files? I'm using clang 11 Thank you
  6. https://quality.embarcadero.com/browse/RSP-33463
  7. this E2216 should be solved in the Delphi/Linker compiler example with Clang 11 a large ecosystem of C libraries can be compiled without rely on VisualC runtime (further Windows 10 comes with VC 2015 crt by default). So we can embed objects without redistribute any DLL or dependancies!
  8. hello dear Delphinius Delphi 64 bit compiler RTL speedup Strong performance speedup for multithreaded server apps Deflate compression 5x faster than gzlib for WebBroker apps, brings your client-server experience up to the stars I'll update sometime. Regards.
  9. RDP1974

    new frameworks

    do you know Fano or Brook web frameworks? Quality? https://github.com/fanoframework/fano https://github.com/risoflora/brookframework probably this is the best mvc https://github.com/danieleteti/delphimvcframework
  10. hello, using Delphi 10.4.1 I do a soap call using XML.OmniXMLDom for Linux target and I get EEncodingError "No mapping for the unicode character exists in the target multi-byte code page" Seeing raw stream the remote server is sending bad codes for some Francais characters (out from the map). Do you know how to set HTTPRIO to ask UTF-8 unicode xml instead of multi-byte code page? Thank you
  11. RDP1974

    SOAP Client EEncodingError

    client cannot be solved, also trying different xml parsers problem was solved server-side
  12. The thread pool TLS cache model of TBB fits particularly well the NT Windows Kernel (scheduler, quantum fibers, KI* exposed API over HAL), but sure, it consumes a lot of memory. Anyway, offtopic, I'm using with great satisfaction Delphi x Linux compiler with Firedac pooling, SOAP indy based custom SSL webservices -> very small and very fast, nobody is using the same toolchain?
  13. Maybe in a old version? They are making "giant" steps forward.
  14. I'm sure, the license permits to distribute for free.
  15. The license permits. There is a tool to make custom DLL.
  16. Indeed TBB is open https://github.com/oneapi-src/oneTBB meanwhile IPP is closed source https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/ipp.html (but with a utility to extract custom DLL)
  17. hi Arnaud, consider I admire your talent, but why you tell TBB unusable? It's used in mainstream server and workstation products worldwide without problems DLL? Are extracted from Intel TBB and IPP royalty free packages, I did only pascal wrappers; no custom source code changes are done; you can compile by yourself, I have put them in the repository because many people cannot build them, or not having the time to do for the memory allocator: https://github.com/oneapi-src/oneTBB/releases https://github.com/oneapi-src/oneTBB/archive/v2020.3.zip -> see folder TBBMalloc for the rtl simd patches: https://software.seek.intel.com/performance-libraries -> see IPP run the utility to build a custom DLL and export: 'ippsZero_8u'; 'ippsCopy_8u'; 'ippsMove_8u'; 'ippsSet_8u'; 'ippsFind_8u'; 'ippsCompare_8u'; 'ippsUppercaseLatin_8u_I'; 'ippsReplaceC_8u'; for the web deflate acceleration (5x quicker than windows gzip, webbroker helper provided) -> extract IPP under Linux, see the readme how to patch zlib original sources, take the changed sources and compile them with MS VC++ kind regards R. Still Delphi (VCL) the best framework for Windows apps!
  18. Yes, it's written in the title and in the license. Custom DLL from Intel Performance libraries. Kind regards
  19. RDP1974

    SOAP Client EEncodingError

    solved, was not a problem of Delphi that works perfectly was a problem in a remote swiss server of a customer
  20. only for sharing talks I'm benchmarking a single thread app (poker app) and multithreaded app (webbroker http) with MM's I did a test of FastMM5 and TBB+IPP https://github.com/RDP1974 FastMM5 is fast as TBB under Webbroker with apachebench 100 concurrent users (finally overcoming the FM4 problems), but TBB is 5x faster than FM5 under TParallel class TBB is fast as FM4/FM5 in single thread with // RedirectCode(@System.Move, @Move2); in RDPSimd64, bcz small moves are faster than under TBB SIMD (condition penalty) so: waiting FM5 will correct TParallel contention? or Delphi AVX support for Synopse MM?
  21. RDP1974

    borderless with aero shadow

    hello, perhaps somebody knows how to drop the windows 10 aero shadow under a borderless vcl form? I'm using from MSDN const MARGINS shadow_on = { 0, 0, 0, 0 }; DwmExtendFrameIntoClientArea(hwnd, &shadow_on); But the DWM compositor wants atleast 1px border in the frame, else don't cast the shadow. So a line with the color of the global theme will be always visible and not accessible bcz border. Btw. I don't want to use WS_LAYER_EX composited, but normal Delphi vcl form. Thanks R.
  22. I had a trouble calling a soap webservice, solved with 10.4.1 update
  23. RDP1974

    a pair of MM test

    ok, did a test with FastMM5, with 16 threads results are identical to BigbrainMM, and with single thread a little better (2501 vs 2727) 8% quicker
  24. RDP1974

    a pair of MM test

    hi looks here https://blog.digitaltundra.com/?p=902 another MM pascal code, free with my test I9 16 threads is the faster among all the MM tested (Tundra vs default) it's using threadvar tls for each thread cache
  25. RDP1974

    a pair of MM test

    sorry, because I'm in a hurry, I'll try to enhance my syntax 🙂
×