Jump to content

RDP1974

Members
  • Content Count

    247
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by RDP1974

  1. RDP1974

    best component for web media player

    hello, Guess I do a whatsapp client application through API I want play and record the camera audio/video Does the webRTC component offer this capability? Show video using VP8, H264 codec, record video from cam to file using VP8, H264 codec Or in your opinion is better to use FFMPEG wrapper to encode and to play? Your component plays video of accelerated VCL surface (opengl or directx)? Installing webRTC the default system codecs are updated? So is enough to use TMediaPlayer VCL for playing DXShow VP8, H264? Installing FFMPEG the default system codecs are updated? So is enough to use TMediaPlayer VCL for playing DXShow VP8, H264? I can install the https://www.webmproject.org/ie/ so to play through accelerated DxShow all the VP8 H264 codecs through mediaplayer
  2. hello, I want to use GPL sources into a closed source Delphi app, I cannot made the Delphi app under GPL because I use commercial components. Now: 1) if I link the objects $L from C GPL code, then I'm obliged to open and make all Delphi code GPL? 2) if I link a DLL from C GPL code, again here I need to open to GPL? 3) if I do a EXE from GPL code, then my closed code is Delphi and I use the first EXE with as a bridge with RPC or shared memory IPC dialogue; if so can I solve the problem of using GPL inside a commercial Delphi code? Thanks
  3. meantime I did a real test on FastMM5 -> under webbroker standalone webserver, running apachebench ab -n 10000 -c 100 -k -r http://192.168.1.150:8080/ -> FastMM5 executable is faster identical at Intel TBB (both 11k reqs/sec) but with a benchmark TParallel.For I get -> - Parallel For used : 1030549 ticks / BlockThreadContentionCounts: 0/2327401/0 ThreadContention testing reported critical areas - consider to increse the corresponding CFastMM_XXXXXBlockArenaCount constant - with TBB Parallel For used : 202024 ticks (5 times quicker on I9 8/16 core) in a single thread benchmark poker game -> 2M with TBB 2650k with FMM4 2750k with FMM5 so actually for single thread apps FMM5 wins by a few margin over FMM4 but for multithread should be tuned? I forward this post to the author of FMM.
  4. (I cannot understand why they don't invest in compiler and rtl that's the core value of the whole toolchain ... )
  5. for now I go with a pair of DLL from Intel TBB, IPP for mem manager and fillchar, move, pos avx enhanced if I do an error into asm conversion will be hard to catch! Too risky 🙂
  6. better to wait Embarcadero support for AVX into Delphi asm ... need too much time and easy to do errors
  7. To translate AVX-2 opcodes to Delphi we should use a object disassembler? And put hexadecimal DB values directly? Or translate by hand using intel manual and a calculator :-) Do you know if exists a similar tool as https://torry.net/files/vcl/experts/other/mmxasm.zip for AVX?
  8. Hi Arnaud, can I port your FPC mem manager under Delphi (Windows)? I'll translate AVX to Delphi ASM. But, I need your permission of course. Let me know. Roberto
  9. What happens when a GPL code violates a patent?
  10. a silly question, IMHO this forum layout should be optimized leaving more space for the topics
  11. hello, there is a bug in webbroker indy based under Linux, the Request.ContentFields is not filled correctly from client Ajax POST meanwhile works correctly in Windows Delphi 10.3.3
  12. RDP1974

    Experience/opinions on FastMM5

    Your talent is fantastic and so your code, but let me tell you a word about "TBB unusable" that it's the default optimize option on the whole Visual Studio C compiler and in main game engines... TBB and IPP also are used in Oracle Database, Adobe, Autodesk...
  13. RDP1974

    Experience/opinions on FastMM5

    I'm studying ad implementing Elixir/PhoenixWeb/Erlang over FreeBSD/Linux. Simply it's incredible! From http MVC with routes/controller/ORM to websocket channels, linear scalability until millions of sockets x single server with yusecs latency and fault tolerance (it's a VM with userlevel scheduler and signaling)...within a bunch of lines (a bench shows 100,000 reqs/sec from a MVC/postgre ORM json render in a single server; further you can change the code inside the VM meantime is running, so you can update pieces of the running app without close it) https://www.phoenixframework.org/ https://elixir-lang.org/
  14. RDP1974

    Experience/opinions on FastMM5

    "They" should move if want to jump to the bandwagon of parallel computing (IMHO? Within 5 years will be the facto with dozens or hundred cpu cores as standard)-> hard to beat Elixir, Erlang, Go or those functional programming that offers built-in horizontal and vertical scalability (userland scheduler with lightweight fibers, kernel threads, multiprocessing over cpu hw cores, machine clustering... without modify a line of code) 🙂
  15. RDP1974

    Experience/opinions on FastMM5

    Hi, https://github.com/RDP1974/Delphi64 look, there I have patched "key" RTL functions with the SIMD enhanced from Intel libraries: https://github.com/RDP1974/Delphi64/blob/master/RDPSimd64.pas (move, fillchar, pos) So I did a TBB allocator wrapper, a SIMD rtl patch, and a Zlib Intel version for http deflate (5x faster than gzip). Results are outstanding, tested by "famous" company coders: A test with Indy, the built-in TCP Delphi library, on I7 cpu, shows an enhancement from 6934.29 ops/sec to 23097.68 ops/sec Another test with WebBroker http compression, on I7 cpu, shows an enhancement from 147 pages/sec to 722 pages/sec Another test with DMVC web api, on I9 cpu and windows 2016, simulating with apachebench 10000 requests and 100 users, shows an enhancement from 111 reqs/sec to 6448 reqs/sec Another test, a ISAPI, on I9 cpu and windows 2016, doing in sequence DB query -> dataset of 1500 lines x 10 rows -> serialize to json string -> shrink it with deflate, is populating 2000 http reqs/sec, correctly filling all the cpu cores As far I have read the code of TBB, seems that the speed is obtained using x thread TLS (threadvar), when an app thread ask for mem, the allocator provides an already prepared zone (act as a cache)(I'm not sure of this). If you wish feel free to test my lib and see if behavior can be reproduced. As far I have seen should be enough to obtain a fast move, fillchar, pos (used in a lot of classes) and lock-free allocator (without branch jumps etc.) to have win64 speedup. (Anyway I agree with you, we should do real case bench) Thank you.
  16. RDP1974

    Experience/opinions on FastMM5

    hello @Pierre le Riche thank you for this great piece of code (FastMM5), I have a suggestion to make it quicker, in my TBB wrapper I have used to replace Fillchar (that's under Delphi64 is very slow) with a SIMD version (Intel IPP avx-512 etc...). Further, you are pre-allocating pieces of virtual mem. Perhaps you can do a quick hash or binary tree based cache with ready fillchar 0 blocks, maybe assigned to a background thread with minimal priority. So when the MM calls the Alloc, the fillchar is not needed, because the block is already filled with zeroes. IMHO in multithreaded stress test this will boost the performance! I don't mind of virtual allocated ram being bigger, windows kernel utilize only the "really used" (hard to explain for me :-)) Further, as far I have read of those new allocators, they pre-allocate ram in TLS cache, dispatching a thread pool (of course with a big ram allocation(virtual, so what cares?), but to avoid race concurrency and global locking) (please sorry me if those info are useless) kind regards Roberto
  17. work scenario can be different, thread pool using the heap will benefit a lot from TBB+IPP but, memory a part, I wish embarcadero will update delphi and linker to accomodate the modern C libraries ($TLS) kind regards
  18. RDP1974

    Experience/opinions on FastMM5

    See this post FastMM5 still 5x slower than the best C allocators
  19. I did a test of your console bench, using FastMM4, FastMM5, and optimized Intel Delphi64 TBB (feel free to use it) The result on VMware 8vcpu I9 5Ghz Windows 2016 Server: FastMM5 is 4x faster than FastMM4; IntelTBB is 5x faster than FastMM5 and 18x faster than FastMM4 Those new generation of allocators based on TLS cache are faster and used in production (I see game engines as unreal that are using by default TBB). Visual Studio C, C++ have as option to optimize using TBB and IPP. Further are better suited for memory error discovery and tested for 24/7/365 use. In my humble opinion Delphi should license TBB from Intel (it's free oss license) and port it to CLANG, rewriting the missing $TLS API runtime. The WINAPI headers dependency of msvcrt should be avoided using the C++Builder winapi 7.0 repository. This should be used in Win32, Win64, Android, Linux, Ios, Osx. Another cool C allocator, free, is the mimalloc of Microsoft. (IMHO Delphi 64bit can have a nice place for Cloud and distributed web apps, with a modern allocator can compete with Rust, Erlang, Go) C:\Exes>FastMM5ConsoleTest_F4 Parallel For used : 1479456 ticks Parallel For used : 1593960 ticks Parallel For used : 1492162 ticks Parallel For used : 1516575 ticks Parallel For used : 1504889 ticks Parallel For used : 1616684 ticks Parallel For used : 1694674 ticks Parallel For used : 1659002 ticks Parallel For used : 1509797 ticks Parallel For used : 1623232 ticks Parallel For used : 1549025 ticks Parallel For used : 1768947 ticks Parallel For used : 1860454 ticks Parallel For used : 1813156 ticks Parallel For used : 2014587 ticks Parallel For used : 1896651 ticks Parallel For used : 1918023 ticks Parallel For used : 1869937 ticks Parallel For used : 1832852 ticks Parallel For used : 1855156 ticks Done. Press ENTER to exit C:\Exes>FastMM5ConsoleTest_F5 (FastMM_SetOptimizationStrategy(mmosOptimizeForSpeed)) Parallel For used : 429409 ticks Parallel For used : 428977 ticks Parallel For used : 439715 ticks Parallel For used : 431561 ticks Parallel For used : 441682 ticks Parallel For used : 448713 ticks Parallel For used : 457904 ticks Parallel For used : 451374 ticks Parallel For used : 420869 ticks Parallel For used : 433840 ticks Parallel For used : 428119 ticks Parallel For used : 426678 ticks Parallel For used : 431399 ticks Parallel For used : 432025 ticks Parallel For used : 429793 ticks Parallel For used : 420178 ticks Parallel For used : 422983 ticks Parallel For used : 433726 ticks Parallel For used : 426557 ticks Parallel For used : 418806 ticks Done. Press ENTER to exit C:\Exes>FastMM5ConsoleTest_Intel Parallel For used : 85910 ticks Parallel For used : 82550 ticks Parallel For used : 84917 ticks Parallel For used : 81707 ticks Parallel For used : 81077 ticks Parallel For used : 80789 ticks Parallel For used : 81069 ticks Parallel For used : 81506 ticks Parallel For used : 85098 ticks Parallel For used : 84156 ticks Parallel For used : 84978 ticks Parallel For used : 81699 ticks Parallel For used : 84017 ticks Parallel For used : 79480 ticks Parallel For used : 80324 ticks Parallel For used : 80736 ticks Parallel For used : 83380 ticks Parallel For used : 84887 ticks Parallel For used : 78052 ticks Parallel For used : 82792 ticks Done. Press ENTER to exit
  20. RDP1974

    borderless with aero shadow

    I know, but I need VCL 🙂
  21. RDP1974

    borderless with aero shadow

    No, the canvas is inside the external frame. The solution in plain API is here: https://stackoverflow.com/questions/22165258/how-to-create-window-without-border-and-with-shadow-like-github-app/44489430#44489430 Create window with WS_CAPTION style Call DwmExtendFrameIntoClientArea WDM API passing 1 pixel top margin Handle WM_NCCALCSIZE message, do not forward call to DefWindowProc while processing this message, but just return 0 (https://stackoverflow.com/questions/43818022/borderless-window-with-drop-shadow)
  22. RDP1974

    borderless with aero shadow

    thank you, but the problem is the 1px frame of the color of theme title I have read a C++ example that I will try in Delphi, needs a return parameter from paint API where VCL use a procedure without return 😕
  23. hello, I did a good benchmark to test the Delphi Linux compiler. Resuming: - server I9 8core with Debian 10 and MySQL8 - server I9 8core with ClearLinux and Apache - server I9 8core with Windows 2016 and IIS10 - I7 client with apachebench The webbroker app from Windows or ClearLinux connect using pooled firedac connections to the Debian MySQL server. I don't want to bench different RDPMS, but only the layer IIS-ISAPI_webbroker and Apache-mod_webbroker. Multiple queries are done against a set returning thousand of lines x ten columns; then the dataset is serialized to REST string using DMVC serializers. The results are, using apachebench ab -n 1000 -c 10 -k -r Default Delphi 64bit IIS 10: 143 reqs/sec Default Delphi 64bit Linux Apache: 554 reqs/sec Delphi 64bit IIS 10 with RDP Intel TBB and Intel IPP libs: 567 reqs/sec (In my site you can download those libs) With a small text output instead of DB both Windows and Linux sustains 10000 reqs/sec So the Linux compiler is great performing and it's very reliable. Under apache I had errors raising the number of concurrent users, this need a manual tuning in apache config files (IIS is autotuning). Congratulations Emba! ------- WINDOWS IIS ISAPI Default Server Software: Microsoft-IIS/10.0 Server Hostname: / Server Port: 80 Document Path: / Document Length: 162716 bytes Concurrency Level: 100 Time taken for tests: 6.974 seconds Complete requests: 1000 Failed requests: 0 Keep-Alive requests: 1000 Total transferred: 162952000 bytes HTML transferred: 162716000 bytes Requests per second: 143.38 [#/sec] (mean) Time per request: 697.430 [ms] (mean) Time per request: 6.974 [ms] (mean, across all concurrent requests) Transfer rate: 22817.02 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.5 0 12 Processing: 47 657 330.7 511 2396 Waiting: 8 655 331.0 508 2396 Total: 47 657 330.7 511 2396 Percentage of the requests served within a certain time (ms) 50% 511 66% 578 75% 825 80% 969 90% 1203 95% 1291 98% 1500 99% 1732 100% 2396 (longest request) ------- WINDOWS IIS ISAPI with Intel TBB IPP Server Software: Microsoft-IIS/10.0 Server Hostname: / Server Port: 80 Document Path: / Document Length: 162716 bytes Concurrency Level: 100 Time taken for tests: 1.762 seconds Complete requests: 1000 Failed requests: 0 Keep-Alive requests: 1000 Total transferred: 162952000 bytes HTML transferred: 162716000 bytes Requests per second: 567.56 [#/sec] (mean) Time per request: 176.192 [ms] (mean) Time per request: 1.762 [ms] (mean, across all concurrent requests) Transfer rate: 90317.94 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.8 0 8 Processing: 23 159 64.5 148 387 Waiting: 8 157 64.7 145 383 Total: 23 159 64.3 148 387 Percentage of the requests served within a certain time (ms) 50% 148 66% 153 75% 160 80% 160 90% 266 95% 312 98% 355 99% 363 100% 387 (longest request) ------ APACHE MOD CLEARLINUX Server Software: Apache/2.4.41 Server Hostname: / Server Port: 80 Document Path: / Document Length: 162778 bytes Concurrency Level: 10 Time taken for tests: 1.804 seconds Complete requests: 1000 Failed requests: 0 Keep-Alive requests: 996 Total transferred: 162992068 bytes HTML transferred: 162778000 bytes Requests per second: 554.27 [#/sec] (mean) Time per request: 18.042 [ms] (mean) Time per request: 1.804 [ms] (mean, across all concurrent requests) Transfer rate: 88224.32 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 2 Processing: 11 18 3.0 17 35 Waiting: 9 16 2.9 16 33 Total: 11 18 3.0 17 35 Percentage of the requests served within a certain time (ms) 50% 17 66% 19 75% 20 80% 20 90% 22 95% 24 98% 26 99% 27 100% 35 (longest request)
  24. When you have DoS/DDoS protection in apache, for example with the usage of the qos_module, you will see that there will be a lot of failed requests in the output of the command. This happens, because the protection is indeed working and as mentioned, the ab tool basically floods your server with requests, so a lot of requests with the same IP will automatically be blocked by the apache module. Indeed I see that the performance of Delphi apache module or Indy web application, with Firedac and data middleware manipulation, under Linux is brilliant. I wait for the compiler optimization to redo a benchmark.
  25. in counterpart for scimark benchmark LLVM compiler needs a complete optimization overhaul https://quality.embarcadero.com/browse/RSP-28006
×