Jump to content

Arnaud Bouchez

Members
  • Content Count

    315
  • Joined

  • Last visited

  • Days Won

    22

Everything posted by Arnaud Bouchez

  1. Cross-Platform was prepared, there are OS-specific units, but the POSIX versions were never finished nor tested IIRC, since Eric (the maintainer) didn't need anything outside Windows. As stated by Eric in his blog https://www.delphitools.info/2018/04/20/dwscript-transition-to-delphi-10-2-3/#more-3949 : Darwin/Linux support may be feasible, but Mobile platforms would require some ARM low-level stuff, which may not be easy to do.
  2. DWSScipt is my favorite. Its syntax is modern, and its implementation is very clean. It even has a JIT! The problem is that it is not cross-platform yet. The veteran PascalScript is my favorite if cross-platform is needed. It is stable, and widely used since years.
  3. Not that I have seen. You can run Delphi 7 with non-admin rights, as soon as you install it not in "c:\program files". This did not change since Vista.
  4. I don't see Delphi 7 being slow on Windows 10, with the built-in antivirus/antimalware. I installed it in a c:\Progs\Delphi7 folder, not in the default "c:\program files" sub-folder. Ensure you installed https://www.idefixpack.de/blog/ide-tools/delphispeedup/ tool.
  5. Arnaud Bouchez

    a pair of MM test

    On Windows, we use http.sys kernel mode which scales better than anything on this platform. It is faster than IOCP since it runs in the kernel. On Linux, we use our own thread-pool of socket server, with a nginx frontend as reverse proxy on the unix socket loopback, handling HTTPS and HTTP/2. This is very safe and scalable. And don't trust micro benchmarks. Even worse, don't write your own benchmark. They won't be as good as measuring of a real application. As I wrote, Intel TBB is a no-go for real server work due to huge memory consumption. If you have to run some specific API calls to release the memory, this is a big design flow - may be considered as a bug (we don't want to have the application stale as it would have with a GC) - and we would never do it. To be more precise, we use long-living threads from thread pools. So in practice, the threads are never released, and the memory allocation and the memory release are done in diverse threads: one thread pool handles the socket communication, then other thread pool will consume the data and release the memory. This is a scenario typical from most event-driven servers, running on multi-core CPUs, with a proven ring-oriented architecture. Perhaps Intel TBB is not very good at releasing memory with such pattern - whereas our SynFPCx64MM is very efficient in this case. And we almost never realloc - just alloc/free using the stack as working buffer if necessary.
  6. Arnaud Bouchez

    a pair of MM test

    TBB is fast in benchmarks, but from our experiment not usable on production on a server. TBB consumes A LOT of memory, much more than FM4/FM5 and alternatives. Numbers for a real multi-threaded Linux server are a show stopper for using TBB. On production on a huge Multi Xeon server, RAM consumption after a few hours stabilisation is gblic=2.6GB vs TBB=170GB - 60 times more memory ! With almost no actual performance boost. This mORMot service handles TB of incoming data, sent by block every second, with thousands of simultaneous HTTPS connections. See https://github.com/synopse/mORMot/blob/master/SynFPCCMemAligned.pas#L55 So never trust any benchmark. Try with your real workload. What we found out with https://github.com/synopse/mORMot/blob/master/SynFPCx64MM.pas may be interesting for the discussion. Using AVX for medium blocks moves/realloc doesn't change in practice in respect to an inlined SSE2 move (tiny/small/medium blocks), or a non-temporal move (using movntdq opcode instead of plain mov - for large blocks). For large blocks, using mremap/VirtualAlloc in-place reallocation is a better approach: relying on the OS and performing no move is faster than AVX/AVX2/AVX512. SynFPCx64MM is currently only for FPC. Used on production with heavily loaded servers. It is based on FastMM4 design, fully optimized in x86_64 asm, but with a lockless round-robin algorithm for tiny blocks (<=256 bytes), and an optional lockless list for FreeMem - which are the bottleneck for most actual servers. It has several spinning alternatives in case of contention. And it is really Open Source - not like FastMM5. We may publish a Delphi-compatible version in the next weeks.
  7. Arnaud Bouchez

    Zeos 7.3 entered the beta phase.

    It is in records/seconds so the higher the better. And it includes the ORM layer - which is very low in practice. You can see for instance that if you use TDataSet (and DB.pas depending units) then reading one record/object has a noticeable overhead, in respect to direct DB access, as ZDBC does or our direct SynDB classes. For a reference documentation, with some old numbers, you may check https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITL_59 Edit: you will see that the ORM also supports MongoDB as backend. Pretty unique.
  8. Arnaud Bouchez

    Zeos 7.3 entered the beta phase.

    Some discussion with numbers using ZDBC/Zeos 7.3 beta in respect to alternatives is available at https://synopse.info/forum/viewtopic.php?pid=32916#p32916
  9. Arnaud Bouchez

    Zeos 7.3 entered the beta phase.

    Great News! Zeos is IMHO the best data access library for Delphi and FPC. And it is Open Source. The direct ZDBC layer has tremendous performance, and a lot of work has been done for this 7.3 upcoming branch.
  10. Arnaud Bouchez

    Free Resource Builder Utility?

    I don't see why - but I never used 24bits RGB icons... 16bits are good enough...
  11. I would try to disable 3rd party packages first, and re-enable them one by one.
  12. Arnaud Bouchez

    System.GetMemory returning NIL

    If the memory pages are swapped on disk, then indeed it will be slow to dereference the pointer. But in this case, the application is very badly designed: paging on disk should be avoided in all cases, and direct disk API calls should be made instead to flush the data. The problem is not the use of the MM. The problem is the whole memory allocation design in the application. Less memory should be allocated. This is what @David Heffernan wrote, and you didn't get his point about swapping. If the memory page is not on disk - then you may have a cache miss when the pointer is dereferenced. For big memory blocks, it won't hurt. Calling VirtualFree will take definitively much more CPU than a cache miss. So I still don't find the relevance of your argumentation. Last but not least, the article you quoted (without any benchmark and code to prove the point) is very specific to the memory use of a database engine, which claims to be the fastest on the embedded market. I doubt everytime I read such claim, and I don't see actual code. More like technical marketing arguments than real data. Raima DB features "needing 350KB of RAM" and "optimized to run on resource-constrained IoT edge devices that require real-time response". So what is the point of benchmarking handling of GB of RAM? The whole https://raima.com/sqlite-vs-rdm/ is full of FUD. The graphs are a joke. Since they don't even show the benchmark code, I guess they didn't even use a fair comparison and use SQLite in default mode - whereas with exclusive mode and in-memory journal, SQLite3 can be really fast. We have benchmark and code to show that with mORMot - https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITL_60 and https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITLE_140 (current numbers are even higher). You may have to find better references.
  13. Arnaud Bouchez

    System.GetMemory returning NIL

    @Mahdi Safsafi Your article refers to the C malloc on Windows - which is known to be far from optimized - much less optimized than the Delphi MM. For instance, the conclusion of the article doesn't apply to the Delphi MM: "If you have an application that uses a lot of memory allocation in relatively small chunks, you may want to consider using alternatives to malloc/free on Windows-based systems. While VirtualAlloc/VirtualFree are not appropriate for allocating less than a memory page they can greatly improve database performance and predictability when allocating memory in multiples of a single page.". This is exactly what FastMM4 does. When I wrote fragmentation won't increase for HUGE blocks, I meant > some MB blocks. With such size, I would probably reuse the very same buffer per thread if performance is needed. @Kas Ob. You are just proving my point: if you use very specific OS calls, you may need buffer aligned on memory page.
  14. Arnaud Bouchez

    System.GetMemory returning NIL

    Allocating 4KB more for huge blocks is not an issue. If you want the buffer aligned with system page granularity, then it is a very specific case, only needed by other OS calls, like changing the memory protection flags. It is theoritically possible, but very rare. This is the only reason when using the internal MM is not to be used. If you expect to see any performance benefit of using memory page-aligned, you are pretty wrong for huge blocks - it doesn't change anything in practice. The only way to increase performance with huge block of memory is by using non-volatile/non-temporal asm opcodes (movnti e.g.), which won't populate the CPU cache. But this is only possible with raw asm, not Delphi code, and will clearly be MM independent.
  15. Arnaud Bouchez

    System.GetMemory returning NIL

    Delphi MM (FastMM4) is just a wrapper around the OS API for big blocks. No benefit of calling direclty the OS function, which is system-specific, and unsafe. Just use getmem/fremem/reallocmem everywhere.
  16. RTC is a great set of communication classes. kbmMW and mORMot are not only communication layers, but full toolboxes, with plenty of other features. They have very diverse philosophy/design. You may also take a look at the TMS components. One big difference is that mORMot is fully Open Source, is used in several critical projects so is highly maintained, and works fine with FPC on a variety of platforms - we use it on production Linux x86_64 servers to handle TB of data from thousands of clients. The fact that mORMot SOA can use interfaces to define the services, and WebSockets for real-time callbacks from server to client, make it unique. There is a full refactoring called mORMot2 which should be finished next october (I hope).
  17. Fair remark. This was the point of the FAQ: https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITL_123 Also check https://tamingthemormot.wordpress.com/ blog entries.
  18. I am confused by this description. What do you call a "user"? Is it a client app? Then why this client logic should be compiled within the main module? Please refine your system description. If you meant having your main service call some sub-functions for custom process, then you may rather call separated services. An embedded dll (or com server, whatever) is the right way to make your system instable. If the dll has some memory leaks or write somewhere in memory, you may corrupt the main process.... So I would isolate the custom logic into one or several dedicated services - so point 3. Perhaps you need to refine the design of your SOA solution. You don't need a monolithic REST server. The best practice today for several services is to create MicroServices. When a Windows Service "does a lot of stuff", from my point of view it sounds like not a very maintainable design. The first step would be to split the main service into smaller services, then put an orchestrator/application service as frontend, calling third-party services if necessary. Perhaps some design part of our framework documentation may help. Check http://mormot.net
  19. Try to push a pull request to the original project, if it is Open Source. If it is useful to you, it may be useful to others. The pull request may not be directly merged, since the project owners may have some requirements (testing, code format, comments, documentation, cross-platform...). It is a great way to enhance your abilities, and give back to the community. For 3rd party non-free components, it is more difficult. You may use the branch feature of a SCM (git or fossil e.g.) to backport the original 3rd pary code updates to your patched branch.
  20. Arnaud Bouchez

    ANN: TMS Web Core for Visual Studio Code - Public Beta

    This is a huge step forward, to allow RAD development using Visual Studio code, for client applications. Even young developers, unwilling to use Delphi, could use the tool and develop the frontend. Then the server side can still use Lazarus or Delphi, and existing code for the business logic and the data access.
  21. Discussion about FPC / Delphi CMR is available here: https://forum.lazarus.freepascal.org/index.php?topic=43143.0 So not compatible yet. But I guess that FPC - in {$mode Delphi} will eventually be Delphi compatible. See https://wiki.freepascal.org/management_operators as reference.
  22. Arnaud Bouchez

    Have you seen CompilerExplorer?

    Old versions of FPC, sadly. Gareth did some optimizations included in 3.2 and trunk... worth seeing it live in the generated asm! I have seen generated asm being improved in FPC since years. I sadly can't say the same with Delphi - especially on cross-platform, e.g. about inlining floating point operations. I use godbolt since years to check the asm generated by latest gcc with high-level of optimization and opcodes. It is useful to have some reference code when writing some SSE2 or AVX/AVX2 asm.
  23. Arnaud Bouchez

    question about GPL and Delphi

    @RDP1974 Yes, we may make it Delphi compatible. The only big difference with FastMM4 is the lock-less round-robin of tiny blocks, and a diverse repartition. Also the locking strategies are not the same than FastMM4. Then some micro-optimization during the refactoring. We will see how it works with Delphi Win64.
  24. Nice! Thanks for sharing.
  25. Side note: see for instance the ProtectMethod use of https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITLE_57 which is similar. What I miss with records is inheritance. Static inheritance, I mean, not virtual inheritance. This is why I always like the `object` type, and find it weird that is deprecated and buggy since Delphi 2010... Check https://blog.synopse.info/?post/2013/10/09/Good-old-object-is-not-to-be-deprecated-it-is-the-future almost seven years ago... already... I tend to search for alternatives not in C++, where alternate memory models were added later on - as with Delphi. But in new languages like Rust which has a new built-in memory model. Rust memory management is perhaps the most interesting/tricky/difficult/powerful/promising (pickup your own set) feature of this language. Both C++ and Delphi suffer from the complexity of all their memory model. COW, TComponent, interface, variant, new/dispose, getmem/freemem, create/free... it may be confusing for newcomers. My guess is that 80% of Delphi RAD users seldom allocate manually memory (just for a few TStringList), and rely on the TComponent ownership and visual design. At least they removed ARC from the landscape! 🙂 CMR is a nice addition - which FPC featured since some time, by the way... We still need to check the performance impact of this initial release - writing efficient RTL was not the main point of Embarcadero these latest years. I hope it won't reduce regular record/class performance.
×