Jump to content

Arnaud Bouchez

  • Content Count

  • Joined

  • Last visited

  • Days Won


Arnaud Bouchez last won the day on June 3 2021

Arnaud Bouchez had the most liked content!

Community Reputation

291 Excellent


Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Arnaud Bouchez

    Is Move the fastest way to copy memory?

    L1 cache access time makes a huge difference. http://blog.skoups.com/?p=592 You could retrieve the L1 cache size, then work on buffers of about 90% of this size (always keep some space for stack, tables and such). Then, if you work in the API buffer directly, a non-temporal move to the result buffer may help a little. During your process, if you use lookup tables, ensure they don't pollute the cache. But profiling is the key for sure. Guesses are most of the time wrong...
  2. Arnaud Bouchez

    Is Move the fastest way to copy memory?

    That's what I wrote: it is unlikely alternate Move() would make a huge difference. When working on buffers, cache locality is a performance key. Working on smaller buffers, which fit in L1 cache (a few MB usually) could be faster than two big Move / Process. But perhaps your CPU has already good enough cache (bigger than your picture), so it won't help. About the buffers, couldn't you use a ring of them, so that you don't move data?
  3. Arnaud Bouchez

    How make benchmark?

    It will depend on the Database used behind FireDAC or Zeos, and the standard used (ODBC/OleDB/Direct...). I would say that both are tuned - just ensure you got the latest version of Zeos, which is much more maintained and refined that FireDAC in the last years. Note that FireDAC has some aggressive settings, e.g. for SQLite3 it changes the default safe write settings into faster access. The main interrest of Zeos is that the ZDBC low-level layer does not use a TDataSet, so it is (much) faster if you retrieve a single object. You will see those two behavior in the Michal numbers above, for instance. Also note that mORMot has a direct DB layer, not based on TDataSet, which may be used with FireDAC or Zeos, or with its own direct ODBC/OleDB/Oracle/PostgreSQL/SQLite3 data access. See https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITL_27 Note that its ORM is built on top on this unique DB layer, and add some unique features like multi-insert SQL generation, so a mORMot TRestBatch is usually much faster than direct naive INSERTs within a transaction. You can reach 1 million inserts per second with SQLite3 with mORMot 2 - https://blog.synopse.info/?post/2022/02/15/mORMot-2-ORM-Performance
  4. Arnaud Bouchez

    Update framework question

    I would stick with a static JSON resource, if it is 20KB of data once zipped. Don't use HEAD for it. With a simple GET, and proper E-Tag caching, it would let the HTTP server return 304 on GET if not modified: just a single request, only returning the data when it changed. All will stay at HTTP server level, so it would be simple and effective.
  5. Arnaud Bouchez

    Is Move the fastest way to copy memory?

    Don't expect anything magic by using mORMot MoveFast(). Perhaps a few percent more or less. On Win32 - which is your target, IIRC the Delphi RTL uses X87 registers. On this platform, MoveFast() use SSE2 registers for small sizes, so is likely to be slightly faster, and will leverage ERMSB move (i.e. rep movsb) on newer CPUs which support it. To be fair, mORMot asm is more optimized for x86_64 than for i386 - because it is the target platform for server side, which is the one needing more optimization. But I would just try all FastCode variants - some can be very verbose, but "may" be better. What I would do in your case, is trying to not move any data at all. Isn't it possible that you pre-allocate a set of buffers, then just consume them in a circular way, passing them from the acquisition to the processing methods as pointers, with no copy? The fastest move() is ... when there is no move... 🙂
  6. Arnaud Bouchez

    Locked SQlite

    See https://www.sqlite.org/lockingv3.html By default, FireDac opens SQLite3 databases in "exclusive" mode, meaning that only a single connection is allowed. It is much better for the performance, but it "locks" the file for opening outside this main connection. So, as @joaodanet2018 wrote, change the LockingMode in FDconn, or just close the application currently using it.
  7. Arnaud Bouchez

    Docx (RTF) to PDF convert

    I really recommend https://www.trichview.com/
  8. Where are you located? (it makes difference for your potential work status, even remotely) Do you have some code to show? (e.g. on github or anywhere else)
  9. note: if you read the file from start to end, Memory mapped files are not faster than reading the file in memory. The memory faults make it slower than a regular single FileRead() call. For huge files on Win32 which won't be able to load in memory, you may use temporary chunks (e.g. 128MB). And if you really load it once and don't want to pollute the OS disk memory cache, consider using the FILE_FLAG_SEQUENTIAL_SCAN flag under Windows. This is what we do with mORMot's FileOpenSequentialRead(). https://devblogs.microsoft.com/oldnewthing/20120120-00/?p=8493
  10. Yes, delete() is as bad as copy(), David is right! Idea is to keep the input string untouched, then append the output to a new output string, preallocated once with a maximum potential size. Then call SetLength() once at the end, which is likely to do nothing and reallocate the content in-place, thanks to the heap manager.
  11. You could do it with NO copy() call at all. Just write a small state machine and read the input one char per char.
  12. The main trick is to avoid memory allocation, i.e. temporary string allocations. For instance, the less copy() calls, the better. Try to rewrite your code to allocate one single output string per input string. Just parse the input string from left to write, then applying the quotes or dates processing on the fly. Then you could also avoid any input line allocation, and parse the whole input buffer at once. Parsing 100.000 lines could be done much quicker, if properly written. I guess round 500MB/s is easy to reach. For instance, within mORMot, we parse and convert 900MB/s of JSON in pure pascal, including string unquoting.
  13. This is a bit tricky (a COM object) but it works fine on Win32. You have the source code at https://github.com/synopse/SynProject We embedded the COM object as resource with the main exe, and it is uncompressed and registered for the current user.
  14. ShortString have a big advantage: they can be allocated on the stack. So they are perfect for small ASCII text, e.g. numbers to text conversion, or when logging information in plain English. Using str() over a shortstring is for instance faster than using IntToString() and a temporary string (or AnsiString) on a multi-thread program, because the heap is not involved. Of course, we could use a static array of AnsiChar, but ShortString have an advantage because they have their length encoded so they are safer and faster than #0 terminated strings. So on mobile platform, we could end up by creating a new record type, re-inventing the wheel whereas the ShortString type is in fact still supported and generated by the compiler, and even used by the RTL at its lowest system level. ShortString have been deprecated... and hidden. They could even be restored/unhidden by some tricks like https://www.idefixpack.de/blog/2016/05/system-bytestrings-for-10-1-berlin Why? Because some people at Embarcadero thought it was confusing, and that the language should be "cleaned up" - I translate by "more C# / Java like", with a single string type. This was the very same reason they did hide RawByteString and AnsiString... More a marketing strategy than a technical decision IMHO. I prefer the FPC more "conservative" way of preserving backward compatibility. It is worth noting that the FPC compiler source code itself uses a lot of shortstring internally, so it would never be deprecated on FPC for sure. 😉
  15. Arnaud Bouchez

    About TGUID type...

    Using a local TGUID constant seems the best solution. It is clean and fast (no string/hex conversion involved, just copy the TGUID record bytes). No need to make anything else, or ask for a feature request I guess, because it would be purely cosmetic.