Jump to content

Arnaud Bouchez

Members
  • Content Count

    324
  • Joined

  • Last visited

  • Days Won

    25

Everything posted by Arnaud Bouchez

  1. Arnaud Bouchez

    language updates in 10.4?

    What do you mean? That you can put a string to a TNullableInteger? At least under FPC they are type safe: you are required to call NullableInteger(1234) to fill a TNullableInteger variable - you can't write aNullableInteger := 'toto' - or even aNullableInteger := 1234. Delphi is more relaxed about implicit conversions, so under Delphi it is indeed not typesafe, since you can write aNullableInteger := 'toto'.
  2. Arnaud Bouchez

    language updates in 10.4?

    The main change about the language would be the full ARC removal. The memory model is really part of the language, to my understanding. It is just as vital as to operate with "class" itself. Pure unfair FUD trolling remark: managed records are available in FPC trunk since a few months, and I guess EMB doesn't like to be behind an Open Source compiler. 😉 We implemented Nullable types using variants, and integrated support in our ORM. It has the advantage on working since Delphi 6, with low overhead, and good integration with the pascal language. See http://blog.synopse.info/post/2015/09/25/ORM-TNullable*-fields-for-NULL-storage - back from 2015!
  3. We use our https://github.com/synopse/mORMot/blob/master/SynLog.pas Open Source logging framework on server side, with sometimes dozens of threads into the same log file. It has very high performance, so logging don't slow down the process, and you can make very useful forensic and analysis if needed. Having all threads logging in the same log file is at the same time a nightmare and a blessing. It may be a nightmare since all operations are interleaved and difficult to identify. It is a blessing with the right tool, able to filter for one or several threads, then find out what is really occuring: in this case, a single log file is better than several. For instance, one of our https://livemon.com server generates TB of logs - see https://leela1.livemon.net/metrics/counters - and we can still handle it. This instance for instance is running since more than 8 months without being restarted, and with hunderths of simultaneous connections, logging incoming data every second... 🙂 We defined our simple and very usable log viewer tool - see http://blog.synopse.info/post/2011/08/20/Enhanced-Log-viewer and https://synopse.info/files/html/Synopse mORMot Framework SAD 1.18.html#TITL_103 This is the key to be able to have something usable from heavily multithreaded logs. So if you can, rather use a single log file per process, with proper thread identification and filtering.
  4. @David Heffernan Yes, the fastest heap is the one not used - I tend to allocate a lot of temporary small buffers (e.g. for number to text conversion) from the stack instead of using a temporary string. See http://blog.synopse.info/post/2011/05/20/How-to-write-fast-multi-thread-Delphi-applications
  5. @Tommi Prami I guess you found out about Lemire in our latest commits - see e.g. https://github.com/synopse/mORMot/commit/a91dfbe2e63761d724adef0703140e717f5b2f00 🙂 @Stefan Glienke It is to be used with a prime size - as with our code - which also reduces memory consumption since power of 2 tables are far from optimal in this regard (doubling the slot numbers can become problematic). With a prime, it actually enhances the distribution, even with a weak hash function, especially in respect to anding a power of 2. Lemire reduction is as fast as anding a power of 2 since a multiplication is done in 1 cycle on modern CPUs. Note that Delphi Win32 is not so good at compiling 64-bit multiplcation as involved with Lemire's, whereas FPC has no problem using the i386 mul opcode - which already gives 64-bit results.
  6. @David Heffernan You just store the ThreadID within the memory block information, or you use a per-convention identification of the memory buffer. Cross-thread deallocations also usually require a ThreadEnded-like event handler, which doesn't exist on Delphi IIRC - but does exist on FPC - so need to hack TThread. @RDP1974 Last time I checked, FastMM4 (trunk or AVX2 fork) don't work well with Linux (at least under FPC). Under Delphi + Linux, FastMM4 is not used at all - it just call libc free/malloc IIRC. I am not convinced the slowness comes from libc heap - which is very good from our tests. But from how Delphi/Linux is not optimized (yet). Other MM like BrainMM or our ScaleMM are not designed for Linux. We tried also a lot of allocators on Linux - see https://github.com/synopse/mORMot/blob/master/SynFPCCMemAligned.pas - in the context of highly multi-threaded servers. In a nutshell, see https://github.com/synopse/mORMot/blob/master/SynFPCCMemAligned.pas#L57 for some numbers. The big issue with those C-based allocators, which is not listed in those comments, apart from loading a lot of RAM, is that they stop the executable as soon as some GPF occurs: e.g. a double free will call a SIGABORT! So they are NOT usable on production unless you use them with ValGrid and proper debugging. We fallback into using the FPC default heap, which is a bit slower, consumes a lot of RAM (since it has a per-thread heap for smaller blocks) but is very stable. It is written in plain pascal. And the main idea about performance is to avoid as much memory allocation as possible - which is what we tried with mORMot from the ground up: for instance, we define most of the temp strings in the stack, not in the heap. I don't think that re-writing a C allocator into pascal would be much faster. It is very likely to be slower. Only a pure asm version may have some noticeable benefits - just like FastMM4. And, personally, I wouldn't invest into Delphi for Linux for server process: FPC is so much stable, faster and better maintained... for free!
  7. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    When reading several MB of buffers, it is not needed to read back. Just read the buffer line by line, from the beginning. Use a fast function like our BufferLineLength() above to compute the line length. Then search within the line buffer. If you can keep the buffer smaller than your CPU L3 cache, it may have some benefit. Going that way, the CPU will give you best performance, for several reasons: 1. the whole line is very likely to remain in L1 cache, so searching the line feed, then search any pattern will be achieved at full core speed. 2. there will be automatic prefetching from main RAM into L1/L2 cache when reading ahead in a single direction. If your disk is fast enough (NVMe), you can fill buffers in separated threads (use number of CPU cores - 1), then search in parallel from several files (one core per file - it would be more difficult to properly search the same file in multiple cores). If you don't allocate any memory during the process (do not use string), parallel search would scale linearly. Always do proper timing for your search speed - also taking into account the OS disk cache, which is likely to be used during testing, but not from real "cold" files.
  8. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    There is very fast line feed search, using proper x86_64 SSE assembly, checking by 16 bytes per loop iteration, in our SynCommons.pas: function BufferLineLength(Text, TextEnd: PUTF8Char): PtrInt; {$ifdef CPUX64} {$ifdef FPC} nostackframe; assembler; asm {$else} asm .noframe {$endif} {$ifdef MSWINDOWS} // Win64 ABI to System-V ABI push rsi push rdi mov rdi, rcx mov rsi, rdx {$endif}mov r8, rsi sub r8, rdi // rdi=Text, rsi=TextEnd, r8=TextLen jz @fail mov ecx, edi movdqa xmm0, [rip + @for10] movdqa xmm1, [rip + @for13] and rdi, -16 // check first aligned 16 bytes and ecx, 15 // lower 4 bits indicate misalignment movdqa xmm2, [rdi] movdqa xmm3, xmm2 pcmpeqb xmm2, xmm0 pcmpeqb xmm3, xmm1 por xmm3, xmm2 pmovmskb eax, xmm3 shr eax, cl // shift out unaligned bytes test eax, eax jz @main bsf eax, eax add rax, rcx add rax, rdi sub rax, rsi jae @fail // don't exceed TextEnd add rax, r8 // rax = TextFound - TextEnd + (TextEnd - Text) = offset {$ifdef MSWINDOWS} pop rdi pop rsi {$endif}ret @main: add rdi, 16 sub rdi, rsi jae @fail jmp @by16 {$ifdef FPC} align 16 {$else} .align 16 {$endif} @for10: dq $0a0a0a0a0a0a0a0a dq $0a0a0a0a0a0a0a0a @for13: dq $0d0d0d0d0d0d0d0d dq $0d0d0d0d0d0d0d0d @by16: movdqa xmm2, [rdi + rsi] // check 16 bytes per loop movdqa xmm3, xmm2 pcmpeqb xmm2, xmm0 pcmpeqb xmm3, xmm1 por xmm3, xmm2 pmovmskb eax, xmm3 test eax, eax jnz @found add rdi, 16 jnc @by16 @fail: mov rax, r8 // returns TextLen if no CR/LF found {$ifdef MSWINDOWS} pop rdi pop rsi {$endif}ret @found: bsf eax, eax add rax, rdi jc @fail add rax, r8 {$ifdef MSWINDOWS} pop rdi pop rsi {$endif} end; {$else} {$ifdef FPC}inline;{$endif} var c: cardinal; begin result := 0; dec(PtrInt(TextEnd),PtrInt(Text)); // compute TextLen if TextEnd<>nil then repeat c := ord(Text[result]); if c>13 then begin inc(result); if result>=PtrInt(PtrUInt(TextEnd)) then break; continue; end; if (c=10) or (c=13) then break; inc(result); if result>=PtrInt(PtrUInt(TextEnd)) then break; until false; end; {$endif CPUX64} It will be faster than any UTF-8 decoding for sure. I already hear some people say: "hey, this is premature optimization! the disk is the bottleneck!". But in 2020, my 1TB SSD reads at more than 3GB/s - https://www.sabrent.com/rocket This is real numbers on my laptop. So searching at GB/s speed does make sense. We use similar techniques at https://www.livemon.com/features/log-management With optimized compression, and distributed search, we reach TB/s brute force speed.
  9. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    We use similar techniques in our SynCommons.pas unit. See for instance lines 17380 and following: // some constants used for UTF-8 conversion, including surrogates const UTF16_HISURROGATE_MIN = $d800; UTF16_HISURROGATE_MAX = $dbff; UTF16_LOSURROGATE_MIN = $dc00; UTF16_LOSURROGATE_MAX = $dfff; UTF8_EXTRABYTES: array[$80..$ff] of byte = ( 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,0,0); UTF8_EXTRA: array[0..6] of record offset, minimum: cardinal; end = ( // http://floodyberry.wordpress.com/2007/04/14/utf-8-conversion-tricks (offset: $00000000; minimum: $00010000), (offset: $00003080; minimum: $00000080), (offset: $000e2080; minimum: $00000800), (offset: $03c82080; minimum: $00010000), (offset: $fa082080; minimum: $00200000), (offset: $82082080; minimum: $04000000), (offset: $00000000; minimum: $04000000)); UTF8_EXTRA_SURROGATE = 3; UTF8_FIRSTBYTE: array[2..6] of byte = ($c0,$e0,$f0,$f8,$fc); In fact, the state machine I talked about was just about line feeds, not UTF-8. My guess was that UTF-8 decoding could be avoided during the process. If the lines are not truncated, then UTF-8 and Ansi bytes will be valid sequences. Since when processing logs, lines should be taken into account, a first scan would be to decode line feeds, then process the line bytes directly, with no string/UnicodeString conversion at all. For fast searching within the UTF-8/Ansi memory buffer, we have some enhanced techniques e.g. the SBNDM2 algorithm: see TMatch.PrepareContains in our SynTable.pas unit. It is much faster than Pos() or BoyerMore for small patterns, with branchless case-insensitivity. It reaches several GB/s of searching speed inside memory buffers. There is even a very fast expression search engine (e.g. search for '404 & mydomain.com') in TExprParserMatch. More convenient than a RegEx to me - for a fast RegEx engine, check https://github.com/BeRo1985/flre/ Any memory allocation would reduce a lot the process performance.
  10. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    I would just cut a line bigger than this size - which is very unlikely with a 2MB buffer. Or just don't cut anything, just read the buffer and use a proper simple state machine to decode the content, without allocating any string.
  11. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    @Vandrovnik I guess you didn't understand what I wrote. I proposed to read the files in a buffer (typically 2MB-32MB), chunk by chunk, searching for the line feeds in it. It will work, very efficiently, for any size of input files - even TB. Last trick: under Windows, check the FILE_FLAG_SEQUENTIAL_SCAN option when you open such a huge file. It bypasses the OS cache, so make it more efficient in your case. See the corresponding function in SynCommons.pas : /// overloaded function optimized for one pass file reading // - will use e.g. the FILE_FLAG_SEQUENTIAL_SCAN flag under Windows, as stated // by http://blogs.msdn.com/b/oldnewthing/archive/2012/01/20/10258690.aspx // - under XP, we observed ERROR_NO_SYSTEM_RESOURCES problems with FileRead() // bigger than 32MB // - under POSIX, calls plain FileOpen(FileName,fmOpenRead or fmShareDenyNone) // - is used e.g. by StringFromFile() and TSynMemoryStreamMapped.Create() function FileOpenSequentialRead(const FileName: string): Integer; begin {$ifdef MSWINDOWS} result := CreateFile(pointer(FileName),GENERIC_READ, FILE_SHARE_READ or FILE_SHARE_WRITE,nil, // same as fmShareDenyNone OPEN_EXISTING,FILE_FLAG_SEQUENTIAL_SCAN,0); {$else} result := FileOpen(FileName,fmOpenRead or fmShareDenyNone); {$endif MSWINDOWS} end;
  12. Arnaud Bouchez

    Reading large UTF8 encoded file in chunks

    For decoding such log lines, I would not bother about UTF-8 decoding, just about line feeds decoding, during file reading. Just read your data into a buffer (bigger than you expect, e.g. of 2MB, not 32KB), search for #13#10 or #10, then decode the UTF-8 or Ansi text in-between - only if really needed. If you don't find a line feed before the end of the buffer, copy the bytes remaining from the last line at the beginning of the buffer, then fill it from disk. Last but not least, to efficiently process huge log files which are UTF-8 or Ansi encoded, I wouldn't make any conversion to string (UnicodeString), but use raw PAnsiChar or PByteArray pointer, with no memory allocation. We have plenty of low-level search / decoding functions working directly into memory buffers (using pointers) in our Open Source libraries https://github.com/synopse/mORMot/blob/master/SynCommons.pas
  13. Arnaud Bouchez

    Why can't I install this monospaced font in Delphi ?

    When I switched to Ubuntu for main OS, I put Ubuntu Mono in my Lazarus IDE and I like the result very much:
  14. Arnaud Bouchez

    Unit uDGVMUtils and 64 bit...

    They could be changed, of course. To be honnest, if you expect to defeat hackers you most probably will loose your time - especially if you don't know how to convert i386 asm to x64. Even the asm trick used in this code could be disabled - see https://www.gta.ufrj.br/ensino/CPE758/artigos-basicos/carpenter07.pdf and http://www.trapkit.de/tools/scoopyng/SecureVMX.txt just after a quick Google Search. And BTW Wine is not a virtual machine, I don't understand why it is part of this detection.
  15. Arnaud Bouchez

    Unit uDGVMUtils and 64 bit...

    Why not get the BIOS description string? The kind of Virtual machine is clearly available there. Just read in the registry: HKEY_LOCAL_MACHINE\Hardware\Description\System\BIOS This is how we do in our Open Source mORMot framework: with TRegistry.Create do try RootKey := HKEY_LOCAL_MACHINE; if OpenKeyReadOnly('\Hardware\Description\System\CentralProcessor\0') then begin cpu := ReadString('ProcessorNameString'); if cpu='' then cpu := ReadString('Identifier'); end; if OpenKeyReadOnly('\Hardware\Description\System\BIOS') then begin manuf := SysUtils.Trim(ReadString('SystemManufacturer')); if manuf<>'' then manuf := manuf+' '; prod := SysUtils.Trim(ReadString('SystemProductName')); prodver := SysUtils.Trim(ReadString('SystemVersion')); if prodver='' then prodver := SysUtils.Trim(ReadString('BIOSVersion')); if OpenKeyReadOnly('\Hardware\Description\System') then begin if prod='' then prod := SysUtils.Trim(ReadString('SystemBiosVersion')); if prodver='' then begin prodver := SysUtils.Trim(ReadString('VideoBiosVersion')); i := Pos(#13,prodver); if i>0 then // e.g. multilines 'Oracle VM VirtualBox Version 5.2.33' SetLength(prodver,i-1); end; end; if prodver<>'' then FormatUTF8('%% %',[manuf,prod,prodver],BiosInfoText) else FormatUTF8('%%',[manuf,prod],BiosInfoText); end; finally Free; end; See https://synopse.info/fossil/finfo?name=SynCommons.pas
  16. Isn't it also used for some kind of Vodka ? - ok I am out (like the variables)
  17. Arnaud Bouchez

    HEIC library

    Brillant! Do you know if it is compatible with the built-in https://blogs.windows.com/windowsexperience/2018/03/16/announcing-windows-10-insider-preview-build-17123-for-fast/#hL2gI3IBkfsGuK6d.97 ? Do you know the WIC identifiers involved?
  18. Arnaud Bouchez

    Unit testing cross platform code

    We don't use Delphi for other platforms.... but FPC... There is no plan yet, due to how incompatible cross-platform was in Delphi. And even with ARC disabled, I am not sure it would be worth it. There is a very small test framework in our Cross-Platform client units. See https://github.com/synopse/mORMot/blob/master/CrossPlatform/SynCrossPlatformTests.pas It is very lightweight, and should work on all platforms...
  19. Arnaud Bouchez

    SChannel TLS - perform TLS communication with WinAPI

    For information, our Open Source https://github.com/synopse/mORMot/blob/master/SynCrtSock.pas supports SChannel as TLS layer since some time, for its raw socket layer. Of course, there is the additional WinInet/WinHTTP API for HTTP requests, which supports proper Proxy detection. Its SChannel implementation is more concise that your propose, and it works from Delphi 6 and up, and also for FPC. See https://github.com/synopse/mORMot/blob/5777f00d17fcbe0378522ceddceb0abece1dd0e3/SynWinSock.pas#L847
  20. Arnaud Bouchez

    Unit testing cross platform code

    We use our Open Source https://github.com/synopse/mORMot/blob/master/SynTests.pas unit. It is cross-platform and cross-compiler (FPC and Delphi). But it is mostly about server-side process and Win32/Win64 for Delphi. It is cross-platform (Win, BSD, Linux, i386/x86-64/arm32/aarch64) for FPC.
  21. It would make sense only if your data consist in text files, and you want to keep versioning of the information. A regular SQL database would replace the old data, so you would need to log the old values in a dedicated table. You can use the git command-line for all features you need. Just call it with the proper argument from your Delphi application. But I would rather take a look at https://www.fossil-scm.org/home/doc/trunk/www/index.wiki It is an efficient Distributed Version Control Management system, very similar to git, but with a big difference: "Self-Contained - Fossil is a single self-contained stand-alone executable. To install, simply download a precompiled binary for Linux, Mac, or Windows and put it on your $PATH. Easy-to-compile source code is also available" So you could be able to much easier integrate it to your own software. It has some other nice features - like an integrated Web Server - which could be easy for you. Here also, you would need to call the fossil command line from your Delphi application.
  22. Arnaud Bouchez

    JSON as a way to save information in Firebird

    @aehimself I do not have the same negative experience in practice. In fact, your last point gives light to the previous one: "DB engine has built-in JSON support" is needed. I stored JSON in DB with great success for several projects. Especially in a DDD project, we use our ORM to store the "Aggregate" object, with all its associated data objects and arrays serialized as JSON, and some indexed fields for quick query. If the architecture starts from the DB, which is so 80s, using JSON doesn't make sense. But if the architecture starts from the code, which is what is recommended in this century with new features like NoSQL and ORM running on high-end hardware, then you may have to enhance your design choices. Normalizing data and starting from the DB is perfectly fine, for instance if you make RAD or have a well-known data layout (e.g. accounting), but for a more exploring project, it will become more difficult to maintain and evolve. "- Some database engines can not search in BLOB fields. Now it might not be an issue in the beginning but it makes investigating data corruption / debugging processing issues a living nightmare" -> this is why I propose to duplicate the searched values in its own indexed fields, and only search the JSON once it is refined; and do not use BLOB binary fields, but CTEXT fields. "Downloading blob fields are slow, no matter what" -> it is true if you download it, but if the JSON process is done on the server itself, using JSON functions as offered e.g. by PostgreSQL or SQLite3, it is efficient (once you pre-query your rows using some dedicated indexed fields) "they have no issues handling a larger number of columns" -> main idea of using JSON is not to have a fixed set of columns, but being able to store anything, with no pre-known schema, and with several level of nested objects or arrays.
  23. Arnaud Bouchez

    Embedded MongoDB

    MongoDB benefit is to be installed in its server(s), with proper replication. Using it stand-alone is pointless. Use SQlite3 as embedded database. It supports JSON https://www.sqlite.org/json1.html efficiently. In practice, on some production project we used: - SQLite3 for almost everything (up to 1TB databases) - and we usually create several databases, e.g. per-user DB - MongoDB for storing huge content (more than 1TB databases) with replication: typically used as an archive service of "cold" data, which can be queried efficiently if you defined the proper indexes To let all this work from Delphi, we use our OpenSource http://mormot.net ORM, which is able to run efficiently on both storages, with the exact same code. The WHERE SQL query clauses are even translated on the fly to MongoDB pipelines by the ORM. One main trick is to put raw data as JSON in the RawJSON or TDocVariant ORM fields, then duplicate the fields to be queried as stand-alone ORM properties, with a proper index. Then you can query it very efficiently.
  24. Arnaud Bouchez

    AES Encryption - FMX

    You are right, SynCrypto is fine with FMX running on Windows - and it uses RawByteString or TBytes as required since Delphi 2009. But it is not a OS compatibility problem - it is a compiler issue. To be fair, it's Delphi cross-platform compiler/RTL which is designed poorly, especially all the backward compatibility breaks they made about strings and heap. Their recent step back is a clear hint of their bad design choices. There is no problem to use SynCrypto on Windows, Linux, BSD, Darwin, for Intel/AMD or ARM 32-bit or 64-bit CPU, if you use FPC. But we didn't lose time with targets breaking too much the existing code base. I am happy I didn't spend weeks making mORMot ARC-compatible - which is now deprecated! - and focused instead on FPC compatibility and tuning. Which was very rewarding.
  25. Arnaud Bouchez

    JSON as a way to save information in Firebird

    For a regular SQL DB, I would use a BLOB SUB_TYPE TEXT field, to store the JSON content. Then duplicate some JSON fields in a dedicated stand-alone SQL field (e.g. a "id" or "name" JSON field into regular table.id or table.name fields in addition to the table.json main field), for easy query - with an index if speed is mandatory. So, start with JSON only (and a key field), then maybe add some dedicated SQL fields for easier query, if necessary. side note: I would rather use SQLite3 or PostgreSQL for efficiently storing JSON content: both have efficient built-in support. Or a "JSON native" database like MongoDB.
×