PeterPanettone 168 Posted Saturday at 08:28 PM One significant enhancement to the Delphi language would be the addition of support for strings as case selectors, similar to Free Pascal. Currently, the case statement is limited to ordinal types (such as integers, characters, or enumerations), which enables compile-time optimizations like jump tables for superior efficiency compared to chained if-then-else statements. However, Embarcadero could extend the compiler to maintain this performance for ordinal types while also supporting strings—perhaps by internally compiling string cases to equivalent if-then-else logic at runtime, as Free Pascal demonstrates. This would provide developers with greater flexibility without compromising existing code. 1 Share this post Link to post
pyscripter 829 Posted Saturday at 08:55 PM (edited) You could even take it a few steps further as in python's structural pattern matching What’s New In Python 3.10 — Python 3.10.18 documentation or C# pattern matching Patterns - Pattern matching using the is and switch expressions. - C# reference | Microsoft Learn Of course I am day-dreaming. Who knows. Maybe one day... Edited Saturday at 09:00 PM by pyscripter Share this post Link to post
PeterPanettone 168 Posted Saturday at 09:12 PM IMO, converting even a very large sequence of strings to a collision-free hash table at compile-time using a dictionary would be the most efficient way. Share this post Link to post
pyscripter 829 Posted Saturday at 09:29 PM (edited) 18 minutes ago, PeterPanettone said: IMO, converting even a very large sequence of strings to a collision-free hash table at compile-time using a dictionary would be the most efficient way. And writing everything in assembly may save you a few miliseconds.... This is not about efficiency, but about writing better and more readable code. It is the compiler's job to convert source code to the most efficient machine code. Edited Saturday at 09:31 PM by pyscripter Share this post Link to post
PeterPanettone 168 Posted Saturday at 09:33 PM 3 minutes ago, pyscripter said: It is the compiler's job to convert the code to the most efficient machine code. So true. Share this post Link to post
PeterPanettone 168 Posted Saturday at 09:55 PM In quantum computers, pattern matching can be significantly more efficient due to quantum parallelism and superposition, resulting in speedups over classical counterparts in certain scenarios. For instance, quantum algorithms for string matching leverage Grover's search or similar techniques to achieve quadratic speedups, reducing time complexity from O(n) to O(√n) in unstructured searches that underpin many pattern-matching tasks. Recent advancements, such as those applied to genomics and text processing, demonstrate faster approximate pattern matching, potentially accelerating applications in cybersecurity and bioinformatics by handling massive datasets more effectively. Studies also show that quantum-enhanced pattern recognition can improve accuracy by up to 100% while boosting efficiency, particularly in machine learning integrations. Share this post Link to post
Anders Melander 2071 Posted Saturday at 10:44 PM 38 minutes ago, pyscripter said: It is the compiler's job to convert source code to the most efficient machine code. And how is that going for us? 1 hour ago, PeterPanettone said: IMO, converting even a very large sequence of strings to a collision-free hash table at compile-time using a dictionary would be the most efficient way. There are many, many ways of searching for occurrence of a string in a static set of strings, but hashing certainly isn't the fastest. Hashing is a generic algorithm that doesn't take strings' unique properties into account. 14 minutes ago, PeterPanettone said: In quantum computers, pattern matching can be significantly more efficient due to quantum parallelism and superposition, resulting in speedups over classical counterparts in certain scenarios. For instance, quantum algorithms for string matching leverage Grover's search or similar techniques to achieve quadratic speedups, reducing time complexity from O(n) to O(√n) in unstructured searches that underpin many pattern-matching tasks. Recent advancements, such as those applied to genomics and text processing, demonstrate faster approximate pattern matching, potentially accelerating applications in cybersecurity and bioinformatics by handling massive datasets more effectively. Studies also show that quantum-enhanced pattern recognition can improve accuracy by up to 100% while boosting efficiency, particularly in machine learning integrations. A chatbot couldn't have expressed it better - whatever it is. 3 Share this post Link to post
PeterPanettone 168 Posted Saturday at 10:53 PM 4 minutes ago, Anders Melander said: whatever it is It is a direct response to pyscripter's mentioning pattern matching in Python and C#. Share this post Link to post
PeterPanettone 168 Posted Saturday at 11:02 PM 16 minutes ago, Anders Melander said: Hashing is a generic algorithm that doesn't take strings' unique properties into account. Here are the CRC-32 hashes of some words (via System.Hash.THashCRC32): "one": 2053932785 "two": 298486374 "three": 1187371253 "four": 2428593789 "five": 1018350795 These hashes are stored as keys in a TDictionary<Cardinal, TProc>, mapping each to an anonymous procedure that represents the "case" action (e.g., handling that specific string). At runtime, the input string is hashed, and the dictionary is used for quick lookup to invoke the corresponding action—mimicking a case statement efficiently. Since the hashes are collision-free for this set, no additional equality checks are needed, and the lookup is O(1). Share this post Link to post
David Heffernan 2463 Posted Saturday at 11:05 PM 1 minute ago, PeterPanettone said: Here are the CRC-32 hashes of some words (via System.Hash.THashCRC32): "one": 2053932785 "two": 298486374 "three": 1187371253 "four": 2428593789 "five": 1018350795 These hashes are stored as keys in a TDictionary<Cardinal, TProc>, mapping each to an anonymous procedure that represents the "case" action (e.g., handling that specific string). At runtime, the input string is hashed, and the dictionary is used for quick lookup to invoke the corresponding action—mimicking a case statement efficiently. Since the hashes are collision-free for this set, no additional equality checks are needed, and the lookup is O(1). Sure you can write a dictionary that works with just these keys. But what is the point of a dictionary that works with these specific five keys? Have you got a real world example to hand? Share this post Link to post
Anders Melander 2071 Posted yesterday at 12:03 AM 39 minutes ago, PeterPanettone said: Since the hashes are collision-free for this set, no additional equality checks are needed, and the lookup is O(1). The problem isn't the lookup into the hash table. The problem is that you would need to scan the whole input string in order to compute the hash key for the lookup. Again: There are many different algorithms specifically designed to efficiently search a static set of "strings" and hashing isn't one of them. 53 minutes ago, PeterPanettone said: It is a direct response to pyscripter's mentioning pattern matching in Python and C#. Yes, of course it is. Share this post Link to post
Dave Novo 57 Posted 17 hours ago Hi Anders, Out of curiosity, what are the most effective ones? Based on your comments I did a quick google/AI search and both seemed to indicate that if you expect a high level of matching across your static set of strings (i.e. when you test a string against your set you expect a match most of the time, which is what I would guess based on a case statement of strings) then hashing is an effective algorithm. If you expect a lot of misses, then a Prefix Tree is a better choice because you don't have to read the entire string all the time. But Prefix tree is slow in the event you match a lot, since you have to read the entire string anyhow. Share this post Link to post
Stefan Glienke 2150 Posted 11 hours ago With a fixed set of strings to look up, hashing is never the fastest way, but a handcrafted way. Let's, for example, imagine a case statement that checks for all keywords in the Pascal language. Arrange the strings in the lookup table by length and first letter. You can see how the C# compiler does it - I wrote a simple demo with just a few keywords: https://sharplab.io/#gist:731036823c89363962d7e79f9dc9ed28 There is a lot of material you can read on that subject: it is called "switch lowering" 2 Share this post Link to post
A.M. Hoornweg 159 Posted 10 hours ago A case statement for strings can be created using generics. Not the fastest (it does a linear probe) but quite practical sometimes. Var s:String; begin s:='three'; CASE tarray.Indexof<string> (tarray<string>.create( 'zero', 'one', 'two', 'three', 'four'), s) of 0:writeln('zero'); 1:writeln('one'); 2:writeln('two'); 3:writeln('three'); 4:writeln('four'); end; end; 2 Share this post Link to post
Cristian Peța 122 Posted 8 hours ago 1 hour ago, A.M. Hoornweg said: A case statement for strings can be created using generics. And what is the benefit? if-then-else is more readable than this. Share this post Link to post
Lars Fosdal 1877 Posted 8 hours ago Many years ago, I suggested compile-time optimized lookup structures for the Delphi compiler. Allen Bauer was not sure it was worth the effort as it would be hard to define something in code that could be applicable to many types of data and structures. So, instead I create custom lookup classes and fill them once. Share this post Link to post
mvanrijnen 128 Posted 7 hours ago there's also the old: IndexText or IndexStr method System.StrUtils.IndexText - RAD Studio API Documentation 1 Share this post Link to post
A.M. Hoornweg 159 Posted 7 hours ago 50 minutes ago, Cristian Peța said: And what is the benefit? if-then-else is more readable than this. When large numbers of fields are accessed by name, the legibility of a CASE is much better than if/then. I used this technique recently to evaluate a recordset of +/- 150 fields where I had to go by name rather than ordinal number (the routine had to be able to handle multiple versions of the database - the sql query had to be rather a-specific). Share this post Link to post
Cristian Peța 122 Posted 7 hours ago 35 minutes ago, A.M. Hoornweg said: When large numbers of fields are accessed by name, the legibility of a CASE is much better than if/then. I prefer something like this: type TMyStrings = (zero, one, two, three, four); function StrToMyString(AString: String): TMyStrings; begin if AString = 'zero' then Result := TMyStrings.zero else if AString = 'one' then Result := TMyStrings.one else if AString = 'two' then Result := TMyStrings.two else if AString = 'three' then Result := TMyStrings.three else if AString = 'four' then Result := TMyStrings.four end; var s: String; begin s := 'three'; case StrToMyString(s) of zero: writeln('zero'); one: writeln('one'); two: writeln('two'); three: writeln('three'); four: writeln('four'); end; //or, if you want interger case Integer(StrToMyString(s)) of 0: writeln('zero'); 1: writeln('one'); 2: writeln('two'); 3: writeln('three'); 4: writeln('four'); end; Share this post Link to post
Uwe Raabe 2169 Posted 6 hours ago You don't even have to provide that function as long as the strings matches the enum identifiers: s := 'three'; case TRttiEnumerationType<TMyStrings>.GetValue(s) of zero: writeln('zero'); one: writeln('one'); two: writeln('two'); three: writeln('three'); four: writeln('four'); end; Or if you prefer a more readable approach: type TMyStrings = (zero, one, two, three, four); TMyStringsHelper = record helper for TMyStrings public class function FromString(const AString: string): TMyStrings; static; end; class function TMyStringsHelper.FromString(const AString: string): TMyStrings; begin Result := TRttiEnumerationType.GetValue<TMyStrings>(AString); end; ... var s := 'three'; case TMyStrings.FromString(s) of zero: writeln('zero'); one: writeln('one'); two: writeln('two'); three: writeln('three'); four: writeln('four'); else Writeln('What?'); end; 1 Share this post Link to post
Rollo62 602 Posted 5 hours ago Good question. This had been on my mind for quite some time, so I tried to come up with a universal solution that would be as painless as possible. Here is the first draft. I welcome feedback and suggestions for improvement. A real case is hardly possible, except in Delphi itself, I had been racking my brains over this for a long time. Neither attributes nor other tricks will get you very far. I found the most painless way for me, see the attachment, which unfortunately consists of two phases: 1. Definition of the cases 2. The actual match case, but with integer values This preserves the case feeling and adds RegEx and wildcard matches as a bonus. Just in short, as simple example, with direkt pattern-match var LCaser := TS4Caser_Factory.New( True, -1 ); LCaser .Register( 1, 'aaa' ) .Register( 2, 'bbb' ) .Register( 3, 'ccc' ) ; var LTest := 'ccc'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); else Log_Memo( 'nothing matched' ); end; buit it can do much more, in a similar, easy to handle RegEx style LCaser := TS4Caser_Factory.New( True, -1 ); // RegEx Pattern registrieren - viel mächtiger als einfache Strings! LCaser .RegisterRegex( 1, '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$') // E-Mail .RegisterRegex( 2, '^\d{3}-\d{2}-\d{4}$') // SSN Format .RegisterRegex( 3, '^(\+49|0)[1-9]\d{1,4}\s?\d{1,12}$') // Deutsche Telefonnummer .RegisterRegex( 4, '^https?://(?:[-\w.])+(?::\d+)?(?:/(?:[\w/_.])*)?(?:\?(?:[\w&=%.])*)?(?:#(?:\w*))?$'); // URL EmailInput := 'test@example.com'; case LCaser.Match( EmailInput ) of 1: Log_Memo('✓ Valid E-Mail-: ', EmailInput); 2: Log_Memo('✓ Valid SSN: ', EmailInput); 3: Log_Memo('✓ Valid German Phone no.: ', EmailInput); 4: Log_Memo('✓ Valid URL: ', EmailInput); else Log_Memo('✗ Unknown Format: ', EmailInput); end; EmailInput := '+491234567890'; case LCaser.Match( EmailInput ) of 1: Log_Memo('✓ Valid E-Mail-: ', EmailInput); 2: Log_Memo('✓ Valid SSN: ', EmailInput); 3: Log_Memo('✓ Valid German Phone no.: ', EmailInput); 4: Log_Memo('✓ Valid URL: ', EmailInput); else Log_Memo('✗ Unknown Format: ', EmailInput); end; Log_Memo; // RegEx with Capture Groups LCaser.Clear; LCaser.RegisterRegex( 10, '^(\d{4})-(\d{2})-(\d{2})$' ); // Date YYYY-MM-DD LIndex := LCaser.MatchWithCaptures( '2024-03-15', LCaptures ); if LIndex = 10 then begin Log_Memo('✓ Date recognized:'); Log_Memo(' Year: ', LCaptures[0]); Log_Memo(' Month: ', LCaptures[1]); Log_Memo(' Day: ', LCaptures[2]); end; Or with Wildcards too LCaser := TS4Caser_Factory.New( False, 0 ); // Case insensitive // Wildcard Pattern registrieren (* und ? Unterstützung) LCaser .RegisterWildcard( 1, '*.txt') // Alle .txt Dateien .RegisterWildcard( 2, '*.doc*') // .doc, .docx, etc. .RegisterWildcard( 3, 'temp_?.log') // temp_1.log, temp_a.log, etc. .RegisterWildcard( 4, 'backup_*.*') // backup_xyz.abc .RegisterWildcard( 5, '*.jp*g'); // .jpg, .jpeg FileName := 'document.DOCX'; case LCaser.Match( FileName ) of 1: Log_Memo('📄 Text-File: ', FileName); 2: Log_Memo('📝 Word-Document: ', FileName); 3: Log_Memo('📋 Temporal Log-File: ', FileName); 4: Log_Memo('💾 Backup-File: ', FileName); 5: Log_Memo('🖼️JPEG-Image: ', FileName); 0: Log_Memo('❓ Unknown file: ', FileName); else Log_Memo('❌ Error at file recognition: ', FileName); end; FileName := 'backup_123.456'; case LCaser.Match( FileName ) of 1: Log_Memo('📄 Text-File: ', FileName); 2: Log_Memo('📝 Word-Document: ', FileName); 3: Log_Memo('📋 Temporal Log-File: ', FileName); 4: Log_Memo('💾 Backup-File: ', FileName); 5: Log_Memo('🖼️JPEG-Image: ', FileName); 0: Log_Memo('❓ Unknown file: ', FileName); else Log_Memo('❌ Error at file recognition: ', FileName); end; FileName := 'image/folder/pic.jpg'; case LCaser.Match( FileName ) of 1: Log_Memo('📄 Text-File: ', FileName); 2: Log_Memo('📝 Word-Document: ', FileName); 3: Log_Memo('📋 Temporal Log-File: ', FileName); 4: Log_Memo('💾 Backup-File: ', FileName); 5: Log_Memo('🖼️JPEG-Image: ', FileName); 0: Log_Memo('❓ Unknown file: ', FileName); else Log_Memo('❌ Error at file recognition: ', FileName); end; Is this closer to the function you were looking to? Let me know, if you could find improvements or a easier, better solution. CaserTest.zip Share this post Link to post
Rollo62 602 Posted 1 hour ago (edited) Here is another update, with registration of group patterns and guard options. Usable like this: procedure TMainFrm.SimpleWhenExample; begin Log_Memo('=== RegEx Pattern Matching Example ==='); var LCaser := TS4Caser_Factory.New( True, -1 ); LCaser .Register( 1, 'aaa' ) .Register( 2, 'bbbb' ) .Register( 3, 'ccc' ) .Register( 4, 'aa' ) ; var LTest := 'ccc'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; LTest := 'bbb'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; LTest := 'bbbb'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; LTest := 'aa'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; // // GUARD: Ignore <= 2 char lengths // LCaser .WhenAny( function ( const AValue : string ) : Boolean begin if AValue.Length > 2 then Result := True else Result := False; end ) ; LTest := 'aaa'; case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; LTest := 'aa'; //! This ís guarded ( len <= 2 ) case LCaser.Match( LTest ) of 1: Log_Memo( 'aaa matched' ); 2: Log_Memo( 'bbb matched' ); 3: Log_Memo( 'ccc matched' ); 4: Log_Memo( 'aa matched' ); else Log_Memo( 'nothing matched to ' + LTest ); end; end; // Example 2: OR-Pattern for groups procedure TMainFrm.SimpleOrExample; var Command : string; LCaser : IS4Caser; begin Log_Memo( '=== OR Patterns Example ==='); LCaser := TS4Caser_Factory.New( False, -1 ); // RegisterOr: Mehrere exakte Alternativen für gleichen Index LCaser .RegisterOr( 1, ['start', 'run', 'begin', 'launch', 'execute']) .RegisterOr( 2, ['stop', 'halt', 'end', 'terminate', 'kill', 'abort']) .RegisterOr( 3, ['pause', 'suspend', 'hold', 'freeze']) .RegisterOr( 4, ['resume', 'continue', 'unpause', 'thaw']); // RegisterWildcardOr: Mehrere Wildcard-Pattern LCaser .RegisterWildcardOr(10, ['config_*', 'cfg_*', 'setting_*']) .RegisterWildcardOr(11, ['log_*', 'trace_*', 'debug_*']); Command := 'execute'; case LCaser.Match(Command) of 1: Log_Memo( ' Service startet: ', Command); 2: Log_Memo( ' Service estopped: ', Command); 3: Log_Memo( 'â¸ï¸Service paused: ', Command); 4: Log_Memo( 'â–¶ï¸ Service resumed: ', Command); 10: Log_Memo( 'âš™ï¸Config-Command: ', Command); 11: Log_Memo( ' Logging-Command: ', Command); else Log_Memo('â“ Unknown command: ', Command); end; Command := 'halt'; case LCaser.Match(Command) of 1: Log_Memo( ' Service startet: ', Command); 2: Log_Memo( ' Service estopped: ', Command); 3: Log_Memo( 'â¸ï¸Service paused: ', Command); 4: Log_Memo( 'â–¶ï¸ Service resumed: ', Command); 10: Log_Memo( 'âš™ï¸Config-Command: ', Command); 11: Log_Memo( ' Logging-Command: ', Command); else Log_Memo('â“ Unknown command: ', Command); end; Command := 'suspend'; case LCaser.Match(Command) of 1: Log_Memo( ' Service startet: ', Command); 2: Log_Memo( ' Service stopped: ', Command); 3: Log_Memo( 'â¸ï¸Service paused: ', Command); 4: Log_Memo( 'â–¶ï¸ Service resumed: ', Command); 10: Log_Memo( 'âš™ï¸Config-Command: ', Command); 11: Log_Memo( ' Logging-Command: ', Command); else Log_Memo('â“ Unknown command: ', Command); end; Command := 'go'; case LCaser.Match(Command) of 1: Log_Memo( ' Service startet: ', Command); 2: Log_Memo( ' Service estopped: ', Command); 3: Log_Memo( 'â¸ï¸Service paused: ', Command); 4: Log_Memo( 'â–¶ï¸ Service resumed: ', Command); 10: Log_Memo( 'âš™ï¸Config-Command: ', Command); 11: Log_Memo( ' Logging-Command: ', Command); else Log_Memo('â“ Unknown command: ', Command); end; Command := 'config_database'; case LCaser.Match(Command) of 1: Log_Memo(' Start-Command: ', Command); 10: Log_Memo('âš™ï¸Configuration in progress: ', Command); 11: Log_Memo(' Log-Command: ', Command); else Log_Memo('â“ Unknown Command: ', Command); end; end; Edited 1 hour ago by Rollo62 Share this post Link to post
Rollo62 602 Posted 1 hour ago (edited) Here the file, belongs to previous post. CaserTest.zip Edited 1 hour ago by Rollo62 Share this post Link to post