Leaderboard
Popular Content
Showing content with the highest reputation on 02/06/21 in all areas
-
Micro optimization: String starts with substring
Attila Kovacs replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
@David Heffernan I admire your patience. -
WolfePak Software is looking (Texas & maybe remote)
Mark- replied to Kyle Miller's topic in Job Opportunities / Coder for Hire
HA, not to be harsh, obviously, you have little knowledge of the petrochemical industry. The device you utilized to type in/speak your post includes products from the "Oil" industry. -
Micro optimization: String starts with substring
David Heffernan replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I don't see any reason to check that. We've long since solved this. -
Micro optimization: String starts with substring
Remy Lebeau replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Really? Ouch 😨 It used to be much simpler: function TStringHelper.StartsWith(const Value: string): Boolean; begin Result := StartsWith(Value, False); end; function TStringHelper.StartsWith(const Value: string; IgnoreCase: Boolean): Boolean; begin if not IgnoreCase then Result := System.SysUtils.StrLComp(PChar(Self), PChar(Value), Value.Length) = 0 else Result := System.SysUtils.StrLIComp(PChar(Self), PChar(Value), Value.Length) = 0; end; function StrLComp(const Str1, Str2: PWideChar; MaxLen: Cardinal): Integer; var I: Cardinal; P1, P2: PWideChar; begin P1 := Str1; P2 := Str2; I := 0; while I < MaxLen do begin if (P1^ <> P2^) or (P1^ = #0) then Exit(Ord(P1^) - Ord(P2^)); Inc(P1); Inc(P2); Inc(I); end; Result := 0; end; function StrLIComp(const Str1, Str2: PWideChar; MaxLen: Cardinal): Integer; var P1, P2: PWideChar; I: Cardinal; C1, C2: WideChar; begin P1 := Str1; P2 := Str2; I := 0; while I < MaxLen do begin if P1^ in ['a'..'z'] then C1 := WideChar(Word(P1^) xor $20) else C1 := P1^; if P2^ in ['a'..'z'] then C2 := WideChar(Word(P2^) xor $20) else C2 := P2^; if (C1 <> C2) or (C1 = #0) then Exit(Ord(C1) - Ord(C2)); Inc(P1); Inc(P2); Inc(I); end; Result := 0; end; Compared to StartsStr(), which (eventually) calls CompareString(): function StartsStr(const ASubText, AText: string): Boolean; begin Result := AnsiStartsStr(ASubText, AText); end; function AnsiStartsStr(const ASubText, AText: string): Boolean; begin Result := AnsiSameStr(ASubText, Copy(AText, 1, Length(ASubText))); // WHY Copy()? AnsiStrLComp() could have been used instead! end; function AnsiSameStr(const S1, S2: string): Boolean; begin Result := AnsiCompareStr(S1, S2) = 0; end; function AnsiCompareStr(const S1, S2: string): Integer; {$IFDEF MSWINDOWS} begin Result := CompareString(LOCALE_USER_DEFAULT, 0, PChar(S1), Length(S1), PChar(S2), Length(S2)) - CSTR_EQUAL; end; {$ENDIF MSWINDOWS} {$IFDEF POSIX} begin Result := UCS4CompareStr(UnicodeStringToUCS4String(S1), UnicodeStringToUCS4String(S2)); end; {$ENDIF POSIX} I don't have RTL/VCL source code past XE3, but I keep seeing people mention how bad it's getting in recent years. This is not good. -
Micro optimization: String starts with substring
David Heffernan replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Not if the search string is longer than the other string. Then it's a buffer overrun. Use the version from my post. -
Micro optimization: String starts with substring
David Heffernan replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
{$APPTYPE CONSOLE} uses System.SysUtils, System.Diagnostics, System.StrUtils; function MyStartsWith(const SearchText, Text: string): Boolean; var Index, SearchTextLen: Integer; begin SearchTextLen := Length(SearchText); if SearchTextLen>Length(Text) then begin Result := False; Exit; end; for Index := 1 to SearchTextLen do if Text[Index]<>SearchText[Index] then begin Result := False; Exit; end; Result := True; end; const cMaxLoop = 10000000; cText = 'Error: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.'; cTextNoHits = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.'; cStartsWithShort = 'E'; cStartsWithLong = 'Error:'; var vSW: TStopWatch; i, vLen: integer; hitCount: Integer; vResult: boolean; begin Writeln('Substring exists at the start:'); // Short string Writeln('Short string:'); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Copy(cText, 1, Length(cStartsWithShort)) = cStartsWithShort then Inc(hitCount); Writeln(' Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MidStr(cText, 1, Length(cStartsWithShort)) = cStartsWithShort then Inc(hitCount); Writeln(' MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if cText.StartsWith(cStartsWithShort) then Inc(hitCount); Writeln(' StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if StartsStr(cStartsWithShort, cText) then Inc(hitCount); Writeln(' StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MyStartsWith(cStartsWithShort, cText) then Inc(hitCount); Writeln(' MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); // Long string Writeln('Long string:'); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Copy(cText, 1, Length(cStartsWithLong)) = cStartsWithLong then Inc(hitCount); Writeln(' Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MidStr(cText, 1, Length(cStartsWithLong)) = cStartsWithLong then Inc(hitCount); Writeln(' MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if cText.StartsWith(cStartsWithLong) then Inc(hitCount); Writeln(' StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if StartsStr(cStartsWithLong, cText) then Inc(hitCount); Writeln(' StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MyStartsWith(cStartsWithLong, cText) then Inc(hitCount); Writeln(' MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); Writeln; // Text DEOS NOT start with selected string Writeln('Substring NOT exists at the start:'); // Short string Writeln('Short string:'); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Copy(cTextNoHits, 1, Length(cStartsWithShort)) = cStartsWithShort then Inc(hitCount); Writeln(' Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MidStr(cTextNoHits, 1, Length(cStartsWithShort)) = cStartsWithShort then Inc(hitCount); Writeln(' MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if cTextNoHits.StartsWith(cStartsWithShort) then Inc(hitCount); Writeln(' StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if StartsStr(cStartsWithShort, cTextNoHits) then Inc(hitCount); Writeln(' StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MyStartsWith(cStartsWithShort, cTextNoHits) then Inc(hitCount); Writeln(' MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); // Long string Writeln('Long string:'); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Copy(cTextNoHits, 1, Length(cStartsWithLong)) = cStartsWithLong then Inc(hitCount); Writeln(' Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MidStr(cTextNoHits, 1, Length(cStartsWithLong)) = cStartsWithLong then Inc(hitCount); Writeln(' MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if cTextNoHits.StartsWith(cStartsWithLong) then Inc(hitCount); Writeln(' StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if StartsStr(cStartsWithLong, cTextNoHits) then Inc(hitCount); Writeln(' StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); hitCount := 0; vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if MyStartsWith(cStartsWithLong, cTextNoHits) then Inc(hitCount); Writeln(' MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount); readln; end. Copy is clearly a shocker of an idea for this use. heap allocation!! Really?!!! And the other functions seem really slow. A simple for loop is around 10 times faster. I didn't capture any timings, but it's easy to do it on your machine. I also addressed a couple of issues with your benchmark code. I don't believe this is a bottleneck in your program, but you love premature optimisation with a rarely seen passion. -
Micro optimization: String starts with substring
Remy Lebeau replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I find that hard to believe. But I have no way to test that myself right now. When Copy() or MidStr() are used to return a partial substring of a larger string, they dynamically allocate a new string and copy characters into it. Your tests are requesting smaller substrings. So there should be allocations being performed before any comparisons can be made. That would take more time. The only way those allocations should not be done is when the entire input string is being returned as-is, in which case the reference count for a non-constant gets incremented, and a constant gets returned as-is (no reference counting). This is even worse for StartsStr(), because it also uses Copy() internally, to chop the input string down to the size of the comparison string, and then compares the result to the comparison string. That is really unnecessary, as TStringHelper.StartsWith() proves. TStringHelper.StartsWith() simply calls SysUtils.StrL(I)Comp(), passing it pointers to the input strings, and the max length to compare. It compares the characters directly in the original memory without making any allocations at all. Which is the way it should be done. So I would have expected TStringHelper.StartsWith() to be much faster than the others, not in the middle. But then, I was looking at an older version (XE3). Maybe things have changed in recent years? Any comparison that avoids having to allocate new strings should be the fastest. -
Fast stable sorting routines
Tom de Neef posted a topic in Algorithms, Data Structures and Class Design
(This is a re-start of another thread) I have published Delphi sorting routines that ought to be similar in speed to TIMsort, for arrays of simple types, strings and objects. The focus has been on large (10M+) real-world situations, where there is already some order in the data. The documentation explains how to use in case you do not want generics. The software is free: https://sourceforge.net/projects/fast-stable-sorting-in-delphi/ It includes a test suite. Stable sorting of objects with QuickSort is possible by adding to the object a tag that can serve as stability key. In that way I compare sorting speeds of my procedures with Tarray.sort<Tobject>. For (semi-)random data the difference is 5-10 times. When there is already order in the data, the improvement may go up to a factor of 20. I chose some optimization parameters on the basis of the test arrays that I generated. It would be nice to know if attractive results are obtained when applied to real-world data. -
https://github.com/DelphiPraxis/DDevExtensions/releases/tag/v2.87 compiled pre-release, not fully tested yet