Mike Torrettinni
Members-
Content Count
1509 -
Joined
-
Last visited
-
Days Won
3
Everything posted by Mike Torrettinni
-
QueryPerformanceCounter precision
Mike Torrettinni replied to Mike Torrettinni's topic in Windows API
Thanks! I guess measuring single execution of short operation has just so many variables dependent on OS, that doesn't make sense. The only reason for Sleep(1) was because this is the only operation I know that you can define the number - I was sure it is a simple: start now, wait 1ms, continue. I guess not. OK, lets see what today brings with more testing. I'm going to try to find the measurement that has the least, but constant, variance(?), probably <5%. While the timing is still below 1ms. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni posted a topic in Algorithms, Data Structures and Class Design
I finally have a working version that is faster than calling StringReplace numerous times, but I hope there is a way to make it even faster. So, the purpose of this ReplaceMultiStrings is avoid calling StringRepace multiple times, when we need to replace multiple sub strings, like: s := StringReplace(s, '<', '<'); s := StringReplace(s, '>', '>'); s := StringReplace(s, '&', '&'); Here is description of how my function works: 1. Get all Positions of all substrings 2. Sort Positions, so we make replacements from 1..n one by one (if we don't sort and make replacements randomly within a string, it's a mess...) 3. Loop Positions and make replacements (copy each replacement and parts between replacements into single Result string) I like the performance, but I was hoping that a single point of constructing final string will give much better performance. But of course the finding all Positions and sorting takes time. Here are the example timings: With short strings, the performance is close to 50% faster, but with longer strings the difference is much smaller. Of course there are some limitations if substrings are not unique or substrings are within other substrings. No Case options...and perhaps others. Source code: Any advice is appreciated! -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
No, I need more general purpose function, normal string is just fine. Here is a real multi pattern where I need performance in my code: ['
', #13#10, '"'] -> ['', #8, '\_/'] [#8, '\_/'] -> [#13#10, '"'] ['XLTT', 'XLA', 'CDR', 'TXE'] -> ['_', '_', '_', '_'] ['Start Event', 'Stop Event', 'Load', 'Call', 'Open'...] <-> ['SE:', 'STE:', 'L:', 'C:', 'O:'...] But even when performance is not really needed and multiple calls to standard StringReplace are good enough, why wouldn't i use one-pass replacement function for examples like these: // escaping characters ['?', !', '[', ']', '(', ')'] -> ['\?', '\!', '\[', '\]', '\(', '\)'] // some html tags ['&', '<', '>', '"', ' ', '<br>', '<br />', '<br/>'] -> ['&', '<', '>', '"', ' ', #13#10, #13#10, #13#10] ['&', '<', '>', '"', ' ', #13#10] -> ['&', '<', '>', '"', ' ', '<br>'] -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Pretty impressive, seems to be very fast, Still looking into it! Amazing how you guys can come up with such solutions so fast, while I spent days working on mine 😉 -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
That is expected, different implementation could have different results. I looked into some open source projects and I couldn't find any example where StringReplace(s, 'somestring', '', [rfReplaceAll]) would be used in while Pos('somestring', s) > 0. So this example would fail on single StringReplace: s := StringReplace('s, 'delphi', '', []),if the purpose is to remove 'delphi' from string: 'This is deldeldelphiphiphi example'. You can also call MultiStringReplace mutliple consecutive times, if needed. The purpose is to have fast single pass replacements, this means some limitation apply. In most cases you know the type of input you work on. If it's possible to have 'nested/hidden' substrings that will show up after this function, than you should use different approach. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Ok, yes I know that. But I'm looking for performant example. If you have a complete example, I would like to try it out. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Hm, I probably don't understand correctly. Yes, CopyTo is like Move, but how do you delete chars? -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Correct, but this is only for methods where you do one-pass over string first and count substrings, otherwise you don't know how many substrings there are, right? -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Thank you, but it doesn't pass the RunTestCases. Just include the check for new functions, like: var sHim: string; ... sHim := Him_StringReplace(aTestCases[i].Str, aTestCases[i].OldPatterns, aTestCases[i].NewPatterns); ... if s <> sHim then raise Exception.Create('Not equal results: ' + sLineBreak + s + sLineBreak + sHim); It seems it only replaces first strings. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I didn't initially test TStringBuilder because I don't have any experience with it and also I remember reading that it was very good solution in early versions of Delphi, then they optimized everything else and it became not so performant in latest versions. But if it would be faster or the fastest, I would use it! I guess I forgot the point I wanted to make: because of limited experience with TStringBuilder, there's a high chance I don't use it correctly or most efficiently. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Seems that when code is in SPOILER tags could be copied wrong. Here is latest source: program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Generics.Collections, System.Generics.Defaults, System.Diagnostics; type // Positions TSubStrPos = record Pos : integer; StrIdx: integer; end; // Test cases TTestCase = record Str: string; OldPatterns, NewPatterns: TArray<string>; end; TTestCases = TArray<TTestCase>; // Custom comparer initialized 1x only TCustomComparer = class(TInterfacedObject, IComparer<TSubStrPos>) public function Compare(const Left, Right: TSubStrPos): Integer; end; const cMaxLoop = 1000000; cShortA : string = 'abc < 1 > 2'; cMediumA : string = 'medium < text medium > text medium <> text medium text >'; cLongA : string = '<Lorem ipsum sit >amet, <consectetur adipiscing. Integer sagittis dolor a aliquam. >< Vestibulum et purus in dolor consectetur tempor a eget est. Curabitur sit amet erat imperdiet, <scelerisque >suscipit nisl. <Vestibulum velit.'; cShortB : string = 'testing and testing'; cMediumB : string = 'testing and testing and another testing and yet another testing of the test'; cLongB : string = 'Lorem te ipsum ingdolor sit stamet, consectetur adipiingiscing elitesting. Morbi bibendum, mauris Vestibulum vulputate, mauris neque pretium terisus, maximus odio ipsum vel est. Integer sollicitudin massa sit amet semper condimentum. rutrumingte'; cHTML: string = '<html><body>This is sample text for StringReplace test.' + 'Lorem <ipsum dolor sit amet,& consectetur adipiscing elit. Integer feugiat ullamcorper &ligula, at &vehicula turpis volutpat et. Praesent in sapien vel tortor pulvinar ullamcorper vel et felis. >Praesent mattis efficitur ligula, sit amet' + 'venenatis felis. Mauris dui mauris, tempus in turpis at, eleifend tincidunt arcu. Morbi nec tincidunt felis, non viverra nisi. Nulla facilisi.< Donec iaculis, elit >non hendrerit vehicula, urna dui efficitur &quolectus, in tempus metus est' + 'sed sem. <Vivamus eleifend euismod orci a volutpat. Ut sed ligula sed sem bibendum facilisis. Morbi gravida neque <nec odio lacinia, at tincidunt &nulla dignissim. Vestibulum vitae >nisi accumsan, dapibus lorem non, &quoconvallis turpis.' + 'Pellentesque imp&erdiet malesuada ipsum vel commodo. Nam dignissim< luctus augue, nec maximus felis vehicula eget. Phasellus ultrices lacus <ac elit dapibus &quopellentesque. Morbi vitae eros in turpis> hendrerit volutpat. Cras euismod' + ' dolor in scelerisque. Donec &id tincidunt nisi. Praesent ut nunc luctus, pulvinar <mauris ut, hend>rerit nulla. Praesent quis varius lorem.&' + '</body></html>'; cSpecialChars : string = 'Testing special char replacemenets: [Lorem te (ipsum) ing(((d)))olo/r -s+it st\\amet, co_!?nse?ctetur adipi-i!ng?isc/ing +(el_)i[_!?]testing.]'; var xComparer: IComparer<TSubStrPos>; function TCustomComparer.Compare(const Left, Right: TSubStrPos): Integer; begin Result := Left.Pos - Right.Pos; end; procedure GetSubStrPositions7(const aStr: string; const aOldPatterns, aNewPatterns: array of string; var aPositions: TArray<TSubStrPos>; var aFinalLength: integer); var vPos, i, vDiff, vIdx: integer; begin aFinalLength := aStr.Length; SetLength(aPositions, 32); vIdx := 0; for i := Low(aOldPatterns) to High(aOldPatterns) do begin vPos := Pos(aOldPatterns[i], aStr); vDiff := aNewPatterns[i].Length - aOldPatterns[i].Length; // calculate diff of replacements, to adjust aFinalLength while vPos <> 0 do begin if vIdx > High(aPositions) then SetLength(aPositions, Length(aPositions) * 2); aPositions[vIdx].Pos := vPos; aPositions[vIdx].StrIdx := i; Inc(vIdx); aFinalLength := aFinalLength + vDiff; vPos := Pos(aOldPatterns[i], aStr, vPos + 1); end; end; SetLength(aPositions, vIdx); end; function ReplaceMultiStringsByPositions7(const aStr: string; const aPositions: TArray<TSubStrPos>; const aOldPatterns, aNewPatterns: array of string; aResultLen: integer): string; var i, vResPos, vStrPos: Integer; begin SetLength(Result, aResultLen); // move substrings by index + between indexes, if any vResPos := 1; vStrPos := 1; for i := Low(aPositions) to High(aPositions) do begin // Copy free text between last Pos and current Pos if aPositions[i].Pos > vStrPos then begin Move(aStr[vStrPos], Result[vResPos], (aPositions[i].Pos - vStrPos)*SizeOf(Char)); Inc(vResPos, aPositions[i].Pos - vStrPos); end; // copy New str Move(aNewPatterns[aPositions[i].StrIdx][1], Result[vResPos], aNewPatterns[aPositions[i].StrIdx].Length * SizeOf(Char)); // move Pos by New str len Inc(vResPos, aNewPatterns[aPositions[i].StrIdx].Length); // move aStr Pos vStrPos := aPositions[i].Pos + aOldPatterns[aPositions[i].StrIdx].Length; end; // copy any left after last replacement if vStrPos < aStr.Length then Move(aStr[vStrPos], Result[vResPos], (aStr.Length - vStrPos + 1) * SizeOf(Char)); end; procedure SortPositions(var aPositions: TArray<TSubStrPos>); begin if xComparer = nil then xComparer := TCustomComparer.Create; // initialize Comparer TArray.Sort<TSubStrPos>(aPositions, xComparer); end; procedure InsertionSort(var A: TArray<TSubStrPos>); var i, j: Integer; tmp: TSubStrPos; begin for i:= 1 to high(A) do begin j:= i; tmp := A[i]; while (j > 0) and (A[j-1].Pos > tmp.Pos) do begin A[j]:= A[j-1]; Dec(j); end; A[j]:= tmp; end; end; function ReplaceMultiStrings7(const aStr: string; const aOldPatterns, aNewPatterns: array of string): string; var vPositions: TArray<TSubStrPos>; vResultLen: integer; // vPositions2: TArray<TSubStrPos>; begin // Get Positions for all string replacements GetSubStrPositions7(aStr, aOldPatterns, aNewPatterns, vPositions, vResultLen); //GetSubStrPositions7(aStr, aOldPatterns, aNewPatterns, vPositions2, vResultLen); // Sort Positions so the replacements go from first to last pos //SortPositions(vPositions); InsertionSort(vPositions); // Insertion sort is faster in these situation because of small number of items to sort // Replace by indexes Result := ReplaceMultiStringsByPositions7(aStr, vPositions, aOldPatterns, aNewPatterns, vResultLen); end; { Standard StringReplace used multiple times } function StringReplaceAll(const aStr: string; const aOldPatterns, aNewPatterns: array of string): string; var i: Integer; begin Result := aStr; for i := Low(aOldPatterns) to High(aOldPatterns) do Result := StringReplace(Result, aOldPatterns[i], aNewPatterns[i], [rfReplaceAll]); end; function MultiStrReplace3(const aStr: string; const aOldPatterns, aNewPatterns: array of string): string; var vCharPos: integer; vStrLen: integer; i, j, vStrPos: integer; vMatched: boolean; vResPos, a, b: integer; begin vStrLen := Length(aStr); if vStrLen > 0 then begin SetLength(Result, 2 * vStrLen); // pre-allocate Result vStrPos := 1; vCharPos := 1; vResPos := 1; a := Low(aOldPatterns); b := High(aOldPatterns); while vCharPos <= vStrLen do begin // find next match for i := a to b do begin vMatched := false; if (aStr[vCharPos] = aOldPatterns[i][1]) then begin vMatched := True; for j := 1 to aOldPatterns[i].Length - 1 do if aStr[vCharPos + j] <> aOldPatterns[i][j+1] then begin vMatched := false; Break; end; end; // replace matched string if vMatched then begin // add original string up to this point, if needed Move(aStr[vStrPos], Result[vResPos], (vCharPos - vStrPos)*SizeOf(Char)); Inc(vResPos, vCharPos - vStrPos); // now add new string Move(aNewPatterns[i][1], Result[vResPos], (aNewPatterns[i].Length)*SizeOf(Char)); Inc(vResPos, aNewPatterns[i].Length); // move pos past the matched string Inc(vCharPos, aOldPatterns[i].Length); vStrPos := vCharPos; Break; end; end; if not vMatched then inc(vCharPos) else vMatched := false; end; // add end of string Move(aStr[vStrPos], Result[vResPos], (aStr.Length - vStrPos +1)*SizeOf(Char)); SetLength(Result, vResPos + (aStr.Length - vStrPos)); end; end; function MultiStringReplaceSB(const aStr: string; const aOldPatterns, aNewPatterns: array of string): string; var vSB: TStringBuilder; vCharPos: integer; vStrLen: integer; i, j, vStrPos: integer; vMatched: boolean; a, b: integer; begin vStrLen := Length(aStr); if vStrLen > 0 then begin vSB := TStringBuilder.Create( 2 * Length(aStr)); try vStrPos := 1; vCharPos := 1; a := Low(aOldPatterns); b := High(aOldPatterns); while vCharPos <= vStrLen do begin // find next match for i := a to b do begin vMatched := false; if (aStr[vCharPos] = aOldPatterns[i][1]) then begin vMatched := True; for j := 1 to aOldPatterns[i].Length - 1 do if aStr[vCharPos + j] <> aOldPatterns[i][j+1] then begin vMatched := false; Break; end; end; // replace matched string if vMatched then begin // add original string up to this point, if needed vSB.Append(Copy(aStr, vStrPos, vCharPos - vStrPos)); // now add new string vSB.Append(aNewPatterns[i]); // move pos past the matched string Inc(vCharPos, aOldPatterns[i].Length); vStrPos := vCharPos; Break; end; end; if not vMatched then inc(vCharPos) else vMatched := false; end; // add end of string vSB.Append(Copy(aStr, vStrPos, aStr.Length - vStrPos +1)); Result := vSB.ToString; finally vSB.Free; end; end; end; { Test cases } procedure AddTestCase(var aTestCases: TTestCases; const aStr: string; const aOldPatterns, aNewPatterns: TArray<string>); var vTestCase: TTestCase; begin vTestCase.Str := aStr; vTestCase.OldPatterns := aOldPatterns; vTestCase.NewPatterns := aNewPatterns; aTestCases := aTestCases + [vTestCase]; end; procedure RunTestCases(const aTestCases: TTestCases); var i: Integer; s, s7, s3, sSB: string; begin for i := Low(aTestCases) to High(aTestCases) do begin s := StringReplaceAll(aTestCases[i].Str, aTestCases[i].OldPatterns, aTestCases[i].NewPatterns); s7 := ReplaceMultiStrings7(aTestCases[i].Str, aTestCases[i].OldPatterns, aTestCases[i].NewPatterns); s3 := MultiStrReplace3(aTestCases[i].Str, aTestCases[i].OldPatterns, aTestCases[i].NewPatterns); sSB := MultiStringReplaceSB(aTestCases[i].Str, aTestCases[i].OldPatterns, aTestCases[i].NewPatterns); if (s <> s7) then raise Exception.Create('Not equal results: ' + sLineBreak + s + sLineBreak + s7); if (s <> s3) or (s <> sSB) then raise Exception.Create('Not equal results: ' + sLineBreak + s + sLineBreak + s3 + sLineBreak + sSB ); end; end; procedure DoTheTiming(const aCaption, aStr: string; const aOldPatterns, aNewPatterns: array of string); var vSW:TStopWatch; i: integer; str: string; begin Writeln(aCaption); // Standard StringReplace vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do begin str := ''; str := StringReplaceAll(aStr, aOldPatterns, aNewPatterns); end; Writeln('StringReplaceAll: ' + vSW.ElapsedMilliseconds.ToString); // ReplaceMultiStrings7 vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do begin str := ''; str := ReplaceMultiStrings7(aStr, aOldPatterns, aNewPatterns); end; Writeln('ReplaceMultiStrings7: ' + vSW.ElapsedMilliseconds.ToString); // MultiStrReplace3 vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do begin str := ''; str := MultiStrReplace3(aStr, aOldPatterns, aNewPatterns); end; Writeln('MultiStrReplace3: ' + vSW.ElapsedMilliseconds.ToString); // MultiStringReplaceSB vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do begin str := ''; str := MultiStringReplaceSB(aStr, aOldPatterns, aNewPatterns); end; Writeln('MultiStringReplaceSB: ' + vSW.ElapsedMilliseconds.ToString); Writeln; end; var vTestCases: TTestCases; begin // make sure string replacement is correct! AddTestCase(vTestCases, cShortA, ['<', '>'], ['<', '>']); AddTestCase(vTestCases, cMediumA, ['<', '>'], ['<', '>']); AddTestCase(vTestCases, cLongA, ['<', '>'], ['<', '>']); AddTestCase(vTestCases, cShortB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); AddTestCase(vTestCases, cShortB, ['te'], ['AND']); AddTestCase(vTestCases, cShortB, ['and'], ['ab']); AddTestCase(vTestCases, cMediumB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); AddTestCase(vTestCases, cMediumB, ['te'], ['AND']); AddTestCase(vTestCases, cMediumB, ['and'], ['ab']); AddTestCase(vTestCases, cLongB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); AddTestCase(vTestCases, cLongB, ['te'], ['AND']); AddTestCase(vTestCases, cLongB, ['and'], ['ab']); AddTestCase(vTestCases, cHTML, ['<', '>', '&', '"'], ['<', '>', '&', '"']); AddTestCase(vTestCases, cSpecialChars, ['\', '/', '!', '?', '[', ']', '_', '(', ')', '-', '+'], ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k']); // Examples of different outcome between calling StringReplace multiple times vs one-time replace all substrings //AddTestCase(vTestCases, 'ReplaceReplaceString', ['Replace', 'FooString'], ['Foo', 'Bar']); RunTestCases(vTestCases); // DoTheTiming('Short str A:', cShortA, ['<', '>'], ['<', '>']); // DoTheTiming('Medium str A:', cMediumA, ['<', '>'], ['<', '>']); // DoTheTiming('Long str A:', cLongA, ['<', '>'], ['<', '>']); DoTheTiming('Short str B:', cShortB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); DoTheTiming('Medium str B:', cMediumB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); DoTheTiming('Long str B:', cLongB, ['te', 'st', 'ing', 'and'], ['ab', 'cd', 'ef', 'AND']); DoTheTiming('HTML:', cHTML, ['<', '>', '&', '"'], ['<', '>', '&', '"']); DoTheTiming('Special Chars:', cSpecialChars, ['\', '/', '!', '?', '[', ']', '_', '(', ')', '-', '+'], ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k']); Writeln('done...'); Readln; end. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Delete. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I use 10.2.3, so I can't use anything 'newer' than 10.3. Where does it fail? -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I didn't know about CopyTo, thanks! Good example, but fails when old and new substrings are of different length, right? -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
@Dalija Prasnikar I have 2 additional versions: MultiStrReplace3 - which loops string char by char and compares each substring. The problem here is that I don't precalculate final lenght of Result, so it's a guess. From results we can see is fastest for really short strings, but way slower with medium or long examples. MultiStringReplaceSB - example using TStringBuilder and is slowest in all examples. Of course there is quite high possibility I have made a mistake and one of the new versions could be best. @Stefan Glienke I implemented InsertionSort and you are correct, it is faster. If you have more details on how you implemented static array that would make it even faster, I would like to know since my implementation of static array didn't prove successful. Updated source: -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I think Delphi's standard StringReplace is based on implementation from FastStrings, already, but there's a lot of tricks that don't apply to unicode anymore. But if you have specific idea I should look at, please let me know. -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Not a bug, but a fact that StringReplace replaces substrings sequentially not all in one step, like my function. Multiple calls in StringReplace go like this: 1. s := StringReplace('ReplaceReplaceString', 'Replace', 'Foo'); -> s = 'FooFooString' 2. s := StringReplace(s, 'FooString', 'Bar'); -> s = 'FooBar' my function: s := ReplaceMultiStrings7('ReplaceReplaceString', ['Replace', 'FooString'], ['Foo', 'Bar']) ->no 'Foo' found in original string, so no replacement, except 'Replace' is replaced with 'Foo' -> s = 'FooFooString' This test raises error because they are different results, but they are both correct results, just this is rare cases where both functions works correct and produce different results. I mean, you can't expect to replace 'Foo' if there is no 'Foo' in original string. Am I wrong in this thinking, does the explanation make sense or am I just trying to justify that my function works correctly? -
Multiple string replace - avoid calling StringReplace multiple times
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
I tried that, for each char in string loop all substrings if any match. It was slower, perhaps because it was Pos is using asm. Will have another look at this example and will provide a code. Initially I had static array [0..1000] and stored all positions of substrings. But looping and finding if that position is > 0 proved to be bottleneck, even when I had MaxIndex (length of string) < 1000 and was looping from 0 to MaxIndex. I tried many options and I could have made a mistake with previous versions, but this one proved to be the only one faster, the rest were really fast on short and much slower on longer strings. Will try InsertionSort. -
Why is TArray.BinarySearch slower than normal binary search function?
Mike Torrettinni posted a topic in Algorithms, Data Structures and Class Design
My simple test shows TArray.BinarySearch is about 6x slower than custom binary search function: GetName_TArrayBinarySearch: 858 GetName_CustomBinarySearch: 131 I was expecting similar timings. Is this just a fact or did I set up this example wrongly: program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Generics.Collections, System.Generics.Defaults, System.Math, System.Diagnostics; type TDataLine = record SeqNo: integer; CustomName: string; end; const cNOfORecords = 1000000; var xData: TArray<TDataLine>; procedure PrepData; var i: Integer; begin SetLength(xData, cNOfORecords); for i := 0 to cNofORecords do begin xData[i].SeqNo := i + 1; xData[i].CustomName := i.ToString; end; end; function GetName_TArrayBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string; var vIdx: integer; begin if TArray.BinarySearch<TDataLine>(aData, aItem, vIdx, TComparer<TDataLine>.Construct( function (const Left, Right: TDataLine): Integer begin Result := CompareValue(Left.SeqNo, Right.SeqNo); end )) then Result := aData[vIdx].CustomName; end; function GetName_CustomBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string; var L, H, i, c: Integer; begin Result := ''; L := 0; H := High(aData); while L <= H do begin i := L + (H - L) shr 1; c := CompareValue(aData[i].SeqNo, aItem.SeqNo); if c < 0 then L := i + 1 else begin if c = 0 then Exit(aData[i].CustomName); H := i - 1; end; end; end; var vDataLine: TDataLine; vName: string; i: integer; vSW: TStopWatch; begin PrepData; // prepares data, sorted by integer vSW := TStopwatch.StartNew; for i := 1 to cNOfORecords do begin vDataLine.SeqNo := i; vName := GetName_TArrayBinarySearch(xData, vDataLine); end; writeln('GetName_TArrayBinarySearch: ' + vSW.ElapsedMilliseconds.ToString); vSW := TStopwatch.StartNew; for i := 1 to cNOfORecords do begin vDataLine.SeqNo := i; vName := GetName_CustomBinarySearch(xData, vDataLine); end; writeln('GetName_CustomBinarySearch: ' + vSW.ElapsedMilliseconds.ToString); readln; end. Any advice appreciated, thanks! -
Why is TArray.BinarySearch slower than normal binary search function?
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Solution: Here is example of code where TArray.BinarySearch uses custom Comparer, initialized once only! Interesting that 64bit results are much faster w/o custom comparer, but of course initializing custom comparer once and using it is better: 32bit: 64bit: Source: Thanks @Stefan Glienke and @balabuev -
Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
@balabuev is this something you came up with on the spot, or you are using something similar or is from another library? This is not licensed, right, I can use in commercial software? -
Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Some very impressive numbers, x32 and x64! -
Micro optimization: use Pos before StringReplace
Mike Torrettinni posted a topic in Algorithms, Data Structures and Class Design
I usually use Pos before StringReplace to test if StringReplace needs to be called at all, like this: if Pos(vFind, vTest) > 0 then vNew := StringReplace(vTest, vFind, vReplace, [rfReplaceAll]); And I found interesting comment in one of my old questions, 7th comment to question: I believe the fist step would be to remove all the If Pos(...) > 0 then - because StringReplace already does it and thus those checks give nothing but a redundant extra scanning of the string. (https://stackoverflow.com/questions/34769953/how-to-improve-multiple-stringreplace-calls) So, since this is comment from an expert, of course I take it as a valid comment, but wanted to run a few tests to confirm his suggestion (or ignore it). Quick test shows these timings in ms: Short string Find + Pos: 1362 Short string Find - Pos: 1265 Short string NO Find + Pos: 98 Short string NO Find - Pos: 714 Long string Find + Pos: 2547 Long string Find - Pos: 2291 Long string NO Find + Pos: 647 Long string NO Find - Pos: 1457 It shows that Pos does add a little time when substring is found, but it is significantly slower not using Pos when substring is not found. Is my test correct or am I missing something obvious? program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Diagnostics; const cMaxLoop = 10000000; cShortStr = 'Testing string magna.'; cLongStr = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.'; cFind = 'mag'; cNoFind = 'mga'; cReplace = 'X'; var vNewStr: string; vSW: TStopWatch; i: Integer; procedure TestShort_Find; begin vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Pos(cFind, cShortStr) > 0 then vNewStr := StringReplace(cShortStr, cFind, cReplace, [rfReplaceAll]); Writeln('Short string Find + Pos: ' + vSW.ElapsedMilliseconds.ToString); vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do vNewStr := StringReplace(cShortStr, cFind, cReplace, [rfReplaceAll]); Writeln('Short string Find - Pos: ' + vSW.ElapsedMilliseconds.ToString); end; procedure TestShort_NoFind; begin vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Pos(cNoFind, cShortStr) > 0 then vNewStr := StringReplace(cShortStr, cNoFind, cReplace, [rfReplaceAll]); Writeln('Short string NO Find + Pos: ' + vSW.ElapsedMilliseconds.ToString); vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do vNewStr := StringReplace(cShortStr, cNoFind, cReplace, [rfReplaceAll]); Writeln('Short string NO Find - Pos: ' + vSW.ElapsedMilliseconds.ToString); end; procedure TestLong_Find; begin vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Pos(cFind, cLongStr) > 0 then vNewStr := StringReplace(cLongStr, cFind, cReplace, [rfReplaceAll]); Writeln('Long string Find + Pos: ' + vSW.ElapsedMilliseconds.ToString); vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do vNewStr := StringReplace(cLongStr, cFind, cReplace, [rfReplaceAll]); Writeln('Long string Find - Pos: ' + vSW.ElapsedMilliseconds.ToString); end; procedure TestLong_NoFind; begin vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do if Pos(cNoFind, cLongStr) > 0 then vNewStr := StringReplace(cLongStr, cNoFind, cReplace, [rfReplaceAll]); Writeln('Long string NO Find + Pos: ' + vSW.ElapsedMilliseconds.ToString); vSW := TStopWatch.StartNew; for i := 1 to cMaxLoop do vNewStr := StringReplace(cLongStr, cNoFind, cReplace, [rfReplaceAll]); Writeln('Long string NO Find - Pos: ' + vSW.ElapsedMilliseconds.ToString); end; begin TestShort_Find; TestShort_NoFind; TestLong_Find; TestLong_NoFind; Readln; end. -
Micro optimization: use Pos before StringReplace
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
That's why is good to benchmark a version you are migrating to, in case the important project performance gets better or worse. Will do the same when moving from 10.2.3 to 10.5. -
Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search
Mike Torrettinni replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Thanks, very good results! If you wanna share the code, please do.