Jump to content
Mike Torrettinni

Micro optimization: String starts with substring

Recommended Posts

I use Copy and MidStr a lot, and I wanted to test which one is best to be used to check if String starts with substring, so I can refactor code and use just one function.

There are also String.StartsWith helper and StartsStr function. So, I ran some simple tests and seems to be that Copy is fastest option:

 

I noticed this question on SO: Difference between System.copy and StrUtils.MidStr https://stackoverflow.com/q/13411139 which explains that MidStrs uses Copy (I checked and in Delphi 10.2.3 it does), so Copy should be faster then.

 

Substring exists at the start:
Short string:
  Copy: 166 // WINNER
  MidStr: 184
  StartsWith: 511
  StartsStr: 792
Long string:
  Copy: 179  // WINNER
  MidStr: 190
  StartsWith: 524
  StartsStr: 840

Substring NOT exists at the start:
Short string:
  Copy: 169  // WINNER
  MidStr: 180
  StartsWith: 514
  StartsStr: 886
Long string:
  Copy: 170  // WINNER
  MidStr: 182
  StartsWith: 517
  StartsStr: 897

 

Does anybody know of any faster option to check if string starts with substring?

 

 

 

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  System.Diagnostics,
  System.StrUtils;

const
  cMaxLoop = 10000000;
  cText = 'Error: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';
  cTextNoHits = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';
  cStartsWithShort = 'E';
  cStartsWithLong = 'Error:';
var vSW: TStopWatch;
    i, vLen: integer;
    vResult: boolean;
begin
  Writeln('Substring exists at the start:');

  // Short string
  Writeln('Short string:');

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := Copy(cText, 1, 1) = cStartsWithShort;
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := MidStr(cText, 1, 1) = cStartsWithShort;
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := cText.StartsWith(cStartsWithShort);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult :=  StartsStr(cStartsWithShort, cText);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString);

  // Long string
  Writeln('Long string:');

  vLen := Length(cStartsWithLong);
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := Copy(cText, 1, vLen) = cStartsWithLong;
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := MidStr(cText, 1, vLen) = cStartsWithLong;
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := cText.StartsWith(cStartsWithLong);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult :=  StartsStr(cStartsWithLong, cText);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString);

  Writeln;
  // Text DEOS NOT start with selected string
  Writeln('Substring NOT exists at the start:');
  // Short string
  Writeln('Short string:');

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := Copy(cTextNoHits, 1, 1) = cStartsWithShort;
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := MidStr(cTextNoHits, 1, 1) = cStartsWithShort;
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := cTextNoHits.StartsWith(cStartsWithShort);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult :=  StartsStr(cStartsWithShort, cTextNoHits);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString);

  // Long string
  Writeln('Long string:');

  vLen := Length(cStartsWithLong);
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := Copy(cTextNoHits, 1, vLen) = cStartsWithLong;
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := MidStr(cTextNoHits, 1, vLen) = cStartsWithLong;
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult := cTextNoHits.StartsWith(cStartsWithLong);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    vResult :=  StartsStr(cStartsWithLong, cTextNoHits);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString);


  readln;


end.

 

Share this post


Link to post
1 hour ago, Mike Torrettinni said:

So, I ran some simple tests and seems to be that Copy is fastest option:

I find that hard to believe.  But I have no way to test that myself right now.

 

When Copy() or MidStr() are used to return a partial substring of a larger string, they dynamically allocate a new string and copy characters into it.  Your tests are requesting smaller substrings.  So there should be allocations being performed before any comparisons can be made.  That would take more time.  The only way those allocations should not be done is when the entire input string is being returned as-is, in which case the reference count for a non-constant gets incremented, and a constant gets returned as-is (no reference counting).

 

This is even worse for StartsStr(), because it also uses Copy() internally, to chop the input string down to the size of the comparison string, and then compares the result to the comparison string.  That is really unnecessary, as TStringHelper.StartsWith() proves.

 

TStringHelper.StartsWith() simply calls SysUtils.StrL(I)Comp(), passing it pointers to the input strings, and the max length to compare.  It compares the characters directly in the original memory without making any allocations at all.  Which is the way it should be done.

 

So I would have expected TStringHelper.StartsWith() to be much faster than the others, not in the middle.

 

But then, I was looking at an older version (XE3).  Maybe things have changed in recent years?  Any comparison that avoids having to allocate new strings should be the fastest.

Edited by Remy Lebeau
  • Like 1
  • Thanks 1

Share this post


Link to post
29 minutes ago, Remy Lebeau said:

I find that hard to believe.  But I have no way to test that myself right now.

 

When Copy() or MidStr() are used to return a partial substring of a larger string, they dynamically allocate a new string and copy characters into it.  Your tests are requesting smaller substrings.  So there should be allocations being performed before any comparisons can be made.  That would take more time.  The only way those allocations should not be done is when the entire input string is being returned as-is, in which case the reference count for a non-constant gets incremented, and a constant gets returned as-is (no reference counting).

 

This is even worse for StartsStr(), because it also uses Copy() internally, to chop the input string down to the size of the comparison string, and then compares the result to the comparison string.  That is really unnecessary, as TStringHelper.StartsWith() proves.

 

TStringHelper.StartsWith() simply calls SysUtils.StrL(I)Comp(), passing it pointers to the input strings, and the max length to compare.  It compares the characters directly in the original memory without making any allocations at all.  Which is the way it should be done.

 

So I would have expected TStringHelper.StartsWith() to be much faster than the others, not in the middle.

 

But then, I was looking at an older version (XE3).  Maybe things have changed in recent years?  Any comparison that avoids having to allocate new strings should be the fastest.

 

I'm not so experienced in analyzing Delphi sources to be able to assess what is optimized and what is wasting resources.

In my code I mostly look for short strings at the beginning, so perhaps my test cases are just right for Copy to be fastest and that will not be the case with 'better' examples.

 

I use 10.2.3 version.

Edited by Mike Torrettinni

Share this post


Link to post
30 minutes ago, Remy Lebeau said:

But then, I was looking at an older version (XE3).  Maybe things have changed in recent years?  Any comparison that avoids having to allocate new strings should be the fastest.

Things changed since then... for some reason unknown to me StartsWith has been changed to the point it involves 10+ calls to other procedures, and eventually calls Windows API CompareString function. 

 

Since the final call handles case insensitive variant, it makes sense in that case (still too many indirect calls for my taste), but code path for case sensitive variant is just WHY, OH, WHY????

 

No wonder it is slower...

 

I think the plain for loop comparing chars would do better...

Edited by Dalija Prasnikar
  • Like 1
  • Thanks 1

Share this post


Link to post
{$APPTYPE CONSOLE}

uses
  System.SysUtils,
  System.Diagnostics,
  System.StrUtils;

function MyStartsWith(const SearchText, Text: string): Boolean;
var
  Index, SearchTextLen: Integer;
begin
  SearchTextLen := Length(SearchText);
  if SearchTextLen>Length(Text) then
  begin
    Result := False;
    Exit;
  end;
  for Index := 1 to SearchTextLen do
    if Text[Index]<>SearchText[Index] then
    begin
      Result := False;
      Exit;
    end;
  Result := True;
end;

const
  cMaxLoop = 10000000;
  cText = 'Error: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';
  cTextNoHits = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.';
  cStartsWithShort = 'E';
  cStartsWithLong = 'Error:';
var vSW: TStopWatch;
    i, vLen: integer;
    hitCount: Integer;
    vResult: boolean;
begin
  Writeln('Substring exists at the start:');

  // Short string
  Writeln('Short string:');

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if Copy(cText, 1, Length(cStartsWithShort)) = cStartsWithShort then
      Inc(hitCount);
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MidStr(cText, 1, Length(cStartsWithShort)) = cStartsWithShort then
      Inc(hitCount);
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if cText.StartsWith(cStartsWithShort) then
      Inc(hitCount);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if StartsStr(cStartsWithShort, cText) then
      Inc(hitCount);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MyStartsWith(cStartsWithShort, cText) then
      Inc(hitCount);
  Writeln('  MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  // Long string
  Writeln('Long string:');

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if Copy(cText, 1, Length(cStartsWithLong)) = cStartsWithLong then
      Inc(hitCount);
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MidStr(cText, 1, Length(cStartsWithLong)) = cStartsWithLong then
      Inc(hitCount);
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if cText.StartsWith(cStartsWithLong) then
      Inc(hitCount);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if StartsStr(cStartsWithLong, cText) then
      Inc(hitCount);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MyStartsWith(cStartsWithLong, cText) then
      Inc(hitCount);
  Writeln('  MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  Writeln;
  // Text DEOS NOT start with selected string
  Writeln('Substring NOT exists at the start:');
  // Short string
  Writeln('Short string:');

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if Copy(cTextNoHits, 1, Length(cStartsWithShort)) = cStartsWithShort then
      Inc(hitCount);
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MidStr(cTextNoHits, 1, Length(cStartsWithShort)) = cStartsWithShort then
      Inc(hitCount);
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if cTextNoHits.StartsWith(cStartsWithShort) then
      Inc(hitCount);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if StartsStr(cStartsWithShort, cTextNoHits) then
      Inc(hitCount);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MyStartsWith(cStartsWithShort, cTextNoHits) then
      Inc(hitCount);
  Writeln('  MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  // Long string
  Writeln('Long string:');

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if Copy(cTextNoHits, 1, Length(cStartsWithLong)) = cStartsWithLong then
      Inc(hitCount);
  Writeln('  Copy: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MidStr(cTextNoHits, 1, Length(cStartsWithLong)) = cStartsWithLong then
      Inc(hitCount);
  Writeln('  MidStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if cTextNoHits.StartsWith(cStartsWithLong) then
      Inc(hitCount);
  Writeln('  StartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if StartsStr(cStartsWithLong, cTextNoHits) then
      Inc(hitCount);
  Writeln('  StartsStr: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);

  hitCount := 0;
  vSW := TStopWatch.StartNew;
  for i := 1 to cMaxLoop do
    if MyStartsWith(cStartsWithLong, cTextNoHits) then
      Inc(hitCount);
  Writeln('  MyStartsWith: ' + vSW.ElapsedMilliseconds.ToString, ' ', hitCount);


  readln;

end.

Copy is clearly a shocker of an idea for this use. heap allocation!! Really?!!!

 

And the other functions seem really slow.  A simple for loop is around 10 times faster.

 

I didn't capture any timings, but it's easy to do it on your machine. I also addressed a couple of issues with your benchmark code. 

 

I don't believe this is a bottleneck in your program, but you love premature optimisation with a rarely seen passion.

Edited by David Heffernan
  • Like 1
  • Haha 4

Share this post


Link to post
27 minutes ago, Dalija Prasnikar said:

I think the plain for loop comparing chars would do better...

I setup this function:

 

function StartCharByChar(const aSubString, aString: string): boolean;
var c: integer;
begin
  Result := True;

  for c := 1 to Length(aSubString) do
    if aString[c] <> aSubString[c] then
      Exit(false);
end;

And it wins! 😉

 

Substring exists at the start:
Short string:
  Copy: 169
  MidStr: 182
  StartsWith: 515
  StartsStr: 784
  StartCharByChar: 41 // :)
Long string:
  Copy: 176
  MidStr: 191
  StartsWith: 518
  StartsStr: 846
  StartCharByChar: 115 // :)

Substring NOT exists at the start:
Short string:
  Copy: 166
  MidStr: 183
  StartsWith: 515
  StartsStr: 876
  StartCharByChar: 33 // :)
Long string:
  Copy: 171
  MidStr: 184
  StartsWith: 507
  StartsStr: 885
  StartCharByChar: 33 // :)

 

Share this post


Link to post
17 minutes ago, Mike Torrettinni said:

And it wins!

Not if the search string is longer than the other string. Then it's a buffer overrun.

 

Use the version from my post. 

Edited by David Heffernan
  • Like 3

Share this post


Link to post
3 hours ago, Dalija Prasnikar said:

Things changed since then... for some reason unknown to me StartsWith has been changed to the point it involves 10+ calls to other procedures, and eventually calls Windows API CompareString function. 

Really? Ouch 😨  It used to be much simpler:

function TStringHelper.StartsWith(const Value: string): Boolean;
begin
  Result := StartsWith(Value, False);
end;

function TStringHelper.StartsWith(const Value: string; IgnoreCase: Boolean): Boolean;
begin
  if not IgnoreCase then
    Result := System.SysUtils.StrLComp(PChar(Self), PChar(Value), Value.Length) = 0
  else
    Result := System.SysUtils.StrLIComp(PChar(Self), PChar(Value), Value.Length) = 0;
end;

function StrLComp(const Str1, Str2: PWideChar; MaxLen: Cardinal): Integer;
var
  I: Cardinal;
  P1, P2: PWideChar;
begin
  P1 := Str1;
  P2 := Str2;
  I := 0;
  while I < MaxLen do
  begin
    if (P1^ <> P2^) or (P1^ = #0) then
      Exit(Ord(P1^) - Ord(P2^));

    Inc(P1);
    Inc(P2);
    Inc(I);
  end;
  Result := 0;
end;

function StrLIComp(const Str1, Str2: PWideChar; MaxLen: Cardinal): Integer;
var
  P1, P2: PWideChar;
  I: Cardinal;
  C1, C2: WideChar;
begin
  P1 := Str1;
  P2 := Str2;
  I := 0;
  while I < MaxLen do
  begin
    if P1^ in ['a'..'z'] then
      C1 := WideChar(Word(P1^) xor $20)
    else
      C1 := P1^;

    if P2^ in ['a'..'z'] then
      C2 := WideChar(Word(P2^) xor $20)
    else
      C2 := P2^;

    if (C1 <> C2) or (C1 = #0) then
      Exit(Ord(C1) - Ord(C2));

    Inc(P1);
    Inc(P2);
    Inc(I);
  end;
  Result := 0;
end;

Compared to StartsStr(), which (eventually) calls CompareString():

function StartsStr(const ASubText, AText: string): Boolean;
begin
  Result := AnsiStartsStr(ASubText, AText);
end;

function AnsiStartsStr(const ASubText, AText: string): Boolean;
begin
  Result := AnsiSameStr(ASubText, Copy(AText, 1, Length(ASubText))); // WHY Copy()? AnsiStrLComp() could have been used instead!
end;

function AnsiSameStr(const S1, S2: string): Boolean;
begin
  Result := AnsiCompareStr(S1, S2) = 0;
end;

function AnsiCompareStr(const S1, S2: string): Integer;
{$IFDEF MSWINDOWS}
begin
  Result := CompareString(LOCALE_USER_DEFAULT, 0, PChar(S1), Length(S1),
      PChar(S2), Length(S2)) - CSTR_EQUAL;
end;
{$ENDIF MSWINDOWS}
{$IFDEF POSIX}
begin
  Result := UCS4CompareStr(UnicodeStringToUCS4String(S1),
    UnicodeStringToUCS4String(S2));
end;
{$ENDIF POSIX}

I don't have RTL/VCL source code past XE3, but I keep seeing people mention how bad it's getting in recent years.  This is not good.

  • Like 3

Share this post


Link to post

Interesting inefficiency of TStringHelper.StartsWith and StartsStr when string and substring are the same:

 

Substring is same as string:
Short string (20 chars):
  StartsWith: 206
  StartsStr: 1113 // !
  StartsWithStr: 46
  StartsWithStr2: 42
Long string (238 chars):
  StartsWith: 1358  // !
  StartsStr: 4948 // !
  StartsWithStr: 45
  StartsWithStr2: 40

 

I guess they don't check for len(str) = len(substr) or str=substr.

Share this post


Link to post
5 minutes ago, Mike Torrettinni said:

they don't check for len(str) = len(substr)

I don't see any reason to check that. We've long since solved this. 

  • Like 1

Share this post


Link to post

OK, there is problem with my test case: if I define consts as strings then TStringHelper.StartsWith is much faster!

 

image.png.88d9621390d53ba7490fa0772e2475c3.png

 

 

Substring exists at the start:
Short string:
const only:
  StartsWith: 505 // slow when consts are not of defined type
  StartsStr: 795
  StartsWithStr: 53
const string:
  StartsWith: 65 // much faster when consts defined as string!
  StartsStr: 785
  StartsWithStr: 64

 

 

 

 

 

 

 

 

 

Share this post


Link to post

Ha ha, I even use a few of LeftStr, to compare the beginning of a string. But it uses Copy, so it's same performance,

 

Talk about the need for refactoring weekend 😉

Share this post


Link to post

I guess comparing just 1 char at beginning is better to do with Char:

 

function StrStartsWith(const aStr, aSubStr: string): boolean; inline; overload;
function StrStartsWith(const aStr:string; const aChar: Char): boolean; inline; overload;

function StrStartsWith(const aStr, aSubStr: string): boolean; inline; overload;
var c, vLenStr, vLenSubStr: integer;
begin
  vLenStr    := Length(aStr);
  vLenSubStr := Length(aSubStr);

  if vLenStr < vLenSubStr then
    Exit(false);

  if vLenStr = vLenSubStr then
    Exit(aStr = aSubStr);

  for c := 1 to Length(aSubStr) do
    if aStr[c] <> aSubStr[c] then
      Exit(false);

  Result := True;
end;

function StrStartsWith(const aStr:string; const aChar: Char): boolean; inline; overload;
begin
  if aStr = '' then
    Exit(false);

  Result := aStr[1] = aChar;
end;

 

Share this post


Link to post
28 minutes ago, Mike Torrettinni said:

if vLenStr < vLenSubStr then Exit(false);

This seems pointless to me. 

 

It's important to test this in the real setting. What timings do you have? 

Share this post


Link to post

This was very productive weekend! A lot of refactoring done. Even though only a few instances need to be performant, it was good exercise and now I have 1 common function that handles all these comparisons how strings start. And if newer Delphi version eventually implements better (faster) option, I can easily make a change.

Clean, refactored and optimized piece of code.

 

Thanks!

Share this post


Link to post
On 2/5/2021 at 7:08 PM, Mike Torrettinni said:

Does anybody know of any faster option to check if string starts with substring?

function StrSize(const Str: UnicodeString): Int64; inline;
begin
  Result := Length(Str)*SizeOf(WideChar);
end;

function StrSize(const Str: RawByteString): Int64; inline;
begin
  Result := Length(Str)*SizeOf(AnsiChar);
end;

function StrIsStartingFrom(const Str, SubStr: string): Boolean;
begin
  Result := False;
  if ((Str = '') or (SubStr = '')) or (Length(SubStr) > Length(Str)) then Exit;
  Result := CompareMem(Pointer(Str), Pointer(SubStr), StrSize(SubStr));
end;

function StrIsStartingFrom(const Str, SubStr: RawByteString): Boolean;
begin
  Result := False;
  if ((Str = '') or (SubStr = '')) or (Length(SubStr) > Length(Str)) then Exit;
  Result := CompareMem(Pointer(Str), Pointer(SubStr), StrSize(SubStr));
end;

 

 

ADD:

If you just check 1st char of a string, then simple `if Str[1] = SomeChar` will be faster; moreover it could be optimized further as `if Pointer(Str)^ = SomeChar` but you'll have to ensure Str is not empty

Edited by Fr0sT.Brutal
add
  • Thanks 1

Share this post


Link to post
2 hours ago, Fr0sT.Brutal said:

If you just check 1st char of a string, then simple `if Str[1] = SomeChar` will be faster; moreover it could be optimized further as `if Pointer(Str)^ = SomeChar` but you'll have to ensure Str is not empty

Yes. I have numerous checks for what first char is, in different parsers, so I'm wrapping into single method where I check for empty string first.

Share this post


Link to post
On 2/8/2021 at 3:45 AM, Fr0sT.Brutal said:

function StrIsStartingFrom(const Str, SubStr: string): Boolean; begin Result := False; if ((Str = '') or (SubStr = '')) or (Length(SubStr) > Length(Str)) then Exit; Result := CompareMem(Pointer(Str), Pointer(SubStr), StrSize(SubStr)); end;

@Fr0sT.Brutal Your StrIsStartingFrom wins for longer string, and if I put Exit(false) and remove initial Result := false it improves performance for additional 2%. Thanks!

 

When checking for shorter string, like 1 char (but we don't know so we can't compare by char), StrStartsWith wins for about 20%.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×