Jump to content
Mike Torrettinni

Is this method good to delete last character?

Recommended Posts

I use a lot of export to csv, so of course a lot of methods that are creating csv text lines add extra comma at the end, so it needs to be deleted.

In a lot of cases the loop prevents it to know exactly which string is last in the line, so I can't always avoid adding this extra comma.

 

So, I setup 2 utility methods:

 

procedure DeleteLastChar(var aText: string; const aLastChar: string);
begin
  If aText <> '' then
  if aText[Length(aText)] = aLastChar then
    SetLength(aText, Length(aText) - 1); // old: aText := Copy(aText, 1, Length(aText) - 1)
end;

procedure DeleteLastComma(var aText: string);
begin
  DeleteLastChar(aText, ',');
end;

And I call DeleteLastComma(TextLine) in each method, to just make sure I don't leave last comma in the text.

The reports could be long, so this could be called 10000x of times... with small or large lines of text.

 

Since I use var it will avoid copying string parameters and also SetLength just reduces the length of the string and doesn't do the actual string copying like Copy would do, right?

 

Share this post


Link to post

This is simple to answer. Measure the time of the program running. Everything else is pointless. Don't guess performance questions. Measure. 

  • Like 2

Share this post


Link to post
Guest

Another option is to avoid writing useless commas at all.

 

Put all row values in a string array and use string.Join(',', rowValues)

 

But whenever you want to know which horse is faster, measure is the answer

Edited by Guest

Share this post


Link to post

I don't have a compiler at hand, I can't check for errors but this should work:

var
  s: string;
begin
  s := 'Hello World';
  Writeln(s);
  Delete(s, Length(s), 1);
  Writeln(s);
  Readln;
end;

 

Share this post


Link to post

The difference is allocting memory for a new string.

Calling SetLength or Delete will try to reuse the existing memory for the string if it is possible (ie. not shared with another place).

I think string.Join or TrimEnd will always create a new string, allocating memory.

Share this post


Link to post

That's' why I switched from using Copy to SetLength, because it is getting called many many times. But didn't notice any significant performance difference, but I hope there will be less memory fragmentation using SetLength.

Share this post


Link to post
5 hours ago, Martin Wienold said:

The difference is allocting memory for a new string.

Calling SetLength or Delete will try to reuse the existing memory for the string if it is possible (ie. not shared with another place).

I think string.Join or TrimEnd will always create a new string, allocating memory.

Yes, but Join should be used before you start trimming at the end, i.e. to avoid having to trim at all.

Share this post


Link to post

You can write the new length of the string inside the AText[0]

I never try that and IMHO isn't an elegant solution but it should works... Fast enough.

Share this post


Link to post
11 hours ago, Davide Visconti said:

You can write the new length of the string inside the AText[0]

I never try that and IMHO isn't an elegant solution but it should works... Fast enough.

No you can't. 

  • Like 1

Share this post


Link to post

If you are SURE that the string has a reference count of 1, e.g. if it just has been allocated and not assigned/copied anywhere else, you can simply try:

 

procedure DeleteLastChar(var aText: string; aLastChar: char); inline;
begin
  if (aText <> '') and (aText[length(aText)] = aLastChar) then
    dec(PInteger(PAnsiChar(pointer(aText)) - SizeOf(integer))^);
end;

This will avoid a call the heap manager.

 

Then do proper benchmark.

Share this post


Link to post

or even try 

procedure DeleteLastChar(var aText: string; aLastChar: char); inline;
var len: {$ifdef FPC}PSizeInt{$else}PInteger{$endif};
begin
  len := pointer(aText);
  if len <> nil then begin
    dec(len);
    if PChar(pointer(aText))[len^ - 1] = aLastChar then begin
      PChar(pointer(aText))[len^ - 1] := #0;   
      dec(len^);
     end;
  end;
end;

which has also the advantage of working with both Delphi and FPC.

 

But all these sounds a bit like premature optimization to me... 😞

Edited by Arnaud Bouchez
  • Like 1

Share this post


Link to post

Hi,

 

The fastest way to delete the last character is to not add it.
Compare the two following routines:
 

procedure TForm61.Button1Click(Sender: TObject);
var
  i : integer;
  lList : TStringlist;
  lResult : String;
begin
   lList := TStringList.Create;
   lList.Add('Item1');
   lList.Add('Item2');
   lList.Add('Item3');
   lList.Add('Item4');
   try
      lResult := '';
      for i := 0 to lList.Count-1 do
        lResult := lResult+ lList[i]+',';

      SetLength(lResult, Length(lResult) - 1 ); // Here I must delete the last char
   finally
     lList.Free;
   end;


end;
procedure TForm61.Button2Click(Sender: TObject);
var
  i : integer;
  lList : TStringlist;
  lResult : String;
begin
   lList := TStringList.Create;
   lList.Add('Item1');
   lList.Add('Item2');
   lList.Add('Item3');
   lList.Add('Item4');
   try
      if lList.Count>0 then
      begin
         lResult := lList[0];
         for i := 1 to lList.Count-1 do
           lResult := lResult+','+ lList[i];
      end;

   finally
     lList.Free;
   end;

end;

Every time I get the chance I use the second routine. It's faster and requires less memory reallocation.

Share this post


Link to post
On 12/21/2018 at 5:51 PM, Mike Torrettinni said:

I use a lot of export to csv, so of course a lot of methods that are creating csv text lines add extra comma at the end, so it needs to be deleted.

In a lot of cases the loop prevents it to know exactly which string is last in the line, so I can't always avoid adding this extra comma.

 

So, I setup 2 utility methods:

 


procedure DeleteLastChar(var aText: string; const aLastChar: string);
begin
  If aText <> '' then
  if aText[Length(aText)] = aLastChar then
    SetLength(aText, Length(aText) - 1); // old: aText := Copy(aText, 1, Length(aText) - 1)
end;

procedure DeleteLastComma(var aText: string);
begin
  DeleteLastChar(aText, ',');
end;

And I call DeleteLastComma(TextLine) in each method, to just make sure I don't leave last comma in the text.

The reports could be long, so this could be called 10000x of times... with small or large lines of text.

 

Since I use var it will avoid copying string parameters and also SetLength just reduces the length of the string and doesn't do the actual string copying like Copy would do, right?

 

There is no diference between var AText:String and const AText:String and return the value as a function result

AText is reference counted so assign a value to this variable inside the función is the same as assign the result of a función to a variable, 

SetLength would allocate a new string and assign this to AText, so reference counting would decrease inside the funcion instead outside.

String functions should work when ZEROBASEDSTRINGS compiler flag is enabled or not, to be safe for any platform. Use StringHelper funcions from System.SysUtils instead base 1 string functions

 

Some optimizations are: declare DeleteLastComma as inline , change "const ALastChar:String" to "ALastChar:Char" 

 

function DeleteLastChar(const AText:String;AChar:Char):String;
begin
  if AText.IsEmpty then
    exit(AText);

  if AText.Chars[AText.Length-1]=AChar then
    exit(AText.Substring(0,AText.Length-1);
  Result:=AText;
end;

 

DeleteLastComma method can be a inline function.

 

procedure DeleteLastComma(var AText:String); inline;

begin

   AText:=DeleteLastChar(AText,',');

end;


 

Share this post


Link to post

You are aware of TStringLists DelimitedText property? 

  StrList := TstringList.Create;
  StrList.Delimiter := ',';
  StrList.StrictDelimiter := True;
  AddValuesToList(StrList);
  CSVString := StrList.DelimitedText;  // No trailing comma, no hassle, easy to read, not hard to explain..even 10 years from now

Have not measured time on this yet, but unless you are processing strings in the millions, no worries.

Share this post


Link to post
5 hours ago, Alexander Brazda said:

SetLength would allocate a new string and assign this to AText, so reference counting would decrease inside the funcion instead outside.

This is wrong if the refcount is 1: in that case (which happens when you just appended some texts to a string variable), SetLength() won't copy the string, just call ReallocMem() which is a no-op with FastMM4 over a minus one byte buffer, and put a #0 at the end.

See how _UStrSetLength() is implemented in System.pas.

Share this post


Link to post

Check https://www.delphitools.info/2013/10/30/efficient-string-building-in-delphi/

 

TStringBuilder has been enhanced in latest Delphi 10.3 IIRC.

 

Our OpenSource `TTextWriter` class, which works on UTF-8 (which is very likely your eventual CSV file encoding), is very fast (it avoids most memory allocation), and has a `TTextWriter.CancelLastComma` method, as you expect.

See https://synopse.info/files/html/api-1.18/SynCommons.html#TTEXTWRITER_CANCELLASTCOMMA

Share this post


Link to post

I agree, use TStringBuilder for string to buffer copy and prevent adding coma cheking before add a new line if buffer is empty.

You can reduce memory reallocations if StringBuilder buffer is set to a initial capacity by calling EnsureCapacity or set a value to Capacity. Perheaps you can do some estimation based on max or average record size or just set a value like 4 MB as start point.

 

For csv files with ansi or utf-8 encoding and with ram limitations perheaps you can use a buffered file stream with string encoding 

 

Attached file FileBuffer.pas is a Buffered file with string encoding and in memory concatenation, support multiple code pages for input and output. It's optimized for the same input and output codepage.

Encoding preamble is writed at start of file, comment if you don't need this.

UTF-8 is default encoding

 

Usage sample

    Buffer:=TFileBuffer.Create('test.txt',CP_UTF8);

    Buffer.WriteLn('FieldName'#9'Value');

    Buffer
      .Write('Name')
      .Write(#9)
      .Write('Alex');

    Buffer.Free;
 

 

FileBuffer.pas

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×