Mike Torrettinni 198 Posted December 21, 2018 I use a lot of export to csv, so of course a lot of methods that are creating csv text lines add extra comma at the end, so it needs to be deleted. In a lot of cases the loop prevents it to know exactly which string is last in the line, so I can't always avoid adding this extra comma. So, I setup 2 utility methods: procedure DeleteLastChar(var aText: string; const aLastChar: string); begin If aText <> '' then if aText[Length(aText)] = aLastChar then SetLength(aText, Length(aText) - 1); // old: aText := Copy(aText, 1, Length(aText) - 1) end; procedure DeleteLastComma(var aText: string); begin DeleteLastChar(aText, ','); end; And I call DeleteLastComma(TextLine) in each method, to just make sure I don't leave last comma in the text. The reports could be long, so this could be called 10000x of times... with small or large lines of text. Since I use var it will avoid copying string parameters and also SetLength just reduces the length of the string and doesn't do the actual string copying like Copy would do, right? Share this post Link to post
Rollo62 536 Posted December 21, 2018 (edited) Maybe this is helpful, http://docwiki.embarcadero.com/Libraries/XE7/en/System.SysUtils.TStringHelper.TrimEnd Edited December 21, 2018 by Rollo62 Share this post Link to post
David Heffernan 2345 Posted December 22, 2018 This is simple to answer. Measure the time of the program running. Everything else is pointless. Don't guess performance questions. Measure. 2 Share this post Link to post
Guest Posted December 22, 2018 (edited) Another option is to avoid writing useless commas at all. Put all row values in a string array and use string.Join(',', rowValues) But whenever you want to know which horse is faster, measure is the answer Edited December 22, 2018 by Guest Share this post Link to post
Martin Wienold 35 Posted December 22, 2018 I don't have a compiler at hand, I can't check for errors but this should work: var s: string; begin s := 'Hello World'; Writeln(s); Delete(s, Length(s), 1); Writeln(s); Readln; end; Share this post Link to post
Mike Torrettinni 198 Posted December 23, 2018 Thanks for the feedback. The performance if good for me, I could not see any significant difference with TrimEnd. I will look into String.Join. Share this post Link to post
Martin Wienold 35 Posted December 23, 2018 The difference is allocting memory for a new string. Calling SetLength or Delete will try to reuse the existing memory for the string if it is possible (ie. not shared with another place). I think string.Join or TrimEnd will always create a new string, allocating memory. Share this post Link to post
Mike Torrettinni 198 Posted December 23, 2018 That's' why I switched from using Copy to SetLength, because it is getting called many many times. But didn't notice any significant performance difference, but I hope there will be less memory fragmentation using SetLength. Share this post Link to post
Rudy Velthuis 91 Posted December 23, 2018 5 hours ago, Martin Wienold said: The difference is allocting memory for a new string. Calling SetLength or Delete will try to reuse the existing memory for the string if it is possible (ie. not shared with another place). I think string.Join or TrimEnd will always create a new string, allocating memory. Yes, but Join should be used before you start trimming at the end, i.e. to avoid having to trim at all. Share this post Link to post
Davide Visconti 5 Posted December 25, 2018 You can write the new length of the string inside the AText[0] I never try that and IMHO isn't an elegant solution but it should works... Fast enough. Share this post Link to post
David Heffernan 2345 Posted December 26, 2018 11 hours ago, Davide Visconti said: You can write the new length of the string inside the AText[0] I never try that and IMHO isn't an elegant solution but it should works... Fast enough. No you can't. 1 Share this post Link to post
Arnaud Bouchez 407 Posted December 26, 2018 If you are SURE that the string has a reference count of 1, e.g. if it just has been allocated and not assigned/copied anywhere else, you can simply try: procedure DeleteLastChar(var aText: string; aLastChar: char); inline; begin if (aText <> '') and (aText[length(aText)] = aLastChar) then dec(PInteger(PAnsiChar(pointer(aText)) - SizeOf(integer))^); end; This will avoid a call the heap manager. Then do proper benchmark. Share this post Link to post
Arnaud Bouchez 407 Posted December 26, 2018 (edited) or even try procedure DeleteLastChar(var aText: string; aLastChar: char); inline; var len: {$ifdef FPC}PSizeInt{$else}PInteger{$endif}; begin len := pointer(aText); if len <> nil then begin dec(len); if PChar(pointer(aText))[len^ - 1] = aLastChar then begin PChar(pointer(aText))[len^ - 1] := #0; dec(len^); end; end; end; which has also the advantage of working with both Delphi and FPC. But all these sounds a bit like premature optimization to me... 😞 Edited December 27, 2018 by Arnaud Bouchez 1 Share this post Link to post
Clément 148 Posted December 26, 2018 Hi, The fastest way to delete the last character is to not add it. Compare the two following routines: procedure TForm61.Button1Click(Sender: TObject); var i : integer; lList : TStringlist; lResult : String; begin lList := TStringList.Create; lList.Add('Item1'); lList.Add('Item2'); lList.Add('Item3'); lList.Add('Item4'); try lResult := ''; for i := 0 to lList.Count-1 do lResult := lResult+ lList[i]+','; SetLength(lResult, Length(lResult) - 1 ); // Here I must delete the last char finally lList.Free; end; end; procedure TForm61.Button2Click(Sender: TObject); var i : integer; lList : TStringlist; lResult : String; begin lList := TStringList.Create; lList.Add('Item1'); lList.Add('Item2'); lList.Add('Item3'); lList.Add('Item4'); try if lList.Count>0 then begin lResult := lList[0]; for i := 1 to lList.Count-1 do lResult := lResult+','+ lList[i]; end; finally lList.Free; end; end; Every time I get the chance I use the second routine. It's faster and requires less memory reallocation. Share this post Link to post
Mike Torrettinni 198 Posted December 26, 2018 @Clément Thanks, another good suggestion how to do it without comma. Share this post Link to post
Alexander Brazda 1 Posted December 27, 2018 On 12/21/2018 at 5:51 PM, Mike Torrettinni said: I use a lot of export to csv, so of course a lot of methods that are creating csv text lines add extra comma at the end, so it needs to be deleted. In a lot of cases the loop prevents it to know exactly which string is last in the line, so I can't always avoid adding this extra comma. So, I setup 2 utility methods: procedure DeleteLastChar(var aText: string; const aLastChar: string); begin If aText <> '' then if aText[Length(aText)] = aLastChar then SetLength(aText, Length(aText) - 1); // old: aText := Copy(aText, 1, Length(aText) - 1) end; procedure DeleteLastComma(var aText: string); begin DeleteLastChar(aText, ','); end; And I call DeleteLastComma(TextLine) in each method, to just make sure I don't leave last comma in the text. The reports could be long, so this could be called 10000x of times... with small or large lines of text. Since I use var it will avoid copying string parameters and also SetLength just reduces the length of the string and doesn't do the actual string copying like Copy would do, right? There is no diference between var AText:String and const AText:String and return the value as a function result AText is reference counted so assign a value to this variable inside the función is the same as assign the result of a función to a variable, SetLength would allocate a new string and assign this to AText, so reference counting would decrease inside the funcion instead outside. String functions should work when ZEROBASEDSTRINGS compiler flag is enabled or not, to be safe for any platform. Use StringHelper funcions from System.SysUtils instead base 1 string functions Some optimizations are: declare DeleteLastComma as inline , change "const ALastChar:String" to "ALastChar:Char" function DeleteLastChar(const AText:String;AChar:Char):String; begin if AText.IsEmpty then exit(AText); if AText.Chars[AText.Length-1]=AChar then exit(AText.Substring(0,AText.Length-1); Result:=AText; end; DeleteLastComma method can be a inline function. procedure DeleteLastComma(var AText:String); inline; begin AText:=DeleteLastChar(AText,','); end; Share this post Link to post
Sherlock 663 Posted December 27, 2018 You are aware of TStringLists DelimitedText property? StrList := TstringList.Create; StrList.Delimiter := ','; StrList.StrictDelimiter := True; AddValuesToList(StrList); CSVString := StrList.DelimitedText; // No trailing comma, no hassle, easy to read, not hard to explain..even 10 years from now Have not measured time on this yet, but unless you are processing strings in the millions, no worries. Share this post Link to post
Arnaud Bouchez 407 Posted December 27, 2018 5 hours ago, Alexander Brazda said: SetLength would allocate a new string and assign this to AText, so reference counting would decrease inside the funcion instead outside. This is wrong if the refcount is 1: in that case (which happens when you just appended some texts to a string variable), SetLength() won't copy the string, just call ReallocMem() which is a no-op with FastMM4 over a minus one byte buffer, and put a #0 at the end. See how _UStrSetLength() is implemented in System.pas. Share this post Link to post
Arnaud Bouchez 407 Posted December 27, 2018 Check https://www.delphitools.info/2013/10/30/efficient-string-building-in-delphi/ TStringBuilder has been enhanced in latest Delphi 10.3 IIRC. Our OpenSource `TTextWriter` class, which works on UTF-8 (which is very likely your eventual CSV file encoding), is very fast (it avoids most memory allocation), and has a `TTextWriter.CancelLastComma` method, as you expect. See https://synopse.info/files/html/api-1.18/SynCommons.html#TTEXTWRITER_CANCELLASTCOMMA Share this post Link to post
Alexander Brazda 1 Posted December 27, 2018 I agree, use TStringBuilder for string to buffer copy and prevent adding coma cheking before add a new line if buffer is empty. You can reduce memory reallocations if StringBuilder buffer is set to a initial capacity by calling EnsureCapacity or set a value to Capacity. Perheaps you can do some estimation based on max or average record size or just set a value like 4 MB as start point. For csv files with ansi or utf-8 encoding and with ram limitations perheaps you can use a buffered file stream with string encoding Attached file FileBuffer.pas is a Buffered file with string encoding and in memory concatenation, support multiple code pages for input and output. It's optimized for the same input and output codepage. Encoding preamble is writed at start of file, comment if you don't need this. UTF-8 is default encoding Usage sample Buffer:=TFileBuffer.Create('test.txt',CP_UTF8); Buffer.WriteLn('FieldName'#9'Value'); Buffer .Write('Name') .Write(#9) .Write('Alex'); Buffer.Free; FileBuffer.pas Share this post Link to post