Guest Posted June 22, 2020 Thank you Remy, But still if there is assembly code to show it will be better, as the questions intriguing me still there 1) It is understandable for the compiler to add try..finally when there Finalize, but will this try..finally be added if there no Finalize and only Initialize ? will be added even if the record have only unmanged types fields too ? or the compiler is smart enough to skip that for unmanaged fields and enforce it when there is managed type fields ? 2) For inline records/variables goes the same questions above, but with addition, is there are stack allocated at the entry point of the procedure? or will behave like many others languages, the stack will be used in the middle of procedure and released in end of that block instead of the end of the procedure ? in that case try..finally will be faster if it used around the block of begin..end been declared, If some one would just compile some examples and paste the assembly code, it will be nice to see that. Share this post Link to post
Mahdi Safsafi 225 Posted June 22, 2020 Quote I would suggest try to remove one conditional jump more from your last version ... is cheaper that comparing and jumping @Kas Ob Performance is not always related to conditional jump. I made a quick optimization on SetTextStr, bellow is my implementation and it uses jcc heavy but outperforms original/Attila algo. You can use jcc to make your own prediction (cheat and step): // this algo can be optimized further // requires length(LineBreak) > 1 procedure TStringList2.SetTextStr(const Value: string); label Again; var P, PStart, PEnd: PChar; s: string; FirstLineBreakChar: Char; LastLineBreakChar: Char; LineBreakLength: integer; TextLength: integer; PMax: PChar; function IsBreak(PC: PChar): Boolean; var P: PChar; begin P := PChar(LineBreak); while (P^ <> #00) do begin if P^ <> PC^ then Exit(False); inc(P); inc(PC); end; Result := True; end; begin LineBreakLength := Length(LineBreak); if (LineBreakLength = 1) or (LineBreak = sLineBreak) then begin // this algo does not handle LineBreakLength = 1 inherited end else begin BeginUpdate(); try Clear(); FirstLineBreakChar := LineBreak[1]; LastLineBreakChar := LineBreak[LineBreakLength]; TextLength := Length(Value); P := PChar(Value); PStart := P; PEnd := P + TextLength; if P^ <> #00 then begin Again: // skip all non important char: while (P^ <> FirstLineBreakChar) and (P^ <> #00) do inc(P); PMax := P + LineBreakLength; // either null or FirstLineBreakChar. if (PMax < PEnd) then begin // current char = FirstLineBreakChar ... could be a line break ! if (P[LineBreakLength - 1] = LastLineBreakChar) then begin // there is a hight chance that its a linebreak ! if IsBreak(P) then begin // this is a linebreak. SetString(s, PStart, P - PStart); Add(s); PStart := P + LineBreakLength; end; end; inc(P, LineBreakLength); // eat all processed chars... goto Again; end else begin if (P > PStart) then begin SetString(s, PStart, P - PStart); Add(s); end else if (P = PStart) then begin // didn't move. if (PMax <> PEnd) then begin SetString(s, PStart, PEnd - PStart); Add(s); end; end; end; end; finally EndUpdate(); end; end; end; @Attila Kovacs Note that the original implementation and yours do not handle correctly trailing. Share this post Link to post
Attila Kovacs 629 Posted June 22, 2020 @Mahdi Safsafi Can you give me a test case for the incorrect handling of the trailing? This would be important to know. In the meanwhile I made a non-recursive version as Stefan pointed out it should be faster, I knew it, but for some reason I decided to make it with a recursive nested procedure. And as the RTL version was unable to handle the mentioned file, with a faulty LineBreak (#10#10#13), and my version worked like a charm I did not go for any further optimization. And by the way, thus did it became a good example for OoOE. This is the new last one: Stefan means, I could achieve even better results without the "for", but I could not. It's beyond me. I rolled out the first contact to the LineBreak and therefor the "for" is not reached too often. I think I reached the max effort I can put into this (time/pay off). It's also faster than yours by the way, and your implementation fails on 7 test-cases of mine. procedure TStringList.SetTextStr(const Value: string); var P, fc, Start: PChar; b: boolean; i: integer; s: string; LineBreakLen: integer; LLineBreak: string; begin if LineBreak = #13#10 then inherited else begin BeginUpdate; try Clear; P := Pointer(Value); if P <> nil then begin LLineBreak := LineBreak; LineBreakLen := Length(LLineBreak); if LineBreakLen > 0 then begin fc := Pointer(LineBreak); Start := P; while P^ <> #0 do begin while (P^ <> fc^) and (P^ <> #0) do Inc(P); if P^ <> #0 then begin b := True; for i := 1 to LineBreakLen - 1 do if (P + i)^ <> (fc + i)^ then begin b := False; Break; end; if b then begin SetString(s, Start, P - Start); Add(s); Inc(P, LineBreakLen); Start := P; end else Inc(P); end; end; if P > Start then begin SetString(s, Start, P - Start); Add(s); end; end else Add(Value); end; finally EndUpdate; end; end; end; Share this post Link to post
Mahdi Safsafi 225 Posted June 23, 2020 Quote Can you give me a test case for the incorrect handling of the trailing? This would be important to know. Your new version rocks ! it passed all the cases, except this: LineBreak = '--'; Text = LineBreak; LIST.COUNT= EXPECT(0) GOT(1) Quote It's also faster than yours by the way, and your implementation fails on 7 test-cases of mine. Great job 🙂 Perhaps you can use xor operator to handle LinebreakLength when its in [1,2,3,4]: case length (linebreak) of 1: b := (PWord(P + i)^ xor PWord(fc + i)^) = 0; 2,3,4:... else for ... Share this post Link to post
Attila Kovacs 629 Posted June 23, 2020 (edited) Ok, no "for" loop, no "xor" etc.. which hides implementation details on Char, and reverted back to text=LineBreak delivers 1 empty string for compatibility reasons (original TString / .Split(), etc...) procedure TStringList.SetTextStr(const Value: string); var P, P2, fc, Start, LineBreakEnd: PChar; s: string; LineBreakLen: integer; LLineBreak: PChar; begin BeginUpdate; try Clear; P := Pointer(Value); if P <> nil then begin LLineBreak := PChar(LineBreak); LineBreakLen := Length(LLineBreak); if LineBreakLen > 0 then begin LineBreakEnd := LLineBreak + LineBreakLen; fc := LLineBreak; Start := P; while P^ <> #0 do begin while (P^ <> fc^) and (P^ <> #0) do Inc(P); if P^ <> #0 then begin P2 := P + 1; Inc(fc); while fc^ <> #0 do begin if P2^ <> fc^ then Break; Inc(P2); Inc(fc) end; if fc = LineBreakEnd then begin SetString(s, Start, P - Start); Add(s); P := P2; Start := P; end else Inc(P); fc := LLineBreak; end; end; if P > Start then begin SetString(s, Start, P - Start); Add(s); end; end else Add(Value); end; finally EndUpdate; end; end; Edited June 23, 2020 by Attila Kovacs 1 Share this post Link to post
Mahdi Safsafi 225 Posted June 23, 2020 Quote Nice catch, I was comparing against the RTL code 😉 Seems to be even faster as the RTL #13#10 code. -- command line Quote That did not work out well. I'll let the case out, less implementation detail on PChar. It should work after all its just a two-byte type. Share this post Link to post
Mahdi Safsafi 225 Posted June 23, 2020 @Attila Kovacs do it like this "(PWord(P + i)^ xor PWord(fc + i)^) = 0" this is better than (P + i)^ = (fc + i)^ Share this post Link to post