pmcgee 10 Posted November 29, 2020 (edited) A bit of a challenge ... to produce Delphi code equaling the performance of C++'s next_permutation. Hopefully more readable in Delphi ... unit D_Next_perm_string; interface procedure swapCh(var a:Ansichar; var b:Ansichar); procedure reverse(var s:AnsiString; const a,x:word); function D_next_permutation(var s:AnsiString; const first, end_:word) : boolean; implementation //uses System.Diagnostics; procedure swapCh( var a : Ansichar; var b : Ansichar ); inline; var c : Ansichar; begin if a<>b then begin c := a; a := b; b := c; end; end; procedure reverse(var s:AnsiString; const a,x:word); inline; var i,j : word; //t : string; begin // x is one past the end of string if a = x-1 then exit; j := ( x-a ) shr 1; //trunc((x-a)/2); for i := 1 to j do swapCh( s[a-1+i] , s[x-i] ); end; function D_next_permutation(var s : AnsiString; const first, end_ : word) : boolean; inline; var i, ii, j : word; begin if first = end_ then begin result := false; exit; end; i := first + 1; if i = end_ then begin result := false; exit; end; i := end_ - 1; while true do begin ii := i; dec(i); if s[i] < s[ii] then begin j := end_; repeat dec(j); until s[i] < s[j]; swapCh( s[i] , s[j] ); reverse(s, ii, end_); result := true; exit; end; if i = first then begin reverse(s, first, end_); result := false; exit; end; end; end; end. Ballpark, we're looking at about 6s vs 2s for s = "abcdefghijkl" (12 bytes). Increases by about a factor of 10 per character beyond that. Note the c++ end() iterator is **one past the end of string**, so ... D_next_permutation(s,1,13) Edited November 29, 2020 by pmcgee Share this post Link to post
pmcgee 10 Posted November 29, 2020 I didn't recall correctly. This was a significant speed up : procedure swapByte( a:Pbyte ; b:Pbyte ); inline; begin if a <> b then begin a^ := a^ + b^; b^ := a^ - b^; a^ := a^ - b^ ; end; end; swapByte( @s[a-1+i] , @s[x_-i] ); Share this post Link to post
pmcgee 10 Posted November 29, 2020 @Kas Ob. posted this in the original thread : procedure reverse2(var s: AnsiString; const a, x: word); inline; var i, j: word; tmpChar: Byte; SBytes: pByte absolute s; begin if a = x - 1 then exit; j := (x - a) shr 1; // trunc((x-a)/2); for i := 1 to j do begin tmpChar := SBytes[a - 1 + i]; SBytes[a - 1 + i] := SBytes[x - i]; SBytes[x - i] := tmpChar; end; end; var st: AnsiString; i: Integer; D: Uint64; begin st := '1234567890ab'; Writeln(st); d := GetTickCount; for i := 1 to 479001600 do // 12! reverse(st, 1, 12); D := GetTickCount - D; Writeln(d); st := '1234567890ab'; Writeln(st); d := GetTickCount; for i := 1 to 479001600 do reverse2(st, 1, 12); D := GetTickCount - D; Writeln(d); Readln; end. With these results ... (32 bit? 64 bit? compile) 1234567890ab 14125 1234567890ab 4891 Share this post Link to post
pmcgee 10 Posted November 30, 2020 The calling code from c++ : int main() { const std::string Test = "abcdefghijkl"; int i = 0; std::string v = Test; Timer timer; timer.start(); do { ++i; } while ( std::next_permutation( v.begin(), v.end() ) ); timer.stop(); std::cout << "C++ " << i << "\n"; std::cout << "Seconds: " << timer.elapsedSeconds() << std::endl; AnsiString s = Test.c_str(); int len = s.Length()+1; i = 0; Timer timer2; timer2.start(); do { ++i; } while ( D_next_permutation(s, 1, len ) ); timer2.stop(); std::cout << "Delphi " << i << "\n"; std::cout << "Seconds: " << timer2.elapsedSeconds() << std::endl; getchar(); return 0; Share this post Link to post
Guest Posted November 30, 2020 13 hours ago, pmcgee said: With these results ... (32 bit? 64 bit? compile) 1234567890ab 14125 1234567890ab 4891 It is Win32, Built with XE8 The interesting thing is the result on Delphi 2009 Quote 1234567890ab 17390 1234567890ab 3641 And from the assembly comparison it is clearly that Delphi string handling got improved over the years, but the plain simple code and algorithm got slower. One more thing is the size of the binary, i left only Windows.pas in the uses clauses XE8 produced 877KB, and it did loaded 51 modules. 2009 produced 22KB , loaded only 22 Now to your question that you called a bit of "A bit of a challenge", let me say, this is not a bit, this is real challenge and it is doomed to fail, you can't challenge a force of nature with compiler like Delphi in your hand, you will not even get close, specially that code should be generics. So if you can use it from C/C++ then you are way better with C compilers, you can optimize parts here and there, but the code will lose its value once it converted to assembly, because it is huge. Suggestions, Stick to C++ implementation, or build it into obj/o files for different platforms and use it. You want a Delphi version, fine, and it might interest people to try, but it doesn't interest me to go and translate from C++ template, if you already have Delphi code, then at least share the implementation, along with tests for correctness, and one or two benchmarks, many here can spot inefficient code or just suggest things that we all can benefit from. Share this post Link to post
pmcgee 10 Posted November 30, 2020 8 hours ago, Kas Ob. said: if you already have Delphi code, then at least share the implementation, along with tests for correctness, and one or two benchmarks, many here can spot inefficient code or just suggest things that we all can benefit from. I'm a little confused. First post *is* the Delphi code. I've been learning about C++ for a couple years, alongside 20 with Delphi, and the ideas from both are interesting to me. Naively, it seems like a Delphi version should be able to produce about the same performance. Teasing that out should be educational. Share this post Link to post
Guest Posted November 30, 2020 34 minutes ago, pmcgee said: I'm a little confused. First post *is* the Delphi code. Yes it is, but it was and should be generics to be comparable with C++ part, and the fact you are using it on only AnsiStrings will reduce its usability a lot. So i think it should be String and Char instead of AnsiString and AnsiChar, and if you can replace that single unit for now with project for console, along with two or more benchmarks that you will perform and paste the result, form both the Delphi (doesn't matter what version as long you included the code) and result with the same input from C++ build, so anyone interested in giving it a go, can simply paste the code then run and use the timing as checkpoint provided by you and your C++ version. 12! and waiting 15second is long time, shorter test bed will be nice, the code you added for testing is for C++, while result and the difference will be more appreciated here, for me the C++ part of my RAD IDE is disabled long time ago. Sorry, if i wasn't clear. Share this post Link to post
pmcgee 10 Posted December 6, 2020 (edited) On 12/1/2020 at 1:52 AM, Kas Ob. said: the fact you are using it on only AnsiStrings will reduce its usability a lot. ... 12! and waiting 15second is long time, shorter test bed will be nice Sure ... byte / ansichar is a simple case ... but you need to start with simple and solve it the best you can before tackling a bigger / harder problem. The testing is made faster by just using a shorter string. The increase in time is about x10 at the 12 mark ... so 11 is quicker, and 10 is almost immediate. Edited December 6, 2020 by pmcgee Share this post Link to post