mvanrijnen 123 Posted December 16, 2020 35 minutes ago, David Heffernan said: I'm just saying that it's very unlikely that there will be real world code that suffers. That said, I don't know about weak refs so that could be significant. You got this from Rudy J.? Share this post Link to post
Mahdi Safsafi 225 Posted December 16, 2020 24 minutes ago, David Heffernan said: I'm just saying that it's very unlikely that there will be real world code that suffers. That said, I don't know about weak refs so that could be significant. Maybe but its not just related to jagged array or weak refs. Here is a simple example where a copy happens too: procedure TForm1.Button1Click(Sender: TObject); var // array of simple type LArray, LArray2: array of Integer; begin SetLength(LArray, 100); LArray2 := LArray; SetLength(LArray2, 100); // copy happens here too end; With all that being said, I think that a fix at the RTL level is necessary. Share this post Link to post
David Heffernan 2345 Posted December 16, 2020 (edited) 50 minutes ago, Mahdi Safsafi said: Here is a simple example where a copy happens too Well, yes. That's part of the contract. SetLength is contracted to return a unique object, that is with ref count 1. It's even documented! http://docwiki.embarcadero.com/Libraries/en/System.SetLength "Following a call to SetLength, S is guaranteed to reference a unique string or array -- that is, a string or array with a reference count of one.“ The only scenario you've raised that I could see being troublesome is a copy for weak refs. Everything else isn't going to have discernible real world impact. Edited December 17, 2020 by David Heffernan 2 Share this post Link to post
Guest Posted December 17, 2020 @Mahdi Safsafi Would you please show me/us how to use DDetours to intercept DynArraySetLength ? I believe in statistics and numbers, also i know for sure that SetLength is used very extensively in various 3rd-party not only in RTL/VCL/FMX, but others like SecureBlackBox, almost all HTTP/REST libraries, image handling, compressing/dcompressing .... so i want to imagine the scale of its usage and how much been called with the same length, in an empty project or real production application, also seeing how much the IDE itself is using it would be insightful. I am not sure how to use the recursive feature, i am missing something obvious or DDetours library is using it internally causing a conflict. Share this post Link to post
Guest Posted December 17, 2020 @Mahdi Safsafi I figured it out, i was doing it wrong, now coffee is working, an empty project the usage of it negligible, but with one REST call using SBB DynArraySetLength had been called 14034 times, 608 time were using the same length, and that with no certificate handling or validation, tens of thing need to be counted now 🙂 Recursive was throwing me on the wrong direction, but here how to do it and it can't be any simpler, brilliant library Mahdi ! type TDynArraySetLength = procedure(var a: Pointer; typeInfo: Pointer; dimCnt: NativeInt; lengthVec: PNativeint); var TrampolineDynArraySetLength: TDynArraySetLength = nil; SetLengthCalls, SetLengthCallsSameLength: Integer; implementation uses DDetours; {$R *.dfm} procedure InterceptDynArraySetLength(var a: Pointer; typeInfo: Pointer; dimCnt: NativeInt; lengthVec: PNativeint); type PDynArrayRec = ^TDynArrayRec; TDynArrayRec = packed record {$IFDEF CPU64BITS} Padding: Integer; {$ENDIF} RefCnt: Integer; Length: NativeInt; end; var oldLength: NativeInt; p: Pointer; begin p := a; oldLength := 0; if p <> nil then begin Dec(PByte(p), SizeOf(TDynArrayRec)); oldLength := PDynArrayRec(p).Length; end; if oldLength = lengthVec^ then Inc(SetLengthCallsSameLength); Inc(SetLengthCalls); TrampolineDynArraySetLength(a, typeInfo, dimCnt, lengthVec); end; initialization TrampolineDynArraySetLength := InterceptCreate(@DynArraySetLength, @InterceptDynArraySetLength, nil, [ioRecursive]); finalization if Assigned(TrampolineDynArraySetLength) then InterceptRemove(@TrampolineDynArraySetLength); end. Share this post Link to post
Guest Posted December 17, 2020 I counted the calls on server in simulation, the server is using SBB and RTC with Unidac, for short run the counts were 117m SetLength call with 6.3m calls with the same length, skipping the calls with the same length added around 30 connections extra above my 380 full HTTPS with TLS1.2 requests per second, that is not much, i am running Apache Meter and my server on the same device, so my numbers should be around 3 times, and i liked the plateau in throughput didn't get effected but looks even smoother. Share this post Link to post
Mahdi Safsafi 225 Posted December 17, 2020 5 hours ago, Kas Ob. said: I am not sure how to use the recursive feature, i am missing something obvious or DDetours library is using it internally causing a conflict. Please see wiki : https://github.com/MahdiSafsafi/DDetours/wiki#enterrecursivesection Share this post Link to post
Mike Torrettinni 198 Posted December 22, 2020 On 12/17/2020 at 2:59 AM, Kas Ob. said: I counted the calls on server in simulation, the server is using SBB and RTC with Unidac, for short run the counts were 117m SetLength call with 6.3m calls with the same length, skipping the calls with the same length added around 30 connections extra above my 380 full HTTPS with TLS1.2 requests per second, that is not much, i am running Apache Meter and my server on the same device, so my numbers should be around 3 times, and i liked the plateau in throughput didn't get effected but looks even smoother. @Kas Ob. did you make any decision based on this 5+% wasted calls? You write it adds almost 8% extra connections, right? Do these increases make a deciding factors to make changes for you, or not? Will you implement the change in release version? Share this post Link to post
Pat Foley 51 Posted December 22, 2020 type ToldschoolArray = TPointerList; // array of pointer TusesSetlengthinternallyArray = TList; TtrianglePts = TList<TPoint>; ... procedure TForm1.btnmakeArrayClick(Sender: TObject); var oldSchoolArray: ToldschoolArray; things: TusesSetlengthinternallyArray; I, X, Getby: Integer; viewI, viewM: Integer; pI: PInteger; const bestguess = 10; begin x := random(10)-4; things := TusesSetlengthinternallyArray.Create; things.Capacity := bestguess; // ^setting may increase speed when 10000 + coming Try for I := 0 to pred(bestguess+x) do begin new(pI); pI^ := I; things.Add(pI); //oldSchoolArray.add(pI); can't do this :( end; oldSchoolArray := Things.List; // I was wondering why they added .list ViewI := PInteger(Things.Last)^; // guessing generics helped add this goody:) Getby := bestguess + X - 1; ViewM := PInteger(oldSchoolArray[Getby])^; caption := 'Easy:' + viewI.tostring + ' Hard:' + viewM.tostring; Finally for I := 0 to things.Count - 1 do dispose(things[I]); things.Destroy; End; Older School would be slice with oversized fixed array. To avoid DynaArray use D3. Share this post Link to post
Pat Foley 51 Posted December 22, 2020 procedure TForm1.btnuseListUXClick(Sender: TObject); var trianglePts: TtrianglePts; // Tlist of <Points> trianglePtsExtended: TtrianglePts; begin trianglePts := TtrianglePts.Create; trianglePts.Add(point(20,20)); trianglePts.Add(point(200,100)); trianglePts.Add(point(200,20)); trianglePts.Add(point(20,20)); Canvas.Pen.Color := clgreen; Canvas.Polyline(trianglePts.List); trianglePtsExtended := TtrianglePts.Create; trianglePtsExtended.AddRange(trianglePts.ToArray); trianglePtsExtended.insert(1,point(20,100)); Canvas.Pen.Color := clRed; Canvas.Polyline(trianglePtsExtended.toarray); trianglePtsExtended.Free; trianglePts.Free; end; forgot generic example. TtrianglePts = TList<TPoint>; Share this post Link to post
Guest Posted December 23, 2020 9 hours ago, Mike Torrettinni said: did you make any decision based on this 5+% wasted calls? I don't think it is easy to answer with yes and no here, the most important thing is i know now the impact of this, that was one, the second thing is SBB was very slow, mine is the expired v15 which i spend hundreds of hours tweaking and optimizing to reach 50-100 times faster PKI operation, which is essential to achieve hundreds of TLS connection to begin with, i don't know of you are familiar with SBB or not, it does goes beyond OpenSLL in security operation coverage in something one can say 10 times, but the speed is almost is a show stopper, so i am happy with the current state, as SBB and RTC aren't the bottleneck anymore. 10 hours ago, Mike Torrettinni said: You write it adds almost 8% extra connections, right? Do these increases make a deciding factors to make changes for you, or not? My servers now runs at peak usage at 6% of the CPU with 250-300 concurrent connections and this after splitting the servers and load balancing not for the sake of the speed but for security and availability to behave like a cloud delivering connection in quite few separated datacenters, so no, i am not using this for now, because the gain from this specific case was/is in the handshake itself only, which is the main factor. 10 hours ago, Mike Torrettinni said: Will you implement the change in release version? Most likely no, as i always prefer security over speed, see with DDetours i will lose DEP or the ability to use an EXE packers/protector, so weighing my options i will not with this, not now at least, but.. for this exact function in other applications it can have huge performance impact specially on DB operations ... And this bring us to another talk, and here i will call on @Mahdi Safsafi to join with his insight and opinion. There is many more important functions do have more speed impact on application written by Delphi, two of them are disguised and looks innocents but i hate them the most as they are essential everywhere even more then SetLength, they are outdated and will have impact on the IDE as well any almost any application from DB (operation specially), HTTP, HTML, logging .... they are everywhere. IntToStr and StrToInt these also affect the IDE from scrolling to disassembler view, now to Mahdi question and suggestion How about a spin-off or a new feature to DDetours, where a hooks will be applied to the file as a patch then remove itself, lets say PatchSelf is procedure will be called from initialization and it will apply any number of hooks/patchs for a copy of the EXE then patch PatchSelf itself to make it clean, this way we can safely patch the RTL/VCL without losing the DEP or the protector, i think it is doable and very useful and will not break RTL compatibility, yet any enhancement/replacement of these functions will be useful for everyone ! (the impact on debug and log error will be minimum but known and predictable), the tricky one will be handling DLLs, away from that i suggest to make this cold patching form itself to shorten the process, although a tool that look for specific signature for a constant record to replace one by one also doable, but patching itself at first run or with a command line to self is cool and less prone to bugs, so what do you think ? The idea is simple to put the frame on how to enhance Delphi built applications using unified library, and i know Andreas Hausladen can add this feature to his IDEFixPack and it will be more precise as the patch will be applied to the address's in calling code instead of the called, on compiling time, break nothing with debug information and save the extra jump, and it could be using a string constant as small script to build these hooks/patchs to make this patching process a concrete and project specific process. (don't know Andreas user name to add him) Share this post Link to post
Mahdi Safsafi 225 Posted December 23, 2020 2 hours ago, Kas Ob. said: Most likely no, as i always prefer security over speed, see with DDetours i will lose DEP or the ability to use an EXE packers/protector, so weighing my options i will not with this, not now at least, but.. for this exact function in other applications it can have huge performance impact specially on DB operations ... Why ? the packer encrypts/compresses the exe in disk but soon the exe is loaded in memory, packer had to decrypt/decompress the exe and DDetours applies hook at runtime (in memory). So I don't think that using DDetours prevents anyone from using a Packer. Please if you encountered such a case send me some details 🙂 Quote There is many more important functions do have more speed impact on application written by Delphi, two of them are disguised and looks innocents but i hate them the most as they are essential everywhere even more then SetLength, they are outdated and will have impact on the IDE as well any almost any application from DB (operation specially), HTTP, HTML, logging .... they are everywhere. IntToStr and StrToInt these also affect the IDE from scrolling to disassembler view, now to Mahdi question and suggestion Indeed ! Personally I spotted many place where core RTL functions were doing crappy things. Quote How about a spin-off or a new feature to DDetours, where a hooks will be applied to the file as a patch then remove itself, lets say PatchSelf is procedure will be called from initialization and it will apply any number of hooks/patchs for a copy of the EXE then patch PatchSelf itself to make it clean, this way we can safely patch the RTL/VCL without losing the DEP or the protector, i think it is doable and very useful and will not break RTL compatibility, yet any enhancement/replacement of these functions will be useful for everyone ! (the impact on debug and log error will be minimum but known and predictable), the tricky one will be handling DLLs, away from that i suggest to make this cold patching form itself to shorten the process, although a tool that look for specific signature for a constant record to replace one by one also doable, but patching itself at first run or with a command line to self is cool and less prone to bugs, so what do you think ? Here is what I understand from your idea (please correct me if I'm wrong): I'm I right to think that you want DDetours to hook functions on the first run and then generates a patched exe file that eventually will be saved on disk ? If so, then this can't be done because DDetours can't pack/compress exe ... In other word, it will fail the day you use a packer. Thanks for your ideas 🙂 Share this post Link to post
Guest Posted December 23, 2020 31 minutes ago, Mahdi Safsafi said: Why ? the packer encrypts/compresses the exe in disk but soon the exe is loaded in memory, packer had to decrypt/decompress the exe and DDetours applies hook at runtime (in memory). So I don't think that using DDetours prevents anyone from using a Packer. Please if you encountered such a case send me some details 🙂 In case of simple executable packer, runtime hooking will fail with DEP, for real protector (anti-patch..antidump..) DEP will be disabled by the protection software layer itself, but most of them ( the real protector) will watch and check the runtime code is not patched, the protection software will have his own checksum and will periodically check the code integrity, for me personally WinLicense is the best and it does have many specific technicality to protect against runtime patching/hooking, of course such functionality can be disabled but it will leave the EXE vulnerable, i remember when be consulted on an application that was written in Delphi but the developers used Andreas VCL FixPack, so they did disable the runtime code integrity checking, this allowed me to demonstrated for them how to extract the most valuable information used a simple dumper and cheat engine by hooking few string handling functions to check for specific size then dump the data to a file, the whole process took very short time i didn't even track or disassemble anything, all what i need to see is specific length in an edit box ! 40 minutes ago, Mahdi Safsafi said: Here is what I understand from your idea (please correct me if I'm wrong): I'm I right to think that you want DDetours to hook functions on the first run and then generates a patched exe file that eventually will be saved on disk ? If so, then this can't be done because DDetours can't pack/compress exe ... In other word, it will fail the day you use a packer. It is like you described, only the EXE will be run for the first time after building, DDetours will then patch a copy on the disk, then comes the packing then signing of the patched one in chained build process, simply put the release build process will add one step to run the EXE as soon as it built by Delphi compiler and will continue to use the patched one, this also will not be a problem with EurekaLog as it will be already done building its engine in the EXE before the patching. In such we don't have any sort of limitation on any tools to be used from simple packing/compressing ( UPX..) to protecting ( WinLicense, ASProtect ..) Share this post Link to post
Mahdi Safsafi 225 Posted December 23, 2020 Thanks I got clearly your idea ... but I have some hard time to digest this statement : Quote In such we don't have any sort of limitation on any tools to be used from simple packing/compressing ( UPX..) to protecting ( WinLicense, ASProtect ..) Technically this won't work ! The packer unpacks the exe in memory, then DDetours applies patch ... right ? But now we ended-up with an unpacked exe ... You can't simply dump this on disk ! Share this post Link to post
Guest Posted December 23, 2020 11 minutes ago, Mahdi Safsafi said: Technically this won't work ! The packer unpacks the exe in memory, then DDetours applies patch ... right ? But now we ended-up with an unpacked exe ... You can't simply dump this on disk ! The packer will not see or handle the original file, it will fed the patched one, the process of building usually like this in sequenced event or batch command 1) Build, this include compile/link/EL etc.. 2) Protect/Pack the generated EXE 3) Sign 4) Build the setup/installer and sing it again With the proposed Patching 1) Build, this include compile/link/EL etc.. 2) Run one time and generate a patched version 3) Protect/Pack the patched file 4) Sign 5) Build the setup/installer and sing it again Share this post Link to post
Mahdi Safsafi 225 Posted December 23, 2020 2 minutes ago, Kas Ob. said: The packer will not see or handle the original file, it will fed the patched one, the process of building usually like this in sequenced event or batch command 1) Build, this include compile/link/EL etc.. 2) Protect/Pack the generated EXE 3) Sign 4) Build the setup/installer and sing it again With the proposed Patching 1) Build, this include compile/link/EL etc.. 2) Run one time and generate a patched version 3) Protect/Pack the patched file 4) Sign 5) Build the setup/installer and sing it again Ah I see ... I thought you were referring to : - Build - Pack - Run one time and generate a patched version - ... Thanks ! I'll investigate further on your idea and if all works great, I'll release it as an IDE plugin 🙂 Share this post Link to post
Guest Posted December 23, 2020 4 minutes ago, Mahdi Safsafi said: an IDE plugin 🙂 Interesting indeed, I have failed project it was few years back i tried to do the same but using the tools with EL and altered its engine to search and replace, then you know, it failed and i got bored and found other things to do, the usual laziness. Anyway, an IDE plugin will be just fantastic, so good luck and as every Delphi application will benefit from this. Share this post Link to post
Jacek Laskowski 57 Posted December 23, 2020 8 hours ago, Kas Ob. said: (don't know Andreas user name to add him) @jbg Share this post Link to post
Guest Posted December 24, 2020 @Mahdi Safsafi Here another idea that will be very useful specially if the cold hooking worked. The real benefits of cold hooking is to replace slow untouchable function with faster one, so there is a chance the new function will be written in assembly, assembly most likely means few different versions and choosing between those will solemnly based on the CPU that can be known on site and at runtime only if one don't want to have different binaries, so the ability to switch between different versions of the InterceptProc at runtime is essential, this wasn't/isn't with current DDetours as you can switch them any time like remove and re-intercept but there is two reasons to add different of intercept mechanism 1) Chaining procs is possible without removing but will introduce overhead with checking, unless one removed and re-add, but this will require atomic protection which is an overhead itself. 2) With cold hooking aka patching this is not an option or will require similar overhead like (1) So i propose to introduce two new functions Quote function InterceptCreate(const TargetProc, InterceptProc: PPointer; const Param: Pointer = nil; const Options: TInterceptOptions = DefaultInterceptOptions) : Pointer; overload; procedure InterceptCreate(const TargetProc, InterceptProc: PPointer; var TrampoLine: Pointer; const Param: Pointer = nil; const Options: TInterceptOptions = DefaultInterceptOptions); overload; here the InteceptProc is PPointer means the target address will be global var and can be switch easily any time even with patched binary, one even check the CPU then decide which to use, also you can run (or profile) the same patched exe with few different versions of the proc at runtime, also you can choose the best fit based on the performance at runtime after deployment, as example, let it do small benchmark for these highly critical functions on the production CPU then use it, and this happen on highly secured and protected EXE with/without DEP, most AntiVirus software handle running EXE without DEP with suspicion sometimes in harsh way, with DEP enabled the AV is way more relaxed and will minimize its hooking to critical functions like reading/writing files. Share this post Link to post
Mahdi Safsafi 225 Posted December 24, 2020 @Kas Ob. Cool idea ... I love it ! Although, this could be somehow challenging on x64 where data (variable that holds the pointer) could be located so far (exceeding 32-bit range) ... But I have a couple of ideas how to handle it ... Again thanks for the great idea 🙂 1 Share this post Link to post
Guest Posted December 24, 2020 2 hours ago, Mahdi Safsafi said: this could be somehow challenging on x64 where data (variable that holds the pointer) could be located so far (exceeding 32-bit range) How about something like this as food for thought type TReservedSpace = array[0..1] of Pointer; // or array [0..the needed space] of Byte TInterceptProcRec = record TargetProc: Pointer; ReservedSpace: TReservedSpace; end; PInterceptProcRec = ^TInterceptProcRec; {const RESERVER_SAPCE: TReserverSpace = (nil, nil); // or for x86 (Pointer($90909090),Pointer($90909090)); } type TMove = procedure(const Source; var Dest; Count: NativeInt); var InterceptMove: TMove; TrampolineMove: TMove; MoveInterceptRec: TInterceptProcRec = (TargetProc: @Move); procedure IntstallHook(DDetoursProc: PInterceptProcRec; var InterceptProc, TrampoLine: Pointer); begin // simulating DDetours InterceptProc := DDetoursProc.TargetProc; TrampoLine := DDetoursProc.TargetProc; end; procedure Test; var A, B: Integer; begin IntstallHook(@MoveInterceptRec, @InterceptMove, @TrampolineMove); // these should succeed Move(A, B, 4); InterceptMove(A, B, 4); TrampolineMove(A, B, 4); end; begin Test; end. By using record you can reserve any space needed, also the global record var is located in either second (not really) but the third section ".data" in the PE structure in case with Delphi, this means no more memory allocation and you don't need to touch, check or alter the PE sections size, keeping the EXE signature for Delphi compiler EXE generated files untouched which i see it as advantage, also the size of executables sections ".text" and ".itext" along with the ".data" have almost zero chance to become beyond 32 bit range, i doubt Delphi compiler and linker will be able to generate a 2gb EXE and that include resources, one thing though it might be have virtual size's that goes beyond 2gb, but in this case a warning will do, by just make sure that the hooking record are processed earlier in building process, still i might be wrong here, on how global vars are sorted in the EXE, is it who is first always ? based on the uses clauses in the dpr ? Share this post Link to post
Mahdi Safsafi 225 Posted December 24, 2020 (edited) @Kas Ob. No ! its more complicated than what you're describing ! Its not about the ability of the compiler to generate more than 2GB ! its about the virtual memory ! On x64, the OS can load DLL at the highest address and your hook can be located at the lowest address ... you see ? Also what happens if the variable is created at runtime (through a call to MM)! Edited December 24, 2020 by Mahdi Safsafi Share this post Link to post
Guest Posted December 24, 2020 6 minutes ago, Mahdi Safsafi said: On x64, the OS can load DLL at the highest address and your hook can be located at the lowest address ... you see ? But it will be stay relatively close, just like all the original code, hence the need for relocation table in case with DLL's usage, and that is the tricky part with DLL that i missed. Share this post Link to post
Guest Posted December 24, 2020 This reading might help but it also does show the tricky part and changes to PE reloc section https://stackoverflow.com/questions/17436668/how-are-pe-base-relocations-build-up Share this post Link to post
Mahdi Safsafi 225 Posted December 24, 2020 @Kas Ob. As I told you ... its a problem related to VM ! // What happens when variable wasn't declared statically !!! MoveInterceptRec := GetMemory(xx); // the variable can be located far >2GB. Share this post Link to post