Jump to content
Mike Torrettinni

DynArraySetLength doesn't check for NewLength = OldLength

Recommended Posts

35 minutes ago, David Heffernan said:

I'm just saying that it's very unlikely that there will be real world code that suffers. That said, I don't know about weak refs so that could be significant. 

You got this from Rudy J.?

 

Share this post


Link to post
24 minutes ago, David Heffernan said:

I'm just saying that it's very unlikely that there will be real world code that suffers. That said, I don't know about weak refs so that could be significant. 

Maybe but its not just related to jagged array or weak refs. Here is a simple example where a copy happens too:

procedure TForm1.Button1Click(Sender: TObject);
var
  // array of simple type
  LArray, LArray2: array of Integer;
begin
  SetLength(LArray, 100);
  LArray2 := LArray;
  SetLength(LArray2, 100); // copy happens here too
end;

With all that being said, I think that a fix at the RTL level is necessary.

Share this post


Link to post
50 minutes ago, Mahdi Safsafi said:

Here is a simple example where a copy happens too

Well, yes. That's part of the contract. SetLength is contracted to return a unique object, that is with ref count 1.

 

It's even documented! 

 

http://docwiki.embarcadero.com/Libraries/en/System.SetLength

 

"Following a call to SetLength, S is guaranteed to reference a unique string or array -- that is, a string or array with a reference count of one.“

 

 

The only scenario you've raised that I could see being troublesome is a copy for weak refs. Everything else isn't going to have discernible real world impact. 

Edited by David Heffernan
  • Like 2

Share this post


Link to post

@Mahdi Safsafi Would you please show me/us how to use DDetours to intercept DynArraySetLength ?

I believe in statistics and numbers, also i know for sure that SetLength is used very extensively in various 3rd-party not only in RTL/VCL/FMX, but others like SecureBlackBox, almost all HTTP/REST libraries, image handling, compressing/dcompressing ....
so i want to imagine the scale of its usage and how much been called with the same length, in an empty project or real production application, also seeing how much the IDE itself is using it would be insightful.

I am not sure how to use the recursive feature, i am missing something obvious or DDetours library is using it internally causing a conflict.

Share this post


Link to post

@Mahdi Safsafi

I figured it out, i was doing it wrong, now coffee is working, an empty project the usage of it negligible, but with one REST call using SBB DynArraySetLength had been called 14034 times, 608 time were using the same length, and that with no certificate handling or validation, tens of thing need to be counted now 🙂

Recursive was throwing me on the wrong direction, but here how to do it and it can't be any simpler, brilliant library Mahdi !


type
  TDynArraySetLength = procedure(var a: Pointer; typeInfo: Pointer; dimCnt: NativeInt; lengthVec: PNativeint);

var
  TrampolineDynArraySetLength: TDynArraySetLength = nil;
  SetLengthCalls, SetLengthCallsSameLength: Integer;

implementation

uses
  DDetours;

{$R *.dfm}

procedure InterceptDynArraySetLength(var a: Pointer; typeInfo: Pointer; dimCnt: NativeInt; lengthVec: PNativeint);
type
  PDynArrayRec = ^TDynArrayRec;
  TDynArrayRec = packed record
  {$IFDEF CPU64BITS}
    Padding: Integer;
  {$ENDIF}
    RefCnt: Integer;
    Length: NativeInt;
  end;
var
  oldLength: NativeInt;
  p: Pointer;
begin
  p := a;
  oldLength := 0;
  if p <> nil then
  begin
    Dec(PByte(p), SizeOf(TDynArrayRec));
    oldLength := PDynArrayRec(p).Length;
  end;

  if oldLength = lengthVec^ then
    Inc(SetLengthCallsSameLength);

  Inc(SetLengthCalls);
  TrampolineDynArraySetLength(a, typeInfo, dimCnt, lengthVec);
end;

initialization
  TrampolineDynArraySetLength := InterceptCreate(@DynArraySetLength, @InterceptDynArraySetLength, nil, [ioRecursive]);

finalization
  if Assigned(TrampolineDynArraySetLength) then
    InterceptRemove(@TrampolineDynArraySetLength);
    
end.






  • Thanks 1

Share this post


Link to post

I counted the calls on server in simulation, the server is using SBB and RTC with Unidac, for short run the counts were 117m SetLength call with 6.3m calls with the same length, skipping the calls with the same length added around 30 connections extra above my 380 full HTTPS with TLS1.2 requests per second, that is not much, i am running Apache Meter and my server on the same device, so my numbers should be around 3 times, and i liked the plateau in throughput didn't get effected but looks even smoother.

  • Like 2

Share this post


Link to post
On 12/17/2020 at 2:59 AM, Kas Ob. said:

I counted the calls on server in simulation, the server is using SBB and RTC with Unidac, for short run the counts were 117m SetLength call with 6.3m calls with the same length, skipping the calls with the same length added around 30 connections extra above my 380 full HTTPS with TLS1.2 requests per second, that is not much, i am running Apache Meter and my server on the same device, so my numbers should be around 3 times, and i liked the plateau in throughput didn't get effected but looks even smoother.

@Kas Ob. did you make any decision based on this 5+% wasted calls? You write it adds almost 8% extra connections, right? Do these increases make a deciding factors to make changes for you, or not? Will you implement the change in release version?

Share this post


Link to post
type
  ToldschoolArray = TPointerList; // array of pointer
  TusesSetlengthinternallyArray = TList;
  TtrianglePts = TList<TPoint>;
...
    
procedure TForm1.btnmakeArrayClick(Sender: TObject);
var
  oldSchoolArray: ToldschoolArray;
  things: TusesSetlengthinternallyArray;
  I, X, Getby: Integer;
  viewI, viewM: Integer;
  pI: PInteger;
const
  bestguess = 10;
begin
  x := random(10)-4;
  things := TusesSetlengthinternallyArray.Create;
  things.Capacity := bestguess;
         // ^setting may increase speed when 10000 + coming
  Try
    for I := 0 to pred(bestguess+x) do
    begin
      new(pI);
      pI^ := I;
      things.Add(pI);
      //oldSchoolArray.add(pI); can't do this :(
    end;

    oldSchoolArray := Things.List; // I was wondering why they added .list
    ViewI := PInteger(Things.Last)^;  // guessing generics helped add this goody:)
    Getby := bestguess + X - 1;
    ViewM := PInteger(oldSchoolArray[Getby])^;
    caption := 'Easy:' + viewI.tostring + ' Hard:' + viewM.tostring;
  Finally
    for I := 0 to things.Count - 1 do dispose(things[I]);
    things.Destroy;
  End;

Older School would be slice with oversized fixed array.    To avoid DynaArray use D3.    

Share this post


Link to post
procedure TForm1.btnuseListUXClick(Sender: TObject);
var
  trianglePts: TtrianglePts;    //           Tlist of <Points>
  trianglePtsExtended: TtrianglePts;
begin
  trianglePts := TtrianglePts.Create;
  trianglePts.Add(point(20,20));
  trianglePts.Add(point(200,100));
  trianglePts.Add(point(200,20));
  trianglePts.Add(point(20,20));
  Canvas.Pen.Color  := clgreen;
  Canvas.Polyline(trianglePts.List);

  trianglePtsExtended := TtrianglePts.Create;
  trianglePtsExtended.AddRange(trianglePts.ToArray);
  trianglePtsExtended.insert(1,point(20,100));
  Canvas.Pen.Color  := clRed;
  Canvas.Polyline(trianglePtsExtended.toarray);

  trianglePtsExtended.Free;
  trianglePts.Free;
end;

forgot generic example.

  TtrianglePts = TList<TPoint>;

Share this post


Link to post
9 hours ago, Mike Torrettinni said:

did you make any decision based on this 5+% wasted calls?

I don't think it is easy to answer with yes and no here, the most important thing is i know now the impact of this, that was one, the second thing is SBB was very slow, mine is the expired v15 which i spend hundreds of hours tweaking and optimizing to reach 50-100 times faster PKI operation, which is essential to achieve hundreds of TLS connection to begin with, i don't know of you are familiar with SBB or not, it does goes beyond OpenSLL in security operation coverage in something one can say 10 times, but the speed is almost is a show stopper, so i am happy with the current state, as SBB and RTC aren't the bottleneck anymore.

10 hours ago, Mike Torrettinni said:

You write it adds almost 8% extra connections, right? Do these increases make a deciding factors to make changes for you, or not?

My servers now runs at peak usage at 6% of the CPU with 250-300 concurrent connections and this after splitting the servers and load balancing not for the sake of the speed but for security and availability to behave like a cloud delivering connection in quite few separated datacenters, so no, i am not using this for now, because the gain from this specific case was/is in the handshake itself only, which is the main factor.

 

10 hours ago, Mike Torrettinni said:

Will you implement the change in release version?

Most likely no, as i always prefer security over speed, see with DDetours i will lose DEP or the ability to use an EXE packers/protector, so weighing my options i will not with this, not now at least, but.. for this exact function in other applications it can have huge performance impact specially on DB operations ...

 

 

And this bring us to another talk, and here i will call on @Mahdi Safsafi to join with his insight and opinion.

 

There is many more important functions do have more speed impact on application written by Delphi, two of them are disguised and looks innocents but i hate them the most as they are essential everywhere even more then SetLength, they are outdated and will have impact on the IDE as well any almost any application from DB (operation specially), HTTP, HTML, logging .... they are everywhere.

IntToStr and StrToInt these also affect the IDE from scrolling to disassembler view, now to Mahdi question and suggestion

 

How about a spin-off or a new feature to DDetours, where a hooks will be applied to the file as a patch then remove itself, lets say PatchSelf is procedure will be called from initialization and it will apply any number of hooks/patchs for a copy of the EXE then patch PatchSelf itself to make it clean, this way we can safely patch the RTL/VCL without losing the DEP or the protector, i think it is doable and very useful and will not break RTL compatibility, yet any enhancement/replacement of these functions will be useful for everyone ! (the impact on debug and log error will be minimum but known and predictable), the tricky one will be handling DLLs, away from that i suggest to make this cold patching form itself to shorten the process, although a tool that look for specific signature for a constant record to replace one by one also doable, but patching itself at first run or with a command line to self is cool and less prone to bugs,

so what do you think ?

 

The idea is simple to put the frame on how to enhance Delphi built applications using unified library, and i know Andreas Hausladen can add this feature to his IDEFixPack and it will be more precise as the patch will be applied to the address's in calling code instead of the called, on compiling time, break nothing with debug information and save the extra jump, and it could be using a string constant as small script to build these hooks/patchs to make this patching process a concrete and project specific process. (don't know Andreas user name to add him)

 

  • Like 1
  • Thanks 1

Share this post


Link to post
2 hours ago, Kas Ob. said:

Most likely no, as i always prefer security over speed, see with DDetours i will lose DEP or the ability to use an EXE packers/protector, so weighing my options i will not with this, not now at least, but.. for this exact function in other applications it can have huge performance impact specially on DB operations ...

Why ? the packer encrypts/compresses the exe in disk but soon the exe is loaded in memory, packer had to decrypt/decompress the exe and DDetours applies hook at runtime (in memory). So I don't think that using DDetours prevents anyone from using a Packer. Please if you encountered such a case send me some details 🙂 

Quote

There is many more important functions do have more speed impact on application written by Delphi, two of them are disguised and looks innocents but i hate them the most as they are essential everywhere even more then SetLength, they are outdated and will have impact on the IDE as well any almost any application from DB (operation specially), HTTP, HTML, logging .... they are everywhere.

IntToStr and StrToInt these also affect the IDE from scrolling to disassembler view, now to Mahdi question and suggestion

Indeed ! Personally I spotted many place where core RTL functions were doing crappy things. 

Quote

How about a spin-off or a new feature to DDetours, where a hooks will be applied to the file as a patch then remove itself, lets say PatchSelf is procedure will be called from initialization and it will apply any number of hooks/patchs for a copy of the EXE then patch PatchSelf itself to make it clean, this way we can safely patch the RTL/VCL without losing the DEP or the protector, i think it is doable and very useful and will not break RTL compatibility, yet any enhancement/replacement of these functions will be useful for everyone ! (the impact on debug and log error will be minimum but known and predictable), the tricky one will be handling DLLs, away from that i suggest to make this cold patching form itself to shorten the process, although a tool that look for specific signature for a constant record to replace one by one also doable, but patching itself at first run or with a command line to self is cool and less prone to bugs,

so what do you think ?

Here is what I understand from your idea (please correct me if I'm wrong): I'm I right to think that you want DDetours to hook functions on the first run and then generates a patched exe file that eventually will be saved on disk ? 

If so, then this can't be done because DDetours can't pack/compress exe ... In other word, it will fail the day you use a packer.

 

Thanks for your ideas 🙂  

  • Like 1

Share this post


Link to post
31 minutes ago, Mahdi Safsafi said:

Why ? the packer encrypts/compresses the exe in disk but soon the exe is loaded in memory, packer had to decrypt/decompress the exe and DDetours applies hook at runtime (in memory). So I don't think that using DDetours prevents anyone from using a Packer. Please if you encountered such a case send me some details 🙂 

In case of simple executable packer, runtime hooking will fail with DEP, for real protector (anti-patch..antidump..) DEP will be disabled by the protection software layer itself, but most of them ( the real protector) will watch and check the runtime code is not patched, the protection software will have his own checksum and will periodically check the code integrity, for me personally WinLicense is the best and it does have many specific technicality to protect against runtime patching/hooking, of course such functionality can be disabled but it will leave the EXE vulnerable, i remember when be consulted on an application that was written in Delphi but the developers used Andreas VCL FixPack, so they did disable the runtime code integrity checking, this allowed me to demonstrated for them how to extract the most valuable information used a simple dumper and cheat engine by hooking few string handling functions to check for specific size then dump the data to a file, the whole process took very short time i didn't even track or disassemble anything, all what i need to see is specific length in an edit box !

 

40 minutes ago, Mahdi Safsafi said:

Here is what I understand from your idea (please correct me if I'm wrong): I'm I right to think that you want DDetours to hook functions on the first run and then generates a patched exe file that eventually will be saved on disk ? 

If so, then this can't be done because DDetours can't pack/compress exe ... In other word, it will fail the day you use a packer.

It is like you described, only the EXE will be run for the first time after building, DDetours will then patch a copy on the disk, then comes the packing then signing of the patched one in chained build process, simply put the release build process will add one step to run the EXE as soon as it built by Delphi compiler and will continue to use the patched one, this also will not be a problem with EurekaLog as it will be already done building its engine in the EXE before the patching.

 

In such we don't have any sort of limitation on any tools to be used from simple packing/compressing ( UPX..) to protecting ( WinLicense, ASProtect ..)

  • Like 1

Share this post


Link to post

Thanks I got clearly your idea ... but I have some hard time to digest this statement

Quote

In such we don't have any sort of limitation on any tools to be used from simple packing/compressing ( UPX..) to protecting ( WinLicense, ASProtect ..)

Technically this won't work ! The packer unpacks the exe in memory, then DDetours applies patch ... right ? But now we ended-up with an unpacked exe ... You can't simply dump this on disk !

Share this post


Link to post
11 minutes ago, Mahdi Safsafi said:

Technically this won't work ! The packer unpacks the exe in memory, then DDetours applies patch ... right ? But now we ended-up with an unpacked exe ... You can't simply dump this on disk !

The packer will not see or handle the original file, it will fed the patched one, the process of building usually like this in sequenced event or batch command

1) Build, this include compile/link/EL etc..

2) Protect/Pack the generated EXE

3) Sign

4) Build the setup/installer and sing it again

 

With the proposed Patching

1) Build, this include compile/link/EL etc..

2) Run one time and generate a patched version

3) Protect/Pack the patched file

4) Sign

5) Build the setup/installer and sing it again

Share this post


Link to post
2 minutes ago, Kas Ob. said:

The packer will not see or handle the original file, it will fed the patched one, the process of building usually like this in sequenced event or batch command

1) Build, this include compile/link/EL etc..

2) Protect/Pack the generated EXE

3) Sign

4) Build the setup/installer and sing it again

 

With the proposed Patching

1) Build, this include compile/link/EL etc..

2) Run one time and generate a patched version

3) Protect/Pack the patched file

4) Sign

5) Build the setup/installer and sing it again

Ah I see ... I thought you were referring to :

- Build

- Pack

- Run one time and generate a patched version

- ...

 

Thanks ! I'll investigate further on your idea and if all works great, I'll release it as an IDE plugin 🙂 

  • Like 1

Share this post


Link to post
4 minutes ago, Mahdi Safsafi said:

an IDE plugin 🙂 

Interesting indeed, I have failed project it was few years back i tried to do the same but using the tools with EL and altered its engine to search and replace, then you know, it failed and i got bored and found other things to do, the usual laziness.

 

Anyway, an IDE plugin will be just fantastic, so good luck and as every Delphi application will benefit from this.

Share this post


Link to post

@Mahdi Safsafi Here another idea that will be very useful specially if the cold hooking worked.

The real benefits of cold hooking is to replace slow untouchable function with faster one, so there is a chance the new function will be written in assembly, assembly most likely means few different versions and choosing between those will solemnly based on the CPU that can be known on site and at runtime only if one don't want to have different binaries, so the ability to switch between different versions of the InterceptProc at runtime is essential, this wasn't/isn't with current DDetours as you can switch them any time like remove and re-intercept but there is two reasons to add different of intercept mechanism
1) Chaining procs is possible without removing but will introduce overhead with checking, unless one removed and re-add, but this will require atomic protection which is an overhead itself.
2) With cold hooking aka patching this is not an option or will require similar overhead like (1)

So i propose to introduce two new functions 

Quote

function InterceptCreate(const TargetProc, InterceptProc: PPointer; const Param: Pointer = nil; const Options: TInterceptOptions = DefaultInterceptOptions)
  : Pointer; overload;
procedure InterceptCreate(const TargetProc, InterceptProc: PPointer; var TrampoLine: Pointer; const Param: Pointer = nil;
  const Options: TInterceptOptions = DefaultInterceptOptions); overload;


here the InteceptProc is PPointer means the target address will be global var and can be switch easily any time even with patched binary, one even check the CPU then decide which to use, also you can run (or profile) the same patched exe with few different versions of the proc at runtime, also you can choose the best fit based on the performance at runtime after deployment, as example, let it do small benchmark for these highly critical functions on the production CPU then use it, and this happen on highly secured and protected EXE with/without DEP, most AntiVirus software handle running EXE without DEP with suspicion sometimes in harsh way, with DEP enabled the AV is way more relaxed and will minimize its hooking to critical functions like reading/writing files.

  • Like 1

Share this post


Link to post

@Kas Ob. Cool idea ... I love it ! Although, this could be somehow challenging on x64 where data (variable that holds the pointer) could be located so far (exceeding 32-bit range) ... But I have a couple of ideas how to handle it ...

Again thanks for the great idea 🙂 

  • Like 2

Share this post


Link to post
2 hours ago, Mahdi Safsafi said:

this could be somehow challenging on x64 where data (variable that holds the pointer) could be located so far (exceeding 32-bit range)

How about something like this as food for thought

type
  TReservedSpace = array[0..1] of Pointer; // or array [0..the needed space] of Byte
  TInterceptProcRec = record
    TargetProc: Pointer;
    ReservedSpace: TReservedSpace;
  end;
  PInterceptProcRec = ^TInterceptProcRec;

{const
  RESERVER_SAPCE: TReserverSpace = (nil, nil); // or for x86 (Pointer($90909090),Pointer($90909090));  }

type
  TMove = procedure(const Source; var Dest; Count: NativeInt);

var
  InterceptMove: TMove;
  TrampolineMove: TMove;
  MoveInterceptRec: TInterceptProcRec = (TargetProc: @Move);

procedure IntstallHook(DDetoursProc: PInterceptProcRec; var InterceptProc, TrampoLine: Pointer);
begin
  // simulating DDetours
  InterceptProc := DDetoursProc.TargetProc;
  TrampoLine := DDetoursProc.TargetProc;
end;

procedure Test;
var
  A, B: Integer;
begin
  IntstallHook(@MoveInterceptRec, @InterceptMove, @TrampolineMove);

  // these should succeed
  Move(A, B, 4);
  InterceptMove(A, B, 4);
  TrampolineMove(A, B, 4);
end;

begin
  Test;
end.

By using record you can reserve any space needed, also the global record var is located in either second (not really) but the third section ".data" in the PE structure in case with Delphi, this means no more memory allocation and you don't need to touch, check or alter the PE sections size, keeping the EXE signature for Delphi compiler EXE generated files untouched which i see it as advantage, also the size of executables sections ".text" and ".itext" along with the ".data" have almost zero chance to become beyond 32 bit range, i doubt Delphi compiler and linker will be able to generate a 2gb EXE and that include resources, one thing though it might be have virtual size's that goes beyond 2gb, but in this case a warning will do, by just make sure that the hooking record are processed earlier in building process, still i might be wrong here, on how global vars are sorted in the EXE, is it who is first always ? based on the uses clauses in the dpr ?

Share this post


Link to post

@Kas Ob. No ! its more complicated than what you're describing ! 

Its not about the ability of the compiler to generate more than 2GB ! its about the virtual memory ! On x64, the OS can load DLL at the highest address and your hook can be located at the lowest address ... you see ?

Also what happens if the variable is created at runtime (through a call to MM)!

Edited by Mahdi Safsafi

Share this post


Link to post
6 minutes ago, Mahdi Safsafi said:

On x64, the OS can load DLL at the highest address and your hook can be located at the lowest address ... you see ?

But it will be stay relatively close, just like all the original code, hence the need for relocation table in case with DLL's usage, and that is the tricky part with DLL that i missed.

Share this post


Link to post

@Kas Ob. As I told you ... its a problem related to VM ! 

// What happens when variable wasn't declared statically !!! 
MoveInterceptRec := GetMemory(xx); // the variable can be located far >2GB.

 

  • Like 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×