Hafedh TRIMECHE 1 Posted April 15, 2021 Delphi: 10.3 Version 26.0.36039.7899 This code generates a memory at the first call to SetLength: A memory block has been leaked. The size is: 519 This block was allocated by thread 0x2010, and the stack trace (return addresses) at the time was: 00420A36 [FastMM5.pas][FastMM5][FastMM_DebugGetMem][7718] 006E43A7 [uMemory.pas][uMemory][NewAllocMem][88] 00407441 [System.pas][System][@ReallocMem][5022] 0040DB51 [System.pas][System][DynArraySetLength][36046] 0040B09C [System.pas][System][@LStrFromPWCharLen][26213] 0040DC92 [System.pas][System][@DynArraySetLength][36150] 00811B23 [uCommon.pas][uCommon][Mime64Decode][3729] 00928D49 [uXMLSec.pas][uXMLSec][NodeGetBase64Value][311] 0092B828 [uXMLSec.pas][uXMLSec][VerifyNode][893] 0092C0D4 [uXMLSec.pas][uXMLSec][TXMLSec.Verify][1006] 00D43D7A [Main.pas][Main][TMainForm.UpdateAttachmentsAndSignatures][298] 00D44784 [Main.pas][Main][TMainForm.LoadFile][427] 00D44886 [Main.pas][Main][TMainForm.BtnOpenClick][443] 00578075 [Vcl.Controls.pas][Vcl.Controls][TControl.Click][7536] 0059B84F [Vcl.StdCtrls.pas][Vcl.StdCtrls][TCustomButton.Click][5470] 0059C365 [Vcl.StdCtrls.pas][Vcl.StdCtrls][TCustomButton.CNCommand][5931] 00577B19 [Vcl.Controls.pas][Vcl.Controls][TControl.WndProc][7420] 75FD5E7A [Unknown function at GetClassLongW] 75FD60BF [Unknown function at GetClassLongW] 75FD5EBC [Unknown function at GetClassLongW] function Mime64Decode(const Encoded:string):TBytes; var InLen , OutLen : NativeUInt; Raw : RawByteString; begin Raw := RawByteString(Encoded); InLen := Length(Encoded); OutLen := 0; if InLen>0 then begin OutLen := MimeDecodedSize(InLen); SetLength(Result,OutLen); try OutLen := MimeDecode(Raw[1],InLen,Result[0]); except OutLen := 0; end; end; SetLength(Result,OutLen); end; Is there any alternative to allocate memory to TBytes? Best regards. Share this post Link to post
David Heffernan 2345 Posted April 15, 2021 The code that we can see can't leak. There is likely a bug in some other code, probably your code. If you show a minimal repro we'll find it easily. 1 Share this post Link to post
Hafedh TRIMECHE 1 Posted April 15, 2021 (edited) 2 hours ago, David Heffernan said: The code that we can see can't leak. There is likely a bug in some other code, probably your code. If you show a minimal repro we'll find it easily. I guess that is a Reallocation problem! The code changed to: SetLength(Result,0); SetLength(Result,OutLen); solved the problem. But it's at a higher risk: Result content would be lost when setting length to 0. Edited April 15, 2021 by Hafedh TRIMECHE Share this post Link to post
David Heffernan 2345 Posted April 15, 2021 5 minutes ago, Hafedh TRIMECHE said: I guess that is a Reallocation problem! The code changed to: SetLength(Result,0); SetLength(Result,OutLen); solved the problem. But it's at a higher risk: Result content would be lost when setting length to 0. That doesn't change anything. That is just wasteful. We know that dynamic arrays work correctly. If you really want to resolve the problem you need to clearly identify it first. Share this post Link to post
Stefan Glienke 2002 Posted April 15, 2021 Important: a memleak report just tells where the memory was allocated that got leaked not the code that actually caused the leak. You need to review the code in NodeGetBase64Value and check what it does with the result of Mime64Decode and where it passes that. 2 Share this post Link to post
Remy Lebeau 1396 Posted April 15, 2021 (edited) The call stack in that log also looks a little off. System._DynArraySetLength() does not call System._LStrFromPWCharLen(), and System._LStrFromPWCharLen() does not call System.DynArraySetLength(). System._DynArraySetLength() calls System.DynArraySetLength() directly. So, even though the code shown does call System._LStrFromPWCharLen() (on the RawByteString type-cast), that call should not be on the call stack anymore when System._ReallocMem() is called. Edited April 15, 2021 by Remy Lebeau Share this post Link to post
Stefan Glienke 2002 Posted April 15, 2021 (edited) @Remy Lebeau The LStrFromPWCharLen entry indeed looks wrong - but the lines before and after that exactly match with the 10.3.3 code - I just checked. Maybe it's just a glitch in the stack walking code done by FastMM_FullDebugMode.dll Edit: Yep - I can repro the wrong entry in the call stack with this code: uses FastMM5, System.SysUtils; var x: TBytes; s: string; r: RawByteString; begin s := 'test'; r := RawByteString(s); SetLength(x, 10); Pointer(x) := nil; ReportMemoryLeaksOnShutdown := True; end. and get this report: This block was allocated by thread 0x432C, and the stack trace (return addresses) at the time was: 004128BA [FastMM5.pas][FastMM5][FastMM_DebugGetMem$qqri][7717] 0040476D [System.pas][System][@ReallocMem$qqrrpvi][5022] 00408181 [System.pas][System][DynArraySetLength$qqrrpvpvipi][36046] 00406E08 [System.pas][System][@LStrFromPWCharLen$qqrr27System.%AnsiStringT$us$i0$%pbius][26213] <-- ?!?! 004082C2 [System.pas][System][@DynArraySetLength$qqrv][36150] 004265D9 7732FA29 [BaseThreadInitThunk] 77E87C7E [RtlGetAppContainerNamedObjectPath] 77E87C4E [RtlGetAppContainerNamedObjectPath] My guess would be that the: PUSH ESP ADD dword ptr [ESP],4 in _DynArraySetLength is not properly handled by the stack walking code. Obviously, a defect in JclDebug LogStackTrace in FastMM_FullDebugMode.dpr - compiling FastMM_FullDebugMode.dll with madExcept gives this call stack as it just passes to madStackTrace.FastMM_LogStackTrace which seems to do a better job: 004128b5 +015 Project53.exe FastMM5 7717 +4 FastMM_DebugGetMem 00404767 +03f Project53.exe System 5022 +91 @ReallocMem 0040817c +16c Project53.exe System 36046 +70 DynArraySetLength 004082bd +005 Project53.exe System 36150 +3 @DynArraySetLength 7732fa27 +017 KERNEL32.DLL BaseThreadInitThunk Probably the bug is in GetRawStackTrace and madExcept is just able to fix it - I am not entirely sure - as far as I can see there is no method in JclDebug that can build a call stack from the data that FastMM_FullDebugMode GetRawStackTrace collects - that is why LogStackTrace simply iterates the entries and calls GetLocationInfo on each of them and just concats them. @Pierre le Riche fyi Edited April 15, 2021 by Stefan Glienke 2 Share this post Link to post
Hafedh TRIMECHE 1 Posted April 16, 2021 8 hours ago, Stefan Glienke said: @Remy Lebeau The LStrFromPWCharLen entry indeed looks wrong - but the lines before and after that exactly match with the 10.3.3 code - I just checked. Maybe it's just a glitch in the stack walking code done by FastMM_FullDebugMode.dll Edit: Yep - I can repro the wrong entry in the call stack with this code: uses FastMM5, System.SysUtils; var x: TBytes; s: string; r: RawByteString; begin s := 'test'; r := RawByteString(s); SetLength(x, 10); Pointer(x) := nil; ReportMemoryLeaksOnShutdown := True; end. and get this report: This block was allocated by thread 0x432C, and the stack trace (return addresses) at the time was: 004128BA [FastMM5.pas][FastMM5][FastMM_DebugGetMem$qqri][7717] 0040476D [System.pas][System][@ReallocMem$qqrrpvi][5022] 00408181 [System.pas][System][DynArraySetLength$qqrrpvpvipi][36046] 00406E08 [System.pas][System][@LStrFromPWCharLen$qqrr27System.%AnsiStringT$us$i0$%pbius][26213] <-- ?!?! 004082C2 [System.pas][System][@DynArraySetLength$qqrv][36150] 004265D9 7732FA29 [BaseThreadInitThunk] 77E87C7E [RtlGetAppContainerNamedObjectPath] 77E87C4E [RtlGetAppContainerNamedObjectPath] My guess would be that the: PUSH ESP ADD dword ptr [ESP],4 in _DynArraySetLength is not properly handled by the stack walking code. Obviously, a defect in JclDebug LogStackTrace in FastMM_FullDebugMode.dpr - compiling FastMM_FullDebugMode.dll with madExcept gives this call stack as it just passes to madStackTrace.FastMM_LogStackTrace which seems to do a better job: 004128b5 +015 Project53.exe FastMM5 7717 +4 FastMM_DebugGetMem 00404767 +03f Project53.exe System 5022 +91 @ReallocMem 0040817c +16c Project53.exe System 36046 +70 DynArraySetLength 004082bd +005 Project53.exe System 36150 +3 @DynArraySetLength 7732fa27 +017 KERNEL32.DLL BaseThreadInitThunk Probably the bug is in GetRawStackTrace and madExcept is just able to fix it - I am not entirely sure - as far as I can see there is no method in JclDebug that can build a call stack from the data that FastMM_FullDebugMode GetRawStackTrace collects - that is why LogStackTrace simply iterates the entries and calls GetLocationInfo on each of them and just concats them. @Pierre le Riche fyi The leak is generated by a bad usage of SSL routine. The result of Base64Decode (TBytes=Blob) is passed to d2i_OCSP_RESPONSE as a Pointer not double Pointer PPointer. Bad: PBlob := Blob; resp := d2i_OCSP_RESPONSE(nil,@PBlob,Length(Blob)); Correct: PBlob := @Blob; resp := d2i_OCSP_RESPONSE(nil,@PBlob,Length(Blob)); Best regards. Share this post Link to post
Fr0sT.Brutal 900 Posted April 16, 2021 16 hours ago, Hafedh TRIMECHE said: But it's at a higher risk: Result content would be lost when setting length to 0. You already set length to 0 on decode error. BTW, there's a catch in dynamic Result regarding performance. They have their previous state when entering a function so the array could be allocated already. When you call SetLength, and if FastMM decides the array must be reallocated (as it reserves some space after the end), it copies full contents of an old array to a new one. Which is useless anyway. OTOH, disposing the old array by setting length to 0 could also slow thing down because then allocation will happen at every call. Depending on how intensive the function call is, you might want to re-use the array without truncating its length and just return how much decoded data it contains. Share this post Link to post
Hafedh TRIMECHE 1 Posted April 16, 2021 (edited) The leak is generated by a bad usage of SSL routine. The result of Base64Decode (TBytes=Blob) is passed to d2i_OCSP_RESPONSE as a Pointer not double Pointer PPointer. Bad: PBlob := Blob; resp := d2i_OCSP_RESPONSE(nil,@PBlob,Length(Blob)); Correct: PBlob := @Blob; resp := d2i_OCSP_RESPONSE(nil,@PBlob,Length(Blob)); Thus, the content of TBytes is modified by changing it's prefix (RefCnt, Length, ...) not by reallocating it. Sorry for the inconvenience. Edited April 16, 2021 by Hafedh TRIMECHE Share this post Link to post
David Heffernan 2345 Posted April 16, 2021 37 minutes ago, Hafedh TRIMECHE said: The leak is generated by a bad usage of SSL routine. Are you saying that it was in the end nothing to do with your SetLength calls and was just a defect somewhere else in your code that corrupted the heap? 1 Share this post Link to post
Pierre le Riche 21 Posted April 17, 2021 (edited) On 4/15/2021 at 5:55 PM, Stefan Glienke said: My guess would be that the: PUSH ESP ADD dword ptr [ESP],4 in _DynArraySetLength is not properly handled by the stack walking code. Obviously, a defect in JclDebug LogStackTrace in FastMM_FullDebugMode.dpr - compiling FastMM_FullDebugMode.dll with madExcept gives this call stack as it just passes to madStackTrace.FastMM_LogStackTrace which seems to do a better job: Hi Stefan, The code that validates stack trace entries in FastMM_FullDebugMode.dpr was taken from TJclStackInfoList.ValidCallSite, with some minor modifications. (Credit is given in the comments.) I just checked and it is still the same, functionally, as the latest code in JCLDebug.pas. By default FastMM does a "raw" stack trace, meaning it just walks the stack in reverse, testing every dword to see whether it could potentially be the return address for a call instruction. It does this by checking whether the bytes just prior to the potential return address match one of the many potential opcodes for a call instruction. This is not 100% accurate, so you do get some false positives (as you have seen), but it works fairly well in general. The alternative would be to do a frame based stack trace, but then routines that do not set up a stack frame would not be listed at all. There's room for optimization here: The stack tracer could perhaps look at the actual address of the call instruction and compare that to the return address of the next call in the call stack. If the variance in code addresses is too large then the prior call is likely a false positive. Perhaps MadExcept does something like this, or some other smart tricks I am not aware of. I'll look around to see if there are ways to cut down on the false positives without impacting performance too much. I ran your test case under Delphi 10.4.2 and I got this stack trace: 0041320A [FastMM5.pas][FastMM5][FastMM_DebugGetMem$qqri][7717] 00404799 [System.pas][System][@ReallocMem$qqrrpvi][5035] 00408B4F [System.pas][System][DynArraySetLength$qqrrpvpvipi][36568] 00408CB6 [System.pas][System][@DynArraySetLength$qqrv][36672] 00427619 76F5FA29 [BaseThreadInitThunk] 77247A4E [RtlGetAppContainerNamedObjectPath] 77247A1E [RtlGetAppContainerNamedObjectPath] As you can see the call to LStrFromPWCharLen isn't there, so I think this confirms that it was just "noise" in the stack trace. Best regards, Pierre Edited April 17, 2021 by Pierre le Riche Share this post Link to post
Stefan Glienke 2002 Posted April 17, 2021 @Pierre le Riche It might work many times but when I saw this mentioned I remembered cases where I had seen leak reports with some weird looking call stacks in them which were kinda confusing as they were impossible (like in this case). I think simply walking up the stack and taking all addresses as a potential return address is the issue. This is the callstack when the code arrives in FastMM_DebugGetMem (10.3.3): and this is how the stack looks like: Looking at 00406E08 will find this: which is why LStrFromPWCharLen is being reported. So you are right - there should be a check if that potential return address is a valid one. Another thing I just noticed with the addresses in your vs the addresses reported from madExcept: madExcept reports the address of the call - you report the return address. IMO reporting the address of the call instruction is the better option - what do you think? Share this post Link to post
Pierre le Riche 21 Posted April 18, 2021 11 hours ago, Stefan Glienke said: Another thing I just noticed with the addresses in your vs the addresses reported from madExcept: madExcept reports the address of the call - you report the return address. IMO reporting the address of the call instruction is the better option - what do you think? I agree with you, but calculating the call address is only included in the cost when you perform a raw stack trace: Frame based traces (including the CaptureStackBackTrace API call) yield return addresses, so in order to get the call addresses for those you would need to dereference the return pointers (as you do with raw traces) in order to find the start of the call instruction. This would negate much of the performance advantage over raw traces. Before a stack trace address is passed to the JCL for conversion to unit and line number information, 1 is subtracted in order to ensure the address falls inside the call instruction. Consequently the unit/line information corresponds to the call, and the listed address is the return address. Not ideal, but I find that I very rarely actually look at the address so it has never really bothered me. Share this post Link to post
Stefan Glienke 2002 Posted April 18, 2021 That work only needs to be done when the report is being generated and I think that is not a point where raw speed matters that much but accuracy - on collecting the stack trace I agree that should be as fast as possible. I think I have seen a Jcl function that returns the address of the call instruction which then could be used to correct the displayed one to match the call instruction. Anyhow if you are not able to correct this we still have the option to use madExcept for the FullDebugMode dll which I will consider using 🤷♂️ Share this post Link to post
Pierre le Riche 21 Posted April 18, 2021 3 minutes ago, Stefan Glienke said: That work only needs to be done when the report is being generated If the call was made by a DLL that was unloaded prior to the report being generated it won't be possible to convert the return address to call address after the fact. But I take your point - in most cases it'll work. If it's a big issue for you I can put it on my to-do list. 13 minutes ago, Stefan Glienke said: I think I have seen a Jcl function that returns the address of the call instruction which then could be used to correct the displayed one to match the call instruction. TJclStackInfoList.ValidCallSite does that - it returns the size of the call instruction given a return address. Share this post Link to post
Stefan Glienke 2002 Posted April 18, 2021 2 minutes ago, Pierre le Riche said: If the call was made by a DLL that was unloaded prior to the report being generated it won't be possible to convert the return address to call address after the fact. But I take your point - in most cases it'll work. If it's a big issue for you I can put it on my to-do list. Fair point - I've seen some glitches with unloaded DLL in the past - fortunately, we don't explicitly unload our own DLLs that were loaded with LoadLibrary. It's not a big issue, don't spend your valuable time on that - since we have a madExcept license anyway I will try out fulldebugmode.dll using that. Share this post Link to post
pyscripter 689 Posted April 18, 2021 FYI Added stCleanRawStack JclStackTrackingOptions option that eliminates … · project-jedi/jcl@4867c10 (github.com) Share this post Link to post