Jan Rysavy 8 Posted May 23, 2023 15 hours ago, Anders Melander said: StrmTbl::~StrmTbl Lots of calls to this. Just one note regarding 'wt' command output. There is only one call to StrmTbl::~StrmTbl (etc.), see summary in dump : Function Name Invocations MinInst MaxInst AvgInst msdia140!StrmTbl::~StrmTbl 1 597 597 597 'wt' output explained: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/wt--trace-and-watch-data- LocalFree is simply called in a loop: 28 0 [ 7] msdia140!StrmTbl::~StrmTbl 3 0 [ 8] msdia140!operator delete 1 0 [ 8] KERNEL32!LocalFreeStub 23 0 [ 8] KERNELBASE!LocalFree 40 27 [ 7] msdia140!StrmTbl::~StrmTbl 3 0 [ 8] msdia140!operator delete 1 0 [ 8] KERNEL32!LocalFreeStub 23 0 [ 8] KERNELBASE!LocalFree 52 54 [ 7] msdia140!StrmTbl::~StrmTbl 3 0 [ 8] msdia140!operator delete 1 0 [ 8] KERNEL32!LocalFreeStub 23 0 [ 8] KERNELBASE!LocalFree Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 14 minutes ago, Jan Rysavy said: 'wt' output explained: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/wt--trace-and-watch-data- Ah, I see. So the 'wt' is a command you give to the debugger and it traces all calls made in the call tree? I thought the trace was something that msdia140.dll produced on its own. Share this post Link to post
Jan Rysavy 8 Posted May 23, 2023 Yes, exactly. I'm using WinDbg Preview from https://apps.microsoft.com/store/detail/windbg-preview/9PGJGD53TN86 Set breakpoint: bp msdia140!CDiaDataSource::loadDataFromPdb Run 'wt' trace command: wt -m msdia140 -oR Btw, see attached dumps. Both are using msdia140.dll version 14.36.32532.0. msdia140_14_36_32532_error.txt is loading PDB from MAP2PDB, while msdia140_14_36_32532_success.txt is loading PDB from MSVC simple application. Look at difference in msdia140!MSF_HB::checkInvariants return value rax = 0 vs rax = 1. msdia140_diff.zip 1 Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 25 minutes ago, Jan Rysavy said: Look at difference in msdia140!MSF_HB::checkInvariants return value rax = 0 vs rax = 1. Yes, interesting; Probably the place where it goes wrong. Unfortunately, I have no idea about what "Invariant" refers to. So assuming the problem lies in the MSF format (which is just a container format - a mini internal file system), I can't see anything in the MSF format that could be referred to as variant/invariant. I have reviewed the code and as far as I can tell I'm completely blocksize agnostic; Nowhere do I assume a blocksize of 4096. I have tried changing the blocksize to 8192 but that just makes the old VTune choke on the file and the new one still can't read it. I will now try to see if the new VTune can work with the PDB files that ship with the old VTune (they have a block size of 4096). If it can then the problem is with map2pdb (i.e. I'm doing something wrong). If it can't then the PDB format has changed and I'm f*cked because there's no way to know what changed. The first time around I reverse-engineered some of the format by examining the files in a hex editor and I'm not doing that again. Share this post Link to post
Stefan Glienke 2002 Posted May 23, 2023 FWIW: https://github.com/microsoft/microsoft-pdb/blob/master/PDB/msf/msf.cpp#L1385 1 Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 8 minutes ago, Anders Melander said: I will now try to see if the new VTune can work with the PDB files that ship with the old VTune It can. Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 (edited) 11 minutes ago, Stefan Glienke said: FWIW: https://github.com/microsoft/microsoft-pdb/blob/master/PDB/msf/msf.cpp#L1385 I want a sex change operation so I can have your children. (it means "thank you" in case you wondered) Edited May 23, 2023 by Anders Melander 2 1 Share this post Link to post
Attila Kovacs 629 Posted May 23, 2023 this is a different checkInvariants Share this post Link to post
Stefan Glienke 2002 Posted May 23, 2023 (edited) 12 minutes ago, Attila Kovacs said: this is a different checkInvariants No, it's not - it's the call from this line: https://github.com/microsoft/microsoft-pdb/blob/master/PDB/msf/msf.cpp#L1627 It however might look different today given that source on GitHub is from seven years ago but it might still give a clue. 13 minutes ago, Anders Melander said: I want a sex change operation so I can have your children. (it means "thank you" in case you wondered) The weirdest way I got a "thank you" ever ngl Edited May 23, 2023 by Stefan Glienke 1 1 Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 (edited) So I'm looking at checkInvariants in msf.cpp and there's this comment: Quote check that every page is either free, freed, or in use in exactly one stream MSF is the container format for a PDB file. The MSF format is pretty much a mini FAT file system, the files being the different PDB tables: Source file names, line numbers, symbols, etc. Internally the MSF file system is divided into intervals. Each interval contains <blocksize> blocks and each block is <blocksize> bytes long. The blocksize used to be 4096. Now it can apparently also be 8192. At the start of each interval, there are two blocks (the Free Page Map) that contain a bitmap of allocated blocks in the interval. Allocated as in "in-use" by a stream. A stream is just a collection of blocks. All streams are listed in the stream directory. The stream directory is stored in one or more blocks. At the start of the file is the superblock. It contains various info about the file: blocksize, index of the first Free Page Map, the number of blocks in the file, a pointer to the stream directory, etc. MSF Block and Interval layout Interval | 0 | 1 ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Block index | 0 | 1 | 2 | 3 | 4 | ... | 4095 | 4096 | 4097 | 4095 | 4096 | 4097 | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Log block | 0 (N/A) | N/A | N/A | 1 | 2 | ... | ... | 4094 | N/A | N/A | 4095 | ... | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Phys offset | 0 | 4096 | 8192 |12288 |16384 | ... | ... |4096^2|+4096 |+8192 | ... | ... | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Content | Superblock | FPM1 | FPM2 | Data | Data | Data | Data | Data | FPM1 | FPM2 | Data | Data | Data So in theory some of the blocks in an interval can be in use and some of them can be free and those that are in use should be referenced in the higher level stream's index of MSF blocks - otherwise, they are "leaked". I believe this is what checkInvariants verifies. Now, since I'm writing the PDB file in one go and never have a need to go back and free or modify an already allocated MSF block, I always mark all blocks as allocated when I start a new interval in the file. This means that I can, and most likely will, end up with blocks marked as allocated in the bitmap but not actually in use (or physically present in the file). procedure TBinaryBlockWriter.WriteBlockMap; const NoVacancies: Byte = $FF; begin BeginBlock; Assert((BlockIndex mod FBlockSize) in [1, 2]); // Mark all BlockSize*8 blocks occupied for var i := 0 to FBlockSize-1 do FStream.WriteBuffer(NoVacancies, 1); FStreamSize := Max(FStreamSize, FStream.Position); EndBlock(True); end; So why wasn't this a problem with the old version? In the old Microsoft source checkInvariants is only active in debug builds so my guess is that the old version simply doesn't perform this validation. Anyway, it's the best guess I have right now so it should be pursued. I'm not sure when I will get time to do so though. Edited May 23, 2023 by Anders Melander 2 Share this post Link to post
Jan Rysavy 8 Posted May 23, 2023 (edited) CONFIRMED! Patched version of amplxe_msdia140!MSF_HB::checkInvariants (version 14.34.31942.0 from VTune 2023.1) works fine with MAP2PDB PDBs. amplxe_msdia140.zip Edited May 23, 2023 by Jan Rysavy 1 Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 40 minutes ago, Jan Rysavy said: CONFIRMED! Wow. Thanks! Share this post Link to post
Anders Melander 1782 Posted May 23, 2023 (edited) 4 hours ago, Anders Melander said: I'm not sure when I will get time to do so though. Of course I couldn't help myself. The Bugfix/LargePDB branch contains the changes so the Free Page Map now contains the correct values. Unfortunately, that didn't solve the problem. Edited May 23, 2023 by Anders Melander Share this post Link to post
Jan Rysavy 8 Posted May 24, 2023 (edited) It looks like the checkInvariants function is more complicated in the current msdia140.dll version (14.36.32532.0) compared to the GitHub source code https://github.com/microsoft/microsoft-pdb/blob/805655a28bd8198004be2ac27e6e0290121a5e89/PDB/msf/msf.cpp#L1385 I compiled MAP2PDB from your Bugfix/LargePDB branch, exported PDB and debugged through checkInvariants. Green/red boxes is executed code. See checkInvariants.zip for details. Neither loop is repeated in the case of my simple PDB. loc_1800FCE71 is where checkInvariants fails. Edit: tested on MSVC PDB where checkInvariants succeeds, exactly same code is executed, first difference is in loc_1800FCE71 where MSVC PDB has al = 0x00: MAP2PDB PDB 00007ffa`70fece71 8bc2 mov eax, edx eax = 0x00000053 00007ffa`70fece73 83e03f and eax, 3Fh eax = 0x00000013 00007ffa`70fece76 0fb6c8 movzx ecx, al ecx = 0x00000013 00007ffa`70fece79 488bc2 mov rax, rdx rax = 0x0000000000000053 00007ffa`70fece7c 48c1e806 shr rax, 6 rax = 0x0000000000000001 00007ffa`70fece80 498b04c6 mov rax, qword ptr [r14+rax*8] rax = 0x00000000001fffff 00007ffa`70fece84 480fa3c8 bt rax, rcx 00007ffa`70fece88 0f92c0 setb al al = 0x01 00007ffa`70fece8b 84c0 test al, al 00007ffa`70fece8d 0f8507020000 jne msdia140!MSF_HB::checkInvariants+0x42a (7ffa70fed09a) MSVC PDB 00007ffa`6769ce71 8bc2 mov eax, edx eax = 0x00000420 00007ffa`6769ce73 83e03f and eax, 3Fh eax = 0x00000020 00007ffa`6769ce76 0fb6c8 movzx ecx, al ecx = 0x00000020 00007ffa`6769ce79 488bc2 mov rax, rdx rax = 0x0000000000000420 00007ffa`6769ce7c 48c1e806 shr rax, 6 rax = 0x0000000000000010 00007ffa`6769ce80 498b04c6 mov rax, qword ptr [r14+rax*8] rax = 0xfffffff800000760 00007ffa`6769ce84 480fa3c8 bt rax, rcx 00007ffa`6769ce88 0f92c0 setb al al = 0x00 00007ffa`70fece8b 84c0 test al, al 00007ffa`6769ce8d 0f8507020000 jne msdia140!MSF_HB::checkInvariants+0x42a (7ffa6769d09a) checkInvariants.zip Edited May 24, 2023 by Jan Rysavy Share this post Link to post
Jan Rysavy 8 Posted May 24, 2023 (edited) Latest llvm-pdbutil (16.0.4) "dump --all" returns following error on MAP2PDB PDB: Unexpected error processing modules: PDB does not contain the requested image section header type Is that normal? Edited May 24, 2023 by Jan Rysavy Share this post Link to post
Anders Melander 1782 Posted May 24, 2023 6 minutes ago, Jan Rysavy said: Is that normal? No, not as far as I remember. I don't have the map2pdb project on the system I'm on right now, but when I get home I will verify that the older version of llvm-pdbutil (the version I targeted with the YAML output) doesn't fail like that. Share this post Link to post
Anders Melander 1782 Posted May 24, 2023 3 hours ago, Anders Melander said: I will verify that the older version of llvm-pdbutil (the version I targeted with the YAML output) doesn't fail like that. llvm-pdbutil.exe version 9.0.0.0 doesn't fail with the new PDBs. However, with the large blocksize (map2pdb -blocksize:8192 ...) it fails as expected (it's documented to only support a blocksize of 4096): Quote llvm-pdbutil: The data is in an unexpected format. Unsupported block size. 4 hours ago, Jan Rysavy said: Latest llvm-pdbutil (16.0.4) "dump --all" returns following error on MAP2PDB PDB: Unexpected error processing modules: PDB does not contain the requested image section header type The source of that error message is here: https://github.com/llvm/llvm-project/blob/f74bb326949aec0cac8e54ff00fc081f746ff35d/llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp#L395 It would have been nice if the message specified what header type it was looking for but In any case, it's a bit suspicious that it mentions "image section header" because that's a thing in the PE image and the PDB doesn't contain any PE related stuff. It could just be a bad choice of words. Anyway, moving along. So the error message is produced by loadSectionHeaders(type). loadSectionHeaders(type) is called from dumpSectionHeaders(type) which is called from dumpSectionHeaders() with type=SectionHdr and type=SectionHdrOrig. I have neither type implemented but I have a comment in my source about SectionHdr: function TDebugInfoPdbWriter.EmitDBISubstreamDebugHeader(Writer: TBinaryBlockWriter): Cardinal; begin Result := Writer.Position; // We don't provide any of the optional debug streams yet for var HeaderType := Low(PDBDbgHeaderType) to High(PDBDbgHeaderType) do Writer.Write(Word(TMSFStream.NullStreamIndex)); // TODO : I believe the SectionHdr stream contains the segment names (* for var Segment in FDebugInfo.Segments do begin xxx.Add(Segment.SegClassName); xxx.Add(Segment.Name); end; *) Result := Writer.Position - Result; end; Since these streams are optional it's pretty stupid of llvm-pdbutil to bug out if they aren't present. The old llvm-pdbutil didn't read them (which is probably why I didn't know the format or content of the streams) but since the new one does it should be possible to deduce the format from their source. I'm just concerned that this might be a completely wasted effort if msdia140.dll is failing on something else. Share this post Link to post
Jan Rysavy 8 Posted May 24, 2023 OK, it looks like different problem. Do you have any idea what they are testing in checkInvariants / loc_1800FCE71? Share this post Link to post
Anders Melander 1782 Posted May 24, 2023 2 hours ago, Jan Rysavy said: Do you have any idea what they are testing in checkInvariants / loc_1800FCE71? No. I can't figure it out. Share this post Link to post
Jan Rysavy 8 Posted May 24, 2023 (edited) Another idea: I found what part of PDB they are reading in loc_1800FCE71, instruction mov rax, qword ptr [r14+rax*8]. It is on offset 0x1008 of attached PDB, see TestMAP2PDB.zip. Does it help? Edit: Sorry, marked two QWORDs in memory dump. TestMAP2PDB.zip Edited May 24, 2023 by Jan Rysavy Share this post Link to post
Anders Melander 1782 Posted May 24, 2023 7 minutes ago, Jan Rysavy said: It is on offset 0x1008 of attached PDB, see TestMAP2PDB.zip. Does it help? Yes, it definitely should. I'm just too tired to dissect the file in a hex editor right now 🙂 Maybe later tonight. 1 Share this post Link to post
Attila Kovacs 629 Posted May 24, 2023 if it's compiled in debug mode, did they ship the own pdb? 😉 Share this post Link to post
Jan Rysavy 8 Posted May 24, 2023 21 minutes ago, Attila Kovacs said: if it's compiled in debug mode, did they ship the own pdb? 😉 I don't think msdia140.dll is compiled in debug mode. PDB is available on Microsoft Symbol servers. Share this post Link to post
Anders Melander 1782 Posted May 24, 2023 2 hours ago, Jan Rysavy said: Another idea: I found what part of PDB they are reading in loc_1800FCE71, instruction mov rax, qword ptr [r14+rax*8]. It is on offset 0x1008 of attached PDB, see TestMAP2PDB.zip. Does it help? As far as I can tell it's the Free Page Map. This matches with the fact that the asm is doing bit tests. If it's blocking the bitmap into qwords then that might be the problem. I'm blocking it into bytes. To mark block 1 as allocated I would set bit 0 of the first byte, block 9 would be bit 0 of the second byte, etc. It could be that I should be using another bit-layout. It could also be that I'm simply not marking all the blocks allocated that I should - or too many. Share this post Link to post
Stefan Glienke 2002 Posted May 24, 2023 8 minutes ago, Anders Melander said: To mark block 1 as allocated I would set bit 0 of the first byte, block 9 would be bit 0 of the second byte, etc. It could be that I should be using another bit-layout. Could the documentation be of any help? Share this post Link to post