Jump to content

Anders Melander

Members
  • Content Count

    2312
  • Joined

  • Last visited

  • Days Won

    119

Everything posted by Anders Melander

  1. Anders Melander

    MAP2PDB - Profiling with VTune

    I can't see how. Is there anything, in particular, you're hinting at? I've read that document many, many times. It's a good starting point but it's very incomplete and, in some cases, plain wrong.
  2. Anders Melander

    MAP2PDB - Profiling with VTune

    As far as I can tell it's the Free Page Map. This matches with the fact that the asm is doing bit tests. If it's blocking the bitmap into qwords then that might be the problem. I'm blocking it into bytes. To mark block 1 as allocated I would set bit 0 of the first byte, block 9 would be bit 0 of the second byte, etc. It could be that I should be using another bit-layout. It could also be that I'm simply not marking all the blocks allocated that I should - or too many.
  3. Anders Melander

    MAP2PDB - Profiling with VTune

    Yes, it definitely should. I'm just too tired to dissect the file in a hex editor right now 🙂 Maybe later tonight.
  4. Anders Melander

    MAP2PDB - Profiling with VTune

    No. I can't figure it out.
  5. Anders Melander

    MAP2PDB - Profiling with VTune

    llvm-pdbutil.exe version 9.0.0.0 doesn't fail with the new PDBs. However, with the large blocksize (map2pdb -blocksize:8192 ...) it fails as expected (it's documented to only support a blocksize of 4096): The source of that error message is here: https://github.com/llvm/llvm-project/blob/f74bb326949aec0cac8e54ff00fc081f746ff35d/llvm/tools/llvm-pdbutil/DumpOutputStyle.cpp#L395 It would have been nice if the message specified what header type it was looking for but In any case, it's a bit suspicious that it mentions "image section header" because that's a thing in the PE image and the PDB doesn't contain any PE related stuff. It could just be a bad choice of words. Anyway, moving along. So the error message is produced by loadSectionHeaders(type). loadSectionHeaders(type) is called from dumpSectionHeaders(type) which is called from dumpSectionHeaders() with type=SectionHdr and type=SectionHdrOrig. I have neither type implemented but I have a comment in my source about SectionHdr: function TDebugInfoPdbWriter.EmitDBISubstreamDebugHeader(Writer: TBinaryBlockWriter): Cardinal; begin Result := Writer.Position; // We don't provide any of the optional debug streams yet for var HeaderType := Low(PDBDbgHeaderType) to High(PDBDbgHeaderType) do Writer.Write(Word(TMSFStream.NullStreamIndex)); // TODO : I believe the SectionHdr stream contains the segment names (* for var Segment in FDebugInfo.Segments do begin xxx.Add(Segment.SegClassName); xxx.Add(Segment.Name); end; *) Result := Writer.Position - Result; end; Since these streams are optional it's pretty stupid of llvm-pdbutil to bug out if they aren't present. The old llvm-pdbutil didn't read them (which is probably why I didn't know the format or content of the streams) but since the new one does it should be possible to deduce the format from their source. I'm just concerned that this might be a completely wasted effort if msdia140.dll is failing on something else.
  6. Anders Melander

    MAP2PDB - Profiling with VTune

    No, not as far as I remember. I don't have the map2pdb project on the system I'm on right now, but when I get home I will verify that the older version of llvm-pdbutil (the version I targeted with the YAML output) doesn't fail like that.
  7. Anders Melander

    MAP2PDB - Profiling with VTune

    Of course I couldn't help myself. The Bugfix/LargePDB branch contains the changes so the Free Page Map now contains the correct values. Unfortunately, that didn't solve the problem.
  8. Anders Melander

    MAP2PDB - Profiling with VTune

    Wow. Thanks!
  9. Anders Melander

    MAP2PDB - Profiling with VTune

    So I'm looking at checkInvariants in msf.cpp and there's this comment: MSF is the container format for a PDB file. The MSF format is pretty much a mini FAT file system, the files being the different PDB tables: Source file names, line numbers, symbols, etc. Internally the MSF file system is divided into intervals. Each interval contains <blocksize> blocks and each block is <blocksize> bytes long. The blocksize used to be 4096. Now it can apparently also be 8192. At the start of each interval, there are two blocks (the Free Page Map) that contain a bitmap of allocated blocks in the interval. Allocated as in "in-use" by a stream. A stream is just a collection of blocks. All streams are listed in the stream directory. The stream directory is stored in one or more blocks. At the start of the file is the superblock. It contains various info about the file: blocksize, index of the first Free Page Map, the number of blocks in the file, a pointer to the stream directory, etc. MSF Block and Interval layout Interval | 0 | 1 ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Block index | 0 | 1 | 2 | 3 | 4 | ... | 4095 | 4096 | 4097 | 4095 | 4096 | 4097 | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Log block | 0 (N/A) | N/A | N/A | 1 | 2 | ... | ... | 4094 | N/A | N/A | 4095 | ... | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Phys offset | 0 | 4096 | 8192 |12288 |16384 | ... | ... |4096^2|+4096 |+8192 | ... | ... | ... ------------+------------+------+------+------+------+------+------+------+------+------+------+------+- - - Content | Superblock | FPM1 | FPM2 | Data | Data | Data | Data | Data | FPM1 | FPM2 | Data | Data | Data So in theory some of the blocks in an interval can be in use and some of them can be free and those that are in use should be referenced in the higher level stream's index of MSF blocks - otherwise, they are "leaked". I believe this is what checkInvariants verifies. Now, since I'm writing the PDB file in one go and never have a need to go back and free or modify an already allocated MSF block, I always mark all blocks as allocated when I start a new interval in the file. This means that I can, and most likely will, end up with blocks marked as allocated in the bitmap but not actually in use (or physically present in the file). procedure TBinaryBlockWriter.WriteBlockMap; const NoVacancies: Byte = $FF; begin BeginBlock; Assert((BlockIndex mod FBlockSize) in [1, 2]); // Mark all BlockSize*8 blocks occupied for var i := 0 to FBlockSize-1 do FStream.WriteBuffer(NoVacancies, 1); FStreamSize := Max(FStreamSize, FStream.Position); EndBlock(True); end; So why wasn't this a problem with the old version? In the old Microsoft source checkInvariants is only active in debug builds so my guess is that the old version simply doesn't perform this validation. Anyway, it's the best guess I have right now so it should be pursued. I'm not sure when I will get time to do so though.
  10. Anders Melander

    MAP2PDB - Profiling with VTune

    I want a sex change operation so I can have your children. (it means "thank you" in case you wondered)
  11. Anders Melander

    MAP2PDB - Profiling with VTune

    It can.
  12. Anders Melander

    MAP2PDB - Profiling with VTune

    Yes, interesting; Probably the place where it goes wrong. Unfortunately, I have no idea about what "Invariant" refers to. So assuming the problem lies in the MSF format (which is just a container format - a mini internal file system), I can't see anything in the MSF format that could be referred to as variant/invariant. I have reviewed the code and as far as I can tell I'm completely blocksize agnostic; Nowhere do I assume a blocksize of 4096. I have tried changing the blocksize to 8192 but that just makes the old VTune choke on the file and the new one still can't read it. I will now try to see if the new VTune can work with the PDB files that ship with the old VTune (they have a block size of 4096). If it can then the problem is with map2pdb (i.e. I'm doing something wrong). If it can't then the PDB format has changed and I'm f*cked because there's no way to know what changed. The first time around I reverse-engineered some of the format by examining the files in a hex editor and I'm not doing that again.
  13. Anders Melander

    MAP2PDB - Profiling with VTune

    Ah, I see. So the 'wt' is a command you give to the debugger and it traces all calls made in the call tree? I thought the trace was something that msdia140.dll produced on its own.
  14. Anders Melander

    set of object instances

    With a list of TEdits? Not likely. I would go for an encapsulated TList<T>: type TSetOfStuff<T> = class private FList: TList<T>; public function Contains(const Value: T): boolean; function Add(const Value: T): integer; procedure Remove(const Value: T); function GetEnumarator: TEnumerator<T>; end; function TSetOfStuff<T>.Contains(const Value: T): boolean; begin var Index: integer; Result := FList.BinarySearch(Value, Index); end; function TSetOfStuff<T>.Add(const Value: T): integer; begin if (not FList.BinarySearch(Value, Result)) then FList.Insert(Result, Value); end; procedure TSetOfStuff<T>.Remove(const Value: T); begin var Index: integer; if (FList.BinarySearch(Value, Index)) then FList.Delete(Index); end; function TSetOfStuff<T>.GetEnumarator: TEnumerator<T>; begin Result := FList.GetEnumerator; end; etc...
  15. Anders Melander

    Fast Base64 encode/decode

    ...and no errors from the compiler if the asm is wrong.
  16. Anders Melander

    Unicode normalization

    I guess that's one way to remove the dependencies. I hadn't thought of that. But I would still have to clean up the remaining source and rewrite parts of it, so again: A fork. If I go that way I would probably prefer to simply start from the original, pre-JEDI, version of the source instead of trying to polish the turd it has become.
  17. Anders Melander

    MAP2PDB - Profiling with VTune

    Ew! It looks like they have done a complete rewrite. No wonder it's broken. So I guess this is an example of the main problem with the PDB format: Microsoft considers it their own internal format to do with what they like. They have their own (undocumented) writer, their own reader, and no documentation.
  18. Anders Melander

    MAP2PDB - Profiling with VTune

    Excellent! A few quick observations: First of all, it's strange that the error isn't logged. That would StrmTbl::internalSerializeBigMsf Lots of calls to this. I'm guessing it's reading MSF blocks and deblocking them into linear memory streams. This is probably the new code that supports the 8192-byte "big" MSF block size. MSF_HB::load Probably the code that loads the PDB tables from the memory streams. StrmTbl::~StrmTbl Lots of calls to this. Probably clean up after the load has been aborted. PortablePDB::PortablePDB Something wrong here. "Portable PDB" is the .NET PDB format. It's a completely different file format. I'm guessing it's falling back to that format after failing to validate the file as PDB.
  19. Anders Melander

    MAP2PDB - Profiling with VTune

    If the block size being 4096 is the only problem (I somehow doubt that I'm that lucky) then this is the line that needs to be changed to write 8192-byte blocks: https://bitbucket.org/anders_melander/map2pdb/src/2341200827af24f7dd75cb695a668dfa9564bcf5/Source/debug.info.writer.pdb.pas#lines-225 constructor TDebugInfoPdbWriter.Create; begin Create(4096); end;
  20. Anders Melander

    MAP2PDB - Profiling with VTune

    Neat. If you can spot where it gives up on the pdb file and returns an error that would be suuuuper nice. Does it produce any debug output while loading?
  21. Anders Melander

    MAP2PDB - Profiling with VTune

    Yeah... Not too keen on that as a first approach. The last time I tried using the llvm pdb support as a reference I wasted a lot of time before I found out that it was very incomplete to the point of being unusable by VTune. It has probably improved but since then it's hard to tell what state it's in. https://github.com/llvm/llvm-project/issues/37279 https://github.com/llvm/llvm-project/issues/28528 I will try to see if I can reproduce and spot the problem in the source before I go down that road. Thanks anyway.
  22. Anders Melander

    MAP2PDB - Profiling with VTune

    Looks like it. Ooooh, interesting. Maybe they've accidentally broken support for the older format and not noticed it because they're only testing the new format now. The article Stefan linked to makes me think that even though the PDB format supported large PDB files, the PDB reader (msdia140.dll) didn't. Otherwise, they would only have had to update their PDB writer to support large PDB files.
  23. Anders Melander

    Unicode normalization

    Doesn't really matter:
  24. Anders Melander

    MAP2PDB - Profiling with VTune

    That means the bug is most likely in map2pdb because that DLL is Microsoft's API for reading PDB files.
  25. Anders Melander

    MAP2PDB - Profiling with VTune

    I think that is a generic error message meaning "Something went wrong and our error handling sucks". As far as I remember you get a message like that regardless of what problem VTune encounters when resolving through the PDB file.
×