Jump to content
Anders Melander

MAP2PDB - Profiling with VTune

Recommended Posts

Posted (edited)

What other tools use pdb? I think pdb files can be used to get stack traces in Process Explorer and Process Hacker 2. Obviously there is Visual Studio and there must be many others. I wonder if anybody has tried using the pdb files generated by map2pdb with other tools such as these. 

Edited by David Heffernan

Share this post


Link to post
1 hour ago, David Heffernan said:

other tools such as these

Yes, OllyDbg, as I wrote before. Works like a dream. So basically every debugger which supports pdb will work.

Share this post


Link to post
8 hours ago, Edwin Yip said:

Maybe price? 😉

That's a significant factor, yes. But since I have already bought it, and actually when a tool saves your time and/or allows you to more easily improve the quality of your code, it pays for itself. So I was mostly talking about technical differences. 

Share this post


Link to post
3 hours ago, Anders Melander said:

It's probably caused by a bug in map2pdb (e.g. some hash table is incorrect) but for now my plan is to add a white/black list option to include and exclude units from the pdb. So for example if I specify the switch -exclude:dx* then any unit that starts with "dx" (i.e. DevExpress) will be excluded from the pdb.

Ok. Please let me know if I can be of any help somehow. I believe it's currently not very usable for you as well since the time it takes is just too much.

Your focus on black/white is in the hope that it will reduce the symbol table size and thus speed up the process?

Share this post


Link to post
Posted (edited)
3 hours ago, Anders Melander said:

my plan is to add a white/black list option to include and exclude units from the pdb

With an exclusion list that removed most of the VCL and RTL as well as DevExpress, TeeChart, Indy and Firedac I managed to reduced the size of my pdb from 200Mb to 35Mb.

VTune now loads my project in "just" 5 minutes... It's still struggling though. Everything is incredible slow. I get the impression that Intel has never tried profiling VTune with VTune.... or maybe they tried and gave up because it was too slow.

 

Here's my command line:

map2pdb -v -bind "TurboFooProPlus.map" -exclude:dx*;cx*;system*;winapi*;vcl*;data*;firedac*;soap*;web*;id*

 

Edited by Anders Melander

Share this post


Link to post
15 minutes ago, Wagner Landgraf said:

Your focus on black/white is in the hope that it will reduce the symbol table size and thus speed up the process?

Yes.

Share this post


Link to post

I was just wondering... Does C++Builder also output .map files?

What if you had a C++ project that would build in both VS and RAD Studio and compared the VS .pdb with the map2pdb .pdb to spot any differences in structure and loading times?

Share this post


Link to post
7 minutes ago, Lars Fosdal said:

What if you had a C++ project that would build in both VS and RAD Studio and compared the VS .pdb with the map2pdb .pdb to spot any differences in structure and loading times?

I don't know about C++Builder and map files but in any case there would be too many differences, caused by all the other stuff that VS outputs, to make that feasible.

What I've done previously, when I had to figure out why something didn't work, was to use a hex editor to compare the pdb of VTune's matrix example with the output from "map2pdb -test". They are both sufficiently small. At this time I can pretty much parse pdb just by looking at the hex 🙂

 

I think the easiest way forward would be to just examine the key suspects (address table, hash tables) of matrix.pdb in a hex editor and verify that they're ordered and structured like we expect them to be. I'm using @mael's HxD editor so if would help if that supported structures. tap.tap... 🙂

 

Another way would be to write a drop-in replacement for the msdia140.dll in-process COM server, which is what VTune uses to access the PDB data (I considered doing that at one point since it would completely eliminate the need to write pdb). That would tell us exactly what API methods VTune is using and how.

  • Like 1

Share this post


Link to post

I reduced the size of my project a lot, and now VTune "only" took one hour to load. I could then do some tests.

What I have noticed is that the "View source code" option sometimes opens the wrong file. @Anders MelanderDo you want to check it? If yes, what information should I provide to you?

Share this post


Link to post
3 minutes ago, Wagner Landgraf said:

what information should I provide to you?

Let's start with the map file. Zip it and PM it to me.

Share this post


Link to post

I have tried to make the pdb of a simple console application to work with AMD uProf, but it fails, no matter what I try.

But the hotspot feature of VTune works on my AMD system, so I am happy right now.

 

Thanks! :classic_cheerleader:

 

Renate

Share this post


Link to post
2 minutes ago, Renate Schaaf said:

I have tried to make the pdb of a simple console application to work with AMD uProf, but it fails, no matter what I try.

Any clues as to what goes wrong?

Share this post


Link to post

AMD uProf failed every time I tried so I'm not wondering. Their forum is also full of complaints about that.

Share this post


Link to post
3 minutes ago, Attila Kovacs said:

AMD uProf failed every time I tried so I'm not wondering. Their forum is also full of complaints about that.

Too bad. It looks very nice.

I guess I'll take a look at it when my VTune trial expires.

Share this post


Link to post
16 minutes ago, Anders Melander said:

I guess I'll take a look at it when my VTune trial expires.

That would be great. Here are some relevant lines of the log file, I doubt they are of any use, though:

 

2021.04.08	14:23:10.909	#DEBUG	#PeFile::Open	#PeFile.cpp(111)	#Executable (PE) C:\Windows\SysWOW64\ntdll.dll opened
2021.04.08	14:23:10.137	#DEBUG	#PeFile::InitializeSymbolEngine	#PeFile.cpp(699)	#Executable (PE) C:\Windows\SysWOW64\ntdll.dll started Symbol Engine initialization
2021.04.08	14:23:10.752	#DEBUG	#PeFile::InitializeSymbolEngine	#PeFile.cpp(751)	#Executable (PE) C:\Windows\SysWOW64\ntdll.dll initialized Symbol Engine: PDB
2021.04.08	14:23:10.662	#DEBUG	#PeFile::Open	#PeFile.cpp(111)	#Executable (PE) D:\DelphiSource\DelphiRio\mystuffR\uProfTest\Win32\Debug\uProfTest.exe opened
2021.04.08	14:23:10.945	#DEBUG	#PeFile::InitializeSymbolEngine	#PeFile.cpp(699)	#Executable (PE) D:\DelphiSource\DelphiRio\mystuffR\uProfTest\Win32\Debug\uProfTest.exe started Symbol Engine initialization
2021.04.08	14:23:10.113	#DEBUG	#PeFile::InitializeSymbolEngine	#PeFile.cpp(758)	#Executable (PE) D:\DelphiSource\DelphiRio\mystuffR\uProfTest\Win32\Debug\uProfTest.exe failed to initialize Symbol Engine: PDB
2021.04.08	14:23:10.740	#DEBUG	#PeFile::InitializeSymbolEngine	#PeFile.cpp(778)	#Executable (PE) D:\DelphiSource\DelphiRio\mystuffR\uProfTest\Win32\Debug\uProfTest.exe initialized Symbol Engine: COFF

 

  • Thanks 1

Share this post


Link to post

New version (2.5) uploaded. Changes since last upload:

  • Include/exclude modules/units from pdb.
    This helps keep the size of the pdb down and thus reduces the symbol resolve time in VTune.
  • You no longer need to link your projects with debug info.
    map2pdb will reuse the existing debug section in the exe/dll/bpl if there is one. Otherwise it will create a new one.

https://bitbucket.org/anders_melander/map2pdb/downloads/

 

What's next:

  • Refactoring of the logging code.
    The current logging is basically just some functions that calls WriteLn. This should be replaced with a pluggable log framework so the whole logging mechanism can be replaced.
    The end goal is to enable integration of the map2pdb core into other projects.
  • A jdbg reader.
    Embarcadero does not supply map files for the RTL/VCL rune time packages. Instead they ship jdbg files that can be read with the JEDI debug functions.
    The jdbg are built from map files so supposedly they contains much, if not all, of the information we need. The task here is to write a reader for the jdbg file format so we can produce pdb files from them.
  • Figure out why VTune is so slow.
    A never ending task it seems.
  • Like 4
  • Thanks 4

Share this post


Link to post
4 hours ago, Anders Melander said:

Figure out why VTune is so slow.

I'd write them in the meantime, they should work too 😛

Share this post


Link to post
On 4/8/2021 at 3:11 PM, Renate Schaaf said:

I have tried to make the pdb of a simple console application to work with AMD uProf, but it fails, no matter what I try.

 

On 4/8/2021 at 3:16 PM, Attila Kovacs said:

AMD uProf failed every time I tried so I'm not wondering.

Works for me so there was probably something wrong with the pdb at that time. I've tried both with a small and a very large application.

image.thumb.png.a64431f3e109436a3dea205a1cc1cc65.png

On the positive side uProf resolved a lot faster than VTune but I'm a bit surprised about how basic the uProf feature set is and I can't really imagine what I would use it for. Also, it has pie charts... WTF?

 

  • Thanks 1

Share this post


Link to post
On 4/10/2021 at 3:44 PM, Anders Melander said:

Figure out why VTune is so slow.

It turned out that the culprit was the version of msdia140.dll that came bundled with the version of VTune I'm using. There's a bug in it that causes exponential slowdown on large pdb's. Replacing the dll with a new version fixed the problem. The symbol resolve time of my test project fell from hours/days to ~10 minutes. The old msdia140.dll was version 14.10.25017.0, the new is 14.28.29913.0. Any version from VS2019 or later should do AFAIK.

 

A side effect of trying to solve this performance problem was that I added segment/section filters. You can now specify what segments to include/exclude from the pdb.

For example since almost all code is in segment 0001 you can exclude all modules and symbols that reside in other segments. This can cause a significant reduction in the size of the pdb.

Try this:

map2pdb -v -include:0001 foobar.map

or try with the -debug switch to get all the details.

I'm considering just adding this 0001 filter as a default.

 

I've uploaded a new version (2.6) with all the latest changes (there aren't that many): https://bitbucket.org/anders_melander/map2pdb/downloads/

Also the repository finally has a readme.md

  • Like 6
  • Thanks 1

Share this post


Link to post

Confirmed: Works with uProf now. Great Job.

I get the same info as with VTune: hotspot timings, processor use, stack graph.

Share this post


Link to post
6 hours ago, Anders Melander said:

It turned out that the culprit was the version of msdia140.dll that came bundled with the version of VTune I'm using.

@Anders Melander, unfortunately that didn't work for me in a specific project.

Actually, I never had patience to wait for it to finish (maximum 1 hour). The weird thing is that this project is not that big, and I use lots of excludes, the PDB file size is 388 Kb. Would you be interested in checking this specific situation?

Share this post


Link to post
6 minutes ago, Wagner Landgraf said:

Would you be interested in checking this specific situation?

Sure. Send me the map file by PM.

Share this post


Link to post
5 hours ago, Renate Schaaf said:

Confirmed: Works with uProf now. Great Job.

I get the same info as with VTune: hotspot timings, processor use, stack graph.

Thanks for the info. Just downloaded AMD uProf and the installer supports Windows 7 (but haven't tried it yet) ;)

Share this post


Link to post
4 minutes ago, Edwin Yip said:

Just downloaded AMD uProf and the installer supports Windows 7 (but haven't tried it yet) ;)

Yes. I'm on Windows 7 too.

Share this post


Link to post

As it seems that the most recent version of VTune also suffer from the performance problem I mentioned earlier, I 've now added a note about the problem to the repository readme and uploaded the files that can be used to fix it.

https://bitbucket.org/anders_melander/map2pdb/src/master/#markdown-header-performance-problems-with-intel-vtune

 

Quote
Performance problems with Intel VTune

Due to a bug in the msdia140.dll file that comes bundled with VTune you will likely experience that VTune takes an extremely long time to resolve symbols on anything but the smallest projects.

msdia140.dll implements the Debug Interface Access SDK. The bug was introduced in VS2017 and supposedly fixed in VS2019 but apparently Intel hasn't caught up to that fact and new versions of VTune still comes with the old VS2017 version of msdia140.dll.

To fix this problem all you have to do is replace VTune's msdia140.dll with a newer version. The file is located in the bin32 and bin64 folders under the VTune root folder. Note that the 32-bit and 64-bit files are not the same. You need to replace the file in the bin32 folder with the 32-bit version of msdia140.dll and the one in the bin64 folder with the 64-bit version.

Now here's the catch; The files you need to replace are not the ones that are actually named msdia140.dll. You need to replace the ones named amplxe_msdia140.dll. Remember to save the old ones first in case you mess this up.

If you have a newer version of VS installed you can probably find the required files somewhere on your system. I guess anything newer than version 14.10.x.x should do. You can also install the Visual Studio Redistributable and get the files from there or you can just get the two files from the repository download section.

 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×