I need to parse the XML files generated by Visual Studio's performance profiler, and get information about CPU usage of each function in my program (similar to what Visual Studio displays in the diagnostics tools and performance profiler).
I tried looking through all the XML files that I can generate using the tool (call tree summary
, function summary
, process summary
, etc.) but I don't seem to find the information regarding CPU usage.
Which XML views should I export to get this information? Also, where can I find CPU usage? What are InclSamples
and ExclSamples
? Here is an example from my function summary XML:
<Function FunctionName="MatrixMultiply.Program.FunctionName" InclSamples="1,444" ExclSamples="881" InclSamplesPercent="97.30" ExclSamplesPercent="59.37" />
For generating the XML files correctly, you have to choose a target and checkbox the analyze CPU Usage
in the Performance Profiler (Alt + F2)
which is available in Debug > Performance Profiler
from the menu dropdown. This window is the diagsession
window which asks you to choose the performance reports you need and start the profiling.
- CPU Usage
- Memory Usage
- GPU Usage
Once you export the report into a VSPX
file, VS2017 allows you to export this binary file into readable CSVs/XML files. In the menu that shows up, you could choose different reports you'd like. Some of the reports that the profiler can generate for you are:
It looks like you're most interested in the CallTreeSummary
, the exported XML file will have the name <reportname>_CallTreeSummary.xml
which contains the <PerformanceReport>
consisting of <CallTreeSummary>
Each CallTree
call in the CallTreeSummary
contains the functionName
, the InclSamples
, ExclSamples
and their respective percentages.
Here's an example:
For a sample C++ code as follows (Memory leak sample):
int main() {
while (true) {
int *p = new int;
}
return 0;
}
A part of the CallTree
shows up as follows:
<CallTree Level="8" FunctionName="operator new" InclSamples="20,560" ExclSamples="78" InclSamplesPercent="97.46" ExclSamplesPercent="0.37" ModuleName="Sample.exe" />
<CallTree Level="9" FunctionName="[ucrtbased.dll]" InclSamples="20,482" ExclSamples="55" InclSamplesPercent="97.09" ExclSamplesPercent="0.26" ModuleName="ucrtbased.dll" />
<CallTree Level="10" FunctionName="[ucrtbased.dll]" InclSamples="20,427" ExclSamples="83" InclSamplesPercent="96.83" ExclSamplesPercent="0.39" ModuleName="ucrtbased.dll" />
<CallTree Level="11" FunctionName="[ucrtbased.dll]" InclSamples="20,344" ExclSamples="177" InclSamplesPercent="96.44" ExclSamplesPercent="0.84" ModuleName="ucrtbased.dll" />
<CallTree Level="12" FunctionName="[ucrtbased.dll]" InclSamples="20,119" ExclSamples="2,092" InclSamplesPercent="95.37" ExclSamplesPercent="9.92" ModuleName="ucrtbased.dll" />
InclSamples
represents the total number of ticks taken to execute the function and any related functions that are called.
ExclSamples
represents the total number of ticks taken to execute only the function.
For an illustrative example, consider the following example:
int bar() {
return 1;
}
int foo() {
return bar();
}
int main() {
int x = foo();
return 0;
}
A sample execution could show the following data:
<FunctionName="main" InclSamples="100" ExclSamples="10" InclSamplesPercent="100.00" .../>
<FunctionName="foo" InclSamples="90" ExclSamples="40" InclSamplesPercent="90.00".../>
<FunctionName="bar" InclSamples="50" ExclSamples="50" InclSamplesPercent="50.00" .../>
This is interpreted as follows:
main()
function takes 100 CPU ticks in total but only 10 of those ticks resulted in executing this function, leaving us with 90
ticks which were used in other functions being called from main()
foo()
function takes 90 CPU ticks (since it includes the run time of foo() + bar()
but exclusively the foo()
takes 40
ticks.bar()
function takes 50
ticks to run.Using the reasoning above, you could reason the CallTree
sample provided above. The associated InclSamplesPercent
is the percentage of time taken by the CPU to run the overall task. For e.g. from the sample above we can say that 100%
of the CPU usage was by the main()
function but 90%
of that was taken up by foo()
function and 50%
by bar()
function effectively making the ExclSamplesPercent
by foo()
as 90 - 50 = 40.00%
and ExclSamplesPercent
by main()
as 100 - 90 = 10%