I have a collection of legacy C code which I'm refactoring to split the C computational code from the GUI. This is complicated by the heavily recursive mathematical core code being K&R style declarations. I've already abandoned an attempt to convert these to ANSI declarations due to nested use of function parameters (just couldn't get those last 4 compiler errors to go).
I need to move some files into a pure DLL and determine the minimal interface to make public, which is going to require wrapper functions writing to publish a typed interface.
I've marked up the key source files with the Doxygen @callergraph markup so informative graphs are generated for individual functions. What I'd like to do beyond that is amalgamate these graphs so I can determine the narrowest boundary of functions exposed to the outside world.
The original header files are no use - they expose everything as untyped C functions.
There are hundreds of functions so simple inspection of the generated callergraphs is painful.
I'm considering writing some kind of DOT merge tool - setting DOT_CLEANUP=NO makes Doxygen leave the intermediate DOT files there rather then just retaining the png files they generated.
I'm not obsessed by this being a graphical solution - I'd be quite happy if someone could suggest an alternative analysis tool (free or relatively cheap) or technique using Doxygen's XML output to achieve the same goal.
A callergraph amalgamated at the file level does have a certain appeal for client documentation rather than a plain list :-)
In your Doxyfile, set
GENERATE_XML = YES
and then you can find your call graph in the XML files. For each function marked with the callergraph, you'll find <referencedby>
elements in the output that you can use. Here's a sample from one of my C files:
<memberdef kind="function" id="spfs_8c_1a3"
prot="public" static="yes" const="no" explicit="no"
inline="no" virt="non-virtual">
<type>svn_error_t *</type>
<definition>svn_error_t * init_apr</definition>
<argsstring>(apr_pool_t **ppool, int *argc, char const *const **argv)</argsstring>
<name>init_apr</name>
<!-- param and description elements clipped for brevity ... -->
<location file="src/spfs.c" line="26" bodystart="101" bodyend="120"/>
<referencedby refid="spfs_8c_1a13" compoundref="spfs_8c"
startline="45" endline="94">main</referencedby>
</memberdef>
So for every memberdef/referencedby pair, you have a caller-callee relationship, which you can grab via XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:apply-templates select="//memberdef[@kind eq 'function']"/>
</xsl:template>
<xsl:template match="memberdef">
<xsl:variable name="function-name"
select="concat(definition, argsstring)"/>
<xsl:for-each select="referencedby">
<xsl:value-of select="concat(./text(), ' calls ', $function-name, '
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Which gives you a line per caller-callee like this:
main calls svn_error_t * init_apr(apr_pool_t **ppool, int *argc, char const *const **argv)
You'll want to tweak that XSLT and then partition that directed graph in the way that cuts across the fewest edges. Woo hoo, an NP-complete problem! Luckily, there are lots of papers on the subject, some are here: http://www.sandia.gov/~bahendr/partitioning.html