How do I get PMD reports & the details therein without using I/O?

I am trying to generate a PMD report by using a custom ruleset. The input is a bunch of Apex classes in String format. Without outputting to a file, I'd like to then parse the report contents directly (XML format) to create a summary of violations, begin/end lines, priority, the rule name, and the message attached. This is to be done on multiple bodies of Apex code, adding to a report bean every time there is a violation.

I was trying to do this with SourceCodeProcessor, but couldn't figure out what some of the required objects/arguments looked like, nor how they were built. I still don't know how to make a RuleContext object property.

Any help is massively appreciated.

Solution

Based on what you intend to do, I'd take a slightly different approach (I'm a PMD maintainer).

SourceCodeProcessor is very low-level. It's the actual place where the whole analysis process is orchestrated, but misses most of the interesting setup to get there.

Also of note, SourceCodeProcessor deals with a single file. For Apex this may currently not make a difference, but PMD is increasingly moving towards crossing info between analysis (ie: we have plans to extend the current Data Flow Analysis / Control Flow Graph code to inter-process calls), so being able to let PMD control the complete project analysis on a single run would be best.

Therefore, I'd take a look at PMD.doPmd. You should probably write your own version of such method, but cover most of the basics:

create a PMDConfiguration object with your setup (threads, rulesets to be used, etc.)
have the RuleSetFacory create a ruleset based on your configuration
obtain a List<DataSource> with the sources to analyze
create a RuleContext
setup a listener for the report. This would allow you to obtain violations directly as POJOs (you can avoid actually generating a report file and parsing it)
call PMD.processFiles to actually do the analysis.

The one point where you should diverge from what PMD currently does is step 3. Instead of pointing to files (FileDataSource), you should create a list of ReaderDataSource, using a StringReader around the source code string you retrieved from the db.

It's very little code on your end, just wiring up differently pieces already in PMD.

P.S. One extra advantage of this approach, is that being PMD a very high level class of PMD, it´s among the least likely to undergo an API change in future releases. PMD has recently adopted semantic versioning, and the upcoming release (6.0.0) will introduce several API changes / remove deprecated methods and classes.

P.S. 2: this is probably not the best place to ask for this... it's mostly by chance I came across this, and I don't think anyone not familiar with PMD internals would be able to help. You may want to reach out to the PMD development team more directly (development mailing list, Github, so on).