I am working on a function that gives a list of code volume per package in for a Java M3. This function looks like this:
public list[int] calculateSizePerComponent(M3 model){
set[loc] packages = packages(model);
list[int] componentSizes = [];
for(package <- packages){
list[loc] classFiles = [x | x <- package.ls, endsWith(x.file, ".java")];
if(size(classFiles)>0){
int sourceSize = 0;
for(classFile <- classFiles){
sourceSize+=getLinesOfCode(classFile).linesOfCode;
}
componentSizes += sourceSize;
}
}
return componentSizes;
}
I use the following function to calculate the amount of lines of code (volume) in a Java compilation unit (which works for other examples):
public tuple[int linesOfCode,int blankLines,int commentLines] getLinesOfCode(loc location) {
int linesOfCode = 0;
int blankLines = 0;
int commentLines = 0;
bool incomment = false;
srcLines = readFileLines(location);
for (line <- srcLines) {
switch(line){
case /^\s*\/\/\s*\w*/: commentLines += 1; // Line preceded by '//'
case /((\s*\/\*[\w\s]+\*\/)+[\s\w]+(\/\/[\s\w]+$)*)+/: linesOfCode += 1; // Line containing Java code containing any amount of comments. Example: code /**comment*/ code /**comment*/ code
case /^\s*\/\*?[\w\s\?\@]*\*\/$/: commentLines += 1; // Line containing single line comment: /*comment*/
case /\s*\/\*[\w\s]*\*\/[\s\w]+/: linesOfCode += 1; // Line containing a comment, but also code. Example: /**comment*/ code
case /^[\s\w]*\*\/\s*\w+[\s\w]*/: {incomment = false; linesOfCode += 1;} // Line closing a multi-line comment, but also containing code. Example: comment*/ code
case /^\s*\/\*\*?[^\*\/]*$/: {incomment = true; commentLines += 1;} // Line opening a multi-line comment, Example: /**comment
case /\s*\*\/\s*$/: {commentLines += 1; incomment = false;} // Line closing a multi-line comment, Example: comment*/
case /^\s*$/: blankLines += 1; // Blank line
default: if (incomment) commentLines += 1; else linesOfCode += 1;
}
}
return <linesOfCode,blankLines,commentLines>;
}
However, package.ls
seems to return results that have a wrong scheme. Due to this, I get the following error at the readFileLines
call:
|std:///IO.rsc|(14565,775,<583,0>,<603,43>): IO("Unsupported scheme java+package")
at *** somewhere ***(|std:///IO.rsc|(14565,775,<583,0>,<603,43>))
at readFileLines(|project://Software_Evolution/src/metrics/volume.rsc|(1911,8,<49,26>,<49,34>))
at calculateSizePerComponent(|project://Software_Evolution/src/metrics/componentsize.rsc|(1996,38,<64,16>,<64,54>))
at getComponentSize(|project://Software_Evolution/src/metrics/componentsize.rsc|(267,1112,<15,0>,<42,1>))
at $root$(|prompt:///|(0,30,<1,0>,<1,30>))
When I println the location, I get the following:
|java+package:///smallsql/database/language/Language.java|
This is incorrect, because this is a java compilationunit and not a package. How do I get the lines of code in this file?
Step by step analysis:
package.ls
works because first the logical URI is resolved by the registry "name server" to an actual physical folder on disk. If that is indeed a directory then .ls
has the right semantics and you get back a list of files in that folder.|java+package:///smallsql/database/language/Language.java|
actually points to a file and not even a compilationUnit.resolve(package).ls
should do much better.PS: the regular expressions are rather error prone, and you might have to deal with a lot of corner cases. I'd use a real parser generated from a syntax definition for Java, or use the syntax trees which already produced by M3 via the Eclipse compiler to compute the SLOC.