The program should be allowed to read from an XML file using XPath expressions. I already started the project using JDOM2, switching to another API is unwanted. The difficulty is, that the program does not know beforehand if it has to read an element or an attribute. Does the API provide any function to receive the content (string) just by giving it the XPath expression? From what I know about XPath in JDOM2, it uses objects of different types to evaluate XPath expressions pointing to attributes or elements. I am only interested in the content of the attribute / element where the XPath expression points to.
Here is an example XML file:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
This is what my program looks like:
package exampleprojectgroup;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import org.jdom2.Attribute;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.filter.Filters;
import org.jdom2.input.SAXBuilder;
import org.jdom2.input.sax.XMLReaders;
import org.jdom2.xpath.XPathExpression;
import org.jdom2.xpath.XPathFactory;
public class ElementAttribute2String
{
ElementAttribute2String()
{
run();
}
public void run()
{
final String PATH_TO_FILE = "c:\\readme.xml";
/* It is essential that the program has to work with a variable amount of XPath expressions. */
LinkedList<String> xPathExpressions = new LinkedList<>();
/* Simulate user input.
* First XPath expression points to attribute,
* second one points to element.
* Many more expressions follow in a real situation.
*/
xPathExpressions.add( "/bookstore/book/@category" );
xPathExpressions.add( "/bookstore/book/price" );
/* One list should be sufficient to store the result. */
List<Element> elementsResult = null;
List<Attribute> attributesResult = null;
List<Object> objectsResult = null;
try
{
SAXBuilder saxBuilder = new SAXBuilder( XMLReaders.NONVALIDATING );
Document document = saxBuilder.build( PATH_TO_FILE );
XPathFactory xPathFactory = XPathFactory.instance();
int i = 0;
for ( String string : xPathExpressions )
{
/* Works only for elements, uncomment to give it a try. */
// XPathExpression<Element> xPathToElement = xPathFactory.compile( xPathExpressions.get( i ), Filters.element() );
// elementsResult = xPathToElement.evaluate( document );
// for ( Element element : elementsResult )
// {
// System.out.println( "Content of " + string + ": " + element.getText() );
// }
/* Works only for attributes, uncomment to give it a try. */
// XPathExpression<Attribute> xPathToAttribute = xPathFactory.compile( xPathExpressions.get( i ), Filters.attribute() );
// attributesResult = xPathToAttribute.evaluate( document );
// for ( Attribute attribute : attributesResult )
// {
// System.out.println( "Content of " + string + ": " + attribute.getValue() );
// }
/* I want to receive the content of the XPath expression as a string
* without having to know if it is an attribute or element beforehand.
*/
XPathExpression<Object> xPathExpression = xPathFactory.compile( xPathExpressions.get( i ) );
objectsResult = xPathExpression.evaluate( document );
for ( Object object : objectsResult )
{
if ( object instanceof Attribute )
{
System.out.println( "Content of " + string + ": " + ((Attribute)object).getValue() );
}
else if ( object instanceof Element )
{
System.out.println( "Content of " + string + ": " + ((Element)object).getText() );
}
}
i++;
}
}
catch ( IOException ioException )
{
ioException.printStackTrace();
}
catch ( JDOMException jdomException )
{
jdomException.printStackTrace();
}
}
}
Another thought is to search for the '@' character in the XPath expression, to determine if it is pointing to an attribute or element. This gives me the desired result, though I wish there was a more elegant solution. Does the JDOM2 API provide anything useful for this problem? Could the code be redesigned to meet my requirements?
Thank you in advance!
XPath expressions are hard to type/cast because they need to be compiled in a system that is sensitive to the return type of the XPath functions/values that are in the expression. JDOM relies on third-party code to do that, and that third party code does not have a mechanism to correlate those types at your JDOM code's compile time. Note that XPath expressions can return a number of different types of content, including String, boolean, Number, and Node-List-like content.
In most cases, the XPath expression return type is known before the expression is evaluated, and the programmer has the "right" casting/expectations for processing the results.
In your case, you don't, and the expression is more dynamic.
I recommend that you declare a helper function to process the content:
private static final Function extractValue(Object source) {
if (source instanceof Attribute) {
return ((Attribute)source).getValue();
}
if (source instanceof Content) {
return ((Content)source).getValue();
}
return String.valueOf(source);
}
This at least will neaten up your code, and if you use Java8 streams, can be quite compact:
List<String> values = xPathExpression.evaluate( document )
.stream()
.map(o -> extractValue(o))
.collect(Collectors.toList());
Note that the XPath spec for Element nodes is that the string-value
is the concatination of the Element's text()
content as well as all child elements' content. Thus, in the following XML snippet:
<a>bilbo <b>samwise</b> frodo</a>
the getValue()
on the a
element will return bilbo samwise frodo
, but the getText()
will return bilbo frodo
. Choose which mechanism you use for the value extraction carefully.