Search code examples
xmlgroovyatom-feedxmlslurper

Parse UTF-8 xml file with XmlSlurper


I'm trying to parse google atom with XmlSlurper. My use case is something like this.

1) Send an atom xml to server with rest client.

2)Handle request and parse it on server side.

I develop my server with Groovy and used XmlSlurper as a parser. But i couldnt succed and get the "content is not allowed in prolog" exception. And then i tried to find the reason why it happened. I saved my atom xml to a file which is encoded with utf-8. And then tried read file and parse atom, i get the same exception. But then i saved atom xml to a file whixh is encoded with ansi. And I parsed atom xml successfully. So i think the problem is about XmlSlurper and "UTF-8".

Do you have any idea about this limitation? My atom xml has to be utf-8, so how can i parse this atom xml ? Thanks for your help.

XML :

<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns:atom='http://www.w3.org/2005/Atom'
    xmlns:gd='http://schemas.google.com/g/2005'>
  <category scheme='http://schemas.google.com/g/2005#kind'
    term='http://schemas.google.com/contact/2008#contact' />
  <title type='text'>Elizabeth Bennet</title>
  <content type='text'>Notes</content>
  <gd:email rel='http://schemas.google.com/g/2005#work'
    address='liz@gmail.com' />
  <gd:email rel='http://schemas.google.com/g/2005#home'
    address='liz@example.org' />
  <gd:phoneNumber rel='http://schemas.google.com/g/2005#work'
    primary='true'>
    (206)555-1212
  </gd:phoneNumber>
  <gd:phoneNumber rel='http://schemas.google.com/g/2005#home'>
    (206)555-1213
  </gd:phoneNumber>
  <gd:im address='liz@gmail.com'
    protocol='http://schemas.google.com/g/2005#GOOGLE_TALK'
    rel='http://schemas.google.com/g/2005#home' />
  <gd:postalAddress rel='http://schemas.google.com/g/2005#work'
    primary='true'>
    1600 Amphitheatre Pkwy Mountain View
  </gd:postalAddress>
</entry>

read file and parse :

 String file = "C:\\Documents and Settings\\user\\Desktop\\create.xml";
 String line = "";
 StringBuilder sb = new StringBuilder();
 BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
 while ((line = br.readLine()) !=null) {
     sb.append(line);
 }
 System.out.println("sb.toString() = " + sb.toString());

 def xmlf = new XmlSlurper().parseText(sb.toString())
    .declareNamespace(gContact:'http://schemas.google.com/contact/2008',
        gd:'http://schemas.google.com/g/2005')

   println xmlf.title  

Solution

  • Try:

    String file = "C:\\Documents and Settings\\user\\Desktop\\create.xml"
    
    def xmlf = new XmlSlurper().parse( new File( file ) ).declareNamespace( 
            gContact:'http://schemas.google.com/contact/2008',
            gd:'http://schemas.google.com/g/2005' )
    println xmlf.title  
    

    You're going the long way round