I have a CAS that is serialized into xmi file, when I am trying to deserialize the xmi file the output is a plain text as supposed to the xml file. Here is what I am doing,
URL myURL = UIMAFramework.class.getResource("TypeSystem.xml");
TypeSystemDescription tsDesc = UIMAFramework.getXMLParser().parseTypeSystemDescription(new XMLInputSource(myURL));
CAS cas = CasCreationUtils.createCas(tsDesc, null, null);
FileInputStream xmiInput = new FileInputStream(args[0]);
XmiCasDeserializer.deserialize(xmiInput, cas, false);
JCas jCas = cas.getJCas();
xmiInput.close();
logger.info(jCas.getDocumentText());
where am I getting this wrong?
If I understand correctly you wonder why jCas.getDocumentText() returns plain text instead of the XML format used by XMI. Well, that's the point of the XmiCasDeserializer. It decodes the XML of the XMI format. The text stored in XML ends up in the jCas.getDocumentText(). The rest is added to the CAS data structure as annotations.
To access the annotations from the CAS, there are various ways, e.g.:
jCas.getAnnotationIndex().iterator()
to simply iterate over all annotations.
Alternative places to look for documentation
Disclosure: I am a developer on the UIMA and uimaFIT projects.