I am writing a custom java annotator for our UIMA pipeline in Watson Explorer Content Analytics.
There are two places (I know of ) where I can try to get the URL or Filename of the document that is currently being processed.
Initialize
public class CustomAnnotator extends JCasAnnotator_ImplBase {
@Override
public void initialize(UimaContext aContext)
throws ResourceInitializationException {
super.initialize(aContext);
.... HERE MAYBE ? ....
Or
Process
@Override
public void process(JCas jcas) throws AnalysisEngineProcessException {
try {
.... HERE ....
I have tried several options:
I also found SourceDocumentInformation , but this is an example and although the method getUri() seems promising, I depend on IBM to implement the setUri(String) method...
But so far I have not been successful, I hope I have overlooked something...
I asked the same question on IBM dwanwsers. In short, you can access multiple views when the pipeline runs in the Watson Explorer Content Analytics server. For metadata we need to inspect the _InitialView and not the rlw-view, which is the one that holds all annotations created by the custom pipeline you create in Content Analytics Studio More details can be found here, also look at the reponses ! https://www.ibm.com/developerworks/community/blogs/ibmandgoogle/entry/Exporting_annotations_from_Watson_Explorer_Content_Analytics?lang=en