I am using the Discovery service of IBM Watson. I would like to build a search engine using this service and I use the Java API in order to upload files to the collection where I will do my searches.
I would like to know whether it is possible to apply low-level customization on the service, such as the content extraction, the term tokenization, any filter that is applied during the processing of the content. I searched through the Java API documentation and it seems that it is not possible, but I would like to be sure of it.
Thank you.
Please refer to this document for customization. https://console.bluemix.net/docs/services/discovery/building.html#configuring-your-service
tokenization is dependent upon the language you specify, but nothing custom is provided.
The Data crawler does provide a level of url filtering, but not sure if that is what you are looking for. https://console.bluemix.net/docs/services/discovery/data-crawler.html#adding-content-with-data-crawler