Thanks for reading this question.
I'm trying to index RTF files in Lucene. It looks like there are few ways to do it, but all of them seem to just extract body text and hand it to Lucene. I think this destroys the fields. If I want to index filepath (for display) and body text (for query), then how would I be able to solve this problem?
Thanks :)
you just add literal params for each additional field you want (in your case the path) with the given value, along with the file.
see here for doc. In your case it would be
curl "http://localhost:8983/solr/update/extract?literal.path=\path\to\tutorial&commit=true" -F "[email protected]"
If you need to encode \, its %5C