phppdfelasticsearch

Rendering PDF or other other document from the query result return by elasticsearch


I am working on a project to index PDF documents (mainly PDF for now). i found out elastic search could index attached document using Apache Tika.

I have implemented elastic search, indexed few PDF documents and also using PHP as the client to render the query result return by elastic search.

Would appreciate if i could get a link to some tutorial on how to locate the PDF attached to elastic search from the query result returned by elastic search.

Have searched online couldn't find any tutorial relating to what i want to achieve


Solution

  • In documentation it says Elasticseacrh store content of the attachement as base64 encoded. So after you search, you can get document content as base64 encoded again. Then you can decode that content as pdf. For example check that thread to see how it can be done: PHP get pdf file from base64 encoded data string