Search code examples
htmlsearchindexingsearch-enginesphinx

How add html and text files to Sphinx index?


From Sphinx reference manual: «The data to be indexed can generally come from very different sources: SQL databases, plain text files, HTML files, mailboxes, and so on»

But I can't find how add text files and html files to index. Quick Sphinx usage tour show setup for MySQL database only.

How I can do this?


Solution

  • Your should look at the xmlpipe2 data source.

    From the manual:

    xmlpipe2 lets you pass arbitrary full-text and attribute data to Sphinx in yet another custom XML format. It also allows to specify the schema (ie. the set of fields and attributes) either in the XML stream itself, or in the source settings.