Search code examples
sphinx

How to select columns in sql query of source in sphinx


I'm using sphinx indexer to create a dictionary based on documents in my mysql db, but I can't limit the source's sql query to selected columns.

This is the command I use

indexer --buildstops dict.txt 1000 --verbose --print-queries --dump-rows listing_rows --buildfreqs listing_core -c config/development.sphinx.conf

With the following source in development.sphinx.conf, no documents are found and the dict.txt is empty

source listing_source {
  type = mysql
  sql_host = mysql
  sql_user = sharetribe
  sql_pass = secret
  sql_db = sharetribe_development
  sql_query = SELECT title AS title, description AS description FROM listings;
}

Output

Sphinx 2.2.10-id64-release (2c212e0)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file 'config/development.sphinx.conf'...
WARNING: key 'max_matches' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'charset_type' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'enable_star' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'charset_type' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'enable_star' was permanently removed from Sphinx configuration. Refer to documentation for details.
indexing index 'listing_core'...
building stopwords list...
SQL-CONNECT: ok
SQL-QUERY: SELECT title AS title, description AS description FROM listings;: ok
total 0 docs, 0 bytes
total 0.008 sec, 0 bytes/sec, 0.00 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

When I change the sql_query to return all columns, the indexer finds the expected number of documents (2) and adds them to the dictionary.

source listing_source {
  type = mysql
  sql_host = mysql
  sql_user = sharetribe
  sql_pass = secret
  sql_db = sharetribe_development
  sql_query = SELECT * FROM listings;
}

Output:

Sphinx 2.2.10-id64-release (2c212e0)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file 'config/development.sphinx.conf'...
WARNING: key 'max_matches' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'charset_type' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'enable_star' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'charset_type' was permanently removed from Sphinx configuration. Refer to documentation for details.
WARNING: key 'enable_star' was permanently removed from Sphinx configuration. Refer to documentation for details.
indexing index 'listing_core'...
building stopwords list...
SQL-CONNECT: ok
SQL-QUERY: SELECT * FROM listings;: ok
total 2 docs, 485 bytes
total 0.008 sec, 56303 bytes/sec, 232.18 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

How can I limit the query to return selected columns only?


Solution

  • sql_query = SELECT title AS title, description AS description FROM listings;

    Doesn't work because it doesn't have a document_id. Add a id column (as the first!) and it should work. You also dont need 'AS' if using same name. (* probably worked, because a id like column happened to be first :)

    So just make sure include a id, and then name your columns want as fields ...

    sql_query = SELECT id, title, description FROM listings