Search code examples
mysqlsphinxthinking-sphinx

sphinx search indexer does not index all the content in a field


Something odd seems to be happening with a sql_query in the configuration for a MySQL database. The relevant part of the sql_query in the sphinx.conf is as follows:

sql_query = ... GROUP_CONCAT(DISTINCT IFNULL(`vendor_items`.`description`, '0') 
SEPARATOR ' ') 
AS `vendor_item_descriptions`,...

The problem is that in some cases of only part this description field is indexed. It seems like indexing goes up to a point, and then indexes the field content no further. I figured this out by searching for a bunch of words from that field for one particular item. It appears that about first 1/3rd of the description content for that item is indexed. I thought it might be character encoding issue so I stripped the description field of any non-printable UTF-8 characters & reindexed. But no luck.

Has anyone else experienced this kind of problem or know what the issue could be?


Solution

  • I would suggest you look at the group_concat_max_len MySQL variable.

    http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat

    http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_group_concat_max_len

    In sphinx terms can use a sql_query_pre to update the value for the session