Search code examples
mysqlsearch-enginesphinx

Sphinx Search, compound key


After my previous question (http://stackoverflow.com/questions/8217522/best-way-to-search-for-partial-words-in-large-mysql-dataset), I've chosen Sphinx as the search engine above my MySQL database.

I've done some small tests with it, and it looks great. However, i'm at a point right now, where I need some help / opinions.

I have a table articles (structure isn't important), a table properties (structure isn't important either), and a table with values of each property per article (this is what it's all about). The table where these values are stored, has the following structure:

articleID   UNSIGNED INT
propertyID  UNSIGNED INT
value       VARCHAR(255)

The primary key is a compound key of articleID and propertyID.

I want Sphinx to search through the value column. However, to create an index in Sphinx, I need a unique id. I don't have right here. Also when searching, I want to be able to filter on the propertyID column (only search values for propertyID 2 for example, which I can do by defining it as attribute).

On the Sphinx forum, I found I could create a multi-value attribute, and set this as query for my Sphinx index:

SELECT articleID, value, GROUP_CONCAT(propertyID) FROM t1 GROUP BY articleID

articleID will be unique now, however, now I'm missing values. So I'm pretty sure this isn't the solution, right?

There are a few other options, like:

  • Add an extra column to the table, which is unique
  • Create a calculated unique value in the query (like articleID*100000+propertyID)

Are there any other options I could use, and what would you do?


Solution

  • In your suggestions

    • Add an extra column to the table, which is unique

    This can not be done for an existing table with large number of records as adding a new field to a large table take some time and during that time the database will not be responsive.

    • Create a calculated unique value in the query (like articleID*100000+propertyID)

    If you do this you have to find a way to get the articleID and propertyID from the calculated unique id.

    Another alternative way is that you can create a new table having a key field for sphinx and another two fields to hold articleID and propertyID.

    • new_sphinx_table with following fields

      id - UNSIGNED INT/ BIGINT

      articleID - UNSIGNED INT

      propertyID - UNSIGNED INT

    Then you can write an indexing query like below

    SELECT id, t1.articleID, t1.propertyID, value FROM t1 INNER JOIN new_sphinx_table nt ON t1.articleID  = nt.articleID AND t1.propertyID = nt.propertyID;
    

    This is a sample so you can modify it to fit to your requirements.

    What sphinx return is matched new_sphinx_table.id values with other attributed columns. You can get result by using new_sphinx_table.id values and joining your t1 named table and new_sphinx_table