I'm working through implementing Sphinx as a search server for my website. The website allows teachers to build assessment items -- which I'm looking to index -- that can include:
I'm new to both Sphinx and the direct use of MySQL, since my other searches have been through CakePHP's functions.
In building a view through which the questions in the database can be indexed, I've come up with the following SQL query (it's big, but I'm not going to ask you to debug it):
SELECT
questions.id,
CONCAT(
IFNULL(questions.context_template, ''),
IFNULL(context_answers_concat.context_answer_templates, ''),
IFNULL(parts_concat.full_parts, '')
) as full_question,
parts_concat.all_categories
FROM
questions
LEFT JOIN(
SELECT
question_answers.question_id,
GROUP_CONCAT(
question_answers.answer_template
) AS context_answer_templates
FROM
question_answers
GROUP BY
question_answers.question_id
) AS context_answers_concat
ON
context_answers_concat.question_id = questions.id
LEFT JOIN(
SELECT
question_parts.question_id,
GROUP_CONCAT(
CONCAT(
question_parts.question_template,
IFNULL(answers_concat.answer_templates, '')
)
) AS full_parts,
GROUP_CONCAT(
categories_concat.part_categories
) AS all_categories
FROM
question_parts
LEFT JOIN(
SELECT
question_answers.question_part_id,
GROUP_CONCAT(
question_answers.answer_template
) AS answer_templates
FROM
question_answers
GROUP BY
question_answers.question_part_id
) AS answers_concat
ON
answers_concat.question_part_id = question_parts.id
LEFT JOIN(
SELECT
question_categories.question_part_id,
GROUP_CONCAT(DISTINCT categories.type) AS part_categories
FROM
question_categories,
categories
WHERE
question_categories.category_id = categories.id
GROUP BY
question_categories.question_part_id
) AS categories_concat
ON
categories_concat.question_part_id = question_parts.id
GROUP BY
question_parts.question_id
) AS parts_concat
ON
parts_concat.question_id = questions.id
I've run the query in phpMyAdmin, and it generates the view I expected -- an id column, a column containing the text of the questions and answers, and a column for the categories.
My question is, because it's so big, is this a reasonable query to use for indexing something like this in Sphinx? If not, should I be using other parts of Sphinx to do some of this work? Or different SQL functions?
As long as you're good with the query's performance and the query returns what you need I don't see any problem. I saw queries even bigger and Sphinx / Manticore worked fine with them.
Alternatively you can use real-time indexes, in that case you would need to insert data to the index rather than taking it from mysql using indexer.
If you want to stick with plain indexes and using indexer think about I'd also recommend you think about splitting your index into main and delta parts so you can rebuild the delta part more frequently. Unless your index is small and takes insignificant time to rebuild, then you don't need to worry about that.