Search code examples
phpmysqlsortingsearch-engine

Building a Tag Search Engine in Mysql - Trouble with sorting


I would like to do a simple search Engine about videos (like a very simple Shutterstock) based on PHP/MySQL. So here's the deal. A video has multiple keywords (or tags). A user type in a search bar some keywords then he gets a list of the videos with these keywords in (ordered by relevance).

I've created 3 Mysql tables :

  • video : video_id, video_name, video_format, and so...
  • tag : tag_id, tag_name
  • video_tag : video_id, tag_id // A join table which link tag and video (as a video has 1 or multiple keywords and a keyword is used by 1 or multiple videos)

The relevance of the results is important, here is the logic:

If the user types in the search bar: water, wood and nature, the results would be like:

 (1) the videos which associate water, wood and nature
 (2) the videos which associate water and wood
 (3) the videos which associate water and nature
 (4) the videos which associate wood and nature
 (5) the videos which associate water
 (6) the videos which associate wood
 (7) the videos which associate nature

So i've a made an algorithm in php which builds a multidimensional array of this kind :

[0] [water, wood, nature]
[1] [water, wood]
[2] [water, nature]
[3] [nature, water]
[4] [wood, nature]
[5] [water]
[6] [wood]
[7] [nature]

Next, I send this array in the MySQL request and here is where I'm stuck ... What I have for my request:

    SELECT video.video_id,video.video_name, GROUP_CONCAT(tag.tag_name) AS tags

    FROM
        video_tag
        INNER JOIN video
          ON video_tag.video_id = video.video_id
        INNER JOIN tag
          ON tag.tag_id = video_tag.tag_id

    GROUP BY video_id

    HAVING tags LIKE '%water%' AND tags LIKE '%wood%' AND tags LIKE '%nature%.'
    OR tags LIKE '%water%' AND tags LIKE '%wood%'
    OR tags LIKE '%water%' AND tags LIKE '%nature%'
    OR tags LIKE '%wood%' AND tags LIKE '%nature%'
    OR tags LIKE '%water%'
    OR tags LIKE '%wood%'
    OR tags LIKE '%nature%'

But the order of the result is not right. Everything is sorted by video_id.

video_id            video_name            tags
1                   video 01            wood, keyword4, keyword5, keyword 6, keyword 7
2                   video 02            wood, nature, water, keyword 6
3                   video 03            nature, water, keyword 7

The order I want is the following:

video_id            video_name            tags
2                   video 02            wood, nature, water, keyword 6
3                   video 03            nature, water, keyword 7
1                   video 01            wood, keyword4, keyword5, keyword 6, keyword 7

Is there a way in Mysql to get the results ordered by the "natural" order of the request? Am I doing this the right way? Did someone already encountered this kind of trouble and can help me? :)


Solution

  • Thank you. With your proposition, I finally realised that I was overcomplicated myself. No need of a multidimensional array or combinatory logic ...

    Here is the solution I found :

    SELECT DISTINCT video.video_name, GROUP_CONCAT(tag.tag_name) AS tags
    
    FROM
        video_tag
        INNER JOIN video
          ON video_tag.video_id = video.video_id
        INNER JOIN tag
          ON tag.tag_id = video_tag.tag_id
    
    WHERE tag.tag_name= 'wood' OR tag.tag_name = 'nature' OR tag.tag_name = 'corporate'
    
    GROUP BY video_id
    ORDER BY COUNT(*) DESC;
    

    Like that, I select all the videos linked with the needed tags. Then I just count how many tags a video have... And sort with that number... Finally, my trouble was very easy.

    Sorry for the disturbance ;)