Search code examples
mysqllistcsvfind-in-set

Comparing lists in MySQL


Say I have a movie table and a genre table. linked through a many to many relationship.

TABLE movie
id
name

_

TABLE movie_genre
movie_fk
genre_fk

_

TABLE genre
id
name

So nice and normalised. But I'm also also importing a CSV file into a table, which is of the format:

TABLE csv
name, genres 
'Die Hard', 'action~drama'

Now I want to check for changes, whether the csv has genres listed I don't have or vice versa so I display the changes to the user and later sync them

I'm doing it like so :

SELECT * FROM movie 
JOIN movie_genre ON movie.id = movie_genre.movie_fk 
JOIN genre ON genre.id = movie_genre.genre_fk 
WHERE 
FIND_IN_SET(genre.name, REPLACE(csv.genres, '~', ',')) = 0 

Issue is this will only flag changes one way. E.g. If in my database I have Die Hard related to genres: action, drama

and the CSV contains action,drama, horror

because each of the genres in my database are included in the csv data it won't get marked as change.

Please note the genres listed in the csv may not be listed in any particular order.

Hopefully I've explained that thoroughly enough.

How can I achieve what I'm trying to do ? Can it be done using REGEX or a custom MySQL function ?


Solution

  • If anyone's interested, not the most elegant solution but I solved this by doing the following:

    • firstly I updated the import code to ensure that the imported genre list column gets sorted into alphabetical order.
    • then I updated my query to do a direct string comparison on the result of a group_concat

    _

    SELECT * FROM movie 
    LEFT JOIN 
    ( 
    SELECT GROUP_CONCAT(genre.name ORDER BY genre.name SEPERATOR "~") AS genres, movie_genre.movie_fk FROM genre 
    JOIN movie_genre ON genre.id = movie_genre.genre_fk 
    GROUP BY movie_genre.movie_fk
    ) AS sub_genres ON movie.id = sub_genres.movie_fk
    WHERE sub_genres.genres != csv.genres