Search code examples
mysqlsqljoingroup-concat

MySQL - group_concat pulling in additional incorrect data


I'm having trouble with a JOIN and a GROUP_CONCAT. The query is concatenating additional data that should not be associated with the join.

Here's my table structure:

linkages
ID       table_name     tag_id
1        subcategories  6
2        categories     9

music
ID       artwork
1        5
2        4

artwork
ID       url_path
1        /some/file/path
2        /some/file/path

And here's my query:

SELECT music.*,
       artwork.url_path AS artwork_url_path,
       GROUP_CONCAT( linkages.tag_id ) AS tag_ids,
       GROUP_CONCAT( linkages.table_name ) AS table_name 
FROM music
LEFT JOIN artwork ON artwork.id = music.artwork 
LEFT JOIN linkages ON music.id = linkages.track_id 
WHERE music.id IN( '1356',
                   '1357',
                   '719',
                   '169',
                   '170',
                   '171',
                   '805' )
ORDER BY FIELD( music.id,
                1356,
                1357,
                719,
                169,
                170,
                171,
                805 )

This is the result of the GROUP_CONCAT :

[tag_ids] => 3, 6, 9, 17, 19, 20, 26, 49, 63, 64, 53, 57, 63, 65, 67, 73, 79, 80, 85, 96, 98, 11, 53, 67, 3, 6, 15, 17, 26, 38, 50, 63, 74, 53, 56, 57, 62, 63, 65, 66, 67, 72, 85, 88, 98, 24, 69, 71, 3, 6, 15, 17, 26, 38, 50

The first portion of the result is correct:

[tag_ids] => 3, 6, 9, 17, 19, 20, 26, 49, 63, 64, 53, 57, 63, 65, 67, 73, 79, 80, 85, 96, 98, 11, 53, 67

Everything after the correct values seems random and most of the values don't exist in the result in the database, but it's still pulling it in. It seems to repeat a portion of the correct result (3, 6, 15, 17 - the 3, 6, 17 are correct, but 15 shouldn't be there, similar with a bunch of other numbers - 71, etc. I can't use DISTINCT because I need to match up the tag_ids and table_name results as a multidimensional array from the results.

Any thoughts as to why?

UPDATE: I ended up solving it with the initial push from Gordon. It needed a GROUP_BY clause, otherwise it was putting every results tag id's in each result. The final query ended up becoming this:

SET SESSION group_concat_max_len = 1000000;
SELECT 
music.*, 
artwork.url_path as artwork_url_path,
GROUP_CONCAT(linkages.tag_id, ':', linkages.table_name) as tags 
FROM music 
LEFT JOIN artwork ON artwork.id = music.artwork 
LEFT JOIN linkages ON music.id = linkages.track_id 
WHERE music.id IN('1356', '1357', '719', '169', '170', '171', '805') 
GROUP BY music.id
ORDER BY FIELD(music.id,1356,1357,719,169,170,171,805);

Solution

  • Your join is generating duplicate rows. I would suggest that you fix the root cause of the problem. But, a quick-and-dirty solution is to use group_concat(distinct):

    GROUP_CONCAT(DISTINCT linkages.tag_id) as tag_ids, 
    GROUP_CONCAT(DISTINCT linkages.table_name) as table_name 
    

    You can put the columns in a single field using GROUP_CONCAT():

    GROUP_CONCAT(DISTINCT linkages.tag_id, ':', linkages.table_name) as tags