Search code examples
mysqlsqlsql-order-byaggregate-functionsdistinct-values

How to echo all rows (related to DISTINCT)


This is my SQL Fiddle

As you can see here, If I use DISTINCT then, there are 2 problems

1.)Only 1st recommendations_vote_average coloum's number is correct. All other numbers are in wrong order

2.) Only 2 Number get's printed.

If i do not use DISTINCT, all numbers are 7.5 (i.e first vote_average)

How to show all the (10) numbers in correct order?

Expected output

movie_title       recommendations_vote_average                    recommendations_title
                                                                  The Dark Knight Rises,Batman Begins,Iron Man,The Lord of the Rings: The Return of the King,The Lord of the Rings: The The Fellowship of the Ring,The Lord of the Rings: The Two Towers,The Matrix,Inception,Iron Man 2,Captain America: The First Avenger
The Dark Knight   7.5,7.5,7.3,8.1,8,7.9,7.9,8,6.6,6.6

SQL Fiddle Code:

CREATE TABLE tmdb_movies (
  tmdb_id INTEGER NOT NULL PRIMARY KEY,
  movie_title TEXT NOT NULL
);

INSERT INTO tmdb_movies (tmdb_id, movie_title) VALUES
(1, 'The Dark Knight');


CREATE TABLE recommendations (
  recommendations_tmdb_id INTEGER NOT NULL,
  recommendations_title TEXT NOT NULL,
  recommendations_vote_average TEXT NOT NULL
);




INSERT INTO recommendations (recommendations_tmdb_id, recommendations_title, recommendations_vote_average) VALUES
(1, 'The Dark Knight Rises', '7.5'),
(1, 'Batman Begins', '7.5'),
(1, 'Iron Man', '7.3'),
(1, 'The Lord of the Rings: The Return of the King', '8.1'),
(1, 'The Lord of the Rings: The The Fellowship of the Ring', '8'),
(1, 'The Lord of the Rings: The Two Towers', '7.9'),
(1, 'The Matrix', '7.9'),
(1, 'Inception', '8'),
(1, 'Iron Man 2', '6.6'),
(1, 'Captain America: The First Avenger', '6.6');


SELECT tmdb_movies.movie_title
,GROUP_CONCAT(DISTINCT recommendations.recommendations_vote_average) as recommendations_vote_average
,GROUP_CONCAT(DISTINCT recommendations.recommendations_title) as recommendations_title
FROM tmdb_movies 

LEFT JOIN recommendations ON recommendations.recommendations_tmdb_id=tmdb_movies.tmdb_id

Where tmdb_movies.tmdb_id=1

GROUP BY tmdb_movies.movie_title

Solution

  • It's hard to guess what you want from your question. You mentioned "correct order" without defining it.

    You can use GROUP_CONCAT() in these ways.

    GROUP_CONCAT(a.b)  -- gets all the items in column b -- cardinality preserved
    
    GROUP_CONCAT(DISTINCT a.b) -- distinct values in column b -- cardinality reduced
    
    GROUP_CONCAT(a.b ORDER BY a.b) -- all items in b in order
    
    GROUP_CONCAT(DISTINCT a.b ORDER BY a.b) -- distinct items in b in order
    
    GROUP_CONCAT(a.b ORDER BY a.c) -- all items in b in the same order as c
    

    I'm not completely sure what it means to an application to add DISTINCT to the last one.

    If you're trying to get two concatenated columns in corresponding order, you can't use DISTINCT in either one; DISTINCT has the potential to remove duplicate values.

    Your result set column mentions _average. You get an actual average (arithmetic mean) with AVG(value). and that gives a single aggregate number.

    If you want a list of scores in one column and a corresponding list of titles in another, try this.

    GROUP_CONCAT(
                recommendations.recommendations_vote_average
       ORDER BY recommendations.recommendations_title
     ) AS recommendations_vote_average,
    GROUP_CONCAT(
                recommendations.recommendations_title
       ORDER BY recommendations.recommendations_title
     ) AS recommendations_title
    

    That presents the two concatenated lists in order by the title.

    You may not be aware of this: rows in DBMS tables have no inherent order. If you say SELECT * FROM table (without an ORDER BY clause) more than once, and the rows come out in the same order each time, it is an accident. There's nothing in your recommendations table -- no unique id values for example -- to give the order of those items except the scores and the titles. So you may not be able to get the exact order you want.

    Many tables contain an autoincrementing id column (but yours does not). Using such an id column in ORDER BY clauses is a way to get repeatable ordering.

    Pro tip: Denormalized data (comma separated data in columns, for example) is often considered harmful. GROUP_CONCAT() turns normalized data, like your input, into denormalized data. So use it sparingly and only when you need it.