Search code examples
mysqlsqldataframedata-extraction

What am I getting wrong in this SQL query?


Write a query that retrieves only a ranked list of the most prolific days in October 2020, prolific measured in number of posts per day. Your query should return those days in a single-column table (column name post_day) in the format YYYY-MM-DD.

This is my table:

CREATE TABLE posts(
            postid INT NOT NULL,
            posted_at DATETIME, 
            num_comments INT,
            score INT,
            selftext TEXT,
            title VARCHAR(10000),
            total_awards_received INT,
            upvote_ratio DOUBLE,
            id INT,
            PRIMARY KEY  (postid),
            FOREIGN KEY (id) REFERENCES users(id)

This is my query:

try:
    with connection.cursor() as cur:
        q = """
                SELECT CAST(posted_at AS DATE) AS post_day
                FROM posts p
                WHERE posted_at BETWEEN '2020-10-01' AND '2020-10-30'
                HAVING count(post_day)
                ORDER BY 1 DESC 
        """
        cur.execute(q)
        results = cur.fetchall()

finally:
    connection.close()
return results

The problem is that I'm only getting one result, not a descending order of dates:

[{'post_day': datetime.date(2020, 11, 9)}]

Solution

  • add GROUP BY CAST(posted_at AS DATE) after WHERE clause and before HAVING clause. Also, check your WHERE clause. You are likely missing the last couple of days.

    SELECT CAST(posted_at AS DATE) AS post_day
    FROM posts p
    WHERE posted_at > '2020-10-01' AND posted_at < '2020-11-01'
    GROUP BY CAST(posted_at AS DATE)
    HAVING COUNT(post_day)
    ORDER BY COUNT(post_day) DESC