I'm quite new to Python and I'm trying to get the following done:
1 and 2 I got through no problem but when I try to do 3, I ran into some trouble. The following is how the table looks like:
As you can see, I have duplicated Titles with different PubDate. What I've been trying to do is to follow the steps ilustrated from Microsoft Kb -> Support Microsoft.
I was able to get through steps 1 - 3 which does:
my code is slightly different from what was shown in the MSFT link:
import sqlite3
cur = db.cursor()
cur.execute('''create table holdkey as
select Title,count(*)
from bar group by Title
having count(*)>1''')
cur.execute('''create table holdup as
select distinct bar.*
from bar, holdkey
where bar.Title = holdkey.Title''')
What I need help on is that in my holdup table (as seen in the screenshot), I still get duplicate Titles because the pubDate is different and that it is known as distinct by SQLite. Is there a way to delete the duplicates based on the PubDate?
I would ideally just keep the latest one, but getting rid of the duplicate is more important.
Let me know if more clarification is required. Thanks
Try
SELECT b.symbol, b.title, b.link, b.pubdate
FROM bar b JOIN
(
SELECT title, MAX(pubdate) pubdate
FROM bar
GROUP BY title
) q ON b.title = q.title
AND b.pubdate = q.pubdate
Here is SQLFiddle demo