I'd like to compare the popularity of tags between two months, ordered by the biggest change.
I've worked out how to count the number of tags in a month, but not how to compare them. Here's what I have so far:
select TOP 10
tags.tagname, count(*) AS tagcount
from Posts
INNER JOIN PostTags ON PostTags.PostId = Posts.id
INNER JOIN Tags ON Tags.id = PostTags.TagId
where
datepart(year, Posts.CreationDate) = 2011 and
datepart(month, Posts.CreationDate) = 1
Group by tags.tagname
Order by tagcount DESC
https://data.stackexchange.com/stackoverflow/qe/924/query-count-tags-from-daterange
(note: you can clone, edit in place, and run it)
I'm new to SQL. It seems I just need to create a second query for the previous month (December 2010), and then combine these two queries, with a column that is prevMonth.count - nextMonth.count
, and order by that column (getting just the top X, so it doesn't take forever).
But I can't work out how to combine two queries in this way - I think you should be able to nest them, but I can't get it to work. Another way is to create a temporary table - it seems inefficient to me, but maybe that is the right way?
Many thanks for any help!
BTW: what I'd like to do next:
After this, I'd like to find the rate of growth (not just the absolute change in numbers). That's just (new-old)/old
. Like velocity, but normalized.
Then, the tags whose rate of growth is increasing the most - i.e. that have exponential growth. This is would require 3 months: calculate the rate of change between months 1 and 2, and between months 2 and 3. The difference between these is the rate of change of the rate of change. It's like acceleration.
[ This is as a signal for new technologies growing, which often start small in a very specific usage. The people in that small area talk to each other, and good ideas get passed on in a chain-reaction: one person tells two, they each tell two more and so on. The that niche might get converted fully after a while, and maybe it spreads to another, similar niche. See "Seeing What's Next", by the Innovator's Dilemma guy. ]
Here's a later version, using JNK's answer: https://data.stackexchange.com/stackoverflow/q/92869/query-tags-with-highest-increase-in-growth-over-3-months
And... the fastest growing tag is... facebook-c#-sdk. Dunno how useful this is, but it's an interesting way to browse SO.
Use #Temp Tables:
-- QUERY: count tags from daterange
-- TODO: compare from two different dateranges...
select TOP 10
tags.tagname, count(*) AS tagcount
INTO #TagCountTemp1
from Posts
INNER JOIN PostTags ON PostTags.PostId = Posts.id
INNER JOIN Tags ON Tags.id = PostTags.TagId
where
datepart(year, Posts.CreationDate) = 2011 and
datepart(month, Posts.CreationDate) = 1
Group by tags.tagname
Order by tagcount DESC
select TOP 10
tags.tagname, count(*) AS tagcount
INTO #TagCountTemp2
from Posts
INNER JOIN PostTags ON PostTags.PostId = Posts.id
INNER JOIN Tags ON Tags.id = PostTags.TagId
where
datepart(year, Posts.CreationDate) = 2010 and
datepart(month, Posts.CreationDate) = 12
Group by tags.tagname
Order by tagcount DESC
SELECT TOP 10
t2.tagname, t2.tagcount as 'Month 1', t1.tagcount as 'Month 2', (t1.tagcount-t2.tagcount) as 'Increase'
FROM #TagCountTemp1 as t1
LEFT JOIN #TagCountTemp2 as t2
ON T1.tagname = t2.tagname
ORDER BY (t1.tagcount-t2.tagcount) desc
This worked fine for me!