I am trying to find a situation using the Stack Exchange Data Explorer (SEDE) where two distinct users on Stack Overflow have accepted an answer from each other. So for example:
Post A { Id: 1, OwnerUserId: "user1", AcceptedAnswerId: "user2" }
and
Post B { Id: 2, OwnerUserId: "user2", AcceptedAnswerId: "user1" }
I currently have a query that can find two users that have collaborated on more than question as questioner-answerer but it does not determine if that relationship is reciprocal:
SELECT user1.Id AS User_1, user2.Id AS User_2
FROM Posts p
INNER JOIN Users user1 ON p.OwnerUserId = user1.Id
INNER JOIN Posts p2 ON p.AcceptedAnswerId = p2.Id
INNER JOIN Users user2 ON p2.OwnerUserId = user2.Id
WHERE p.OwnerUserId <> p2.OwnerUserId
AND p.OwnerUserId IS NOT NULL
AND p2.OwnerUserId IS NOT NULL
AND user1.Id <> user2.Id
GROUP BY user1.Id, user2.Id HAVING COUNT(*) > 1;
For anyone unfamiliar with the schema, there are two tables like so:
Posts
--------------------------------------
Id int
PostTypeId tinyint
AcceptedAnswerId int
ParentId int
CreationDate datetime
DeletionDate datetime
Score int
ViewCount int
Body nvarchar (max)
OwnerUserId int
OwnerDisplayName nvarchar (40)
LastEditorUserId int
LastEditorDisplayName nvarchar (40)
LastEditDate datetime
LastActivityDate datetime
Title nvarchar (250)
Tags nvarchar (250)
AnswerCount int
CommentCount int
FavoriteCount int
ClosedDate datetime
CommunityOwnedDate datetime
And
Users
--------------------------------------
Id int
Reputation int
CreationDate datetime
DisplayName nvarchar (40)
LastAccessDate datetime
WebsiteUrl nvarchar (200)
Location nvarchar (100)
AboutMe nvarchar (max)
Views int
UpVotes int
DownVotes int
ProfileImageUrl nvarchar (200)
EmailHash varchar (32)
AccountId int
One CTE
and simple inner joins
will do the job. There's no need for so much code as I've observed in other answers. Note a lot of comments in mine.
Link to StackExchange Data Explorer with sample result saved
with questions as ( -- this is needed so that we have ids of users asking and answering
select
p1.owneruserid as question_userid
, p2.owneruserid as answer_userid
--, p1.id -- to view sample ids
from posts p1
inner join posts p2 on -- to fetch answer post
p1.acceptedanswerid = p2.id
)
select distinct -- unique pairs
q1.question_userid as userid1
, q1.answer_userid as userid2
--, q1.id, q2.id -- to view sample ids
from questions q1
inner join questions q2 on
q1.question_userid = q2.answer_userid -- accepted answer from someone
and q1.answer_userid = q2.question_userid -- who also accepted our answer
and q1.question_userid <> q1.answer_userid -- and we aren't self-accepting
This brings as an example posts:
Though, StackExchange may throw you timeout because of large dataset and the distinct
part. If you wish to view some data, remove distinct
and add top N
at start:
with questions as (
...
)
select top 3 ...