I'm trying to write a nested query using sqldf. The dataset is 'contact_fb'. I am trying to only take rows without clubmahindra and distinct names which are in column 'from_name' and then left join with the 'contact_fb ' to get all the info in the other columns. This is giving not the result I want:
contact_fb =structure(list(X = 1:6, from_name = c("Club Mahindra", "Club Mahindra","pinto", "valencia", "valencia", "Club Mahindra"), type = structure(c(2L, 2L, 2L, 1L, 1L, 2L), .Label = c("link","photo", "status", "video"), class = "factor")), .Names = c("X","from_name", "type"), row.names = c(NA, 6L), class = "data.frame")
My attempt at this is:
names_cm=sqldf("select t1.from_name, t2.* from (select distinct from_name from contact_fb where from_name!='Club Mahindra') as t1 left join ( select * from contact_fb ) as t2 on t1.from_name=t2.t1.from_name")
I could get it finally through
sqldf("select distinct(t1.from_name),t2.* from df t1 left join df t2 on (t1.from_name=t2.from_name) where t1.from_name!='Club Mahindra' group by t1.from_name")
I don't understand where I'm going wrong. Could I still get it through my way?
The output is:
3 Pinto photo
4 valencia link
The output shown at the end of the question seems to be the set of distinct from_name
, type
pairs for which from_name
is not "Club Mahindra"
and for that we don't need a join:
sqldf("select distinct from_name, type
from contact_fb
where from_name != 'Club Mahindra'")
giving:
from_name type
1 pinto photo
2 valencia link