I have the following R data frames. I am trying to get summary stats by the logical vector grouping from the final "score" data frame.
#original df
type <- c("A", "B", "C","D","E")
user <- c('user1','user2','user3','user4','user5')
text <-c('this is a tweet','this is a fb post','tweeting is fun','other text','another fb post')
tweet.mention <- c('TRUE','FALSE','TRUE','FALSE','FALSE')
fb.mention <- c('FALSE','TRUE','FALSE','FALSE','TRUE')
df1 <- cbind.data.frame(type, user, text,tweet.mention,fb.mention)
df1
#Remove records that are all FALSE
tweet<-as.logical(tweet.mention)
fb<-as.logical(fb.mention)
test<-cbind(tweet,fb)
true<-rowSums(test)
all<-cbind(test,true)
#Create score df
score<-subset(df1,true>=1)
#score API return
sentiment<-c(1,.5,2,-2)
#scored text
score<-cbind(score,sentiment)
The score df removed record 4 as it should and contains the scored numeric value. Then I would like to get average sentiment score but grouped by tweet.mention(1.5) and fb.mention(-.75). I have tried summary from base R but that is all in. Thus I think a group by or subset is needed. I then tried the describeBy from the psych package. That isn't helping either.
Making matters more complicated is that I won't always know the number of logical vectors so can't subset them manually by specifying the column and having ==TRUE. I can create a list or vector of the column headers to lapply through but I am unsure the coding or function to get the grouping done.
I have read the base r and psych vignettes as well as checked the R Cookbook for this answer but cannot find it. I appreciate the help greatly.
2 methods using base R:
> with(score, tapply(sentiment, list(tweet.mention, fb.mention), mean))
FALSE TRUE
FALSE NA -0.75
TRUE 1.5 NA
and:
> aggregate(sentiment~tweet.mention+fb.mention, data=score, mean)
tweet.mention fb.mention sentiment
1 TRUE FALSE 1.50
2 FALSE TRUE -0.75