I need to analyse survey data to get the frequency of a multi question variable. I'm using this R package
I understand that I need to use the 'multi.split' function in order to create the variable that I will be working with. but I need to know how I can make it reference answers that are not in the data-set, meaning answers that were a part of the original question but was not selected during the survey and therefor should be displayed with the value 0.
Example: I have the following passable answers:
"red", "blue", "green" and "yellow"
and my data is (like in the example):
v <- c("red/blue","green","red/green","blue/red")
when I run this command:
multi.table(multi.split(v))
I get the following result:
n %multi
v.blue 2 50
v.red 3 75
v.green 2 50
but I would like to get:
n %multi
v.blue 2 50
v.red 3 75
v.green 2 50
v.yellow 0 0
any ideas on how can I do that?
I have never used this package before but I'll give it a try.
The function multi-split()
produces a data.frame so if you want to add another column before getting the statistics you could do something like the following:
v <- c("red/blue","green","red/green","blue/red")
a <- multi.split(v)
a$v.yellow <- 0
multi.table(a)
## > multi.table(a)
## n %multi
## v.blue 2 50
## v.red 3 75
## v.green 2 50
## v.yellow 0 0
Update A more generic version would go something like that.
1.wanted.data
is a char of column names that you always want in your output.
2. col.to.add are the columns that were not in the a
data.frame.
3. Then assign 0 to the columns that were not present.
4. Finally order the columns so we always have them in the same order.
library(questionr)
v <- c("red/blue","green","red/green","blue/red")
wanted_data <- c("v.red","v.blue","v.green","v.yellow")
a <- multi.split(v)
col.to.add<- wanted_data[!(wanted_data%in% colnames(a) )]
a[col.to.add] <- 0
a[,order(colnames(a))]
multi.table(a)
## > multi.table(a)
## n %multi
## v.blue 2 50
## v.red 3 75
## v.green 2 50
## v.yellow 0 0