I am doing some association rules mining in R and want to extract my results so I can build reports my results look like this:
> inspect(rules[1:3])
lhs rhs support confidence lift
1 {apples} => {oranges} 0.00029 0.24 4.4
2 {apples} => {pears} 0.00022 0.18 45.6
3 {apples} => {pineapples} 0.00014 0.12 1.8
How do i extract the "rhs" here i.e. a vector of oranges, pears and pineapples
Further how do I extract information out of the summary i.e.
> summary(rules)
The data type is "s4" and have no problem extracting when the output is in the list etc.. how do you do the equivelant? set of 3 rules
rule length distribution (lhs + rhs):sizes
2
3
Min. 1st Qu. Median Mean 3rd Qu. Max.
2 2 2 2 2 2
I want to extract the "3" from the "set of 3 rules"
I have gotten as far as using "@" What does the @ symbol mean in R?
But once i use that, how do i turn my results into a vector i.e.
inspect(rules@rhs)
1 {oranges}
2 {pears}
3 {pineapples}
becomes character vector of length 3
inspect
isn't returning anything, just printing its output. When this happens you can use the function capture.output
if you want to save the output as a string. For example, getting the rhs
data(Adult)
rules <- apriori(Adult, parameter = list(support = 0.4))
inspect(rules[1:3])
# lhs rhs support confidence lift
# 1 {} => {race=White} 0.8550428 0.8550428 1
# 2 {} => {native-country=United-States} 0.8974243 0.8974243 1
# 3 {} => {capital-gain=None} 0.9173867 0.9173867 1
## Capture it, and extract rhs
out <- capture.output(inspect(rules[1:3]))
gsub("[^{]+\\{([^}]*)\\}[^{]+\\{([^}]*)\\}.*", "\\2", out)[-1]
# [1] "race=White" "native-country=United-States"
# [3] "capital-gain=None"
However, it looks like you can just access this information from the rules
with the function rhs
str(rhs(rules)@itemInfo)
# 'data.frame': 115 obs. of 3 variables:
# $ labels :Class 'AsIs' chr [1:115] "age=Young" "age=Middle-aged" "age=Senior" "age=Old" ...
# $ variables: Factor w/ 13 levels "age","capital-gain",..: 1 1 1 1 13 13 13 13 13 13 ...
# $ levels : Factor w/ 112 levels "10th","11th",..: 111 63 92 69 30 54 65 82 90 91 ...
In general, use str
to see what objects are made of so you can decide how to extract components.