I am coming back to R after a year and want to use rpart for a classification tree.
My data looks like:
Category, Shape, Color, Yes, No
A, Square, Blue, 3, 2
B, Triangle, Blue, 2, 4
etc.
Any recommendations to reshape into the below so I can use rpart? (I believe rpart needs the data as such)
ID, Shape, Color, Result
A, Square, Blue, Yes
A, Square, Blue, Yes
A, Square, Blue, Yes
A, Square, Blue, No
A, Square, Blue, No
B, Triangle, Green, Yes
etc...
Thank you!
You can using melt
from reshape2
, then follow by rep
s=melt(df,id.var=c('Category','Shape','Color'))
s[ rep( 1:nrow(s) , s$value ),]
Category Shape Color variable value
1 A Square Blue Yes 3
1.1 A Square Blue Yes 3
1.2 A Square Blue Yes 3
2 B Triangle Blue Yes 2
2.1 B Triangle Blue Yes 2
3 A Square Blue No 2
3.1 A Square Blue No 2
4 B Triangle Blue No 4
4.1 B Triangle Blue No 4
4.2 B Triangle Blue No 4
4.3 B Triangle Blue No 4