I have a data frame (labels) that I would like to use as a reference or lookup table of the form:
V1 V2
1 1 WALKING
2 2 WALKING_UPSTAIRS
3 3 WALKING_DOWNSTAIRS
4 4 SITTING
5 5 STANDING
6 6 LAYING
The data frame to use the reference table is (test, ncol = 564, nrow = 2947) where the first three colnames are (test_subject, test_label(num 1-6), data_set) where test_label(1-6) equal the strings referenced above.
Could someone help me figure out how I can use my lookup table to insert a new column called "activity_label" and each observation of that column would correspond to the string equivalent of the referenced number from the reference table.
E.g., if test_label row 1 equals 5 then activity_label row 1 would equal "Standing"
Thanks so much for all of your help!
After using the merge method:
> test2[1:10, 564: 565]
angle(Z,gravityMean) activity_label
1 0.04404283 walking
2 0.04134032 walking
3 0.04295217 walking
4 0.03611571 walking
5 -0.09080307 walking
6 -0.08602478 walking
7 -0.07997668 walking
8 0.04372663 walking
9 0.19900166 walking
10 0.20350821 walking
> str(test1)
'data.frame': 2947 obs. of 565 variables:
$ test_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ test_subject : int 12 12 12 12 4 4 4 12 9 9 ...
$ observ_set : Factor w/ 1 level "test": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.228 0.303 0.237 0.306 0.29 ...
> str(train1)
'data.frame': 7352 obs. of 565 variables:
$ train_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ V1 : int 27 7 7 26 7 26 6 6 6 7 ...
$ observ_set : Factor w/ 1 level "train": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.262 0.354 0.344 0.292 0.314 ...
One way is to use ifelse
:
if data frame = test and activity number column = activitynum,
test$activitylabel <- ifelse(test$activitynum == 1, "walking, ifelse(test$activitynum == 2, "walking_upstairs", ifelse(test$activitynum == 3, "walking_downstairs", ifelse(test$activitynum == 4, "sitting", ifelse(test$activitynum == 5, "standing", ifelse(test$activitynum == 6, "laying", NA))))))
another way is to create a look-up table and then do a merge
as suggested by @Jaehyeon:
lookup <- data.frame(activitynum = c(1,2,3,4,5,6), activity = c("walking", "walking_upstairs", "walking_downstairs", "standing", "sitting", "laying"))
survey <- data.frame(id = c(seq(1:10)), activitynum = floor(runif(10, 1, 7)), var1 = runif(10, 1, 100))
merge(survey, lookup, by = "activitynum", all.x = TRUE)
> str(lookup)
'data.frame': 6 obs. of 2 variables:
$ activitynum: num 1 2 3 4 5 6
$ activity : Factor w/ 6 levels "laying","sitting",..: 4 6 5 3 2 1
> str(survey)
'data.frame': 10 obs. of 3 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10
$ activitynum: num 1 2 4 1 4 6 2 4 2 2
$ var1 : num 52.3 60.5 53.3 49.8 73.1 ...