I would like to convert a height variable I have from character type to numeric. for context, this is so I can use the values to calculate body mass index.
Looking at the below example data frame, I would like to convert Height_1 into Height_2 (whereby Height_2 is in inches):
# Height_1 Height_2
# 5ft6in 66
# XftXin XXXX
# XftXin XXXX
# XftXin XXXX
# XftXin XXXX
I have tried a few things using the "tidyverse" and "measurements" packages but have not been able to create a variable like Height_2 above. For example:
df %>%
separate(Height_1,c('feet', 'inches'), sep = 'ft', convert = TRUE, remove = FALSE) %>%
mutate(Height_2 = 12*feet + inches)
I think this is because the above doesn't address the fact that there is "in" at the end of the values.
You can use regex to extract feet and inches data from Height_1
and then perform the calculation.
df %>%
extract(Height_1, c('feet', 'inches'), '(\\d+)ft(\\d+)in', convert = TRUE, remove = FALSE) %>%
Height_2 = 12*feet + inches)
# Height_1 Height_2
#1 5ft6in 66
#2 4ft9in 57
#3 5ft12in 72
#4 4ft9in 57
#5 6ft2in 74
In base R -
transform(strcapture('(\\d+)ft(\\d+)in', df$Height_1,
proto = list(feet = numeric(), inches = numeric())),
Height_2 = 12*feet + inches)
df <- structure(list(Height_1 = c("5ft6in", "4ft9in", "5ft12in", "4ft9in", "6ft2in")), row.names = c(NA, -5L), class = "data.frame")