Search code examples
runits-of-measurementunit-conversion

Convert Mixed Unit Measurements


I have a file with large range of non-standardised mixed imperial and metric measurements, which I want to standardise and republish.

A sample of that range looks like this:

df  <- data.frame(Measurements =c("1.25m", "2 Feet", "3 Inches", "5.5 cm"))

|Measurements|
|1.25m       |
|2 Feet      |
|3 Inches    |
|5.5 cm      |

which I want to look like this:

|Measurements|MM_Conversion|
|1.25m       |1200mm
|2 Feet      |609.6mm
|3 Inches    |76.2mm
|5.5 cm      |55mm

I can't use measurements::conv_unit or units::set_unit because they both seem to require numeric input values. Is there a straightforward way of doing this which can parse both the value and the string, and convert accordingly?

EDIT 1: Having an issue whereby Conv_Unit can't convert NA values. If the initial vector instead was: df <- data.frame(Measurements =c(NA, 1.25m", "2 Feet", "3 Inches", "5.5 cm")), how would you get around it?


Solution

  • We can use extract from tidyr to separate the value and unit and feed that into conv_unit using map2:

    df <- data.frame(Measurements =c(NA, "1.25m", "2 Feet", "3 Inches", "5.5 cm"))
    
    library(tidyverse)
    library(stringr)
    library(measurements)
    
    df %>%
      extract(Measurements, c("value", "unit"), 
              regex = "^([\\d.]+)\\s*([[:alpha:]]+)$", 
              remove = FALSE, convert = TRUE) %>%
      mutate(unit = str_replace_all(unit, c(Feet="ft", Inches="inch")),
             MM_Conversion = paste0(map2(value, unit, ~if(!is.na(.x)) conv_unit(.x, .y, "mm") else NA), "mm"))
    

    Result:

      Measurements value unit MM_Conversion
    1         <NA>    NA <NA>          NAmm
    2        1.25m  1.25    m        1250mm
    3       2 Feet  2.00   ft       609.6mm
    4     3 Inches  3.00 inch        76.2mm
    5       5.5 cm  5.50   cm          55mm
    

    or use filter if NAs should not appear in the final output:

    df %>%
      extract(Measurements, c("value", "unit"), 
              regex = "^([\\d.]+)\\s*([[:alpha:]]+)$", 
              remove = FALSE, convert = TRUE) %>%
      filter(!is.na(Measurements)) %>%
      mutate(unit = str_replace_all(unit, c(Feet="ft", Inches="inch")),
             MM_Conversion = paste0(map2(value, unit, ~conv_unit(.x, .y, "mm")), "mm"))
    

    Result:

      Measurements value unit MM_Conversion
    1        1.25m  1.25    m        1250mm
    2       2 Feet  2.00   ft       609.6mm
    3     3 Inches  3.00 inch        76.2mm
    4       5.5 cm  5.50   cm          55mm
    

    Notice how I manually abbreviated the original units to make conv_unit work. It would be one step less if the original units were already in abbreviated form.