Search code examples
rcoercion

Coercion of numeric columns with decimals and points


I would like to know how do I turn values into numerics, since they involve decimals and points. Thank you for your help.

c("1.139,0000", "1.160,0000", "1.160,0000", "1.160,0000", "1.160,0000", 
"1.194,0000", "1.533,3500", "1.550,0000", "1.550,0000", "1.602,0000", 
"1.825,0000", "1.825,0000", "1.825,0000", "1.825,0000", "1.825,0000", 
"1.825,0000", "1.825,0000", "1.825,4000", "1.825,0000", "1.825,0000", 
"2.042,1234", "2.200,0000", "2.200,0000", "2.200,0000", "2.200,0000", 
"2.200,0000", "2.200,0000", "2.200,0000", "2.200,0000", "2.200,0000", 
"2.200,0000", "2.200,0000", "2.200,0000", "2.200,0000", "2.200,0000"
)

Desired output:

c("1139.0000", "1160.0000", "1160.0000", "1160.0000", "1160.0000", 
    "1194.0000", "1533.3500", "1550.0000", "1550.0000", "1602.0000", 
    "1825.0000", "1825.0000", "1825.0000", "1825.0000", "1825.0000", 
    "1825.0000", "1825.0000", "1825.4000", "1825.0000", "1825.0000", 
    "2042.1234", "2200.0000", "2200.0000", "2200.0000", "2200.0000", 
    "2200.0000", "2200.0000", "2200.0000", "2200.0000", "2200.0000", 
    "2200.0000", "2200.0000", "2200.0000", "2200.0000", "2200.0000"
    )

Solution

  • In base R, you can use sub to remove . and replace , with ..

    as.numeric(sub(',', '.', sub('.', '', x, fixed = TRUE), fixed = TRUE))
    
    # [1] 1139.000 1160.000 1160.000 1160.000 1160.000 1194.000 1533.350 1550.000
    # [9] 1550.000 1602.000 1825.000 1825.000 1825.000 1825.000 1825.000 1825.000
    #[17] 1825.000 1825.400 1825.000 1825.000 2042.123 2200.000 2200.000 2200.000
    #[25] 2200.000 2200.000 2200.000 2200.000 2200.000 2200.000 2200.000 2200.000
    #[33] 2200.000 2200.000 2200.000
    

    We can also use parse_number from readr specifying decimal_mark as comma.

    library(readr)
    parse_number(x, locale = locale(decimal_mark = ','))