I'm curious if there's any sort of out of the box functions in R that can handle this.
I have a CSV file that I am reading into a data frame using read.csv. One of the columns in the CSV contains currency values in the format of
Currency
--------
$1.2M
$3.1B
N/A
I would like to convert those into more usable numbers that calculations can be performed against, so it would look like this:
Currency
----------
1200000
3100000000
NA
My initial thoughts were to somehow subset the dataframe into 3 parts based on rows that contain *M, *B, or N/A. Then use gsub to replace the $ and M/B, then multiply the remaining number by 1000000 or 1000000000, and finally rejoin the 3 subsets back into 1 data frame.
However I'm curious if there's a simpler way to handle this sort of conversion in R.
We could use gsubfn
to replace the 'B', 'M' with 'e+9', 'e+6' and convert to numeric
(as.numeric
).
is.na(v1) <- v1=='N/A'
options(scipen=999)
library(gsubfn)
as.numeric(gsubfn('([A-Z]|\\$)', list(B='e+9', M='e+6',"$"=""),v1))
#[1] 1200000 3100000000 NA
EDIT: Modified based on @nicola's suggestion
v1 <- c('$1.2M', '$3.1B', 'N/A')