Search code examples
microsoft-r

rxTextToXdf to read commas as decimals


I have a large text file that uses commas instead of periods to indicate decimals.

Is there a way to get the rxTexttoXdf function in the RevolScaleR package to view commas as periods?

I suspect I'm going to get so much flak for this post as it seems really simple

Edit:

I am currently using a workaround that involves importing the numeric columns as character type, followed by stripping the comma and replacing it with a period and then converting to numeric

library(dplyrXdf) 

imported_data %>% #dataset with character types 
mutate_if(is.character,
        funs(gsub(",",".",.))) %>% #replace commas for period
mutate_if(is.character, as.numeric) %>%  #convert character to numeric
persist(cleaned_file) # cleaned_file being a file path 

It feels like there are much cleaner ways of doing this


Solution

  • RxTextData has a decimalPoint argument for just this purpose.

    Assuming your text file is European csv (columns are ; separated, , is the decimal point):

    txt <- RxTextData("your/file.txt", decimalPoint=",", delimiter=";")
    xdf <- rxDataStep(txt, "imported.xdf")
    
    # do stuff with xdf
    

    In general, it's a good idea to use data source objects to refer to files, rather than filenames. You can also use rxDataStep for just about everything.