I have a large text file that uses commas instead of periods to indicate decimals.
Is there a way to get the rxTexttoXdf function in the RevolScaleR package to view commas as periods?
I suspect I'm going to get so much flak for this post as it seems really simple
Edit:
I am currently using a workaround that involves importing the numeric columns as character type, followed by stripping the comma and replacing it with a period and then converting to numeric
library(dplyrXdf)
imported_data %>% #dataset with character types
mutate_if(is.character,
funs(gsub(",",".",.))) %>% #replace commas for period
mutate_if(is.character, as.numeric) %>% #convert character to numeric
persist(cleaned_file) # cleaned_file being a file path
It feels like there are much cleaner ways of doing this
RxTextData
has a decimalPoint
argument for just this purpose.
Assuming your text file is European csv (columns are ;
separated, ,
is the decimal point):
txt <- RxTextData("your/file.txt", decimalPoint=",", delimiter=";")
xdf <- rxDataStep(txt, "imported.xdf")
# do stuff with xdf
In general, it's a good idea to use data source objects to refer to files, rather than filenames. You can also use rxDataStep
for just about everything.