I came across this weird situation:
I need to save a dataframe to a .csv file UTF-8 and with a LF ending. I'm using the latest version of R and Rstudio on a Windows 10 machine.
My first attempt was to do naively:
write.csv(df, fileEncoding="UTF-8", eol="\n")
checking with Notepad++, it appears the encoding is UTF-8, however the line ending is CRLF and not LF. Ok, let's double check with Notepad: surprise, surprise, the encoding, according to Notepad, is ANSI. At this point I'm confused.
After looking at the docs for the function write.csv I read that:
CSV files do not record an encoding
I'm not an expert on the topic, so I decide to revert back and save the file as a simple .txt using write.table as follows:
write.table(df, fileEncoding="UTF-8", eol="\n")
again, the same result as above. No changes whatsoever. I tried the combinations
write.csv(df)
write.table(df)
without specified encodings but no change. Then I set the default encoding in Rstudio to be UTF-8 and LF line ending (as in the picture below)
and ran the tests again. No change. What am I missing??
This is an odd one, at least for me. Nonetheless, by reading the docs of write.table I found the solution. Apparently on Windows, to save files Unix-style you have to open a binary connection to a file and then save the file using the desired eol:
f <- file("filename.csv", "wb")
write.csv(df, file=f, eol="\n")
close(f)
As far as the UTF-8 format is concerned, global settings should work fine.
Check that the eol is LF using Notepad++. UTF-8 is harder to check since on Linux isutf8 (from moreutils) says files are indeed UTF-8 but Windows' Notepad disagrees when saving and says they are ANSI.