My data set testdata
has 2 variables named PWGTP
and AGEP
The data are in a .csv
file.
When I do:
> head(testdata)
The variables show up as
ï..PWGTP AGEP
23 55
26 56
24 45
22 51
25 54
23 35
So, for some reason, R is reading PWGTP
as ï..PWGTP
. No biggie.
HOWEVER, when I use some function to refer to the variable ï..PWGTP
, I get the message:
Error: id variables not found in data: ï..PWGTP
Similarly, when I use some function to refer to the variable PWGTP
, I get the message:
Error: id variables not found in data: PWGTP
2 Questions:
Is there anything I should be doing to the source file to prevent mangling of the variable name PWGTP
?
It should be trivial to rename ï..PWGTP
to something else -- but R
is unable to find a variable named as such. Your thoughts on how one should try to repair the variable name?
This is a BOM (Byte Order Mark) UTF-8 issue.
To prevent this from happening, 2 options:
fileEncoding = "UTF-8-BOM"
when using read.table
or read.csv
Example:
mydata <- read.table(file = "myfile.txt", fileEncoding = "UTF-8-BOM")