I got this message after I convert a few columns from "characters" to "numeric": Warning message: Unknown or uninitialised column:df
I needed to load a csv file (from Qualtrics) into R.
filename <- "/Users/Study1.csv"
library(readr)
df <- read_csv(filename)
The first row contains the variable names, but the second and the third rows are a chunk of characters not useful for R. Therefore, I needed to remove those two rows. However, since R already recognised rows 18 to the end to be characters thanks to those useless chunks of strings, I needed to convert these rows manually to numeric (which is necessary for me to do further analysis).
# The 2nd and 3rd rows of the csv file are useless (they are strings)
df <- df[3:nrow(df), ]
# cols 18 to the end are supposed to be numeric, but the 2nd and 3rd rows are string, so R thinks that these columns contain strings
df[ ,18:ncol(df)] <- lapply(df[ ,18:ncol(df)], as.numeric)
After running the above code, the error popped up:
Warning message:
Unknown or uninitialised column: 'df'.
Parsed with column specification:
cols(
.default = col_character()
)
See spec(...) for full column specifications.
NAs introduced by coercionNAs introduced by coercion
The NAs are fine. But the error message is annoying. Is there a better way to convert my columns to numeric? Thank you all!
EDITED
Thank you all for your advice. I tried the method of skip
ing the 2nd and the 3rd rows. However, one peculiar thing happened. Because on cell contains multiple rows, separate by empty lines, R recognised it incorrectly.
I blurred the original text in the picture. It happens whether or not I clicked ""First Row as Names". Can you suggest any fix to it? Thanks all again.
UPDATE on 2018-05-30: I've solved the problem. Please see my answer below or visit How to import Qualtrics data (in csv format) into R
Thank you all for your advice and comments. I heeded @alistaire 's advice of using skip
.
As per the newline
in the qualtrics cell, I found that I could click on "More options" when exporting data, and select "remove line breaks".
Following the advice from Skip specific rows using read.csv in R, I used the following code to solve my problem.
headers = read.csv(filename, header = F, nrows = 1, as.is = T)
df = read.csv(filename, skip = 3, header = F)
colnames(df)= headers