So, I have been trying to turn a text file (each line is a chat log) into R to turn it into a data frame and further tidy the data.
I am using read.Lines so I can have each log as a single line. Because read.Lines reads them a single long char; I then convert them to strings (I need to parse the log); as per below
rawchat <- readLines("disc-W-App-avec-loy.txt")
rawchat <- c(lapply(rawchat, toString))
My problem comes when I want to turn this list into data frame:
rawchat <- as.data.frame(rawchat)
It turns the list into a data frame of 1 observation of 42,000 variables. The intention was to turn it into 42,000 observations of one variable.
Any help please?
By the way, I am pretty new in tidying raw data in R.
So, I encountered another block:
I loaded a text file as data frame as per below.
rawchat <- readLines("disc-W-App-avec-loy.txt")
rawchat <- as.data.frame(rawchat, stringsAsFactors=FALSE)
names(rawchat) <- "chat"
I am currently trying to identify any row (42000) that starts with the number 16. I can't seem to apply correctly the startsWith()
function or the dplyr
starts_with()
, even grepl
with regular expressions.
Could it be the format of the observations of the data frame (chr
)?
The problem is your rawchat <- c(lapply(rawchat, toString))
Just use
rawchat <- readLines("disc-W-App-avec-loy.txt")")
rawchat <- as.data.frame(rawchat, stringsAsFactors=FALSE)