first of all I am a coding noob and just started coding for the purpose to write my master's thesis at my university. I extracted youtube comments using the tuber package in R in order to carry out a sentiment analysis of those comments. Everything worked fine and I received a data frame with all the comments (11314 observations and 13 variables). However, when i tried to write a .csv file of that data frame in order to look at the comments in Excel I encountered a particular issue. For the comments that contain new paragraphs, the write.table function created a new row. I used the following function:
write.table(testneuohneduplikate, file = "Testneuohnedulikate.csv",sep = ";", row.names = FALSE, col.names = TRUE, quote = TRUE)
Is there a possibility that each comment is written in a single row and not sometimes two or three rows because the comment contains paragraphs?
I hope I was able to explain my problem properly.
Thank you guys in advance and greetings from Germany to wherever you are from :)
Yes, write.table is creating a new row when it encounters a newline character. Here's an example of stripping out newline characters out of the comment string:
> comment<-"I think this video \n is great"
> cat(comment)
I think this video
is great
> fixedcomment<-gsub("[\r\n]", "", comment)
> cat(fixedcomment)
I think this video is great
>
you can use 'apply' to apply it to every column in your table, or modify the MARGIN parameter if you only want to do rows or columns.
> ID<-1:4
> Names<-c('name1','name2','name3','name4')
> Comments<-c("I think this video \n is great", "No it stinks \n I mean it", "Use the Force", "It's time \n to get to work")
> table<-cbind(ID, Names, Comments)
> fixed_table<-apply(X=table,MARGIN=c(1,2),FUN = function(y) gsub("[\r\n]","",y))