Search code examples
rwrite.table

Assign each extracted comment to a single row write.table R data frame


first of all I am a coding noob and just started coding for the purpose to write my master's thesis at my university. I extracted youtube comments using the tuber package in R in order to carry out a sentiment analysis of those comments. Everything worked fine and I received a data frame with all the comments (11314 observations and 13 variables). However, when i tried to write a .csv file of that data frame in order to look at the comments in Excel I encountered a particular issue. For the comments that contain new paragraphs, the write.table function created a new row. I used the following function:

write.table(testneuohneduplikate, file = "Testneuohnedulikate.csv",sep = ";", row.names = FALSE, col.names = TRUE, quote = TRUE)

Is there a possibility that each comment is written in a single row and not sometimes two or three rows because the comment contains paragraphs?

I hope I was able to explain my problem properly.

Thank you guys in advance and greetings from Germany to wherever you are from :)


Solution

  • Yes, write.table is creating a new row when it encounters a newline character. Here's an example of stripping out newline characters out of the comment string:

    > comment<-"I think this video \n is great"
    > cat(comment)
    I think this video 
     is great
    
    > fixedcomment<-gsub("[\r\n]", "", comment)
    > cat(fixedcomment)
    I think this video  is great
    > 
    

    you can use 'apply' to apply it to every column in your table, or modify the MARGIN parameter if you only want to do rows or columns.

    > ID<-1:4
    > Names<-c('name1','name2','name3','name4')
    > Comments<-c("I think this video \n is great", "No it stinks \n I mean it", "Use the Force", "It's time \n to get to work")
    > table<-cbind(ID, Names, Comments)
    
    > fixed_table<-apply(X=table,MARGIN=c(1,2),FUN = function(y) gsub("[\r\n]","",y))