Search code examples
rcsvtidyversereadr

how to parse this encoded tsv


I tried to parse this tsv file using reader::read_tsv but I keep getting error of parsing failures. Then I realised that the tsv contained some unusual characters, when I used python to read the file it says encoding='cp1252'

I have tried using these:

writeLines(iconv(readLines("Evaluations (1).tab"), from = "cp1252", to = "UTF8"), file("test2.tab", encoding="UTF-8"))

read.delim("Evaluations (1).tab", sep = "\t", encoding = "Windows-1252")

read.table("Evaluations (1).tab", header=TRUE, sep="\t", fileEncoding="CP1252")

none worked.

Can someone take a look at this tab file and guide me how I can parse this?

Thanks!!!


Solution

  • It seems it's UCS-2LE encoded so try:

    read.table(file = "Evaluations (1).tab", sep = "\t", header = TRUE, fileEncoding = "UCS-2LE")
    
    [1] Session.Date                 Date.Completed               Evaluator.Name               Evaluator.Status             Subject.Name                
     [6] Subject.Rotation             Overall.Comments             Subject.Comments             X.Question.1.ID.             X.Question.1.Tags.          
    [11] X.Question.1.Response.       X.Question.1.Comment.        X.Question.1.Drop.Down.List. X.Question.2.ID.             X.Question.2.Tags.          
    [16] X.Question.2.Response.       X.Question.2.Comment.        X.Question.2.Drop.Down.List. X.Question.3.ID.             X.Question.3.Tags.          
    [21] X.Question.3.Response.       X.Question.3.Comment.        X.Question.3.Drop.Down.List. X.Question.4.ID.             X.Question.4.Tags.          
    [26] X.Question.4.Response.       X.Question.4.Comment.        X.Question.4.Drop.Down.List. X.Question.5.ID.             X.Question.5.Tags.          
    [31] X.Question.5.Response.       X.Question.5.Comment.        X.Question.5.Drop.Down.List. X.Question.6.ID.             X.Question.6.Tags.          
    [36] X.Question.6.Response.       X.Question.6.Comment.        X.Question.6.Drop.Down.List. X.Question.7.ID.             X.Question.7.Tags.          
    [41] X.Question.7.Response.       X.Question.7.Comment.        X.Question.7.Drop.Down.List.
    <0 rows> (or 0-length row.names)