Search code examples
rtextbyte-order-markaprioriarules

Importing txt file into R Studio includes unwanted BOM characters ""


When i imported the following data saved as an UTF-8 Encoded Txt file

1   test1
1   test2
2   test1
2   test3

Into R-Studio I had issues with the BOM characters "" showing up in resulting table. Below is the code that I used to import the data.

library(arules)
library(arulesViz)

txn <- read.transactions("r-test.txt",rm.duplicates= FALSE,format="single",sep="\t",cols = c(1,2))
inspect(txn)

The resulting import looked like the following:

  items         transactionID
1 {test2}       1            
2 {test1,test3} 2            
3 {test1}       1 

Solution

  • What I found was that by saving the file as a ANSI encoded txt file this cleared the issue up.

      items         transactionID
    1 {test1,test2} 1            
    2 {test1,test3} 2  
    

    You can use the following r studio code to convert your file to ANSI format:

    writeLines(iconv(readLines("Old File Name"), from = "UTF8", to = "ANSI_X3.4-1986"), 
               file("New File Name", encoding="ANSI_X3.4-1986"))
    

    Hope this helps someone else if they have the same issue.