Search code examples
rcharacterread.tablenul

Read txt files containing nul character as seprator, such as \001?


when I use r to read the txt files,I set the read.table sep para as sep="\001" or sep="\\001" both not worked.

                                                                        V1
1             886153044351\0010981623127\001\00113036806119\00113036806119
2           132693697611\0010\00118380389386\00113795105928\00113795105928
3             886134400554\0010981623127\001\00115033907649\00115033907649
4            550075776697\00115955516598\00115955516598\00113969121085\001
5             886156798054\0010918770552\001\00115977055775\00115977055775
6 132642200735\00118015668803\00118015668803\00118655109444\00118655109444

above is I use read table default into R. I use split function,but it also did not work for sep like above.

In notepad++,I replace \0001 with comma",",so I can read the data into R like a data frame.

If the data is big,and I cannot use notepad++ to replace the nul character,how can I do it?


Solution

  • Try using the read.delim function instead :

    read.delim(
    text = "V1
    1 886153044351\0010981623127\001\00113036806119\00113036806119
    2 132693697611\0010\00118380389386\00113795105928\00113795105928
    3 886134400554\0010981623127\001\00115033907649\00115033907649
    4 550075776697\00115955516598\00115955516598\00113969121085\001
    5 886156798054\0010918770552\001\00115977055775\00115977055775
    6 132642200735\00118015668803\00118015668803\00118655109444\00118655109444", 
    sep = "\001", header = FALSE )
    
    
                  V1          V2          V3          V4          V5
    1             V1          NA          NA          NA          NA
    2 1 886153044351   981623127          NA 13036806119 13036806119
    3 2 132693697611           0 18380389386 13795105928 13795105928
    4 3 886134400554   981623127          NA 15033907649 15033907649
    5 4 550075776697 15955516598 15955516598 13969121085          NA
    6 5 886156798054   918770552          NA 15977055775 15977055775
    7 6 132642200735 18015668803 18015668803 18655109444 18655109444