I recently received a .txt file in a very unusual format like this to process:
"Pony ID"/t"colour"/t"location"/t"age"
"Pony A"/t"white;brown;black"/t"stable1"/t12
"Pony B"/t"pink"/t"stable2"/t13
"Pony C"/t"white"/t"stable3"/t9
So if i try to import with the classic reading functions from utils or readr (e.g.read.tsv, read.delim), I end up with 1 column, probably since the sep="/t" input is not interpreted as a literal seperator. The following code resolves it:
library(tidyverse)
a<-read.delim("ponies.txt",sep="/", header = FALSE)
a<-data.frame(cbind(a[,1],sapply(a[,-1], function(x) str_sub(x,2))))
colnames(a)<-a[1,]
a<-a[-1,]
Pony ID colour location age
2 Pony A white;brown;black stable1 12
3 Pony B pink stable2 13
4 Pony C white stable3 9
I hope this questions is not too obscure, but I'm very curious: Does anyone know if there is a way to directly escape the literal "/t" delim during the import?
This could be made a bit more compact by reading with readLines
, use gsub
to change the delimiter, before reading with read.csv/read.table
read.csv(text = gsub("/t", ",", gsub('"', '', readLines("ponies.txt"))),
check.names = FALSE)
-output
Pony ID colour location age
1 Pony A white;brown;black stable1 12
2 Pony B pink stable2 13
3 Pony C white stable3 9