I have a series of txt files formatted in the same way. The first few rows are all about file information. There are no variable names. As you can see spaces between factors are inconsistent but Columns are left-aligned or right-aligned.I know SAS could directly read data with this format and wonder if R provide any function similar.
I tried read.csv function to load these data and I want to save them in a data.frame with 3 columns, while it turns out the option sep = "\s"(multiple spaces) in the function cannot recognize regular expression.
So I tried to read these data in a variable first and use substr function to split them as following. step1
Factor<-data.frame(substr(Share$V1,1,9),substr(Share$V1,9,14),as.numeric(substr(Share$V1,15,30)))
step2
But this is quite unintelligent, and need to count the spaces between. I wander if there is any method to directly load data as three columns.
> Factor
F T S
1 +B2P A 1005757219
2 +BETA A 826083789
We can use read.table
to read it as 3 columns
read.table(text=as.character(Share$V1), sep="", header=FALSE,
stringsAsFactors=FALSE, col.names = c("FactorName", "Type", "Share"))
# FactorName Type Share
#1 +B2P A 1005757219
#2 +BETA A 826083789
#3 +E2P A 499237181
#4 +EF2P A 38647147
#5 +EFCHG A 866171133
#6 +IL1QNS A 945726018
#7 +INDMOM A 862690708
Another option would be to read it directly from the file, skip
ping the header line and change the column names
read.table("yourfile.txt", header=FALSE, skip=1, stringsAsFactors=FALSE,
col.names = c("FactorName", "Type", "Share"))