Search code examples
rstringsortingdivide

R - divide into smaller data frames based on information in a column


Let's say I have a tab-delimited file fileA.txt containing several types of information as follows:

X         123       78000    0        romeo 
X         78000     78004    56       juliet    
Y         78004     78005    12       mario
Y         78006     78008    21       mario   
Y         78008     78056    8        luigi 
Z         123       78000    1        peach 
Z         78000     78004    24       peach    
Z         78004     78005    4        peach
A         78006     78008    12       zelda   
A         78008     78056    14       zelda

I have this data frame saved to a variable as follows:

df <- read.table("fileA.txt",sep="\t",colClasses=c("character","numeric","numeric","numeric","character"))
colnames(df) <- c("location","start","end","value","label")

Let's assume that I don't know how many different strings are contained in the first column df[,1] and call this number n. I would like to automatically generate n new data frames, each containing the information for a single type of string. How do I go about writing a function for that?


Solution

  • Probably, you need:

    library(plyr)
    out <- llply(unique(df[,1]), function(x) subset(df, df[,1]==x))
    out
    

    It creates list where each element is data.frame with specific location.

    Now you can access data.frames as: out[[1]].

    If you want to keep names:

    names(out) <- unique(df[,1])
    out$X # gives data.frame with location=='X'