Search code examples
rdataframeunique

Storing Unique Values from Excel file into separate Dataframes


I have a massive csv file I'm working with here. What I want is to be able to store all unique values in a certain column into a dataframe. For a small example, if I had:

1   a   
2   b
3   c
4   a
5   b

I would want my code to sort this into x amount of dataframes where x would be the amount of unique values from column two (in this example that would be 3).

What I did is find the name of every unique variable in my desired column and did vectorization. Which would be:

    DF1 <- Data[Data$Column2 == "a",]
    DF2 <- Data[Data$Column2 == "b",]
    DF3 <- Data[Data$Column2 == "c",]

But that was super tedious, and since I have way more than just 3 unique values it took me waaaaay too long. There has got to be an easier way to do it. Maybe using loops? Any help is greatly appreciated!


Solution

  • We can use split to split the data.frame into separate data.frame in a list and then use lapply to loop over the list and write the files as csv

    lst1 <- split(Data, Data$Column2)
    lapply(names(lst1), function(nm) write.csv(lst1[[nm]], paste0(nm, ".csv"),
         quote = FALSE, row.names = FALSE))
    

    By doing the split, we effectively reduce the burden of naming multiple objects in the global environment, reduce the clutter, and reduce the time in finding those objects and saving one by one with write.csv