Search code examples
rloopsfilterexport-to-csv

Splitting large dataframe into smaller dataframes and then saving as csv


First time posting on here - apologies if I have not yet got the hang of how to ask questions.

I have one large csv file ("AllANT_z") which contains data from over 330 participants, with around 290 rows of data per participant (over 100k rows in total). I have loaded in the large csv and now I am using a simple filter command to individually pick out participants (each participant has a unique participant ID, listed under a column titled 'participant'), save these as data, and then use the write.csv function to manually save them. Is there a way to loop this to make it faster? The method I am currently using works - but it is very time consuming and prone to human error. This is what I am using:

boa001 <- AllANT_z %>%
  filter(participant == "boa001")
write.csv(boa001, "E:/CognitiveData_Processed/ANT_zscored/zscored_allANT/boa001.csv", row.names = TRUE)

Any way to speed this up--with a loop maybe? I am an R newbie and do not yet have the know-how of how to loop. Any help would be much appreciated - thank you!


Solution

  • Assuming you have 330 participants and they all follow the naming scheme 'boa{three-digit-number}'

    #sprintf is used to generate the numbers with leading zeros
    for(i in sprintf('%0.3d', 1:330)){
    
          participant <- paste0("boa", i)
          
          AllANT_z %>%
            filter(participant == participant) %>% 
            write.csv(paste0("E:/CognitiveData_Processed/ANT_zscored/zscored_allANT/", participant, ".csv", row.names = TRUE))
        }
    

    This will generate the strings for all of them, select the corresponding participant, and write a CSV file with the corresponding name. If they don't follow a naming scheme, you should define a vector with all names like

    participants <- c("boa001, ...")
    

    and then iterate over it like

    for(i in participants){
      AllANT_z %>%
        filter(participant == i) %>% 
        write.csv(paste0("E:/CognitiveData_Processed/ANT_zscored/zscored_allANT/", i, ".csv", row.names = TRUE))
    }