I have list data for which I used split
:
x <- split(A, f = A$Col_1)
It works beautifully. But now I need to write each chunk of the split to an individual .csv. There are 2100 chunks of 140 rows each. Let's call them "1:2100". I would like to create something that wrote "1" to "~/full_path_name/A1.csv" then go to "2" and write to "~/full_path_name/A2.csv", then "3" to "~/full_path_name/A3.csv", etc.
I included "~/full_path_name/"
because down the road this path name will change for other data using the same code, and for my own understanding I need to see it in the code. I don't know how to write a small sample of what I am asking for for someone to correct because I don't know how to write it at all.
Can someone make a suggestion on how to do this? Thank you.
I have only been coding for month and am entirely self-taught. I do not have a background in other coding programs. I have no one to ask for help but here. I struggle with the terminology, so please understand if I am not asking in the proper way and I will try to correct it if need be.
EDIT, AFTER DOING SOME FURTHER RESEARCH --
This is what I have found elsewhere on SO from @RichPaloo, and my adaptations below that:
#example data.frame
df <- data.frame(x = 1:4, y = c("a", "a", "b", "b"))
#split into a list by the y column
l <- split(df, df$y)
#the names of the list are the unique values of the y column
nam <- names(l)
#iterate over the list and the vector of list names and write csvs
for(i in 1:length(l)) {
write_csv(l[[i]], paste0(nam[i], ".csv"))
}
This is my version:
bcc4.5_WINTER <- split(bcc4.5_FinalWinterRO, f = bcc4.5_FinalWinterRO$HUC8)
nam <- names(bcc4.5_WINTER)
for(i in 1:length(bcc4.5_WINTER)) {
write_csv(bcc4.5_WINTER[[i]], paste0(“~/Rprojects/BCC_CSM1_1_RCP_45/Winter/”, nam[i], “.csv”))
}
I appear to have a problem with the folder within my home folder "/BCC_CSM1_1_RCP_45/Winter/” It says "unexpected token" at both ends, but not at the "~Rprojects". Can I not send something to a folder within my home folder?
It also shows redlines under the quotes around ".csv" near the end. I don't know what to make of this because it's exactly what the person used successfully, apparently, in another post. Thank you.
Investigating the potential typo problem
Please see the two lines below:
write.csv(l[[1]], file = paste0("./a_folder/", names(l)[1], ".csv"))
write.csv(l[[1]], file = paste0(“./a_folder/”, names(l)[1], “csv”))
Line 1 will save the file. Note that "./a_folder/"
and ".csv"
are seen as text.
Line 2 “./a_folder/”
and “.csv”
are not recognized as text. Line 2 produces an error: unexpected input in " write.csv(l[[1]], file = paste0(“"
RStudio colors your code to help you with this problem.
Thoughts about not using a for
loop.
I think one better way to go (especialy when you have large dataset) is by using lapply
or mapply
. What these functions do is take each "chunk" of a list and apply a function to it.
As lapply
loses the name of each chunk while processing it. It can be annoying when you want to use the name of the chunk to name the file on your computer. mapply()
comes handy to deal with this situation.
Here is an example using the provided example.
# example data.frame
df <- data.frame(x = 1:4, y = c("a", "a", "b", "b"))
# split df
l <- split(df, df$y)
# save each "chunk" of l as a .csv file on a hard drive
# 1st, create a function that takes a "chunk" of your list and its name as inputs
save_fun <- function(l_i, name_i) {
print(l_i) # print the output in console
write.csv(l_i, file = paste0("./a_folder/", name_i, ".csv")) # save the file on your computer
}
# 2nd, use mapply (and not a list) to use the previous function on each pair chunk/name
mapply(FUN = save_fun, l_i = l, name_i = names(l), SIMPLIFY = FALSE) # see ?mapply for how to use mapply()