Search code examples
rzipread.csv

R: Reading a csv from within 2 zip folders


I am working under some unfortunate circumstances, and need to read in a csv file from within 2 zip folders. What I mean by this is that the file path looks something like this:

//path/folder1.zip/folder2.zip/wanttoread.csv

I tried mimicking the slick work of this problem found here: Extract certain files from .zip , but have had no luck so far. Specifically, when I ran something similar on my end, I got an error message reading

Error in fread(x, sep = ",", header = TRUE, stringsAsFactors = FALSE) : 
embedded nul in string:

followed by a bunch of encoded nonsense.

Any ideas on how to handle this problem? Thanks in advance!


Solution

  • Here's an approach using tempdir():

    temp<-tempdir(check = TRUE) #Create temporary directory to extract into
    
    unzip("folder1.zip",exdir = temp) #Unzip outer archive to temp directory
    
    unzip(file.path(temp,"folder2.zip"), #Use file.path to generate the path to the inner archive
          exdir = file.path(temp,"temp2")) #Extract to a subfolder inside temp
                                           #This covers the case when the outer archive might also have a file named wanttoread.csv
    
    list.files(file.path(temp,"temp2")) #We can see the .csv file is now there
    #[1] "wanttoread.csv"
    
    read.csv(file.path(temp,"temp2","wanttoread.csv")) #Read it in
    #   Var1         Var2
    #1 Hello obewanjacobi