Search code examples
rstringlisttextcharacter

Read in list of lists into R from .txt file and maintain structure


I have previously made a list of 9 elements, where each element is a list containing 1000 vectors, each vector being gene names. Each of the 9 higher-level elements contains a list of vectors which are the same length, but each of the 9 elements differ in vector length between one-another. Unfortunately, I saved this as a .csv (and also .txt) file, which looks like the following:

[[1]]
[[1]][[1]]
 [1] "FBgn0039358" "FBgn0033482" "FBgn0039026" "FBgn0035755" "FBgn0010830"
 [6] "FBgn0039727" "FBgn0037648" "FBgn0038244" "FBgn0030794" "FBgn0034989"
[11] "FBgn0035591" "FBgn0036018" "FBgn0032775" "FBgn0033519" "FBgn0026090"
[16] "FBgn0035690" "FBgn0262160" "FBgn0250871" "FBgn0010774" "FBgn0010342"
[21] "FBgn0266540" "FBgn0052103" "FBgn0035266" "FBgn0284257" "FBgn0037705"
[26] "FBgn0036838" "FBgn0029996" "FBgn0031758" "FBgn0031979" "FBgn0032472"
[31] "FBgn0015008" "FBgn0028866" "FBgn0037263" "FBgn0037429" "FBgn0261545"
[36] "FBgn0038016" "FBgn0031020" "FBgn0039890" "FBgn0035488" "FBgn0039634"
[41] "FBgn0034722" "FBgn0029893" "FBgn0034997" "FBgn0051019" "FBgn0038548"
[46] "FBgn0040022" "FBgn0038735" "FBgn0032873" "FBgn0037899" "FBgn0043577"
[51] "FBgn0036316" "FBgn0026189" "FBgn0031813" "FBgn0035283" "FBgn0041629"
[56] "FBgn0259959" "FBgn0053798"

And so on until

[[9]][[1000]]
  [1] "FBgn0029947" "FBgn0053490" "FBgn0052631" "FBgn0036676" "FBgn0031170"
  [6] "FBgn0027360" "FBgn0033778" "FBgn0052499" "FBgn0035199" "FBgn0038887"
 [11] "FBgn0015001" "FBgn0036706" "FBgn0028897" "FBgn0263511" "FBgn0050488"
 [16] "FBgn0032660" "FBgn0036551" "FBgn0002872" "FBgn0004876" "FBgn0037235"
 [21] "FBgn0030850" "FBgn0265194" "FBgn0030789" "FBgn0027085" "FBgn0030625"
 [26] "FBgn0037718" "FBgn0028539" "FBgn0039451" "FBgn0039727" "FBgn0031698"
 [31] "FBgn0032600" "FBgn0020236" "FBgn0038286" "FBgn0029914" "FBgn0039508"
 [36] "FBgn0023522" "FBgn0036702" "FBgn0036301" "FBgn0034118" "FBgn0028992"
 [41] "FBgn0026372" "FBgn0031143" "FBgn0039156" "FBgn0032717" "FBgn0032169"
 [46] "FBgn0030838" "FBgn0010497" "FBgn0085423" "FBgn0034261" "FBgn0036374"
 [51] "FBgn0032689" "FBgn0035842" "FBgn0022710" "FBgn0023511" "FBgn0038532"
 [56] "FBgn0035137" "FBgn0033473" "FBgn0037882" "FBgn0039507" "FBgn0031875"
 [61] "FBgn0030555" "FBgn0033609" "FBgn0030946" "FBgn0001330" "FBgn0038951"
 [66] "FBgn0035253" "FBgn0027602" "FBgn0035111" "FBgn0039117" "FBgn0262647"
 [71] "FBgn0085405" "FBgn0027574" "FBgn0029706" "FBgn0030061" "FBgn0030710"
 [76] "FBgn0036451" "FBgn0032701" "FBgn0028646" "FBgn0042111" "FBgn0040212"
 [81] "FBgn0050394" "FBgn0261618" "FBgn0035526" "FBgn0032728" "FBgn0036889"
 [86] "FBgn0035021" "FBgn0037470" "FBgn0259152" "FBgn0039773" "FBgn0039889"
 [91] "FBgn0037530" "FBgn0051739" "FBgn0263490" "FBgn0034611" "FBgn0032296"
 [96] "FBgn0283659" "FBgn0035203" "FBgn0037760" "FBgn0038659" "FBgn0039427"
[101] "FBgn0036624" "FBgn0038467" "FBgn0038304" "FBgn0037282" "FBgn0032005"
[106] "FBgn0283473" "FBgn0031897"

I would like to read this file back into R and maintain this list of lists structure as it is important for further analysis. I cannot remake this list so that is not an option. I have tried the answer to this question, but I cannot figure out how to apply it to my own data. I will absolutely never again save files like this as text files...

Any help would be greatly appreciated.

Thank you very much to @onyambu for solving my problem.


Solution

  • Try running this code:

    a <- readLines('myfile.txt')
    b <- grep("^ *\\[\\d+", unlist(strsplit(a, ' *" *"?')), value = TRUE, invert = TRUE)
    
    result <- lapply(unname(split(b,cumsum(grepl("^\\[\\[\\d+\\]\\]$",b)))), function(x)
        lapply(unname(split(x[-1], cumsum(grepl("^\\[", x[-1])))), `[`,-1))