Search code examples
julia

How do I load multiple CSV into DataFrames in Julia?


I already know how to load a single CSV into a DataFrame:

using CSV
using DataFrames    
df = DataFrame(CSV.File("C:\\Users\\username\\Table_01.csv"))

How would I do this when I have several CSV files, e.g. Table_01.csv, Table_02.csv, Table_03.csv? Would I create a bunch of empty DataFrames and use a for loop to fill them? Or is there an easier way in Julia? Many thanks in advance!


Solution

  • If you want multiple data frames (not a single data frame holding the data from multiple files) there are several options.

    Let me start with the simplest approach using broadcasting:

    dfs = DataFrame.(CSV.File.(["Table_01.csv", "Table_02.csv", "Table_03.csv"]))
    

    or

    dfs = @. DataFrame(CSV.File(["Table_01.csv", "Table_02.csv", "Table_03.csv"]))
    

    or (with a bit of more advanced stuff, using function composition):

    (DataFrame∘CSV.File).(["Table_01.csv", "Table_02.csv", "Table_03.csv"])
    

    or using chaining:

    CSV.File.(["Table_01.csv", "Table_02.csv", "Table_03.csv"]) .|> DataFrame
    

    Now other options are map as it was suggested in the comment:

    map(DataFrame∘CSV.File, ["Table_01.csv", "Table_02.csv", "Table_03.csv"])
    

    or just use a comprehension:

    [DataFrame(CSV.File(f)) for f in ["Table_01.csv", "Table_02.csv", "Table_03.csv"]]
    
    

    (I am listing the options to show different syntactic possibilities in Julia)