Search code examples
pythondataframesasrbind

Import multiple sas files in Python and then row bind


I have over 20 SAS (sas7bdat) files all with same columns I want to read in Python. I need an iterative process to read all the files and rbind into one big df. This is what I have so far, but it throws an error saying no objects to concatenate.

import pyreadstat
import glob
import os

path = r'C:\\Users\myfolder'  # or unix / linux / mac path
all_files = glob.glob(os.path.join(path , "/*.sas7bdat"))

li = []

for filename in all_files:
    reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
    for df, meta in reader:
        li.append(df)
    frame = pd.concat(li, axis=0)

I found this answer to read in csv files helpful: Import multiple CSV files into pandas and concatenate into one DataFrame


Solution

  • So if one has too big sas data files and plans to append all of them into one df then:

    #chunksize command avoids the RAM from crashing...
    for filename in all_files:
        reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
        for df, meta in reader:
            li.append(df)
    frame = pd.concat(li, axis=0, ignore_index=True)