I have over 20 SAS (sas7bdat) files all with same columns I want to read in Python. I need an iterative process to read all the files and rbind into one big df. This is what I have so far, but it throws an error saying no objects to concatenate.
import pyreadstat
import glob
import os
path = r'C:\\Users\myfolder' # or unix / linux / mac path
all_files = glob.glob(os.path.join(path , "/*.sas7bdat"))
li = []
for filename in all_files:
reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
for df, meta in reader:
li.append(df)
frame = pd.concat(li, axis=0)
I found this answer to read in csv files helpful: Import multiple CSV files into pandas and concatenate into one DataFrame
So if one has too big sas data files and plans to append all of them into one df then:
#chunksize command avoids the RAM from crashing...
for filename in all_files:
reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
for df, meta in reader:
li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)