Search code examples
pythonpandasdirectorysubdirectoryfile-search

copy paste files from different directories to one folder


I'm stuck again! my story is:

I need to find files named "tv.sas7bdat" that exists in different folders in a directory and save the content of all files found into a single excel file on my desktop. With my actual Code I can get all paths for that file and transfer their content to a dataframe. But, I can't append all dataframes into one single Excel file.

In my excel I find only the last dataframe !!

Here is my Code,

import pandas as pd
from sas7bdat import SAS7BDAT
import os

path = "\\"
newpath = "\\"

files = []
# r=root, d=directories, f = files
for r, d, f in os.walk(path):
    for file in f:
        if 'tv.sas7bdat' in file:
            files.append(os.path.join(r,  file))

lenf = range(len(files))
for f in files:
    print(f)

for df in lenf:

    with SAS7BDAT(f) as file:
        df = file.to_data_frame()
        print(df)

    group =pd.concat([df], axis=0, sort=True, ignore_index = True)
    df.to_excel(newpath + 'dataframes_tv.xlsx',index=False)

Solution

  • if don't want to change your code, you can enumerate your list of files to split the process by getting the first file from your list to assign the initial dataframe as a placeholder on one side, then the remaining files in the list to append all the rest of dataframes to the initial one

    EDIT a snippet of your code with enumerate and your files list

        # save the first dataframe from 1st list element
        df = SAS7BDAT(files[0]).to_data_frame()
    
        # enumerate the list to access greater elements
        for k, f in enumerate(files):
            # from 2nd element onward
            if k > 0:
                with SAS7BDAT(f[k]) as file:
                    # append all elements to the 1st
                    df = df.append(file.to_data_frame())
    
        group = pd.concat([df], axis=0, sort=True, ignore_index=True)
        df.to_excel('dataframes_tv.xlsx', index=False)