Search code examples
pythonpandascsvglobxls

Concatenate a .xls file list into .csv


I'm trying to concatenate a .xls file list into .csv:

import glob
import pandas as pd

file_xls = glob.glob(location + "\*.xls")
print(file_xls)
read_file = pd.read_excel(location + file_xls)

but I have an error about the concatenation of the list:

TypeError: can only concatenate str (not "list") to str

Is there a specific way to concatenate this list? Thanks in advance!

Goal: Fuse the the .xls files into .csv with the help of a list


Solution

  • I think instead of concatenating strings, you should use os.path.join():

    file_xls = glob.glob(location + "\*.xls")
    

    Will return a list of all the file names in the specified root (defined by the user as location)

    files = [pd.read_excel(os.path.join(location,x)) for x in file_xls]
    

    Returns a list of all files as pandas DataFrames.

    You can concat them by using pd.concat() and output them as csv with df.to_csv()

    output = pd.concat(files)
    output.to_csv(desired_output_path)
    

    You can also single line absolutely everything with:

    pd.concat([pd.read_excel(os.path.join(location,x)) for x in glob.glob(location + "\*.xls")]).to_csv(desired_output_path)