Search code examples
pythonpandascsvgoogle-colaboratorygoogle-drive-shared-drive

How to read/loop through multiple .csv files in a folder using Google Colab python, then assign each file as a function parameter


I'm currently using Google Colab and already mounted my Google Drive. I have a folder inside the drive that has multiple .csv files

e.g. folder name: dataset

folder content: data1.csv, data2.csv, data3.csv, and so on

I want to iterate through every file in the folder, then make the file a function parameter

Here's my code but still didn't work

from google.colab import drive
drive.mount('/content/drive/')

def myfunction(data):
###function detail here###

dir = '/content/drive/dataset'

for files in dir:
  myfunction(pd.read_csv('filename'))

Thank you


Solution

  • You have to iterate over files using a function like os.listdir. Here's an example that uses this function and defensively checks that what is read is a csv file. I've used Google Colab's sample_data folder so that the code is reproducible; you will need to change the dir variable to point to your Google Drive folder.

    import pandas as pd
    import os
    
    def myfunction(data):
      print(data)
    
    dir = 'sample_data'
    
    for file in os.listdir(dir):
      if file.endswith(".csv"):
        myfunction(file)