Search code examples
pythonpostgresqlcsvpsycopg2

Create Postgresql table from csv file using psycopg2


Hi I looking for solutions that let me create table from csv files. I find solution on another forum. This code looks like this:

import csv
import psycopg2
import os
import glob


conn = psycopg2.connect("host= localhost port=5433  dbname=testDB user= postgres password= ************")
print("Connecting to Database")

cur = conn.cursor()
csvPath = "W:\Maciej.Olech\structure_files"

# Loop through each CSV
for filename in glob.glob(csvPath+"*.csv"):
# Create a table name
    tablename = filename.replace("W:\Maciej.Olech\structure_files", "").replace(".csv", "")
    print(tablename)

    # Open file
    fileInput = open(filename, "r")

    # Extract first line of file
    firstLine = fileInput.readline().strip()


    # Split columns into an array [...]
    columns = firstLine.split(",")
     

    # Build SQL code to drop table if exists and create table
    sqlQueryCreate = 'DROP TABLE IF EXISTS '+ tablename + ";\n"
    sqlQueryCreate += 'CREATE TABLE'+ tablename + "("

        #some loop or function according to your requiremennt
        # Define columns for table
    for column in columns:
        sqlQueryCreate += column + " VARCHAR(64),\n"

        sqlQueryCreate = sqlQueryCreate[:-2]
        sqlQueryCreate += ");"

cur.execute(sqlQueryCreate)
conn.commit()
cur.close()

I try to run this code but i get this error:

C:\Users\MACIEJ~1.OLE\AppData\Local\Temp/ipykernel_5320/1273240169.py in <module>
     40         sqlQueryCreate += ");"
     41 
---> 42 cur.execute(sqlQueryCreate)
     43 conn.commit()
     44 cur.close()

NameError: name 'sqlQueryCreate' is not defined

I don't understand why i have this error becouse sqlQueryCreate is defined. Any one have idea what is wrong? Thanks for any help.


Solution

  • There are a few issues with your code.

    1. In Windows, paths need to have the \ escaped.
    2. your cur.execute(sqlQueryCreate) and conn.commit() are indented wrong. ditto with sqlQueryCreate = sqlQueryCreate[:-2] and sqlQueryCreate += ");"
    3. Edit: Realized that your glob.glob() parameter isn't correct. What you intend: W:\\Jan.Bree\\structure_files\\*.csv, what you actually had W:\\Jan.Bree\\structure_files*.csv
    import csv
    import psycopg2
    import os
    import glob
    
    
    conn = psycopg2.connect("host= localhost port=5433  dbname=testDB user= postgres password= ************")
    print("Connecting to Database")
    
    cur = conn.cursor()
    csvPath = "W:\\Jan.Bree\\structure_files"
    
    # Loop through each CSV
    for filename in glob.glob(os.path.join(csvPath,"*.csv")):
    # Create a table name
        tablename = filename.replace("W:\\Jan.Bree\\structure_files", "").replace(".csv", "")
        print(tablename)
    
        # Open file
        fileInput = open(filename, "r")
    
        # Extract first line of file
        firstLine = fileInput.readline().strip()
    
        # Split columns into an array [...]
        columns = firstLine.split(",")
    
        # Build SQL code to drop table if exists and create table
        sqlQueryCreate = 'DROP TABLE IF EXISTS '+ tablename + ";\n"
        sqlQueryCreate += 'CREATE TABLE'+ tablename + "("
    
        #some loop or function according to your requiremennt
        # Define columns for table
        for column in columns:
            sqlQueryCreate += column + " VARCHAR(64),\n"
    
        sqlQueryCreate = sqlQueryCreate[:-2]
        sqlQueryCreate += ");"
    
        cur.execute(sqlQueryCreate)
        conn.commit()
    
    cur.close()
    

    This should cover the issues; but I have no way of testing the code as I don't use psycopg2. I'm assuming that the connect() works.