I am trying to truncate a postgres destination table prior to insert, and in general, trying to fire external functions utilizing the connections already created in GLUE.
Has anyone been able to do so?
I've tried the DROP/ TRUNCATE
scenario, but have not been able to do it with connections already created in Glue, but with a pure Python PostgreSQL driver, pg8000.
__init__.py
in the root folderPython lib path
of the job --
). Tick the "Server-side encryption" box.Then you can simply create a connection and execute SQL.
import sys
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame
from awsglue.job import Job
import pg8000
args = getResolvedOptions(sys.argv, [
'JOB_NAME',
'PW',
'HOST',
'USER',
'DB'
])
# ...
# Create Spark & Glue context
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
# ...
config_port = 5432
conn = pg8000.connect(
database=args['DB'],
user=args['USER'],
password=args['PW'],
host=args['HOST'],
port=config_port
)
query = "TRUNCATE TABLE {0};".format(".".join([schema, table]))
cur = conn.cursor()
cur.execute(query)
conn.commit()
cur.close()
conn.close()