Now I am using psycopg2 to connect to postgresql and use the copy function to load the local txt data into the database.
Currently my code is as follows:
#!/usr/bin/env python
import time
import psycopg2
import psycopg2.extensions
from psycopg2.extras import LoggingConnection, LoggingCursor
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
class MyLoggingCursor(LoggingCursor):
def execute(self, query, vars=None):
self.timestamp = time.time()
return super(MyLoggingCursor, self).execute(query, vars)
def callproc(self, procname, vars=None):
self.timestamp = time.time()
return super(MyLoggingCursor, self).callproc(procname, vars)
class MyLoggingConnection(LoggingConnection):
def filter(self, msg, curs):
return msg + " %d ms" % int((time.time() - curs.timestamp) * 1000)
def cursor(self, *args, **kwargs):
kwargs.setdefault('cursor_factory', MyLoggingCursor)
return LoggingConnection.cursor(self, *args, **kwargs)
db_settings = {
"database": "gps_stat",
"user": "user",
"password": "test@123",
"host": "localhost",
"port": 26000
}
target_txt = "test.txt"
table_name = 'gps_base'
start_time=time.time()
conn = psycopg2.connect(connection_factory=MyLoggingConnection, **db_settings)
conn.initialize(logger)
cur = conn.cursor('cursor_unique_name', cursor_factory=psycopg2.extras.DictCursor)
f = open(target_txt, 'r')
cur.copy_from(f, table_name, sep=',')
conn.commit()
conn.close()
f.close()
print("Time:{}ms".format(time.time()-start_time))
But the MyLoggingConnection
does not meet my needs.
There are about 200 million pieces of data in the test_txt. After the above program runs, it prints Time: 2000ms
, but in fact I waited about 20 minutes to get the result of Time: 2000ms
.
What can I do to get the real execution time of the copy_from() function?
Initially I had same issue and then I have written small program to verify this and came to know this
start_time=time.time()
time.sleep(3)
print("Total Time Taken :: {} ms".format(time.time()-start_time))
and output is
Total Time Taken :: 3.010446071624756 ms
It means, time which you are getting is in seconds as I am putting time on sleep for 3 sec. If it is in ms then, result should have been 3000+ ms whereas result is 3.010 which is nothing but sec. I believe your execution is taking total 2000 sec.
happy coding!