Search code examples
pythoncursorverticapython-datetimestrptime

Cursor's iterate() method sometimes fails when converting a string to datetime


My goal is to fetch rows from Vertica that contain a column of type Timestamp with format YYYY-MM-DD HH:MM:SS, and do stuff with them. My problem is that cursor.iterate() function sometimes throw an error, and I do not know why. It is very important to mention that I am using vertica_python library.

Error Message:

(<type 'exceptions.AttributeError'>, AttributeError('_strptime',), <traceback object at 0x7f9f2fc29ef0>)`

**Traceback:** 
`Traceback (most recent call last):
  File "/root/..../sync.py", line 129, in fetch_exception_table
    for row in cur.iterate():
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/cursor.py", line 363, in iterate
    row = self.fetchone()
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/cursor.py", line 267, in fetchone
    row = self.row_formatter(self._message)
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/cursor.py", line 446, in row_formatter
    return self.format_row_as_array(row_data)
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/cursor.py", line 462, in format_row_as_array
    for idx, value in enumerate(row_data.values)]
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/column.py", line 212, in convert
    return self.converter(s) if self.converter is not None else s
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/column.py", line 77, in timestamp_parse
    dt = _timestamp_parse(s)
  File "/usr/lib/python2.7/site-packages/vertica_python/vertica/column.py", line 94, in _timestamp_parse
    return datetime.strptime(s, '%Y-%m-%d %H:%M:%S.%f')
AttributeError: _strptime`

My code:

        csv_buffer1 = cStringIO.StringIO()

        connection = Connection(config)
        cur = connection.get_cursor()

        cur.execute("select {} from DB.Table order by column;"
                    .format(HEADER_COLUMNS))

        for row in cur.iterate():
            csv_buffer1.write(tabulate(row, lsp=True))

My Connection class:

from vertica_python import connect

class Connection:
    cur = None

    def __init__(self, config):
        self.config = config
        self.connObject = connect(**dict(config.items()))

    def get_cursor(self):
        self.cur = self.connObject.cursor()
        return self.cur
    #todo - return self.conn.... directly

    def close_connection(self):
        self.connObject.close()

In the code above, the execution fails on the for loop.

For some reason, the code does not always fail, it runs successfully sometimes and other times it does not.

Please see the figure below of the data in Vertica DB:

Vertica Timestamp column

Furthermore, it is pointless to use cur.fetchall() or cur.fetchone() instead of cur.iterate() for the sake of debugging because all of them use the same functions from cursor.py and column.py.


Solution

  • This problem was caused due to the use of multi-threading in my code. The import of _strptime in some executions of the code sort of gets locked or gets used by another thread, so the second thread requesting it fails the import.

    This has not happened to me in Python3 yet, so I assume it was fixed in Python3.

    I can think of two possible workarounds:

    1. In addition to importing datetime, also import _strptime even though it is not in use import datetime import _strptime

    2. Use the function like so: datetime.datetime.strptime('2019-12-11 08:58:01', '%Y-%m-%d %H:%M:%S')