Search code examples
mysqlinnodbitunesmysql-python

MySQL > Not showing data even when size is increasing and rows can be seen on a database viewer


Inserting lots of data in a mysql database. Steps being followed are : https://affiliate.itunes.apple.com/resources/documentation/epfimporter/

The problem is everything seems to be going well apart from I can't see the data. I tried select count(*) on the table as well and it returns 0 rows. I tried on command line for the same.

Size of DB is increasing.

I am using Sequel Pro and when I press refresh it shows the row count. And with time as my script keeps inserting more data, the row count and size is increasing as well. But after the loader completes row count shows 0 but size is still a big number.

enter image description here

The ingester code snippet. Full code can be found in link given above.

def _populateTable(self, tableName, resumeNum=0, isIncremental=False, skipKeyViolators=False):
    """
    Populate tableName with data fetched by the parser, first advancing to resumePos.

    For Full imports, if skipKeyViolators is True, any insertions which would violate the primary key constraint
    will be skipped and won't log errors.
    """
    #REPLACE is a MySQL extension which inserts if the key is new, or deletes and inserts if the key is a duplicate
    commandString = ("REPLACE" if isIncremental else "INSERT")
    ignoreString = ("IGNORE" if (skipKeyViolators and not isIncremental) else "")
    exStrTemplate = """%s %s INTO %s %s VALUES %s"""
    colNamesStr = "(%s)" % (", ".join(self.parser.columnNames))

    self.parser.seekToRecord(resumeNum) #advance to resumeNum
    conn = self.connect()

    while (True):
        #By default, we concatenate 200 inserts into a single INSERT statement.
        #a large batch size per insert improves performance, until you start hitting max_packet_size issues.
        #If you increase MySQL server's max_packet_size, you may get increased performance by increasing maxNum
        records = self.parser.nextRecords(maxNum=300)
        if (not records):
            break

        escapedRecords = self._escapeRecords(records) #This will sanitize the records
        stringList = ["(%s)" % (", ".join(aRecord)) for aRecord in escapedRecords]

        cur = conn.cursor()
        colVals = unicode(", ".join(stringList), 'utf-8')
        exStr = exStrTemplate % (commandString, ignoreString, tableName, colNamesStr, colVals)
        #unquote NULLs
        exStr = exStr.replace("'NULL'", "NULL")
        exStr = exStr.replace("'null'", "NULL")

        try:
            cur.execute(exStr)
        except MySQLdb.Warning, e:
            LOGGER.warning(str(e))
        except MySQLdb.IntegrityError, e:
        #This is likely a primary key constraint violation; should only be hit if skipKeyViolators is False
            LOGGER.error("Error %d: %s", e.args[0], e.args[1])

        self.lastRecordIngested = self.parser.latestRecordNum
        recCheck = self._checkProgress()
        if recCheck:
            LOGGER.info("...at record %i...", recCheck)

    conn.close()

I even waited for script to complete but no good came out.

MySQl version 5.6.23

Any help is greatly appreciated.


Solution

  • It sounds like

    • you have autocommit=0 and
    • did not do START TRANSACTION and
    • have not yet done COMMIT.

    Building multi-row INSERTs is a good idea. 200 is good. But commit (or autocommit) each one.

    REPLACE is not as good as INSERT .. ON DUPLICATE KEY UPDATE; but the latter has a different syntax.

    Using character set conversion routines (unicode()) scares me. Often that leads to a mess. If everything is set correctly, no explicit conversions are needed.