I am getting hundreds of positional errors with Python mysql connector while trying to populate a database. I figure it's because I'm not being able to escape some or all of the special characters. Is there a way to escape all the special characters that are causing this problem or; ...should I just go ahead and replace the special characters in each string with other neutral characters and create a dictionary to restructure the sentence back to original when I pull data back from the database?
for thing in every_thing:
line_= thing.split("|", 3)
book_name = line_[0]
chap_number = line_[1]
verse_num = line_[2]
verse= line_[3]
#Test
statement = "INSERT INTO " + new_dict[line_[0]]
statement += " ( "
chapters = book_key_chapt[line_[0]]
h = 1
fields = ""
while h <= chapters:
fields += "Chapter_" + str(h)
h += 1
if h != chapters + 1:
fields += ","
else:
fields += " )"
statement += fields
# Value creation
values = " VALUES ( "
a = 1
while a <= chapters:
if a == int(chap_number):
values += "'" + verse_num + verse.replace('~'," ") + "'"
else:
values += "'NA'"
a += 1
if a != chapters + 1:
values += ","
values += " )"
statement += values
try:
cur.execute(*statement)
print(cur)
except Exception as er:
print(er)
Here:
cur.execute(*statement)
The "*" means "unpack before passing". It's usually applied to lists or tuples but works on any iterable - and strings are iterable. So basically, what your doing here is:
cur.execute(statemement[0], statement[1], statement[2], # etc)
So we know your statement
string is 844 characters long ;-)
The solution is, of course, to remove the "*".
This being said, your code is totally unsafe, and even if no one uses this breach to attack you, it's still very brittle (properly escaping strings is not that simple). The proper way to use the db-api is to use placeholders in the SQL query and pass the values as arguments
Note that this only applies to values, not to tables and fields names, so you still have to programmatically build the SQL query, but you can make your life much easier (and your code a bit faster) by using string formatting operations, and by building your fields list and placeholders list using str.join(seq)
, ie:
statement = "INSERT INTO {table_name} ({fields}) VALUES ({placeholders})"
# NB: you may want to read about the stdlib's `csv` package...
book_name, chap_number, verse_num, verse = thing.split("|", 3)
table_name = new_dict[book_name]
nb_chapters = book_key_chapt[book_name]
fields = ["Chapter_{}".format(i) for i in range(1, nb_chapters + 1)]
placeholders = ["?"] * nb_chapters
values = [#XXX I let you rewrite your code here using a list comprehension as above]
sql = statement.format(
table_name=table_name,
fields=", ".join(fields),
placeholders=", ".join(placeholders)
)
cur.execute(sql, values)
And as a last (but important!) note: it looks like you're having one distinct table per book name, each with as many fields as chapters ? If yes, that's a totally wrong design. If the point is to model a book + chapters you want one single table for the books (one record per book), and a related "chapters" table with a foreignkey on the "books" table (one record per chapter).