Version: Postgresql = 11.3 Python = 3.7.3
I have a python script that is collecting data from the web and storing it in a database for further analysis.
The below sql statement always returns false when it attempts to compare the variable that has been fetched from the web against what already exists in the database. I think this happens because of a mismatch in the encoding of the variable.
As an example, the variable = Σ1, and Σ1 already exists in the database, so the below sql statement should return TRUE but instead returns FALSE.
cur.execute("SELECT EXISTS(SELECT 1 FROM table_name WHERE column_name = %s)", (variable,))
When I run "cur.query" it shows the following query is being executed:
b"SELECT EXISTS(SELECT 1 FROM table_name WHERE column_name = '\xce\xa31')"
In pgadmin4, the variable is correctly stored as Σ1. If I run the sql query above directly in pgadmin4 with "Σ1" it returns true as expected. But when I query to check if that variable already exists in the database from the python script, it returns false because it's comparing "Σ1" in the database against "\xce\xa31" in the sql query.
When I run "show CLIENT_ENCODING;" in pgadmin4 it shows "UTF8" and when I print "print(conn.encoding)" immediately before the sql statement in my script it also shows "UTF8".
Where am I going wrong?
If I do this:
>>> b"SELECT EXISTS(SELECT 1 FROM table_name WHERE column_name = '\xce\xa31')".decode("utf-8")
"SELECT EXISTS(SELECT 1 FROM table_name WHERE column_name = 'Σ1')"
it shows that you're not doing anything wrong. The query that you are seeing displayed back is shown as UTF-8 bytes but it is exactly the same as the Unicode query you want. I suspect that the Greek symbol in the database isn't exactly what your select
expects to find. That may be because there are in fact several Unicode sigma signs: U+03A3, U+2211 and a few others besides.