Search code examples
cpgbouncer

pgbouncer-rr is failing in query rewrite


I am using pgbouncer-rr to do query rewrite in a redshift cluster (pgbouncer calls rewrite_query.py to do the rewrites and here is the link for more info on this project - https://github.com/awslabs/pgbouncer-rr-patch). Pgbouncer-rr is based on pgbouncer and its code is merged into pgbouncer and started. I have been using it successfully to rewrite queries, but I am running into an issue when I am trying to convert an insert statement into unload to external table.

The execution flow is pgbouncer -> rewrite.c -> pycall.c -> rewrite_query.py

In the python module I write the orig and converted sql and here is how it looks in the log - so python module is able to do the conversion without any issues.

06:03:35AM on January 31, 2019 ---- INSERT INTO nm1(c1) select c1 from t
06:03:35AM on January 31, 2019 ---- CONVERTED TO:
06:03:35AM on January 31, 2019 ---- unload (' SELECT c1 FROM t' ) to 's3://mybucket/nm1/' iam_role 'arn:aws:iam::99999:role/RedshiftDefaultRole,arn:aws:iam::99999:role/RedshiftWriteAccess' ALLOWOVERWRITE ; insert into schema1.nm1_decoy select (1) from x.nm1;

But when you see the pgbouncer log, the query return unchanged.

2019-01-31 06:28:20.980 989 NOISE C-0x22ebe60: dev/[email protected]:57222 pkt='Q' len=42
2019-01-31 06:28:20.980 989 DEBUG C-0x22ebe60: dev/[email protected]:57222 rewrite_query: Username => dbuser
2019-01-31 06:28:20.980 989 DEBUG C-0x22ebe60: dev/[email protected]:57222 rewrite_query: Orig Query=> INSERT INTO nm1(c1) select c1 from t
2019-01-31 06:28:21.011 989 WARNING C-0x22ebe60: dev/[email protected]:57222 pValue right after the call PyString_AsString(pValue): unload (' SELECT c1 FROM t' ) to 's3:
//mybucket/nm1/' iam_role 'arn:aws:iam::99999:role/RedshiftDefaultRole,arn:aws:iam::99999:role/RedshiftWriteAccess' ALLOWOVERWRITE ; insert into schema1.nm1_decoy select (1) from crm_unload.nm1;
2019-01-31 06:28:21.011 989 WARNING C-0x22ebe60: dev/[email protected]:57222 Result after PyString_AsString(pValue) and in else NULL condition: (null)
2019-01-31 06:28:21.011 989 DEBUG C-0x22ebe60: dev/[email protected]:57222 query unchanged

Here is the code pgbouncer/src/pycall.c which calls rewrite_query.py module for conversion. I dont understand C data structures and how it interacts with python, just put the log slog_error statements for debugging purpose. Looks like for some reason its going into the else condition for PyString_Check(pValue) check. Why is it failing the check when pValue is a string? So basically instead of returning unload query its returning back insert statements after failing the if PyString_Check(pValue) check.

pValue = PyObject_CallObject(pFunc, pArgs);
        slog_warning(client,"pValue right after the call PyString_AsString(pValue): %s", PyString_AsString(pValue));
        if (pValue == NULL) {
                slog_error(client, "Python Function <%s> failed to return a value",
                                py_function);
                goto finish;
        }
        if (PyString_Check(pValue)) {
                slog_warning(client,"PyStringCheck succeeded on rewrite query return value pValue.");
                res = strdup(PyString_AsString(pValue));
                slog_warning(client,"Result after PyString_AsString(pValue) and strdup() call: %s",res);
        } else {
                res = NULL;
                slog_warning(client,"Result after PyString_AsString(pValue) and in else NULL condition: %s",res);
        }

Solution

  • This was due to the python object type getting changed from str to Unicode. I am using sqlparse do to parsing and this module was converting the variable to unicode and was failing PyString_Check check. To fix the issue, after the call to sqlparse, I converted the encoding of the variable to ascii in python (version is 2.7.15) - varname.encode("ascii")