Search code examples
pythonjsonpycurl

Weird behavior when doing POST by PyCurl


I have a simple code to post data to a remote server:

def main():
    headers = {}
    headers['Content-Type'] = 'application/json'

    target_url = r'the_url'

    data = {"bodyTextPlain": "O estimulante concorrente dos azulzinhos\r\nConhe\u00e7a a nova tend\u00eancia em estimulante masculino e feminino\r\n\r\nEste estimulante ficou conhecido por seus efeitos similares as p\u00edlulas\r\nazuis,\r\ndestacando-se por n\u00e3o possuir contraindica\u00e7\u00e3o ou efeito colateral.\r\n\r\nSucesso de vendas e principal concorrente natural dos azulzinhos,\r\nsua f\u00f3rmula \u00e9 totalmente natural e livre de qu\u00edmicos.\r\n\r\nPossuindo registro no Minist\u00e9rio da Sa\u00fade (ANVISA) e atestado de\r\nautenticidade.\r\n\r\nSaiba mais http://www5.somenteagora.com.br/maca\r\nAdquirindo 3 frascos voc\u00ea ganha +1 de brinde. Somente esta semana!\r\n\r\n\r\n\r\n\r\nPare de receber\r\nhttp://www5.somenteagora.com.br/app/sair/3056321/1\r\n\r\n"}

    buffer = StringIO()
    curl = pycurl.Curl()
    curl.setopt(curl.URL, target_url)
    curl.setopt(pycurl.HTTPHEADER, ['%s: %s' % (k, v) for k, v in headers.items()])

    # this line causes the problem
    curl.setopt(curl.POSTFIELDS, json.dumps(data))

    curl.setopt(pycurl.SSL_VERIFYPEER, False)
    curl.setopt(pycurl.SSL_VERIFYHOST, False)
    curl.setopt(pycurl.WRITEFUNCTION, buffer.write)
    curl.perform()

    response = buffer.getvalue()

    print curl.getinfo(pycurl.HTTP_CODE)
    print response

The remote server has errors parsing the json string I send:

500 { "status" : "Error", "message" : "Unexpected IOException (of type java.io.CharConversionException): Invalid UTF-32 character 0x3081a901(above 10ffff) at char #7, byte #31)" }

However if I save the post data from json.dumps to a variable and then do post:

    #curl.setopt(curl.POSTFIELDS, json.dumps(data))

    data_s = json.dumps(data)
    curl.setopt(curl.POSTFIELDS, data_s)

Then there is no error:

200

Is there any difference between these two cases?

Thanks.


Solution

  • This is a marvelously subtle question. The answer lies in this warning in the documentation for Curl.setopt_string(option, value):

    Warning: No checking is performed that option does, in fact, expect a string value. Using this method incorrectly can crash the program and may lead to a security vulnerability. Furthermore, it is on the application to ensure that the value object does not get garbage collected while libcurl is using it. libcurl copies most string options but not all; one option whose value is not copied by libcurl is CURLOPT_POSTFIELDS.

    When you use a variable, this creates a reference to the string so it doesn't get garbage collected. When you inline the expression, the string is deallocated before libcurl finishes using it, with unpredictable results.

    To avoid having to worry about the lifetime of your objects, you can use CURLOPT_COPYPOSTFIELDS instead.