Search code examples
pythonimagebase64data-uri

How to parse data-uri in python?


HTML image elements have this simplified format:

<img src='something'>

That something can be data-uri, for example:

...

Is there a standard way of parsing this with python, so that I get content_type and base64 data separated, or should I create my own parser for this?


Solution

  • Split the data URI on the comma to get the base64 encoded data without the header. Call base64.b64decode to decode that to bytes. Last, write the bytes to a file.

    from base64 import b64decode
    
    data_uri = "..."
    
    # Python 2 and <Python 3.4
    header, encoded = data_uri.split("base64,", 1)
    data = b64decode(encoded)
    
    # Python 3.4+
    # from urllib import request
    # with request.urlopen(data_uri) as response:
    #     data = response.read()
    
    with open("image.png", "wb") as f:
        f.write(data)