Search code examples
pythonxml

How do I save an XML file locally from a website using Python and requests module?


I'm using Python for a web scraping project, and I bumped into this URL that downloads a XML file to my PC.

Is there a way I can access the XML file that's downloaded when you click the link? I'm ok with saving the XML locally if that's the only way, but I have no idea how to do so.

I've tried using the requests module, but I get the byte string when doing so.

import requests

r = requests.get(
    "https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601"
)

print(r.content)

Solution

  • You need to specify the request headers to download from that specific site.

    Here is how i did it:

    import requests
    
    filename = "file.xml"
    url = "https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601"
    headers = {
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8'
    }
    
    r = requests.get("https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601", headers=headers)
    
    with open(filename, "wb") as file:
        file.write(r.content)