Search code examples
pythonajaxpython-requestsrequest

How to make a request to download a file using python to a portal with login?


I am trying to download a file using the following requests

url = 'https://totoro.banrep.gov.co/analytics/saw.dll?Go&ViewID=o%3ago%7er%3areport&Action=Download&SearchID=g8of6g3fbae0jp7va1ru23h9rm&ViewName=compoundView%211&fmapId=I5IMHw&ViewState=d8lnokl6jm050tk65k2v7mq8k6&ItemName=1.2.5.IPC_Serie_variaciones&path=%2fshared%2fSeries%20Estad%c3%adsticas_T%2f1.%20IPC%20base%202018%2f1.2.%20Por%20a%c3%b1o%2f1.2.5.IPC_Serie_variaciones&Format=excel2007&Extension=.xlsx&bNotSaveCommand=true'
username = 'publico'
password = 'publico123'
payload = {'NQUser': username, 'NQPassword': password}
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36'}

respuesta = requests.get(url, headers = headers, data = payload)
xls = pd.ExcelFile(BytesIO(respuesta .content))

However, I am getting that the requests are not authorized and I cannot understand why both the username and the password are correct. What could I do in this case? Are there any parameters that I need?

Note: Using selenium to simulate a user is not an option for me since the download I need to add inside an azure function.

I really appreciate the help!

I was hoping that by using the provided code I could get the data I need, however I am getting unauthorized requests


Solution

  • Try this. Here I saved the file to disk for testing, but you can adapt the code.

    Side comment: never, EVER, post credentials on the web. Here the information does not look very sensitive, but if there is a password, however weak, it's for a reason, and I should not be able to download the file.

    import requests
    
    session_url = "https://totoro.banrep.gov.co/analytics/saw.dll?Logon"
    file_url = "https://totoro.banrep.gov.co/analytics/saw.dll?Go&ViewID=o%3ago%7er%3areport&Action=Download&SearchID=g8of6g3fbae0jp7va1ru23h9rm&ViewName=compoundView%211&fmapId=I5IMHw&ViewState=d8lnokl6jm050tk65k2v7mq8k6&ItemName=1.2.5.IPC_Serie_variaciones&path=%2fshared%2fSeries%20Estad%c3%adsticas_T%2f1.%20IPC%20base%202018%2f1.2.%20Por%20a%c3%b1o%2f1.2.5.IPC_Serie_variaciones&Format=excel2007&Extension=.xlsx&bNotSaveCommand=true"
    payload = {"NQUser": "publico", "NQPassword": "publico123"}
    
    headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"}
    
    session = requests.Session()
    
    login_resp = session.post(session_url, headers=headers, data=payload)
    file_resp = session.get(file_url, headers=headers)
    
    with open("file.xlsx", "wb") as f:
        f.write(file_resp.content)
    
    session.close()