Search code examples
pythonpython-3.xweb-scrapingpython-requests.aspxauth

Logging into website and scraping data


The website I am trying to log in to is https://realitysportsonline.com/RSOLanding.aspx. I can't seem to get the login to work since the process is a little different to a typical site that has a login specific page. I haven't got any errors, but the log in action doesn't work, which then causes the main to redirect to the homepage.

import requests
url = "https://realitysportsonline.com/RSOLanding.aspx"
main = "https://realitysportsonline.com/SetLineup_Contracts.aspx?leagueId=3000&viewingTeam=1"
data = {"username": "", "password": "", "vc_btn3 vc_btn3-size-md vc_btn3-shape-rounded vc_btn3-style-3d vc_btn3-color-danger" : "Log In"}
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
          'Referer':  'https://realitysportsonline.com/RSOLanding.aspx', 
          'Host':  'realitysportsonline.com',
          'Connection':   'keep-alive',
          'Accept-Language':    'en-US,en;q=0.5',
          'Accept-Encoding':    'gzip, deflate, br',
          'Accept':  'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'}

s = requests.session()
s.get(url)
r = s.post(url, data, headers=header)

page = requests.get(main)

Solution

  • First of all, you create a session and assuming your POST request worked, you then request an authorised page without using your previously created session.

    You need to make the request with the s object you created like so: page = s.get(main)

    However, there were also a few issues with your POST request. You were making a request to the home page instead of the /Login route. You were also missing the Content-Type header.

    import requests
    
    url = "https://realitysportsonline.com/Services/AccountService.svc/Login"
    main = "https://realitysportsonline.com/LeagueSetup.aspx?create=true"
    payload = {"username":"","password":""}
    headers = {
        'Content-Type': "text/json",
        'Cache-Control': "no-cache"
    }
    
    s = requests.session()
    response = s.post(url, json=payload, headers=headers)
    page = s.get(main)
    

    PS your main request url redirects to the homepage, even with a valid session (at least for me).