Search code examples
pythonlinkedin-apipython-requests

How could I use python-request to grab a linkedin page?


I use below code try to grab a linked in page,but it seems this method couldn't let me login,just show me the unauthorized home page.

#/usr/bin/env python3
import requests
from bs4 import BeautifulSoup


payload = {
'session-key': 'my account',
'session-password': 'my password'
}

URL = 'https://www.linkedin.com/uas/login'
s = requests.session()
s.post(URL, data=payload)

r = s.get('http://www.linkedin.com/nhome')
soup = BeautifulSoup(r.text)
print(soup)

`


Solution

  • This is much more complicated than what you've got so far.

    You will need to do something like:

    • Load https://www.linkedin.com/uas/login
    • Parse the response with BeautifulSoup to get the login form, with all the hidden form fields etc. (The CSRF ones are particularly important, as the server will reject a POST request without the correct values).
    • Build your POST data dictionary from the parsed login form data + your username and password
    • POST that data to https://www.linkedin.com/uas/login-submit (you might have to fake some of the headers too, as it might only accept requests marked as AJAX)
    • Finally GET http://www.linkedin.com/nhome

    You can see this whole process by opening the developer tools in chrome/firefox and going through the login process in the network tab.

    Something like this should work:

    import requests
    from bs4 import BeautifulSoup
    
    # Get login form
    URL = 'https://www.linkedin.com/uas/login'
    session = requests.session()
    login_response = session.get('https://www.linkedin.com/uas/login')
    login = BeautifulSoup(login_response.text)
    
    # Get hidden form inputs
    inputs = login.find('form', {'name': 'login'}).findAll('input', {'type': ['hidden', 'submit']})
    
    # Create POST data
    post = {input.get('name'): input.get('value') for input in inputs}
    post['session_key'] = 'username'
    post['session_password'] = 'password'
    
    # Post login
    post_response = session.post('https://www.linkedin.com/uas/login-submit', data=post)
    
    # Get home page
    home_response = session.get('http://www.linkedin.com/nhome')
    home = BeautifulSoup(home_response.text)