I need to scrape website that requires login. I'm trying to create a session
and login as I have to scrape different pages after logging in. But can't find out why it's not working.
import requests
from bs4 import BeautifulSoup
login_data = {
"log":"login",
"login":"my email",
"password":"my password"
}
session = requests.session()
session.post(login_url, data=login_data)
response = session.get(url)
html = response.text
soup = BeautifulSoup(html, "html.parser")
print(soup.title.get_text())
Title shows it's not working.
Here is the website form.
<form method="post" id="signin-form" class="form-horizontal">
<input type="hidden" name="referer" value="" />
<div class="form-group">
<label for="email_text" class="col-sm-4 control-label">Your login (email):</label>
<div class="col-sm-8">
<input type="email" class="form-control" id="email_text" value="" name="login" autofocus data-validation='{"parent":".form-group","events":["keyup","blur"],"rules":[{"name":"notblank"},{"name":"email"}]}' />
</div>
</div>
<div class="form-group">
<label for="password_text" class="col-sm-4 control-label">Password:</label>
<div class="col-sm-8">
<input type="password" class="form-control" id="password_text" name="password" data-validation='{"parent":".form-group","rules":[{"name":"min","min":5}]}' />
</div>
</div>
<div class="form-group">
<div class="col-sm-8 col-sm-offset-4">
<div class="checkbox">
<label>
<input type="checkbox" name="rememberme"> Remember me on this computer
</label>
</div>
</div>
</div>
<div class="form-group">
<div class="col-sm-offset-4 col-sm-8">
<button type="submit" class="btn btn-default btn-lg" name="log">Log into your account</button>
<a class="btn btn-default btn-lg mobile-show-inline-block" href="/account/create/">Create account</a>
<a href="/account/lostpassword" class="btn btn-link btn-lg">Forgot your password?</a>
</div>
</div>
</form>
N.B: Don't suggest me to use selenium
. I can do this with selenium
and I tested that but I have to stick to requests
because selenium
pops up console even if I use PhantomJS
.
Try doing a get on the login page first. Perhaps it's setting some cookies that it expects to be present on the post.