Search code examples
web-crawlerscrapyscrapy-shell

Can't login using Scrapy


<div class="col small-w100 tiny-w100 col1">
<div class="box_already_member">
    <h2 class="fs22 fwn foro black">Already member ?</h2>
    <p>Please enter your account details : </p>
    <div class="box_form">
    <label>Your email* </label>
    <br>
    <input id="txtUsernamelogin" type="text" data-parsley-group="glogin" 
    data-parsley-required="true" data-parsley-errors-container="#lblMessage" 
    data-parsley-type-message="Please check that your Professional Email is 
    in correct format" data-parsley-required-message="Please type your 
    Professional Email" data-parsley-type="email"><br>
    <label>Password* </label>
    <br>
    <input id="txtPasswordlogin" type="password" data-parsley-group="glogin" 
    data-parsley-required="true" data-parsley-errors-container="#lblMessage" 
    data-parsley-required-message="Please type your password"><br>
    <div class="row pt20 pb20">
        <div class="col "><a class="c19" href="/forgot-password" 
         rel="nofollow">Forgot password ?</a></div>
            <div class="col txtright">
               <div class="inbl">
                   <a href="#" id="loginbtnclick" class=" row  wauto  fs14 
                    c0 bgc18 rounded5 txtcenter h36p vam tdn mb20">
                   <span class="col vam fs16 pr40 pl40"> 
                   <strong>LOGIN</strong></span>
                   </a>
               </div>
            </div>
        </div>
  </div>
<span>*mandatory fields</span><br>
<span id="lblMessage" class="red"></span>
</div>


tried using scrapy.FormRequest.from_response() but doesnt seem to work.

I need to login to get full access to product details Login page: https://cosmetics.specialchem.com/login


Solution

  • It's example how to login to target site. To do it you need to open browser and learn all data sending to server. When you understand how it work you can write your own code.

    import scrapy
    from scrapy.exceptions import CloseSpider
    from scrapy.spiders import CrawlSpider
    
    
    class SpecialchemSpider(CrawlSpider):
        name = 'specialchem'
        allowed_domains = ['<DOMAIN>']
        start_urls = ['https://cosmetics.<DOMAIN>/login']
        custom_settings = {'ROBOTSTXT_OBEY': False}
    
        def parse(self, response):
            inputs = response.css('form input')
    
            formdata = {'Iid': '',
                        'Password': 'SECRET',
                        'User': 'EMAIL',
                        'Popin': '1'}
    
            return scrapy.FormRequest(
                'https://cosmetics.<DOMAIN>/services/LoginService.ashx',
                formdata=formdata,
                callback=self.after_login
            )
    
        def after_login(self, response):
            if 'OK' not in response.text:
                raise CloseSpider('Wrong login or password. Or you was blocked.')
    
            url = 'https://cosmetics.<DOMAIN>/product/i-eastman-chemical-company-eastman-aq-38s-polymer'
            return scrapy.Request(url, callback=self.product)
    
        def product(self, response):
            pass
    

    It will work when you replace DOMAIN, SECRET and EMAIL to correct one.