Search code examples
scrapycss-selectors

How can I extract the amount written after the 'span' tag under the 'div' tag using css-selectors in Scrapy?


I want to get the amount(182.78) written after span tag under div tag , but I am getting only the "MRP" element content as a string, I only want to extract the amount written. By the way I'm using ipython as my shell

HTML code:

<div class="style__font-bold___1k9Dl style__font-14px___YZZrf style__flex-row___2AKyf style__space-between___2mbvn style__padding-bottom-5px___2NrDR">

<div>Augmentin 625 Duo Tablet</div>

<div>
<span class="style__font-normal___2gZqF style__margin-left-8px___3Sw1d">MRP</span>

₹<!-- -->
182.78

</div>
</div>

code i've used

med.css('div span ::text').get()

and the output of my code is.....--> 'MRP'


Solution

  • Your current selector 'div span ::text' means that you want the text contents from the span element which is a child of a div element. However the text that you are trying to extract isn't in contents of the span element.

    <div ...>
       <span ...>MRP</span>
       ...
       <!-- -->
       "132.73"
    </div>
    

    In order to extract this using a CSS selector, you can use the :has() directive to specify the div with a span as a direct descendant like this:

    response.css('div:has(> span)::text')
    

    And since it is broken by a comment and span tag will want to use the getall() method and indicate that you want the last index.

    For example:

    >>> response.css('div:has(> span)::text').getall()[-1]
    "132.73"