Search code examples
python-3.xbeautifulsoupurllib

Python :Page Navigator Maximum Value Scraper - Only getting the output of last value


This is the program that I have created to extract the maximum page value from each category section from the list.I am unable to fetch all the value,I am just getting the value of the last value in the list.What changes do I need to make in order to get all the outputs.

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

#List for extended links to the base url

links = ['Link_1/','Link_2/','Link_3/']
#Function to find out the biggest number present in the page navigation
#section.Every element before 'Next→' is consist of the upper limit

def page_no():
    bs = soup(page_html, "html.parser")
    max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
   print(max_page)

#url loop
for url in links:
    my_urls ='http://example.com/category/{}/'.format(url)

# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
page_no()

Page Navigator Example: 1 2 3 … 15 Next →

Thanks in Advance


Solution

  • You need to put page_html inside the function and indent the last 4 lines. Also it would be better to return the max_page value so you can use it ojtside the function.

    def page_no(page_html): 
        bs = soup(page_html, "html.parser")
        max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
        return max_page
    
    #url loop 
    for url in links: 
        my_urls='http://example.com/category/{}/'.format(url) 
        # opening up connection,grabbing the page 
        uClient = uReq(my_urls) 
        page_html = uClient.read()
        uClient.close() 
        max_page = page_no(page_html)
        print(max_page)