This is the program that I have created to extract the maximum page value from each category section from the list.I am unable to fetch all the value,I am just getting the value of the last value in the list.What changes do I need to make in order to get all the outputs.
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
#List for extended links to the base url
links = ['Link_1/','Link_2/','Link_3/']
#Function to find out the biggest number present in the page navigation
#section.Every element before 'Next→' is consist of the upper limit
def page_no():
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
print(max_page)
#url loop
for url in links:
my_urls ='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
page_no()
Page Navigator Example:
1 2 3 … 15 Next →
Thanks in Advance
You need to put page_html inside the function and indent the last 4 lines. Also it would be better to return the max_page value so you can use it ojtside the function.
def page_no(page_html):
bs = soup(page_html, "html.parser")
max_page = bs.find('a',{'class':'next page-numbers'}).findPrevious().text
return max_page
#url loop
for url in links:
my_urls='http://example.com/category/{}/'.format(url)
# opening up connection,grabbing the page
uClient = uReq(my_urls)
page_html = uClient.read()
uClient.close()
max_page = page_no(page_html)
print(max_page)