I have an error. when I try to concatenate the link and the part of next link, where I need to switch. Here is my error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-172-c75cfd599dcf> in <module>
21 l.append(j['href'])
22
---> 23 url2 = 'https://krisha.kz/prodazha/kvartiry/petropavlovsk/' + ''.join(l[j])
24 driver.get(url2)
25
TypeError: list indices must be integers or slices, not Tag
And I faced problem in the following code:
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a', {'class': 'a-card__title'})
soup2 = BeautifulSoup(str(links), 'html.parser')
href = soup2.find_all('a', href=True)
l = []
for j in href:
l.append(j['href'])
url2 = 'https://krisha.kz/prodazha/kvartiry/petropavlovsk/' + ''.join(l[j])
driver.get(url2)
My "l" is a list of hrefs and it looks like that picture below:
Because of that I can't move to the next page to scrape it. What integer or slice do I need here?
Instead of using the list
try to use j
of your loop:
url2 = 'https://krisha.kz/prodazha/kvartiry/petropavlovsk/' +j['href'][1:]
I sliced it at the end to avoid a //
in the url.
You also can use the list
but than you have to enumerate
in your loop:
for i,j in enumerate(href):
l.append(j['href'])
url2 = 'https://krisha.kz/prodazha/kvartiry/petropavlovsk/' +l[i][1:]
Example
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = "https://krisha.kz/prodazha/kvartiry/petropavlovsk/"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
links = soup.find_all('a', {'class': 'a-card__title'})
soup2 = BeautifulSoup(str(links), 'html.parser')
href = soup2.find_all('a', href=True)
l = []
for j in href:
l.append(j['href'])
url2 = 'https://krisha.kz/prodazha/kvartiry/petropavlovsk/' +j['href'][1:]
print(url2)