I am faced with a real mystery:
VBA
.send "land_abk=sh&ger_name=Norderstedt&order_by=2&ger_id=X1526"
Python
headers = {'User-Agent': 'python-requests/2.24.0', 'Accept-Encoding':'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive','Referer': 'https://url'}
A click on link leads to the last child-ULR and to the details. I tried really everthing to get data from the 3. site, with POST, GET, VBA, PYTHON-Referer, wihtout success. I just get header response 200 & header-content, but not a single letter from the sourcecode, just an error without any description. The only way to open this 3rd page without error and with content is to click the link on the 2nd page. This is a completely public website, no reason to build in referer or any other encryption. So what is the problem and how to fix it?
Your headers should work fine, as long as you include the correct referrer. Maybe there is something wrong in your way of receiving the html. This works for me:
Using urllib3
import urllib3
from bs4 import BeautifulSoup
URL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=755&land_abk=sh"
headers = {
"Referer": "https://www.zvg-portal.de/index.php?button=Suchen",
}
http = urllib3.PoolManager()
response = http.request("GET", URL, headers=headers)
html = response.data.decode("ISO-8859-1")
soup = BeautifulSoup(html, "lxml")
print(soup.select_one("tr td b").text)
# >> 0061 K 0012/ 2019
Using requests
import requests
URL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=755&land_abk=sh"
headers = {
"Referer": "https://www.zvg-portal.de/index.php?button=Suchen",
}
html = requests.get(URL, headers=headers).text
print("Versteigerung im Wege der Zwangsvollstreckung" in html)
# >> True
Using Python 2:
import urllib2
URL = "https://www.zvg-portal.de/index.php?button=showZvg&zvg_id=755&land_abk=sh"
req = urllib2.Request(URL)
req.add_header("Referer", "https://www.zvg-portal.de/index.php?button=Suchen")
html = urllib2.urlopen(req).read()
print("Versteigerung im Wege der Zwangsvollstreckung" in html)
# >> True