Search code examples
pythonweb-scrapingbeautifulsoupfindallpython-requests-html

soup.find_all returns an empty list regardless of what class I enter (Google Colab)


I'm aware this has been asked before but I can't find any instance where it was done in Google colab (rather than locally). I'm trying to scrape the area name and the associated latitude and longitudes from an APIs output using requests and BeautifulSoup. My code is below:

#Importing tools
import numpy as np
import pandas as pd

import requests
import string
from bs4 import BeautifulSoup

import os

#Getting the HTML elements from the URL
URL = "http://api.positionstack.com/v1/forward?access_key=4d197793636f1badcdc02c14da0f8da0&query=London&limit=1"
html = requests.get(URL)
soup = BeautifulSoup(html.content, 'html.parser')


#I went onto the website, inspected it and found that the latitudes, longitudes and place names are in the span.n elements
#I'm grabbing this from the website here and viewing it
soup_k = soup.find_all("span", class_="n")

soup_k

But it just outputs: []

I have also tried every other element I can find using inspect and none of them return anything. I saw that the solutions to similar issues suggested that the elements were hidden behind Javascript but I don't think this is the case...

Any ideas on why it returns an empty list or help on scraping this page would be greatly appreciated! Thanks

Disclaimer: I'm new to coding, I've tried to make sure my terminology is correct and the question is asked in the right way but I'm still learning - any pointers in the right direction are always welcome


Solution

  • It is not a website it is an api that response with json not html. So BeautifulSoup is not needed, just grab the json and pick your attributes:

    import requests
    URL = "http://api.positionstack.com/v1/forward?access_key=4d197793636f1badcdc02c14da0f8da0&query=London&limit=1"
    
    res = requests.get(URL).json()
    

    Output of res:

    {'data': [{'latitude': 51.509648, 'longitude': -0.099076, 'type': 'locality', 'name': 'London', 'number': None, 'postal_code': None, 'street': None, 'confidence': 1, 'region': 'Greater London', 'region_code': None, 'county': None, 'locality': 'London', 'administrative_area': None, 'neighbourhood': None, 'country': 'United Kingdom', 'country_code': 'GBR', 'continent': 'Europe', 'label': 'London, England, United Kingdom'}]}
    

    To access your attributes:

    lat = res['data'][0]['latitude']
    lng = res['data'][0]['longitude']
    region = res['data'][0]['region']
    
    print(lat,lng,region)
    

    Output:

    51.509648 -0.099076 Greater London