python web-scraping beautifulsoup python-requests google-colaboratory

Why am I not getting the output nor an error in web scraping?

I am doing an assignment of web scraping on google colab with beautifulsoup and requests. Here I am only scraping the headline of google news. Below is the code:

import requests
from bs4 import BeautifulSoup

def beautiful_soup(url):
'''DEFINING THE FUNCTION HERE THAT SENDS A REQUEST AND PRETTIFIES THE TEXT 
INTO SOMETHING THAT IS EASY TO READ'''

request = requests.get(url)
soup = BeautifulSoup(request.text, "lxml")
print(soup.prettify())

beautiful_soup('https://news.google.com/?hl=en-IN&gl=IN&ceid=IN:en')

for headlines in soup.find_all('a', {'class': 'VDXfz'}):
   print(headlines.text)

The problem is that when I run the cell it neither shows the output (list of headlines) nor an error. Please help it is bugging me for 2 days.

Solution

You probably need to display the text from the next span element. This could be done as follows:

import requests
from bs4 import BeautifulSoup

def beautiful_soup(url):
    '''DEFINING THE FUNCTION HERE THAT SENDS A REQUEST AND PRETTIFIES THE TEXT 
       INTO SOMETHING THAT IS EASY TO READ'''

    request = requests.get(url)
    soup = BeautifulSoup(request.text, "lxml")
    #print(soup.prettify())
    return soup

soup = beautiful_soup('https://news.google.com/?hl=en-IN&gl=IN&ceid=IN:en')

for headlines in soup.find_all('a', {'class': 'VDXfz'}):
    print(headlines.find_next('span').text)

This would give you output starting something like:

I Take Back My Comment, Says Ram Madhav After Omar Abdullah’s Dare to Prove Pakistan Charge
Ram Madhav Backpedals On "Instruction From Pak" After Omar Abdullah Dare
National Conference backed PDP to save J&K from uncertainty: Omar Abdullah
On Ram Madhav ‘instruction from Pak’ barb, Omar Abdullah’s stinging reply
Make public reports of horse-trading in govt formation in J-K: Omar Abdullah to Guv

You could write the headlines to a CSV formatted file using the following approach:

import requests
from bs4 import BeautifulSoup
import csv

def beautiful_soup(url):
    '''DEFINING THE FUNCTION HERE THAT SENDS A REQUEST AND PRETTIFIES THE TEXT 
       INTO SOMETHING THAT IS EASY TO READ'''

    request = requests.get(url)
    soup = BeautifulSoup(request.text, "lxml")
    return soup

soup = beautiful_soup('https://news.google.com/?hl=en-IN&gl=IN&ceid=IN:en')

with open('output.csv', 'w', newline='', encoding='utf-8') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(['Headline'])

    for headlines in soup.find_all('a', {'class': 'VDXfz'}):
        headline = headlines.find_next('span').text
        print(headline)
        csv_output.writerow([headline])

Currently this just produces a single column called Headline