When I am running my code from inside VS Code i get this error. As I understand its warning me about a &
sign which does not exist in my code or it is something else?:
& C:/Users/iwanh/AppData/Local/Programs/Python/Python310/python.exe "c:/Users/iwanh/AppData/Local/Programs/Giannis' Programs/bbc_news.py"
File "<stdin>", line 1
& C:/Users/iwanh/AppData/Local/Programs/Python/Python310/python.exe"c:/Users/iwanh/AppData/Local/Programs/Giannis' Programs/bbc_news.py"
^
SyntaxError: invalid syntax
But if i run the .py file from cmd using python.exe bbc_news.py
it runs as expected. Currently running Python 3.10.8. VS Code is using the same version.
For refrence this is my code:
import requests
from bs4 import BeautifulSoup
import pandas as pd
def scrape_bbc_news(url):
try:
# Send a GET request to the URL
response = requests.get(url)
response.raise_for_status() # Raise an exception for bad response status
# Parse the HTML content
soup = BeautifulSoup(response.content, 'html.parser')
# Initialize lists to store headlines, descriptions, and links
headlines = []
descriptions = []
links = []
# Find all containers that hold both headline, description, and link
article_containers = []
# First type of container
containers_type1 = soup.find_all('div', class_='sc-f98732b0-3 ephYtw')
article_containers.extend(containers_type1)
for container in article_containers:
# Check if the article contains a "live" icon
live_icon = container.find('svg', class_='sc-3387039d-0 hgdstu sc-1097f7fe-0 jmthjj')
if live_icon:
continue # Skip this article if it contains a live icon
# Extract headline
headline = container.find('h2', class_='sc-4fedabc7-3 dsoipF').text.strip()
headlines.append(headline)
# Extract description
description = container.find('p', class_='sc-f98732b0-0 iQbkqW').text.strip()
descriptions.append(description)
# Extract link
link = container.find('a', class_='sc-2e6baa30-0 gILusN')['href'] if container.find('a', class_='gs-c-promo-heading') else ''
links.append(link)
# Check if lengths of headlines, descriptions, and links match
if len(headlines) != len(descriptions) or len(headlines) != len(links):
raise ValueError("Number of headlines, descriptions, and links do not match")
# Create a DataFrame from the extracted data
df = pd.DataFrame({
'headline': headlines,
'description': descriptions,
'link': links
})
return df
except requests.exceptions.RequestException as e:
print(f"Error fetching data: {e}")
return None
except ValueError as ve:
print(f"ValueError: {ve}")
return None
# Example usage:
url = "https://www.bbc.com/news"
df = scrape_bbc_news(url)
if df is not None:
print("Headlines, descriptions, and links scraped successfully:")
print(df.head())
df.to_csv('bbc_news_headlines.csv', index=False, encoding='utf-8')
print("Data saved to 'bbc_news_headlines.csv' successfully.")
else:
print("Failed to scrape data.")
That error message you're seeing is a Python one, the command is supposed to be executed within Powershell.
If you're typing in the command manually, you should do this in the terminal window, ensuring that the current prompt begins with PS
(Powershell).
Running your code with the play button (top right) should open up a Powershell session to run that command. For example, a simple hello program should show:
PS C:\Pax> & C:/Pax/Python312/python.exe "c:/Pax/temp.py"
hello