I need to scrap data instantly (5-10 sec), but my code works 60-80 sec.
if I a have better and faster idea, please let me know. It does not matter whether by the Selenium or even another language.
Here is my code:
from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
import time
start = time.time()
driver = webdriver.Chrome()
driver.get("https://www.investing.com/economic-calendar/")
# Scrape the data
events = []
# Locate the rows in the table
rows = driver.find_elements(By.XPATH, '/html/body/div[6]/section/div[6]/table/tbody/tr[14]')
for row in rows:
try:
actual = row.find_element(By.XPATH, './td[5]').text
previous = row.find_element(By.XPATH, './td[7]').text
events.append([actual, previous])
except Exception as e:
print(f"Error processing row: {e}")
driver.quit()
df = pd.DataFrame(events, columns=[ 'Actual', 'Previous'])
df.head()
df.to_csv('economic_calendar.csv', index=False)
end = time.time()
print(end - start)
To get the data fast don't use selenium
, try beautifulsoup instead:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://www.investing.com/economic-calendar/"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
table = soup.select_one("#economicCalendarData")
day = soup.select_one(".theDay").text
all_data = []
for row in table.select("tbody tr"):
tds = row.select("td")
if len(tds) != 8:
continue
all_data.append(
{
"Day": day,
"Time": tds[0].get_text(strip=True),
"Cur": tds[1].get_text(strip=True),
"Imp.": "*" * len(tds[2].select(".grayFullBullishIcon")),
"Event": tds[3].get_text(strip=True),
"Actual": tds[4].get_text(strip=True),
"Forecast": tds[5].get_text(strip=True),
"Previous": tds[6].get_text(strip=True),
}
)
df = pd.DataFrame(all_data)
print(df)
Prints:
Day Time Cur Imp. Event Actual Forecast Previous
0 Tuesday, July 9, 2024 02:00 JPY * Machine Tool Orders (YoY) 9.7% 4.2%
1 Tuesday, July 9, 2024 04:20 EUR ** German Buba Mauderer Speaks
2 Tuesday, July 9, 2024 05:40 EUR * Spanish 3-Month Letras Auction 3.293% 3.374%
3 Tuesday, July 9, 2024 06:00 USD * NFIB Small Business Optimism (Jun) 91.5 90.3 90.5
4 Tuesday, July 9, 2024 06:00 EUR ** Eurogroup Meetings
5 Tuesday, July 9, 2024 07:00 GBP * BoE Quarterly Bulletin
6 Tuesday, July 9, 2024 08:55 USD * Redbook (YoY) 6.3% 5.8%
7 Tuesday, July 9, 2024 09:15 USD ** Fed Vice Chair for Supervision Barr Speaks
8 Tuesday, July 9, 2024 10:00 USD *** Fed Chair Powell Testifies
9 Tuesday, July 9, 2024 10:00 USD ** Treasury Secretary Yellen Speaks
10 Tuesday, July 9, 2024 11:30 USD * 52-Week Bill Auction 4.775% 4.915%
11 Tuesday, July 9, 2024 12:00 USD ** EIA Short-Term Energy Outlook
12 Tuesday, July 9, 2024 13:00 USD ** 3-Year Note Auction 4.399% 4.659%
13 Tuesday, July 9, 2024 13:30 USD ** FOMC Member Bowman Speaks
14 Tuesday, July 9, 2024 16:30 USD ** API Weekly Crude Oil Stock -1.923M -0.250M -9.163M
15 Tuesday, July 9, 2024 18:45 NZD * External Migration & Visitors (May) 12.10% 1.70%
16 Tuesday, July 9, 2024 18:45 NZD * Permanent/Long-Term Migration (May) 1,410 5,110
17 Tuesday, July 9, 2024 18:45 NZD * Visitor Arrivals (MoM) (May) 4.0% -9.4%
18 Tuesday, July 9, 2024 19:00 KRW * Unemployment Rate (Jun) 2.8% 2.8%
19 Tuesday, July 9, 2024 19:50 JPY * PPI (MoM) (Jun) 0.2% 0.4% 0.7%
20 Tuesday, July 9, 2024 19:50 JPY * PPI (YoY) (Jun) 2.9% 2.9% 2.6%
21 Tuesday, July 9, 2024 21:30 AUD ** Building Approvals (MoM) (May) 5.5% -0.3%
22 Tuesday, July 9, 2024 21:30 AUD * Private House Approvals (May) 2.1% -1.6%
23 Tuesday, July 9, 2024 21:30 CNY ** CPI (YoY) (Jun) 0.4% 0.3%
24 Tuesday, July 9, 2024 21:30 CNY ** CPI (MoM) (Jun) -0.1% -0.1%
25 Tuesday, July 9, 2024 21:30 CNY ** PPI (YoY) (Jun) -0.8% -1.4%
26 Tuesday, July 9, 2024 22:00 NZD *** RBNZ Interest Rate Decision 5.50% 5.50%
27 Tuesday, July 9, 2024 22:00 NZD ** RBNZ Rate Statement