I am trying to add the stock name value for each iteration to it's corresponding data bulk upload to a pandas dataframe:
This is what I've tried so far:
from pandas_datareader import data as pdr
import requests
from bs4 import BeautifulSoup
import json, requests
import pandas as pd
import re
import numpy as np
import pandas_datareader.data as web
import yfinance as yfin
from tqdm import tqdm
import numpy as np
import datetime
from datetime import timedelta
################# fetch series names for sic ######################
sic_emisoras_df = pd.json_normalize(
json.loads(
requests.get('https://www.bmv.com.mx/es/Grupo_BMV/BmvJsonGeneric?idSitioPagina=6&mercado=CGEN_SCSOP&tipoValor=CGEN_CASEO&random=5845')
.text
.split(';(', 1)[-1]
.split(')')[0]
)['response']['resultado']
).dropna(axis=1, how='all')
####################################################################
# define time range:
start=datetime.date.today()-datetime.timedelta(days=14)
end=datetime.date.today()
# fetch data
# get all SIC names as list
stock_names = sic_emisoras_df["cveCorta"].values.tolist()
# append information per stock name
sic_market_df = pd.DataFrame([])
sic_market_df["stock_name"] = np.nan
for i in tqdm(stock_names):
# fetch data per stock_name
try:
yfin.pdr_override()
# append stock name
sic_market_df["stock_name"]=i
# fetch information by stock name
data = web.DataReader(i,start,end)
# append rows to empty dataframe
sic_market_df = sic_market_df.append(data)
except KeyError:
pass
print("Fetched sic_market_df!")
And the output only fetches the name for the first iteration but gets NaN for every other bulk upload :
stock_name Open High Low Close Adj Close Volume
2024-02-20 ZS 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 ZS 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 ZS 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 ZS 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 ZS 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
I would like to get a dataframe that identifies each iteration bulk upload with it's unique stock name, i.e, something like this:
stock_name Open High Low Close Adj Close Volume
2024-02-20 ZS 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 ZS 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 ZS 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 ZS 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 ZS 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
... ... ... ... ... ... ... ...
2024-02-20 AAPL 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 AAPL 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 AAPL 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 AAPL 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 AAPL 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
package versions:
!pip show pandas. #1.5.3
!pip show beautifulsoup4 #4.12.3
!pip show pandas-datareader #0.10.0
Could you please assist on how to accomplish this?
Seems like you're adding a stock name in the wrong place and at the wrong time. This is the DataFrame
returned by web.DataReader
where you have to add the stock name either like data["stock_name"] = i
or with the DataFrame.insert method to place the new column on the left:
...
# fetch information by stock name
data = web.DataReader(i,start,end)
# insert the stock name to the left of the data
data.insert(0, "stock_name", i)
# append rows to empty dataframe
sic_market_df = sic_market_df.append(data)
...
Also, it looks reasonable to use pandas.concat instead of the deprecated append method. For example:
def get_data(stock_names, start, end):
for stock in tqdm(stock_names):
try:
yfin.pdr_override()
df = web.DataReader(stock, start, end)
df.insert(0, "stock_name", stock)
yield df
except KeyError:
pass
sic_market_df = pd.concat(get_data(stock_names, start, end))