import requests
import pandas as pd
from bs4 import BeautifulSoup as bs
from io import StringIO
url = "https://www.tickertape.in/stocks/oil-and-natural-gas-corporation-ONGC"
r = requests.get(url=url)#,headers=headers)
soup = bs(r.content,'html5lib')
fin = soup.find_all(class_="financials-table-root")
for f in fin:
str_data = f.text
break
print(str_data)
df = pd.read_csv(StringIO(str_data))
print(df)
This is not giving me the desired result. I am not good with handing strings, please guide me a way to extract the values from the str_data, so that I can use it for further calculations.
Output of str_data
:
Financial YearFY 2016FY 2017FY 2018FY 2019FY 2020FY 2021FY 2022FY 2023FY 2024TTMTotal Revenue1,33,084.682,95,230.293,33,135.154,32,783.974,06,772.743,15,162.695,00,202.156,41,531.266,57,654.976,57,751.89Raw Materials36,666.931,75,376.732,02,298.512,75,523.612,69,975.372,05,913.122,25,616.932,66,120.045,41,837.995,42,659.83Power & Fuel Cost1,182.381,957.932,177.502,139.732,078.921,980.442,239.162,174.44Employee Cost9,230.1715,128.1614,970.7215,850.5015,531.2614,135.1215,235.7414,898.79Selling & Administrative Expenses35,735.3136,378.9034,331.5839,621.4429,357.6623,004.0836,347.6144,365.47Operating & Other expenses10,068.00860.7212,038.3915,262.6036,742.869,385.991,34,092.392,38,475.37EBITDA40,201.8965,527.8567,318.4584,386.0953,086.6760,743.9486,670.3275,497.151,15,816.981,15,092.06Depreciation/Amortization16,384.0620,219.2023,111.9123,703.7026,634.8825,538.4726,883.1624,557.0628,762.7428,750.79PBIT23,817.8345,308.6544,206.5460,682.3926,451.7935,205.4759,787.1650,940.0987,054.2486,341.27Interest & Other Items3,765.583,591.114,999.045,836.737,489.345,079.035,696.047,889.3610,194.1710,194.17PBT20,052.2541,717.5439,207.5054,845.6618,962.4530,126.4454,091.1243,050.7376,860.0776,147.10Taxes & Other Items7,177.0417,298.2917,101.5824,299.628,158.8513,822.058,568.997,610.2827,638.6927,459.24Net Income12,875.2124,419.2522,105.9230,546.0410,803.6016,304.3945,522.1335,440.4549,221.3848,687.86EPS10.0319.0317.2324.048.5912.9636.1928.1739.1338.70DPS5.677.556.607.005.003.6010.5011.2512.2510.25Payout ratio0.560.400.380.290.580.280.290.400.310.26
Output of print(df)
:
Empty DataFrame
Columns: [Financial YearFY 2016FY 2017FY 2018FY 2019FY 2020FY 2021FY 2022FY 2023FY 2024TTMTotal Revenue1, 33, 084.682, 95, 230.293, 33.1, 135.154, 32, 783.974, 06, 772.743, 15, 162.695, 00, 202.156, 41, 531.266, 57, 654.976, 57.1, 751.89Raw Materials36, 666.931, 75, 376.732, 02, 298.512, 75.1, 523.612, 69, 975.372, 05, 913.122, 25, 616.932, 66, 120.045, 41.1, 837.995, 42, 659.83Power & Fuel Cost1, 182.381, 957.932, 177.502, 139.732, 078.921, 980.442, 239.162, 174.44Employee Cost9, 230.1715, 128.1614, 970.7215, 850.5015, 531.2614, 135.1215, 235.7414, 898.79Selling & Administrative Expenses35, 735.3136, 378.9034, 331.5839, 621.4429, 357.6623, 004.0836, 347.6144, 365.47Operating & Other expenses10, 068.00860.7212, 038.3915, 262.6036, 742.869, 385.991, 34, 092.392, 38, 475.37EBITDA40, 201.8965, 527.8567, 318.4584, 386.0953, 086.6760, 743.9486, 670.3275, 497.151, 15.1, 816.981, 15.2, 092.06Depreciation/Amortization16, 384.0620, 219.2023, 111.9123, 703.7026, 634.8825, 538.4726, 883.1624, 557.0628, 762.7428, 750.79PBIT23, 817.8345, 308.6544, 206.5460, 682.3926, 451.7935, ...]
Index: []
Trying to make a dataframe to use the values.
You can use read_html
with fixing columns names by remove Unnamed
columns names and filter by length of them:
url = "https://www.tickertape.in/stocks/oil-and-natural-gas-corporation-ONGC"
df = pd.read_html(url)[2]
cols = [c for c in df.columns if not c.startswith('Unnamed')]
out = df.iloc[:, :len(cols)].set_axis(cols, axis=1)
print (out)
Financial Year FY 2016 FY 2017 FY 2018 FY 2019 FY 2020 FY 2021 FY 2022 FY 2023 FY 2024 TTM
0 Total Revenue 133084.68 295230.29 333135.15 432783.97 406772.74 315162.69 500202.15 641531.26 657654.97 657751.89
1 Raw Materials 36666.93 175376.73 202298.51 275523.61 269975.37 205913.12 225616.93 266120.04 541837.99 542659.83
2 Power & Fuel Cost 1182.38 1957.93 2177.50 2139.73 2078.92 1980.44 2239.16 2174.44 541837.99 542659.83
3 Employee Cost 9230.17 15128.16 14970.72 15850.50 15531.26 14135.12 15235.74 14898.79 541837.99 542659.83
4 Selling & Administrative Expenses 35735.31 36378.90 34331.58 39621.44 29357.66 23004.08 36347.61 44365.47 541837.99 542659.83
5 Operating & Other expenses 10068.00 860.72 12038.39 15262.60 36742.86 9385.99 134092.39 238475.37 541837.99 542659.83
6 EBITDA 40201.89 65527.85 67318.45 84386.09 53086.67 60743.94 86670.32 75497.15 115816.98 115092.06
7 Depreciation/Amortization 16384.06 20219.20 23111.91 23703.70 26634.88 25538.47 26883.16 24557.06 28762.74 28750.79
8 PBIT 23817.83 45308.65 44206.54 60682.39 26451.79 35205.47 59787.16 50940.09 87054.24 86341.27
9 Interest & Other Items 3765.58 3591.11 4999.04 5836.73 7489.34 5079.03 5696.04 7889.36 10194.17 10194.17
10 PBT 20052.25 41717.54 39207.50 54845.66 18962.45 30126.44 54091.12 43050.73 76860.07 76147.10
11 Taxes & Other Items 7177.04 17298.29 17101.58 24299.62 8158.85 13822.05 8568.99 7610.28 27638.69 27459.24
12 Net Income 12875.21 24419.25 22105.92 30546.04 10803.60 16304.39 45522.13 35440.45 49221.38 48687.86
13 EPS 10.03 19.03 17.23 24.04 8.59 12.96 36.19 28.17 39.13 38.70
14 DPS 5.67 7.55 6.60 7.00 5.00 3.60 10.50 11.25 12.25 10.25
15 Payout ratio 0.56 0.40 0.38 0.29 0.58 0.28 0.29 0.40 0.31 0.26
For select by indicators is possible set first column to index:
df = pd.read_html(url, index_col=0)[2]
cols = [c for c in df.columns if not c.startswith('Unnamed')]
out = df.iloc[:, :len(cols)].set_axis(cols, axis=1)
print (out)
FY 2016 FY 2017 FY 2018 FY 2019 FY 2020 FY 2021 FY 2022 FY 2023 FY 2024 TTM
Financial Year
Total Revenue 133084.68 295230.29 333135.15 432783.97 406772.74 315162.69 500202.15 641531.26 657654.97 657751.89
Raw Materials 36666.93 175376.73 202298.51 275523.61 269975.37 205913.12 225616.93 266120.04 541837.99 542659.83
Power & Fuel Cost 1182.38 1957.93 2177.50 2139.73 2078.92 1980.44 2239.16 2174.44 541837.99 542659.83
Employee Cost 9230.17 15128.16 14970.72 15850.50 15531.26 14135.12 15235.74 14898.79 541837.99 542659.83
Selling & Administrative Expenses 35735.31 36378.90 34331.58 39621.44 29357.66 23004.08 36347.61 44365.47 541837.99 542659.83
Operating & Other expenses 10068.00 860.72 12038.39 15262.60 36742.86 9385.99 134092.39 238475.37 541837.99 542659.83
EBITDA 40201.89 65527.85 67318.45 84386.09 53086.67 60743.94 86670.32 75497.15 115816.98 115092.06
Depreciation/Amortization 16384.06 20219.20 23111.91 23703.70 26634.88 25538.47 26883.16 24557.06 28762.74 28750.79
PBIT 23817.83 45308.65 44206.54 60682.39 26451.79 35205.47 59787.16 50940.09 87054.24 86341.27
Interest & Other Items 3765.58 3591.11 4999.04 5836.73 7489.34 5079.03 5696.04 7889.36 10194.17 10194.17
PBT 20052.25 41717.54 39207.50 54845.66 18962.45 30126.44 54091.12 43050.73 76860.07 76147.10
Taxes & Other Items 7177.04 17298.29 17101.58 24299.62 8158.85 13822.05 8568.99 7610.28 27638.69 27459.24
Net Income 12875.21 24419.25 22105.92 30546.04 10803.60 16304.39 45522.13 35440.45 49221.38 48687.86
EPS 10.03 19.03 17.23 24.04 8.59 12.96 36.19 28.17 39.13 38.70
DPS 5.67 7.55 6.60 7.00 5.00 3.60 10.50 11.25 12.25 10.25
Payout ratio 0.56 0.40 0.38 0.29 0.58 0.28 0.29 0.40 0.31 0.26