Good evening all who happen to find this post!
I'm having a slight issue when it comes to dividing a stocks netIncome by its totalRevenue within a multiindex.
The code is as follows:
import yahoo_fin.stock_info as si
import pandas as pd
company = ['AMZN', 'FB']
inc = {}
for ticker in company:
inc[ticker] = si.get_income_statement(ticker)
df = pd.concat(inc)
pM = df.loc[(slice(None), ['netIncome', 'totalRevenue']), :]
pM
Now, all of this code should return me the company ticker, along with the net incomes and total revenue of the previous 4 years. Which is exactly what I need, and is all good so far...
I then want to take each company (ticker) in the index and divide the netIncome by totalRevenue (of the same company) and times that by 100 to get me a profit margin percentage...
If I use the code below it returns a load of NaNs within the rows...
test = pM.loc[(slice(None), 'netIncome'), :] / pM.loc[(slice(None), 'totalRevenue'), :] * 100
test
But if I manually type in the companies ticker like so:
pM.loc[('AMZN', 'netIncome'), :] / pM.loc[('AMZN', 'totalRevenue'), :] * 100
endDate
2019-12-31 4.13087
2018-12-31 4.32527
2017-12-31 1.70522
2016-12-31 1.74355
dtype: object
It returns what I'm looking for, with that company...but obviously if I'm chopping and changing or have multiple stock tickers within the 'company' list then you can see it would be a bit long winded...
These are all questions I've asked myself and cant find an answer, so my final solution is here on this forum!
Didn't really want to bother anyone with my questions, but I don't know what else to do :(
Thank you in advance, and hopefully everything I've written is clear, any issues let me know.
The reason it failed was because the indices of the two operands were not aligned:
>>> pM.loc[(slice(None), 'netIncome'), :]
endDate 2019-12-31 2018-12-31 2017-12-31 2016-12-31
Breakdown
AMZN netIncome 11588000000 10073000000 3033000000 2371000000
FB netIncome 18485000000 22112000000 15934000000 10217000000
>>> pM.loc[(slice(None), 'totalRevenue'), :]
endDate 2019-12-31 2018-12-31 2017-12-31 2016-12-31
Breakdown
AMZN totalRevenue 280522000000 232887000000 177866000000 135987000000
FB totalRevenue 70697000000 55838000000 40653000000 27638000000
pandas was looking for the label (AMZN, netIncome)
in the second frame and there isn't one so the result of the revision is N/A. Same with the other rows.
What you wanted was xs
:
margin = (
pM.xs((slice(None), 'netIncome'))
.div(pM.xs((slice(None), 'totalRevenue')))
.mul(100)
.assign(Breakdown='profitMargin')
.set_index('Breakdown', append=True)
)
pM = pM.append(margin).sort_index()
You can see how much work it takes just to add a metric to the data frame. That was because your pM
frame was inefficiently designed. Try this instead:
company = ['AMZN', 'FB']
inc = {}
for ticker in company:
inc[ticker] = si.get_income_statement(ticker).T # ← added a `.T` here
df = pd.concat(inc)
df['profitMargin'] = df['netIncome'] / df['totalRevenue'] * 100