I have a pandas data frame as follows after attempting to flatten it:
df = pd.DataFrame(web.DataReader(stocks, 'yahoo', day, day).iloc[0]).unstack(level=0).droplevel(level=0, axis=1)
Attributes adjClose close ... volume date
Symbols ...
FB 261.399994 261.399994 ... 13587000.0 2020-10-19
AAPL 115.980003 115.980003 ... 120639300.0 2020-10-19
AMZN 3207.209961 3207.209961 ... 5223600.0 2020-10-19
GOOG 1534.609985 1534.609985 ... 1607100.0 2020-10-19
NFLX NaN NaN ... NaN 2020-10-19
I'm trying to persist this to the database; however, I don't see Symbols in the df.columns. In order to save the df to a following format:
Symbols adjClose close ... volume date
FB 261.399994 261.399994 ... 13587000.0 2020-10-19
AAPL 115.980003 115.980003 ... 120639300.0 2020-10-19
AMZN 3207.209961 3207.209961 ... 5223600.0 2020-10-19
GOOG 1534.609985 1534.609985 ... 1607100.0 2020-10-19
NFLX NaN NaN ... NaN 2020-10-19
Any suggestions on how to achieve this? My database has a composite key on Symbols, date columns. Thank You.
Symbols is your dataframes index, you'll need to use reset_index
to put it into the frame itself. Try this:
df = (pd.DataFrame(web.DataReader(stocks, 'yahoo', day, day)
.iloc[0])
.unstack(level=0)
.droplevel(level=0, axis=1)
.rename_axis(columns=None) # Gets rid of the "Attributes"
.reset_index() # Puts "Symbols" as an actual column, not as the index
)
My 2 addition:
rename_axis
This should get rid of your "Attributes" title. This is mainly for visual purposes when printing, but can throw off people who aren't used to working with multiindex data. Essentially your column labels are stored in an Index
object. This Index
object can have a name, so "Attributes" is the name of your columns (pretty strange concept, that isn't super useful for normal Indexes- but has a lot of usefulness when working with a MultiIndex
).reset_index()
It seems that your "Symbols" column isn't actually a column (which is why it doesn't appear in df.columns
but rather the index for the dataframe. Adding this method will insert the "Symbols" index as a column into the dataframe, and create a new index that is a simple RangeIndex
that spans from 0 to the length of your dataframe.