Search code examples
pythondictionarydataframexlwingsquandl

how do I populate rows of of a DataFrame with values from a dictionary?


I am downloading metadata from quandl.com for financial data sets. The data from quandl.com is already in a dictionary format. I want to take this data from quandl.com and organize it into a DataFrame and then import it into Excel.

This is the text file('Indicator_list.txt') with a list of financial datasets that I am downloading from quandl.com. I want metadata on each of these symbols organized into a DataFrame.

COM/OIL_WTI
BOE/XUDLADS
BOE/XUDLADD
BOE/XUDLB8KL
BOE/XUDLCDS
BOE/XUDLCDD

This is the code I am running

import quandl
import pandas as pd

#This adjusts the layout in the command
#promt to have columns displayed side by side
pd.set_option('expand_frame_repr', False)

#This "with open" statment opens a text file that 
#has the symbols I want to get the metadata on 
with open ('Indicator_list.txt') as file_object:
    Current_indicators = file_object.read()
    tickers = Current_indicators.split('\n')

#quandlmetadata is a blank dictionary that I am
#appending the meatadata to
quandlmetadata={}

#this loops through all the values in 
#Indicator_list.txt"
for i in tickers:

    #metadata represents one set of metadata
    metadata = quandl.Dataset(i).data().meta

This is the output of metadata coming from quandl.com

{'start_date': datetime.date(1975, 1, 2), 'column_names': ['Date', 'Value'], 'limit': None, 'collapse': None, 'order': 'asc', 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'column_index': None, 'frequency': 'daily'}

Next I add this to quandlmetadata dictionary and use the current symbol from indicator_list.txt " i " to name my key for the dictionary.

quandlmetadata[i]=(metadata)

This is the output of quandlmetadata

{'BOE/XUDLADS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLB8KL': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(2011, 8, 1), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'COM/OIL_WTI': {'column_names': ['date', 'value'], 'end_date': datetime.date(2016, 11, 4), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1983, 3, 30), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLADD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}}

Finally I want to have the quandlmetadata dictionary turned into a data frame(or another way that would be better)

this is the final part of the code

df = pd.DataFrame(index = quandlmetadata.keys(),columns =['transform', 'frequency', 'limit', 'end_date', 'collapse', 'column_names','start_date', 'order', 'column_index']  )

The output of df

             transform frequency limit end_date collapse column_names start_date order column_index
BOE/XUDLB8KL       NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLADS        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLADD        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLCDS        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
COM/OIL_WTI        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLCDD        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN

The output of df is exactly what I want; the ticker from Indicator_list.txt is my index, and the columns are the metadata.keys(). The only thing I cannot get to work is populating the rows of the DataFrame with the quandlmetadata dictionary values. The end goal is to be able to import this list into excel so if there is a way to do this without using a dataframe I am defiantly open to that.


Solution

  • Maybe you could use DataFrame.from_dict?

    In [15]: pd.DataFrame.from_dict(quandlmetadata, orient='index')
    Out[15]: 
                 column_index    end_date order   column_names  start_date collapse transform limit frequency
    BOE/XUDLADD          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
    BOE/XUDLADS          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
    BOE/XUDLB8KL         None  2016-11-03   asc  [Date, Value]  2011-08-01     None      None  None     daily
    BOE/XUDLCDD          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
    BOE/XUDLCDS          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
    COM/OIL_WTI          None  2016-11-04   asc  [date, value]  1983-03-30     None      None  None     daily
    

    I don't think the column_names column will be very useful, though. You'd also want to manually call pd.to_datetime on the date columns, so that they're datetime64 columns and not string ones.