I am downloading metadata from quandl.com for financial data sets. The data from quandl.com is already in a dictionary format. I want to take this data from quandl.com and organize it into a DataFrame and then import it into Excel.
This is the text file('Indicator_list.txt') with a list of financial datasets that I am downloading from quandl.com. I want metadata on each of these symbols organized into a DataFrame.
COM/OIL_WTI
BOE/XUDLADS
BOE/XUDLADD
BOE/XUDLB8KL
BOE/XUDLCDS
BOE/XUDLCDD
This is the code I am running
import quandl
import pandas as pd
#This adjusts the layout in the command
#promt to have columns displayed side by side
pd.set_option('expand_frame_repr', False)
#This "with open" statment opens a text file that
#has the symbols I want to get the metadata on
with open ('Indicator_list.txt') as file_object:
Current_indicators = file_object.read()
tickers = Current_indicators.split('\n')
#quandlmetadata is a blank dictionary that I am
#appending the meatadata to
quandlmetadata={}
#this loops through all the values in
#Indicator_list.txt"
for i in tickers:
#metadata represents one set of metadata
metadata = quandl.Dataset(i).data().meta
This is the output of metadata coming from quandl.com
{'start_date': datetime.date(1975, 1, 2), 'column_names': ['Date', 'Value'], 'limit': None, 'collapse': None, 'order': 'asc', 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'column_index': None, 'frequency': 'daily'}
Next I add this to quandlmetadata dictionary and use the current symbol from indicator_list.txt " i " to name my key for the dictionary.
quandlmetadata[i]=(metadata)
This is the output of quandlmetadata
{'BOE/XUDLADS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLB8KL': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(2011, 8, 1), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'COM/OIL_WTI': {'column_names': ['date', 'value'], 'end_date': datetime.date(2016, 11, 4), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1983, 3, 30), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLADD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}}
Finally I want to have the quandlmetadata dictionary turned into a data frame(or another way that would be better)
this is the final part of the code
df = pd.DataFrame(index = quandlmetadata.keys(),columns =['transform', 'frequency', 'limit', 'end_date', 'collapse', 'column_names','start_date', 'order', 'column_index'] )
The output of df
transform frequency limit end_date collapse column_names start_date order column_index
BOE/XUDLB8KL NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADS NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADD NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDS NaN NaN NaN NaN NaN NaN NaN NaN NaN
COM/OIL_WTI NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDD NaN NaN NaN NaN NaN NaN NaN NaN NaN
The output of df is exactly what I want; the ticker from Indicator_list.txt is my index, and the columns are the metadata.keys(). The only thing I cannot get to work is populating the rows of the DataFrame with the quandlmetadata dictionary values. The end goal is to be able to import this list into excel so if there is a way to do this without using a dataframe I am defiantly open to that.
Maybe you could use DataFrame.from_dict
?
In [15]: pd.DataFrame.from_dict(quandlmetadata, orient='index')
Out[15]:
column_index end_date order column_names start_date collapse transform limit frequency
BOE/XUDLADD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLADS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLB8KL None 2016-11-03 asc [Date, Value] 2011-08-01 None None None daily
BOE/XUDLCDD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLCDS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
COM/OIL_WTI None 2016-11-04 asc [date, value] 1983-03-30 None None None daily
I don't think the column_names
column will be very useful, though. You'd also want to manually call pd.to_datetime
on the date columns, so that they're datetime64 columns and not string ones.