Search code examples
pythonxmlsdmx

XML SDMX reading in python


Im strugling to read SDMX XML file with python from below links: https://www.newyorkfed.org/xml/fedfunds.html or direct

Ideally i would like to get fund rates into dataframe but i was trying to use pandasdmx which doesnt seem to work with this one

My Current code: f

rom urllib.request import urlopen
import xml.etree.ElementTree as ET

url = "https://websvcgatewayx2.frbny.org/autorates_fedfunds_external/services/v1_0/fedfunds/xml/retrieve?typ=RATE&f=03012016&t=04032020"

d2 = urlopen(url).read()
root  ET.fromstring(d2)

for elem in root.iter():
    k = elem.get('OBS_VALUE')
    if k is not None:
        print(k)

I would like to get something that will look like this:

             FUNDRATE_OBS_POINT='1%' FUNDRATE_OBS_POINT='25%'
2020-04-02   0.03                    0.05
2020-04-01   0.03                    0.05
2020-04-01   0.01                    0.05

I found this method to be pretty ugly, and for each "data" i need to check if its not None. Is there any better way to do that?


Solution

  • Try something along these lines:

    from lxml import etree
    import requests
    
    resp = requests.get(url)
    
    doc = etree.fromstring(resp.content)
    
    headers = []
    dates = []
    columns = []
    
    fop = doc.xpath('//Series[@FUNDRATE_OBS_POINT]')
    datpath = fop[0].xpath('//*[@*="ns13:ObsType"]')
    for dat in datpath:
        dates.append(dat.attrib.get('TIME_PERIOD'))
    for item in fop:
        headers.append(item.attrib.get('FUNDRATE_OBS_POINT'))
        entries = item.xpath('//*[@*="ns13:ObsType"]')
        column = []
        for entry in entries:
            column.append(entry.attrib.get('OBS_VALUE'))
        columns.append(column)
    
    
    df = pd.DataFrame(columns=headers,index=dates)
    
    for a, b in zip(headers,columns):
        df[a] = b
    df.head(3)
    

    Output:

                 1%     25%     50%     75%     99%  TARGET_HIGH  TARGET_LOW
    2020-04-02  0.03    0.03    0.03    0.03    0.03    0.03    0.03
    2020-04-01  0.03    0.03    0.03    0.03    0.03    0.03    0.03
    2020-03-31  0.01    0.01    0.01    0.01    0.01    0.01    0.01