Search code examples
pythonjsonconfiguration-files

Python - JSON parsing using .ini options


Good localtime Python people,

I have a bunch of JSON responses I will be dealing with, with the following format:

    {
   "responseHeader":{
      "status":1,
      "params":{
         "indent":"true",
         "fq":"recordType:Vinyl",
         "wt":"json"
      }
   },
   "response":{
      "numFound":2,
      "albums":[
         {
            "name":"Some Crappy Album",
            "year":"1997",
            "artist":[
               "Bill's Polka Jamburri"
            ],
            "producer":[
               "Dope records"
            ],
         },
         {
            "name":"Best of Foreigner",
            "year":"2008",
            "artist":[
               "Foreginer"
            ],
            "producer":[
               "Rhino Entertainment"
            ],
         },
      ]
   }
}

And an .ini file that includes:

[Filters]
Exclude:somekey=somevalue
Include:somekey=somevalue

I already have code that uses urllib, urllib2, argparse and config parser that is capable of reading in a bunch of these records and doing stuff with data. My question is, what would be the best way to implement filtering using my .ini file, where I could explicitly retrieve albums based on fields (Include:artist=devo) or exclude albums based on fields (Exclude:year=1979)?

Below are my getOptionsFromConfigFile, loadJSON and getAlbums functions:

def getOptionsFromConfigFile( ):
    print "==========================================================================="
    print "Reading in config (.ini) file params ... "
    config = ConfigParser.ConfigParser()
    config.read("config.ini")
    ExcludeParams = config.get("Filters", "Exclude")
    logging.debug(' Exclude params pulled from ini file: ' + JSONPath)
    IncludeParams = config.get("Filters", "Include")
    logging.debug(' Include params pulled from ini file: ' + JSONPath)
    return ExcludeParams, IncludeParams;

def loadJSON( ):
    print "Fetch Albums! ---> " + JSONPath
    print "==========================================================================="
    logging.debug('Loading ' + JSONPath)
    response = urllib2.urlopen(JSONPath)
    data = response.read()
    values = simplejson.loads(data)
    logging.debug('Dictionary pulled from ' + JSONPath)
    return values;

def getAlbums( values, outputPath):
    logging.debug('Getting Albums ...')
    for Album in values['response']['albums']:
        albumName = album['name']
        storeAlbum(outputPath)
    print "==========================================================================="
    return;

Solution

  • Assuming you can load Exclude:year=1979 to a String, you would need to get a tuple, for example

    ('year', 1979) 
    

    Then, while you iterate the albums, you also need to iterate some list of exclusion or inclusion tuples

    # TODO: parse the exclusions and pass to this function 
    def getAlbums( values, output_path, inclusions=None, exclusions=None):
    
        logging.debug('Getting Albums ...')
    
        albums = [] 
        for album in values['response']['albums']:
            for ex_key, ex_value in exclusions:
                # filter out the exclusions 
                if ex_key in album and album[ex_key] != ex_value:
                    album_name = album['name']
                    albums.append(album_name)
    
        for album in albums:
            store_album(album, output_path)
    

    This approach isn't perfect, though, because what if you exclude and include overlapping values? Do you want add everything that isn't excluded or only the included values?

    You might be better grabbing all the values in a list, then filtering afterwards