Search code examples
pythonjsonpandasdata-analysiskeyword-argument

Query data using pandas with kwargs


I'm trying to Query data using python pandas library. here is an example json of the data...

[
{
"name": "Bob", 
"city": "NY", 
"status": "Active"
}, 
{
"name": "Jake", 
"city": "SF", 
"status": "Active" 
}, 
{
"name": "Jill", 
"city": "NY", 
"status": "Lazy" 
},
{
"name": "Steve", 
"city": "NY", 
"status": "Lazy" 
}]

My goal is to query the data where city == NY and status == Lazy. One way using pandas DataFrame is to do...

df = df[(df.status == "Lazy") & (df.city == "NY")]

This is working fine but i wanted this to be more abstract.

This there way I can use **kwargs to filter the data? so far i've had trouble using Pandas documentation.

so far I've done.....

 def main(**kwargs):

        readJson = pd.read_json(sys.argv[1])

        for key,value in kwargs.iteritems():
            print(key,value)
            readJson = readJson[readJson[key] == value]

        print readJson

if __name__ == '__main__':
    main(status="Lazy",city="NY")

again...this works just fine, but I wonder if there is some better way to do it.


Solution

  • I don't really see anything wrong with your approach. If you wanted to use df.query you could do something like this, although I'd argue it's less readable.

    expr = " and ".join(k + "=='" + v + "'" for (k,v) in kwargs.items())
    readJson = readJson.query(expr)