I wrote an app that helps me search through the gigantic JSON file (database dump). JSON file is loaded as list of dictionaries using:
with open ('myDB.json', 'r', encoding="utf-8") as file:
myDB = json.load(file)
The current structure of myDB is like this:
[
{
"object":"myobject",
"key1":"value1",
"key2":"value2",
"key3":"value3",
}
{
"object":"myobject",
"key1":"value1",
"key2":"value2",
"key3":"value3",
}
]
Some values are lists, some values are other dictionaries, some are just regular values.
At the moment I'm outputting queried items by pretty-printing it with:
for i in queryResults:
print(json.dumps(i, indent = 3))
...but sadly there are so many keys in each item that it takes too much space on screen and makes it unreadable. Even worse, I don't need all of that. What I'd like to do is selectively remove certain key:value pairs from printed result, so in my example let's say only object and key2 would be printed.
I'm not interested in manually printing (or making lists) of key:value pairs that I need. There's far too many of those to do it this way, not to mention actual needs might change. In comparison there's only a handfull of key:value pairs I want to remove. What I'd prefer is to have a list of keys to remove which will be used when printing result, thus filtering what is actually printed.
Pythonic one-liners are very welcome.
BONUS QUESTION: I'm looking primarily for a way to remove top-level key:value pairs in each item but for the sake of complete knowledge I'll be happy to also know how to remove key:value pairs from sub-dictionaries that are values of certain top-level keys.
First: use the pprint
library, it's made for this.
Otherwise, the straightforward solution would be to filter the dict and then prettyprint it. Something like this would work for top-level key removal
filtered_results = [{k:v for k, v in elem.items() if k not in keys_to_remove} for elem in query_results]
though to get lower-level you would maybe need to do something recursive, like
def filter_results(results, keys_to_remove):
if isinstance(results, list):
return [filter_results(item) for item in list]
elif isinstance(results, dict):
return {k:filter_results(v) for k,v in results.items() if k not in keys_to_remove}
else:
return results
...
filtered_results = filter_results(query_results)
After which you can just print it at your leisure
import pprint
...
pprint.pprint(filtered_results)