I've stumbled upon a very surprising observation when working with the python standard json
library, and more specifically when using object_pairs_hook
from there.
Here's my data:
items.json:
--
{
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
--
And here's my minimum working code:
Jupyter QtConsole 4.3.1
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
import json
def dummy_hook(input):
print("INPUT:",input)
filename = r'items.json'
with open(filename, 'r') as f:
data = json.load(f,object_pairs_hook=dummy_hook)
Surprisingly (to me), the outcome is this:
INPUT: [('value', 'New'), ('onclick', 'CreateNewDoc()')]
INPUT: [('value', 'Open'), ('onclick', 'OpenDoc()')]
INPUT: [('value', 'Close'), ('onclick', 'CloseDoc()')]
INPUT: [('menuitem', [None, None, None])]
In particular, you will realize that the three dictionaries with the "value"/"onclick" pairs have been decoded to None
. This is a problem for me, as I have been hoping to perform some further operations on these.
Questions: Is this to be expected? Am I doing something incorrectly here?
EDIT: So changing the hook function into:
def dummy_hook(input):
print("INPUT:",input)
return 7
Does indeed change the print outcome into:
INPUT: [('value', 'New'), ('onclick', 'CreateNewDoc()')]
INPUT: [('value', 'Open'), ('onclick', 'OpenDoc()')]
INPUT: [('value', 'Close'), ('onclick', 'CloseDoc()')]
INPUT: [('menuitem', [7,7,7])]
How adding a return
statement changes the decoding outcome, I still don't understand. But yes, in principle, this solves the problem.
I think your function dummy_hook
should return a value. In your case, maybe the same input.
object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, collections.OrderedDict() will remember the order of insertion). If object_hook is also defined, the object_pairs_hook takes priority.
from: https://docs.python.org/3.6/library/json.html#json.load