I am working with a Rails database that contains serialized values in one column. These values should have been regular Hash
es, but due to improperly sanitizing the parameters they were stored as HashWithIndifferentAccess
or Parameters
. For example, one column entry looks like this:
--- !ruby/object:ActionController::Parameters
parameters: !ruby/hash:ActiveSupport::HashWithIndifferentAccess
windowHeight: 946
documentHeight: 3679
scrollTop: 500
permitted: false
I want to read this with Python's yaml
implementation, but when I try to do so, I get:
*** yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:ActionController::Parameters'
in "<unicode string>", line 1, column 5:
--- !ruby/object:ActionController::P ...
^
So, for some reason it expects a constructor. But quite obviously, the value itself is just a regular dictionary. How can I still read it?
You can use the add_constructor(loader, node)
function of the PyYAML parser, which lets you implement custom constructors for object types that it does not recognize.
In that constructor, the function loader.construct_pairs(node)
can be called to obtain key-value tuples from the original node contents. Using a dictionary comprehension, we can create the original dictionary.
Since the entry is nested, we have to apply the constructor to both object types.
A complete example would be the following:
import yaml
def convert_entry(loader, node):
return { e[0]: e[1] for e in loader.construct_pairs(node) }
yaml.add_constructor('!ruby/hash:ActiveSupport::HashWithIndifferentAccess', convert_entry)
yaml.add_constructor('!ruby/object:ActionController::Parameters', convert_entry)
yaml.load(input_string)
This is somehow documented but it's hard to find a lot of examples.