I have a third party JSON feed which is huge - lots of data. Eg
{
"data": [{
"name": "ABC",
"price": "2.50"
},
...
]
}
I am required to strip the quotation marks from the price as the consumer of the JSON feed requires it in this way.
To do this I am performing a regex to find the prices and then iterating over the prices and doing a string replace using gsub. This is how I am doing it:
price_strings = json.scan(/(?:"price":")(.*?)(?:")/).uniq
price_strings.each do |price|
json.gsub!("\"#{price.reduce}\"", price.reduce)
end
json
The main bottle neck appears to be on the each block. Is there a better way of doing this?
If this JSON string is going to be serialised into a Hash
at some point in your application or in another 3rd-party dependency of your code (i.e. to be consumed by your colleagues or modules), I suggest negotiating with them to convert the price value from String
to Numeric
on demand when the json is already a Hash
, as this is more efficient, and allows them to...
...handle edge-case where say if "price": ""
of which my code below will not work, as it would remove the ""
, and will be a JSON syntax error.
However, if you do not have control over this, or are doing once-off mutation for the whole json data, then can you try below?
json =
<<-eos
{
"data": [{
"name": "ABC",
"price": "2.50",
"somethingsomething": {
"data": [{
"name": "DEF",
"price": "3.25", "someprop1": "hello",
"someprop2": "world"
}]
},
"somethinggggg": {
"price": "123.45" },
"something2222": {
"price": 9.876, "heeeello": "world"
}
}]
}
eos
new_json = json.gsub /("price":.*?)"(.*?)"(.*?,|})/, '\1\2\3'
puts new_json
# =>
# {
# "data": [{
# "name": "ABC",
# "price": 2.50,
# "somethingsomething": {
# "data": [{
# "name": "DEF",
# "price": 3.25, "someprop1": "hello",
# "someprop2": "world"
# }]
# },
# "somethinggggg": {
# "price": 123.45 },
# "something2222": {
# "price": 9.876, "heeeello": "world"
# }
# }]
# }
DISCLAIMER: I am not a Regexp expert.