i'm writing a client to a third-party API, and they provide data in a weird format. At first, it might look like JSON but it's not, and i'm a bit confused about how i should handle that.
It's a key-value based format (much like JSON).
What could possibly be this format? Shall i use a premade gem to parse it, or should i build my own parser?
{ "anArray" = (
"aDictionary" = {
"aString" = "Something";
EDIT This format seems to be Apple's property list, but it's not XML neither Binary... This make sense as the API is from a WebObjects webservice. i will try to use CFPropertyList gem to parse it, if there is a better solution, please let me know.
EDIT 2 This is a NextSTEP Property List.
Here's a robust answer using a custom StringScanner-based parser. It allows whitespace to be optional, allows trailing commas after the last item in a list and allows omitting the semicolon after the last dictionary key/value pair. It allows the outermost item to be an dictionary, array, or string. And it allows really any sort of legal string content, including parens and curly braces and escaped text like \n
Seen in action:
p parse('{ "array" = ( "1", "2", ( "3", "4" ) ); "hash"={ "key"={ "more"="oh}]yes;!"; }; }; }')
#=> {"array"=>["1", "2", ["3", "4"]], "hash"=>{"key"=>{"more"=>"oh}]yes;!"}}}
puts parse('("Escaped \"Quotes\" Allowed", "And Unicode \u2623 OK")')
#=> Escaped "Quotes" Allowed
#=> And Unicode ☣ OK
The code:
require 'strscan'
def parse(str)
ss, getstr, getary, getdct = StringScanner.new(str)
getvalue = ->{
if ss.scan /\s*\{\s*/ then getdct[]
elsif ss.scan /\s*\(\s*/ then getary[]
elsif str = getstr[] then str
elsif ss.scan /\s*[)}]\s*/ then nil end
getstr = ->{
if str=ss.scan(/\s*"(?:[^"\\]|\\u\d+|\\.)*"\s*/i)
eval str.gsub(/([^\\](?:\\\\)*)#(?=[{@$])/,'\1\#')
getary = ->{
[].tap do |a|
while v=getvalue[]
a << v
ss.scan /\s*,\s*/
getdct = ->{
{}.tap do |h|
while key = getstr[]
ss.scan /\s*=\s*/
if value=getvalue[] then h[key]=value; ss.scan(/\s*;\s*/) end
As an alternative to rolling your own parser from scratch in the future, you might also want to look into the Treetop Ruby library.
Edit: I've replaced the implementation of getstr
above with one that should prevent running arbitrary Ruby code inside the eval
. For more details, see "Eval a string without interpolation". Seen in action:
@secret = "OH NO!"
$secret = "OH NO!"
@@secret = "OH NO!"
puts parse('"\"#{:NOT&&:very}\" bad. \u262E\n#@secret \\#$secret \\\\#@@secret"')