I am stuck in an impossible situation. I have a JSON from outer space (there is no way they are going to change it). Here is the JSON
{
user:'180111',
title:'I\'m sure "E pluribus unum" means \'Out of Many, One.\' \n\nhttp://en.wikipedia.org/wiki/E_pluribus_unum.\n\n\'',
date:'2007/01/10 19:48:38',
"id":"3322121",
"previd":112211,
"body":"\'You\' can \"read\" more here [url=http:\/\/en.wikipedia.org\/?search=E_pluribus_unum]E pluribus unum[\/url]'s. Cheers \\*/ :\/",
"from":"112221",
"username":"mikethunder",
"creationdate":"2007\/01\/10 14:04:49"
}
"It is nowhere near a valid JSON",I said. And their response was "emmm! but Javascript can read it without complain":
<html>
<script type="text/javascript">
var obj = {"PUT JSON FROM UP THERE HERE"};
document.write(obj.title);
document.write("<br />");
document.write(obj.creationdate + " " + obj.date);
document.write("<br />");
document.write(obj.body);
document.write("<br />");
</script>
<body>
</body>
</html>
Problem
I am supposed to read and parse this string via .NET(4) and it broke 3 out of 14 library mentioned in C# section of Json.org (didn't try rest of them). To make the problem go away, I wrote following function to fix the issue with single and double quotes.
public static string JSONBeautify(string InStr){
bool inSingleQuote = false;
bool inDoubleQuote = false;
bool escaped = false;
StringBuilder sb = new StringBuilder(InStr);
sb = sb.Replace("`", "<°)))><"); // replace all instances of "grave accent" to "fish" so we can use that mark later.
// Hopefully there is no "fish" in our JSON
for (int i = 0; i < sb.Length; i++) {
switch (sb[i]) {
case '\\':
if (!escaped)
escaped = true;
else
escaped = false;
break;
case '\'':
if (!inSingleQuote && !inDoubleQuote) {
sb[i] = '"'; // Change opening single quote string markers to double qoute
inSingleQuote = true;
} else if (inSingleQuote && !escaped) {
sb[i] = '"'; // Change closing single quote string markers to double qoute
inSingleQuote = false;
} else if (escaped) {
escaped = false;
}
break;
case '"':
if (!inSingleQuote && !inDoubleQuote) {
inDoubleQuote = true; // This is a opening double quote string marker
} else if (inSingleQuote && !escaped) {
sb[i] = '`'; // Change unescaped double qoute to grave accent
} else if (inDoubleQuote && !escaped) {
inDoubleQuote = false; // This is a closing double quote string marker
} else if (escaped) {
escaped = false;
}
break;
default:
escaped = false;
break;
}
}
return sb.ToString()
.Replace("\\/", "/") // Remove all instances of escaped / (\/) .hopefully no smileys in string
.Replace("`", "\\\"") // Change all "grave accent"s to escaped double quote \"
.Replace("<°)))><", "`") // change all fishes back to "grave accent"
.Replace("\\'","'"); // change all escaped single quotes to just single quote
}
Now JSONlint only complains about attribute names and I can use both JSON.NET and SimpleJSON libraries to parse above JSON.
Question
I am sure my code is not the best way of fixing mentioned JSON. Is there any scenario that my code might break? Is there a better way of doing this?
You need to run this through JavaScript. Fire up a JavaScript parser in .net. Give the string as input to JavaScript and use JavaScript's native JSON.stringify
to convert:
obj = {
"user":'180111',
"title":'I\'m sure "E pluribus unum" means \'Out of Many, One.\' \n\nhttp://en.wikipedia.org/wiki/E_pluribus_unum.\n\n',
"date":'2007/01/10 19:48:38',
"id":"3322121",
"previd":"112211",
"body":"\'You\' can \"read\" more here [url=http:\/\/en.wikipedia.org\/?search=E_pluribus_unum]E pluribus unum[\/url]'s. Cheers \\*/ :\/",
"from":"112221",
"username":"mikethunder",
"creationdate":"2007\/01\/10 14:04:49"
}
console.log(JSON.stringify(obj));
document.write(JSON.stringify(obj));
Please remember that the string (or rather object) you've got isn't valid JSON and can't be parsed with a JSON library. It needs to be converted to valid JSON first. However it's valid JavaScript.
To complete this answer: You can use JavaScriptSerializer
in .Net. For this solution you'll need the following assemblies:
System.Web.Script.Serialization
var webClient = new WebClient();
string readHtml = webClient.DownloadString("uri to your source (extraterrestrial)");
var a = new JavaScriptSerializer();
Dictionary<string, object> results = a.Deserialize<Dictionary<string, object>>(readHtml);