Search code examples
c#jsonjson.netformattingindentation

Is there a way to modify JSON without changing the formatting?


Using Newtonsoft.Json, you get to choose how to format JSON using the Formatting enum and properties of JsonTextWriter. But if I start with a JSON string that's already formatted in some way and I want to modify it, is there a way to ensure that it retains its formatting?

I can think of a few avenues to explore:

  1. Do any current JSON libraries have a way of modifying a JSON string in place without deserializing it?
  2. When deserializing a JSON string, is there a way to detect the string's formatting so that the same formatting can be applied when reserializing it?
  3. Is there a function that combines a deserialized JObject with the JSON string it came from so that the object's changes are applied on top of the old string rather than used to build a new string?

Example of the problem:

var obj = JObject.Parse(json);
obj["foo"] = "bar";

Console.WriteLine(obj.ToString(Formatting.Indented));
// {
//   "baz": "qux",
//   "foo": "bar"
// }

Console.WriteLine(obj.ToString(Formatting.None));
// {"baz":"qux","foo":"bar"}

// Not knowing how the input was formatted,
// how can I know what options to use?

What a solution might look like:

var format = JsonConvert.GetFormat(json); // No such method?
var obj = JObject.Parse(json);
obj["foo"] = "bar";

Console.WriteLine(obj.ToString(format));

(I'm aware that there are more ways to format JSON than just choosing Indented or None, but I've kept the example simple for clarity.)


Solution

  • I'd argue: It's probably a fruitless pursuit.

    Although it's meant to be Human Readable, it rarely is read by humans. I'd in fact be much more worried about accidentally parsing important things in and out of a valid object.

    However, for a little fun in Newtonsoft this works for if it is formatted without indentation when the original was indented.

    var person = new Person()
        {
            Name = "John",
            Colors = new List<string>() {"Red", "Blue","Green"}
        };
    
        var rawJson = Newtonsoft.Json.JsonConvert.SerializeObject(person, Formatting.Indented);     
        var newJson = Newtonsoft.Json.JsonConvert.SerializeObject(person, Formatting.None);
    
        var settings =  new Newtonsoft.Json.JsonSerializerSettings();
        settings.Formatting = (Formatting)(newJson.Length <= rawJson.Length ? 1 : 0);       
    
        var finalJson = JsonConvert.SerializeObject(person, settings);
    

    If we are are concerned only with Newtonsoft, and indented or not I believe this would be good enough on the following premises:

    1. We control the changes of the format from the client
    2. If it shrinks, the format changed to Newtonsoft's default
    3. Hypothetically, it can't grow
    4. If the conversion keeps the strings the same length, nothing has changed

    Of course, problem cases come up like:

    1. Received JSON contains extra characters: "First Name:", this is valid in Json but not Class property names, so you may have a an attribute to be worried about.
    2. JSON is still a string, all sorts of weird things might come in, so validating it with any degree of certainty is likely to become a fools race (builds a better fool)

    In fact, if the purpose of some tool is to present the data to humans: Than select the most legible format (to me indented) and always display that.

    If the design is usually machines, then use none for the smallest payloads. I'd argue pretty much all developers used to JSON would expect a minimized JSON as a payload.