Search code examples
jsonrustserde-json

How can I parse a JSON array of either strings or objects?


An API I hit has poorly structured JSON. Someone decided that it was a great idea to send back a list that looks like this

features: [
  "First one",
  "second one",
  {
    "feature": "third one",
    "hasAdditionalImpact": true
  },
  "forth one"
]

I've figured out a way to get this data into a struct but that was effectively:

struct MyStruct {
    SensibleData: String,
    SensibleTruthy: bool,
    features: serde_json::Value,
}

This doesn't help me normalize and verify the data.

Is there a good way to turn that first object into something like

features: [
  {
    "feature": "First one",
    "hasAdditionalImpact": false
  },
  {
    "feature": "second one",
    "hasAdditonalImpact": false
  },
  {
    "feature": "third one",
    "hasAdditionalImpact": true
  },
  {
    "feature": "forth one",
    "hasAdditionalImpact": false
  }
]

I saw type_name might be usable for checking the type and doing post-processing after it's be parsed by serde_json, but I also saw that type_name is for diagnostic purposes so I'd rather not use that for this purpose.


Solution

  • It looks like the features in your JSON have two forms; an explicit object and a simplified form where some fields are defaulted or unnamed. You can model that with an eum like this:

    #[derive(Deserialize, Debug)]
    #[serde(untagged)]
    enum Feature {
        Simple(String),
        Explicit {
            feature: String,
            #[serde(rename = "hasAdditionalImpact")]
            has_additional_impact: bool,
        }
    }
    

    (playground)

    The #[serde(untagged)] attribute means it will attempt to deserialize into each variant in order until one succeeds.


    If the enum is going to be annoying, you can convert them all into the same struct, with default values, using #[serde(from)] and providing a From conversion:

    #[derive(Deserialize, Debug)]
    #[serde(untagged)]
    enum FeatureSource {
        Simple(String),
        Explicit {
            feature: String,
            #[serde(rename = "hasAdditionalImpact")]
            has_additional_impact: bool,
        },
    }
    
    #[derive(Deserialize, Debug)]
    #[serde(from = "FeatureSource")]
    struct Feature {
        feature: String,
        has_additional_impact: bool,
    }
    
    impl From<FeatureSource> for Feature {
        fn from(other: FeatureSource) -> Feature {
            match other {
                FeatureSource::Simple(feature) => Feature {
                    feature,
                    has_additional_impact: false,
                },
                FeatureSource::Explicit {
                    feature,
                    has_additional_impact,
                } => Feature {
                    feature,
                    has_additional_impact,
                },
            }
        }
    }
    

    (playground)

    FeatureSource is only used as an intermediate representation and converted to Feature before the rest of your code ever sees it.