Search code examples
c#yamlyamldotnet

c# yamldotnet deserialize scalar or sequence in yaml dynamically


I am using YamlDotNet library, and I wanted to deserialize the 1000s of Yaml file using a common data structure class. Sample class have given below. The class Jobs contains a filed Pool which can be scalar or sequence. Like it can be string or object, How to deserialize it dynamically based on pool.

 public class Jobs
    {
        [YamlMember(Alias = "job", ApplyNamingConventions = false, Order = 0)]
        public string Job { get; set; }

        [YamlMember(Alias = "displayName", ApplyNamingConventions = false, Order = 1)]
        public string DisplayName { get; set; }

        [YamlMember(Alias = "pool", ApplyNamingConventions = false, Order = 3)]
        public Pool Pool { get; set; }
}

public class Pool
{
       [YamlMember(Alias = "name", ApplyNamingConventions = false, Order = 0)]
       public string Name { get; set; }
}

Many files are like

Jobs:
-  job: Job1
   displayName: DisplayName1
   pool:
    - name: firstPool   

and other files are like

Jobs:
-  job: Job2
   displayName: DisplayName2
   pool: secondPool   

How to deserialize yaml files dynamically based on pool as pool can be string or object.


Solution

  • Based on your examples, it seems you want to allow the following two things:

    • Where a list of item is expected, the YAML document may contain either a sequence or just a single item.
    • Where a complex object is expected, the YAML document may contain either the object structure (as mappings), or just a scalar (if that is sufficient to create/define the object).

    You can solve this by providing your own implementations of IYamlTypeConverter. Once defined, the converters can be easily registered while constructing the deserializer:

    var deserializer = new DeserializerBuilder()
        .WithNamingConvention(CamelCaseNamingConvention.Instance)
        .WithTypeConverter(new MySpecialTypeConverter())
        .Build();
    

    I have recently written such adapters in one of my open source projects.


    The List Adapter

    For the list aspect, declare a YAML type converter as a generic class with item type T. In its Accepts method, indicate it can handle IEnumerable<T> or similar basic list types, based on your data model.

    You then have to check whether you are looking at the start of a sequence. If so, read the sequence, and collect the items one by one. Otherwise, just read the single element and act as if you had read a sequence with just one element.

    In both cases, you can call a YAML deserializer again, which should contain all of your type converters except for the one you are in to prevent an infinite recursion.

    public object? ReadYaml(IParser parser, Type type)
    {
        var deserializer = ...
    
        if (parser.TryConsume<SequenceStart>(out _))
        {
            var items = new List<T>();
            // read until end of sequence
            while (!parser.TryConsume<SequenceEnd>(out _))
            {
                // skip comments
                if (parser.TryConsume<Comment>(out _))
                {
                    continue;
                }
    
                var item = deserializer.Deserialize<T>(parser);
                items.Add(item);
            }
    
            return CreateReturnValue(type, items);
        }
    
        var singleValue = deserializer.Deserialize<T>(parser);
        if (singleValue == null)
        {
            return null;
        }
    
        return CreateReturnValue(type, new[] { singleValue });
    }
    

    Note that the results are passed to a private method called CreateReturnValue - the implementation of this method basically depends on what list types you want to support in your data model. If you just have arrays, return an array, but if you have some properties of type T[], some of type List<T>, or even others, you may have to instantiate different collections based on the target type.

    You'll have to register an instance of this converter type for each of the item types you wish to support when constructing your deserializers.


    The Object Adapter

    The type converter for individual objects is simpler, but depends on the object type, as you have to specify how to create an object based on a scalar value.

    Once again, create a deserializer with all of your converters except the one you are in.

    public object? ReadYaml(IParser parser, Type type)
    {
        var deserializer = ...
    
        if (parser.TryConsume<Scalar>(out var scalar))
        {
            return new MyObject(scalar.Value);
        }
    
        return deserializer.Deserialize<MyObject>(parser);
    }
    

    The above example assumes you wish to read objects of type MyObject (which you also have to indicate in the Accepts method), which can be instantiated by passing a string scalar to the constructor.