Search code examples
goyamlmarshalling

How should I write and unmarshal a byte64 encoded value in YAML


Part of my Yaml file

rules:
- action:
    count: {}
  name: rulenumbertwo
  priority: 123
  statement:
    bytematchstatement:
      fieldtomatch:
        singleheader:
          name: foobar
      positionalconstraint: CONTAINS
      searchstring: [103, 105, 122, 122, 98, 117, 122, 122]

The above comes out as the corresponding ASCII string ("fizzbuzz") for the searchstring value. But how can I write it human-readable format? The manual says that the searchstring should be "automatically" converted but it throws an error if I do not write it as an array of ASCII codes.

My code (trying to unmarshal it, which works if I write the array directly but it is not human-readable):

d := &wafv2.UpdateRuleGroupInput{}
err = yaml.Unmarshal(buf, d)

I would like to write the YAML simply as follows

rules:
- action:
    count: {}
  name: rulenumbertwo
  priority: 123
  statement:
    bytematchstatement:
      fieldtomatch:
        singleheader:
          name: foobar
      positionalconstraint: CONTAINS
      searchstring: fizzbuzz

Or possibly use a template (but why when it should be easy)

searchstring: {{ convertStringToByteArray "fizzbuzz" }}

Solution

  • The type ByteMatchStatement which contains the SearchString doesn't have anything declared on it customizing the way it loads from YAML, so it can't load a []byte from a YAML scalar. The manual you link describes the semantics of the type, but does neither define how the data is stored (depending on use-case, it does make sense to store something that is a string to the user as []byte) nor how it can be loaded from YAML. Generally, (de)serialization tends to expose facts about the data model that would be implementation details otherwise, and this is what hits you here.

    The simple fix would be to declare some type SearchStringType []byte, use it for the SearchString field and declare a custom UnmarshalYAML method there to handle the conversion to []byte. However, that is not possible in your code since you don't have control over the types.

    One possible albeit somewhat crude fix would be to pre-process the YAML input:

    type LoadableUpdateRuleGroupInput wafv2.UpdateRuleGroupInput
    
    func (l *LoadableUpdateRuleGroupInput) UnmarshalYAML(n *yaml.Node) error {
        preprocSearchString(n)
        return n.Decode((*wafv2.UpdateRuleGroupInput)(l))
    }
    
    func preprocSearchString(n *yaml.Node) {
        switch n.Kind {
        case yaml.SequenceNode:
            for _, item := range n.Content {
                preprocSearchString(item)
            }
        case yaml.MappingNode:
            for i := 0; i < len(n.Content); i += 2 {
                if n.Content[i].Kind == yaml.ScalarNode &&
                    n.Content[i].Value == "searchstring" {
                    toByteSequence(n.Content[i+1])
                } else {
                    preprocSearchString(n.Content[i+1])
                }
            }
        }
    }
    
    func toByteSequence(n *yaml.Node) {
        n.Kind = yaml.SequenceNode
        raw := []byte(n.Value)
        n.Content = make([]*yaml.Node, len(raw))
        for i := range raw {
            n.Content[i] = &yaml.Node{Kind: yaml.ScalarNode, Value: strconv.Itoa(int(raw[i]))}
        }
    }
    

    By using this new subtype when unmarshaling, the YAML scalar value of searchstring will be transformed into the YAML sequence node you would have written. When you load the YAML node, it will then properly load into []byte:

    d := &LoadableUpdateRuleGroupInput{}
    err = yaml.Unmarshal(buf, d)
    

    This code requires gopkg.in/yaml.v3, v2 does not give you as much control over unmarshaling.