Search code examples
dictionarymergecommentsruamel.yaml

Issue in preserving comments when merging 2 yamls using deepmerge python module and using ruamel.yaml


Code:

from deepmerge import always_merger
import ruamel.yaml

fileA = "source.yaml"
fileB = "dest.yaml"

yaml = ruamel.yaml.YAML()

with open(fileA,'r+') as f:
   fileAdictionary= yaml.load(f)

with open(fileB,'r+') as f:
   fileBdictionary = yaml.load(f)

result = always_merger.merge(fileAdictionary, fileBdictionary)
with open('output.yaml','w+') as f:
   yaml.dump(result,f)

source.yaml

element:
    connection:
        test: true

dest.yaml

element:
    connection:
        test: true
    networkPolicy:
        enabled: true
        # network policy has been enabled
    test_str_param: "abc"
    # comment for string parameter
    test_int_param: 10
    # comment for the integer parameter
    test_bool_param: true
    # comment for the boolean parameter

Actual Output

output.yaml

element:
  connection:
    test: true
  networkPolicy:
    enabled: true
        # network policy has been enabled
  test_str_param: abc
  test_int_param: 10
  test_bool_param: true

Problem Description

As you can see in the output.yaml, the comments for the elements test_str_param, test_int_param and test_bool_param is not preserved or carry forwarded from dest.yaml

Expectation

What needs to be done so that all the comments pertaining to all parameters are preserved in the final output.yaml

Expected Output

element:
    connection:
        test: true
    networkPolicy:
        enabled: true
        # network policy has been enabled
    test_str_param: "abc"
    # comment for string parameter
    test_int_param: 10
    # comment for the integer parameter
    test_bool_param: true
    # comment for the boolean parameter

Solution

  • What you load from your input are CommentedMap instances and these are sub-classes of dict. deepmerge handles them as dicts, but since it doesn't do anything special for the comments you lose them if keys are merged in an already existing CommentedMap (like fileAdictionary['element']), but not when a value that is merged in is a CommentedMap and doesn't exist in the fileAdictionary yet (i.e. there is no fileAdictionary['element']['networkPolicy'])

    deepmerge allows you to add your own strategies, but I am not sure what is the best/recommended procedure to add new types:

    import sys
    from pathlib import Path
    # from deepmerge import always_merger
    import deepmerge
    import ruamel.yaml
    RYCM = ruamel.yaml.comments.CommentedMap
    
    class CommentedMapStrategies(deepmerge.strategy.core.StrategyList):
        NAME = 'CommentedMap'
    
        @staticmethod
        def strategy_merge(config, path, base, nxt):
            for k, v in nxt.items():
                if k not in base:
                    base[k] = v
                else:
                    base[k] = config.value_strategy(path + [k], base[k], v)
            try:
                for k, v in nxt.ca.items.items():
                    base.ca.items[k] = v
            except AttributeError:
                pass
        
            return base
    
        @staticmethod
        def strategy_override(config, path, base, nxt):
            """
            move all keys in nxt into base, overriding
            conflicts.
            """
            return nxt
    
    # insert as it needs to be before 'dict'
    deepmerge.DEFAULT_TYPE_SPECIFIC_MERGE_STRATEGIES.insert(0, (RYCM, 'merge'))
    Merger = deepmerge.merger.Merger
    Merger.PROVIDED_TYPE_STRATEGIES[RYCM] = CommentedMapStrategies
    
    always_merger = Merger(deepmerge.DEFAULT_TYPE_SPECIFIC_MERGE_STRATEGIES, ['override'], ['override'])
    
    fileA = Path('source.yaml')
    fileB = Path('dest.yaml')
    
    yaml = ruamel.yaml.YAML()
    yaml.indent(mapping=4)
    result = always_merger.merge(yaml.load(fileA), yaml.load(fileB))
    yaml.dump(result, sys.stdout)
    

    which gives:

    element:
        connection:
            test: true
        networkPolicy:
            enabled: true
            # network policy has been enabled
        test_str_param: abc
        # comment for string parameter
        test_int_param: 10
        # comment for the integer parameter
        test_bool_param: true
        # comment for the boolean parameter
    

    Depending on where you have comments in your YAML documents, you might have to amend/complete the comment copying in strategy_merge.

    Please note that the above relies on CommentedMap internals that might change, so pin the ruamel.yaml version and test before upgrading it.