Search code examples
diffterminologyconceptual

Term for diff/delta on multiple files or data structures


I would like to know whether there is a proper term to describe "diffing" of / obtaining the delta between multiple files or data structures, such that the resulting "diff" contains first a description of the parts common to all files/structures, then descriptions of how this "base" file/structure must be modified to obtain the individual ones, ideally in a hierarchical fashion if some files/structures are more similar to each other than others.

There are some questions and answers about how to do this with certain tools (e.g. DIFF utility works for 2 files. How to compare more than 2 files at a time?), but as I want to do this for a specific type of data structure (namely JSON), I'm at a loss as to what I should even search for.

This type of problem seems to me like it should be common enough to have a name such as "hierarchical diff" (which however seems to be reserved for 2-way diffs on hierarchical data structures), "commonality finding", or something like that.

I guess a related concept about hierarchical ordering of commonalities and differences is formal concept analysis, but this operates on sets of properties rather than hierarchical data structures and won't help me much.


Solution

  • There are multiple valid denominations :

    • Data comparison (or Sequence comparison)
    • Delta encoding
    • Delta compression (or Differential compression)

    Algorithms:

    Good Wikipedia links

    Some implementations