Search code examples
functionhaskellhashlisp

function hashing/fingerprinting as a built-in feature?


How can one get a stable hash of a function at runtime?

This means the hash changes if the implementation changes, and this is recursive, so if there is a nested function call, the nested function hash will affect the outer function hash.

Not sure which language has this feature. I'm looking for a practical programming language where it is trivial and performant.

I guess functional languages, perhaps lisp or haskell are the usual suspects, but unsure how this looks like.

function myFunction() {
    ... // Some code, possibly using names from other files/modules/libraries
}

 // Prints the hash which changes if anything in the implementation of `myFunction` changes, stable across runs.
print(hash(myFunction))

is there such a language? if so an example of how this is written and why it works is desired.

non-examples would be js, python, java...


Solution

  • If you want a general-purpose programming language, Unison does exactly this pervasively:

    Each Unison definition is some syntax tree, and by hashing this tree in a way that incorporates the hashes of all that definition's dependencies, we obtain the Unison hash which uniquely identifies that definition.

    https://www.unisonweb.org/docs/tour

    Every Unison definition is identified by a 512-bit SHA3 hash, and is immutable—you cannot modify a definition, only create a new one. Moreover, names are stored separately from definitions, so renaming is a trivial operation, and if two people write structurally the same code with only different variable & function names, their code will share the same hash and thus be identified as the same code.

    As for configuration languages, Dhall does as well:

    Use Dhall's support for semantic hashes to guarantee that many types of refactors are behavior-preserving

    https://dhall-lang.org/

    Since Dhall is non–Turing complete, every expression has a normal form, and the hash of this normal form can be used to identify it, so when you refactor your Dhall code, you can have strong assurance that it produces identical results.