Search code examples
rfunctional-programmingcomputer-science

Which programming languages implement the concept of a function inverse, such as names() in R


In the R programming language, some functions can either return a value, or they can set that value if an assignment is made to them. In the below example, we create a named list, and use the names() function to get a vector of those names:

> ll <- list(x = 1, y = 2, z = "whatever") # create a list
> names(ll)
[1] "x" "y" "z"

But I can use the same function to set those names, in a very interesting way. I assign a new vector to the exact same form as above:

> names(ll) <- c("a", "b", "c")
> names(ll)
[1] "a" "b" "c"

Is there some kind of quirky R magic going on here? Or is this a technique in computer science which one can see in other (esoteric?) languages? I'm interested in DSLs and this idea appears quite powerful and I'd like to research it further. It's as if you're saying "give me the input to the function such that the output is this".

This doesn't work but imagine it did:

> f <- function(x) x + 1
> f(2)
[1] 3
> z <- 3
> f(z) <- 2
Error in f(z) <- 2 : could not find function "f<-"
> z
[1] 3

I wanted z to be equal to 1, because f(1) is 2.

This idea maps closely to the concept of an inverse function in mathematics. Not all functions have an inverse, of course, but since programming often has mathematical underpinnings, I wonder if the concept is explored further in any other programming languages.


Solution

  • Generalized references - Common Lisp

    Your first example is more about the concpet of lvalues in C/C++, and in the case of Common Lisp, places. Generalized references are based on macroexpansion and can be extended by the programmer.

    Let's say you build a cons cell (cons 0 1), and let's say it is bound to a local variable named x. A cons cell is just a small structure with two slots, with accessors car and cdr. For example, (car x) is 0 and (cdr x) is 1. Typically, lists are built by chaining cons-cells where the cdr is a sublist.

    The historical way to mutate the slots is by calling RPLACA/RPLACD functions (replace car, replace cdr). The SETF exapansion mechanism is a way to talk about places and how to affect them. In the case of cons-cells, you have two writer functions whose names are (setf car) and (setf cdr); the names are literally lists of two elements (this is the only case where a function name is not a symbol).

    Then, you can write (setf (car x) 2) to mutate x so that it holds the value 2. This is macroexpanded as a call to RPLACA by setf-expansion.

    Other macros are built on top of setf, and generally are named with a -f suffix, like incf:

    (incf (cdr x))
    

    The above increments the value of the CDR of X. setf also trivially works for setting local variables.

    What is interesting is that the mechanism can be composed; the accessors for hash table is (gethash <key> <table> &optional <default-value>); the accessor for arrays is (aref <array> ... <subscripts>). You could write:

    (setf (aref (gethash key table) index)
          new-value)
    

    And the above would change the value at position index in the array associated with key in table.

    The composition is efficient, because the expansion only traverse the nested data-structures down to the point where the structure shall be modified; For example, if you mutate a tree tree equal to:

    (root-node (node-a 0 1) (node-b 2 3))
    

    Then the value 2 is the first child of the second child of the root node, which in terms of list positions is written as follows:

    (second (third tree)) 
    => 0
    

    If you want to increment that value, you write:

    (incf (second (third tree)))
    

    And INCF is clever enough to only traverse the list once; this is the result of macroexpansion:

    (LET* ((#:LIST (CDR (THIRD TREE))) (#:NEW (+ 1 (CAR #:LIST))))
      (SB-KERNEL:%RPLACA #:LIST #:NEW))
    

    This mechanism can be extended by calling define-setf-expander; For example, the Cells library implements a kind of constraint propagation mechanism, like spreadsheets formula (dataflow, reactive programming), where the change in an object's slot value is propagated down to users of that slot value (http://stefano.dissegna.me/cells-tutorial.html). But for users, it is only a matter of calling (setf (slot object) value), which abstracts the underlying magic.

    Constraint programming - Prolog

    Prolog and more generally constraint programming are notorious for allowing relationships to be called in mutliple directions (example with https://eclipseclp.org/ interpreter):

    lib(fd).
    f(X,R) :- R #= X + 1.
    

    Case where X is ground and R left variable:

    [eclipse 3]: f(3,R).
    
    R = 4
    Yes (0.00s cpu)
    

    Case where X is variable and R ground:

    [eclipse 4]: f(X,4).
    
    X = 3
    Yes (0.00s cpu)
    

    Case with both being left variables:

    [eclipse 5]: f(X,Y).
    
    X = X{[-10000000 .. 9999999]}
    Y = Y{[-9999999 .. 10000000]}
    
    Delayed goals:
        -1 - X{[-10000000 .. 9999999]} + Y{[-9999999 .. 10000000]} #= 0
    

    Case where both are ground:

    [eclipse 6]: f(5,10).
    
    No (0.00s cpu)