Search code examples
rsubsetsemanticsargument-matching

Why does R use partial matching?


I know that for a list, partial matching is done when indexing using the basic operators $ and [[. For example:

ll <- list(yy=1)
ll$y
[1] 1

But I am still an R newbie and this is new for me, partial matching of function arguments:

h <- function(xx=2)xx
h(x=2)
[1] 2

I want to understand how this works. What is the mechanism behind it? Does this have any side effects? I want understand how can someone test if the xx argument was given?

Edit after Andrie comment:

Internally R uses pmatch algorithm to match argument, here an example how this works:

 pmatch("me",   c("mean", "median", "mode")) # error multiple partial matches
[1] NA
> pmatch("mo",   c("mean", "median", "mode")) # mo match mode match here
[1] 3

But why R has such feature? What is the basic idea behind of partial unique matching?


Solution

  • Partial matching exists to save you typing long argument names. The danger with it is that functions may gain additional arguments later on which conflict with your partial match. This means that it is only suitable for interactive use – if you are writing code that will stick around for a long time (to go in a package, for example) then you should always write the full argument name. The other problem is that by abbreviating an argument name, you can make your code less readable.

    Two common good uses are:

    1. len instead of length.out with the seq (or seq.int) function.

    2. all instead of all.names with the ls function.

    Compare:

    seq.int(0, 1, len = 11) 
    seq.int(0, 1, length.out = 11)
    
    ls(all = TRUE)
    ls(all.names = TRUE)
    

    In both of these cases, the code is just about as easy to read with the shortened argument names, and the functions are old and stable enough that another argument with a conflicting name is unlikely to be added.

    A better solution for saving on typing is, rather than using abbreviated names, to use auto-completion of variable and argument names. R GUI and RStudio support this using the TAB key, and Architect supports this using CTRL+Space.


    Some relevant sections of R Language Definition:

    3.4.1 Indexing by vectors

    ...assume that the expression is x[i]. Then the following possibilities exist according to the type of i

    Character. The strings in i are matched against the names attribute of x and the resulting integers are used. For [[ and $ partial matching is used if exact matching fails, so x$aa will match x$aabb if x does not contain a component named "aa" and "aabb" is the only name which has prefix "aa". For [[, partial matching can be controlled via the exact argument which defaults to NA indicating that partial matching is allowed, but should result in a warning when it occurs. Setting exact to TRUE prevents partial matching from occurring, a FALSE value allows it and does not issue any warnings. Note that [ always requires an exact match. The string "" is treated specially: it indicates ‘no name’ and matches no element (not even those without a name). Note that partial matching is only used when extracting and not when replacing.

    [see also ?Extract]

    4.3.2 Argument matching

    The first thing that occurs in a function evaluation is the matching of formal to the actual or supplied arguments. This is done by a three-pass process:

    1. Exact matching on tags. For each named supplied argument the list of formal arguments is searched for an item whose name matches exactly. It is an error to have the same formal argument match several actuals or vice versa.

    2. Partial matching on tags. Each remaining named supplied argument is compared to the remaining formal arguments using partial matching. If the name of the supplied argument matches exactly with the first part of a formal argument then the two arguments are considered to be matched. It is an error to have multiple partial matches. Notice that if f <- function(fumble, fooey) fbody, then f(f = 1, fo = 2) is illegal, even though the 2nd actual argument only matches fooey. f(f = 1, fooey = 2) is legal though since the second argument matches exactly and is removed from consideration for partial matching. If the formal arguments contain ... then partial matching is only applied to arguments that precede it.

    3. Positional matching.


    Note that when subsetting a tibble

    Partial matching of column names with $ and [[ is not supported, and NULL is returned. For $, a warning is given.