Unusual Behaviour of colon operator : in R

2000:2017

The expected output is a vector of the sequence 2000 to 2017 with a step of 1.

Output: 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

'2000':'2017'

However, when I type this command, it still gives me the same output.

Output: 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

Unable to understand how it is generating sequence from characters.

Edit 1:

Ultimately, I am trying to understand why the code below worked? How can X2007:X2011 can possibly work? The select function is from dplyr package.

R code

My data also has similar column names as mentioned in the image above but I do not have 'X' there. I just have years like 2007,2008 etc.

For me select(Division, State, 2007:2011) does not work.

Error:Can't subset columns that don't exist. x Locations 2007, 2008, 2009, 2010, and 2011 don't exist.

But this works select(Division, State, '2007':'2011').

Solution

If we check the more generic seq.default, it does changes the type from character to numeric for the from and to

...
if (!missing(from) && !is.finite(if (is.character(from)) from <- as.numeric(from) else from)) 
        stop("'from' must be a finite number")
    if (!missing(to) && !is.finite(if (is.character(to)) to <- as.numeric(to) else to)) 
...

Along on that lines, the documentation of ?: also says so

For other arguments from:to is equivalent to seq(from, to), and generates a sequence from from to to in steps of 1 or -1. Value to will be included if it differs from from by an integer up to a numeric fuzz of about 1e-7. Non-numeric arguments are coerced internally (hence without dispatching methods) to numeric—complex values will have their imaginary parts discarded with a warning.

Regarding the updated question with subset and select, if the column is numeric column name i.e. it starts with digit, it is an non-standard column name and evaluation of those can be done by backquoting

df1 <- data.frame(`2007` = 1:5, `2008` = 6:10, 
      `2012` =  11:15, v1 = rnorm(5), check.names = FALSE)
subset(df1, select = `2007`:`2012`)
#  2007 2008 2012
#1    1    6   11
#2    2    7   12
#3    3    8   13
#4    4    9   14
#5    5   10   15

Or with dplyr::select

library(dplyr)
select(df1, `2007`:`2012`)
#   2007 2008 2012
#1    1    6   11
#2    2    7   12
#3    3    8   13
#4    4    9   14
#5    5   10   15

If we have X at the beginning (happens when we read the data without check.names = FALSE - by default it is TRUE. Or when we create the dataset with data.frame - here also the check.names = TRUE by default)

df1 <- data.frame(`2007` = 1:5, `2008` = 6:10, `2012` =  11:15, v1 = rnorm(5))
subset(df1, select = X2007:X2012)