Search code examples
rpackagereleaseversions

Obtain a package 1st version's date of publication


Documentation of R packages only include the date of last update/publication. Numbering of versions do not follow a common pattern to all packages. Therefore, it is quite difficult to know at a glance if the package is old or new. Sometimes you need to decide between two packages with similar functions and knowing the age of a package could guide the decision.

My first approach was to plot downloads per year: By traking CRAN downloads. This methods provides also the relative popularity/usage of a package. However, this requires a lot of memory and time to proceed. Therefore, I would rather have a faster way to look into the history of one package.

Is there a quick way to know or vizualize the first version's date of release of one specific package or even to compare several pakages at once?

The purpose is to facilitate a mental mapping of all available packages in R, especially for newcomers. Getting to know packages and managing them is probably the main challenge why people give up on R.


Solution

  • Just for fun:

    ## not all repositories have the same archive structure!
    archinfo <- function(pkgname,repos="http://www.cran.r-project.org") {
        pkg.url <- paste(contrib.url(repos),"Archive",pkgname,sep="/")
        r <- readLines(pkg.url)
        ## lame scraping code
        r2 <- gsub("<[^>]+>"," ",r)   ## drop HTML tags
        r2 <- r2[-(1:grep("Parent Directory",r2))]  ## drop header
        r2 <- r2[grep(pkgname,r2)]                  ## drop footer
        strip.white <- function(x) gsub("(^ +|  +$)","",x)
        r2 <- strip.white(gsub("&nbsp;","",r2))     ## more cleaning
        r3 <- do.call(rbind,strsplit(r2," +"))      ## pull out data frame
        data.frame(
            pkgvec=gsub(paste0("(",pkgname,"_|\\.tar\\.gz)"),"",r3[,1]),
            pkgdate=as.Date(r3[,2],format="%d-%b-%Y"),
            ## assumes English locale for month abbreviations
            size=r3[,4])
    }
    AERinfo <- archinfo("AER")
    lme4info <- archinfo("lme4")
    comb <- rbind(data.frame(pkg="AER",AERinfo),
                  data.frame(pkg="lme4",lme4info))
    

    We can't compare package numbers directly because everyone uses different numbering schemes ...

    library(dplyr) ## overkill
    comb2 <- comb %>% group_by(pkg) %>% mutate(numver=seq(n()))
    

    If you want to arrange by package date:

    comb2 <- arrange(comb2,pkg,pkgdate)
    

    Pretty pictures ...

    library(ggplot2); theme_set(theme_bw())
    ggplot(comb2,aes(x=pkgdate,y=numver,colour=pkg))+geom_line()
    

    enter image description here