Search code examples
rrstudiospss

Is there a variable listing in RStudio (or R) like in SPSS?


RStudio provides a nice function View (with uppercase V) to take a look into the data, but with R it's still nasty to get orientation in a large data set. The most common options are...

  • names(df)
  • str(df)

If you're coming from SPSS, R seems like a downgrade in this respect. I wondered whether there is a more user-friendly option? I did not find a ready-one, so I'd like to share my solution with you.


Solution

  • Using RStudio's built-in function View, it's white simple to have a variable listing for a data.frame similar to the one in SPSS. This function creates a new data.frame with the variable information and displays in the RStudio GUI via View.

    # Better variables view
    Varlist = function(sia) {
      # Init varlist output
      varlist = data.frame(row.names = names(sia))
      varlist[["comment"]] = NA
      varlist[["type"]] = NA
      varlist[["values"]] = NA
      varlist[["NAs"]] = NA
      # Fill with meta information
      for (var in names(sia)) {
        if (!is.null(comment(sia[[var]]))) {
            varlist[[var, "comment"]] = comment(sia[[var]])
        }
        varlist[[var, "NAs"]] = sum(is.na(sia[[var]]))
        if (is.factor(sia[[var]])) {
          varlist[[var, "type"]] = "factor"
          varlist[[var, "values"]] = paste(levels(sia[[var]]), collapse=", ")
        } else if (is.character(sia[[var]])) {
          varlist[[var, "type"]] = "character"
        } else if (is.logical(sia[[var]])) {
          varlist[[var, "type"]] = "logical"
          n = sum(!is.na(sia[[var]]))
          if (n > 0) {
            varlist[[var, "values"]] = paste(round(sum(sia[[var]], na.rm=T) / n * 100), "% TRUE", sep="")
          }
        } else if (is.numeric(sia[[var]])) {
          varlist[[var, "type"]] = typeof(sia[[var]])
          n = sum(!is.na(sia[[var]]))
          if (n > 0) {
            varlist[[var, "values"]] = paste(min(sia[[var]], na.rm=T), "...", max(sia[[var]], na.rm=T))
          }
        } else {
          varlist[[var, "type"]] = typeof(sia[[var]])
        }
      }
      View(varlist)
    }
    

    My recommendation is to store that as a file (e.g., Varlist.R) and whever you need it, just type:

    source("Varlist.R")
    Varlist(df)
    

    Again please take note of the uppercase V using as function name.

    Limitation: When working with data.frame, the listing will not be updated unless Varlist(df) is run again.

    Note: R has a built-in option to view data with print. If working with pure R, just replace the View(varlist) by print(varlist). Yet, depending on screen size, Hmisc::describe() could be a better option for the console.