Search code examples
rsapply

Reordering Polynomial Column Names Based on Degree


I am trying to reorder column names after expanding using the poly function in R.

The initial function I wrote looks like this:

> sorted_data
X0.1     X1.0     X0.2    X1.1     X2.0     X0.3     X1.2     X2.1     X3.0
1 1.701692 1.071616 2.895755 1.82356 1.148361 4.927683 3.103138 1.954156 1.230602
2 1.720578 1.035489 2.960388 1.78164 1.072238 5.093578 3.065450 1.844869 1.110291

s <- strsplit(substring(colnames(sorted_theta), 2), "\\.")
> s
[[1]]
[1] "0" "1"

[[2]]
[1] "1" "0"

[[3]]
[1] "0" "2"

[[4]]
[1] "1" "1"

[[5]]
[1] "2" "0"

colnames(sorted_data) <- sapply(s, function(x) {
      vec <- c("x", "y", "z")[seq_along(x)]
      x <- as.integer(x)
      y <- rep(vec, rev(x))
      paste(y, collapse = "")
    })

colnames(sorted_data)
[1] "x"   "y"   "xx"  "xy"  "yy"  "xxx" "xxy" "xyy" "yyy"

I am now trying to change the variable names to x1 x2, and x3. However, I was hoping to generalize the code to allow for more than just 3 variables. I also wanted to update it to use ^ for the powers such as the following:

sorted_data_test <- sorted_data
    colnames(sorted_data_test) <- sapply(s, function(powers) {
      terms <- mapply(function(power, index) {
        if (power == "0") {
          return(NULL)
        } else if (power == "1") {
          return(paste0("x", index))
        } else {
          return(paste0("x", index, "^", power))
        }
      }, powers, seq_along(powers), SIMPLIFY = FALSE)
      
      # Filter out any NULL values from the terms list
      terms <- Filter(Negate(is.null), terms)
      
      # Collapse the terms into one string
      paste(terms, collapse = "")
    })

However, that gives:

print(colnames(sorted_theta_test))
[1] "x2"     "x1"     "x2^2"   "x1x2"   "x1^2"   "x2^3"   "x1x2^2" "x1^2x2" "x1^3"  

How can I edit my second sapply to order the columns in the same way as the first sapply?

Thanks in advance.


Solution

  • I think you are almost there but just lack of rev outside seq_along(powers).

    You can try

    s <- strsplit(substring(colnames(sorted_data), 2), "\\.")
    colnames(sorted_data_test) <- sapply(s, function(powers) {
        terms <- mapply(function(power, index) {
            if (power == "0") {
                return(NULL)
            } else if (power == "1") {
                return(paste0("x", index))
            } else {
                return(paste0("x", index, "^", power))
            }
        }, powers, rev(seq_along(powers)), SIMPLIFY = FALSE) # <------ here is the minor change
    
        # Filter out any NULL values from the terms list
        terms <- Filter(Negate(is.null), terms)
        # Sort terms alphabetically
        sorted_terms <- sort(unlist(terms))
        # Collapse the terms into one string
        paste(sorted_terms, collapse = "")
    })
    

    and you will obtain

    > sorted_data_test
            x1       x2     x1^2    x2x1     x2^2     x1^3   x2x1^2   x2^2x1
    1 1.701692 1.071616 2.895755 1.82356 1.148361 4.927683 3.103138 1.954156
    2 1.720578 1.035489 2.960388 1.78164 1.072238 5.093578 3.065450 1.844869
          x2^3
    1 1.230602
    2 1.110291