r dataframe filter dataset data-cleaning

How to combine row and column name of df when value in target cell of df is not NA

I have a df like this:

df1 <- data.frame(c(3,NA,5), c(NA, NA, 3), c(3, 2, NA))
names(df1) <- c('number1', 'number2', 'number3')
rownames(df1) <- c('siteA', 'siteB', 'siteC')

I want to make a new df where row and column name are combined from the values that are not NA. The resulting df should look like this:

df2 <- data.frame('number' = c('number1', 'number1', 'number2', 'number3', 'number3'), 
                  'site' = c('siteA', 'siteC', 'siteC', 'siteA', 'siteB'))

How do I do this as efficiently as possible? (my real df is massive)

Solution

In a matrix or table each axis can have a name which is a character string labelling that axis. Convert df1 to a matrix m and add those.

Then convert that to long form using as.data.frame.table and remove the rows with NA's. as.data.frame.table will add a third column named Freq which we don't need so select out the first 2 columns only and since the question put number first reverse the order of the two columns extracted.

 m <- as.matrix(df1)
 names(dimnames(m)) <- c("site", "number")

 m |>
  as.data.frame.table() |>
  na.omit() |>
  subset(select = 2:1)
##    number  site
## 1 number1 siteA
## 3 number1 siteC
## 6 number2 siteC
## 7 number3 siteA
## 8 number3 siteB

m looks like this. Note the dimnames.

m
##        number
## site    number1 number2 number3
##   siteA       3      NA       3
##   siteB      NA      NA       2
##   siteC       5       3      NA