Add values across dataframe columns

I have a dataframe where missingness in indicated by "Z" (there may also be some "z" and NA entries present in the data), and values are entered as characters ("0", "1", etc). I need to create scores ("updrs1", "updrs2", updrs3") that add up the non-missing values across columns selected by colname prefix ("NP1", "NP2", "NP3").

dummy data:

dummy_df <- data.frame(
  subject_id = seq(1,6,1), 
  OTHERV1 = c(1,1,0,0,1,1),
  NP1VAR1 = c("Z","0","Z","Z","Z","Z"),
  NP1VAR2 = c("Z","0","Z","Z","Z","Z"),
  NP1VAR3 = c("Z","3","Z","Z","Z","Z"),
  NP2VAR1 = c("Z","2","Z","Z","Z","Z"), 
  NP2VAR2 = c("Z","0","Z","Z","Z","Z"),
  NP2VAR3 = c("Z","0","Z","Z","Z","Z"),
  NP3VAR1 = c("Z","4","Z","Z","Z","Z"),
  NP3VAR2 = c("Z","0","Z","Z","z","Z"),
  NP3VAR3 = c("Z","0","Z","Z","Z",NA),
  OTHERV2 = c(NA,NA,NA,NA,NA,NA)
)

desired output:

	subject_id	updrs1	updrs2	updrs3
1	1	Z	Z	Z
2	2	3	2	4
3	3	Z	Z	Z
4	4	Z	Z	Z
5	5	Z	Z	Z
6	6	Z	Z	Z

NOTE: all values in the output are characters

NOTE: treating NA/Z/z as 0 (i.e., transforming all values with as.numeric()) is problematic.

I've tried variations on this answer, with no luck.

I have used a combination of tidyr::pivot_longer(), dplyr::group_by(), and summarize():

desired_output <- select(dummy_df, c(subject_id, starts_with("NP"))) %>%
  mutate(across(all_of(everything()), ~ifelse(. %in% c("Z", "z"), NA, .))) %>%
  pivot_longer(cols = starts_with("NP"), names_to = c(".value", "np_var"), names_sep = "VAR") %>%
  group_by(subject_id) %>%
  summarize(updrs1 = sum(as.numeric(NP1), na.rm = FALSE),
            updrs2 = sum(as.numeric(NP2), na.rm = FALSE), 
            updrs3 = sum(as.numeric(NP3), na.rm = FALSE), .groups = "drop") %>%
  mutate(across(all_of(everything()), as.character)) %>%
  replace(is.na(.), "Z")

This works. L Tyrone's answer also works, so I've accepted it. Thanks all.

Solution

If you don't care whether:

the missingness values are upper or lower case
Z/z are the only missingness values
existing NA values can be represented as Z

then this works. There is likely to be a more direct way to do this, but I find a step-wise approach easier to follow. Note that it will return a warning, which you can ignore:

library(dplyr)
library(tidyr)

dummy_df |>
  pivot_longer(
    cols = starts_with("NP"),
    names_to = c("grp", "var"),
    names_pattern = "^(NP\\d+)(VAR\\d+)$",
    values_to = "value") |>
  mutate(value = if_else(grepl("^[0-9]+$", value), as.integer(value), NA)) |>
  summarise(value = sum(value), .by = c(subject_id, grp)) |>
  pivot_wider(id_cols = subject_id,
              names_from = grp,
              values_from = value) |>
  rename(updrs1 = NP1, updrs2 = NP2, updrs3 = NP3) |>
  mutate(across(where(is.numeric), as.character),
         across(everything(), ~replace_na(.x, "Z")))
    
# # A tibble: 6 × 4
#   subject_id updrs1 updrs2 updrs3
#   <chr>      <chr>  <chr>  <chr> 
# 1 1          Z      Z      Z     
# 2 2          3      2      4     
# 3 3          Z      Z      Z     
# 4 4          Z      Z      Z     
# 5 5          Z      Z      Z     
# 6 6          Z      Z      Z