Sum up rows by partial match of name and rename based on a name database

after a long time I am working again with R and thus I am a bit rusty. Every bit of help is appreciated.

So I have a survey out there which contains many matrix questions, even dual matrix. In this example it is a dual matirx is the survey question come formatted in Question.AnswerCode..FirstOrSecondMatrix. I made a frequency data frame who looks like this:

subsetting from my original data for reproduction - columns in german, not sorted

dput(PIL)
structure(list(gering = c(12L, 9L, 12L, 3L, 12L, 8L, 3L, 10L, 
12L, 6L, 7L, 7L), hoch = c(32L, 45L, 28L, 43L, 39L, 44L, 20L, 
20L, 17L, 20L, 28L, 18L), `keine Angabe` = c(6L, 20L, 12L, 30L, 
9L, 24L, 9L, 16L, 10L, 17L, 10L, 17L), mittel = c(27L, 32L, 15L, 
27L, 23L, 27L, 19L, 18L, 20L, 25L, 10L, 24L), `sehr gering` = c(17L, 
10L, 28L, 1L, 14L, 2L, 1L, 1L, 3L, 2L, 15L, 4L), `sehr hoch` = c(22, 
0, 21, 12, 19, 11, 18, 5, 8, 0, 0, 0), Summe = c(116, 116, 116, 
116, 116, 116, 70, 70, 70, 70, 70, 70)), row.names = c("PIL.PILK1..1.", 
"PIL.PILK1..2.", "PIL.PILK2..1.", "PIL.PILK2..2.", "PIL.PILK3..1.", 
"PIL.PILK3..2.", "PIL2.PILK1..1.", "PIL2.PILK1..2.", "PIL2.PILK2..1.", 
"PIL2.PILK2..2.", "PIL2.PILK3..1.", "PIL2.PILK3..2."), class = "data.frame")

and as I result I try to achieve this:

I would like to join(sum) the rows by the middle part of the string "PILK1" = Prozess1 and rename it into "Prozess 1..1." or Prozess1..2.". Then renaming the last step "..1." with grepl to the desired output i am pretty sure to get this done since it has only two options. But all my searching has not helped me with the renaming a part of a string from a second data frame with the partial strings and the name of the process. The names list(data frame) contains 100+ names as abbrev strings and full names.

Where I am stuck

How to identify a row by partial match and sum those up
How to replace a string up to a certain point of a string using a list or data frame (picture 3)

Thanks a lot

PS: My workaround up to now is directly addressing it i.e. Prozess1_Intensity<- table(ifelse(!is.na(PIL.PILK1..1.),PIL.PILK1..1., PIL2.PILK1..1.), useNA = "always") and bind all items into a data frame. But this is a lot of copy and paste or in my case chaining everything together in Excel and paste it into R, which becomes tedious if applied to the whole survey

Solution

If the lookup table to match the process name with it's abbrevation is called as lookup.

library(dplyr)
library(tidyr)

lookup <- data.frame(prozess_name = c('Prozess1', 'Prozess2', 'Prozess3'), 
                     abbrev = c('PILK1', 'PILK2', 'PILK3'))

lookup

#  prozess_name abbrev
#1     Prozess1  PILK1
#2     Prozess2  PILK2
#3     Prozess3  PILK3

You can divide the rownames in 3 separate columns using extract, join with lookup and sum the column values using across.

PIL %>%
  rownames_to_column('abbrev') %>%
  extract(abbrev, c('PIL','abbrev', 'num'), '(PIL\\d?)\\.(PILK\\d+)\\.\\.(\\d+)\\.') %>%
  left_join(lookup, by = 'abbrev') %>%
  group_by(prozess_name, num) %>%
  summarise(across(gering:Summe, sum, na.rm = TRUE)) %>%
  ungroup

#  prozess_name num   gering  hoch `keine Angabe` mittel `sehr gering` `sehr hoch` Summe
#  <chr>        <chr>  <int> <int>          <int>  <int>         <int>       <dbl> <dbl>
#1 Prozess1     1         15    52             15     46            18          40   186
#2 Prozess1     2         19    65             36     50            11           5   186
#3 Prozess2     1         24    45             22     35            31          29   186
#4 Prozess2     2          9    63             47     52             3          12   186
#5 Prozess3     1         19    67             19     33            29          19   186
#6 Prozess3     2         15    62             41     51             6          11   186