Search code examples
rdataframedplyrdata-cleaning

Combining content of two columns in R


Good day

I have the following data set that basically serves as a key to another data set.

enter image description here

A short example section in code form

Code = c("KBP1369"," "," "," ","KBP1370"," "," ", " "," ")
Description = c("All monetary institutions Credit extended to the domestic ","private sector Total loans and advances","A","J","Monetary aggregates / Money supply: M1(A)", "A","J","M","X")


Data=data.frame(Code,Description) 

I need to somehow create a column that contains the code and the letter under the description column.

So basically I need something along these lines.

enter image description here

I just don't know how since in some cases the wordy description runs over 1 line and other times only over one. On top of that, the number of letter rows differ two, like for example KBP1369 has 4 letter descriptions but KBP1372 only has 2. How on earth do I do this??


Solution

  • An idea is to convert empty strings to NA, fill and paste with the single character strings, i.e.

    library(dplyr)
    
    Data %>% 
     mutate(Code1 = replace(Code, Code == ' ', NA)) %>% 
     tidyr::fill(Code1) %>% 
     mutate(Code1 = ifelse(nchar(Description) == 1, paste0(Code1, Description), ''))
    
         Code                                                Description    Code1
    1 KBP1369 All monetary institutions Credit extended to the domestic          
    2                            private sector Total loans and advances         
    3                                                                  A KBP1369A
    4                                                                  J KBP1369J
    5 KBP1370                  Monetary aggregates / Money supply: M1(A)         
    6                                                                  A KBP1370A
    7                                                                  J KBP1370J
    8                                                                  M KBP1370M
    9                                                                  X KBP1370X
    

    To include the new results in the original Code column then simply modify that in mutate, i.e.

    Data %>% 
         mutate(Code1 = replace(Code, Code == ' ', NA)) %>% 
         tidyr::fill(Code1) %>% 
         mutate(Code = ifelse(nchar(Description) == 1, paste0(Code1, Description), Code))