Search code examples
rgsubff

Replacing numbers with letters in string


I have an ID column with names like "155AB3EA157A3466887D8F4B99BABC35". I want to replace the numbers in these strings with letters. I've tried using gsub, but it produces an "invalid text argument" error. My code looks like this:

as.character(df$ID)
gsub("1", "A", df$ID)

I should add that I'm working with the ff package, because the data is very large.


Solution

  • IF we are replacing numbers 1-9 with LETTERS 'A' to 'I', then chartr is an option

    chartr('123456789', 'ABCDEFGHI', v1)
    #[1] "AEEABCEAAEGACDFFHHGDHFDBIIBABCCE"
    

    Update

    Just noticed that the OP was probably using an ffdf object

    library(ff)
    library(ffbase)
    

    In that case, applying the functions in the regular manner results in error as the OP mentioned

    gsub("1", "A", d$v1) 
    

    Error in gsub("1", "A", d$v1) : invalid 'text' argument

    So, we can use the specialized extraction functions like with.ffdf or within.ffdf from ffbase

    with.ffdf(d, gsub("1", "A", v1))
    #ff (open) integer length=1 (1) levels: A55AB3EAA57A3466887D8F4B99BABC35
    #                         [1] 
    #A55AB3EAA57A3466887D8F4B99BABC35 
    

    For replacing the numbers 1-9, the chartr can be applied as

    d$v1 <- with.ffdf(d, chartr('123456789', 'ABCDEFGHI', v1))
    d
    #ffdf (all open) dim=c(1,1), dimorder=c(1,2) row.names=NULL
    #ffdf virtual mapping
    #   PhysicalName VirtualVmode PhysicalVmode  AsIs #VirtualIsMatrix PhysicalIsMatrix PhysicalElementNo #PhysicalFirstCol PhysicalLastCol PhysicalIsOpen
    #v1           v1      integer       integer FALSE           FALSE            FALSE                 1                1               #1           TRUE
    #ffdf data
    #                                v1
    #1 AEEABCEAAEGACDFFHHGDHFDBIIBABCCE
    

    data

    v1 <- "155AB3EA157A3466887D8F4B99BABC35"
    d <- as.ffdf(data.frame(v1))