Search code examples
rdataframesapply

Replace R data.frame columns values based on value length


Suppose I have the following df in R:

df <- data.frame(A = c(1234, 56789, 159, 45950, 7844, 678, 17, 28794), B = c(657, 823, 465, 756, 328, 122, 490, 990))

I need to replace values in column A that have a length less than 4 with "", but I don't want to filter out those rows. Values with a length greater than 4 can remain as is. I've referenced the post Replace values based on length but have had no luck as the code doesn't perform any action on the column.

file['A'][sapply(file['A'], nchar) < 4] <- ""

Any suggestions?


Solution

  • If your dataframe has characters you could use nchar to count the number of character with sapply like this:

    df <- data.frame(A = as.character(c(1234, 56789, 159, 45950, 7844, 678, 17, 28794)), B = as.character(c(657, 823, 465, 756, 328, 122, 490, 990)))
    df[sapply(df, \(x) nchar(x) < 4)] <- ""
    df
    #>       A B
    #> 1  1234  
    #> 2 56789  
    #> 3        
    #> 4 45950  
    #> 5  7844  
    #> 6        
    #> 7        
    #> 8 28794
    

    Created on 2023-01-15 with reprex v2.0.2


    If your dataframe has numeric values, you could change all values less than 1000 like this:

    df <- data.frame(A = c(1234, 56789, 159, 45950, 7844, 678, 17, 28794), B = c(657, 823, 465, 756, 328, 122, 490, 990))
    df[sapply(df, \(x) x < 1000)] <- ""
    df
    #>       A B
    #> 1  1234  
    #> 2 56789  
    #> 3        
    #> 4 45950  
    #> 5  7844  
    #> 6        
    #> 7        
    #> 8 28794
    

    Created on 2023-01-15 with reprex v2.0.2