Search code examples
rcharacter-encodingascii

R How to convert a byte in a raw vector into a ascii space


I am reading some very old files created by C code that consist of a header (ASCII) and then data. I use readBin() to get the header data. When I try to convert the header to a string it fails because there are 3 'bad' bytes. Two of them are binary 0 and the other binary 17 (IIRC).

How do I convert the bad bytes to ASCII SPACE? I've tried some versions of the below code but it fails.

      hd[hd == as.raw(0) | hd  == as.raw(0x17)] <- as.raw(32)

I'd like to replace each bad value with a space so I don't have to recompute all the fixed data locations in parsing the string derived from hd.


Solution

  • I normally just go through a conversion to integer.

    Suppose we have this raw vector:

    raw_with_null <- as.raw(c(0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x00, 
                              0x57, 0x6f, 0x72, 0x6c, 0x64, 0x21))
    

    We get an error if we try to convert it to character because of the null byte:

    rawToChar(raw_with_null)
    #> Error in rawToChar(raw_with_null): embedded nul in string: 'Hello\0World!'
    

    It's easy to convert to numeric and replace any 0s or 23s with 32s (ascii space)

    nums <- as.integer(raw_with_null)
    
    nums[nums == 0 | nums == 23] <- 32
    

    We can then convert nums back to raw and then to character:

    rawToChar(as.raw(nums))
    #> [1] "Hello World!"
    

    Created on 2022-03-05 by the reprex package (v2.0.1)