Search code examples
rexcelspace

R and Excel: Space which is not a common space gives in R


I have an Excel sheet with columns where some entries contains at the end of the string a space. For example: "SS" and "SS ", the last with space. This is only visible if I click in the cell. When I try to replace the space using space bar with "" and ctrl+h it does not find the space. Obviously, it is not a common space. I tried to insert a common space somewhere else (with space bar) and I can find this (space-bar)-space. So, I assume that it is a special space. If I copy the space in that cell and use it with ctrl+h then I can replace it with "".

When I import the Excel sheet into R (using ess-emacs) before replacing I get the following entries:

enter image description here

The underscore is not a common underscore and can't be replaced with "" by using sub. Now, I'm wonder what this space is and how can I cope with this in R (that is removing this space/underscore).


Solution

  • I can't guarantee that this will work (since Unicode etc. handling can vary from platform to platform), but ?trimws suggests that using whitespace = "[\\h\\v]" will work:

    > z <- data.frame(1:2,2:3)
    > names(z) <- c("a  ","b\u00a0")  ## column name with Unicode space
    > z
      a   b 
    1   1  2
    2   2  3
    > names(z)
    [1] "a  " "b " 
    > trimws(names(z))  ## default doesn't remove space after 'b'
    [1] "a"  "b "
    > trimws(names(z), whitespace="[\\h\\v]")
    [1] "a" "b"