Search code examples
rdataframeclipboardread.table

How can I use read.table in R properly with this database?


I'm trying to read this dummy database with read.table(file="clipboard") in R:

             Aspecto   Sexo      Ranking
1             Imagen   Hombre    7.50
2      Mantenimiento   Hombre    7.18
3               Otro   Hombre    7.05
4  Espacios de venta   Hombre    6.91
5         Vigilancia   Hombre    6.36
6             Tiempo   Hombre    6.51
7    Espacios libres   Hombre    6.40
8             Imagen   Mujer     7.21
9      Mantenimiento   Mujer     7.30
10              Otro   Mujer     6.90
11 Espacios de venta   Mujer     7.02
12        Vigilancia   Mujer     6.53
13            Tiempo   Mujer     6.40
14   Espacios libres   Mujer     5.78

This code seems to work:

pw <- read.table(file="clipboard", dec=".", sep=",", header=TRUE)

But the structure is clearly something I don't want:

str(pw)
'data.frame':   14 obs. of  1 variable:
 $ Aspecto...Sexo......Ranking: Factor w/ 14 levels "1  

I've tried many things including fill=TRUE and so other arguments but I just can't get what I expect. For example:

pw <- read.table(file="clipboard", dec=".", sep="", header=TRUE)
Error in read.table(file = "clipboard", dec = ".", sep = "", header = TRUE) : 
  more columns than column names

Any advice will be much appreciated.


Solution

  • You can use read.fwf, since the columns have fixed widths and you don't have quotes surrounding character strings. And since the first row has only 3 names, we skip this, but read them in later using scan.

    clipboard <- read.fwf("clipboard.txt", widths=c(2,18,9,8), skip=1, as.is=TRUE) 
    # or row.names=1 to ignore the first un-named column
    
    colnames(clipboard)[2:4] = scan("clipboard.txt", what=rep("character", 3), nlines=1)
    
    str(clipboard)
    
    'data.frame':   14 obs. of  4 variables:
     $ V1     : num  1 2 3 4 5 6 7 8 9 10 ...
     $ Aspecto: chr  "            Imagen" "     Mantenimiento" "              Otro" " Espacios de venta" ...
     $ Sexo   : chr  "   Hombre" "   Hombre" "   Hombre" "   Hombre" ...
     $ Ranking: num  7.5 7.18 7.05 6.91 6.36 6.51 6.4 7.21 7.3 6.9 ...