Search code examples
rfactors

Subsetting columns from the data frame in R


I have relatively easy question regarding the subsetting columns in R.

I have two data frames, dat1 and dat2:

>dat1
      cities countries areakm2 populationk
1   Shanghai     China    2643       21766
2    Beijing     China    1368       21500
3        NYC       USA Unknown        8406
4         LA       USA    1302        3884
5     London        UK    1737     Unknown
6 Manchester        UK     116         255

> dat2
  Ozone Solar.R Wind Temp Month Day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5

Then, if I'd like to subset the first column from the dat1 I get the following:

> dat1[,1]
[1] Shanghai   Beijing    NYC        LA         London     Manchester
Levels: Beijing LA London Manchester NYC Shanghai
> class(dat1[,1])
[1] "factor

However, if I do the same thing with dat2 I'll get a vector, not a factor.

> dat2[,1]
[1] 41 36 12 18 NA
> class(dat2[,1])
[1] "integer"

I can't understand what the difference is between these two cases. I assume that this is something to do with data types (in dat1 the first column consists of characters, while in dat2 of integers )

Thank you


Solution

  • Actualy both are vectors. One of factors and one of integers. If you want R to see them as characters (another type like factors or integers) you should use

    stringsAsFactors = FALSE

    In the creation of your data.frame