Search code examples
rdataframelookupindexing

In R, What is the difference between df["x"] and df$x


Where can I find information on the differences between calling on a column within a data.frame via:

df <- data.frame(x=1:20,y=letters[1:20],z=20:1)

df$x
df["x"]

They both return the "same" results, but not necessarily in the same format. Another thing that I've noticed is that df$x returns a list. Whereas df["x"] returns a data.frame.

EDIT: However, knowing which one to use in which situation has become a challenge. Is there a best practice here or does it really come down to knowing what the command or function requires? So far I've just been cycling through them if my function doesn't work at first (trial and error).


Solution

  • If I'm not mistaken, df$x is the same as df[['x']]. [[ is used to select any single element, whereas [ returns a list of the selected elements. See also the language reference. I usually see that [[ is used for lists, [ for arrays and $ for getting a single column or element. If you need an expression (for example df[[name]] or df[,name]), then use the [ or [[ notation also. The [ notation is also used if multiple columns are selected. For example df[,c('name1', 'name2')]. I don't think there is a best-practices for this.