Search code examples
rreshape2dcast

listing sublevels of one column by the levels of other column


It is possible that this question is already discussed somewhere on this site but I am not able to find it after several searches. Let us say my dataframe has only 2 columns which can be generated with below code

df=data.frame(brand=c('Audi','BMW','Audi','Toyota','Audi','Toyota'),
Model=c('A3','x', 'A4','Corolla','A5','Camry'))

Now I am interested in seeing for each brand, what are the models available. I do not want to sum or mean anything. This should be a simple solution but I am not able to find it.

Best I could find was using dcast from reshape2 package. I do

dcast(df,brand+Model~.)

and I get this

   brand   Model       .
1   Audi      A3      A3
2   Audi      A4      A4
3   Audi      A5      A5
4    BMW       x       x
5 Toyota   Camry   Camry
6 Toyota Corolla Corolla  

The first two columns are exactly what I want but I am thinking I am unnecessarily generating a third column and then discarding a later. Also, there is a chance this can be done without any additional library.

So my question is, is there a better or more efficient way of handling this?

(The actual data I have gives the count in the last column under '.' I do not know why the simplified example I created does not give the count but instead just repeats the value in second column. I will just take that up as a separate question if I have to!)


Solution

  • You can simply order the data.frame by brand and Model. If you have duplicated values, you can use unique() to only get distinct combinations.

    unique(df[order(df$brand, df$Model), ])
    

    returns

       brand   Model
    1   Audi      A3
    3   Audi      A4
    5   Audi      A5
    2    BMW       x
    6 Toyota   Camry
    4 Toyota Corolla