Search code examples
rrandom-forestgini

R dataframe not creating properly


I have used the following code to obtain mean decrease in accuracy for random forest

 AAA<-randomForest(CPercentage~., data=data, importance= T)
 BBB<-as.data.frame(importance(AAA))

I have created the following dataframe by the above process

                 %IncMSE IncNodePurity
  Campaigntype     3.4815273   207.5336052
  Email           -1.1606079  2042.5660103
  get              4.9073550    35.1237017
  free             2.8777972    14.5362957
  new              8.4464445    93.3491610
  buy              5.9636483    23.9926669
  just             4.1262164    21.5611278
  month            4.0817729    16.6345631

I am able to obtain the second and third column by BBB$%IncMSE and BBB$IncNodePurity. I want to subset thsi based on the first column which appears unnamed. I am unable to do this. When writing this dataframe to a csv file, it works and all three columns are listed separately. However, I am unable to separate the first two columns. Is there any way i can do this and rename the first column. Will be grateful to anyone who helps


Solution

  • It seems that your "first column" may actually be the index. You can create a column in your data frame from the index, and then reset your index so they're row numbers instead of names.

    Try this:

    BBB$col_one <- rownames(BBB)
    rownames(BBB) <- 1:nrow(rownames)