Search code examples
rdataframer-colnames

Unable to set column names to a subset of a dataframe


I run the following code, p is the dataframe loaded.

a <- sort(table(p$Title))
a1 <- as.data.frame(a)
tail(a1, 7)

                     a
Maths               732
Science             737
Physics             737
Chemistry           776
Social Science      905
null              57374
                  88117

I want to do some manipulations on the above dataframe result. I want to add column names to the dataframe. I tried the colnames function.

colnames(a1) <- c("category", "count")

I get the below error:

Error in `colnames<-`(`*tmp*`, value = c("category", "count")) : 
    attempt to set 'colnames' on an object with less than two dimensions

Please suggest.


Solution

  • As I said in the comments to your question, the categories are rownames. A reproducible example:

    # create dataframe p
    x <- c("Maths","Science","Physics","Chemistry","Social Science","Languages","Economics","History")
    set.seed(1)
    p <- data.frame(title=sample(x, 100, replace=TRUE), y="some arbitrary value")
    
    # create the data.frame as you did
    a <- sort(table(p$title))
    a1 <- as.data.frame(a)
    

    The resulting dataframe:

    > a1
                    a
    Social Science  6
    Maths           9
    History        10
    Science        11
    Physics        12
    Languages      15
    Economics      17
    Chemistry      20
    

    Looking at the dimensions of dataframe a1, you get this:

    > dim(a1)
    [1] 8 1
    

    which means that your dataframe has 8 rows and 1 column. Trying to assign two columnnames to the a1 dataframe will hence result in an error.

    You can solve your problem in two ways:

    1: assign just 1 columnname with colnames(a1) <- c("count")

    2: convert the rownames to a category column and then assign the columnnames:

    a1$category <- row.names(a1)
    colnames(a1) <- c("count","category")
    

    The resulting dataframe:

    > a1
                   count       category
    Social Science     6 Social Science
    Maths              9          Maths
    History           10        History
    Science           11        Science
    Physics           12        Physics
    Languages         15      Languages
    Economics         17      Economics
    Chemistry         20      Chemistry
    

    You can remove the rownames with rownames(a1) <- NULL. This gives:

    > a1
    
      count       category
    1     6 Social Science
    2     9          Maths
    3    10        History
    4    11        Science
    5    12        Physics
    6    15      Languages
    7    17      Economics
    8    20      Chemistry