Search code examples
rdataframeaggregateanalytics

R: aggregating data frame sum not meaningful factors


i am having the following error: Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : ‘sum’ not meaningful for factors

Here is my code

   library(SportsAnalytics)
    nba1819 = fetch_NBAPlayerStatistics("18-19")
    nbadf = data.frame(nba1819)
    nbaagg = nbadf[c(5:25)]
    nbaagg = lapply(nbaagg, function(x) type.convert(as.numeric(x)))
    nbaagg$Team = as.character(nbadf$Team)
    nbaagg = aggregate(nbaagg,
                     by = list(nbaagg$Team),
                     FUN = sum)

I already tried to convert everything to vectors so dont understand why it is still claiming I have factors. here is my output of str(nbaagg)

List of 22
 $ GamesPlayed        : int [1:530] 31 10 34 80 82 18 7 81 10 37 ...
 $ TotalMinutesPlayed : int [1:530] 588 121 425 2667 1908 196 23 2689 120 414 ...
 $ FieldGoalsMade     : int [1:530] 56 4 38 481 280 11 3 684 13 67 ...
 $ FieldGoalsAttempted: int [1:530] 157 18 110 809 487 36 10 1319 39 178 ...
 $ ThreesMade         : int [1:530] 41 2 25 0 3 6 0 10 3 32 ...
 $ ThreesAttempted    : int [1:530] 127 15 74 2 15 23 4 42 12 99 ...
 $ FreeThrowsMade     : int [1:530] 12 7 7 146 166 4 1 349 8 45 ...
 $ FreeThrowsAttempted: int [1:530] 13 10 9 292 226 4 2 412 12 60 ...
 $ OffensiveRebounds  : int [1:530] 5 3 11 392 166 3 1 252 11 3 ...
 $ TotalRebounds      : int [1:530] 48 25 61 760 598 19 4 744 26 24 ...
 $ Assists            : int [1:530] 20 8 66 124 184 5 6 194 13 25 ...
 $ Steals             : int [1:530] 17 1 13 119 72 1 2 43 1 5 ...
 $ Turnovers          : int [1:530] 14 4 28 138 121 6 2 144 8 33 ...
 $ Blocks             : int [1:530] 6 4 5 77 65 4 0 107 0 6 ...
 $ PersonalFouls      : int [1:530] 53 24 45 204 203 13 4 179 7 47 ...
 $ Disqualifications  : int [1:530] 0 0 0 3 0 0 0 0 0 0 ...
 $ TotalPoints        : int [1:530] 165 17 108 1108 729 32 7 1727 37 211 ...
 $ Technicals         : int [1:530] 1 1 0 2 3 0 0 1 0 0 ...
 $ Ejections          : int [1:530] 0 0 0 0 0 0 0 0 0 0 ...
 $ FlagrantFouls      : int [1:530] 0 0 0 0 0 0 0 0 0 0 ...
 $ GamesStarted       : int [1:530] 2 0 1 80 28 3 0 81 1 2 ...
 $ Team               : chr [1:530] "OKL" "PHO" "ATL" "OKL" ...

Solution

  • Based on the str(nbaagg), nbaagg is a list of vectors and not a data.frame. It can be converted to data.frame with as.data.frame (here the list elements are of equal length

     nbaagg <- as.data.frame( nbaagg)
    

    then, we can use

    aggregate(.~ Team, nbaagg, FUN = sum, na.rm = TRUE, na.action = NULL)
    

    It was created as a list in this step

     nbaagg <- lapply(nbaagg, function(x) type.convert(as.numeric(x)))
    

    The lapply output is always a list. If we want to have the same attributes as in the original dataset, use []

     nbaagg[] <- lapply(nbaagg, function(x) type.convert(as.numeric(x)))
    

    Here, the type.convert can be directly used on the dataset assuming they are all character class instead of a loop with lapply

    nbaagg <- type.convert(nbaagg, as.is = TRUE)