Search code examples
rgeom-col

How do you find stats of a column chart in R?


My dataset has 2 variables :

1)Match_City (name of the city where a soccer match is played) 2) Home_score (number of goals by the home team)

The problem is there more than 1500 cities and it's not possible to make out which cities have the highest total of Home_score. I want to know if I can get data of that column chart i.e. City A has a total of 20 goals from home team. I am currently using geom_col() to make the column chart. I need help!


Solution

  • It is a little difficult to understand your question and as others have noted you should try to give us a reproducible example with code.

    However, if I understand your question correctly...

    You have a large dataframe with two columns (Match_City and Home_score) and you have made a column chart to compare the totals of Home_score for each Match_City.

    Now you can visually see which Match_City has the highest total Home_score but you would like R to calculate these numbers in a way that you can work with. The aggregate function is your best bet here.

    Some example code:

    #Let's Create Some Data
    
    df <- data.frame(Match_City=sample(LETTERS[1:5], size = 100, replace = TRUE), Home_score=sample(1:6, size = 100, replace = TRUE))
    
    #Aggregate will find the sum of Home_score for each Match_City
    
    score_summary<-aggregate(Home_score~Match_City, data = df, FUN = sum)
    
    #You can then sort the score_summary data frame so that the Home_score sums are in decreasing order
    
    score_summary<-score_summary[order(score_summary$Home_score, decreasing = TRUE),]