Search code examples
pythonexceldata-analysis

Sum by common ID and value in Excel or Python


I have experience with both python and excel and feel comfortable with both so whichever presents a better solution I will use. I have a CSV which I have pasted in this google sheet so you guys can see (https://docs.google.com/spreadsheets/d/104nstdSWUWZa22NbCcdeymcITntev346omPhttrL1oc/edit?usp=sharing). What I want to do is take the teams in column B and find the sum of their highest 5 person combination from column M not including any player with position "P" in column D. So essentially I would like something that could find the 5 highest values in column M associated with the same team in column B and sum them together. Thank you in advance for the help!


Solution

  • You can use some custom m-code for the group aggregation to return the sum of the top 5.

    Here is an example using your data:

    M Code

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Team", type text}, {"Opponent", type text}, {"Position", type text}, {"DFS ID", Int64.Type}, {"SS Projection", type number}, {"My Projection", type any}, {"Actual", type any}, {"Price", Int64.Type}, {"Value", type number}, {"SS Own", type number}, {"My Own", type any}, {"dk_points", type number}, {"dk_25_percentile", type number}, {"dk_50_percentile", type number}, {"dk_75_percentile", type number}, {"dk_85_percentile", type number}, {"dk_95_percentile", type number}, {"dk_99_percentile", type number}, {"fd_points", type number}, {"fd_25_percentile", Int64.Type}, {"fd_50_percentile", type number}, {"fd_75_percentile", type number}, {"fd_85_percentile", type number}, {"fd_95_percentile", type number}, {"fd_99_percentile", type number}, {"fdraft_points", type number}, {"fdraft_25_percentile", type number}, {"fdraft_50_percentile", type number}, {"fdraft_75_percentile", type number}, {"fdraft_85_percentile", type number}, {"fdraft_95_percentile", type number}, {"fdraft_99_percentile", type number}, {"yahoo_points", type number}, {"yahoo_25_percentile", type number}, {"yahoo_50_percentile", type number}, {"yahoo_75_percentile", type number}, {"yahoo_85_percentile", type number}, {"yahoo_95_percentile", type number}, {"yahoo_99_percentile", type number}, {"dk_std", type number}, {"fd_std", type number}, {"fdraft_std", type number}, {"yahoo_std", type number}, {"BatOrder", Int64.Type}, {"PA", type number}, {"Singles", type number}, {"Doubles", type number}, {"Triples", type number}, {"HR", type number}, {"R", type number}, {"RBI", type number}, {"BB", type number}, {"SB", type number}, {"CS", type number}, {"IP", type number}, {"Wins", type number}, {"QS", type number}, {"K", type number}, {"CG", type number}, {"CGSO", type number}, {"Pitches", type number}}),
    //remove position P
        #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Position] <> "P")),
    
    //Group and use a custom aggregation
        #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Team"}, {
            {"Sum of Top 5", each List.Sum(List.MaxN([dk_points],5)), type number}
            })
    in
        #"Grouped Rows"
    

    enter image description here

    Note: The above will return the SUM of the top 5 scores. So scores of {8,7,6,5,5,5,5,4,3} => 31. If you would want to return 45, that would be a different algorithm.