If I have a dataset that contains [City, Dealership, Total Cars Sold]. How would I get the top dealer in each city and the number of cars they sold?
The results should look like
City1 Dealership A 2000
City2 Dealership X 1000
etc.
I'm sure it's possible, but I'm not having any luck and it might because i'm approaching the problem the wrong way.
Currently i'm grouping by Dealership and City which creates a Frame<(string*string*int), int>
and that gets me
City1 Dealership A 1 -> 2000
City1 Dealership B 2 -> 1000
City2 Dealership X 3 -> 1000
City2 Dealership Y 4 -> 500
etc.
But trying to then get the dealership that does the most deals is where i'm stumped.
Thanks.
I adapted Tomas's answer and output the type as Series<string, (string * int)>
let data = series [
("City1", "Dealership A") => 2000
("City1", "Dealership B") => 1000
("City2", "Dealership X") => 1000
("City2", "Dealership Y") => 500 ]
data
|> Series.groupBy (fun k _ -> fst k)
|> Series.mapValues (fun sr ->
let sorted = sr |> Series.sortBy(fun x -> -x)
let key = sorted |> Series.firstKey |> snd
let value = sorted |> Series.firstValue
key, value )
The output looks like
City1 -> (Dealership A, 2000)
City2 -> (Dealership X, 1000)
EDITED
I assume you have a csv file like this
City,Dealership,TotalCarsSold
City1,Dealership A,2000
City1,Dealership B,1000
City2,Dealership X,1000
City2,Dealership Y,500
This is how I'll do it. Read it as Frame
and get the column as Series
and apply the same code above to get result.
let df =
Frame.ReadCsv("C:/Temp/dealership.csv")
|> Frame.indexRowsUsing(fun r -> r.GetAs<string>("City"), r.GetAs<string>("Dealership"))
df?TotalCarsSold
|> Series.groupBy (fun k _ -> fst k)
|> Series.mapValues (fun sr ->
let sorted = sr |> Series.sortBy(fun x -> -x)
let key = sorted |> Series.firstKey |> snd
let value = sorted |> Series.firstValue
key, value )