Search code examples
jsondictionarygoslicedecode

how to find most frequent integer in a slice of struct with Golang


*** disclaimer : i'm not a professional developer, been tinkering with golang for about 8 month (udemy + youtube) and still got no idea how to simple problem like below..

Here's the summarize of the problem :

  • I'm trying to find the most frequent "age" from struct that comes from the decoded .json file (containing string "name" and integer "age").

  • After that i need to print the "name" based on the maximum occurence frequency of "age".

  • The printed "name" based on the maximum-occurence of "age" needs to be sorted alpabethically

Input (.json) :

[
{"name": "John","age": 15},
{"name": "Peter","age": 12},
{"name": "Roger","age": 12},
{"name": "Anne","age": 44},
{"name": "Marry","age": 15},
{"name": "Nancy","age": 15}
]

Output : ['John', 'Mary', 'Nancy'].

Explaination : Because the most occurring age in the data is 15 (occured 3 times), the output should be an array of strings with the three people's name, in this case it should be ['John', 'Mary', 'Nancy'].

Exception :

  • In the case there are multiple "age" with the same maximum-occurence count, i need to split the data and print them into different arrays (i.e when 'Anne' age is 12, the result is: ['John', 'Mary', 'Nancy'], ['Anne','Peter','Roger']

This is what i've tried (in Golang) :

package main
{
import (
    "encoding/json"
    "fmt"
    "os"
    "sort"
)
// 1. preparing the empty struct for .json
type Passanger struct {
    Name string `json:"name"`
    Age  int    `json:"age"`
}
func main() {
    // 2. load the json file
    content, err := os.ReadFile("passanger.json")
    if err != nil {
        fmt.Println(err.Error())
    }
    // 3. parse json file to slice
    var passangers []Passanger
    err2 := json.Unmarshal(content, &passangers)
    if err2 != nil {
        fmt.Println("Error JSON Unmarshalling")
        fmt.Println(err2.Error())
    }
    // 4. find most frequent age numbers (?)
    for _, v := range passangers {
        // this code only show the Age on every line
        fmt.Println(v.Age)
    }
    // 5. print the name and sort them apabethically (?)
       // use sort.slice package
       // implement group based by "max-occurence-age"
}

Been stuck since yesterday, i've also tried to implement the solution from many coding challenge question like :

func majorityElement(arr int) int {
    sort.Ints(arr)
    return arr[len(arr)/2]
}

but i'm still struggling to understand how to handle the "age" value from the Passanger slice as an integer input(arr int) to code above.

others solution i've found online is by iterating trough map[int]int to find the maximum frequency :

func main(){
    arr := []int{90, 70, 30, 30, 10, 80, 40, 50, 40, 30}
    freq := make(map[int]int)
    for _ , num :=  range arr {
        freq[num] = freq[num]+1
    }
    fmt.Println("Frequency of the Array is : ", freq)
}

but then again, the .json file contain not only integer(age) but also string(name) format, i still don't know how to handle the "name" and "age" separately..

i really need a proper guidance here.

*** here's the source code (main.go) and (.json) file that i mentioned above : https://github.com/ariejanuarb/golang-json


Solution

  • What to do before implementing a solution

    During my first years of college, my teachers would always repeat something to me and my fellow classmates, don't write code first, especially if you are a beginner, follow these steps instead:

    • Write what you want to happen
    • Details the problem into small steps
    • Write all scenarios and cases when they branch out
    • Write input and output (method/function signature)
    • Check they fit each other

    Let's follow these steps...

    Write what you want to happen

    You have well defined your problem so i will skip this step.

    Let's detail this further

    1. You have a passenger list
    2. You want to group the passengers by their age
    3. You want to look which are the most common/which have the most passengers.
    4. You want to print the name in alphabetical order

    Branching out

    • Scenario one: one group has a bigger size than all others.
    • Scenario two: two or more groups has the same size and are bigger than the others.

    There might more scenario but they are yours to find

    input output ??

    Well now that we have found out what we must be doing, we are going to check the input and output of each steps to achieve this goal.

    the steps:

    1. You have a passenger list
    • input => none or filename (string)
    • output => []Passenger
    1. You want to group the passengers by their age
    • input => []Passenger // passenger list
    • output => map[int][]int or map[int][]&Passenger // ageGroups

    The first type, the one inside the bracket is the age of the whole group.

    The second type, is a list that contains either:

    • the passenger's position within the list
    • the address of the object/passenger in the memory

    it is not important as long as we can retrieve back easily the passenger from the list without iterating it back again.

    1. You want to look which are the most common/which have the most passengers.
    • input => groups (ageGroups)

    so here we have scenario 1 and 2 branching out... which mean that it must either be valid for all scenario or use a condition to branch them out.

    • output for scenario 1 => most common age (int)
    • output for scenario 2 => most common ages ([]int)

    we can see that the output for scenario 1 can be merged with the output of scenario 2

    1. You want to print the name in alphabetical order of all passengers in an ageGroup

      • input => groups ([]Passenger) + ages ([]int) + passenger list ([]Passenger).
      • output => string or []byte or nothing if you just print it...

      to be honest, you can skip this one if you want to.

    After checking, time to code

    let's create functions that fit our signature first

    type Passenger struct {
        Name string `json:"name"`
        Age  int    `json:"age"`
    }
    
    func GetPassengerList() []Passenger{
       // 2. load the json file
       content, err := os.ReadFile("passanger.json")
       if err != nil {
           fmt.Println(err.Error())
       }
    
       // 3. parse json file to slice
       var passengers []Passenger
     
       err2 := json.Unmarshal(content, &passengers)
       if err2 != nil {
           fmt.Println("Error JSON Unmarshalling")
           fmt.Println(err2.Error())
       }
    
       return passengers
    }
    
    // 4a. Group by Age
    func GroupByAge(passengers []Passenger) map[int][]int {
        group := make(map[int][]int, 0)
    
        for index, passenger := range passengers {
            ageGroup := group[passenger.Age]
            ageGroup = append(ageGroup, index)
            group[passenger.Age] = ageGroup
        }
    
        return group
    }
    
    // 4b. find the most frequent age numbers
    
    func FindMostCommonAge(ageGroups map[int][]int) []int {
        mostFrequentAges := make([]int, 0)
        biggestGroupSize := 0
    
        // find most frequent age numbers
        for age, ageGroup := range ageGroups {
            // is most frequent age
            if biggestGroupSize < len(ageGroup) {
                biggestGroupSize = len(ageGroup)
                mostFrequentAges = []int{age}
            } else if biggestGroupSize == len(ageGroup) { // is one of the most frequent age
                mostFrequentAges = append(mostFrequentAges, age)
            }
            // is not one of the most frequent age so does nothing
        }
    
        return mostFrequentAges
    }
    
    func main() {
        passengers := loadPassengers()
    
        // I am lazy but if you want you could sort the
        // smaller slice before printing to increase performance
        sort.Slice(passengers, func(i, j int) bool {
            if passengers[i].Age == passengers[j].Age {
                return passengers[i].Name < passengers[j].Name
            }
            return passengers[i].Age < passengers[j].Age
        })
    
        // age => []position
        // Length of the array count as the number of occurences
        ageGrouper := GroupByAge(passengers)
    
        mostFrequentAges := FindMostCommonAge(ageGrouper)
    
        // print the passenger
        for _, age := range mostFrequentAges {
            fmt.Println("{")
            for _, passengerIndex := range ageGrouper[age] {
                fmt.Println("\t", passengers[passengerIndex].Name)
            }
            fmt.Println("}")
        }
    }