Search code examples
javakotlincollectionsjava-stream

Stream two collections and collect a map based on common property


I have looked around a bit but have not found a elegant solution. What I am trying to do is have a nice streamlined (pun intended) solution to create a map from two collections who have 1 shared property (an id of sorts). The map should be key-value of the matching objects.

How I have done it for now is a forEach on collection 1 and within a forEach on collection 2 to get the matching model and then do an operation on it. I would like to create the map and then separately just do the operation on every pair in the map. I have tried to come up with a simple example to make it a bit more clear.

data class FirstNameModel(val idNumber: String, val firstName: String)
data class LastNameModel(val idNumber: String, val lastName: String)

val randomFirstNameList = listOf(
   FirstNameModel("5631ab", "Bob"),
   FirstNameModel("ca790a", "George"),
   FirstNameModel("j8f1sa", "Alice")
)

val randomLastNameList = listOf(
   LastNameModel("j8f1sa", "Smith"),
   LastNameModel("5631ab", "Johnson"),
   LastNameModel("ca790a", "Takai")
)

// stream function to correctly create map (not just a null one like below).

val map: Map<FirstNameModel, LastNameModel>? = null

fun printIt() {
   map?.forEach {
       println("Name for id ${it.key.idNumber} is ${it.key.firstName} ${it.value.lastName}")
   }
//    should print something like:
//            Name for id 5631ab is Bob Johnson
//            Name for id ca790a is George Takai
//            Name for id j8f1sa is Alice Smith
}

I have been trying this in Kotlin for now but it's a use case I sometimes also have in Java, so am curious for both.


Solution

  • This would work including the cases mentioned in @Sweeper's (thanks!) comment below the question:

    val map: Map<FirstNameModel, LastNameModel> = randomFirstNameList.map { it.idNumber }
      .plus(randomLastNameList.map { it.idNumber })
      .distinct()
      .associate { idNumber ->
        (randomFirstNameList.firstOrNull { it.idNumber == idNumber } ?: FirstNameModel(idNumber, "")) to
        (randomLastNameList.firstOrNull { it.idNumber == idNumber } ?: LastNameModel(idNumber, ""))
      }
    

    But it probably would make sense to introduce a data class to hold the result:

    data class NameModel(val idNumber: String, val firstName: String, val lastName: String)
    
    val map: List<NameModel> = randomFirstNameList.map { it.idNumber }
      .plus(randomLastNameList.map { it.idNumber })
      .distinct()
      .map { idNumber ->
        NameModel(
          idNumber,
          randomFirstNameList.firstOrNull { it.idNumber == idNumber }?.firstName ?: "",
          randomLastNameList.firstOrNull { it.idNumber == idNumber }?.lastName ?: ""
        )
      }
    

    Output (including one last name entry without corresponding first name entry):

    NameModel(idNumber=5631ab, firstName=Bob, lastName=Johnson)
    NameModel(idNumber=ca790a, firstName=George, lastName=Takai)
    NameModel(idNumber=j8f1sa, firstName=Alice, lastName=Smith)
    NameModel(idNumber=999999, firstName=, lastName=NoFirstName)
    

    Additional remark:

    If the two lists are very large, the repeated firstOrNull calls could increase time consumption. In such a case it would make sense to create lookup maps for first and last names:

    val firstNameMap = randomFirstNameList.associate { it.idNumber to it.firstName }
    val lastNameMap = randomLastNameList.associate { it.idNumber to it.lastName }
    
    val map: List<NameModel> = randomFirstNameList.map { it.idNumber }
      .plus(randomLastNameList.map { it.idNumber })
      .distinct()
      .map { idNumber ->
        NameModel(
          idNumber,
          firstNameMap.getOrElse(idNumber) { "" },
          lastNameMap.getOrElse(idNumber) { "" }
        )
      }