Search code examples
powershellset-differencecompareobject

Find what is different in two very large lists


I have two lists about 1k people each list. What I want to do is find who is leftover between the two.

$BunchoEmail = Import-Csv C:\temp\Directory.csv | Select-Object primaryEmail -ExpandProperty primaryEmail

$GoogleUsers = gam print users fields suspended | ConvertFrom-Csv | Where-Object suspended -ne $true | Select-Object primaryEmail -ExpandProperty primaryEmail

$objects = @{
    ReferenceObject  = $GoogleUsers
    DifferenceObject = $BunchoEmail
}
Compare-Object @objects

Above didn't produce what I wanted.

What is the best way to find what is different ?


Solution

  • Load each list into a [hashtable]:

    $emailTable = @{}
    $BunchoEmail |ForEach-Object { $emailTable[$_] = $_ }
    
    $gsuiteTable = @{}
    $GoogleUsers |ForEach-Object { $gsuiteTable[$_] = $_ }
    

    Now you can iterate over one list and check whether the other doesn't contain any particular email addresses with Where-Object:

    $notInGSuite = $BunchoEmail |Where-Object { -not $gsuiteTable.ContainsKey($_) }
    
    $notInEmailList = $GoogleUsers |Where-Object { -not $emailTable.ContainsKey($_) }
    

    The time complexity of ContainsKey() on a hashtable is O(1), so it'll keep working for lists with 1000s of emails