Search code examples
powershellcsvcompareobject

Compare multiple CSV columns and return complete data from both in PowerShell


I have two spreadsheets with data which I want to compare to highlight matching and non-matching data with a few "gotchas" - some columns I want to ignore for comparison (score, testname), others I want to include (Filename and hash), and I want to passthru both sides (so even if it doesn't match, I want to do a "full outer" query). Here's my example data:

CSV1 (Source):

FileName, ServerName, TestName, Hash,  Score
C:\temp,  ServerA,    TestA,    12345, 100
C:\temp1, ServerA,    TestA,    12345, 100
C:\temp2, ServerA,    TestA,    12345, 5
C:\temp3, ServerA,    TestA,    8888,  100
C:\temp5, serverA,    TestB,    9999,  100

CSV2 (ComparisonTarget):

FileName, ServerName, TestName, Hash,  Score
C:\temp,  ServerB,    TestA,    12345, 100
C:\temp1, ServerB,    TestA,    5555,  100
C:\temp3, ServerB,    TestA,    8888,  100
C:\temp5, ServerB,    TestB,    9999,  100
C:\temp7, ServerB,    TestB,    5678,  22

I want to see if Filenames and hashes match - e.g. C:\temp and C:\temp5 would match and return all columns with a side indicator - e.g.

Output CSV

SourceFileName, SourceServer,DestinationServer, ..., Match, Score
C:\temp,        ServerA,     ServerB,                True, 100
C:\temp3,       ServerA,     ServerB,                True, 8888
C:\temp1,       ServerA,     ServerB,                False, 100
C:\temp2,       ServerA,     N/A,                    False, 5
C:\temp7,       N/A,         ServerB,                False, 22

I have used a ForEach($record in $CSV1) and checked for matching Filenames, then Hashes, but it's slow with large amounts of data (I suspect because my full outer query requires me to extend my hashtable with new members for all columns).

I figured Compare-object would be a good fit but that doesn't seem to allow me to include both sides in a passthru statement unless I'm missing something?


Solution

  • Using this Join-Object:

    $CSV1 | FullJoin $CSV2 -On FileName -Property @{
        SourceFileName = {If ($Left.FileName) {$Left.FileName} Else {$Right.FileName}}
        SourceServer = {$Left.ServerName}
        DestinationServer = {$Right.ServerName}
        Match = {$Left.Hash -eq $Right.Hash}
        Score = {If ($Left.$_ -gt $Right.$_) {$Left.$_} Else {$Right.$_}}
    } |Format-Table
    
    Match DestinationServer Score SourceFileName SourceServer
    ----- ----------------- ----- -------------- ------------
     True ServerB           100   C:\temp        ServerA
    False ServerB           100   C:\temp1       ServerA
    False                   5     C:\temp2       ServerA
     True ServerB           100   C:\temp3       ServerA
     True ServerB           100   C:\temp5       serverA
    False ServerB           22    C:\temp7