Search code examples
powershellpowershell-4.0

Compare column values and merge


Iam trying to merge two csv files with common column name but one has 22 row and other just 16.

1st CSV                                2nd CSV

Name       Service_StatusA              Name         Service_StatusB 
IEClient     running                   IEClient          Manual
IE Nomad     running                   ​IE Nomad          running
Data Usage   running                   Print Spooler     Manual
Print Spooler running                  Server            running
Server        running

I want to merge this to a single csv

Name           Service_StatusA     Service_StatusB
IEClient          running             Manual
IE Nomad          running             running
Data Usage        running
Print Spooler     running             Manual
Server            running             running
$file1 = Import-Csv -Path .\PC1.csv
$file2 = Import-Csv -Path .\PC2.csv
$report = @()

    foreach ($line in $file1) 
    {
        $match = $file2 | Where-Object {$_.Name -eq $line.Name}
        if ($match)
        {
            $row = "" | Select-Object 'Name','Service_StatusA','Service_StatusA',
            $row.Name = $line.Name
            $row.'Service_StatusA' = $line.'Service_StatusA'
            $row.'Service_StatusB' = $match.'Service_StatusB'
            $report += $row
        }
    }
$report | export-csv .\mergetemp.csv -notype -force

how to compare the row values before merging


Solution

  • In SQL database terms you want a left join, and your code is doing an inner join. In set terms, you are doing an intersection of 1.csv and 2.csv (only the rows which appear in both) but you want to be doing a union of 1.csv + the intersection (all rows from 1.csv with only matching lines from 2.csv).

    You want every row in the first csv to be a row in the output csv. That should be the start - always output something in your loop. At the moment you output from the if() test. You want matching rows in the second csv to have their data added in if they exist, but not to change the amount of output.

    $file1 = Import-Csv -Path .\PC1.csv
    $file2 = Import-Csv -Path .\PC2.csv
    
    $report = foreach ($line in $file1) 
        {
            # always make an output line for each row in file1
            $row = "" | Select-Object 'Name','Service_StatusA','Service_StatusA',
            $row.Name = $line.Name
            $row.'Service_StatusA' = $line.'Service_StatusA'
    
            # if there is a matching line in file2, add its data in    
            $match = $file2 | Where-Object {$_.Name -eq $line.Name}
            if ($match)
            {
                $row.'Service_StatusB' = $match.'Service_StatusB'                
            }
    
            # always have output a row for a row in file1
            $row
        }
    $report | export-csv .\mergetemp.csv -notype -force
    

    (It is possible that what you want is a SQL outer join where rows in 2.csv that are not in 1.csv also create an output row, but your example does not show that).

    (I took out $report += because it's more code that runs slower, which is an annoying combination).