Search code examples
powershellazure-powershell

Powershell compare two datasets using arrays or hashtable


enter image description hereUsing PowerShell I have two compare two datasets one is from CSV file and other is from a sql query result:

Sql query result:

AppName Metadata1 Metadata2
App1 ab3245 1235
App1 ab4533 4645
App2 ab1234 4101
App3 ab4245 4101
App2 ab5245 6534
App4 ab5245 2312
App5 ab3245 1235

CSV data:

Metadata1 Metadata2
ab4245 7890
ab3245 3412
ab3245 1235
ab4245 7777
ab4533 4645
ab4533 2345
ab5245 1929
ab4533 4645
ab9988 3344
ab1234 4102
ab9988 2233
ab1234 4103

I need to compare for each AppName how the metadata1 & Metadata2 values are different with CSV file?

I tried storing data in two separate arrays and then tried comparing as:

Compare-Object -ReferenceObject $sqlarray -DifferenceObject $csvarray

But above code just returns the differences between two arrays.

Updating expected output:

This is to identify any changes available in CSV file but missing in DB. Expected output for every App’s name metadata1 & metadata2 when compared with CSV file metadata1 & metadata2 only difference should display like below table:

AppName Metadata1 Metadata2
App1 ab3245 3412
App1 ab3245 1235
App1 ab4533 2345
App2 ab1234 4102
App2 ab1234 4103
App2 ab5245 1929
App3 ab4245 7890
App3 ab4245 7777
App4 ab4245 7890
App4 ab4245 7777
App5 ab3245 3412

Solution

  • As per my comment, using the datasets you provided.

    'Metadata1  Metadata2
    ab4245  7890
    ab3245  3412
    ab3245  1235
    ab4245  7777
    ab4533  4645
    ab4533  2345
    ab5245  1929
    ab4533  4645
    ab9988  3344
    ab1234  4102
    ab9988  2233
    ab1234  4103' | 
    Out-File -FilePath 'D:\Temp\CsvDataList'
    
    
    'AppName    Metadata1   Metadata2
    App1    ab3245  1235
    App1    ab4533  4645
    App2    ab1234  4101
    App3    ab4245  4101
    App2    ab5245  6534
    App4    ab5245  2312
    App5    ab3245  1235'| 
    Out-File -FilePath 'D:\Temp\SqlDataList'
    
    
       
    'Metadata1  Metadata2
    ab3245  1235
    ab4533  4645
    ab1234  4101
    ab4245  4101
    ab5245  6534
    ab5245  2312
    ab3245  1235'| 
    Out-File -FilePath 'D:\Temp\SqlDataListNoAppName'
    
    $CsvMetaList = (Import-Csv -Path 'D:\Temp\CsvDataList' -Delimiter "`t")
    $SqlMetaList = (Import-Csv -Path 'D:\Temp\SqlDataList' -Delimiter "`t") 
    $SqlMetaList = (Import-Csv -Path 'D:\Temp\SqlDataListNoAppName' -Delimiter "`t") 
    

    Using Compare-Object of the arrays.

    Compare-Object -ReferenceObject $CsvMetaList -DifferenceObject $SqlMetaList -IncludeEqual
    # Results
    <#
    InputObject                         SideIndicator
    -----------                         -------------
    @{Metadata1=ab4245; Metadata2=7890} ==           
    @{Metadata1=ab3245; Metadata2=3412} ==           
    @{Metadata1=ab3245; Metadata2=1235} ==           
    @{Metadata1=ab4245; Metadata2=7777} ==           
    @{Metadata1=ab4533; Metadata2=4645} ==           
    @{Metadata1=ab4533; Metadata2=2345} ==           
    @{Metadata1=ab5245; Metadata2=1929} == 
         
    @{Metadata1=ab4533; Metadata2=4645} <=           
    @{Metadata1=ab9988; Metadata2=3344} <=           
    @{Metadata1=ab1234; Metadata2=4102} <=           
    @{Metadata1=ab9988; Metadata2=2233} <=           
    @{Metadata1=ab1234; Metadata2=4103} <= 
    #>
    
    Compare-Object -ReferenceObject $SQlMetaList -DifferenceObject $CsvMetaList -IncludeEqual
    # Results
    <#
    @{AppName=App1; Metadata1=ab3245; Metadata2=1235} ==           
    @{AppName=App1; Metadata1=ab4533; Metadata2=4645} ==           
    @{AppName=App2; Metadata1=ab1234; Metadata2=4101} ==           
    @{AppName=App3; Metadata1=ab4245; Metadata2=4101} ==           
    @{AppName=App2; Metadata1=ab5245; Metadata2=6534} ==           
    @{AppName=App4; Metadata1=ab5245; Metadata2=2312} ==           
    @{AppName=App5; Metadata1=ab3245; Metadata2=1235} ==
               
    @{Metadata1=ab4533; Metadata2=4645}               =>           
    @{Metadata1=ab9988; Metadata2=3344}               =>           
    @{Metadata1=ab1234; Metadata2=4102}               =>           
    @{Metadata1=ab9988; Metadata2=2233}               =>           
    @{Metadata1=ab1234; Metadata2=4103}               =>
    #>
    

    Point of note: As expected, removing the AppName attribute via the SqlDataListNoAppName results are the same as the above, without the AppName of course.

    Ignoring the AppName attribute.

    ForEach ($CsvRecord in $CsvMetaList)
    {
        If (($SqlMetaList.Metadata1 -notcontains $CsvRecord.Metadata1) -and ($SqlMetaList.Metadata2 -notcontains $CsvRecord.Metadata2))
        {$CsvRecord}
    }
    # Results
    <#
    Metadata1 Metadata2
    --------- ---------
    ab9988    3344     
    ab9988    2233      
    #>
    
    
    ForEach ($CsvRecord in $CsvMetaList)
    {
        If ($SqlMetaList.Metadata1 -notcontains $CsvRecord.Metadata1)
        {$CsvRecord}
    }
    # Results
    <#
    Metadata1 Metadata2
    --------- ---------
    ab9988    3344     
    ab9988    2233  
    #>
    
    
    ForEach ($CsvRecord in $CsvMetaList)
    {
        If ($SqlMetaList.Metadata2 -notcontains $CsvRecord.Metadata2)
        {$CsvRecord}
    }
    # Results
    <#
    
    Metadata1 Metadata2
    --------- ---------
    ab4245    7890     
    ab3245    3412     
    ab4245    7777     
    ab4533    2345     
    ab5245    1929     
    ab9988    3344     
    ab1234    4102     
    ab9988    2233     
    ab1234    4103 
    #>
    

    Combining the two array lists to map an Sql AppName to the Csv then evaluate attributes Metadata1 and Metadata2, then just Metadata1.

    Comparing both attributes.

    $CsvMetaList |
    ForEach-Object{
        $CsvRow = $PSItem
    
        if(
            $SqlRow = $SqlMetaList | 
                    Where-Object{($PSItem.Metadata1 -eq $CsvRow.Metadata1) -and ($PSItem.Metadata2 -eq $CsvRow.Metadata2)}
            )
        {
            (
                $out = [pscustomobject]@{ 
                            AppName   = $SqlRow.AppName
                            Metadata1 = $CsvRow.Metadata1 
                            Metadata2 = $CsvRow.Metadata2
                        }
            )
        }
    } | 
    Sort-Object -Property AppName, Metadata1
    # Results
    <#
    AppName      Metadata1 Metadata2
    -------      --------- ---------
    App1         ab4533    4645     
    App1         ab4533    4645     
    {App1, App5} ab3245    1235 
    #>
    

    Comparing one attribute.

    Clear-Host
    $CsvMetaList |
    ForEach-Object{
        $CsvRow = $PSItem
    
        if(
            $SqlRow = $SqlMetaList | 
                    Where-Object{$PSItem.Metadata1 -eq $CsvRow.Metadata1}
            )
        {
            (
                $out = [pscustomobject]@{ 
                            AppName   = $SqlRow.AppName
                            Metadata1 = $CsvRow.Metadata1 
                            Metadata2 = $CsvRow.Metadata2
                        }
            )
        }
    } | 
    Sort-Object -Property AppName, Metadata1
    # Results
    <#
    AppName      Metadata1 Metadata2
    -------      --------- ---------
    App1         ab4533    4645     
    App1         ab4533    2345     
    App1         ab4533    4645     
    {App1, App5} ab3245    3412     
    {App1, App5} ab3245    1235     
    App2         ab1234    4102     
    App2         ab1234    4103     
    {App2, App4} ab5245    1929     
    App3         ab4245    7890     
    App3         ab4245    7777  
    #>
    

    Note: In this scenario, one gets the Metadata1 mapping as expected, but that Metadata2, is not completely valid.

    Hopefully, this illustrates the points I and others are trying to make in the comments.