Search code examples
powershellpowershell-4.0

Compare within same result set of Compare-Object


I have a CSV file that contains the Name, Size, and Hash (filename, size in bytes, and MD5 hash) for every file on one of my storage appliances. Once this data is moved, I will generate the hash for each file including also the name and size, then compare those columns to those in the existing CSV. I know there are utilities out there that would do all of this work for me, but I am doing this more as a learning experience than anything else.

What I would like to do is for the file names, sizes, and hashes that do not match perfectly, I would like to export a log indicating if the file does not exist on the new location, or if there is a hash mismatch.

As an example, using my current compare script:

$csv1 = Import-CSV "X:\Documents\Customer Projects\Destination.csv"
$csv2 = Import-CSV "X:\Documents\Customer Projects\Source.csv"
Compare-Object -ReferenceObject $csv2 -DifferenceObject $csv1 -Property Name,Size,Hash

I get:

Name                                            Size                                            Hash                                            SideIndicator                                 
----                                            ----                                            ----                                            -------------                                 
123456789.avi                                   4122896                                         D258518EDDE5F00579CE2F9D01129C11                =>                                            
123456789.avi                                   8635210                                         807666D37D0E1A75279E1AE837759674                <=                                            
qwertyuiop.avi                                  468246867                                       3F779E039B646D49D84F3D2C403F2EBD                <=

In the case of the first file, 123456789.avi, it is found in both locations, but the size and hash do not match which should log something along the lines of "Hash mis-match".

For the second file, qwertyuiop.avi, it is only in the source location and not in the destination which would log something along the lines of "File missing from destination".

Is there a way to do this comparison directly with the output of Compare-Object? I can't seem to find a good way to compare between rows of the same output. Does the data need to be exported to two different CSV files, one for one side and another for the other side and then compare?


EDIT:

With the help of Robert, I am using the below code to group the output of my original Compare-Object statement and output a single message for files of the same name based on the count of the Group-Object statement.

$csv1 = Import-CSV "X:\Documents\Customer Projects\Destination.csv"
$csv2 = Import-CSV "X:\Documents\Customer Projects\Source.csv"
$test = Compare-Object -ReferenceObject $csv2 -DifferenceObject $csv1 -Property Name,Size,Hash
$group = $test | Group-Object -Property Name
foreach ($file in $group)
{
    if (($file | Select-Object -ExpandProperty Count) -ge 2) {
        Write-Host ""$file.name"- Hash mis-match"
    }
    if (($file | Select-Object -ExpandProperty Count) -eq 1) {
        Write-Host ""$file.name"- File missing"
    }
}

Solution

  • You can put your output in a variable.

    $compare = compare-object ....
    

    Then you can run a loop on that based on name and search it for duplicates

    Foreach ($file in $compare.name) {
        If (($compare.name -match $file).count -ge 2) {
            "Perform action based on file"
        }
    }
    

    I hope this helps.

    Another option for the if statement would be

    (($compare | where name -eq $file).count -ge 2)
    

    If you want to set the count search to -eq 1 you could use that to log differently.

    Another option is to pipe your compare object into group-object and expand the 2's for one message and 1's for another. Let me know if that is what you'd like to do. The advantage of group-object is it won't give you the same message twice. Turns out a script i am building needed something similar.. Here's how i did it. (simplified to your needs of course)

    $csv1 = Import-CSV "X:\Documents\Customer Projects\Destination.csv"
    $csv2 = Import-CSV "X:\Documents\Customer Projects\Source.csv"
    $compare = Compare-Object -ReferenceObject $csv2 -DifferenceObject $csv1 -Property Name,Size,Hash
    #this next line finds duplicate errors
    $findings = group-object $compare -property name | where count -ge 2
    foreach  ($finding in $findings) {
        $expand = $finding | select -ExpandProperty group | select -expand name -first 1
        $compare | where name -match $expand | Add-Member -MemberType NoteProperty -Name Notes -Value "Hashes don't match" -force
    }
    $compare | where Notes -match ".." | select Name,Size,Hash,Notes  | export c:\compare.csv