Search code examples
powershellgroup-bycountexport-to-csvdifference

powershell export-csv unusual behavior


I'm having an unusual trouble with exporting output to CSV. Using code listed below I'm able to mostly accomplish what I need , which is a list of files where special characters are replaced with wildcards, grouped and listed where only the extension is different.

$path = gci "somepath" -File -Recurse -ErrorAction Continue

$path|
Select @{l="Algorithm";e={}}, @{l="Hash";e={}}, @{l='File';e={$_.PSChildName}}, @{l='Compare Filename';e={$_.BaseName.replace('_','*').replace(' ','*').replace('-','*')}},  @{l="Path";e={$path + $_.PSParentPath.Substring(40,$_.PSParentPath.Length-40)}}, @{l="Link";e={$path +$_.FullName.substring(2,$_.FullName.Length-2)}}, @{l="Extension";e={$_.Extension}}|
group -Property 'Compare Filename'|
Where {@($_.Group.Extension |Sort -Unique).Count -ge 2}|
%{$_.group}|
Export-Csv -Path $CSV_path -NoTypeInformation

However I've discovered while using .Count method that one of the groups is not being exported the way I want it to be.

If I wrap part of the code in parentheses and add count, like so

($path|
Select @{l="Algorithm";e={}}, @{l="Hash";e={}}, @{l='File';e={$_.PSChildName}}, @{l='Compare Filename';e={$_.BaseName.replace('_','*').replace(' ','*').replace('-','*')}},  @{l="Path";e={$path + $_.PSParentPath.Substring(40,$_.PSParentPath.Length-40)}}, @{l="Link";e={$path +$_.FullName.substring(2,$_.FullName.Length-2)}}, @{l="Extension";e={$_.Extension}}|
group -Property 'Compare Filename'|
Where {@($_.Group.Extension |Sort -Unique).Count -ge 2}).count

I get different value then after exporting each group separately.

Culprit mentioned below

File                                    Compare Filename                    Extension
MV_CHIK_202_ELISA_MEASLES_20180423.csv  MV*CHIK*202*ELISA*MEASLES*20180423  .csv
MV_CHIK_202_ELISA_MEASLES_20180423.xlsx MV*CHIK*202*ELISA*MEASLES*20180423  .xlsx
MV_CHIK_202_ELISA_MEASLES_20180423.csv  MV*CHIK*202*ELISA*MEASLES*20180423  .csv
MV_CHIK_202_ELISA_MEASLES_20180423.xlsx MV*CHIK*202*ELISA*MEASLES*20180423  .xlsx

As you can see the those 4 files should be in a single group (based on 'compare filename' attribute) but when the whole list gets piped to %{$_.group} the get split resulting in an incorrect number of entries. In this case .count returns value of '42 ' but export-csv yields 86 rows (basically meaning 43 groups). Any ideas on how to either increase count value or combine abovementioned group?


Solution

  • After some investigation of how the script was created, files that are fileterd and output of

    $path = gci "somepath" -File -Recurse -ErrorAction Continue
    
    $path|
    Select @{l="Algorithm";e={}}, @{l="Hash";e={}}, @{l='File';e={$_.PSChildName}}, @{l='Compare Filename';e={$_.BaseName.replace('_','*').replace(' ','*').replace('-','*')}},  @{l="Path";e={$path + $_.PSParentPath.Substring(40,$_.PSParentPath.Length-40)}}, @{l="Link";e={$path +$_.FullName.substring(2,$_.FullName.Length-2)}}, @{l="Extension";e={$_.Extension}}|
    group -Property 'Compare Filename'|
    Where {@($_.Group.Extension |Sort -Unique).Count -ge 2}
    

    I came to the realisation that the script does exactly what it is supposed to do and that in fact there is no discrepancy between csv output and grouped objects number. Simply a misunderstanding on my part.

     Where {@($_.Group.Extension |Sort -Unique).Count -ge 2}
    

    The above line is supposed to sift through groups and include those where al least two extensions are present. As lady luck would have it, there was only one group with moree than two files that fit the critera, hence my confusion.