Search code examples
powershellpowershell-2.0powershell-3.0powershell-4.0

PowerShell Manipulate grouped objects


I am trying to create an updated baseball schedule with the teams and starting pitchers records. The first game stats should be team records 0-0 and pitchers records 0-0. So need to insert the records at first date and push other stats to next date

  • Input file:
Date,Team,G,W,L,Pitcher,GP
4/13/2020,BAL,1,1,0,W.Johnson,1
4/16/2020,BAL,4,1,0,W.Johnson,2
4/14/2020,BAL,2,1,0,C.Mathewson,1
4/17/2020,BAL,5,0,1,C.Mathewson,2
4/15/2020,BAL,3,1,0,C.Young,1
4/13/2020,NYA,1,0,1,W.Ford,1
4/16/2020,NYA,4,0,1,W.Ford,2
4/14/2020,NYA,2,0,1,H.Pennock,1
4/17/2020,NYA,5,1,0,H.Pennock,2
4/15/2020,NYA,3,0,1,D.Dean,1
4/13/2020,KCM,1,1,0,S.Paige,1
4/18/2020,KCM,3,0,0,S.Paige,2
4/14/2020,KCM,2,0,1,J.Williams,1
4/13/2020,HOG,1,0,1,D.Redding,1
4/18/2020,HOG,3,0,0,D.Redding,2
4/14/2020,HOG,2,1,0,E.Rile,1
  • Output file (90% works):
Date,Team,G,W,L,Pitcher,GP
,,,0,1,,
,,,1,0,,
4/13/2020,BAL,1,0,0,W.Johnson,1
4/13/2020,HOG,1,0,0,D.Redding,1
4/13/2020,NYA,1,0,0,W.Ford,1
4/13/2020,KCM,1,0,0,S.Paige,1
4/14/2020,HOG,2,0,0,E.Rile,1
4/14/2020,BAL,2,1,0,C.Mathewson,1
4/14/2020,NYA,2,0,1,H.Pennock,1
4/14/2020,KCM,2,0,0,J.Williams,1
4/15/20,KCM,1,0,1,J.Williams,1
4/15/20,HOG,1,1,0,E.Rile,1
4/16/20,BAL,1,1,0,C.Young,1
4/16/20,NYA,1,0,1,D.Dean,1
4/16/2020,NYA,4,0,1,W.Ford,2
4/16/2020,BAL,4,1,0,W.Johnson,2
4/17/2020,NYA,5,0,1,H.Pennock,2
4/17/2020,BAL,5,1,0,C.Mathewson,2
4/18/2020,HOG,3,0,1,D.Redding,2
4/18/2020,KCM,3,1,0,S.Paige,2
  • Expected Output:
Date,Team,G,W,L,Pitcher,GP
4/13/2020,BAL,1,0,0,W.Johnson,1
4/13/2020,NYA,1,0,0,W.Ford,1
4/13/2020,KCM,1,0,0,S.Paige,1
4/13/2020,HOG,1,0,0,D.Redding,1
4/14/2020,BAL,2,1,0,C.Mathewson,1
4/14/2020,HOG,2,0,1,E.Rile,1
4/14/2020,NYA,2,0,1,H.Pennock,1
4/14/2020,KCM,2,1,0,J.Williams,1
4/15/2020,BAL,3,1,0,C.Young,1
4/15/2020,NYA,3,0,1,D.Dean,1
4/16/2020,BAL,4,1,0,W.Johnson,2
4/16/2020,NYA,4,0,1,W.Ford,2
4/18/2020,NYA,1,1,0,H.Pennock,2
4/18/2020,BAL,1,0,1,C.Mathewson,2
4/18/2020,KCM,3,0,1,S.Paige,2
4/18/2020,HOG,3,1,0,D.Redding,2
4/19/2020,KCM,1,0,0,S.Paige,2
4/19/2020,HOG,1,0,0,D.Redding,2
  • Code:
$csv = Import-Csv 'C:\deployments\New folder\schedule1.txt' |
    Select-Object Date,Team, G, W, L, Pitcher, GP |
    Group-Object Team

$csv | ForEach-Object {
    if($_.Group[0].G -eq 1) {
        [PSCustomObject]@{
            Date    = $_.Group[0].Date
            Team    = $_.Group[0].Team
            G       = $_.Group[0].G
            W       = 0
            L       = 0
            Pitcher = $_.Group[0].Pitcher
            GP      = $_.Group[0].GP
        }
    }
    else {
        [PSCustomObject]@{
            Date    = $_.Group[0].Date
            Team    = $_.Group[0].Team
            G       = $_.Group[0].G
            W       = $_.Group[0].W
            L       = $_.Group[0].L
            Pitcher = $_.Group[0].Pitcher
            GP      = $_.Group[0].GP
        }
    }

    for ($ii = 1; $ii -lt $csv.Count; $ii++) {
        [PSCustomObject]@{
            Date    = $_.Group[$ii].Date
            Team    = $_.Group[$ii].Team
            G       = $_.Group[$ii].G
            W       = $_.Group[$ii - 1].W
            L       = $_.Group[$ii - 1].L
            Pitcher = $_.Group[$ii].Pitcher
            GP      = $_.Group[$ii].GP
        }
    }
    [PSCustomObject]@{
        Date    = Get-Date -Format 'M/dd/yy' -Date ([datetime]$_.Group[-1].Date).AddDays(1)
        Team    = $_.Group[-1].Team
        G       = $_.Group[0].G
        W       = $_.Group[-1].W
        L       = $_.Group[-1].L
        Pitcher = $_.Group[-1].Pitcher
        GP      = $_.Group[-1].GP
    }

} | Sort-Object Date | Export-Csv 'C:\deployments\New folder\schedule6.txt' -NoTypeInformation

Solution

  • First off, your approach kind of feels like a bit of an abuse of the pipeline, but I can't really come up with a more elegant solution.

    In any case, if you make the following changes to your code it will give the required output:

    Sample data

    Rather than read from a file, I've used inline data so you can just cut & paste the code sample to play with it...

    # set up some test data
    $text = @"
    Date,Team,G,W,L,Pitcher,GP
    4/13/2020,BAL,1,1,0,W.Johnson,1
    4/16/2020,BAL,4,1,0,W.Johnson,2
    4/14/2020,BAL,2,1,0,C.Mathewson,1
    4/17/2020,BAL,5,0,1,C.Mathewson,2
    4/15/2020,BAL,3,1,0,C.Young,1
    4/13/2020,NYA,1,0,1,W.Ford,1
    4/16/2020,NYA,4,0,1,W.Ford,2
    4/14/2020,NYA,2,0,1,H.Pennock,1
    4/17/2020,NYA,5,1,0,H.Pennock,2
    4/15/2020,NYA,3,0,1,D.Dean,1
    4/13/2020,KCM,1,1,0,S.Paige,1
    4/18/2020,KCM,3,0,0,S.Paige,2
    4/14/2020,KCM,2,0,1,J.Williams,1
    4/13/2020,HOG,1,0,1,D.Redding,1
    4/18/2020,HOG,3,0,0,D.Redding,2
    4/14/2020,HOG,2,1,0,E.Rile,1
    "@
    

    Sort the data before processing

    You're bumping some of the data up or down a row for a given team, but you're not sorting it first so it's kind of random which date you're bumping it to.

    If you sort the data first you'll be bumping it to the next chronological record rather than just to whatever date is next in the file for that team.

    $csv = $text `
        | ConvertFrom-Csv `
        | Sort-Object Date ` # <-- sort the data
        | Select-Object Date, Team, G, W, L, Pitcher, GP `
        | Group-Object Team;
    

    Correct the iteration limit

    Change

        for ($ii = 1; $ii -lt $csv.Count; $ii++) {
            ... #             ^^^^^^^^^^
        }
    
        for ($ii = 1; $ii -lt $_.Group.Count; $ii++) {
            ... #             ^^^^^^^^^^^^^^
        }
    

    $csv.Count is the number of groups (i.e. Teams) in the dataset, not the number of records for the current Team, which is represented by $_.Group.Count.

    Correct the date format

            Date    = Get-Date -Format 'M/dd/yy' -Date ([datetime]$_.Group[-1].Date).AddDays(1)
            #                                ^^ - 2-digit year
    

    should be

            Date    = Get-Date -Format 'M/dd/yyyy' -Date ([datetime]$_.Group[-1].Date).AddDays(1)
            #                                ^^^^ - 4-digit year
    

    so you get the correct number of digits in the year.

    Sort the result set

    Not strictly necessary, but I added Team to the final Sort-Object to give this:

    } | Sort-Object Date, Team | ConvertTo-Csv
    

    According to some sources, the Sort-Object cmdlet isn't "stable" (see https://stackoverflow.com/a/11349063/3156906) so the one at the top of the script doesn't necessarily preserve the order of teams for the same date. Adding Team to the final Sort-Object ensures a predictable order of teams with the same date.

    Result

    If you make all these changes your output should be:

    "Date","Team","G","W","L","Pitcher","GP"
    "4/13/2020","BAL","1","0","0","W.Johnson","1"
    "4/13/2020","HOG","1","0","0","D.Redding","1"
    "4/13/2020","KCM","1","0","0","S.Paige","1"
    "4/13/2020","NYA","1","0","0","W.Ford","1"
    "4/14/2020","BAL","2","1","0","C.Mathewson","1"
    "4/14/2020","HOG","2","0","1","E.Rile","1"
    "4/14/2020","KCM","2","1","0","J.Williams","1"
    "4/14/2020","NYA","2","0","1","H.Pennock","1"
    "4/15/2020","BAL","3","1","0","C.Young","1"
    "4/15/2020","NYA","3","0","1","D.Dean","1"
    "4/16/2020","BAL","4","1","0","W.Johnson","2"
    "4/16/2020","NYA","4","0","1","W.Ford","2"
    "4/17/2020","BAL","5","1","0","C.Mathewson","2"
    "4/17/2020","NYA","5","0","1","H.Pennock","2"
    "4/18/2020","BAL","1","0","1","C.Mathewson","2"
    "4/18/2020","HOG","3","1","0","D.Redding","2"
    "4/18/2020","KCM","3","0","1","S.Paige","2"
    "4/18/2020","NYA","1","1","0","H.Pennock","2"
    "4/19/2020","HOG","1","0","0","D.Redding","2"
    "4/19/2020","KCM","1","0","0","S.Paige","2"
    

    which is a slightly different order to you "expected" results and has two additional rows which you confirmed are missing from your "expected".