Search code examples
powershellcsvamazon-s3splitgroup-object

Retrieve S3 filenames, group by 3 digit number with related images in series then split series into columns and new group into new row


I'm attempting to use a CSV to bulk upload listings however the image urls are in columns.

I'm using Amazon S3 to host the images and PowerShell to retrieve the keys of each file. However I am unsure how to group by their related files and then use something like text to columns to split?

The files have a consistent naming structure:

C2-123-1.JPG
C2-123-2.JPG
C2-123-3.JPG
C3-333-1.JPG
C3-333-2.JPG

In the example above C2-123 has three photos, C2-333 has only two so I'm looking to receive an outcome like below.

|Image Link 1|  Image Link 2|   Image Link 3|   Image Link 4|
|C2-123-1.JPG|  C2-123-2.JPG|   C2-123-3.JPG|               |
|C3-333-1.JPG|  C3-333-2.JPG|               |               |

Solution

  • This should work, you should replace $data for the output you get from AWS.

    • Using $data for testing:
    $data = @'
    C2-123-1.JPG
    C2-123-2.JPG
    C2-123-3.JPG
    C3-333-1.JPG
    C3-333-2.JPG
    C3-333-4.JPG
    C3-333-999.JPG
    C3-456-2.JPG
    C3-111-2.JPG
    C3-999-4.JPG
    '@ -split '\r?\n'
    
    • First, group by the numbers between the last - and the .jpg extension:
    Count Name    Group
    ----- ----    -----
        2 1       {C2-123-1.JPG, C3-333-1.JPG}
        4 2       {C2-123-2.JPG, C3-333-2.JPG, C3-456-2.JPG, C3-111-2.JPG}
        1 3       {C2-123-3.JPG}
        2 4       {C3-333-4.JPG, C3-999-4.JPG}
        1 999     {C3-333-999.JPG}
    
    • Then get the maximum number of elements of the Group arrays
    • Lastly, use a while loop with $max as reference to cast [pscustomobject]
    # Group the files
    $groups = $data | Group-Object {
    
        [regex]::Match(
            $_,
            '(?i)(?<=\d-)(?<imagenum>\d+)\.jpg$'
        ).Groups['imagenum'].Value
    
    }
    
    # Determine max number of elements
    $max = $groups.Count | Measure-Object -Maximum
    $index = 0
    
    # Construct the object
    $result = while($max.Maximum--)
    {
        $out = [ordered]@{}
        $groups.ForEach({
            $key = 'Image Link {0}' -f $_.Name
            $out[$key] = $_.Group[$index]
        })
    
        [pscustomobject]$out
        $index++
    }
    

    Result would be:

    PS /> $result | Format-Table
    
    
    Image Link 1 Image Link 2 Image Link 3 Image Link 4 Image Link 999
    ------------ ------------ ------------ ------------ --------------
    C2-123-1.JPG C2-123-2.JPG C2-123-3.JPG C3-333-4.JPG C3-333-999.JPG
    C3-333-1.JPG C3-333-2.JPG              C3-999-4.JPG 
                 C3-456-2.JPG                           
                 C3-111-2.JPG    
    

    To see the regex explanation you can use https://regex101.com/r/kARr39/1