Search code examples
phparrayscsvcol

(PHP) CSV Change column order


I am currently trying to manipulate a CSV file. I want to be able to change the order of columns.

So my initial thought was to read which column should be at which place and just put the columns in order into another array and overwrite the original array once finished.

Sadly while looping through the csv, the arrays are row based and not col based.

Is there a way in PHP to like flip the table and so be able to change the sequence?


Solution

  • There is no built in function to do this, you're going to have to roll your own. Doing this kind of operation on data from a CSV could potentially cause memory issues if you read the whole file into an array and then perform some sort of transform on it, so it's probably best to do this row by row. This way you could stream the rows out to a new file instead of storing them all in one big array.

    Here's a quick and dirty solution, where you can supply an array of column indices to define the new order. Simple and effective.

    Input:

    Header A,Header B,Header C,Header D
    a,b,c,d
    1,2,3,4
    alpha,bravo,charlie,delta
    
    $fh = fopen('reorder.csv', 'r');
    
    /*
     * Create an array with column indices in the order that they
     * should appear in the output.
     *
     * Each column that should appear in the output must be included.
     * This is both a feature and a potential gotcha.
     */
    $colSpec = [0,3,2,1];
    
    // Output buffer
    $output = [];
    
    while($currRow = fgetcsv($fh))
    {
        // Buffer for our output row
        $currOutput = [];
    
        /*
         * Loop through the spec array and populate the row output buffer
         * using the indices defined there
         */
        foreach($colSpec as $currColumnIndex)
        {
            $currOutput[] = $currRow[$currColumnIndex];
        }
    
        // Append the new reordered row to the output buffer
        $output[] = $currOutput;
    }
    
    fclose($fh);
    
    print_r($output);
    

    Output:

    Array
    (
        [0] => Array
            (
                [0] => Header A
                [1] => Header D
                [2] => Header C
                [3] => Header B
            )
    
        [1] => Array
            (
                [0] => a
                [1] => d
                [2] => c
                [3] => b
            )
    
        [2] => Array
            (
                [0] => 1
                [1] => 4
                [2] => 3
                [3] => 2
            )
    
        [3] => Array
            (
                [0] => alpha
                [1] => delta
                [2] => charlie
                [3] => bravo
            )
    
    )
    

    That's not very intuitive though, the numeric indices make it difficult to logically understand what column is going where. If your CSV has a header row, and the labels in that row are immutable, you can do something like this to make it more intuitive:

    /*
     * Create an array with header values in the order that they
     * should appear in the output.
     *
     * Each column that should appear in the output must be included.
     * This is both a feature and a potential gotcha.
     */
    $colSpec = ['Header C', 'Header A', 'Header B', 'Header D'];
    
    // Create a map for column name to actual index in the file
    $headerIndexMap = array_flip($colSpec);
    
    // Output buffer
    $output = [];
    
    while ($currRow = fgetcsv($fh))
    {
        // If this is our first row, set up the column mapping
        if(empty($output))
        {
            // Loop through the columns...
            foreach($currRow as $index => $currHeaderLabel)
            {
                /*
                 * Trim the header value, in case there it leading/trailing whitespace in the data
                 */
                $currHeaderLabel = trim($currHeaderLabel);
    
                // If this column is in our column spec, set the index in $headerIndexMap
                if(array_key_exists($currHeaderLabel, $headerIndexMap))
                {
                    $headerIndexMap[$currHeaderLabel] = $index;
                }
            }
        }
    
        // Buffer for our output row
        $currOutput = [];
    
        // Loop through the column spec...
        foreach ($colSpec as $currColumn)
        {
            // Get the actual index of the column from the index map
            $currIndex = $headerIndexMap[$currColumn];
    
            // Append the data in the appropriate column to the row output buffer
            $currOutput[] = $currRow[$currIndex];
        }
    
        // Append the new reordered row to the output buffer
        $output[] = $currOutput;
    }
    
    fclose($fh);
    
    print_r($output);
    

    Output:

    Array
    (
        [0] => Array
            (
                [0] => Header C
                [1] => Header A
                [2] => Header B
                [3] => Header D
            )
    
        [1] => Array
            (
                [0] => c
                [1] => a
                [2] => b
                [3] => d
            )
    
        [2] => Array
            (
                [0] => 3
                [1] => 1
                [2] => 2
                [3] => 4
            )
    
        [3] => Array
            (
                [0] => charlie
                [1] => alpha
                [2] => bravo
                [3] => delta
            )
    
    )
    
    

    I usually wrap this kind of thing up in a helper class to make it encapsulated and testable, and keep the code that uses it nice & clean.