Search code examples
powershellexport-to-csv

Modify a .csv file in powershell automatically


I try to create a powershell script, to perform a few steps:

  • In a specific folder, I put a .xlsx file, it converts it to csv. Until now I got this:
$ErrorActionPreference = 'Stop'

Function Convert-CsvInBatch
{
    [CmdletBinding()]
    Param
    (
        [Parameter(Mandatory=$true)][String]$Folder
    )
    $ExcelFiles = Get-ChildItem -Path $Folder -Filter *.xlsx -Recurse

    $excelApp = New-Object -ComObject Excel.Application
    $excelApp.DisplayAlerts = $false

    $ExcelFiles | ForEach-Object {
        $workbook = $excelApp.Workbooks.Open($_.FullName)
        $csvFilePath = $_.FullName -replace "\.xlsx$", ".csv"
        $workbook.SaveAs($csvFilePath, [Microsoft.Office.Interop.Excel.XlFileFormat]::xlCSV)
        $workbook.Close()
    }

    # Release Excel Com Object resource
    $excelApp.Workbooks.Close()
    $excelApp.Visible = $true
    Start-Sleep 5
    $excelApp.Quit()
    [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excelApp) | Out-Null
}

#
# 0. Prepare the folder path which contains all excel files
$FolderPath = "C:\exacthpath"

Convert-CsvInBatch -Folder $FolderPath
  • The columns in the file, are still there, so I want to remove them, and insert a ';' instead, like: H;1;43;185;

At this point I'm stuck. I can import it into Powershell like:

Import-Csv -Path 'C:\folder\filename.csv' | ForEach-Object {
$_
}

I get this look, and the most important task is here, in the first row only:

H;1;43;185;

This should be modified into:

H;01;43;185

the rest should be left untouched. After I need to export back it into a CSV file, like:

Export-Csv -Path 'C:\folder\modified_filename.csv'

But this whole process should be inserted in one single powershell script, which performs the above steps on it's own. So in short:

  • identifies any .xlsx file - regardless of it's name
  • convers it into .csv
  • modifies the outlook of the document, to separate the columns with a ";"
  • modify the first line to have 'H;01;43;185' - this is a static line, it will always look like this
  • save the created file as a final .csv file

Can you help me somehow to include/optimize the above scripts and let powershell perform the modification too? Example content of a file like this (final look) Usually it includes more 1000+ lines:

H;01;43;185
D;111;3;1042;2
D;222;3;1055;3
D;333;3;1085;1
T;3;;;

Any help is highly appreciated.

Regards, Armin


Solution

  • If as you say in your comment, your Excel already creates a csv with the semi-colon as delimiter, you can do this inside the loop, just below $workbook.Close()

    # read the file created by Excel as string array
    $data = Get-Content $csvFilePath
    # overwrite the file with just the new header
    Set-Content -Path $csvFilePath -Value 'H;01;43;185'
    # add the rest of the data to the file
    $data[1..($data.Count -1)] | Add-Content -Path $csvFilePath
    

    P.S. I would delete the lines

    $excelApp.Visible = $true
    Start-Sleep 5
    

    because I don't see the need to have Excel show itself and pause the function for 5 seconds.. Instead, have Excel not show at all so it will work a lot faster by adding

    $excelApp.Visible = $false
    

    right after you have created the $excelApp