Search code examples
powershellcsvrow

Found a line in a txt file with ※ and correct it in Powershell


This PowerShell script processes CSV files and their corresponding "_Inglese" files, updating translations in the CSVs based on the contents of the "_Inglese" files. It retrieves files, imports data, updates translations, and exports the updated CSV files.

However, it have this issue, when the file from which it takes data is in this format:

# (Go in a newline)
Il mio corpo è un tempio.
※Parla con Jack.

It correctly gives me this result:

"0x00000011","Il mio corpo è un tempio.
※Parla con Jack.","My Body Is a Temple.
※Speak with Jack."

But the next line gives the following error:

"0x000000AA","","Dress Like a Pirate."
"0x00000080","Vestiti come un pirata.","Obtain a complete."

Instead, it should be so:

"0x000000AA","Vestiti come un pirata.","Dress Like a Pirate."
"0x00000080","Ottieni un completo.","Obtain a complete."

*_Inglese.txt

Row 115 | Andiamo!
Row 116 | Il mio corpo è un tempio.
Row 117 | ※Parla con Jack.
Row 118 | Vestiti come un pirata.

Raw CSV:

Row 115 | "0x00000010","","Let's go!"
Row 116 | "0x00000011","","My Body Is a Temple.
Row 117 | ※Speak with Jack."
Row 118 | "0x000000AA","","Dress Like a Pirate."
Row 119 | "0x00000080","","Obtain a complete."

The part of the code where I tried to solve starts with:

Update the second column (translation) in the primary CSV with data from the _Inglese file.

Powershell code:

param(
    $SourceDir = $PWD,
    $OutDir = $PWD,
    $OutFileSuffix = "output" # Define the suffix for the output file.
)

# Get all primary CSV files in the source directory.
$csvFiles = Get-ChildItem -Path $SourceDir -Recurse -Filter "*.csv"

foreach ($csvFile in $csvFiles) {
    # Construct the name for the corresponding _Inglese file.
    $column3FileName = "{0}_inglese.txt" -f $csvFile.BaseName
    $column3FilePath = Join-Path -Path $csvfile.Directory -ChildPath $column3FileName
    
    # Check if the _Inglese file exists.
    if (Test-Path $column3FilePath) {
        # Import the primary CSV file and the corresponding _Inglese file.
        $primaryCsv = Import-Csv -Encoding utf8 -Path $csvFile.FullName
        $column3Data = Get-Content -Encoding utf8 $column3FilePath
        
        # Assuming the first line in the _Inglese file is a header and we skip it.
        $column3Values = $column3Data | Select-Object -Skip 1

        # Update the second column (translation) in the primary CSV with data from the _Inglese file.
        $previousTranslation = $null
        for ($i = 0; $i -lt $primaryCsv.Count; $i++) {
            if ($column3Values[$i] -match "※") {
                # Found a line in _Inglese file with ※, append it to the previous translation if available.
                if ($i -gt 0 -and $previousTranslation -ne $null) {
                    $primaryCsv[$i - 1].translation += "`n$($column3Values[$i])"
                }
            } else {
                # Otherwise, update the current translation.
                $primaryCsv[$i].translation = $column3Values[$i]
                $previousTranslation = $column3Values[$i]
            }
        }

        # Construct the output file path.
        $outputFilePath = Join-Path -Path $csvFile.DirectoryName -ChildPath ("{0}{1}.csv" -f $csvFile.BaseName, $OutFileSuffix)
                    
        # Write the entire file with BOM (Byte Order Mark) in UTF-8
        $primaryCsv | Export-Csv -Path $outputFilePath -NoTypeInformation -Encoding UTF8
    }
    else {
        Write-Warning "Corresponding column3 file not found for $($csvFile.Name)"
    }
}

Solution

  • I solved it this way:

    # Combine lines with ※ into single lines
    $mergedLines = @()
    $currentLine = ""
    foreach ($line in $column3Values) {
        if ($line -like "※*") {
            $currentLine += "`n$line"
        } else {
            if ($currentLine -ne "") {
                $mergedLines += $currentLine
            }
            $currentLine = $line
        }
    }
    if ($currentLine -ne "") {
        $mergedLines += $currentLine
    }
    
    # Initialize counter for merged lines
    $mergedIndex = 0
    
    # Update the second column (translation) in the primary CSV with data from the _Inglese file.
    for ($i = 0; $i -lt $primaryCsv.Count; $i++) {
        # Update the translation column
        $primaryCsv[$i].translation = $mergedLines[$mergedIndex]
    
        # Move to the next merged line
        $mergedIndex++
    }