Search code examples
powershellunicodedirectoryexport-to-csv

How make Excel recognize data columns on a Unicode CSV?


I have a script (please see Original Script below) which extracts a list of filepaths in a directory and outputs the list to a csv, along with various file metadata. When I open the csv, I put the data into the appropriate columns by first replacing "," and ," with a rare character (such as ), then running text-to-columns based on . I'd like to eliminate the find-replace/text-to-column steps - do you know how I can do so?

Here is what I have tried:

  1. Remove -Encoding Unicode. This solves my problem, but creates another. It solves the problem in the sense that when I open the csv in Excel, the data is automatically sorted into columns but it creates another in that it also replaces any unusual characters in the file path with question marks. I later use the filepath for a move command, so this solution is unacceptable.
  2. Modify the last line to be $a = Get-FolderItem "Z:\Folder" | Export-Csv -Path C:\Folder\2024-12-23Testing.csv -Encoding Unicode -Delimiter ','. This doesn't work.
  3. Put the "find and replace" step into Powershell by adding the following lines to the end of the script:
(Get-Content C:\Folder\2024-12-23Testing.csv).Replace('","', '¬') | Set-Content C:\Folder\2024-12-23Testing.csv 
(Get-Content C:\Folder\2024-12-23Testing.csv).Replace(',"', '¬') | Set-Content C:\Folder\2024-12-23Testing.csv

This works, but still requires a time-consuming "text-to-columns" step. I also tried various unicode delimiters, but they did not result in automatic columns.

Original Script

Function Get-FolderItem {
    
        [cmdletbinding(DefaultParameterSetName='Filter')]
        Param (
            [parameter(Position=0,ValueFromPipeline=$True,ValueFromPipelineByPropertyName=$True)]
            [Alias('FullName')]
            [string[]]$Path = $PWD,
            [parameter(ParameterSetName='Filter')]
            [string[]]$Filter = '*.*',    
            [parameter(ParameterSetName='Exclude')]
            [string[]]$ExcludeFile,              
            [parameter()]
            [int]$MaxAge,
            [parameter()]
            [int]$MinAge
        )
        Begin {
            $params = New-Object System.Collections.Arraylist
            $params.AddRange(@("/L","/E","/NJH","/BYTES","/FP","/NC","/XJ","/R:0","/W:0","T:W","/UNILOG:c:\temp\test.txt"))
            If ($PSBoundParameters['MaxAge']) {
                $params.Add("/MaxAge:$MaxAge") | Out-Null
            }
            If ($PSBoundParameters['MinAge']) {
                $params.Add("/MinAge:$MinAge") | Out-Null
            }
        }
        Process {
            ForEach ($item in $Path) {
                Try {
                    $item = (Resolve-Path -LiteralPath $item -ErrorAction Stop).ProviderPath
                    If (-Not (Test-Path -LiteralPath $item -Type Container -ErrorAction Stop)) {
                        Write-Warning ("{0} is not a directory and will be skipped" -f $item)
                        Return
                    }
                    If ($PSBoundParameters['ExcludeFile']) {
                        $Script = "robocopy `"$item`" NULL $Filter $params /XF $($ExcludeFile  -join ',')"
                    } Else {
                        $Script = "robocopy `"$item`" NULL $Filter $params"
                    }
                    Write-Verbose ("Scanning {0}" -f $item)
                    Invoke-Expression $Script | Out-Null
                    get-content "c:\temp\test.txt" | ForEach {
                        Try {
                            If ($_.Trim() -match "^(?<Children>\d+)\s(?<FullName>.*)") {
                               $object = New-Object PSObject -Property @{
                                    FullName = $matches.FullName
                                    #Extension = $matches.fullname -replace '.*\.(.*)','$1'
                                    #FullPathLength = [int] $matches.FullName.Length
                                    #FileHash = Get-FileHash -LiteralPath "\\?\$($matches.FullName)" |Select -Expand Hash
                                    #Created = ([System.IO.FileInfo] $matches.FullName).creationtime
                                    #Created = ([System.IO.FileInfo] "\\?\$($matches.FullName)").creationtime
                                    LastWriteTime = ([System.IO.FileInfo] "\\?\$($matches.FullName)").LastWriteTime
                                    #Characters = (Get-Content -LiteralPath "\\?\$($matches.FullName)" | Measure-Object -ignorewhitespace -Character).Characters
                                    #Size = ([System.IO.FileInfo] "\\?\$($matches.FullName)").length
                                    #Access = ((Get-Acl -LiteralPath "\\?\$($matches.FullName)") |Select -Expand AccessToString)-replace '[\r\n]',' '
                                    #$permission = (Get-Acl $Folder).Access | ?{$_.IdentityReference -match $User} | Select IdentityReference,FileSystemRights
                                    #Owner = (Get-ACL $matches.Fullname).Owner
                                } 
                                $object.pstypenames.insert(0,'System.IO.RobocopyDirectoryInfo')
                                Write-Output $object
                            } Else {
                                Write-Verbose ("Not matched: {0}" -f $_)
                            }
                        } Catch {
                            Write-Warning ("{0}" -f $_.Exception.Message)
                            Return
                        }
                    }
                } Catch {
                    Write-Warning ("{0}" -f $_.Exception.Message)
                    Return
                }
            }
        }
    }
    
 $a = Get-FolderItem "Z:\Folder" | Export-Csv -Path C:\Folder\2024-12-23Testing.csv -Encoding Unicode

Your help is appreciated.


Solution

  • How about importing the csv with utf8 (with bom) encoding? Excel is not as seamless with importing utf16 or unicode encoded text files.