Search code examples
powershellunicodecharacter-encodingfilesystemsget-childitem

How to get Get-ChildItem to handle path with non-breaking space


I have the following code that works for most files. The input file (FoundLinks.csv) is a UTF-8 file with one file path per line. It is full paths of files on a particular drive that I need to process.

$inFiles = @()
$inFiles += @(Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv")

foreach ($inFile in $inFiles) {
    Write-Host("Processing: " + $inFile)
    $objFile = Get-ChildItem -LiteralPath $inFile
    New-Object PSObject -Prop @{ 
        FullName = $objFile.FullName
        ModifyTime = $objFile.LastWriteTime
    }
} 

But even though I've used -LiteralPath, it continues to not be able to process files that have a non-breaking space in the file name.

Processing: q:\Executive\CLC\Budget\Co  2018 Budget - TO Bob (GA Prophix).xlsx
Get-ChildItem : Cannot find path 'Q:\Executive\CLC\Budget\Co  2018 Budget - TO Bob (GA Prophix).xlsx'
because it does not exist.
At ListFilesWithModifyTime.ps1:6 char:29
+     $objFile = Get-ChildItem <<<<  -LiteralPath $inFile
    + CategoryInfo          : ObjectNotFound: (Q:\Executive\CL...A Prophix).xlsx:String) [Get-ChildItem], ItemNotFound
   Exception
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

I know my input file has the non-breaking space in the path because I'm able to open it in Notepad, copy the offending path, paste into Word, and turn on paragraph marks. It shows a normal space followed by a NBSP just before 2018.

Is PowerShell not reading in the NBSP? Am I passing it wrong to -LiteralPath? I'm at my wit's end. I saw this solution, but in that case they are supplying the path as a literal in the script, so I can't see how I could use that approach.

I've also tried: -Encoding UTF8 parameter on Get-Content, but no difference.

I'm not even sure how I can check $inFile in the code just to confirm if it still contains the NBSP.

Grateful for any help to get unstuck!

Confirmed that $inFile has NBSP

Thank you all! As per @TheMadTechnician, I have updated the code like this, and also reduced my input file to only the one file having a problem.

$inFiles = @()
$inFiles += @(Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" -Encoding UTF8)

foreach ($inFile in $inFiles) {
    Write-Host("Processing: " + $inFile)

    # list out all chars to confirm it has an NBSP
    $inFile.ToCharArray()|%{"{0} -> {1}" -f $_,[int]$_}

    $objFile = Get-ChildItem -LiteralPath $inFile
    New-Object PSObject -Prop @{ 
        FullName = $objFile.FullName
        ModifyTime = $objFile.LastWriteTime
    }
} 

And so now I can confirm that $inFile in fact still contains the NBSP just as it gets passed to Get-ChildItem. Yet Get-ChildItem says the file does not exist.

More I've tried:

  • Same if I use Get-Item instead of Get-ChildItem
  • Same if I use -Path instead of -LiteralPath
  • Windows explorer and Excel can deal with the file successfully.

I'm on a Windows 7 machine, Powershell 2.

Thanks again for all the responses!


Solution

  • It's still unclear why Sandra's code didn't work: PowerShell v2+ is capable of retrieving files with paths containing non-ASCII characters; perhaps a non-NTFS filesystem with different character encoding was involved?

    However, the following workaround turned out to be effective:

    $objFile = Get-ChildItem -Path ($inFile -replace ([char] 0xa0), '?')
    
    • The idea is to replace the non-breaking space char. (Unicode U+00A0; hex. 0xa) in the input file path with wildcard character ?, which represents any single char.

    • For Get-ChildItem to perform wildcard matching, -Path rather than -LiteralPath must be used (note that -Path is actually the default if you pass a path argument positionally, as the first argument).

    • Hypothetically, the wildcard-based paths could match multiple files; if that were the case, the individual matches would have to be examined to identify the specific match that has a non-breaking space in the position of the ?.