Search code examples
windowspowershellunicodearabicright-to-left

Change every word that contains Arabic letters To Reversed


Since Arabic text is automatically reversed in PowerShell, I had to reverse-flip it to make it readable.
For example, the text 'مرحبا' , will look 'ابحرم'

I can change them one by one, but I have difficulty with many words at once.

function ReverseText {
    Return ( -Join ($Args.ToCharArray() | Sort {(--$script:i)}) )
}
$AJoin  = 'ابتثجحخدذرزسشصضطظعغفقكلمنهوي'
$AMatch = $AJoin.ToCharArray() -join '|'

$data = @(
"1. The verb for the male singular in Arabic is?"
"   a) 'تفعل'" 
"   b) 'تفعلان'"  
"   c) 'يفعلون'"  
"   d) 'يفعل'"
"2. 'تلميذ' is the mufrad form of the word?"
"   a) 'تلاميذ'"
"   b) 'تلميذات'"  
"   c) 'تلميذان'" 
"   d) 'تلميذين'"
)

ReverseText 'بيتان'
$Newdata = $data | Foreach {
    If ($_.Split("'") -match $AMatch) {
        'configuous to continue'
    }
}
$Newdata; pause

From the script, i can reserve one word 'بيتان' with command ReverseText 'بيتان'.

From $data, I want to change every Arabic word into reverse letter order, then save it in $Newdata. It's only 2 out of 50 taken from a .txt file, as example.
I'm not capable enough to do it yet.

I appreciate any helps. Thanks.


Solution

  • Do the following in order to selectively reverse runs of Arabic characters in your input strings:

    $data | ForEach-Object {
      [regex]::Replace($_, 
        '\p{IsArabic}{2,}',
        { param($m) [Array]::Reverse(($chars = $m.Value.ToCharArray())); [string]::new($chars) }
      )
    }
    
    $data -replace 
      '\p{IsArabic}{2,}', 
      { [Array]::Reverse(($chars = $_.Value.ToCharArray())); [string]::new($chars) }
    

    • In both cases, regex '\p{IsArabic}{2,}' is used to match all runs of two or more ({2,}) Arabic characters (\p{IsArabic}), and any such run's characters are reversed.

    • In PowerShell hosts that do not support Unicode's bidirectional text-rendering algorithm - such as both the legacy Windows console host (conhost.exe) and Windows Terminal - the result then displays (mostly) correctly, but note that this approach should (a) fundamentally be limited to such hosts, and (b) should only be used to produce for-display output.

      • Caveat: As noted in the comments, this simple character-by-character reversal isn't always enough, as Arabic text can contain ligatures and other groups of characters that must be treated as a unit - see this answer for background information.

      • See this answer for general background information regarding bidirectional text rendering.