Search code examples
regexpowershellreplaceio

Issues finding and replacing strings in PowerShell



I'm rather new to PowerShell and I'm trying to write a PowerShell script to convert some statements in VBScript to Microsoft JScript. Here is my code:

$vbs = 'C:\infile.vbs'
$js = 'C:\outfile.js'

(Get-Content $vbs | Set-Content $js)
(Get-Content $js) |
 Foreach-Object { $_ -match "Sub " } | Foreach-Object { "$_()`n`{" } | Foreach-Object { $_ -replace "Sub", "function" } | Out-File $js
 Foreach-Object { $_ -match "End Sub" } | Foreach-Object { $_ -replace "End Sub", "`}" } | Out-File $js
 Foreach-Object { $_ -match "Function " } | Foreach-Object { "$_()`n`{" } | Foreach-Object { $_ -replace "Function", "function" } | Out-File $js
 Foreach-Object { $_ -match "End Function" } | Foreach-Object { $_ -replace "End Function", "`}" } | Out-File $js

What I want is for my PowerShell program to take the code from the VBScript input file infile.vbs, convert it, and output it to the JScript output file outfile.js. Here is an example of what I want it to do:

Input file:

Sub HelloWorld
 (Code Here)
End Sub

Output File:

function HelloWorld()
{
 (Code Here)
}

Something similar would happen with regard to functions. From there, I would tweak the code manually to convert it. When I run my program in PowerShell v5.1, it does not show any errors. However, when I open outfile.js, I see only one line:

False

So really, I have two questions.
1. Why is this happening?
2. How can I fix this program so that it behaves how I want it to (as detailed above)?

Thanks,
Gabe


Solution

  • As for question #2 (How can I fix this program [...]?):

    Kirill Pashkov's helpful answer offers an elegant solution based on the switch statement.

    Note, however, that his solution:

    • is predicated on Sub <name> / Function <name> statement parts not being on the same line as the matching End Sub / End Function parts - while this is typically the case, it isn't a syntactical requirement; e.g., Sub Foo() WScript.Echo("hi") End Sub - on a single line - works too.

    • in line with your own solution attempt, blindly appends () to Sub / Function definitions, which won't work with input procedures / functions that already have parameter declarations (e.g., Sub Foo (bar, baz)).

    The following solution:

    • also works with single-line Sub / Function definition
    • correctly preserves parameter declarations
    Get-Content $vbs | ForEach-Object {
      $_ -replace '\b(?:sub|function)\s+(\w+)\s*(\(.*?\))', 'function $1$2 {' `
         -replace '\bend\s+(?:sub|function)\b', '}'
    } | Out-File $js
    

    The above relies heavily on regexes (regular expressions) to transform the input; for specifics on how regex matching results can be referred to in the -replace operator's replacement-string operand, see this answer.

    Caveat: There are many other syntax differences between VBScript and JScript that your approach doesn't cover, notably that VBScript has no return statement and instead uses <funcName> = ... to return values from functions.


    As for question #1:

    However, when I open outfile.js, I see only one line:
    False
    [...]
    1. Why is this happening?

    • All but the first ForEach-Object cmdlet call run in separate statements, because the initial pipeline ends with the first call to Out-File $js.

    • The subsequent ForEach-Object calls each start a new pipeline, and since each pipeline ends with Out-File $js, each such pipeline writes to file $js - and thereby overwrites whatever the previous one wrote.
      Therefore, it is the last pipeline that determines the ultimate contents of file $js.

    • A ForEach-Object that starts a pipeline receives no input. However, its associated script block ({...}) is still entered once in this case, with $_ being $null[1]:

      • The last pipeline starts with Foreach-Object { $_ -match "End Function" }, so its output is the equivalent of $null -match "End Function", which yields $False, because -match with a scalar LHS (a single input object) outputs a Boolean value that indicates whether a match was found or not.

      • Therefore, given that the middle pipeline segment (Foreach-Object { $_ -replace "End Function", "}" }) is an effective no-op ($False is stringified to 'False', and the -replace operator therefore finds no match to replace and passes the stringified input out unmodified), Out-File $js receives string 'False' and writes just that to output file $js.


    Even if you transformed your separate commands into a single pipeline with a single Out-File $js segment at the very end, your command wouldn't work, however:

    Given that Get-Content sends the input file's lines through the pipeline one by one, something like $_ -match "Sub " will again produce a Boolean result - indicating whether the line at hand ($_) matched string "Sub " - and pass that on.

    While you could turn -match into a filter by making the LHS an array - by enclosing it in the array-subexpression operator @(...); e.g., @($_) -match "Sub " - that would:

    • pass line that contain substring Sub through as a whole, and
    • omit lines that don't.

    In other words: This wouldn't work as intended, because:

    • lines that do not contain a matching substring would be omitted from the output, and
    • the lines that do match are reflected in full in $_ in the next pipeline segment - not just the matched part.

    [1] Strictly speaking, $_ will retain whatever value it had in the current scope, but that will only be non-$null if you explicitly assigned a value to $_ - given that $_ is an automatic variable that is normally controlled by PowerShell itself, however, doing so is ill-advised - see this GitHub discussion.