Search code examples
powershellstring-parsing

How to handle substring with dynamic length in PowerShell


I have a list of domain as follows:

facebook.com  
youtube.com/video  
google.com/
github.com/something/somewhere
microsoft.com/default.aspx

Now I want to use PowerShell to substring with expected result as follows, in order to create a cleaned format)

facebook.com
youtube.com
google.com
github.com
microsoft.com

With the PowerShell as follows,

ForEach($website in $list){
    $finalList  = $website.Substring(0,$website.IndexOf('/'))
    Write-Output $finalList 
}

The problem with this PowerShell is when the loop run at facebook.com it thrown an error Exception calling "Substring" with "2" argument(s): "Length cannot be less than zero. because facebook.com has no forward slash (/) so the index always returns -1

I'm thinking of a way to exclude facebook.com but still can't seem to figure out a better way.


Solution

  • While using the methods of .NET types directly in PowerShell - such as .Substring() of the [string] type - is always an option, PowerShell's native features usually offer a more concise and elegant solution:

    # Create the list (an array of strings).
    $list = @'
    facebook.com  
    youtube.com/video  
    google.com/
    github.com/something/somewhere
    microsoft.com/default.aspx
    '@ -split '\r?\n'
    
    foreach ($website in $list) {
      # Split the list element by '/' (if present) and output the 1st token.
      ($website -split '/')[0]
    }
    

    The above yields:

    facebook.com  
    youtube.com
    google.com
    github.com
    microsoft.com
    

    The -split operator returns the array of tokens contained in the LHS ($website) based on the RHS separator (/), and index [0] returns the 1st such token.

    If the LHS happens to contain no instance of the separator, a single-element array is returned, which can still be safely accessed with index [0].

    If the list of URLs also contains port numbers (e.g., website.com:8888) modify the split command as follows:

    ($website -split '[/:]')[0]
    

    This takes advantage of the fact that the -split operator supports regular expressions as the separator definition, and character set [/:] matches both a (single) / and a :