Search code examples
regexpowershelluser-agent

How to parse User-Agent string with Powershell Match to get Browser and OS


I'm trying to parse an access log-file from Caddy with Powershell, and I have now gotten to the User-Agent string.

How would I go about getting the Browser and Operating System info out of the below string?

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36

Mozilla/5.0 (Linux; Android 13; SM-G991B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Mobile Safari/537.36

Home Assistant/2023.1.1-3124 (Android 13; SM-G991B)

This is a User-Agent string from my own computer, and I can't fathom why Safari is in there when I use Chrome to access a page.

I thought about parsing the string with RegEx, but my RegEx skills are barely existing.

I found a RegEx from https://regex101.com/r/2McsiK/1, but it captures a whole lot more than just the actual browser and OS

\((?<info>.*?)\)(\s|$)|(?<name>.*?)\/(?<version>.*?)(\s|$)

and it does not seem to work well with Powershell Match.

PS C:\> "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36" -match "\((?<info>.*?)\)(\s|$)|(?<name>.*?)\/(?<version>.*?)(\s|$)" | Out-Null

PS C:\> $Matches

Name                           Value                                                                                                                                                                                                                                                                                                                      
----                           -----                                                                                                                                                                                                                                                                                                                      
version                        5.0                                                                                                                                                                                                                                                                                                                        
name                           Mozilla                                                                                                                                                                                                                                                                                                                    
2                                                                                                                                                                                                                                                                                                                                                         
0                              Mozilla/5.0

Any advice would be helpful.


Solution

    • See Mathias' comments on the question for the perils of user-agent sniffing (parsing the user-agent string) in general.

    Regex-based PowerShell-only solution:

    • The following tries hard to extract the relevant information, but it's impossible to tell if it will work meaningfully across all platforms and browsers, given the lack of standardization of user-agent strings.
    # Sample user-agent strings, spanning 
    # * Windows, macOS, Linux, iOS, and Android
    # * Chrome, Safari, Edge, Firefox, Opera
    $userAgentStrings = @(
      'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
      'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.2 Safari/605.1.15'
      'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.41'
      'Mozilla/5.0 (Linux; Android 13; SM-G991B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Mobile Safari/537.36'
      'Home Assistant/2023.1.1-3124 (Android 13; SM-G991B)'
      'Mozilla/5.0 (iPhone; CPU iPhone OS 16_0_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Mobile/15E148 Safari/604.1'
      'Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25'
      'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0'
      'Mozilla/5.0 (X11; Linux ppc64le; rv:75.0) Gecko/20100101 Firefox/75.0'
      'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16.2'
    )
    
    $userAgentStrings | ForEach-Object {
      if ($_ -match '^(?<browser1>.+?) \((?<os>.+?)\)(?: (?<engine>\S+(?: \(.+?\))?)(?: Version/(?<version>\S+(?: Mobile/\S+)?))?(?: (?<browser2>\S+))?(?: \S+? (?<browser4>\S+/\S+$))?)?') {
        # Determine the true browser name and version.
        $browser = if ($Matches.browser4) { $Matches.browser4 } elseif ($Matches.browser2) { $Matches.browser2 } else { $Matches.browser1 }
        if ($Matches.version) {
          $browser = ($browser -split '/')[0] + '/' + $Matches.version
        }
        # Output the captured substrings via a custom object.
        [pscustomobject] @{
          OS = $Matches.os
          Browser = $browser
          Engine = $Matches.engine
          IsMobile = $Matches.os -match '\bAndroid\b' -or $Matches.version -match '\bMobile\b'
        }
      }
    }
    

    For an explanation of the regex and the ability to experiment with it, see this regex101.com page.

    Output:

    OS                                         Browser                      Engine                                   IsMobile
    --                                         -------                      ------                                   --------
    Windows NT 10.0; Win64; x64                Chrome/110.0.0.0             AppleWebKit/537.36 (KHTML, like Gecko)      False
    Macintosh; Intel Mac OS X 10_15_7          Safari/16.2                  AppleWebKit/605.1.15 (KHTML, like Gecko)    False
    Windows NT 10.0                            Edg/110.0.1587.41            AppleWebKit/537.36 (KHTML, like Gecko)      False
    Linux; Android 13; SM-G991B                Safari/537.36                AppleWebKit/537.36 (KHTML, like Gecko)       True
    Android 13; SM-G991B                       Home Assistant/2023.1.1-3124                                              True
    iPhone; CPU iPhone OS 16_0_3 like Mac OS X Safari/16.0 Mobile/15E148    AppleWebKit/605.1.15 (KHTML, like Gecko)     True
    iPad; CPU OS 6_0 like Mac OS X             Safari/6.0 Mobile/10A5355d   AppleWebKit/536.26 (KHTML, like Gecko)       True
    Macintosh; Intel Mac OS X 10.15; rv:101.0  Firefox/101.0                Gecko/20100101                              False
    X11; Linux ppc64le; rv:75.0                Firefox/75.0                 Gecko/20100101                              False
    X11; Linux i686; Ubuntu/14.10              Opera/12.16.2                Presto/2.12.388                             False
    

    More complete, web-service-based PowerShell solution:

    • https://useragentstring.com/pages/api.php offers an API that returns the parsed components as a JSON object, which a call via Invoke-RestMethod automatically converts to a PowerShell custom object.

    • While slower, this solution is more complete than the pure PowerShell solution, though it omits OS details.

    $userAgentStrings = @(
      'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
      'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.2 Safari/605.1.15'
      'Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.41'
      'Mozilla/5.0 (Linux; Android 13; SM-G991B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Mobile Safari/537.36'
      'Home Assistant/2023.1.1-3124 (Android 13; SM-G991B)'
      'Mozilla/5.0 (iPhone; CPU iPhone OS 16_0_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Mobile/15E148 Safari/604.1'
      'Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25'
      'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0'
      'Mozilla/5.0 (X11; Linux ppc64le; rv:75.0) Gecko/20100101 Firefox/75.0'
      'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16.2'
    )
    
    $userAgentStrings | 
      ForEach-Object {
        Invoke-RestMethod ('https://useragentstring.com?uas={0}&getJSON=all' -f $_)
      } | 
      Format-Table
    

    Output:

    agent_type agent_name             agent_version os_type   os_name    os_versionName os_versionNumber os_producer os_producerURL linux_distibution
    ---------- ----------             ------------- -------   -------    -------------- ---------------- ----------- -------------- -----------------
    Browser    Chrome                 110.0.0.0     Windows   Windows 10                                                            Null
    Browser    Safari                 16.2          Macintosh OS X                      10_15_7                                     Null
    Browser    Chrome                 110.0.0.0     Windows   Windows 10                                                            Null
    Browser    Android Webkit Browser --            Android   Android                   13                                          Null
    unknown    unknown                              Android   Android                   13                                          Null
    Browser    Safari                 16.0          Macintosh iPhone OS                 16_0_3                                      Null
    Browser    Safari                 6.0           Macintosh iPhone OS                 6_0                                         Null
    Browser    Firefox                101.0         Macintosh OS X                      10.15                                       Null
    Browser    Firefox                75.0          Linux     Linux                                                                 Null
    Browser    Opera                  12.16.2       Linux     Linux                                                                 Ubuntu