Search code examples
powershellmarkdownstack-overflowinvoke-webrequest

PowerShell, download the contents of a StackOverflow answer or comment


Each answer or comment on a StackOverflow question thread has a unique URL. How can we use that URL with Invoke-WebRequest (or other tool) to capture just the contents of that answer or comment in mini-Markdown, and from that, some useful information?

Some answers contain complete scripts that I would soemtimes like to automate the retrieval of into .ps1 files on various systems. For example, given this URL https://superuser.com/questions/176624/linux-top-command-for-windows-powershell/1426271#1426271 , I would like to grab just the PowerShell code portion and pipe that into a file called mytop.ps1.


Solution

  • You may use StackExchange REST API to pull the question, in particular answers-by-id.

    It still doesn't give you the markdown, but it will be easier to drill down to the answer's body using the JSON response instead of parsing the full page source. Actually I think that it outputs HTML for the answer body is even better than markdown, because you consistently get <code> elements instead of having to parse all the different ways code can be formatted using markdown (e. g. code fences and indentation).

    $answer = Invoke-RestMethod 'https://api.stackexchange.com/2.3/answers/1426271?site=superuser&filter=withbody'
    
    $codes = [RegEx]::Matches( $answer.items.body, '(?s)<code>(.*?)</code>' ).ForEach{ $_.Groups[1].Value }
    
    # This gives you the PowerShell script for this particular answer only!
    $codes[6]
    

    As there can be multiple <code> elements, you might want to use heuristics to determine the one that contains the PowerShell script, e. g. sort by length and check if the code consists of multiple lines.