Search code examples
powershellselenium-webdriverselenium-chromedriver

Download .xml file with selenium chrome driver


I'm trying to download am xml file with selenium chrome driver, but getting but prompted for:

"this type of file can harm your computer.Do you want to keep the file anyway?".

Using Google Version 80.0.3987.122 (Official Build) (32-bit) and ChromeDriver 80.0.3987.106. The powershell chrome option I am using is below:

$ChromeOptions = New-Object OpenQA.Selenium.Chrome.ChromeOptions
$ChromeOptions.AddArguments(@(
    "--disable-extensions",
    "--ignore-certificate-errors"))

$download = "C:\temp\download"
$ChromeOptions.AddUserProfilePreference("safebrowsing.enabled", "true");
$ChromeOptions.AddUserProfilePreference("download.default_directory", $download);
$ChromeOptions.AddUserProfilePreference("download.prompt_for_download", "false");
$ChromeOptions.AddUserProfilePreference("download.directory_upgrade", "true");

$ChromeDriver = New-Object OpenQA.Selenium.Chrome.ChromeDriver($chromeOptions)

I'd appreciate the correct option to remove the prompt.


Solution

  • Why? Just use the built-in PowerShell cmdlets to do this. Invoke-WebRequest, Start-BitTransfer, or the .Net namespace.

    You don't need a browser to scrape a web site or download a file. Point of note: Some site blocks any automated effort, regardless of the tool you try to use.

    3 ways to download files with PowerShell

    # 1. Invoke-WebRequest
    
    $url = "http://mirror.internode.on.net/pub/test/10meg.test"
    $output = "$PSScriptRoot\10meg.test"
    $start_time = Get-Date
    
    Invoke-WebRequest -Uri $url -OutFile $output
    Write-Output "Time taken: $((Get-Date).Subtract($start_time).Seconds) second(s)"
    
    
    # 2. System.Net.WebClient
    
    $url = "http://mirror.internode.on.net/pub/test/10meg.test"
    $output = "$PSScriptRoot\10meg.test"
    $start_time = Get-Date
    
    $wc = New-Object System.Net.WebClient
    $wc.DownloadFile($url, $output)
    #OR
    (New-Object System.Net.WebClient).DownloadFile($url, $output)
    
    Write-Output "Time taken: $((Get-Date).Subtract($start_time).Seconds) second(s)"
    
    
    # 3. Start-BitsTransfer
    
    $url = "http://mirror.internode.on.net/pub/test/10meg.test"
    $output = "$PSScriptRoot\10meg.test"
    $start_time = Get-Date
    
    Import-Module BitsTransfer
    Start-BitsTransfer -Source $url -Destination $output
    
    #OR
    Start-BitsTransfer -Source $url -Destination $output -Asynchronous
    
    Write-Output "Time taken: $((Get-Date).Subtract($start_time).Seconds) second(s)"
    
    
    # Get specifics for a module, cmdlet, or function
    (Get-Command -Name Invoke-WebRequest).Parameters
    (Get-Command -Name Invoke-WebRequest).Parameters.Keys
    Get-help -Name Invoke-WebRequest -Examples
    <#
    # Built-In Examples
    
    $R = Invoke-WebRequest -URI http://www.bing.com?q=how+many+feet+in+a+mile
    $R.AllElements | where {$_.innerhtml -like "*=*"} | Sort { $_.InnerHtml.Length } | Select InnerText -First 5
    shortest HTML value often helps you find the most specific element that matches that text.
    $R=Invoke-WebRequest http://www.facebook.com/login.php -SessionVariable fb
    $FB
    $Form = $R.Forms[0]
    $Form | Format-List
    $Form.fields
    $Form.Fields["email"]="[email protected]"
    $R=Invoke-WebRequest -Uri ("https://www.facebook.com" + $Form.Action) -WebSession $FB -Method POST -Body $Form.Fields
    # Sends a sign-in request by running the Invoke-WebRequest cmdlet. The command specifies a value of "fb" for the SessionVariable parameter, and saves the 
    $R.StatusDescription
    (Invoke-WebRequest -Uri "http://msdn.microsoft.com/en-us/library/aa973757(v=vs.85).aspx").Links.Href
    #>
    Get-help -Name Invoke-WebRequest -Full
    Get-help -Name Invoke-WebRequest -Online
    
    
    
    (Get-Command -Name Start-BitsTransfer).Parameters
    (Get-Command -Name Start-BitsTransfer).Parameters.Keys
    Get-help -Name Start-BitsTransfer -Examples
    <#
    # Built-In Examples
    
    Start-BitsTransfer -Source "http://server01/servertestdir/testfile1.txt" -Destination "c:\clienttestdir\testfile1.txt"
    Import-CSV filelist.txt | Start-BitsTransfer
    Start-BitsTransfer -Source "c:\clienttestdir\testfile1.txt" -Destination "http://server01/servertestdir/testfile1.txt" -TransferType Upload
    Start-BitsTransfer -Source "http://server01/servertestdir/testfile1.txt", "http://server01/servertestdir/testfile2.txt" -Destination 
    $Cred = Get-Credential
     Start-BitsTransfer -DisplayName MyJob -Credential $Cred -Source "http://server01/servertestdir/testfile1.txt" -Destination "c:\clienttestdir\testfile1.txt"
    Import-CSV filelist.txt | Start-BitsTransfer -Asynchronous -Priority Normal
    Start-BitsTransfer -Source "http://server01/servertestdir/*.*" -Destination "c:\clienttestdir\"
    Import-CSV filelist.txt | Start-BitsTransfer -TransferType Upload
    Start-BitsTransfer -Source .\Patch0416.msu -Destination $env:temp\Patch0416.msu -ProxyUsage Override -ProxyList BitsProxy:8080 -ProxyCredential 
    #>
    Get-help -Name Start-BitsTransfer -Full
    Get-help -Name Start-BitsTransfer -Online
    
    
    <#
    WebClient Class
    
    Definition 
    Namespace: System.Net
    Assembly:  System.Net.WebClient.dll
    
    Provides common methods for sending data to and receiving data from a resource identified by a URI.
    
    https://learn.microsoft.com/en-us/dotnet/api/system.net.webclient?view=netcore-3.1
    
    
    
    WebClient.DownloadFile Method
    
    Namespace: System.Net
    Assembly:  System.Net.WebClient.dll
    
    Downloads the resource with the specified URI to a local file.
    
    https://learn.microsoft.com/en-us/dotnet/api/system.net.webclient.downloadfile?view=netcore-3.1
    #>
    

    Though if you truly want to do this with Selenium via Chrome, then this similar SO thread should be of help.

    How to control the download of files with Selenium + Python bindings in Chrome

    As well as this article and sample:

    PowerShell & Selenium: Automate Web Browser Interactions – Part II

    As for this warning ...

    > "this type of file can harm your computer. Do you want to keep the file anyway?".

    ... this is not PowerShell or a PowerShell warning, this is the Windows and the Browser (IE, Edge, Chrome, et all) alerting you to a potential threat.

    Working around the Google Chrome "This type of file can harm your computer" problem

    Google announced recently that it made the decision to improve protection against unwanted software downloads in the Chrome browser and Google search.

    See these SO thread as well on this topic.

    How to disable 'This type of file can harm your computer' pop up

    This type of file can harm your computer, trying to download an .ini file in Chrome using c# and selenium