Search code examples
javascriptangularcorssame-origin-policy

How can I download a website as string with Angular4?


I'm trying to download(get) a webpage which returns string, not XML and not JSON.

Basically is there any way to download a webpage as string in Angular4 like WebClient.DownloadString in C#?

Note: I thought I can use http methods(observables, promises, JSONP) to download a website at first. But anyway I tried.

I am not able to use JSONP as I understand because of it is parsing the result as json and I'm getting error because the response is string not JSON.

And observables and promises fail because I get CORS error. And I'm not sure why I'm getting CORS error because it is not a RESTful service, WCF or web api, etc..

No 'Access-Control-Allow-Origin' header is present on the requested resource.

I have also tried HttpClient but I got CORS error again.

So I believe there should be some other method or component or module in Angular to download a webpage as string.


Solution

  • You can use a CORS proxy to get content of sites that don’t send Access-Control-Allow-Origin Here’s a simple example:

    const proxyurl = "https://cors-anywhere.herokuapp.com/";
    const requesturl = "https://google.com";
    fetch(proxyurl + requesturl)
        .then(response => response.text())
        .then(text => document.querySelector("pre").textContent = text)
    <pre></pre>

    What’s happening there is this:

    If a site doesn’t itself send a Access-Control-Allow-Origin response header, then browsers will block your frontend JavaScript code from being able to access the response from that site when you make a request to it using the Fetch API or XHR or Ajax methods from JavaScript libraries.

    But using the URL https://cors-anywhere.herokuapp.com/https://google.com causes the request get made through https://cors-anywhere.herokuapp.com, an open CORS proxy which forwards the request to https://google.com and then receives the response back from it. The https://cors-anywhere.herokuapp.com backend adds the Access-Control-Allow-Origin header to the response and passes that back to your requesting frontend code.

    The browser will then allow your frontend code to access the response, because that response with the Access-Control-Allow-Origin response header is what the browser sees.

    You can also easily set up your own CORS proxy using https://github.com/Rob--W/cors-anywhere/

    For details about what browsers do when you send cross-origin requests from frontend JavaScript code using XHR or the Fetch API or AJAX methods from JavaScript libraries—and details about what response headers must be received in order for browsers to allow frontend code to access the responses—see https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS.