Search code examples
c#uriimage-extraction

Image Extraction : uri is too long


I am working on Image Extraction Software from A WebPage . have created a function

 public static void GetAllImages()
        {

            WebClient x = new WebClient();
            string source = x.DownloadString(@"http://www.bbc.com");

            var document = new HtmlWeb().Load(source);
            var urls = document.DocumentNode.Descendants("img")
                                .Select(e => e.GetAttributeValue("src", null))
                                .Where(s => !String.IsNullOrEmpty(s));

            document.Load(source);


        }

It says "Uri is too long " ..

I tried to use Uri.EscapeDataString .. But not getting idea where to put it

Any Help would be appreciated


Solution

  • HtmlWeb.Load takes a URL as its source and deals with the downloading of the content. You don't need a supplementary WebClient to do this, it's all taken care of.

    What you are doing is downloading the content, then attempting to use the downloaded content (HTML) as a URL (probably under the assumption that Load means Parse).

    So remove

    WebClient x = new WebClient();
    string source = x.DownloadString(@"http://www.bbc.com");
    

    then change the next line to

    var document = new HtmlWeb().Load(@"http://www.bbc.com");
    

    and you'll be good to go.