With AngleSharp, to load HTML page and wait until all stylesheets are downloaded (if required) and all scripts are ready to be parser executed, I do this
public sealed class WebReader
{
private IDocument _ashDocument;
public async Task Load(string Url)
{
var config = Configuration.Default.WithDefaultLoader().WithJavaScript().WithCss();
var context = BrowsingContext.New(config);
_ashDocument = await context.OpenAsync(Url);
}
public IEnumerable<string> getImage()
{
return _ashDocument.QuerySelectorAll("img").Select(n => n.Attributes["src"].Value);
}
}
static void Main(string[] args)
{
WebReader wReader = new WebReader();
AsyncContext.Run((Action)(async () =>
{
await wReader.Load("http://blogs.msdn.com/b/dotnet/");
}));
IEnumerable<string> imageUrls = wReader.getImage();
foreach (string url in imageUrls)
{
Console.WriteLine(url);
}
Console.ReadKey();
}
AsyncContext is a part of AsyncEx library.
Is it possible to do the same thing without AsyncEx library?
Is it possible to do the same thing without AsyncEx library?
Not inside a console application. The whole point of AsyncContext
is to allow you to await a method in Main
, which itself isn't async (and can't be). The only alternative is to block the task. Additionaly, as @StephanCleary notes, the continuation inside the context will resume on a single thread, instead of an arbitrary thread pool thread.
Without it, it would simply be:
static void Main(string[] args)
{
WebReader wReader = new WebReader();
wReader.Load("http://blogs.msdn.com/b/dotnet/")).Wait();
IEnumerable<string> imageUrls = wReader.getImage();
foreach (string url in imageUrls)
{
Console.WriteLine(url);
}
}
There are rare cases when blocking with Task.Wait
is ok, this is one of them.
Side note - Async methods should be marked with the Async
postfix, hence you should use LoadAsync
. Also, .NET method naming conventions are Pascal case, not camel case.