Headless browser for C# (.NET)?
There are some options:
WebKit.Net (free)
Awesomium
It is based on Chrome/WebKit and works like a charm. There is a free license available but also a commercial one and if need be you can buy the source code :-)HTML Agility Pack (free) (An HTML Parser library, NOT a headless browser)
This helps with extracting information from HTML etc. and might be useful in your case (possibly in combination withHttpWebRequest
)
More solutions:
- PhantomJS - full featured headless web browser. Often used in pair with Selenium which allows you to access the browser from .NET application.
- Optimus (nuget package)- lightweight headless web browser. It's in beta but it is sufficient for some cases.
I used to use both for web testing. But they are also suitable for web scraping.
You may be after TrifleJS (currently in beta), or something similar using the .NET WebBrowser class which communicates with IE via a windowless ActiveX/COM API.
You'll essentially be running a fully fledged browser (not a http request wrapper) using Internet Explorer's Trident engine, if you are not interested in the JavaScript API (a port of phantomjs) you may still be able to use some of the C# codebase to get around key concepts (custom headers, cookies, script execution, screenshot rendering etc).
Note that this can also emulate different versions of IE depending on what you have installed.