How to create HtmlUnit HTMLPage object from String?
Using HTMLUnit 2.40, Grooveek's code won't compile, you get "Cannot make a static reference to the non-static method parseHtml(WebResponse, WebWindow) from the type HTMLParser". But there is now a class HtmlUnitNekoHtmlParser implementing the HTMLParser interface, so the following code works:
StringWebResponse response = new StringWebResponse(
"<html><head><title>Test</title></head><body></body></html>",
new URL("http://www.example.com"));
HtmlPage page = new HtmlUnitNekoHtmlParser().parseHtml(
response, new WebClient().getCurrentWindow());
There is some sample code in the FAQ https://htmlunit.sourceforge.io/faq.html#HowToParseHtmlString
e.g.
final String htmlCode = "<html>"
+ " <head>"
+ " <title>Title</title>"
+ " </head>"
+ " <body>"
+ " content..."
+ " </body>"
+ "</html> ";
try (WebClient webClient = new WebClient(browserVersion)) {
final HtmlPage page = webClient.loadHtmlCodeIntoCurrentWindow(htmlCode);
// work with the html page
}
This code works in GroovyConsole
@Grapes(
@Grab(group='net.sourceforge.htmlunit', module='htmlunit', version='2.8')
)
import com.gargoylesoftware.htmlunit.*
import com.gargoylesoftware.htmlunit.html.*
URL url = new URL("http://www.example.com");
StringWebResponse response = new StringWebResponse("<html><head><title>Test</title></head><body></body></html>", url);
WebClient client = new WebClient()
HtmlPage page = HTMLParser.parseHtml(response, client.getCurrentWindow());
System.out.println(page.getTitleText());