remove html node from htmldocument :HTMLAgilityPack
It seems you're modifying the collection during the enumeration by using HtmlNode.RemoveChild
method.
To fix this you need is to copy your nodes to a separate list/array by calling e.g. Enumerable.ToList<T>()
or Enumerable.ToArray<T>()
.
var nodesToRemove = doc.DocumentNode
.SelectNodes("//img[not(string-length(normalize-space(@src)))]")
.ToList();
foreach (var node in nodesToRemove)
node.Remove();
If I'm right, the problem will disappear.
What I have done is:
List<string> xpaths = new List<string>();
foreach (HtmlNode node in doc.DocumentNode.DescendantNodes())
{
if (node.Name.ToLower() == "img")
{
string src = node.Attributes["src"].Value;
if (string.IsNullOrEmpty(src))
{
xpaths.Add(node.XPath);
continue;
}
}
}
foreach (string xpath in xpaths)
{
doc.DocumentNode.SelectSingleNode(xpath).Remove();
}