Example:
<HTML>
<HEAD><TITLE> HTML Agility Bug Demo</TITLE></HEAD>
<BODY>
<somestuff>stuff here</somestuff>
<table>
<tr><td>first row</td></tr>
<tr><td>second row</td></tr>
<tr><td>third row</td></tr>
</table>
</BODY>
</HTML>
HtmlAgilityPack.HtmlDocument doc = new HtmlDocument();
doc.Load(@"HtmlAgilityBugDemo.html");
HtmlNodeCollection rowNodes = doc.DocumentNode.SelectNodes("//table/tr");
foreach (HtmlNode row in rowNodes)
{
string test1 = row.InnerText; // Works, enumerates correctly
string test2 = row.SelectSingleNode("//td").InnerText; // This ALWAYS returns "first row" !!
string test3 = row.SelectSingleNode("//somestuff").InnerText; // Found somestuff. But no stuff within this node !!
}
Comments: ** Comment from web user: HarryCallahan **
Yes very annoying as it's a common thing to do.
I've got around it, or rather contended with it, by loading the child's InnerHtml into a new doc and using that. A heavy weight solution.