Use GetElementsByClassName in a script
If you figure out how to get GetElementsByClassName to work, I'd like to know. I just ran into this yesterday and ran out of time so I came up with a workaround:
$geturl.ParsedHtml.body.getElementsByTagName('div') |
Where {$_.getAttributeNode('class').Value -eq 'newstitle'}
Cannot, for the life of me, get that method to work either!
Depending upon what you need back in the result though, this might help;
function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news
$news=($geturl.Links|where href -match '\#news\d+')[0]
$news
}
check-krpano
Gives me back:
innerHTML : krpano 1.16.5 released
innerText : krpano 1.16.5 released
outerHTML : <A href="#news1165">krpano 1.16.5 released</A>
outerText : krpano 1.16.5 released
tagName : A
href : #news1165
You can use those properties directly of course, so if you only wanted to know the most recently released version of krpano, this would do it:
function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news
$news=($geturl.Links|where href -match '\#news\d+')[0]
$krpano_version = $news.outerText.Split(" ")[1]
Write-Host $krpano_version
}
check-krpano
would return 1.16.5
at time of writing.
Hope that achieves what you wanted, albeit in a different manner.
EDIT:
This is a possibly a little faster than piping through select-object:
function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news
($geturl.Links|where href -match '\#news\d+'|where class -notmatch 'moreinfo+')[0..4].outerText
}
getElementsByClassName
does not return an array directly but instead a proxy to the results via COM. As you have discovered, conversion to an array is not automatic with the []
operator. You can use the list evaluation syntax, @()
, to force it to an array first so that you can access individual elements:
@($body.getElementsByClassName("foo"))[0].innerText
As an aside, conversion is performed automatically if you use the object pipeline, e.g.:
$body.getElementsByClassName("foo") | Select-Object -First 1
It is also performed automatically with the foreach
construct:
foreach ($element in $body.getElementsByClassName("foo"))
{
$element.innerText
}
I realize this is an old question, but I wanted to add an answer for anyone else who might be trying to achieve the same thing by controlling Internet Explorer using the COM object like such:
$ie = New-Object -com internetexplorer.application
$ie.navigate($url)
while ($ie.Busy -eq $true) { Start-Sleep -Milliseconds 100; }
I normally prefer to use Invoke-WebRequest as the original poster did, but I've found cases where it seemed like I needed a full-fledged IE instance in order to see all of the JavaScript-generated DOM elements even though I would expect parsedhtml.body to include them.
I found that I could do something like this to get a collection of elements by a class name:
$titles = $ie.Document.body.getElementsByClassName('newstitle')
foreach ($storyTitle in $titles) {
Write-Output $storyTitle.innerText
}
I observed the same really slow performance the original poster noted when using PowerShell to search the DOM, but using PowerShell 3.0 and IE11, Measure-Command shows that my collection of classes is found in a 125 KB HTML document in 280 ms.