Use GetElementsByClassName in a script

If you figure out how to get GetElementsByClassName to work, I'd like to know. I just ran into this yesterday and ran out of time so I came up with a workaround:

$geturl.ParsedHtml.body.getElementsByTagName('div') | 
    Where {$_.getAttributeNode('class').Value -eq 'newstitle'}

Cannot, for the life of me, get that method to work either!

Depending upon what you need back in the result though, this might help;

function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news

$news=($geturl.Links|where href -match '\#news\d+')[0]

$news

}

check-krpano

Gives me back:

innerHTML : krpano 1.16.5 released
innerText : krpano 1.16.5 released
outerHTML : <A href="#news1165">krpano 1.16.5 released</A>
outerText : krpano 1.16.5 released
tagName   : A
href      : #news1165

You can use those properties directly of course, so if you only wanted to know the most recently released version of krpano, this would do it:

function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news

$news=($geturl.Links|where href -match '\#news\d+')[0]

$krpano_version = $news.outerText.Split(" ")[1]

Write-Host $krpano_version

}

check-krpano

would return 1.16.5 at time of writing.

Hope that achieves what you wanted, albeit in a different manner.

EDIT:

This is a possibly a little faster than piping through select-object:

function check-krpano {
$geturl=Invoke-WebRequest http://krpano.com/news  

($geturl.Links|where href -match '\#news\d+'|where class -notmatch 'moreinfo+')[0..4].outerText  

}

getElementsByClassName does not return an array directly but instead a proxy to the results via COM. As you have discovered, conversion to an array is not automatic with the [] operator. You can use the list evaluation syntax, @(), to force it to an array first so that you can access individual elements:

@($body.getElementsByClassName("foo"))[0].innerText

As an aside, conversion is performed automatically if you use the object pipeline, e.g.:

$body.getElementsByClassName("foo") | Select-Object -First 1

It is also performed automatically with the foreach construct:

foreach ($element in $body.getElementsByClassName("foo"))
{
    $element.innerText
}

I realize this is an old question, but I wanted to add an answer for anyone else who might be trying to achieve the same thing by controlling Internet Explorer using the COM object like such:

$ie = New-Object -com internetexplorer.application
$ie.navigate($url)
while ($ie.Busy -eq $true) { Start-Sleep -Milliseconds 100; }

I normally prefer to use Invoke-WebRequest as the original poster did, but I've found cases where it seemed like I needed a full-fledged IE instance in order to see all of the JavaScript-generated DOM elements even though I would expect parsedhtml.body to include them.

I found that I could do something like this to get a collection of elements by a class name:

$titles = $ie.Document.body.getElementsByClassName('newstitle')
foreach ($storyTitle in $titles) {
     Write-Output $storyTitle.innerText
}

I observed the same really slow performance the original poster noted when using PowerShell to search the DOM, but using PowerShell 3.0 and IE11, Measure-Command shows that my collection of classes is found in a 125 KB HTML document in 280 ms.

Tags:

Powershell