How To Read HTML code as XML and get the output like the sample in sql?
I am trying to establish communication between nodes
h4
andul
.
You can use the <<
and >>
operator to check if a node is before or after another node in document order. Combine that with a predicate on position, [1]
, to get the first occurrence also in document order.
select H4.X.value('(span/text())[1]', 'varchar(10)') as Section,
UL.X.query('.') as UL
from @X.nodes('/div/section/h4') as H4(X)
cross apply H4.X.nodes('(let $h4 := . (: Save current h4 node :)
return /div/section/ul[$h4 << .])[1]') as UL(X);
rextester:
<<
and >>
are called Node Order Comparison Operators
If you have an XML fragment like this:
<N1>1</N1>
<N2>2</N2>
<N3>3</N3>
<N4>4</N4>
<N5>5</N5>
you can get all nodes before the first occurrence of N3
with this query:
select @X.query('/*[. << /N3[1]]');
Result:
<N1>1</N1>
<N2>2</N2>
/*
will give you all root nodes. What is enclosed in []
is a predicate. .
is the current node and /N3[1]
is the first N3 node in document order at the root level. So from each root node you get the nodes that precede N3
.
Here is almost the same query, only you get the nodes that follow the first N3
node:
select @X.query('/*[. >> /N3[1]]');
<N4>4</N4>
<N5>5</N5>
To only get the first node after the first N3
node, you add the predicate [1]
:
select @X.query('/*[. >> /N3[1]][1]');
<N4>4</N4>
This is not exactly elegant but seems to do the job.
DECLARE @X XML = REPLACE(REPLACE(@S, '<h4>', '<foo><h4>'), '</ul>', '</ul></foo>')
SELECT Category = x.value('../../h4[1]/span[1]', 'varchar(10)'),
Selection = x.value('descendant-or-self::text()[1]', 'varchar(10)'),
Value = REPLACE(
REPLACE(
REPLACE(
LTRIM(
RTRIM(
REPLACE(
REPLACE(
CAST(x.x.query('fn:data(descendant-or-self::text()[fn:position() > 1])') AS VARCHAR(MAX))
, char(10), '')
, char(13), '')
)
)
, ' ', ' |')
, '| ', '')
, '|', '')
FROM @X.nodes('div/section/foo/ul/li') x(x)
ORDER BY Category,
Selection
Which returns
+----------+-----------+-------+
| Category | Selection | Value |
+----------+-----------+-------+
| A | Ab | AD AC |
| A | Ag | AL |
| B | Bb | BD BC |
| B | Bg | BL |
+----------+-----------+-------+
I'm assuming this is what you want as the desired results table in the question does not return the "rest of the values as a concatenated string"