Get content between a pair of HTML tags using Bash
Using sed in shell/bash, so you needn't install something else.
tag=body
sed -n "/<$tag>/,/<\/$tag>/p" file
plain text processing is not good for html/xml parsing. I hope this could give you some idea:
kent$ xmllint --xpath "//body" f.html
<body>
text
<div>
text2
<div>
text3
</div>
</div>
</body>