What is parsing?

Parsing usually applies to text - the act of reading text and converting it into a more useful in-memory format, "understanding" what it means to some extent. So for example, an XML parser will take the sequence of characters (or bytes) and convert them into elements, attributes etc.

In some cases (particularly compilers) there's a separation between lexical analysis and syntactic analysis, so the real "understanding" part of the parser works on a sequence of tokens (identifiers, operators etc) rather than on the raw characters.


You can start here: http://en.wikipedia.org/wiki/Parsing. Short excerpt:

Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part (of speech).


Parsing is taking a set of data and extracting the meaningful information from it. With HTML parsing, you're looking to read some html and return a structured set of tags and text