Is there a good Python library that can parse C++?
C++ is notoriously hard to parse. Most people who try to do this properly end up taking apart a compiler. In fact this is (in part) why LLVM started: Apple needed a way they could parse C++ for use in XCode that matched the way the compiler parsed it.
That's why there are projects like GCC_XML which you could combine with a python xml library.
Some non-compiler projects that seem to do a pretty good job at parsing C++ are:
- Eclipse CDT
- OpenGrok
- Doxygen
Not an answer as such, but just to demonstrate how hard parsing C++ correctly actually is. My favorite demo:
template<bool> struct a_t;
template<> struct a_t<true> {
template<int> struct b {};
};
template<> struct a_t<false> {
enum { b };
};
typedef a_t<sizeof(void*)==sizeof(int)> a;
enum { c, d };
int main() {
a::b<c>d; // declaration or expression?
}
This is perfectly valid, standard-compliant C++, but the exact meaning of commented line depends on your implementation. If sizeof(void*)==sizeof(int)
(typical on 32-bit platforms), it is a declaration of local variable d
of type a::b<c>
. If the condition doesn't hold, then it is a no-op expression ((a::b < c) > d)
. Adding a constructor for a::b
will actually let you expose the difference via presence/absence of side effects.