Is there a good Python library that can parse C++?

C++ is notoriously hard to parse. Most people who try to do this properly end up taking apart a compiler. In fact this is (in part) why LLVM started: Apple needed a way they could parse C++ for use in XCode that matched the way the compiler parsed it.

That's why there are projects like GCC_XML which you could combine with a python xml library.

Some non-compiler projects that seem to do a pretty good job at parsing C++ are:

  • Eclipse CDT
  • OpenGrok
  • Doxygen

Not an answer as such, but just to demonstrate how hard parsing C++ correctly actually is. My favorite demo:

template<bool> struct a_t;

template<> struct a_t<true> {
    template<int> struct b {};
};

template<> struct a_t<false> {
    enum { b };
};

typedef a_t<sizeof(void*)==sizeof(int)> a;

enum { c, d };
int main() {
    a::b<c>d; // declaration or expression?
}

This is perfectly valid, standard-compliant C++, but the exact meaning of commented line depends on your implementation. If sizeof(void*)==sizeof(int) (typical on 32-bit platforms), it is a declaration of local variable d of type a::b<c>. If the condition doesn't hold, then it is a no-op expression ((a::b < c) > d). Adding a constructor for a::b will actually let you expose the difference via presence/absence of side effects.

Tags:

Python

C++