Audio/video file formats that support embedded markers and comments/annotations?
have you read up on the CAF format?
basically, it functions as a wrapper for many audio formats, and allows you to embed all sorts of data (as well as user defined data). it may be an option since interchange is not an issue (in your case).
The CAF File Specification