The "Gold standard" in BibTeX databases
The Collection of Science Bibliographies provides a huge number of test cases. It might be that the ones on site are generated by a pretty printer and thus would be quite uniform in their syntactical features. However, there are also links to the original files which should be more diverse.
For testing parsers in general, there typically is no one Gold standard. Compiler writers rely on individual test cases and test case generators. A test case generator is driven by the grammar of a language and can help to find corner cases that rarely occur in practice. Developing those requires substantial effort, though. If you are curious, look for fuzz testing. As an example for the C language, Finding and Understanding Bugs in C Compilers is a fascinating read about this subject.
If you have biblatex installed, then you have a nice example .bib file at $TEXMF\bibtex\bib\biblatex\biblatex-examples.bib
. It has a lot of comments describing the possible pitfalls of each bibtex entry.
Alternatively you can download the file here: http://mirror.ctan.org/macros/latex/contrib/biblatex/bibtex/bib/biblatex/biblatex-examples.bib
I was thinking on a BNF grammar of the BibTeX format, but I coudn't find one by myself. The closest I found was this one (from a TCL/TK wiki):
# A rough grammar (case-insensitive):
#
# Database ::= (Junk '@' Entry)*
# Junk ::= .*?
# Entry ::= Record
# | Comment
# | String
# | Preamble
# Comment ::= "comment" [^\n]* \n -- ignored
# String ::= "string" '{' Field* '}'
# Preamble ::= "preamble" '{' .* '}' -- (balanced)
# Record ::= Type '{' Key ',' Field* '}'
# | Type '(' Key ',' Field* ')' -- not handled
# Type ::= Name
# Key ::= Name
# Field ::= Name '=' Value
# Name ::= [^\s\"#%'(){}]*
# Value ::= [0-9]+
# | '"' ([^'"']|\\'"')* '"'
# | '{' .* '}' -- (balanced)
If your parser is handling well all the .bib files you tested, I think you already did a great job!