I recommend you have a look at a tool named ANTLR (www.antlr.org) which is a parser generator that can generate C++ or Java parsers. It could be defined as a "21st century version of lex/yacc (aka flex/bison)".
Last minute news: the to-be-completed latest version (v3) possibly includes PHP among the target parser generation languages! (There are others, but I have the feeling this specific one may interest you... No?)
--23 May 2006