2001-08-27 - Release on Sourceforge.  Version 0.1.1. Bugfixes.

2001-06-29 - First release on Sourceforge.  Version 0.1.0

What is ebnf2yacc?

ebnf2yacc is a tool to help write yacc parsers/compilers. It takes as input a grammar written in bnf (ebnf support is planned) and outputs a c++ abstract syntax tree that supports the visitor pattern, along with a yacc parser that builds the tree.




If you wish to communicate with someone about ebnf2yacc, please do so on the mailing list:


To subscribe.

You may also submit bugs or patches via the sourceforge group page.


Here is the README included with ebnf2yacc:

This is the first release of ebnf2yacc.  I started this as a personal experiment, and then ended up using it as a tool at work (for OpenWBEM - see and I finally got it into a usable state, and decided to open source it.

The purpose of ebnf2yacc is to ease the creation of yacc parsers.  Yacc input files must be in bnf.  It is much easier to write a grammer in ebnf.  This program currently takes a bnf file as input, and in the future will take an input file in ebnf and convert it to a usable yacc file.

Caveat: Right now, it will only accept bnf input, basically the same that you would feed to yacc.  The main usefullness of ebnf2yacc right now is to create a c++ abstract syntax tree.  For a concrete example, see the WQL parser of OpenWBEM.  It is planned in the future to support most ebnf features.

ebnf2yacc generates a set of classes that represent the ast of the grammar.  These ast classes support the visitor pattern.  An abstract visitor base class is generated as well as a sample concrete visitor that simply traverses the tree.  ebnf2yacc also generates a yacc file that can be used (with slight modification if you need precedence or other yacc features) to build the ast.  To build a parser, you will still need to provide the appropriate framework.

In order to implement certain features, ebnf2yacc makes use of certain characteristics of the names of grammar rules. Any token that is ALL CAPS is assumed to be a terminal, and a token that comes from the lexer. If a rule begins with "str" (e.g. strToken) or is ALL CAPS, it is stored as a string in the ast.  No ast class is generated for rules that begin with str. You should only use this for rules that are simple alternatives of a bunch of tokens. e.g.:

    | MINUS
    | TIMES
    | DIVIDE

If a rule begins with "opt", then code will be generated to check the ast for null in the sample traversal visitor. e.g.:

optSemicolon: /* EMPTY */

If a rule ends with "List", then the ast will contain a list of the first non-terminal of the first alternative of the rule. e.g.:

    | varList COMMA var

There is no checking done to enforce these rules, so the "garbage in, garbage out" rule applies here.

Right now the names of the generated classes are fixed.  I plan to have this be configurable in the future, but have not yet decided on a good mechanism for that.

To build ebnf2yacc you need lex and yacc.  In particular, I used flex and bison. It has not been tested with other lex and yaccs.  If someone tries it with a different lex or yacc, I would like to know if it works or not.  I have tried to write the code in portable c++, and have compiled it with gcc 2.95.2.

The project uses autoconf/automake, so to build it, you can simply run:

and then to install it:

    make install

The binary is named ebnf2yacc.  The command line arguments are:
<input grammar> <visitor file> <ast file> <yacc file> <traversal visitor header file> <traversal visitor cpp file>

If you find any bugs or have any suggestions for improvements and features, I am eager to hear them.  Please feel free to make use of the sourceforge facilities at

There is a mailing list for ebnf2yacc, hosted at sourceforge that you can subscribe to as well. You may also contact me directly via e-mail: nuffer at users dot sourceforge dot net

--Dan Nuffer

SourceForge Logo