Parsing with yacc

The yacc environment

You create a yacc parser with the command

   $ yacc grammar.y
where grammar.y is the file containing your yacc specification. (The .y suffix is a convention recognized by other UNIX system commands. It is not strictly necessary.) The output is a file of C language subroutines called The function produced by yacc is called yyparse(), and is integer-valued. When it is called, it in turn repeatedly calls yylex(), the lexical analyzer supplied by the user (see ``Lexical analysis''), to obtain input tokens. Eventually, an error is detected, yyparse() returns the value 1, and no error recovery is possible, or the lexical analyzer returns the end-marker token and the parser accepts. In this case, yyparse() returns the value 0.

You must provide a certain amount of environment for this parser in order to obtain a working program. For example, as with every C language program, a routine called main() must be defined that eventually calls yyparse(). In addition, a routine called yyerror() is needed to print a message when a syntax error is detected.

These two routines must be supplied in one form or another by the user. To ease the initial effort of using yacc, a library has been provided with default versions of main() and yyerror(). The library, liby, is accessed by a -ly argument to the cc command. The source codes

        return (yyparse());
   # include <stdio.h>

yyerror(s) char *s; { (void) fprintf(stderr, "%s\n", s); }

show the triviality of these default programs. The argument to yyerror() is a string containing an error message, usually the string syntax error. The average application wants to do better than this. Ordinarily, the program should keep track of the input line number and print it along with the message when a syntax error is detected. The external integer variable yychar contains the lookahead token number at the time the error was detected. This may be of some interest in giving better diagnostics. Since the main() routine is probably supplied by the user (to read arguments, for instance), the yacc library is useful only in small projects or in the earliest stages of larger ones.

The external integer variable yydebug is normally set to 0. If it is set to a nonzero value, the parser will output a verbose description of its actions including a discussion of the input symbols read and what the parser actions are. It is possible to set this variable by using debug(1).

Next topic: Hints for preparing specifications
Previous topic: Error handling

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 27 April 2004