Lexical analysis with lex

Start conditions

Some problems require for their solution a greater sensitivity to prior context than is afforded by the ^ operator alone. You may want different rules to be applied to an expression depending on a prior context that is more complex than the end of a line or the start of a file. In this situation you could set a flag to mark the change in context that is the condition for the application of a rule, then write code to test the flag. Alternatively, you could define for lex the different ``start conditions'' under which it is to apply each rule.

Consider this problem: copy the input to the output, except change the word magic to the word first on every line that begins with the letter a; change magic to second on every line that begins with b; change magic to third on every line that begins with c. Here is how the problem might be handled with a flag. Recall that ECHO is a lex macro equivalent to printf("%s", yytext):

   int flag
   ^a  {flag = 'a'; ECHO;}
   ^b  {flag = 'b'; ECHO;}
   ^c  {flag = 'c'; ECHO;}
   \n  {flag = 0; ECHO;}
   magic {
   	switch (flag)
   		case 'a': printf("first"); break;
   		case 'b': printf("second"); break;
   		case 'c': printf("third"); break;
   		default: ECHO; break;

To handle the same problem with start conditions, each start condition must be introduced to lex in the definitions section with a line reading

   %Start name1 name2 . . .
where the conditions may be named in any order. The word Start may be abbreviated to ``S'' or ``s''. The conditions are referenced at the head of a rule with ``<>'' brackets. So
is a rule that is only recognized when the scanner is in start condition name1. To enter a start condition, execute the action statement
   BEGIN name1;
which changes the start condition to name1. To resume the normal state
   BEGIN 0;
resets the initial condition of the scanner. A rule may be active in several start conditions. That is,
is a valid prefix. Any rule not beginning with the <> prefix operators is always active.

The example can be written with start conditions as follows:

   %Start AA BB CC
   ^a        {ECHO; BEGIN AA;}
   ^b        {ECHO; BEGIN BB;}
   ^c        {ECHO; BEGIN CC;}
   \n        {ECHO; BEGIN 0;}
   <AA>magic     printf("first");
   <BB>magic     printf("second");
   <CC>magic     printf("third");

Next topic: User routines
Previous topic: Definitions

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 27 April 2004