Lexical analysis with lex

Advanced lex usage

lex provides a suite of features that let you process input text riddled with quite complicated patterns. These include rules that decide what specification is relevant when more than one seems so at first; functions that transform one matching pattern into another; and the use of definitions and subroutines. Before considering these features, you may want to affirm your understanding thus far by examining an example that draws together several of the points already covered:

   -[0-9]+		printf("negative integer");
   \+?[0-9]+		printf("positive integer");
   -0.[0-9]+		printf("negative fraction, no whole number part");
   rail[ \t]+road	printf("railroad is one word");
   crook		printf("Here's a crook");
   function		subprogcount++;
   G[a-zA-Z]*	{ printf("may have a G word here:%s", yytext);
   		Gstringcount++; }

The first three rules recognize negative integers, positive integers, and negative fractions between 0 and -1. The use of the terminating + in each specification ensures that one or more digits compose the number in question. Each of the next three rules recognizes a specific pattern. The specification for railroad matches cases where one or more blanks intervene between the two syllables of the word. In the cases of railroad and crook, we could have simply printed a synonym rather than the messages stated. The rule recognizing a function simply increments a counter. The last rule illustrates several points:

Next topic: Some special features
Previous topic: Actions

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 27 April 2004