The general format of Lex source is:
center;
l.
{definitions}
%%
{rules}
%%
{user subroutines}
where the definitions and the user subroutines
are often omitted.
The second
%%
is optional, but the first is required
to mark the beginning of the rules.
The absolute minimum Lex program is thus
center;
l.
%%
(no definitions, no rules) which translates into a program
which copies the input to the output unchanged.
In the outline of Lex programs shown above, the
rules
represent the user's control
decisions; they are a table, in which the left column
contains
regular expressions
(see section 3)
and the right column contains
actions,
program fragments to be executed when the expressions
are recognized.
Thus an individual rule might appear
center;
l l.
integer printf("found keyword INT");
to look for the string
integer
in the input stream and
print the message ``found keyword INT'' whenever it appears.
In this example the host procedural language is C and
the C library function
printf
is used to print the string.
The end
of the expression is indicated by the first blank or tab character.
If the action is merely a single C expression,
it can just be given on the right side of the line; if it is
compound, or takes more than a line, it should be enclosed in
braces.
As a slightly more useful example, suppose it is desired to
change a number of words from British to American spelling.
Lex rules such as
center;
l l.
colour printf("color");
mechanise printf("mechanize");
petrol printf("gas");
would be a start. These rules are not quite enough,
since
the word
petroleum
would become
gaseum;
a way of dealing
with this will be described later.