The actual implementation of awk uses the language development tools available on the UNIX operating system. The grammar is specified with yacc; yacc johnson cstr the lexical analysis is done by lex; the regular expression recognizers are deterministic finite automata constructed directly from the expressions. An awk program is translated into a parse tree which is then directly executed by a simple interpreter.
Awk was designed for ease of use rather than processing speed; the delayed evaluation of variable types and the necessity to break input into fields makes high speed difficult to achieve in any case. Nonetheless, the program has not proven to be unworkably slow.
Table I below shows the execution (user + system) time on a PDP-11/70 of the UNIX programs wc, grep, egrep, fgrep, sed, lex, and awk on the following simple tasks:
The program wc merely counts words, lines and characters in its input; we have already mentioned the others. In all cases the input was a file containing 10,000 lines as created by the command ls -l; each line has the form
As might be expected, awk is not as fast as the specialized tools wc, sed, or the programs in the grep family, but is faster than the more general tool lex. In all cases, the tasks were about as easy to express as awk programs as programs in these other languages; tasks involving fields were considerably easier to express as awk programs. Some of the test programs are shown in awk, sed and lex. $LIST$
Task Program 1 2 3 4 5 6 7 8 --------+------+-------+-------+------+------+-------+------+------+ wc | 8.6 | | | | | | | | grep | 11.7 | 13.1 | | | | | | | egrep | 6.2 | 11.5 | 11.6 | | | | | | fgrep | 7.7 | 13.8 | 16.1 | | | | | | sed | 10.2 | 11.6 | 15.8 | 29.0 | 30.5 | 16.1 | | | lex | 65.1 | 150.1 | 144.2 | 67.7 | 70.3 | 104.0 | 81.7 | 92.8 | awk | 15.0 | 25.6 | 29.9 | 33.3 | 38.9 | 46.4 | 71.4 | 31.1 | --------+------+-------+-------+------+------+-------+------+------+
The programs for some of these jobs are shown below. The lex programs are generally too long to show.
AWK:
SED:
LEX: