An awk action is a sequence of action statements terminated by newlines or semicolons. These action statements can be used to do a variety of bookkeeping and string manipulating tasks.
Awk provides a ``length'' function to compute the length of a string of characters. This program prints each record, preceded by its length:
Awk also provides the arithmetic functions sqrt, log, exp, and int, for square root, base e logarithm, exponential, and integer part of their respective arguments.
The name of one of these built-in functions, without argument or parentheses, stands for the value of the function on the whole record. The program
The function substr(s, m, n) produces the substring of s that begins at position m (origin 1) and is at most n characters long. If n is omitted, the substring goes to the end of s. The function index(s1, s2) returns the position where the string s2 occurs in s1, or zero if it does not.
The function sprintf(f, e1, e2, ...) produces the value of the expressions e1, e2, etc., in the printf format specified by f. Thus, for example,
Awk variables take on numeric (floating point) or string values according to context. For example, in
By default, variables (other than built-ins) are initialized to the null string, which has numerical value zero; this eliminates the need for most BEGIN sections. For example, the sums of the first two fields can be computed by
Arithmetic is done internally in floating point. The arithmetic operators are +, -, *, /, and % (mod). The C increment ++ and decrement -- operators are also available, and so are the assignment operators +=, -=, *=, /=, and %=. These operators may all be used in expressions.
Fields in awk share essentially all of the properties of variables _ they may be used in arithmetic or string operations, and may be assigned to. Thus one can replace the first field with a sequence number like this:
Field references may be numerical expressions, as in
Each input line is split into fields automatically as necessary. It is also possible to split any variable or string into fields:
Strings may be concatenated. For example
Array elements are not declared; they spring into existence by being mentioned. Subscripts may have any non-null value, including non-numeric strings. As an example of a conventional numeric subscript, the statement
Array elements may be named by non-numeric values, which gives awk a capability rather like the associative memory of Snobol tables. Suppose the input contains fields with values like apple, orange, etc. Then the program
Awk provides the basic flow-of-control statements if-else, while, for, and statement grouping with braces, as in C. We showed the if statement in section 3.3 without describing it. The condition in parentheses is evaluated; if it is true, the statement following the if is done. The else part is optional.
The while statement is exactly like that of C. For example, to print all input fields one per line,
The for statement is also exactly that of C:
There is an alternate form of the for statement which is suited for accessing the elements of an associative array:
The expression in the condition part of an if, while or for can include relational operators like <, <=, >, >=, == (``is equal to''), and != (``not equal to''); regular expression matches with the match operators ~ and !~; the logical operators ||, &&, and !; and of course parentheses for grouping.
The break statement causes an immediate exit from an enclosing while or for; the continue statement causes the next iteration to begin.
The statement next causes awk to skip immediately to the next record and begin scanning the patterns from the top. The statement exit causes the program to behave as if the end of the input had occurred.
Comments may be placed in awk programs: they begin with the character # and end with the end of the line, as in