4.  The Language

      We will not try to describe the language precisely here; interested readers may refer to the appendix for more details. Throughout this section, we will write expressions exactly as they are handed to the typesetting program (hereinafter called ``EQN''), except that we won't show the delimiters that the user types to mark the beginning and end of the expression. The interface between EQN and TROFF is described at the end of this section.

      As we said, typing x=y+z+1 should produce [equation], and indeed it does. Variables are made italic, operators and digits become roman, and normal spacings between letters and operators are altered slightly to give a more pleasing appearance.

      Input is free-form. Spaces and new lines in the input are used by EQN to separate pieces of the input; they are not used to create space in the output. Thus

x = y
+ z + 1
also gives [equation]. Free-form input is easier to type initially; subsequent editing is also easier, for an expression may be typed as many short lines.

      Extra white space can be forced into the output by several characters of various sizes. A tilde ``~'' gives a space equal to the normal word spacing in text; a circumflex gives half this much, and a tab charcter spaces to the next tab stop.

      Spaces (or tildes, etc.) also serve to delimit pieces of the input. For example, to get


we write

f(t) = 2 pi int sin ( omega t )dt
Here spaces are necessary in the input to indicate that sin, pi, int, and omega are special, and potentially worth special treatment. EQN looks up each such string of characters in a table, and if appropriate gives it a translation. In this case, pi and omega become their greek equivalents, int becomes the integral sign (which must be moved down and enlarged so it looks ``right''), and sin is made roman, following conventional mathematical practice. Parentheses, digits and operators are automatically made roman wherever found.

      Fractions are specified with the keyword over:

a+b over c+d+e = 1


      Similarly, subscripts and superscripts are introduced by the keywords sub and sup:


is produced by

x sup 2 + y sup 2 = z sup 2
The spaces after the 2's are necessary to mark the end of the superscripts; similarly the keyword sup has to be marked off by spaces or some equivalent delimiter. The return to the proper baseline is automatic. Multiple levels of subscripts or superscripts are of course allowed: ``xsupysupz'' is [equation]. The construct ``something sub something sup something'' is recognized as a special case, so ``x sub i sup 2'' is [equation] instead of [equation].

      More complicated expressions can now be formed with these primitives:


is produced by

{partial sup 2 f} over {partial x sup 2} = x sup 2 over a sup 2 + y sup 2 over b sup 2
Braces {} are used to group objects together; in this case they indicate unambiguously what goes over what on the left-hand side of the expression. The language defines the precedence of sup to be higher than that of over, so no braces are needed to get the correct association on the right side. Braces can always be used when in doubt about precedence.

      The braces convention is an example of the power of using a recursive grammar to define the language. It is part of the language that if a construct can appear in some context, then any expression in braces can also occur in that context.

      There is a sqrt operator for making square roots of the appropriate size: ``sqrt a+b'' produces [equation], and

x = {-b +- sqrt{b sup 2 -4ac}} over 2a


Since large radicals look poor on our typesetter, sqrt is not useful for tall expressions.

      Limits on summations, integrals and similar constructions are specified with the keywords from and to. To get


we need only type

sum from i=0 to inf x sub i -> 0
Centering and making the [equation] big enough and the limits smaller are all automatic. The from and to parts are both optional, and the central part (e.g., the [equation]) can in fact be anything:
lim from {x -> pi /2} ( tan~x) = inf


Again, the braces indicate just what goes into the from part.

      There is a facility for making braces, brackets, parentheses, and vertical bars of the right height, using the keywords left and right:

left [ x+y over 2a right ]~=~1


A left need not have a corresponding right, as we shall see in the next example. Any characters may follow left and right, but generally only various parentheses and bars are meaningful.

      Big brackets, etc., are often used with another facility, called piles, which make vertical piles of objects. For example, to get


we can type

sign (x) ~==~ left {
rpile {1 above 0 above -1}
~~lpile {if above if above if}
~~lpile {x>0 above x=0 above x<0}
The construction ``left {'' makes a left brace big enough to enclose the ``rpile {...}'', which is a right-justified pile of ``above ... above ...''. ``lpile'' makes a left-justified pile. There are also centered piles. Because of the recursive language definition, a pile can contain any number of elements; any element of a pile can of course contain piles.

      Although EQN makes a valiant attempt to use the right sizes and fonts, there are times when the default assumptions are simply not what is wanted. For instance the italic sign in the previous example would conventionally be in roman. Slides and transparencies often require larger characters than normal text. Thus we also provide size and font changing commands: ``size 12 bold {A~x~=~y}'' will produce [equation]. Size is followed by a number representing a character size in points. (One point is 1/72 inch; this paper is set in 9 point type.)

      If necessary, an input string can be quoted in "...", which turns off grammatical significance, and any font or spacing changes that might otherwise be done on it. Thus we can say

lim~ roman "sup" ~x sub n = 0
to ensure that the supremum doesn't become a superscript:


      Diacritical marks, long a problem in traditional typesetting, are straightforward:


is made by typing

x dot under + x hat + y tilde
+ X hat + Y dotdot = z+Z bar

      There are also facilities for globally changing default sizes and fonts, for example for making viewgraphs or for setting chemical equations. The language allows for matrices, and for lining up equations at the same horizontal position.

      Finally, there is a definition facility, so a user can say

define name "..."
at any time in the document; henceforth, any occurrence of the token ``name'' in an expression will be expanded into whatever was inside the double quotes in its definition. This lets users tailor the language to their own specifications, for it is quite possible to redefine keywords like sup or over. Section 6 shows an example of definitions.

      The EQN preprocessor reads intermixed text and equations, and passes its output to TROFF. Since TROFF uses lines beginning with a period as control words (e.g., ``.ce'' means ``center the next output line''), EQN uses the sequence ``.EQ'' to mark the beginning of an equation and ``.EN'' to mark the end. The ``.EQ'' and ``.EN'' are passed through to TROFF untouched, so they can also be used by a knowledgeable user to center equations, number them automatically, etc. By default, however, ``.EQ'' and ``.EN'' are simply ignored by TROFF, so by default equations are printed in-line.

      ``.EQ'' and ``.EN'' can be supplemented by TROFF commands as desired; for example, a centered display equation can be produced with the input:

.ce .EQ x sub i = y sub i ... .EN

      Since it is tedious to type ``.EQ'' and ``.EN'' around very short expressions (single letters, for instance), the user can also define two characters to serve as the left and right delimiters of expressions. These characters are recognized anywhere in subsequent text. For example if the left and right delimiters have both been set to ``#'', the input:

Let #x sub i#, #y# and #alpha# be positive
Let [equation], [equation] and [equation] be positive

      Running a preprocessor is strikingly easy on UNIX. To typeset text stored in file ``f'', one issues the command:

eqn f | troff
The vertical bar connects the output of one process (EQN) to the input of another (TROFF).