2.  LANGUAGE DESCRIPTION

Design

      Ratfor attempts to retain the merits of Fortran (universality, portability, efficiency) while hiding the worst Fortran inadequacies. The language is Fortran except for two aspects. First, since control flow is central to any program, regardless of the specific application, the primary task of Ratfor is to conceal this part of Fortran from the user, by providing decent control flow structures. These structures are sufficient and comfortable for structured programming in the narrow sense of programming without GOTO's. Second, since the preprocessor must examine an entire program to translate the control structure, it is possible at the same time to clean up many of the ``cosmetic'' deficiencies of Fortran, and thus provide a language which is easier and more pleasant to read and write.

      Beyond these two aspects _ control flow and cosmetics _ Ratfor does nothing about the host of other weaknesses of Fortran. Although it would be straightforward to extend it to provide character strings, for example, they are not needed by everyone, and of course the preprocessor would be harder to implement. Throughout, the design principle which has determined what should be in Ratfor and what should not has been Ratfor doesn't know any Fortran. Any language feature which would require that Ratfor really understand Fortran has been omitted. We will return to this point in the section on implementation.

      Even within the confines of control flow and cosmetics, we have attempted to be selective in what features to provide. The intent has been to provide a small set of the most useful constructs, rather than to throw in everything that has ever been thought useful by someone.

      The rest of this section contains an informal description of the Ratfor language. The control flow aspects will be quite familiar to readers used to languages like Algol, PL/I, Pascal, etc., and the cosmetic changes are equally straightforward. We shall concentrate on showing what the language looks like.

Statement Grouping

      Fortran provides no way to group statements together, short of making them into a subroutine. The standard construction ``if a condition is true, do this group of things,'' for example,

if (x > 100)
{ call error("x>100"); err = 1; return }
cannot be written directly in Fortran. Instead a programmer is forced to translate this relatively clear thought into murky Fortran, by stating the negative condition and branching around the group of statements:
if (x .le. 100) goto 10
call error(5hx>100)
err = 1
return
10 ...
When the program doesn't work, or when it must be modified, this must be translated back into a clearer form before one can be sure what it does.

      Ratfor eliminates this error-prone and confusing back-and-forth translation; the first form is the way the computation is written in Ratfor. A group of statements can be treated as a unit by enclosing them in the braces { and }. This is true throughout the language: wherever a single Ratfor statement can be used, there can be several enclosed in braces. (Braces seem clearer and less obtrusive than begin and end or do and end, and of course do and end already have Fortran meanings.)

      Cosmetics contribute to the readability of code, and thus to its understandability. The character ``>'' is clearer than ``.GT.'', so Ratfor translates it appropriately, along with several other similar shorthands. Although many Fortran compilers permit character strings in quotes (like "x>100"), quotes are not allowed in ANSI Fortran, so Ratfor converts it into the right number of H's: computers count better than people do.

      Ratfor is a free-form language: statements may appear anywhere on a line, and several may appear on one line if they are separated by semicolons. The example above could also be written as

if (x > 100) {
call error("x>100")
err = 1
return
}
In this case, no semicolon is needed at the end of each line because Ratfor assumes there is one statement per line unless told otherwise.

      Of course, if the statement that follows the if is a single statement (Ratfor or otherwise), no braces are needed:

if (y <= 0.0 & z <= 0.0)
write(6, 20) y, z
No continuation need be indicated because the statement is clearly not finished on the first line. In general Ratfor continues lines when it seems obvious that they are not yet done. (The continuation convention is discussed in detail later.)

      Although a free-form language permits wide latitude in formatting styles, it is wise to pick one that is readable, then stick to it. In particular, proper indentation is vital, to make the logical structure of the program obvious to the reader.

The ``else'' Clause

      Ratfor provides an else statement to handle the construction ``if a condition is true, do this thing, otherwise do that thing.''

if (a <= b)
{ sw = 0; write(6, 1) a, b }
else
{ sw = 1; write(6, 1) b, a }
This writes out the smaller of a and b, then the larger, and sets sw appropriately.

      The Fortran equivalent of this code is circuitous indeed:

if (a .gt. b) goto 10
sw = 0
write(6, 1) a, b
goto 20
10 sw = 1
write(6, 1) b, a
20 ...
This is a mechanical translation; shorter forms exist, as they do for many similar situations. But all translations suffer from the same problem: since they are translations, they are less clear and understandable than code that is not a translation. To understand the Fortran version, one must scan the entire program to make sure that no other statement branches to statements 10 or 20 before one knows that indeed this is an if-else construction. With the Ratfor version, there is no question about how one gets to the parts of the statement. The if-else is a single unit, which can be read, understood, and ignored if not relevant. The program says what it means.

      As before, if the statement following an if or an else is a single statement, no braces are needed:

if (a <= b)
sw = 0
else
sw = 1

      The syntax of the if statement is

if (legal Fortran condition)
Ratfor statement
else
Ratfor statement
where the else part is optional. The legal Fortran condition is anything that can legally go into a Fortran Logical IF. Ratfor does not check this clause, since it does not know enough Fortran to know what is permitted. The Ratfor statement is any Ratfor or Fortran statement, or any collection of them in braces.

Nested if's

      Since the statement that follows an if or an else can be any Ratfor statement, this leads immediately to the possibility of another if or else. As a useful example, consider this problem: the variable f is to be set to -1 if x is less than zero, to +1 if x is greater than 100, and to 0 otherwise. Then in Ratfor, we write

if (x < 0)
f = -1
else if (x > 100)
f = +1
else
f = 0
Here the statement after the first else is another if-else. Logically it is just a single statement, although it is rather complicated.

      This code says what it means. Any version written in straight Fortran will necessarily be indirect because Fortran does not let you say what you mean. And as always, clever shortcuts may turn out to be too clever to understand a year from now.

      Following an else with an if is one way to write a multi-way branch in Ratfor. In general the structure

if (...)
- - -
else if (...)
- - -
else if (...)
- - -
...
else
- - -
provides a way to specify the choice of exactly one of several alternatives. (Ratfor also provides a switch statement which does the same job in certain special cases; in more general situations, we have to make do with spare parts.) The tests are laid out in sequence, and each one is followed by the code associated with it. Read down the list of decisions until one is found that is satisfied. The code associated with this condition is executed, and then the entire structure is finished. The trailing else part handles the ``default'' case, where none of the other conditions apply. If there is no default action, this final else part is omitted:
if (x < 0)
x = 0
else if (x > 100)
x = 100

if-else ambiguity

      There is one thing to notice about complicated structures involving nested if's and else's. Consider

if (x > 0)
if (y > 0)
write(6, 1) x, y
else
write(6, 2) y
There are two if's and only one else. Which if does the else go with?

      This is a genuine ambiguity in Ratfor, as it is in many other programming languages. The ambiguity is resolved in Ratfor (as elsewhere) by saying that in such cases the else goes with the closest previous un-else'ed if. Thus in this case, the else goes with the inner if, as we have indicated by the indentation.

      It is a wise practice to resolve such cases by explicit braces, just to make your intent clear. In the case above, we would write

if (x > 0) {
if (y > 0)
write(6, 1) x, y
else
write(6, 2) y
}
which does not change the meaning, but leaves no doubt in the reader's mind. If we want the other association, we must write
if (x > 0) {
if (y > 0)
write(6, 1) x, y
}
else
write(6, 2) y

The ``switch'' Statement

      The switch statement provides a clean way to express multi-way branches which branch on the value of some integer-valued expression. The syntax is

switch (expression) {

case
expr1 :
statements
case
expr2, expr3 :
statements
...
default:
statements
}

      Each case is followed by a list of comma-separated integer expressions. The expression inside switch is compared against the case expressions expr1, expr2, and so on in turn until one matches, at which time the statements following that case are executed. If no cases match expression, and there is a default section, the statements with it are done; if there is no default, nothing is done. In all situations, as soon as some block of statements is executed, the entire switch is exited immediately. (Readers familiar with C[4] should beware that this behavior is not the same as the C switch.)

The ``do'' Statement

      The do statement in Ratfor is quite similar to the DO statement in Fortran, except that it uses no statement number. The statement number, after all, serves only to mark the end of the DO, and this can be done just as easily with braces. Thus

do i = 1, n {
x(i) = 0.0
y(i) = 0.0
z(i) = 0.0
}
is the same as
do 10 i = 1, n
x(i) = 0.0
y(i) = 0.0
z(i) = 0.0
10 continue
The syntax is:
do legal­Fortran­DO­text
Ratfor statement
The part that follows the keyword do has to be something that can legally go into a Fortran DO statement. Thus if a local version of Fortran allows DO limits to be expressions (which is not currently permitted in ANSI Fortran), they can be used in a Ratfor do.

      The Ratfor statement part will often be enclosed in braces, but as with the if, a single statement need not have braces around it. This code sets an array to zero:

do i = 1, n
x(i) = 0.0
Slightly more complicated,
do i = 1, n
do j = 1, n
m(i, j) = 0
sets the entire array m to zero, and
do i = 1, n
do j = 1, n
if (i < j)
m(i, j) = -1
else if (i == j)
m(i, j) = 0
else
m(i, j) = +1
sets the upper triangle of m to -1, the diagonal to zero, and the lower triangle to +1. (The operator == is ``equals'', that is, ``.EQ.''.) In each case, the statement that follows the do is logically a single statement, even though complicated, and thus needs no braces.

``break'' and ``next''

      Ratfor provides a statement for leaving a loop early, and one for beginning the next iteration. break causes an immediate exit from the do; in effect it is a branch to the statement after the do. next is a branch to the bottom of the loop, so it causes the next iteration to be done. For example, this code skips over negative values in an array:

do i = 1, n {
if (x(i) < 0.0)
next
process positive element
}
break and next also work in the other Ratfor looping constructions that we will talk about in the next few sections.

      break and next can be followed by an integer to indicate breaking or iterating that level of enclosing loop; thus

break 2
exits from two levels of enclosing loops, and break 1 is equivalent to break. next 2 iterates the second enclosing loop. (Realistically, multi-level break's and next's are not likely to be much used because they lead to code that is hard to understand and somewhat risky to change.)

The ``while'' Statement

      One of the problems with the Fortran DO statement is that it generally insists upon being done once, regardless of its limits. If a loop begins

DO I = 2, 1
this will typically be done once with I set to 2, even though common sense would suggest that perhaps it shouldn't be. Of course a Ratfor do can easily be preceded by a test
if (j <= k)
do i = j, k {
_ _ _
}
but this has to be a conscious act, and is often overlooked by programmers.

      A more serious problem with the DO statement is that it encourages that a program be written in terms of an arithmetic progression with small positive steps, even though that may not be the best way to write it. If code has to be contorted to fit the requirements imposed by the Fortran DO, it is that much harder to write and understand.

      To overcome these difficulties, Ratfor provides a while statement, which is simply a loop: ``while some condition is true, repeat this group of statements''. It has no preconceptions about why one is looping. For example, this routine to compute sin(x) by the Maclaurin series combines two termination criteria.

real function sin(x, e)
# returns sin(x) to accuracy e, by
# sin(x) = x - x**3/3! + x**5/5! - ...

sin = x
term = x

i = 3
while (abs(term)>e & i<100) {
term = -term * x**2 / float(i*(i-1))
sin = sin + term
i = i + 2
}

return
end

      Notice that if the routine is entered with term already smaller than e, the loop will be done zero times, that is, no attempt will be made to compute x**3 and thus a potential underflow is avoided. Since the test is made at the top of a while loop instead of the bottom, a special case disappears _ the code works at one of its boundaries. (The test i<100 is the other boundary _ making sure the routine stops after some maximum number of iterations.)

      As an aside, a sharp character ``#'' in a line marks the beginning of a comment; the rest of the line is comment. Comments and code can co-exist on the same line _ one can make marginal remarks, which is not possible with Fortran's ``C in column 1'' convention. Blank lines are also permitted anywhere (they are not in Fortran); they should be used to emphasize the natural divisions of a program.

      The syntax of the while statement is

while (legal Fortran condition)
Ratfor statement
As with the if, legal Fortran condition is something that can go into a Fortran Logical IF, and Ratfor statement is a single statement, which may be multiple statements in braces.

      The while encourages a style of coding not normally practiced by Fortran programmers. For example, suppose nextch is a function which returns the next input character both as a function value and in its argument. Then a loop to find the first non-blank character is just

while (nextch(ich) == iblank)
;
A semicolon by itself is a null statement, which is necessary here to mark the end of the while; if it were not present, the while would control the next statement. When the loop is broken, ich contains the first non-blank. Of course the same code can be written in Fortran as
100 if (nextch(ich) .eq. iblank) goto 100
but many Fortran programmers (and a few compilers) believe this line is illegal. The language at one's disposal strongly influences how one thinks about a problem.

The ``for'' Statement

      The for statement is another Ratfor loop, which attempts to carry the separation of loop-body from reason-for-looping a step further than the while. A for statement allows explicit initialization and increment steps as part of the statement. For example, a DO loop is just

for (i = 1; i <= n; i = i + 1) ...
This is equivalent to
i = 1
while (i <= n) {
...
i = i + 1
}
The initialization and increment of i have been moved into the for statement, making it easier to see at a glance what controls the loop.

      The for and while versions have the advantage that they will be done zero times if n is less than 1; this is not true of the do.

      The loop of the sine routine in the previous section can be re-written with a for as

for (i=3; abs(term) > e & i < 100; i=i+2) {
term = -term * x**2 / float(i*(i-1))
sin = sin + term
}

      The syntax of the for statement is

for ( init ; condition ; increment )
Ratfor statement
init is any single Fortran statement, which gets done once before the loop begins. increment is any single Fortran statement, which gets done at the end of each pass through the loop, before the test. condition is again anything that is legal in a logical IF. Any of init, condition, and increment may be omitted, although the semicolons must always be present. A non-existent condition is treated as always true, so for(;;) is an indefinite repeat. (But see the repeat-until in the next section.)

      The for statement is particularly useful for backward loops, chaining along lists, loops that might be done zero times, and similar things which are hard to express with a DO statement, and obscure to write out with IF's and GOTO's. For example, here is a backwards DO loop to find the last non-blank character on a card:

for (i = 80; i > 0; i = i - 1)
if (card(i) != blank)
break
(``!='' is the same as ``.NE.''). The code scans the columns from 80 through to 1. If a non-blank is found, the loop is immediately broken. (break and next work in for's and while's just as in do's). If i reaches zero, the card is all blank.

      This code is rather nasty to write with a regular Fortran DO, since the loop must go forward, and we must explicitly set up proper conditions when we fall out of the loop. (Forgetting this is a common error.) Thus:

DO 10 J = 1, 80
I = 81 - J
IF (CARD(I) .NE. BLANK) GO TO 11
10 CONTINUE
I = 0
11 ...
The version that uses the for handles the termination condition properly for free; i is zero when we fall out of the for loop.

      The increment in a for need not be an arithmetic progression; the following program walks along a list (stored in an integer array ptr) until a zero pointer is found, adding up elements from a parallel array of values:

sum = 0.0
for (i = first; i > 0; i = ptr(i))
sum = sum + value(i)
Notice that the code works correctly if the list is empty. Again, placing the test at the top of a loop instead of the bottom eliminates a potential boundary error.

The ``repeat-until'' statement

      In spite of the dire warnings, there are times when one really needs a loop that tests at the bottom after one pass through. This service is provided by the repeat-until:

repeat
Ratfor statement
until (legal Fortran condition)
The Ratfor statement part is done once, then the condition is evaluated. If it is true, the loop is exited; if it is false, another pass is made.

      The until part is optional, so a bare repeat is the cleanest way to specify an infinite loop. Of course such a loop must ultimately be broken by some transfer of control such as stop, return, or break, or an implicit stop such as running out of input with a READ statement.

      As a matter of observed fact[8], the repeat-until statement is much less used than the other looping constructions; in particular, it is typically outnumbered ten to one by for and while. Be cautious about using it, for loops that test only at the bottom often don't handle null cases well.

More on break and next

      break exits immediately from do, while, for, and repeat-until. next goes to the test part of do, while and repeat-until, and to the increment step of a for.

``return'' Statement

      The standard Fortran mechanism for returning a value from a function uses the name of the function as a variable which can be assigned to; the last value stored in it is the function value upon return. For example, here is a routine equal which returns 1 if two arrays are identical, and zero if they differ. The array ends are marked by the special value -1.

# equal _ compare str1 to str2;
# return 1 if equal, 0 if not
integer function equal(str1, str2)
integer str1(100), str2(100)
integer i

for (i = 1; str1(i) == str2(i); i = i + 1)
if (str1(i) == -1) {
equal = 1
return
}
equal = 0
return
end

      In many languages (e.g., PL/I) one instead says

return (expression)
to return a value from a function. Since this is often clearer, Ratfor provides such a return statement _ in a function F, return(expression) is equivalent to
{ F = expression; return }
For example, here is equal again:
# equal _ compare str1 to str2;
# return 1 if equal, 0 if not
integer function equal(str1, str2)
integer str1(100), str2(100)
integer i

for (i = 1; str1(i) == str2(i); i = i + 1)
if (str1(i) == -1)
return(1)
return(0)
end
If there is no parenthesized expression after return, a normal RETURN is made. (Another version of equal is presented shortly.)

Cosmetics

      As we said above, the visual appearance of a language has a substantial effect on how easy it is to read and understand programs. Accordingly, Ratfor provides a number of cosmetic facilities which may be used to make programs more readable.

Free-form Input

      Statements can be placed anywhere on a line; long statements are continued automatically, as are long conditions in if, while, for, and until. Blank lines are ignored. Multiple statements may appear on one line, if they are separated by semicolons. No semicolon is needed at the end of a line, if Ratfor can make some reasonable guess about whether the statement ends there. Lines ending with any of the characters

= + - * , | & ( _
are assumed to be continued on the next line. Underscores are discarded wherever they occur; all others remain as part of the statement.

      Any statement that begins with an all-numeric field is assumed to be a Fortran label, and placed in columns 1-5 upon output. Thus

write(6, 100); 100 format("hello")
is converted into
write(6, 100)
100 format(5hhello)

Translation Services

      Text enclosed in matching single or double quotes is converted to nH... but is otherwise unaltered (except for formatting _ it may get split across card boundaries during the reformatting process). Within quoted strings, the backslash `\' serves as an escape character: the next character is taken literally. This provides a way to get quotes (and of course the backslash itself) into quoted strings:

"\\\'"
is a string containing a backslash and an apostrophe. (This is not the standard convention of doubled quotes, but it is easier to use and more general.)

      Any line that begins with the character `%' is left absolutely unaltered except for stripping off the `%' and moving the line one position to the left. This is useful for inserting control cards, and other things that should not be transmogrified (like an existing Fortran program). Use `%' only for ordinary statements, not for the condition parts of if, while, etc., or the output may come out in an unexpected place.

      The following character translations are made, except within single or double quotes or on a line beginning with a `%'.

== .eq. != .ne.
> .gt. >= .ge.
< .lt. <= .le.
& .and. | .or.
! .not. ^ .not.
In addition, the following translations are provided for input devices with restricted character sets.
[ { ] }
$( { $) }

``define'' Statement

      Any string of alphanumeric characters can be defined as a name; thereafter, whenever that name occurs in the input (delimited by non-alphanumerics) it is replaced by the rest of the definition line. (Comments and trailing white spaces are stripped off). A defined name can be arbitrarily long, and must begin with a letter.

      define is typically used to create symbolic parameters:

define ROWS 100
define COLS 50
dimension a(ROWS), b(ROWS, COLS)
if (i > ROWS | j > COLS) ...
Alternately, definitions may be written as
define(ROWS, 100)
In this case, the defining text is everything after the comma up to the balancing right parenthesis; this allows multi-line definitions.

      It is generally a wise practice to use symbolic parameters for most constants, to help make clear the function of what would otherwise be mysterious numbers. As an example, here is the routine equal again, this time with symbolic constants.

define YES 1
define NO 0
define EOS -1
define ARB 100

# equal _ compare str1 to str2;
# return YES if equal, NO if not
integer function equal(str1, str2)
integer str1(ARB), str2(ARB)
integer i

for (i = 1; str1(i) == str2(i); i = i + 1)
if (str1(i) == EOS)
return(YES)
return(NO)
end

``include'' Statement

      The statement

include file
inserts the file found on input stream file into the Ratfor input in place of the include statement. The standard usage is to place COMMON blocks on a file, and include that file whenever a copy is needed:
subroutine x
include commonblocks
...
end

suroutine y
include commonblocks
...
end
This ensures that all copies of the COMMON blocks are identical

Pitfalls, Botches, Blemishes and other Failings

      Ratfor catches certain syntax errors, such as missing braces, else clauses without an if, and most errors involving missing parentheses in statements. Beyond that, since Ratfor knows no Fortran, any errors you make will be reported by the Fortran compiler, so you will from time to time have to relate a Fortran diagnostic back to the Ratfor source.

      Keywords are reserved _ using if, else, etc., as variable names will typically wreak havoc. Don't leave spaces in keywords. Don't use the Arithmetic IF.

      The Fortran nH convention is not recognized anywhere by Ratfor; use quotes instead.