Each file in the Pascal environment is represented by a pointer
to a
files
structure in the heap.
At the location addressed by the pointer is the element
in the file's window variable.
Behind this window variable is information about the file,
at the following offsets:
center;
n l l.
-108 FNAME Text name of associated UNIX file
-30 LCOUNT Current count of lines output
-26 LLIMIT Maximum number of lines permitted
-22 FBUF UNIX FILE pointer
-18 FCHAIN Chain to next file
-14 FLEV Pointer to associated file variable
-10 PFNAME Pointer to name of file for error messages
-6 FUNIT File status flags
-4 FSIZE Size of elements in the file
0 File window element
Here FBUF is a pointer to the system FILE block for the file. The standard system I/O library is used that provides block buffered input/output, with 1024 characters normally transferred at each read or write.
The files in the Pascal environment, are all linked together on a single file chain through the FCHAIN links. For each file the FLEV pointer gives its associated file variable. These are used to free files at block exit as described in section 3.3 below.
The FNAME and PFNAME give the associated file name for the file and the name to be used when printing error diagnostics respectively. Although these names are usually the same, input and output usually have no associated file name so the distinction is necessary.
The
FUNIT
word contains
a set of flags.
whose representations are:
center;
l l l.
EOF 0x0100 At end-of-file
EOLN 0x0200 At end-of-line (text files only)
SYNC 0x0400 File window is out of sync
TEMP 0x0800 File is temporary
FREAD 0x1000 File is open for reading
FWRITE 0x2000 File is open for writing
FTEXT 0x4000 File is a text file; process EOLN
FDEF 0x8000 File structure created, but file not opened
The EOF and EOLN bits here reflect the associated built-in function values. TEMP specifies that the file has a generated temporary name and that it should therefore be removed when its block exits. FREAD and FWRITE specify that reset and rewrite respectively have been done on the file so that input or output operations can be done. FTEXT specifies the file is a text file so that EOLN processing should be done, with newline characters turned into blanks, etc.
The SYNC bit, when true, specifies that there is no usable image in the file buffer window. As discussed in the Berkeley Pascal User's Manual, the interactive environment necessitates having ``input^'' undefined at the beginning of execution so that a program may print a prompt before the user is required to type input. The SYNC bit implements this. When it is set, it specifies that the element in the window must be updated before it can be used. This is never done until necessary.
All the variables in the Pascal runtime environment are cleared to zero on block entry. This is necessary for simple processing of files. If a file is unused, its pointer will be nil. All references to an inactive file are thus references through a nil pointer. If the Pascal system did not clear storage to zero before execution it would not be possible to detect inactive files in this simple way; it would probably be necessary to generate (possibly complicated) code to initialize each file on block entry.
When a file is first mentioned in a reset or rewrite call, a buffer of the form described above is associated with it, and the necessary information about the file is placed in this buffer. The file is also linked into the active file chain. This chain is kept sorted by block mark address, the FLEV entries.
When block exit occurs the interpreter must free the files that are in use in the block and their associated buffers. This is simple and efficient because the files in the active file chain are sorted by increasing block mark address. This means that the files for the current block will be at the front of the chain. For each file that is no longer accessible the interpreter first flushes the files buffer if it is an output file. The interpreter then returns the file buffer and the files structure and window to the free space in the heap and removes the file from the active file chain.
Flushing all the file buffers at abnormal termination, or on a call to the procedure flush or message is done by flushing each file on the file chain that has the FWRITE bit set in its flags word.
For input-output,
px
maintains a notion of an active file.
Each operation that references a file makes the file
it will be using the active file and then does its operation.
A subtle point here is that one may do a procedure call to
write
that involves a call to a function that references another file,
thereby destroying the active file set up before the
write.
Thus the active file is saved at block entry
in the block mark and restored at block exit.**
** It would probably be better to dispense with the notion of
active file and use another mechanism that did not involve extra
overhead on each procedure and function call.
Files in Pascal can be used in two distinct ways:
as the object of
read,
write,
get,
and
put
calls, or indirectly as though they were pointers.
The second use as pointers must be careful
not to destroy the active file in a reference such as
write(output, input\(ua) or the system would incorrectly write on the input device.
The fundamental operator related to the use of a file is FNIL. This takes the file variable, as a pointer, insures that the pointer is not nil, and also that a usable image is in the file window, by forcing the SYNC bit to be cleared.
A simple example that demonstrates the use of the file operators
is given by
writeln(f) that produces
lp-2w(8) l. RV:l f UNIT WRITLN
offsets of element names
_
Array of null terminated element names
Figure 3.2 - Enumerated type conversion structure
See the description of NAM in the next section for an example.
Figure 3.3 - Boolean type conversion structure
The code for
NAM
is
_NAM: incl lc addl3 (lc)+,ap,r6 #r6 points to scalar name list movl (sp)+,r3 #r3 has data value cmpw r3,(r6)+ #check value out of bounds bgequ enamrng movzwl (r6)[r3],r4 #r4 has string index pushab (r6)[r4] #push string pointer jmp (loop) enamrng: movw $ENAMRNG,_perrno jbr error The address of the table is calculated by adding the base address of the interpreter code, ap to the offset pointed to by lc. The first word of the table gives the number of records and provides a range check of the data to be output. The pointer is then calculated as
tblbase = ap + A; size = *tblbase++; return(tblbase + tblbase[value]);
The uses of files and the file operations are summarized
in an example which outputs a real variable (r) with a variable
width field (i).
writeln('r =',r:i,' ',true); that generates the code
lp-2w(8) l. UNITOUT FILE CON14:1 CON14:3 LVCON:4 "r =" WRITES RV8:l r RV4:l i MAX:8 1 RV4:l i MAX:1 1 LVCON:8 " %*.*E" FILE WRITEF:6 CONC4 ' ' WRITEC CON14:1 NAM bool LVCON:4 "%s" FILE WRITEF:3 WRITLN
Here the operator UNITOUT is an abbreviated form of the operator UNIT that is used when the file to be made active is output. A file descriptor, record count, string size, and a pointer to the constant string ``r ='' are pushed and then output by WRITES. Next the value of r is pushed on the stack and the precision size is calculated by taking seven less than the width, but not less than one. This is followed by the width that is reduced by one to leave space for the required leading blank. If the width is too narrow, it is expanded by fprintf. A pointer to the format string is pushed followed by a file descriptor and the operator WRITEF that prints out r. The value of six on WRITEF comes from two longs for r and a long each for the precision, width, format string pointer, and file descriptor. The operator CONC4 pushes the blank character onto a long on the stack that is then printed out by WRITEC. The internal representation for true is pushed as a long onto the stack and is then replaced by a pointer to the string ``true'' by the operator NAM using the table bool for conversion. This string is output by the operator WRITEF using the format string ``%s''. Finally the operator WRITLN appends a newline to the file.
0 - character at a time buffering 1 - line at a time buffering 2 - block buffering The default value is 1.
The three remaining file operations are FLUSH that flushes the active file, REMOVE that takes the pointer to a file name and removes the specified file, and MESSAGE that flushes all the output files and sets the standard error file to be the active file.