This section describes the procedure for upgrading a 4.3BSD system to 4.4BSD. This procedure may vary according to the version of the system running before conversion. If you are converting from a System V system, some of this section will still apply (in particular, the filesystem conversion). However, many of the system configuration files are different, and the executable file formats are completely incompatible.
In particular be wary when using this information to upgrade a 4.3BSD HP300 system. There are at least four different versions of ``4.3BSD'' out there:
If you are running 4.3BSD, upgrading your system involves replacing your kernel and system utilities. In general, there are three possible ways to install a new BSD distribution: (1) boot directly from the distribution tape, use it to load new binaries onto empty disks, and then merge or restore any existing configuration files and filesystems; (2) use an existing 4.3BSD or later system to extract the root and /usr filesystems from the distribution tape, boot from the new system, then merge or restore existing configuration files and filesystems; or (3) extract the sources from the distribution tape onto an existing system, and use that system to cross-compile and install 4.4BSD. For this release, the second alternative is strongly advised, with the third alternative reserved as a last resort. In general, older binaries will continue to run under 4.4BSD, but there are many exceptions that are on the critical path for getting the system running. Ideally, the new system binaries (root and /usr filesystems) should be installed on spare disk partitions, then site-specific files should be merged into them. Once the new system is up and fully merged, the previous root and /usr filesystems can be reused. Other existing filesystems can be retained and used, except that (as usual) the new fsck should be run before they are mounted.
It is STRONGLY advised that you make full dumps of each filesystem before beginning, especially any that you intend to modify in place during the merge. It is also desirable to run filesystem checks of all filesystems to be converted to 4.4BSD before shutting down. This is an excellent time to review your disk configuration for possible tuning of the layout. Most systems will need to provide a new filesystem for system use mounted on /var (see below). However, the /tmp filesystem can be an MFS virtual-memory-resident filesystem, potentially freeing an existing disk partition. (Additional swap space may be desirable as a consequence.) See mount_mfs(8).
The recommended installation procedure includes the following steps. The order of these steps will probably vary according to local needs.
Section 3.2 lists the files to be saved as part of the conversion process. Section 3.3 describes the bootstrap process. Section 3.4 discusses the merger of the saved files back into the new system. Section 3.5 gives an overview of the major bug fixes and changes between 4.3BSD and 4.4BSD. Section 3.6 provides general hints on possible problems to be aware of when converting from 4.3BSD to 4.4BSD.
The following list enumerates the standard set of files you will want to save and suggests directories in which site-specific files should be present. This list will likely be augmented with non-standard files you have added to your system. If you do not have enough space to create parallel filesystems, you should create a tar image of the following files before the new filesystems are created. The rest of this subsection describes where theses files have moved and how they have changed.
/.cshrc root csh startup script (moves to /root/.cshrc) /.login root csh login script (moves to /root/.login) /.profile root sh startup script (moves to /root/.profile) /.rhosts for trusted machines and users (moves to /root/.rhosts) /etc/disktab in case you changed disk partition sizes /etc/fstab * disk configuration data /etc/ftpusers for local additions /etc/gettytab getty database /etc/group * group data base /etc/hosts for local host information /etc/hosts.equiv for local host equivalence information /etc/hosts.lpd printer access file /etc/inetd.conf * Internet services configuration data /etc/named* named configuration files /etc/netstart network initialization /etc/networks for local network information /etc/passwd * user data base /etc/printcap * line printer database /etc/protocols in case you added any local protocols /etc/rc * for any local additions /etc/rc.local * site specific system startup commands /etc/remote auto-dialer configuration /etc/services for local additions /etc/shells list of valid shells /etc/syslog.conf * system logger configuration /etc/securettys * merged into ttys /etc/ttys * terminal line configuration data /etc/ttytype * merged into ttys /etc/termcap for any local entries that may have been added /lib for any locally developed language processors /usr/dict/* for local additions to words and papers /usr/include/* for local additions /usr/lib/aliases * mail forwarding data base (moves to /etc/aliases) /usr/lib/crontab * cron daemon data base (moves to /etc/crontab) /usr/lib/crontab.local * local cron daemon data base (moves to /etc/crontab.local) /usr/lib/lib*.a for local libraries /usr/lib/mail.rc system-wide mail(1) initialization (moves to /etc/mail.rc) /usr/lib/sendmail.cf * sendmail configuration (moves to /etc/sendmail.cf) /usr/lib/tmac/* for locally developed troff/nroff macros (moves to /usr/share/tmac/*) /usr/lib/uucp/* for local uucp configuration files /usr/man/manl * for manual pages for locally developed programs (moves to /usr/local/man) /usr/spool/* for current mail, news, uucp files, etc. (moves to /var/spool) /usr/src/local for source for locally developed programs /sys/conf/HOST configuration file for your machine (moves to /sys/<arch>/conf) /sys/conf/files.HOST list of special files in your kernel (moves to /sys/<arch>/conf) /*/quotas * filesystem quota files (moves to /*/quotas.user)
**Files that can be used from 4.3BSD without change. ***Files that need local changes merged into 4.4BSD files. *Files that require special work to merge and are discussed in section 3.4.
The next step is to build a working 4.4BSD system. This can be done by following the steps in section 2 of this document for extracting the root and /usr filesystems from the distribution tape onto unused disk partitions. For the SPARC, the root filesystem dump on the tape could also be extracted directly. For the HP300 and DECstation, the raw disk image can be copied into an unused partition and this partition can then be dumped to create an image that can be restored. The exact procedure chosen will depend on the disk configuration and the number of suitable disk partitions that may be used. It is also desirable to run filesystem checks of all filesystems to be converted to 4.4BSD before shutting down. In any case, this is an excellent time to review your disk configuration for possible tuning of the layout. Section 2.5 and config(8) are required reading.
The filesystem in 4.4BSD has been reorganized in an effort to meet several goals:
These goals are realized with the following general layouts. The reorganized root filesystem has the following directories:
/etc (config files) /bin (user binaries needed when single-user) /sbin (root binaries needed when single-user) /local (locally added binaries used only by this machine) /tmp (mount point for memory based filesystem) /dev (local devices) /home (mount point for AMD) /var (mount point for per-machine variable directories) /usr (mount point for multiuser binaries and files)
The reorganized /usr filesystem has the following directories:
/usr/bin (user binaries) /usr/contrib (software contributed to 4.4BSD) /usr/games (binaries for games, score files in /var) /usr/include (standard include files) /usr/lib (lib*.a from old /usr/lib) /usr/libdata (databases from old /usr/lib) /usr/libexec (executables from old /usr/lib) /usr/local (locally added binaries used site-wide) /usr/old (deprecated binaries) /usr/sbin (root binaries) /usr/share (mount point for site-wide shared text) /usr/src (mount point for sources)
The reorganized /usr/share filesystem has the following directories:
/usr/share/calendar (various useful calendar files) /usr/share/dict (dictionaries) /usr/share/doc (4.4BSD manual sources) /usr/share/games (games text files) /usr/share/groff_font (groff font information) /usr/share/man (typeset manual pages) /usr/share/misc (dumping ground for random text files) /usr/share/mk (templates for 4.4BSD makefiles) /usr/share/skel (template user home directory files) /usr/share/tmac (various groff macro packages) /usr/share/zoneinfo (information on time zones)
The reorganized /var filesystem has the following directories:
/var/account (accounting files, formerly /usr/adm) /var/at (at(1) spooling area) /var/backups (backups of system files) /var/crash (crash dumps) /var/db (system-wide databases, e.g. tags) /var/games (score files) /var/log (log files) /var/mail (users mail) /var/obj (hierarchy to build /usr/src) /var/preserve (preserve area for vi) /var/quotas (directory to store quota files) /var/run (directory to store *.pid files) /var/rwho (rwho databases) /var/spool/ftp (home directory for anonymous ftp) /var/spool/mqueue (sendmail spooling directory) /var/spool/news (news spooling area) /var/spool/output (printer spooling area) /var/spool/uucp (uucp spooling area) /var/tmp (disk-based temporary directory) /var/users (root of per-machine user home directories)
The 4.4BSD bootstrap routines pass the identity of the boot device through to the kernel. The kernel then uses that device as its root filesystem. Thus, for example, if you boot from /dev/sd1a, the kernel will use sd1a as its root filesystem. If /dev/sd1b is configured as a swap partition, it will be used as the initial swap area, otherwise the normal primary swap area (/dev/sd0b) will be used. The 4.4BSD bootstrap is backward compatible with 4.3BSD, so you can replace your old bootstrap if you use it to boot your first 4.4BSD kernel. However, the 4.3BSD bootstrap cannot access 4.4BSD filesystems, so if you plan to convert your filesystems to 4.4BSD, you must install a new bootstrap before doing the conversion. Note that SPARC users cannot build a 4.4BSD compatible version of the bootstrap, so must not convert their root filesystem to the new 4.4BSD format.
Once you have extracted the 4.4BSD system and booted from it, you will have to build a kernel customized for your configuration. If you have any local device drivers, they will have to be incorporated into the new kernel. See section 4.1.3 and ``Building 4.3BSD UNIX Systems with Config'' (SMM:2).
If converting from 4.3BSD, your old filesystems should be converted. If you've modified the partition sizes from the original 4.3BSD ones, and are not already using the 4.4BSD disk labels, you will have to modify the default disk partition tables in the kernel. Make the necessary table changes and boot your custom kernel BEFORE trying to access any of your old filesystems! After doing this, if necessary, the remaining filesystems may be converted in place by running the 4.4BSD version of fsck(8) on each filesystem and allowing it to make the necessary corrections. The new version of fsck is more strict about the size of directories than the version supplied with 4.3BSD. Thus the first time that it is run on a 4.3BSD filesystem, it will produce messages of the form:
DIRECTORY ...: LENGTH xx NOT MULTIPLE OF 512 (ADJUSTED)
IMPOSSIBLE INTERLEAVE=0 IN SUPERBLOCK (SET TO DEFAULT)[note 4] IMPOSSIBLE NPSECT=0 IN SUPERBLOCK (SET TO DEFAULT)
In addition, 4.4BSD removes several limits on filesystem sizes that were present in 4.3BSD. The limited filesystems continue to work in 4.4BSD, but should be converted as soon as it is convenient by running fsck with the -c 2 option. The sequence fsck -p -c 2 will update them all, fix the interleave and npsect fields, fix any incorrect directory lengths, expand maximum uid's and gid's to 32-bits, place symbolic links less than 60 bytes into their inode, and fill in directory type fields all at once. The new filesystem formats are incompatible with older systems. If you wish to continue using these filesystems with the older systems you should make only the compatible changes using fsck -c 1.
When your system is booting reliably and you have the 4.4BSD root and /usr filesystems fully installed you will be ready to continue with the next step in the conversion process, merging your old files into the new system.
If you saved the files on a tar tape, extract them into a scratch directory, say /usr/convert:
# mkdir /usr/convert # cd /usr/convert # tar xp
The data files marked in the previous table with a dagger (**) may be used without change from the previous system. Those data files marked with a double dagger (***) have syntax changes or substantial enhancements. You should start with the 4.4BSD version and carefully integrate any local changes into the new file. Usually these local changes can be incorporated without conflict into the new file; some exceptions are noted below. The files marked with an asterisk (*) require particular attention and are discussed below.
As described in section 3.3, the most immediately obvious change in 4.4BSD is the reorganization of the system filesystems. Users of certain recent vendor releases have seen this general organization, although 4.4BSD takes the reorganization a bit further. The directories most affected are /etc, that now contains only system configuration files; /var, a new filesystem containing per-system spool and log files; and /usr/share, that contains most of the text files shareable across architectures such as documentation and macros. System administration programs formerly in /etc are now found in /sbin and /usr/sbin. Various programs and data files formerly in /usr/lib are now found in /usr/libexec and /usr/libdata, respectively. Administrative files formerly in /usr/adm are in /var/account and, similarly, log files are now in /var/log. The directory /usr/ucb has been merged into /usr/bin, and the sources for programs in /usr/bin are in /usr/src/usr.bin. Other source directories parallel the destination directories; /usr/src/etc has been greatly expanded, and /usr/src/share is new. The source for the manual pages, in general, are with the source code for the applications they document. Manual pages not closely corresponding to an application program are found in /usr/src/share/man. The locations of all man pages is listed in /usr/src/share/man/man0/man[1-8]. The manual page hier(7) has been updated and made more detailed; it is included in the printed documentation. You should review it to familiarize yourself with the new layout.
A new utility, mtree(8), is provided to build and check filesystem hierarchies with the proper contents, owners and permissions. Scripts are provided in /etc/mtree (and /usr/src/etc/mtree) for the root, /usr and /var filesystems. Once a filesystem has been made for /var, mtree can be used to create a directory hierarchy there or you can simply use tar to extract the prototype from the second file of the distribution tape.
The /etc directory now contains nearly all the host-specific configuration files. Note that some file formats have changed, and those configuration files containing pathnames are nearly all affected by the reorganization. See the examples provided in /etc (installed from /usr/src/etc) as a guide. The following table lists some of the local configuration files whose locations and/or contents have changed.
4.3BSD and Earlier 4.4BSD Comments ------------------------------------------------------------------------------------ /etc/fstab /etc/fstab new format; see below /etc/inetd.conf /etc/inetd.conf pathnames of executables changed /etc/printcap /etc/printcap pathnames changed /etc/syslog.conf /etc/syslog.conf pathnames of log files changed /etc/ttys /etc/ttys pathnames of executables changed /etc/passwd /etc/master.passwd new format; see below /usr/lib/sendmail.cf /etc/sendmail.cf changed pathnames /usr/lib/aliases /etc/aliases may contain changed pathnames /etc/*.pid /var/run/*.pid New in 4.3BSD-Tahoe 4.4BSD Comments ------------------------------------------------------------------------------------ /usr/games/dm.config /etc/dm.conf configuration for games (see dm(8)) /etc/zoneinfo/localtime /etc/localtime timezone configuration /etc/zoneinfo /usr/share/zoneinfo timezone configuration
New in 4.4BSD Comments -------------------------------------------------------------------- /etc/aliases.db database version of the aliases file /etc/amd-home location database of home directories /etc/amd-vol location database of exported filesystems /etc/changelist /etc/security files to back up /etc/csh.cshrc system-wide csh(1) initialization file /etc/csh.login system-wide csh(1) login file /etc/csh.logout system-wide csh(1) logout file /etc/disklabels directory for saving disklabels /etc/exports NFS list of export permissions /etc/ftpwelcome message displayed for ftp users; see ftpd(8) /etc/kerberosIV Kerberos directory; see below /etc/man.conf lists directories searched by man(1) /etc/mtree directory for local mtree files; see mtree(8) /etc/netgroup NFS group list used in /etc/exports /etc/pwd.db non-secure hashed user data base file /etc/spwd.db secure hashed user data base file /etc/security daily system security checker
System security changes require adding several new ``well-known'' groups to /etc/group. The groups that are needed by the system as distributed are:
name number purpose ------------------------------------------------------------------ wheel 0 users allowed superuser privilege daemon 1 processes that need less than wheel privilege kmem 2 read access to kernel memory sys 3 access to kernel sources tty 4 access to terminals operator 5 read access to raw disks bin 7 group for system binaries news 8 group for news wsrc 9 write access to sources games 13 access to games staff 20 system staff guest 31 system guests nobody 39 the least privileged group utmp 45 access to utmp files dialer 117 access to remote ports and dialersOnly users in the ``wheel'' group are permitted to su to ``root''. Most programs that manage directories in /var/spool now run set-group-id to ``daemon'' so that users cannot directly access the files in the spool directories. The special files that access kernel memory, /dev/kmem and /dev/mem, are made readable only by group ``kmem''. Standard system programs that require this access are made set-group-id to that group. The group ``sys'' is intended to control access to kernel sources, and other sources belong to group ``wsrc.'' Rather than make user terminals writable by all users, they are now placed in group ``tty'' and made only group writable. Programs that should legitimately have access to write on user terminals such as talkd and write now run set-group-id to ``tty''. The ``operator'' group controls access to disks. By default, disks are readable by group ``operator'', so that programs such as dump can access the filesystem information without being set-user-id to ``root''. The shutdown(8) program is executable only by group operator and is setuid to root so that members of group operator may shut down the system without root access.
The ownership and modes of some directories have changed. The at programs now run set-user-id ``root'' instead of ``daemon.'' Also, the uucp directory no longer needs to be publicly writable, as tip reverts to privileged status to remove its lock files. After copying your version of /var/spool, you should do:
# chown -R root /var/spool/at # chown -R uucp.daemon /var/spool/uucp # chmod -R o-w /var/spool/uucp
The format of the cron table, /etc/crontab, has been changed to specify the user-id that should be used to run a process. The userid ``nobody'' is frequently useful for non-privileged programs. Local changes are now put in a separate file, /etc/crontab.local.
Some of the commands previously in /etc/rc.local have been moved to /etc/rc; several new functions are now handled by /etc/rc, /etc/netstart and /etc/rc.local. You should look closely at the prototype version of these files and read the manual pages for the commands contained in it before trying to merge your local copy. Note in particular that ifconfig has had many changes, and that host names are now fully specified as domain-style names (e.g., vangogh.CS.Berkeley.EDU) for the benefit of the name server.
Some of the commands previously in /etc/daily have been moved to /etc/security, and several new functions have been added to /etc/security to do nightly security checks on the system. The script /etc/daily runs /etc/security each night, and mails the output to the super-user. Some of the checks done by /etc/security are:
+ Syntax errors in the password and group files. + Duplicate user and group names and id's. + Dangerous search paths and umask values for the superuser. + Dangerous values in various initialization files. + Dangerous .rhosts files. + Dangerous directory and file ownership or permissions. + Globally exported filesystems. + Dangerous owners or permissions for special devices.
The C-library and system binaries on the distribution tape are compiled with new versions of gethostbyname and gethostbyaddr that use the name server, named(8). If you have only a small network and are not connected to a large network, you can use the distributed library routines without any problems; they use a linear scan of the host table /etc/hosts if the name server is not running. If you are on the Internet or have a large local network, it is recommend that you set up and use the name server. For instructions on how to set up the necessary configuration files, refer to ``Name Server Operations Guide for BIND'' (SMM:10). Several programs rely on the host name returned by gethostname to determine the local domain name.
If you are using the name server, your sendmail configuration file will need some updates to accommodate it. See the ``Sendmail Installation and Operation Guide'' (SMM:8) and the sample sendmail configuration files in /usr/src/usr.sbin/sendmail/cf. The aliases file, /etc/aliases has also been changed to add certain well-known addresses.
The password file format adds change and expiration fields and its location has changed to protect the encrypted passwords stored there. The actual password file is now stored in /etc/master.passwd. The hashed dbm password files do not contain encrypted passwords, but contain the file offset to the entry with the password in /etc/master.passwd (that is readable only by root). Thus, the getpwnam() and getpwuid() functions will no longer return an encrypted password string to non-root callers. An old-style passwd file is created in /etc/passwd by the vipw(8) and pwd_mkdb(8) programs. See also passwd(5).
Several new users have also been added to the group of ``well-known'' users in /etc/passwd. The current list is:
name number ------------------ root 0 daemon 1 operator 2 bin 3 games 7 uucp 66 nobody 32767
After installing your updated password file, you must run pwd_mkdb(8) to create the password database. Note that pwd_mkdb(8) is run whenever vipw(8) is run.
The spooling directories saved on tape may be restored in their eventual resting places without too much concern. Be sure to use the `-p' option to tar(1) so that files are recreated with the same file modes. The following commands provide a guide for copying spool and log files from an existing system into a new /var filesystem. At least the following directories should already exist on /var: output, log, backups and db.
SRC=/oldroot/usr cd $SRC; tar cf - msgs preserve | (cd /var && tar xpf -)
# copy $SRC/spool to /var cd $SRC/spool tar cf - at mail rwho | (cd /var && tar xpf -) tar cf - ftp mqueue news secretmail uucp uucppublic | \ (cd /var/spool && tar xpf -)
# everything else in spool is probably a printer area mkdir .save mv at ftp mail mqueue rwho secretmail uucp uucppublic .save tar cf - * | (cd /var/spool/output && tar xpf -) mv .save/* . rmdir .save
cd /var/spool/mqueue mv syslog.7 /var/log/maillog.7 mv syslog.6 /var/log/maillog.6 mv syslog.5 /var/log/maillog.5 mv syslog.4 /var/log/maillog.4 mv syslog.3 /var/log/maillog.3 mv syslog.2 /var/log/maillog.2 mv syslog.1 /var/log/maillog.1 mv syslog.0 /var/log/maillog.0 mv syslog /var/log/maillog
# move $SRC/adm to /var cd $SRC/adm tar cf - . | (cd /var/account && tar xpf -) cd /var/account rm -f msgbuf mv messages messages.[0-9] ../log mv wtmp wtmp.[0-9] ../log mv lastlog ../log
The major new facilities available in the 4.4BSD release are a new virtual memory system, the addition of ISO/OSI networking support, a new virtual filesystem interface supporting filesystem stacking, a freely redistributable implementation of NFS, a log-structured filesystem, enhancement of the local filesystems to support files and filesystems that are up to 2^63 bytes in size, enhanced security and system management support, and the conversion to and addition of the IEEE Std1003.1 (``POSIX'') facilities and many of the IEEE Std1003.2 facilities. In addition, many new utilities and additions to the C library are present as well. The kernel sources have been reorganized to collect all machine-dependent files for each architecture under one directory, and most of the machine-independent code is now free of code conditional on specific machines. The user structure and process structure have been reorganized to eliminate the statically-mapped user structure and to make most of the process resources shareable by multiple processes. The system and include files have been converted to be compatible with ANSI C, including function prototypes for most of the exported functions. There are numerous other changes throughout the system.
This release includes several important structural kernel changes. The kernel uses a new internal system call convention; the use of global (``u-dot'') variables for parameters and error returns has been eliminated, and interrupted system calls no longer abort using non-local goto's (longjmp's). A new sleep interface separates signal handling from scheduling priority, returning characteristic errors to abort or restart the current system call. This sleep call also passes a string describing the process state, that is used by the ps(1) program. The old sleep interface can be used only for non-interruptible sleeps. The sleep interface (tsleep) can be used at any priority, but is only interruptible if the PCATCH flag is set. When interrupted, tsleep returns EINTR or ERESTART.
Many data structures that were previously statically allocated are now allocated dynamically. These structures include mount entries, file entries, user open file descriptors, the process entries, the vnode table, the name cache, and the quota structures.
To protect against indiscriminate reading or writing of kernel memory, all writing and most reading of kernel data structures must be done using a new ``sysctl'' interface. The information to be accessed is described through an extensible ``Management Information Base'' (MIB) style name, described as a dotted set of components. A new utility, sysctl(8), retrieves kernel state and allows processes with appropriate privilege to set kernel state.
The kernel runs with four different levels of security. Any superuser process can raise the security level, but only init()(8) can lower it. Security levels are defined as follows:
Normally, the system runs in level 0 mode while single user and in level 1 mode while multiuser. If the level 2 mode is desired while running multiuser, it can be set in the startup script /etc/rc using sysctl(1). If it is desired to run the system in level 0 mode while multiuser, the administrator must build a kernel with the variable securelevel in the kernel source file /sys/kern/kern_sysctl.c initialized to -1.
The new virtual memory implementation is derived from the Mach operating system developed at Carnegie-Mellon, and was ported to the BSD kernel at the University of Utah. It is based on the 2.0 release of Mach (with some bug fixes from the 2.5 and 3.0 releases) and retains many of its essential features such as the separation of the machine dependent and independent layers (the ``pmap'' interface), efficient memory utilization using copy-on-write and other lazy-evaluation techniques, and support for large, sparse address spaces. It does not include the ``external pager'' interface instead using a primitive internal pager interface. The Mach virtual memory system call interface has been replaced with the ``mmap''-based interface described in the ``Berkeley Software Architecture Manual'' (see UNIX Programmer's Manual, Supplementary Documents, PSD:5). The interface is similar to the interfaces shipped by several commercial vendors such as Sun, USL, and Convex Computer Corp. The integration of the new virtual memory is functionally complete, but still has serious performance problems under heavy memory load. The internal kernel interfaces have not yet been completed and the memory pool and buffer cache have not been merged. Some additional caveats:
The ISO/OSI Networking consists of a kernel implementation of transport class 4 (TP-4), connectionless networking protocol (CLNP), and 802.3-based link-level support (hardware-compatible with Ethernet[note 5] ). We also include support for ISO Connection-Oriented Network Service, X.25, TP-0. The session and presentation layers are provided outside the kernel using the ISO Development Environment by Marshall Rose, that is available via anonymous FTP (but is not included on the distribution tape). Included in this development environment are file transfer and management (FTAM), virtual terminals (VT), a directory services implementation (X.500), and miscellaneous other utilities.
Kernel support for the ISO OSI protocols is enabled with the ISO option in the kernel configuration file. The iso(4) manual page describes the protocols and addressing; see also clnp(4), tp(4) and cltp(4). The OSI equivalent to ARP is ESIS (End System to Intermediate System Routing Protocol); running this protocol is mandatory, however one can manually add translations for machines that do not participate by use of the route(8) command. Additional information is provided in the manual page describing esis(4).
The command route(8) has a new syntax and several new capabilities: it can install routes with a specified destination and mask, and can change route characteristics such as hop count, packet size and window size.
Several important enhancements have been added to the TCP/IP protocols including TCP header prediction and serial line IP (SLIP) with header compression. The routing implementation has been completely rewritten to use a hierarchical routing tree with a mask per route to support the arbitrary levels of routing found in the ISO protocols. The routing table also stores and caches route characteristics to speed the adaptation of the throughput and congestion avoidance algorithms.
The format of the sockaddr structure (the structure used to describe a generic network address with an address family and family-specific data) has changed from previous releases, as have the address family-specific versions of this structure. The sa_family family field has been split into a length, sa_len, and a family, sa_family. System calls that pass a sockaddr structure into the kernel (e.g. sendto() and connect()) have a separate parameter that specifies the sockaddr length, and thus it is not necessary to fill in the sa_len field for those system calls. System calls that pass a sockaddr structure back from the kernel (e.g. recvfrom() and accept()) receive a completely filled-in sockaddr structure, thus the length field is valid. Because this would not work for old binaries, the new library uses a different system call number. Thus, most networking programs compiled under 4.4BSD are incompatible with older systems.
Although this change is mostly source and binary compatible with old programs, there are three exceptions. Programs with statically initialized sockaddr structures (usually the Internet form, a sockaddr_in) are not compatible. Generally, such programs should be changed to fill in the structure at run time, as C allows no way to initialize a structure without assuming the order and number of fields. Also, programs with use structures to describe a network packet format that contain embedded sockaddr structures also require change; a definition of an osockaddr structure is provided for this purpose. Finally, programs that use the SIOCGIFCONF ioctl to get a complete list of interface addresses need to check the sa_len field when iterating through the array of addresses returned, as not all the structures returned have the same length (this variance in length is nearly guaranteed by the presence of link-layer address structures).
The 4.4BSD distribution contains most of the interfaces specified in the IEEE Std1003.1 system interface standard. Filesystem additions include IEEE Std1003.1 FIFOs, byte-range file locking, and saved user and group identifiers.
A new virtual filesystem interface has been added to the kernel to support multiple filesystems. In comparison with other interfaces, the Berkeley interface has been structured for more efficient support of filesystems that maintain state (such as the local filesystem). The interface has been extended with support for stackable filesystems done at UCLA. These extensions allow for filesystems to be layered on top of each other and allow new vnode operations to be added without requiring changes to existing filesystem implementations. For example, the umap filesystem (see mount_umap(8)) is used to mount a sub-tree of an existing filesystem that uses a different set of uids and gids than the local system. Such a filesystem could be mounted from a remote site via NFS or it could be a filesystem on removable media brought from some foreign location that uses a different password file.
Other new filesystems that may be stacked include the loopback filesystem mount_lofs(8), the kernel filesystem mount_kernfs(8), and the portal filesystem mount_portal(8).
The buffer cache in the kernel is now organized as a file block cache rather than a device block cache. As a consequence, cached blocks from a file and from the corresponding block device would no longer be kept consistent. The block device thus has little remaining value. Three changes have been made for these reasons:
The root filesystem may be made writable while in single-user mode with the command:
mount -uw /
In addition to the local ``fast filesystem'', we have added an implementation of the network filesystem (NFS) that fully interoperates with the NFS shipped by Sun and its licensees. Because our NFS implementation was implemented by Rick Macklem of the University of Guelph using only the publicly available NFS specification, it does not require a license from Sun to use in source or binary form. By default it runs over UDP to be compatible with Sun's implementation. However, it can be configured on a per-mount basis to run over TCP. Using TCP allows it to be used quickly and efficiently through gateways and over long-haul networks. Using an extended protocol, it supports Leases to allow a limited callback mechanism that greatly reduces the network traffic necessary to maintain cache consistency between the server and its clients. Its use will be familiar to users of other implementations of NFS. See the manual pages mount(8), mountd(8), fstab(5), exports(5), netgroup(5), nfsd(8), nfsiod(8), and nfssvc(8). and the document ``The 4.4BSD NFS Implementation'' (SMM:6) for further information. The format of /etc/fstab has changed from previous BSD releases to a blank-separated format to allow colons in pathnames.
A new local filesystem, the log-structured filesystem (LFS), has been added to the system. It provides near disk-speed output and fast crash recovery. This work is based, in part, on the LFS filesystem created for the Sprite operating system at Berkeley. While the kernel implementation is almost complete, only some of the utilities to support the filesystem have been written, so we do not recommend it for production use. See newlfs(8), mount_lfs(8) and lfs_cleanerd(8) for more information. For a in-depth description of the implementation and performance characteristics of log-structured filesystems in general, and this one in particular, see Dr. Margo Seltzer's doctoral thesis, available from the University of California Computer Science Department.
We have also added a memory-based filesystem that runs in pageable memory, allowing large temporary filesystems without requiring dedicated physical memory.
The local ``fast filesystem'' has been enhanced to do clustering that allows large pieces of files to be allocated contiguously resulting in near doubling of filesystem throughput. The filesystem interface has been extended to allow files and filesystems to grow to 2^63 bytes in size. The quota system has been rewritten to support both user and group quotas (simultaneously if desired). Quota expiration is based on time rather than the previous metric of number of logins over quota. This change makes quotas more useful on fileservers onto which users seldom login.
The system security has been greatly enhanced by the addition of additional file flags that permit a file to be marked as immutable or append only. Once set, these flags can only be cleared by the super-user when the system is running in insecure mode (normally, single-user). In addition to the immutable and append-only flags, the filesystem supports a new user-settable flag ``nodump''. (File flags are set using the chflags(1) utility.) When set on a file, dump(8) will omit the file from incremental backups but retain them on full backups. See the ``-h'' flag to dump(8) for details on how to change this default. The ``nodump'' flag is usually set on core dumps, system crash dumps, and object files generated by the compiler. Note that the flag is not preserved when files are copied so that installing an object file will cause it to be preserved.
The filesystem format used in 4.4BSD has several additions. Directory entries have an additional field, d_type, that identifies the type of the entry (normally found in the st_mode field of the stat structure). This field is particularly useful for identifying directories without the need to use stat(2).
Short (less than sixty byte) symbolic links are now stored in the inode itself rather than in a separate data block. This saves disk space and makes access of symbolic links faster. Short symbolic links are not given a special type, so a user-level application is unaware of their special treatment. Unlike pre-4.4BSD systems, symbolic links do not have an owner, group, access mode, times, etc. Instead, these attributes are taken from the directory that contains the link. The only attributes returned from an lstat(2) that refer to the symbolic link itself are the file type (S_IFLNK), size, blocks, and link count (always 1).
An implementation of an auto-mounter daemon, amd, was contributed by Jan-Simon Pendry of the Imperial College of Science, Technology & Medicine. See the document ``AMD - The 4.4BSD Automounter'' (SMM:13) for further information.
The directory /dev/fd contains special files 0 through 63 that, when opened, duplicate the corresponding file descriptor. The names /dev/stdin, /dev/stdout and /dev/stderr refer to file descriptors 0, 1 and 2. See fd(4) and mount_fdesc(8) for more information.
The 4.4BSD system uses the IEEE P1003.1 (POSIX.1) terminal interface rather than the previous BSD terminal interface. The terminal driver is similar to the System V terminal driver with the addition of the necessary extensions to get the functionality previously available in the 4.3BSD terminal driver. Both the old ioctl calls and old options to stty(1) are emulated. This emulation is expected to be unavailable in many vendors releases, so conversion to the new interface is encouraged.
4.4BSD also adds the IEEE Std1003.1 job control interface, that is similar to the 4.3BSD job control interface, but adds a security model that was missing in the 4.3BSD job control implementation. A new system call, setsid(), creates a job-control session consisting of a single process group with one member, the caller, that becomes a session leader. Only a session leader may acquire a controlling terminal. This is done explicitly via a TIOCSCTTY ioctl() call, not implicitly by an open() call. The call fails if the terminal is in use. Programs that allocate controlling terminals (or pseudo-terminals) require change to work in this environment. The versions of xterm provided in the X11R5 release includes the necessary changes. New library routines are available for allocating and initializing pseudo-terminals and other terminals as controlling terminal; see /usr/src/lib/libutil/pty.c and /usr/src/lib/libutil/login_tty.c.
The POSIX job control model formalizes the previous conventions used in setting up a process group. Unfortunately, this requires that changes be made in a defined order and with some synchronization that were not necessary in the past. Older job control shells (csh, ksh) will generally not operate correctly with the new system.
Most of the other kernel interfaces have been changed to correspond with the POSIX.1 interface, although that work is not complete. See the relevant manual pages and the IEEE POSIX standard.
Both the HP300 and SPARC ports feature the ability to run binaries built for the native operating system (HP-UX or SunOS) by emulating their system calls. Building an HP300 kernel with the HPUXCOMPAT and COMPAT_OHPUX options or a SPARC kernel with the COMPAT_SUNOS option will enable this feature (on by default in the generic kernel provided in the root filesystem image). Though this native operating system compatibility was provided by the developers as needed for their purposes and is by no means complete, it is complete enough to run several non-trivial applications including those that require HP-UX or SunOS shared libraries. For example, the vendor supplied X11 server and windowing environment can be used on both the HP300 and SPARC.
It is important to remember that merely copying over a native binary and executing it (or executing it directly across NFS) does not imply that it will run. All but the most trivial of applications are likely to require access to auxiliary files that do not exist under 4.4BSD (e.g. /etc/ld.so.cache) or have a slightly different format (e.g. /etc/passwd). However, by using system call tracing and through creative use of symlinks, many problems can be tracked down and corrected.
The DECstation port also has code for ULTRIX emulation (kernel option ULTRIXCOMPAT, not compiled into the generic kernel) but it was used primarily for initially bootstrapping the port and has not been used since. Hence, some work may be required to make it generally useful.
We have been tracking the IEEE Std1003.2 shell and utility work and have included prototypes of many of the proposed utilities based on draft 12 of the POSIX.2 Shell and Utilities document. Because most of the traditional utilities have been replaced with implementations conformant to the POSIX standards, you should realize that the utility software may not be as stable, reliable or well documented as in traditional Berkeley releases. In particular, almost the entire manual suite has been rewritten to reflect the POSIX defined interfaces, and in some instances it does not correctly reflect the current state of the software. It is also worth noting that, in rewriting this software, we have generally been rewarded with significant performance improvements. Most of the libraries and header files have been converted to be compliant with ANSI C. The shipped compiler (gcc) is a superset of ANSI C, but supports traditional C as a command-line option. The system libraries and utilities all compile with either ANSI or traditional C.
This release uses a completely new version of the make program derived from the pmake program developed by the Sprite project at Berkeley. It supports existing makefiles, although certain incorrect makefiles may fail. The makefiles for the 4.4BSD sources make extensive use of the new facilities, especially conditionals and file inclusion, and are thus completely incompatible with older versions of make (but nearly all the makefiles are now trivial!). The standard include files for make are in /usr/share/mk. There is a bsd.README file in /usr/src/share/mk.
Another global change supported by the new make is designed to allow multiple architectures to share a copy of the sources. If a subdirectory named obj is present in the current directory, make descends into that directory and creates all object and other files there. We use this by building a directory hierarchy in /var/obj that parallels /usr/src. We then create the obj subdirectories in /usr/src as symbolic links to the corresponding directories in /var/obj. (This step is automated. The command ``make obj'' in /usr/src builds both the local symlink and the shadow directory, using /usr/obj, that may be a symbolic link, as the root of the shadow tree. The use of /usr/obj is for historic reasons only, and the system make configuration files in /usr/share/mk can trivially be modified to use /var/obj instead.) We have one /var/obj hierarchy on the local system, and another on each system that shares the source filesystem. All the sources in /usr/src except for /usr/src/contrib and portions of /usr/src/old have been converted to use the new make and obj subdirectories; this change allows compilation for multiple architectures from the same source tree (that may be mounted read-only).
The Kerberos authentication server from MIT (version 4) is included in this release. See kerberos(1) for a general, if MIT-specific, introduction. If it is configured, login(1), passwd(1), rlogin(1) and rsh(1) will all begin to use it automatically. The file /etc/kerberosIV/README describes the configuration. Each system needs the file /etc/kerberosIV/krb.conf to set its realm and local servers, and a private key stored in /etc/kerberosIV/srvtab (see ext_srvtab(8)). The Kerberos server should be set up on a single, physically secure, server machine. Users and hosts may be added to the server database manually with kdb_edit(8), or users on authorized hosts can add themselves and a Kerberos password after verification of their ``local'' (passwd-file) password using the register(1) program.
Note that by default the password-changing program passwd(1) changes the Kerberos password, that must exist. The -l option to passwd(1) changes the ``local'' password if one exists.
Note that Version 5 of Kerberos will be released soon; Version 4 should probably be replaced at that time.
The timezone conversion code in the C library uses data files installed in /usr/share/zoneinfo to convert from ``GMT'' to various timezones. The data file for the default timezone for the system should be copied to /etc/localtime. Other timezones can be selected by setting the TZ environment variable.
The data files initially installed in /usr/share/zoneinfo include corrections for leap seconds since the beginning of 1970. Thus, they assume that the kernel will increment the time at a constant rate during a leap second; that is, time just keeps on ticking. The conversion routines will then name a leap second 23:59:60. For purists, this effectively means that the kernel maintains TAI (International Atomic Time) rather than UTC (Coordinated Universal Time, aka GMT).
For systems that run current NTP (Network Time Protocol) implementations or that wish to conform to the letter of the POSIX.1 law, it is possible to rebuild the timezone data files so that leap seconds are not counted. (NTP causes the time to jump over a leap second, and POSIX effectively requires the clock to be reset by hand when a leap second occurs. In this mode, the kernel effectively runs UTC rather than TAI.)
The data files without leap second information are constructed from the source directory, /usr/src/share/zoneinfo. Change the variable REDO in Makefile from ``right'' to ``posix'', and then do
make obj (if necessary) make make install
You will then need to copy the correct default zone file to /etc/localtime, as the old one would still have used leap seconds, and because the Makefile installs a default /etc/localtime each time ``make install'' is done.
It is possible to install both sets of timezone data files. This results in subdirectories /usr/share/zoneinfo/right and /usr/share/zoneinfo/posix. Each contain a complete set of zone files. See /usr/src/share/zoneinfo/Makefile for details.
Notable additions to the libraries include functions to traverse a filesystem hierarchy, database interfaces to btree and hashing functions, a new, faster implementation of stdio and a radix and merge sort functions.
The fts(3) functions will do either physical or logical traversal of a file hierarchy as well as handle essentially infinite depth filesystems and filesystems with cycles. All the utilities in 4.4BSD which traverse file hierarchies have been converted to use fts(3). The conversion has always resulted in a significant performance gain, often of four or five to one in system time.
The dbopen(3) functions are intended to be a family of database access methods. Currently, they consist of hash(3), an extensible, dynamic hashing scheme, btree(3), a sorted, balanced tree structure (B+tree's), and recno(3), a flat-file interface for fixed or variable length records referenced by logical record number. Each of the access methods stores associated key/data pairs and uses the same record oriented interface for access.
The qsort(3) function has been rewritten for additional performance. In addition, three new types of sorting functions, heapsort(3), mergesort(3) and radixsort(3) have been added to the system. The mergesort function is optimized for data with pre-existing order, in which case it usually significantly outperforms qsort. The radixsort(3) functions are variants of most-significant-byte radix sorting. They take time linear to the number of bytes to be sorted, usually significantly outperforming qsort on data that can be sorted in this fashion. An implementation of the POSIX 1003.2 standard sort(1), based on radixsort, is included in /usr/src/contrib/sort.
Some additional comments about the 4.4BSD C library:
The curses(3) library has been largely rewritten. Important additional features include support for scrolling and termios(3).
An application front-end editing library, named libedit, has been added to the system.
A superset implementation of the SunOS kernel memory interface library, libkvm, has been integrated into the system.
There are many new utilities, offering many new capabilities, in 4.4BSD. Skimming through the section 1 and section 8 manual pages is sure to be useful. The additions to the utility suite include greatly enhanced versions of programs that display system status information, implementations of various traditional tools described in the IEEE Std1003.2 standard, new tools not previous available on Berkeley UNIX systems, and many others. Also, with only a very few exceptions, all the utilities from 4.3BSD that included proprietary source code have been replaced, and their 4.4BSD counterparts are freely redistributable. Normally, this replacement resulted in significant performance improvements and the increase of the limits imposed on data by the utility as well.
A summary of specific additions and changes are as follows:
amd An auto-mounter implementation. ar Replacement of the historic archive format with a new one. awk Replaced by gawk; see /usr/src/old/awk for the historic version. bdes Utility implementing DES modes of operation described in FIPS PUB 81. calendar Addition of an interface for system calendars. cap_mkdb Utility for building hashed versions of termcap style databases. cc Replacement of pcc with gcc suite. chflags A utility for setting the per-file user and system flags. chfn An editor based replacement for changing user information. chpass An editor based replacement for changing user information. chsh An editor based replacement for changing user information. cksum The POSIX 1003.2 checksum utility; compatible with sum. column A columnar text formatting utility. cp POSIX 1003.2 compatible, able to copy special files. csh Freely redistributable and 8-bit clean. date User specified formats added. dd New EBCDIC conversion tables, major performance improvements. dev_mkdb Hashed interface to devices. dm Dungeon master. find Several new options and primaries, major performance improvements. fstat Utility displaying information on files open on the system. ftpd Connection logging added. hexdump A binary dump utility, superseding od. id The POSIX 1003.2 user identification utility. inetd Tcpmux added. jot A text formatting utility. kdump A system-call tracing facility. ktrace A system-call tracing facility. kvm_mkdb Hashed interface to the kernel name list. lam A text formatting utility. lex A new, freely redistributable, significantly faster version. locate A database of the system files, by name, constructed weekly. logname The POSIX 1003.2 user identification utility. mail.local New local mail delivery agent, replacing mail. make Replaced with a new, more powerful make, supporting include files. man Added support for man page location configuration. mkdep A new utility for generating make dependency lists. mkfifo The POSIX 1003.2 FIFO creation utility. mtree A new utility for mapping file hierarchies to a file. nfsstat An NFS statistics utility. nvi A freely redistributable replacement for the ex/vi editors. pax The POSIX 1003.2 replacement for cpio and tar. printf The POSIX 1003.2 replacement for echo. roff Replaced by groff; see /usr/src/old/roff for the historic versions. rs New utility for text formatting. shar An archive building utility. sysctl MIB-style interface to system state. tcopy Fast tape-to-tape copying and verification. touch Time and file reference specifications. tput The POSIX 1003.2 terminal display utility. tr Addition of character classes. uname The POSIX 1003.2 system identification utility. vis A filter for converting and displaying non-printable characters. xargs The POSIX 1003.2 argument list constructor utility. yacc A new, freely redistributable, significantly faster version.
The new versions of lex(1) (``flex'') and yacc(1) (``zoo'') should be installed early on if attempting to cross-compile 4.4BSD on another system. Note that the new lex program is not completely backward compatible with historic versions of lex, although it is believed that all documented features are supported.
The find utility has two new options that are important to be aware of if you intend to use NFS. The ``fstype'' and ``prune'' options can be used together to prevent find from crossing NFS mount points. See /etc/daily for an example of their use.
This section summarizes changes between 4.3BSD and 4.4BSD that are likely to cause difficulty in doing the conversion. It does not include changes in the network; see section 5 for information on setting up the network.
Since the stat st_size field is now 64-bits instead of 32, doing something like:
foo(st.st_size);
foo(size) int size; { ... }
lseek(fd, (long)off, 0);
lseek(fd, 0, 0);
Determining the ``namelen'' parameter for a connect(2) call on a unix domain socket should use the ``SUN_LEN'' macro from <sys/un.h>. One old way that was used:
addrlen = strlen(unaddr.sun_path) + sizeof(unaddr.sun_family);
The kernel's limit on the number of open files has been increased from 20 to 64. It is now possible to change this limit almost arbitrarily. The standard I/O library autoconfigures to the kernel limit. Note that file (``_iob'') entries may be allocated by malloc from fopen; this allocation has been known to cause problems with programs that use their own memory allocators. Memory allocation does not occur until after 20 files have been opened by the standard I/O library.
Select can be used with more than 32 descriptors by using arrays of ints for the bit fields rather than single ints. Programs that used getdtablesize as their first argument to select will no longer work correctly. Usually the program can be modified to correctly specify the number of bits in an int. Alternatively the program can be modified to use an array of ints. There are a set of macros available in <sys/types.h> to simplify this. See select(2).
Old core files will not be intelligible by the current debuggers because of numerous changes to the user structure and because the kernel stack has been enlarged. The a.out header that was in the user structure is no longer present. Locally-written debuggers that try to check the magic number will need to be changed.
Files may not be deleted from directories having the ``sticky'' (ISVTX) bit set in their modes except by the owner of the file or of the directory, or by the superuser. This is primarily to protect users' files in publicly-writable directories such as /tmp and /var/tmp. All publicly-writable directories should have their ``sticky'' bits set with ``chmod +t.''
The following two sections contain additional notes about changes in 4.4BSD that affect the installation of local files; be sure to read them as well.