RCS arranges revisions in an ancestral tree. The ci command builds this tree; the auxiliary command rcs prunes it. The tree has a root revision, normally numbered 1.1, and successive revisions are numbered 1.2, 1.3, etc. The first field of a revision number is called the release number and the second one the level number. Unless given explicitly, the ci command assigns a new revision number by incrementing the level number of the previous revision. The release number must be incremented explicitly, using the -r option of ci. Assuming there are revisions 1.1, 1.2, and 1.3 in the RCS file f.c,v, the command
ci -r2.1 f.c or ci -r2 f.c
A young revision tree is slender: It consists of only one branch, called the trunk. As the tree ages, side branches may form. Branches are needed in the following 4 situations.
Figure 2. A slender revision tree.
Now imagine a customer requesting a fix of
a problem in revision 1.3, although actual development has moved on
to release 2. RCS does not permit an extra
revision to be spliced in between 1.3 and 2.1, since that would not reflect
the actual development history. Instead, create a branch
at revision 1.3, and check in the fix on that branch.
The first branch starting at 1.3 has number 1.3.1, and
the revisions on that branch are numbered 1.3.1.1, 1.3.1.2, etc.
The double numbering is needed to allow for another
branch at 1.3, say 1.3.2.
Revisions on the second branch would be numbered
1.3.2.1, 1.3.2.2, and so on.
The following steps create
branch 1.3.1 and add revision 1.3.1.1:
co -r1.3 f.c -- check out revision 1.3 edit f.c -- change it ci -r1.3.1 f.c -- check it in on branch 1.3.1This sequence of commands transforms the tree of Figure 2 into the one in Figure 3. Note that it may be necessary to incorporate the differences between 1.3 and 1.3.1.1 into a revision at level 2. The operation rcsmerge automates this process (see the Appendix).
Figure 3. A revision tree with one side branch
Figure 4. A customer's revision tree with local modifications.
This approach is actually practiced in the CSNET project, where several universities and a company cooperate in developing a national computer network.
Every node in a revision tree consists of the following attributes: a revision number, a check-in date and time, the author's identification, a log entry, a state and the actual text. All these attributes are determined at the time the revision is checked in. The state attribute indicates the status of a revision. It is set automatically to `experimental' during check-in. A revision can later be promoted to a higher status, for example `stable' or `released'. The set of states is user-defined.
For conserving space, RCS stores revisions in the form of deltas, i.e., as differences between revisions. The user interface completely hides this fact.
A delta is a sequence of edit commands that transforms one string into another. The deltas employed by RCS are line-based, which means that the only edit commands allowed are insertion and deletion of lines. If a single character in a line is changed, the edit scripts consider the entire line changed. The program diff2 produces a small, line-based delta between pairs of text files. A character-based edit script would take much longer to compute, and would not be significantly shorter.
Using deltas is a classical space-time tradeoff: deltas reduce the space consumed, but increase access time. However, a version control tool should impose as little delay as possible on programmers. Excessive delays discourage the use of version controls, or induce programmers to take shortcuts that compromise system integrity. To gain reasonably fast access time for both editing and compiling, RCS arranges deltas in the following way. The most recent revision on the trunk is stored intact. All other revisions on the trunk are stored as reverse deltas. A reverse delta describes how to go backward in the development history: it produces the desired revision if applied to the successor of that revision. This implementation has the advantage that extraction of the latest revision is a simple and fast copy operation. Adding a new revision to the trunk is also fast: ci simply adds the new revision intact, replaces the previous revision with a reverse delta, and keeps the rest of the old deltas. Thus, ci requires the computation of only one new delta.
Branches need special treatment. The naive solution would be to store complete copies for the tips of all branches. Clearly, this approach would cost too much space. Instead, RCS uses forward deltas for branches. Regenerating a revision on a side branch proceeds as follows. First, extract the latest revision on the trunk; secondly, apply reverse deltas until the fork revision for the branch is obtained; thirdly, apply forward deltas until the desired branch revision is reached. Figure 5 illustrates a tree with one side branch. Triangles pointing to the left and right represent reverse and forward deltas, respectively.
Figure 5. A revision tree with reverse and forward deltas.
Although implementing fast check-out for the latest trunk revision, this arrangement has the disadvantage that generation of other revisions takes time proportional to the number of deltas applied. For example, regenerating the branch tip in Figure 5 requires application of five deltas (including the initial one). Since usage statistics show that the latest trunk revision is the one that is retrieved in 95 per cent of all cases (see the section on usage statistics), biasing check-out time in favor of that revision results in significant savings. However, careful implementation of the delta application process is necessary to provide low retrieval overhead for other revisions, in particular for branch tips.
There are several techniques for delta application. The naive one is to pass each delta to a general-purpose text editor. A prototype of RCS invoked the UNIX editor ed both for applying deltas and for expanding the identification markers. Although easy to implement, performance was poor, owing to the high start-up costs and excess generality of ed. An intermediate version of RCS used a special-purpose, stream-oriented editor. This technique reduced the cost of applying a delta to the cost of checking out the latest trunk revision. The reason for this behavior is that each delta application involves a complete pass over the preceding revision.
However, there is a much better algorithm. Note that the deltas are line oriented and that most of the work of a stream editor involves copying unchanged lines from one revision to the next. A faster algorithm avoids unnecessary copying of character strings by using a piece table. A piece table is a one-dimensional array, specifying how a given revision is `pieced together' from lines in the RCS file. Suppose piece table PTr represents revision r. Then PTr[i] contains the starting position of line i of revision r. Application of the next delta transforms piece table PTr into PTr+1. For instance, a delete command removes a series of entries from the piece table. An insertion command inserts new entries, moving the entries following the insertion point further down the array. The inserted entries point to the text lines in the delta. Thus, no I/O is involved except for reading the delta itself. When all deltas have been applied to the piece table, a sequential pass through the table looks up each line in the RCS file and copies it to the output file, updating identification markers at the same time. Of course, the RCS file must permit random access, since the copied lines are scattered throughout that file. Figure 6 illustrates an RCS file with two revisions and the corresponding piece tables.
Figure 6 is not available.
Figure 6. An RCS file and its piece tables
The piece table approach has the property that the time for applying a single delta is roughly determined by the size of the delta, and not by the size of the revision. For example, if a delta is 10 per cent of the size of a revision, then applying it takes only 10 per cent of the time to generate the latest trunk revision. (The stream editor would take 100 per cent.)
There is an important alternative for representing deltas that affects performance. SCCS3, a precursor of RCS, uses interleaved deltas. A file containing interleaved deltas is partitioned into blocks of lines. Each block has a header that specifies to which revision(s) the block belongs. The blocks are sorted out in such a way that a single pass over the file can pick up all the lines belonging to a given revision. Thus, the regeneration time for all revisions is the same: all headers must be inspected, and the associated blocks either copied or skipped. As the number of revisions increases, the cost of retrieving any revision is much higher than the cost of checking out the latest trunk revision with reverse deltas. A detailed comparison of SCCS's interleaved deltas and RCS's reverse deltas can be found in Reference 4. This reference considers the version of RCS with the stream editor only. The piece table method improves performance further, so that RCS is always faster than SCCS, except if 10 or more deltas are applied.
Additional speed-up for both delta methods can be obtained by caching the most recently generated revision, as has been implemented in DSEE.5 With caching, access time to frequently used revisions can approach normal file access time, at the cost of some additional space.