This section outlines the changes made to the system since the 4.2BSD distribution. The changes reported here were made in response to the problems described in Section 3. The improvements fall into two major classes; changes to the kernel that are described in this section, and changes to the system libraries and utilities that are described in the following section.
Our goal has been to optimize system performance for our general timesharing environment. Since most sites running 4.2BSD have been forced to take advantage of declining memory costs rather than replace their existing machines with ones that are more powerful, we have chosen to optimize running time at the expense of memory. This tradeoff may need to be reconsidered for personal workstations that have smaller memories and higher latency disks. Decreases in the running time of the system may be unnoticeable because of higher paging rates incurred by a larger kernel. Where possible, we have allowed the size of caches to be controlled so that systems with limited memory may reduce them as appropriate.
Our initial profiling studies showed that more than one quarter of the time in the system was spent in the pathname translation routine, namei, translating path names to inodes[note 15] . An inspection of namei shows that it consists of two nested loops. The outer loop is traversed once per pathname component. The inner loop performs a linear search through a directory looking for a particular pathname component.
Our first idea was to reduce the number of iterations
around the inner loop of namei by observing that many programs
step through a directory performing an operation on each entry in turn.
To improve performance for processes doing directory scans,
the system keeps track of the directory offset of the last component of the
most recently translated path name for each process.
If the next name the process requests is in the same directory,
the search is started from the offset that the previous name was found
(instead of from the beginning of the directory).
Changing directories invalidates the cache, as
does modifying the directory.
For programs that step sequentially through a directory with
files, search time decreases from
to
.
The c
the cache is ab
c(aband 16 bytes per prst
Iuser vect
As a quick benchmark t
y the maximum effectiveness the
cache we ran ``ls -l''
iles.
Befused 22.3 sec
system time.
After adding the cache the pr user time, but the system time dr
This change pr
iled system
Inamei.
The results sh
Inamei
drstill acc
the system call time,
18% the kernel,
all the machine cycles.
This am
rThe results are sh
D'l |192u 0' parttime% kernel D'l |192u 0' self11.0 ms/call9.2% child10.6 ms/call8.9% D'l |192u 0' t D'l 0 |0u-1v' D'l 0 |0u-1v' Table 9. Call times fInamei with per-pr
The small perfwas caused by a lAlth
fective when hit,
it was
the names being translated.
An additi
alth
time spent in namei itself
decreased substantially,
msince each direct
rand
r
Frequent requests f
names are best handled
with a cache recent name translatiThis has the effect eliminating the inner l
namei.
Fnamei first l
recent translatifIf it exists, the direct
The system already maintained a cache recently accessed insmaintained a simple name-incheck each c
a path name during name translatiWe cwith its mbut eventually decided tkept names with pTagging inmany inthe in
time, but are never lup by name.
Other infrequently by many different names (e.g. ``..'').
By keeping a separate table names, the cache can
truly reflect the mAn added benefit is that the table can be sized independently
the in
memcan reduce the size the cache (with
ying the in
Anh
erences tN
erences'' by incrementing the
reference c
erence.
Since the system reuses
erence ca hard reference insures that the inH
the name cache h
erences,
it is limited t
racti
the size the insince s
t free f
iles.
It als
the kernel
t
y s
a device
ile.
These reas
erences
with
fecting the behavi
the inThus, we ch
t references'' prby a capability - a 32-bit number
guaranteed tWhen an entry is made in the name cache,
the capability its inWhen an inWhen a name cache hit the capability the name cache entry is cwith the capability the in
erences.
If the capabilities dSince the name cache h
t references,
it may be sized independent the size the inA final benefit using capabilities is that all
cached names fsearching thrinstead all y
The c
the name cache is ab
c(aband 48 bytes per cache entry.
Depending
the system,
ab
igured,
using 10-50 kil
physical memThe name cache is resident in mem
After adding the system wide name cache we reran ``ls -l''
The user time remained the same,
hThis was n
Inamei
nbut was never able t
it.
An
iled system was created and measurements
were csh
Inamei, with
namei acc
the system call time,
13% the time in the kernel,
all the machine cycles.
System time dr
rThe results are sh
D'l |192u 0' parttime% kernel D'l |192u 0' self4.2 ms/call6.2% child4.4 ms/call6.6% D'l |192u 0' t D'l 0 |0u-1v' D'l 0 |0u-1v' Table 10. Call times fInamei with b
On
ind that during the twelve
h
rname translatiStatistics
bthe large perfcaused by the high hit ratiThe name cache has a hit rate 70%-80%;
the direct
fset cache gets a hit rate 5%-15%.
The c
the twWith the additi
the twthe percentage system time devdr
rWhile the system wide cache reduces b
time in
the r
Inamei calls as well as namei itself
(since fewer directit is interesting t
system
time spent in namei itself increases even thactual time per call decreases.
This is because less thence a smaller abs
Mit can either generate an interrupt each time a character is received,
T
la silAscii terminals nan input rate less than 30 characters per secAt this input rate
they are m
ficiently handled with interrupt per character msince this generates fewer interrupts than draining the input sil the terminal multiplexWhen input is being generated by an
unctithe input rate is usually mIt is m
ficient tsince this generates fewer interrupts than handling each character
as a separate interrupt.
Since a given dialup pit is imp
ficient input m
We thereft
the sil
per-character interrupts.
At linterrupt basis, av checking each interface During peri
sustained input, the handler enables the siland starts a timer tThis timer runs less frequently than the cland is used
input.
The transiti
rdamped t
transiti
fic
(such as in netwInput characters serve tlush the silBy switching between these tw
the
checking the silwhen necessary.
In additithe cla stware interrupt after each hardware interrupt tThe stware-interrupt level p
the clneeded when timers expire an executi
ile.
Thus, the number interrupts attributable tis substantially reduced.
As systems have gr
the prhas gr
ar past 200 entries.
With large tables, linear searches must be eliminated
fr
requently used facility.
The kernel pr active and zA third list threads unused prFree slfr
r
the free list.
The number prSince the 4.2BSD release,
the kernel maintained linked lists the descendents each prThis linkage is nparents seeking the exit status children n the prIn additi
inding all descendents an exiting
pr
the prThis has been changed t
When fthe system must assign it a unique pr
ier.
The system previa new pr
ier that was nN
the system c
unused identifiers
that can be directly assigned.
Only when the set identifiers is exhausted is anscan required.
PreviPr
priPrbeen sleeping fOn systems running many prthe scheduler represented nearly 20% the system time.
Tthe scheduler has been changed trunnable prT
still get their apprtheir priSince the set runnable pr
racti the t
prthe c
inv
The hardware clat high priAs m
the clthe system schedules a l
tware interrupt ttime-critical events such as cpu scheduling and timeOften there are n
tware interrupt handler
finds nThe high pri
there are levents tif there is n
tware interrupt is nOften, the high primachine had been running at lRather than p
tware interrupt that wsthe hardware cland calls the stware clBetween these tw
the 100 stware
interrupts per sec
The file system uses a large blT
iles t
ficiently, the large blbe br
ragments, typically multiples 1024 bytes.
T
full-sized blintragments, the file system uses a best fit strategy.
Pr
iles using write 1024 bytes can f
ile system tsuccessively larger and larger fragments until it finally
gr
ull sized blThe file system still uses a best fit strategy the first time
a fragment is written.
H
irst time that the file system is ffragment it places it at the beginning a full sized blC
urther cby using up the rest the blIf the file ceases t
the blavailable f
ragments.
When creating a new file name,
the entire directtFBecause there was n
a direct
illed will increase the c file creati
ter the
illing is cThus, fafter it is cleared up.
Tthat it finds at the end a directscan t
The default am
buffer space all
pipes) has been increased tStream s
fer sizes in the bl
ield
the stat structure.
This inf
fering.
Unix din the stat structure twith The TCP maximum segment size is calculated accand interface in use; nf
On multiply-ht
ace that will be used in transmitting data packets fcSeveral bugs in the calculati
rTCP n
ails,
ICMP srate TCP streams by temp
icially small
send windrather than resending all queued data.
A send pthat decreases the number small packets f
fic [Nagle84],
pr
netwThe
packet rc
The buffer management strategy implemented by s
P has been
changed t
the increased size the s
fers
and a better tuned delayed acknR
ied t
the last
rMultiple messages send with the same destinatiPerf
leither the sender hAlsthe thrWe have
their cycles
transmitting netw
When exec-ing a new prprfragain
the newly created prThese tware nThis an argument list by a fact
ten;
the average time t
Iexec call decreased by 25%.
The kernel used t
tware event when it wanted ta prOften the pr
exiting the kernel, delaying the event trap.
At sbe selected tfinally causing the event tThe event wselecting the same prThe fix t
tware reschedule
events when saving a prThis change dcan synchr
The kernel r
Isetjmp, that saves the current system
c
registers than necessary under mBy trimming its the
13%.
The current c
d
icant Gthe C language is nbecause its rampant use unbThus, many classical analysis and selecti
register variables must be dby hand using ``exteri
when such
e.
Anis inline expansi
small
requently used rIn past Berkeley systems this has been d
Ised trun r
the r
ten
a single VAX instructiWhile this
the subrcall and return,
it did n
several arguments
tThe sed script has been replaced by a minline, that merges the pushes and pF
the C c
if (scanc(map[i], 1, 47, i - 63))
D'l |408u 0' Alternative C Language CD'l |408u 0' ccsedinline D'l |408u 0' subl3$64,_i,-(sp)subl3$64,_i,-(sp)subl3$64,_i,r5 pushl$47pushl$47mpushl$1pushl$1pushl$1 mull2$16,_i,r3mull2$16,_i,r3mull2$16,_i,r3 pushl-56(fp)[r3]pushl-56(fp)[r3]mp)[r3],r2 calls$4,_scancmtstlr0mjeqlL7mmscancr2,(r3),(r4),r5 tstlr0 jeqlL7 D'l 0 |0u-1v' D'l 0 |0u-1v' D'l 0 |0u-1v' D'l 0 |0u-1v' Table 11. Alternative inline c
Anexisting data structures in the c
the current system.
F
fer hashing was implemented when the system
typically had thirty tifty buffers.
M
fers.
C
the hash chains cten t
fers each!
The running time the l
fer management primitives was
dramatically impr
the hash table.
Intuitively, changes tpayf since they affect all prH
icant imprBy c
the libraries and utilities had never been tuned.
F
their
running time dChanging the utility trunning time by a fact
five!
Thus, while m
m
the speedups are because impr
the system.
S
the m
subsecti
UNIX pr
database management r
Idbm,
that can be used t
iles
with an external hashed index file.
The
dbm was designed tdatabase at a time. These rmultiple database files, enabling them t the passw
ile lused t
ile significantly imprtime many impthe C-shell (in d
Ils -l, etc.
The new filesystem with its larger blperf
by perf
ers rather than using
appr
fers.
The standard I/O library
aut
fer size f
ile.
SI/O
fering, hSeveral impand were buffering I/O using the
fer size,
1Kbytes; the pr
fer I/O acc
ile system blThese include the edit
ile cthe text f
The standard err
fered
tand t
r
buffers are n
lushed.
The in
sending single-byte packets thrthe netw
fering scheme errWithin a single call t
Ifprintf, all
fered tempBef
lushed and the stream is again
marked unbuffered.
As bef
fering mechanisms can be used
instead the default behavi
It is p
defeat the standard I/O library's ch
I/O buffer size by using
the setbuf call t
fer.
Because p
ault buffer size prby setbuf is 1024 bytes; this can lead, One such pr
Icat;
there are undas the system has changed much since they were
The pr
icant w
irst pr
ied was a bug
in the sysl
P pr
Isendmail
lc
acilities. Sysl
P then rec
a l
ile. Unf
Isysl
P was perf
Isync
ter each message it received, whether it was l
ile
fectiveness the
buffer cache and explained, textent, why sending mail theavy l
message recipient causing alm
sync
The hashed data base files were
installed in all mail pr
magnitude
speedup
I/bin/mail
that n
ies the c
P pra user was changed tspeedup
Next, the file l
acilities
pr
Ifl
P(2), were used in place the lThe mail system previ
Ilink and unlink in
implementing file lBecause these
y the c
directthey require synchradvantage the name cache maintained by the system.
Unlink requires that the entry be fit can be remlink requires that the directdBy c
acility in 4.2BSD is
efficient because it is all dThus, the mail system was m
ied t
ile lThis yielded an
delivering mail.
Extensive priling and tuning sendmail and
c
With the intr
the netw
acilities in 4.2BSD,
a myriad services became available, each which
required its Many these daem
ever used,
yet they lay asleep in the prsystem resRather than having many servers started at binetd was substituted.
This pr
igurati
ile
that specifies the services the system is willing tand listens fWhen a client requests service the apprand passed a service cthat require the identity their client may use the getpeername
system call; likewise gets
P may be used tind a server's l
iles.
This scheme is attractive f
With an increased numbers netwwe f
the rinSeveral changes were made in the rR
ault gateway.
This reduces the am
netw
fic and the time required
tIn additi
iled
and functi
time were The maj
aster hashing scheme,
and inline expansi
the ubiquit
uncti
Under certain circumstances, when attempts by the remtalth
Iselect call had indicated that data cThis resulted in cuser restarted This prand the
Several pe
in frequently used r
In particular the running time the string rcut in half by rewriting them using the VAX string instructiThe memmem
twCertain library r
ile input in have been cOther library r
Ifread and fwrite
have been rewritten f
ficiency.
The C-shell was cwriting a set rWhile this pr
unctiit was gr
ficient, generating up
tThe C-shell has been m
ied tfacilities directly,
cutting the number system calls per pr
.
Additi
priling
t
frequently used facilities.