CA2003342A1

CA2003342A1 - Memory management in high-performance fault-tolerant computer system

Info

Publication number: CA2003342A1
Application number: CA002003342A
Authority: CA
Inventors: Kenneth C. Debacker; Nikhil A. Mehta; John D. Allison; Robert W. Horst; Richard W. Cutts, Jr.; Charles E. Peet, Jr.; Douglas E. Jewett
Original assignee: TANDEN COMPUTERS Inc; Tandem Computers Inc
Current assignee: TANDEN COMPUTERS Inc; Tandem Computers Inc
Priority date: 1988-12-09
Filing date: 1989-11-20
Publication date: 1990-06-09
Also published as: EP0372579A3; EP0372578A2; JPH02202636A; ATE158879T1; EP0681239A3; AU628497B2; US5276823A; EP0372578A3; US5588111A; DE68928360T2; JPH079625B2; AU5202790A; US4965717B1; US5146589A; EP0681239A2; CA2003337A1; EP0372579B1; US5388242A; US5193175A; EP0447578A1

Abstract

ABSTRACT: A computer system in a fault-tolerant configuration employs three identical CPUs executing the same instruction stream, with two identical, self-checking memory modules storing duplicates of the same data. Memory references by the three CPUs are made by three separate busses connected to three separate ports of each of the two memory modules. The three CPUs are loosely synchronized, as by detecting events such as memory references and stalling any CPU ahead of others until all execute the function simultaneously; interrupts can be synchronized by ensuring that all three CPUs implement the interrupt at the same point in their instruction stream. Memory references via the separate CPU-to-memory busses are voted at the three separate ports of each of the memory modules. I/O functions are imple-mented using two identical I/O busses, each of which is separate-ly coupled to only one of the memory modules. A number Or I/O
processors are coupled to both I/O busses. Each CPU has its own fast cache and also a local memory not accessible by the other CPUs. A hierarchical virtual memory management arrangement for this system employs demand paging to keep the most-used data in the local memory, page-swapping with the global memory. Page swapping with disk memory is through the global memory; the global memory is used as a disk buffer and also to hold pages likely to be needed for loading to local memory. The operating system kernel is kept in local memory. A private-write area is included in the shared memory space in the memory modules to allow functions such as software voting of state information unique to CPUs. All CPUs write state information to their private-write area, then all CPUs read all the private-write areas for functions such as detecting differences in interrupt cause or the like.

Description

:

RELATED CASES: This application discloses ~ub~ct matter al~o disclosed in copend~ng u S patent applications Ser No .
282,538, 282,629, 283,139, and 283,141, ril-d Dec 9, 1988, and ser No. 283,574, ~iled Dec 13. 1988, all assign-d to Tandem co~puters Incorporated ~. .

~ 8ACXGROUND OF THE INVENTION
: .
This invention relates to computer ~ystems, and more par-ticularly to a memory manag-m nt ~y~tem used in a ~ault-tolerant computer having multiple CPUs ,"...~
. ~ 20 Highly reliabl- digital proce~ing i~ achiev-d in various computor architoctur-- mploying r dundancy For xample, ~MR
~tripl- modular r-dundancy) ~y~t-m~ m~y ~ploy thr-e CPUs execut-ing th- ~ in-truction tr-~m, ~long with thr~ parat- main m-mory unit- and ~-p~r~t- I/O d-vice- which duplic~t- ~unctions, ~ 25 80 if on- of ~ch typ- of l-ment f~il-, th- y~tem continu-s to `i op-rat- Anoth~r fault-tol-rant typ- of sy-t-Q i- shown in U S
Pat-nt 4,228,496, i~-u-d to Xatzm~n t al, for "Multiproc-s-or Sy~tem", a~ign-d to TAnd-m Comput-r- Incorpor~t-d V~riou~
m thod~ hav bo-n us-d for ynchroniz$nq th- unit~ in r-dundant ~ 30 ~ystem~; for xampl-, in ~aid prior ~pplic~tion Ser No 118,503, ;j fil-d Nov. 9, 1987, by ~ W Borst, for ~M thod and Apparatus for ,:.
~ 2 1 :

.;~ .
,.~
. ' .
, ~ . . ; , ' ., ' ' ~ !

:''` . ' ~ . ' ;. , . . ': ' ~ ' .
. ~

~00334Z
Synchronizing a Plurality o~ Processors~, also assiqned to Tandem Computer~ Incorporated, a method of "loo~e" synchronizing is disclosed, in contra~t to other sy~tems which have employed a lock-step synchronization using a single clock, as shown in u s Patent 4,453,215 ~or "Central Processing Apparatus ~or Fault-Tolerant ComputingH, a~signed to Stratu~ Computer, Inc A
technique called "synchronization voting" i~ disclosed by Davies & Wakerly in "Synchronization and Matching in Redundant Sys-tems", IEEE Transactions on Computer~ June 1978, pp 531-539 A
method for interrupt synchronization in redu~dant fault-tolerant systems is disclosed by Yondea et al in Proceeding Or 15th Annual Symposium on Fault-Tolerant Computing, June 1985, pp 246-251, "Implementation of Interrupt Handler for Loosely Synchronized TMR
Systems" U S Patent 4,644,498 for "Fault-Tolerant Real Time Clock" discloses a triple modular redundant clock configuration ~or use in a ~MR computer system U S Patent 4,733,353 for "Frame Synchronization of Multiply Redundant Computers~ discloses a synchronization method using separately-clocked CPUs which are periodically synchronized by executing a synch frame As high-performance microproces~or devices have become ! available, using higher clock speeds and providing greater capabilities, such as the Intel 80386 and Motorola 68030 chips operating at 25-MHz clock rates, and as other elements of com-puter system~ such as memory, disk drives, and the like have corre-pondingly becom- le~- exp-n~iv- and o~ greater capability, th- p-rrormanc- and co-t ot high-reliability proce~sors has been r-guir-d to ~ollow the same trends In addition, ~tandardization on a ~ew op-rating ~ystems in the computer industry in general has vastly increased the availability o~ applications software, so a similar demand is made on the field Or high-reliability systems; i - , a standard operating system must be available It i~ thereror- the principal object Or this invention to provide an improved high-reliability computer ~ystem, particular-ly o~ the ~ault-tolerant type Another ob~ect i5 to provide an :`

improved redundant, fault-tolerant type o~ computing system, and one in which high per~ormanc- and reduced co~t are both possible;
particularly, it is preferable that the improved system avoid the performan~e burden~ u~ually associated with highly redundant S sy5temg. A further ob~ect is to provide a high-reliability computer ~ystem in which the performance, measured in reliability as well a~ speed and software compatibility, is $mproved but yet at a cost comparable to other altQrnatives of lower performance An additional object is to provide a high-reliability computer sy~tem which i~ capable of executing an operating system which uses virtual memory management with demand paging, and having protected (supervisory or "kernel") mode; particularly an operat-ing system also permitting execution of multiple processes; all at a high level of performance SUMMARY OF THE INVENTION
.
In accordance with one embodiment of the invention, a computer system employs three identical CPUs typically executing ; th- same instruction stream, and has two identical, self-checking memory modules storing duplicates of the samQ data A configura-tion of three CPUs and two memorie~ i9 th-refore employed, rather than three CPUs and three memorie~ as in the clas~ic TMR systems Memory references by the throe CPU~ are mad- by three separate bu~es connected to throe ~eparat- ports Or ach of the two m-mory module- In ordor to avoid impo~ing th- performance ~2S burd-n Or fault-tol-rant op-ration on th- CPUs thems-lve~, and impo-ing th- xp-n--, compl-xity and timing problem~ of fault-tol-rant clocking, th- thr-- CPU- ach hav their own separate and ind-p-nd-nt clock~, but ar- loo--ly ynchroniz-d, a~ by detecting event~ such as m mory r d -r-nc-~ and stalling any CPU
ah-ad of others until all xecut- tho function simultaneously;
th- int-rrupts are also ~ynchronised to th- CPU~ ensuring that the CPU~ xecut- th- int-rrupt at tho ~am point in their in-struction stream The thre- asynchronou- m-mory ref-rences via .' .

~ ., . ~. .. . . .
: , . - . . ~ . . ~

. . .
.: . . .
... . . .

:. :

. ` ~ .

the separate CPU-to-memory bu~seg are voted at the three geparate ports of each of the memory modules at th- time of th- memory ! request, but read da~a is not voted when returned to the CPUs The two memories both per~orm all write rQquests received from either the CPUs or the I/0 bu~ses, ~o that both are kopt up-to-date, but only one memory module presents read data back to the CPUs or I/Os in response to read requests; the one memory module producing read data is designated the ~primary~ and the other $9 the back-up Accordingly, incoming data i~ from only one source and is not voted The memory requQsts to the two memory modules are implemented while the voting is still going on, so the read data is available to the CPUs a short delay after the last one of the CPUs makes the request Even write cycles can be substantially overlapped because DRAMs used for these memory modules use a large part of the write access to merely r-ad and refresh, then if not strobed for the last part of the writ- cycle the read is non-destructive; therefora, a write cycle begins as soon as the first CPU makes a reque~t, but does not complete until the last request has been received and voted good These features of non-voted read-data returns and overlapped accesses allow fault-tolerant operation at high performance, but yet at minimum complexity and expense . .
I/0 function~ ar- implemented using two identical I/0 busses, each of which is s-parately coupled to only one of the memory modules a numb r of I/0 proc-ssors ar- coupl-d to both I/0 bus~--, and IlO d-vic-- ar- coupl-d to pairs Or the I/0 proc--~or~ but acc-~-d by only one o~ th- I/0 processors Since one m-mory modul- i~ designated primary, only the I/0 bus for this modul- will b- controllinq th- I/0 processors, and I/0 traffic between memory module and I/0 is not voted The CPUs can access th- I/0 proc-~ors through th- m mory modul-s (each access b-ing voted ~ust as the memory acc-sses are voted), but the IlO
proces-ors can only access the memory modules, not th- CPUs; the I/0 processors can only send interrupts to the CPUs, and these :
.:

`''` '' ::
, interrupts are collected in the ~emory modulQ~ be~or< pr~senting to th- CPU- Thus synchronization ov-rh-ad for ~/0 device access i~ not burdening the CPUs~ yet fault tolerance is provided If an I/O procQssor fails, the othor one of the pair can take over s control of the I/o device~ for thi~ I/o procQ~sor by merely changing the addresse~ used ~or the I/o device in th- I/o page table maintained by the operating system In this manner, fault tolerance and rein~egration of an I/o device is possible without sy~tem shutdown, and yet without hardware expense and performance p-nalty as~ociated with voting and the like in these I/o paths The memory system used in the illustrated embodiment i8 - hierarchical at several levels Each CPU has its own cache, operating at essentially the clock speed of the CPU Then each CPU has a local memory not accessible by the other CPUs, and virtual memory management allow~ the kernel of the operating system and pages for th- current ta~k to b- in local memory for all thre- CPUs, acces~ible at high speed without fault-tolerance ; overhead ~uch as voting or ~ynchronizing impo~ed Next is the memory module level, referred to as global memory, where voting and synchronization take place 90 some access-time burden is - introduced; nevertheless, the speed of the global memory is muchfaster than disk access, 80 thi~ level is u~ed for page swapping ~; with local memory to keep the most-used data in the fastest area, rather than employing disk for the fir~t level of demand paging ~ ' 2S one of the f-atur-~ of th- disclos-d embodiment of the inv-ntion i- ability to r-plac- faulty components, ~uch as CPU
modul-~ or m-mory modul--, without shutting down the sy~tem - Thu-, th- y-t-m i- ~v~ilabl- for continuouB U5- even though ; component- may ~ail and hav- to be replaced In addition, the ability to obtain a high l-v l o~ ~ault toleranc- with f-wer sy~t-m compon-nts, g , no fault-tolerant clocking needed, only two m-mory module~ n--ded inste~d o~ thr--, voting circuits ; minimized, tc , m-an~ that th-r- ar- f-w-r compon-nts to fail, and 80 th- reliability is enhanced That i9, ther- are fewer ~ .
, ' ~-'x '. .~;;
. . . .
., ~ .. ..
.. ... , :: . .
; ~. . . . . ..
.. . .
.. , . . ' .: . -:.
'' ' fa~lures because there are ~ewer component~, and when there are ~ailur-s th- compon~nt~ are isolated to allow the system to keep running, while the components can be replaced without system shut-down ~he CPUs of this system pre~erably u3e a commercially-available high-performance microproeessor chip ~or which operat-ing sy~tems such a~ UnixI~ are available Th- parts o~ the system which make it fault-tolerant are Rither transparent to the op-rating syQtem or easily adapted to the operating system Accordingly, a high-performance fault-tolerant system is provided ~ which allows comparability with contemporary widely-used multi-¦ tasking operating system and applications software s Although the memory modules are essentially duplicates or on- another, storins the same date, th-r- is still a need in some situation~ to be able to ~tore data separately by each CPU in a mann-r ~uch that the data is readabl- by all CPUs Of course, th- CPUs of th- exampl- embodiment hav- loeal memory (not in the memory moduleQ but instead on the CPU modul-s) but this loeal t~- memory is not aeeessible by the oth-r CPU~ Thus, according to a f-ature of one embodiment, an ar-a of private-write memory is included in th- shared memory area, ~o that unigu- state informa-tion ean be writt-n by aeh CPU then read by the others to do a eompare operation, for xampl- The privat- write is aecessed in ~ a manner such that th- in-truetion str-am~ of the CPU~ ar- still 5 25 identieal, and addr~ u--d are id-ntieal, so the integrity of th- id-ntieal eod- tr-am i- maintain-d Voting of data i~
u-p-nd-d wh n a privat- writ- op-ration i- d-t-eted by the ' m-mory modul~ ine- this data may differ, but th- addr-~ses and eommand- ar- still vot-d Th- ar-a us~d for privat- writ- may be ehang-d, or liminat-d, und-r eontrol of th- instruetion stream Aeeordingly, th- abllity to eompar- unigu- data i~ provided in a fl-xibl- manner, without bypassing the synehronization and voting m ehanism~, and without disturbing th- id-ntieal nature of the ~ eode ex-euted by th- multiple CPUs t 7 ~.' ~ ;~
,.~
,, ~ .
.
,: :
'' :. , ' , -. . , - . ~ .
`,--.

20~
BRIEF DESCRIPTION OF THE DRAWINGS

The features believed characteristic of the invention are set forth in the appended claims The invention itself, however, a~ well a~ other featurQs and advantages th~reo~, may be~t be , 5 understood by reference to the detailed de~cription o~ a specific embodiment which follow~, when read in con~unction with the accompanying drawings, wherein Figure 1 is an electrical diagram in block ~orm of a computer system according to on- embodim-nt of the invention;

t ' 10 Figure 2 is an electrical schematic diagram in block form of one o~ the CPUs of the system of Figure l;

s Figure 3 i~ an electrieal Jehematie diagram in block form of one of the mieroproees~or ehip u~ed in the CPU of Figure 2;

Figures 4 and 5 are timing diagrams showing events oeeurring in the CPU of Figures 2 and 3 as a funetion of time;

Figure 6 is an eleetrieal sehematie diagram in block form of one of the memory module~ in the eomputer ~ystem of Figure l;

Figure 7 is a timing diagra~ ~howing events oeeurring on the CPU to m-mory bu~ in th- y-t-m of Figur- l;
~':
Figur- 8 i- an l-etrieal ~ehomatie diagram in bloek form of ~, one of th- I/O proe-~ors in th- eomputer sy~tem of Figure 1;

Figur- 9 i- a ti~ing diagra~ showing ~vent~ V8 . time for the transf-r protoeol b-twe-n a memory modul~ and an I/O proees~or in th- ~y~t-m of Figur- l;

Figur- lO i~ a timing diagra~ ~howing eventa vs time ~or `4 exeeution of instruetions in th- CPU- of Figures 1, 2 and 3;

., 5 ,1 ~' `'''~'' '' ' ' ,'::
,~:

.: ',,' ;'~:', ' . ~,' .

Figure lOa is a detail view of a part of the diagram of Figure 10;

Figuras 11 and 12 are timing diagram~ similar to Flgure lo ~howing evQnt~ V8 time for execution of in~tructions in the cPus Or Figures 1, 2 and 3;

Figure 13 is an electrical schematic diagram in block form of the interrupt synchronization circuit used in the CPU of Figure 2;

Figures 14, 15, 16 and 17 are timing diagrams like Figures 10 or 11 showing events V8 time for execution of instructions in the CPUs of Figures 1, 2 and 3 when an interrupt occurs, - illustrating various ~cenarios;

Figur- 18 is a physical memory map of the memories used in the system of Figures 1, 2, 3 and 6;

Figure 19 is a virtual memory map of the CPUs used in the system of Figures 1, 2, 3 and 6;
. ., `~ Figure 20 is a diagram of the format of the virtual address `~ and th- TL~ entrie- in the microprocessor chip~ in th- CPU
according to Figure 2 or 3;

~20 Flgur- 21 1~ an lllu-tration Or the private m-mory locations in th- n~ ory ~ap of th- global memory modules in th- system of Figur-- 1, 2, 3 and 6; and Figur- 22 1- an l-ctrical diagra~ of a fault-tol-rant power ~upply us-d with the syst-m o~ th- invention according to one ,;25 ambodiment :~ .
.:
,~ 9 , . !

~ . . , , ' ':', . ", '' '' ., ' ~., . ' "' : ' ' .' ' . '.
,' ~ ' , ' . . .
'~ ' ' , '.' " ' -': , . ' j . ', . ' i , '. ~

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENT

With reference to Figure 1, a computer system using ~eatures of the invention i9 8hown in one embodiment having three identical processors 11, 12 and 13, re~err-d to as CPU-A, CPU-B
and CPU-C, which operate as one logical processor, all three typically executing the ~ame instruction str-am; the only time the threo processors are not executing the same instruction stream i~ in such operations as power-up self test, diagnostics and the like The three processors are coupled to two memory modules 14 and lS, referred to as Memory-~l and Memory-t2, each memory storing the same data in the same addr-~ space In a preferred embodiment, each one of the processors 11, 12 and 13 contains its own local memory 16, as well, accessible only by the proce~sor containing this memory Each one of tho processor~ 11, 12 and 13, as well as each one of the memory modules 14 and 15, ha- its own separatQ clock oocillator 17; in thio embodim nt, th- proc-s~ors are not run in nlock stepn, but instead are loosely synchronized by a method ;~ such as i~ set forth in the above-m-ntion-d application Ser No 118,503, i e , using event~ uch ao external memory references to bring the CPUs into ~ynchronization Ext-rnal intQrrupts are synchronized among th- threo CPUs by a t-chnigue employing a set of buos-o 18 for coupling the int-rrupt r-quQst~ and otatus from each of the processor~ to th- oth-r two; ach one of the proces-soro CPU-A, CPU-B and CPU-C io r-~ponoiv to the thr-- interrupt r-qu--t-, lt- own and th- two r-c-iv-d frou th- oth-r CPU~, to pr-o-nt an int-rrupt to the CPUo at th- a~ point in the ~ ox-cution tr-a~ Th- m-mory moduleo 14 and 15 vot- the memory ;l r-f-r-nc-o, and allow a ~-mory r-f-r-nc- to prose-d only when all ~-; 30 thr-e CPUo hav- mad- th- sam r-gu-ot ~with provision ~or ; faulto) In thi~ mann-r, th- proc---oro ar- ~ynchroniz-d at the tim- of xt-rnal ev-nt~ (me~ory r-f-r-nc--), r-oulting in the proc---or- typically ex-cuting th- oa~ instruction otream, in th- am s-qu-nc-, but not n~c~-arily during align-d clock . . ~ . .
~ 10 .~

.:;.

. : .

.. . ~ ..
.

cycles in the time between synchronization events In addition, external interrupts are synchronized to be xecuted at the same point in th- instruction ~tream of each CPU

The CPU-A proceggor 11 ~8 connectad to the Memory-tl module 14 and to the Memory-~2 module 15 by a bus 21; likewise the CPU-B is connected to the modules 14 and 15 by a bus 22, and the cPU-C is connected to the memory modules by a bus 23 These busses - 21, 22, 23 each include a 32-bit multiplexed address/data bus, a command bus, and control linos for address and data strobes The lo CPUs have control of these busses 21, 22 and 23, so there is no arbitration, or bus-request and bu~-grant Each one of the memory modules 14 and 15 is separately coupled to a respective input/output bus 24 or 25, and each of these bus~es i8 coupled to two (or more) input/output proce~sors lS 26 and i7 The system can have multiple I/O processor~ a~ needed to accommodate the I/0 devices needed for the particular system configuration Each one of the input/output processors 26 and 27 is connected to a bus 28, which may be of a standard configuration such as a VMEbu~, and each bu~ 28 is connected to one or more bus interface modules 29 for interface with a standard I/O controller 30 Each bus interface module 29 is connect-d to two o~ the buss-s 28, so failur~ o~ one I/0 proce~sor 26 or 27, or failure of one of th- bu~ channels 28, can bo tolorat-d Th- I/0 proc-s~or~ 26 and 21 can be addre~sed by th- CPUs 11, 12 and 13 through th- m-mory module~ 14 and 15, and can ignal an int-rrupt to th- CPU- via th- memory modules Disk drive-, t-rminal~ with CRT screens and keyboards, and network adapt-r~, ar- typical peripheral d-vic-~ operat-d by the controll-r- 30 Th- controll-r~ 30 may mak- DMA-typ- references to th- memory modul-~ 14 and 15 to tran~f-r block~ Or data Each ono of th- I/O proces~ors 26, 27, tc , has c-rtain individual lines directly connected to each on- of the memory module~ for bus r-qu--t, ku- grant, tc ; th - polnt-to-polnt connections ;

i~ .
:~ . ~........... .
.
-: , . .. . .... . ..
. ' ~.
:~ :. ' . . ..
.

are called "radial~ and are included in a group o~ radial lines A Qy~t-m statUJ bus 32 is individually connected to each one of the CPUs 11, 12 and 13, to each memory module 14 and lS, and S to each ot the I/o proce~sor~ 26 and 27, ~or th- purpose of providing in~ormation on the status ot each element This status bu~ provide~ in~ormation about which o~ the CPUa, memory modules and I/0 proce~sors i8 curr-ntly in the ~y-tem and operating properly An acknowledge/status bus 33 connocting the three CPUs and two memory modules includec individual linea by which th- modules 14 and lS ssnd acknowledge signals to th- CPU~ wh-n memory regue~t~ ar- made by the CPU~, and at th- sam- tim a ~tatus ; ti-ld i~ ~-nt to report on the statu~ ot th- command and whether it execut-d correctly The memory modul-- not only check parity on data r-ad trom or written to the global memory, but also check parity on data pa~ing through the menory modul-s to or from the ~ I/0 bu~s-~ 24 and 25, a~ well a~ cheeking the validity ot ,~ eommand~ It is through the status lines in bus 33 that these eheek~ are reported to th- CPU~ 11, 12 and 13, 80 if errors occur a fault routine can be entered to isolzte a faulty eomponent ;~ Ev n though both memory modul-~ 14 and 15 ar- storing the ~am data in global m mory, and op-rating to p-rfor~ v-ry memory r-~er-ne- in duplieat-, on- o~ th-~ ory module~ is dQsig-nat-d a~ pri~ary and th oth-r a- baek-up, at any giv n time N ~ory wrlt- op ration~ ar- x-eut-d by both m mory modules so $ both ar- k-pt eurr-nt, and al-o a ~ ~ory r-ad op-ration is i x-eut-d by both, but only th- pri~ary ~odul- aetually loads the r-ad-data baek onto th- bu~--a 21, 22 and 23, and only th-pri~ary ~-nory ~odul- eontrol- th arbitration tor ~ulti-master i bu---- 24 and 2S To k--p th- pri~ary and baek-up ~odul-~
x-euting th- tam op-ration-, a bu- 3~ eonv-y- eontrol 3 in~or~ation ~rom primary to baek-up Either modul- ean assume . ;, .

,.
. .
`!
'~t , . :-.`, ~ .
~'`'`""'' ' ' ' , .
" ' '""' ' . "'' ' ' `"' '` ' .
'.~''" '. ~

' ' ,` '' ~ ' .

the role o~ primary at boot-up, and the roles can switch durlng operat~on und-r softwar- control; the rol-~ can al~o switch when selected rror coAditions are detected by the CPUs or other error-responsive parts of the system Certain interrupts generated in the CPUs are also voted by the memory modules 14 and 15 When th- CPUs encounter such an interrupt condition (and ar- not st~lled), they signal an interrupt reguest to th- memory modules by individual lines in an interrupt bus 35, 80 the three int-rrupt reguests ~rom the three CPUs can b- voted When all interrupts hav- been voted, the memory modules each send a voted-int-rrupt signal to th- three CPUs via bus 35 This voting of interrupts also function~ to check on the operation of the CPUs ~he three CPUs synch the voted interrupt CPU interrupt s$gnal via the inter-CPU bus 18 and present th- interrupt to the processors at a common point in the instruction stream This interrupt synchronization is accomplished without stalling any of the CPUs CPU Module :
Referring now to Figure 2, one of th- processors 11, 12 or 13 i8 shown in mor- detail All thre- CPU module~ are of the same con-truction in a pro~erred e~bodiment, 80 only CPU-A will be de~crib d her- In ord-r to ke-p cost- within a competitive rang-, and to provid- r-ady acc-~- to alr-ady-dev-lop-d software and op-rating JyJt-u-, it i- pr-f-rr-d to us- a comc rcially-~2S availabl- uicroproc---or chip, and any on- of a nu~b r of devices may b- cho--n Th- ~ISC (r-duc-d in-truction set) architecture has ou advantag in i~pl-monting th- loo-e synchronization as will b- d-~crib d, but ~ore-conv-ntional CISC (complex in-truction -t) nlcroproc---or- such a~ Motorola 68030 devices j 30 or Int-l 80386 devic-~ (availabl- in 20-MHz and 2S-MHz peeds) could b used High-~p--d 32-bit RISC nicroproc-s~or d-vices are availabl- from -v ral sourc-- in thr-- ba~ic type-; Motorola produc-- a d-vic- a- part nu~b4r 88000, ~IPS Computor Sy~tems, ~ 13 .

; . - - :.
. ~ . .
., , , - ~ ~ .
:i . : , .~:
, . . , : :
: : -:.~., , ~
:: -Inc and others produce a chlp set referr-d to as the MIPS type, and Sun Mlcrosystems ha- announc-d a so-call-d SPARCJ~ typo (scalable processor architecture) Cypr-~ S-miconductor of san Jose, California, ~or example, manu~acture- a microprocessor re~erred to as part number CY7C601 provlding 20-MIPS (million instructions per second), clocked at 33-MHz, supporting the SPARC
tandard, and Fu~itsu manufactures a CMOS RISC microprocessor, part numb r S-25, also supporting the SPARC standard The CPU board or modul- in the illu~trative embodiment, used as an exampl-, employs a microprocessor chip 40 which i5 in this case an R2000 devic- de~ign-d by MIPS Comput-r Systems, Inc , and also manufactured by Integrat-d Dsvice T-chnology, Inc The R2000 device is a 32-bit proc-ssor u-ing RISC archit-cture to provide high performance, e g , 12-MIPS at 16 67-MHz clock rate ; 15 Higher-sp-ed versions of this d-vic- may b- u~ed inst~ad, such as the R3000 that provides 20-MIPS at 25-MHz clock rat- The proc-~sor 40 also has a co-processor u--d for memory management, including a translation lookasid- bu~r-r to cache translations of logical to physical addresses The proc-ssor 40 i8 coupled to a local bus havinq a data bus 41, an addr-~- bus 42 and a control bus 43 S-parate in~truction and data cach- memories 44 and 45 are coupl-d to thi~ local bus Th-se cach-~ ar- ach of 64K-i byte iz-, for exampl-, and are acc-~s-d within a single clock cycle of the proc-~-or 40 A num ric or floating point co-` 25 proc-s-or 46 i9 coupl-d to th local bu- if additional p-rformanc- i- n--d-d for th-J- typ-~ of calculation-; this nu~-r~c proc---or d vic- 1- al~o comm~ rcially availabl- from MIPS
~` Comput-r Sy~t~ a- part number R2010 Th- local buJ 41, 42, 43, coupl-d to an int-rnal bus ~tructur- through a write buffer 50 and a r-ad bu~f-r S~ Th- writ- buff-r i- a com~m rcially availabl- d-vic-, part nu~b r R2020, and function~ to allow th-proc---or 40 to continu- to x-cut- Run cycl-s aft-r toring data and addre~s in the writ- buff-r 50 for a writ- op-ration, rather ~ than having to ex-cut- stall cycl-- whil- th- writ- is ; 3S compl-ting ....

., ;~

:
.", ,:.~','' , ' .. i . . . .
. '`", ' ` ~ , . .

:~; - , . .
.. . ~ .

- ~ ,' :. .
~,'; ' ' - ..', :

200:~342 In add~tion to the path through the write bu~er 50, a path is provided to allow the processor 40 to execute write operations bypas~ing the write buffer 50 This path i~ a write bu~fer bypass 52 allows the processor, under so~tware ~election, to perform synchronous writes If the write buffer bypass 52 is enabled (write bu~fer 50 not enabled) and the processor executes a write then the processor will stall until the write completes In contrast, when writes are executed with the write buffer bypas~ 52 disabled the processor will not stall because data is written into the write buffer 50 (unles~ the write buffer is full) If the write buffer 50 iQ enabled when the proce~sor 40 performs a write operation, the write bu~er 50 captures the output data from bus 41 and the address from bus 42, as well as controls from bus 43 The write buffer 50 can hold up to ~our lS such data-address sets while it waits to pass the data on to the main memory The write buffer runs ~ynchronously with the clock 17 of the processor chip 40, ~o the processor-to-buffer transfers are synchronous and at the machine cycle rate of the processor The write buffer 50 signals the processor if it is full and unable to accept data Read operations by the processor 40 are checked against the addresses contain-d in the four-deep write buffer 50, so if a read is attempted to one of the data words waiting in the write buffer to be written to m mory 16 or to global memory, the read is stalled until th- write is completed The writ- and r-ad buffers 50 and 51 are coupl-d to an int-rnal bu- ~tructur- having a data bu~ 53, an address bus 54 and a control bus SS Th- local memory 16 is accessed by this int-rnal bu-, and a bu- interfac- 56 coupled to th- internal bus i- us-d to acce~- th- system bus 21 ~or bus 22 or 23 for the other CPU~) The ~-parat- data and addre~s bu-ses 53 and 54 of the int-rnal bus (as d-rived rrom buss-s 41 and 42 o~ th- local ~ bus) are conv-rted to a multipl-x-d addr-~-/data bus S7 in the ; system bus 21, and th- command and control line~ ar-,~ 15 `
:
. . .
,~ :, . . .
-. . .

., , . - ' - ' - ' . ' ` ' ; .
: ~ : - .

corre~pondingly converted to command lin-~ s8 ~nd control lines 59 in thi~ external bus The bu~ int~rface unit 56 also receives the acknowlodge/status lines 33 ~rom the memory modules 14 and 15 In these lines 33, separate ~tatus line~ 33-1 or 33-2 are coupled from each o~ the moduleJ 14 and 15, so th- responses from both ; memory modules can be evaluated upon th- evQnt of a transfer (read or write) between CPUs and global m-mory, as will be explained .
The local memory 16, in one embodiment, comprises about 8-Mbyte of RAM which can be accessed in about three or four of the machine cycles of processor 40, and thi~ acces~ is synchronous with the clock 17 Or this CPU, whereas the memory access time to the module~ 14 and 15 is much greater than that to local memory, and thi- access to th- memory modul~ 14 and 15 is asynchronous and sub~-ct to the synchronization overhead imposed by waiting for all CPUs to make the request then voting For eompari~on, access to a typical commercially-available disk memory through the I/O processors 26, 27 and 29 i~ mea-ured in milliseconds, i e , considerably ~lower than acces~ to the modules 14 and 15 Thus, there is a hi-rarchy o~ mem~ry aceess by the CPU chip 40, th- highe~t being the instruetion and data eaehes 44 and 45 which will provide a hit ratio o~ p-rhaps 95% wh-n u~ing 64-XByte cache siz- and uitabl- r$11 algorith~s The s-eond highe~t is the ~ 25 loeal m-mory 16, and again by mploying eont-mporary virtual -, m-mory ~anag-m-nt ~lgorithm- a hit ratio Or p-rhap- 95% is i obtain-d ~or m ~ory r-~-rene-~ ~or whieh a each- miss occurs but a hit in loeal ~ ~ory 16 i~ ~ound, in an xampl- wh-re th- size o~ th- loeal m-mory i~ about 8-MByt- Th- net r-sult, from the standpoint o~ th- proe-~or ehip 40, is that p-rhaps great-r than 99% o~ m-mory r-f-r-ne-~ (but not I/0 r-~erenees) will b-synehronou- and will oeeur in ither th- sam- maehin- cycle or in three or ~our maehin- eyele~

" 16 : ..
.

. .
:,i , ., . . ~ , -.;. .
.
~................................ .
.
,. . .

The local memory 16 i8 accessed ~ro~ thQ internal bus by a memory controller 60 which receives the addrosses from address bus 54, and the addrsss strobes from the control bus 55, and generates separate row and column addrQsses, and RAS and CAS
controls, for example, if the local memory 16 employ~ DRAMs with multiplexed address~ng, a8 is usually th- case Data is written to or read from the local memory via data bus 53 In addition, several local registers 61, as well as non-volatile memory 62 such as NVRAMs, and high-speed PROMs 63, as may be used by the operating system, are accessed by the internal bus; some of this part of the memory i8 used only at power-on, some is used by the operating syste~ and may be almo~t continuously within the cache 44, and other may be within the non-cached part o~ the memory map External interrupts are applied to the processor 40 by one of the pins of the control bus 43 or 55 from an interrupt circuit 65 in the CPU module of Figure 2 This type of interrupt is voted in the circuit 65, so that befor- an interrupt is executed by the processor 40 it is determined whether or not all three CPUs are presented with the interrupt; to this end, the circuit 65 receives interrupt pending inputs 66 from the other two CPUs 12 and 13, and sends an interrupt pending Qignal to the other two CPUs via line 67, these lines being part of the bus 18 connecting the three CPUs 11, 12 and 13 togeth-r Also, for voting other types o~ interrupt~, sp-cifically CPU-g-n-rated interrupts, the circuit 65 can send an int-rrupt r-gu--t rrom this CPU to both of th- m mory modul-- 14 and lS by a lin- 68 in the bus 35, then r-c-iv- ~-parate vot-d-interrupt signals from tho memory modules via lin-- 69 and 70; both memory module~ will present the ext-rnal int-rrupt to b- act-d upon An int-rrupt g-nerated in some ext-rnal ourc- 8uch as a keyboard or disk driv- on one of th- I/O chann-ls 28, for exampl-, will not bo pr-sent-d to the interrupt pin of th- chip 40 rrom th- circuit 65 until each one of tho CPUs 11, 12 and 13 is at th- sam point in the instruction stream, as will be explained ` 17 ~' .

' ; .

~ Sinc- the processor~ 40 are clocked by separate cloc~
; oscillator- 17, there mu-t be ~ome mechanism for period~cally bringing the processors 40 back into ~ynchronization Even though the clock oscillators 17 are of the same nominal rrequency, ~ g , 16 67-MHz, and th- toleranc- ror theso devices i~ about 25-ppm (parts per milllon), th- proce~sors can potentially become many cycles out Or phase unless periodically brought back into ~ynch Of cour~ very time an xternal interrupt occurs the CPUs will be brought into ~ynch in the sense Or being interrupted at the sam point in th-ir instruction stream (due to the interrupt synch m chani~m), but this doe~ not help bring the cycle count into synch Th- mechanism Or voting memory references in the memory modul-s 14 and 15 will bring the CPUs into synch (in real time), as will be oxplained However, som- conditions result in long periods wh-re no meaory reterence occur-, and so an additional mechanism is used to introduc- ~tall cycle- to bring tho proce~sors 40 bacX into synch A cycl-counter 71 i~ coupl-d to th- cIock 17 and th- control pins of the processor 40 via control bus 43 to count aachine cycles which are .5 20 Run cycl-s (but not Stall cycles) This counter 71 includes a count regist-r having a maxiaua count valu- ~olected to represent the p-riod during which th aaximua allowable drift ~Qtween CPUs would occur (taking into account tho speciried tolerance tor the crystal oscillators); wh-n this count r-gist-r overflow- action 2S is initiated to tall th- ta-t-r proce--ors until th- slow-r proc---or or proc-s-or- catch up Thi- count-r 71 i- r-s-t wh-n-v r a ynchronlzatlon i- don- by a a aory rer-renc- to the i a aory aodul-- 14 and 15 Also, a r-tr-sh counter 72 is employed to p-rtors r-tr--h cycl-- on the local a aory 16, as will be xplainod In addition, a count-r 73 counts aachin- cycle which ar- Run cycl-- but not Stall cycl-s, lik th- count-r 71 does, but this count-r 73 i- not re--t by a a aory r-ter-nc-; th-' count-r 73 is u--d tor int-rrupt ynchronization as xplain-d ;~ below, and to this nd produces th- output signals CC-4 and CC-8 to th- int-rrupt ynchronization circuit 65 ., ,1 '.~
`.~
. .
. - . , :' :' .: . ' . ' ' .
. ~ .
''`" ' . ~ . .

Th- proc-ssor 40 has a RISC instruction s-t which does not suppsrt memory-to-memory instruction~, but in~tead only memory-to-regiater or regi~ter-to-m-mory instructions (i e , load or store) It is important to koep rr-gu-ntly-us-d data and th-S currently-executing code in local m mory Accordingly, a block-transfer operation iB provided by a DMA stat- machine 74 coupled to th- bus interface 56 Th- proce~or 40 wr$tes a word to a regi~ter in the DMA circuit 74 to function as a command, and writes the starting address and l-ngth of the block to registers in this circuit 74 In one embodim nt, the microprocessor stalls while the DMA circuit takes over and xecutes th- block transfer, producing the necessary addresses, command~ and strobes on the busses 53-S5 and 21 The command ex-cut-d by th- processor 40 to initiate this block transfer can b- a r-ad from a register in the DMA circuit 74 Sinc- memory manag-m nt in th- Unix operating system r-lies upon demand paging, thQs- block tran~fers will most often be pages being moved betw -n global and local memory and I/0 traffic A page is 4-RByte- or course, the bu~ses 21, 22 and 23 support single-word read and writ- transfers between CPUs and global memory; th- block transfer- ref-rred to are only po-sibl- betwe-n local and global memory .,.
Th- Proe-s-or , R f-rring now to Figur- 3, th- R2000 or R3000 typ- of - mieroproe-~-or 40 of th- x~pl- badi~ nt i9 ~hown in more 2S d-t~il Thi- d-viC- inelud-- ~ ~ain 32-bit CPU 75 eontaining thirty-two 32-bit g-n-ral purpos- regi-t-r~ 76, a 32-bit ALU 77, a z-ro-to-6~ bit ~hift-r 78, and a 32-by-32 multiply/divide eireuit 79 Thi- CPU al~o ha- ~ program eount-r 80 along with a-~oeiat d iner-~-nt-r ~nd adder Th--- eo~pon-nt~ ar- eoupled ~, 30 to a proc-~or bu~ tructur- 81, whieh i- coupl-d to th- local d~t~ bu~ 41 and to an in~truction d-eod-r 82 with a~soeiated ~ eontrol logie to ex-eut- in~truetion- feteh-d via data bus 41 ;~ Th- 32-bit loe~l addr-~ bu~ 42 i~ driv-n by a virtu~l ~-mory '~ .
. ~ .

, ~'.
:-. : . . . . :! ~. .. :-.:~ ' . . '' '' ' : ' ,'' . ' .' . , , . ;.
:
. . . , ,, . , ' .:

. ~` . ~ . .. . : .. .. . .

management arrangement including a tranglatiOn lookaside buffer (TLB) 83 within an on-chip memory-management coprocessor The TLB 83 contains sixty-four entries to b- compared with a virtual address received from the microprocessor block 75 via virtual address bus 84 The low-order 16-bit part 85 o~ the bus 42 is driven by the low-ordor part of thi~ virtual address bu~ 84, and the high-order part is from th- bus 84 if the virtual address is used as the physical address, or i~ the tag ntry from the TLB 83 via output 86 if virtual addressing is us-d and a hit occurs The control lines 43 of the local bus ar- connected to pipeline and bus control circuitry 87, driven from the internal bus structurQ 81 and the control logic 82 The microprocessor block 75 in the proces_or 40 is of the RISC type in that most in_truetion~ exeeute in one maehine eycle, and the instruetion set use~ register-to-regist-r and load/store instruction~ rathQr than having compl-x inotruetions involving memory ref-rences along with ALU op-ration~ There are no compl-x addressing schQmes includ-d a- part of the in~truction set, sueh as "add the operand whose addr-ss is the Qum of the eontent~ of register Al and register A2 to th- operand whose address is found at th- main memory location addressed by the eontents of regiQter B, and store tho result in main memory at tho loeation whose address is found in r-gister C n Instead, this operation i~ don- in a number of simpl- regist-r-to-register and load/stor- instruetion~ add r-gi-t-r A2 to r-gister Al;
load regist-r Bl trom m mory loeation whos- addres~ i~ in r-gi-t-r B; add regi-t-r Al and r-gi~t-r Bl; tor- r-gi~ter Bl to m mory loeation addr~ d by r-gist-r C Optimizing eompiler t-ehnlqu-- ar- u--d to maximize th- us- of th- thirty-two r-glst-r- 76, 1 - , a~ur- that mo-t op ration~ wlll find the op-rands already in the r-gi~t-r s-t Th- load in~truetion~
aetually tak- longer than on- ~aehine eyel-, and to aeeount for ~ thi~ a lat-ney of on- instruetlon 1~ lntrodue-d; th- data fetched -~ by th- load instruetion is not us-d until th- seeond eyele, and , :
- .. .
:` : ~ . ...

. .
. .

:

the intervening cycle is used for ~ome other ln~truction, if po~sible.

The main cPu 75 is highly pipelined to facilitata the goal of averaging one instruction execution per machine cycle S Referring to Figure 4, a singlo instruction is executed over a period including five machine cycles, where a machine cycle is one clock period or 60-nsec for a 16 67-MHz clock 17 These f ive cycle~ or pipe stages are referred to a~ IF (instruction ~etch from I-cache 44), RD (read operands fro~ regi~ter set 76), ALU
(perform the required operation in ALU 77), MEM (access D-cache 45 if required), and WB (write back ALU re~ult to register file 76) A~ seen in Figure 5, these five pipe ~tage~ are overlapped so that in a given machine cycle, cycle-5 for example, instruction I~5 is in it~ first or IF pipe stage and instruction I~l is in its last or WB stage, while the other instructions are in the intervening pipe stages Memory Module With reference to Figure 6, one of the memory modules 14 or 15 is shown in detail Both memory modules aro of the same construction in a preferred embodiment, so only the Memory~l modul- is shown The memory module includes three input/output ports 91, 92 and 93 coupled to th- three busseo 21, 22 and 23 ,. . .
; coming from the CPU~ 11, 12 and 13, r-sp-ctively Inputs toth-~- ports ar- latch-d into regi~t-rs 94, 95 and 96 each of which ha- -parat- -ctiono to otor- data, addres~, command and otrob-o for a writ- op-ration, or addr---, command and strobes rOr a r-ad op-ration Th- cont-nts o~ th-s- three registers are vot-d by a vot- circuit 100 having inputo connected to all section~ of all thr-- regist-rs If ~11 three of th- CPUs 11, 12 and 13 ma~e th- sam m-mory request (sam- addr-s~, ~am- command), a~ should be th- ca~e ~ince th- CPUs ar- typically ex-cuting th-i ~ame instruction str-am, then the memory request is allowed to complete; howev-r, a~ soon a~ the fir~t memory request is latched :

''`

:' . ~ ,, ~ . .
. ~

'' ` '' ' ' ` ~

.

: . . .~.. .
- ;

20(~3342 into any one o~ the three latches 94, 95 or 96, it is passed on immediat-ly to begin the m-mory acces~ To this end, the address, data and command are applied to an internal bus including data buo lol, addre~s bus 102 and control bu~ 103 From thio internal bus the m-mory reque~t acc-~se~ various resources, depending upon the address, and depQnding upon the ~ystQm conflguration In one embodiment, a larg- DRAM 104 i~ accessed by the internal bu~, using a memory controller 105 which accepts the address from address bus 102 and msmory reque~t and strobes from control bu~ 103 to generate multiplexed row and column addresses for the DRAM so that data input/output is provided on the data bus 101 This DRAM 104 is also referred to as global memory, and is of a size of perhap~ 32-MByte in one embodiment In addition, the internal bu~ 101-103 can acces- control and status registers 106, a quantity of non-volatile RAM 107, and write-protect RAM
108 Th- m-mory reference by th- CPUa can also bypass the memory in the m-mory module 14 or 15 and acce~s th- I/O busses 24 and 25 by a bus interface 109 which ha~ inputs connected to the internal bus 101-103 If the memory module i~ th- primary memory module, a bus arbitrator 110 in each memory module controls the bus interface 109 If a memory modul- i~ the backup module, the bus 34 controls the bus int-rface 109 A memory access to th- DRAM 104 iJ initiat-d as soon as the tirst r-gu--t is latch-d into on- of the latches 94, 95 or 96, but i- not allow d to compl-t- unl--s th- vot- circuit 100 d-t-r~in-- that a plurality ot th- r-qu--ts are the sam-, with provi-ion tor taults Th- arrival ot th- tirst of th- three r-gu--t- cau--- th- access to th- DRAM 104 to ~egin For a read, th- DRAM 104 is addres--d, th- sens- ampliti-rs ar- strob-d, and th- data output i- produc-d at th- DRAM output-, so it th- vot-j is good att-r th- third r-qu--t is r-c-iv d th-n th- requ-sted '~ data is r-ady ~or imm-diat- transt-r back to th- CPUs In this mann-r, voting i~ ov-rlapped with DRAM acc-ss . .

.... . . .
. - - . -: : . . -, ~
;: : . , .
. ~ . .
: . . . .

: : , -Re~erring to Figure 7, the bu~e~ 21, 22 and 23 apply memory requests to ports 91, 92 and 93 of the memory module~ 14 and 15 in the format illustrated Each o~ these busses consists of thirty-two bidirectional multiplexed addr-~s/data lines, thirteen s unidirectional command lines, and two ~trobes The command lines include a field which specifies the type of bus activity, such as read, write, block transfer, aingle transfer, I/o read or write, etc Also, a ~ield functions as a byte enabl~ ~or the four bytes The strobes are AS, addres~ strobe, and DS, data strobe lo The CPUs 11, 12 and 13 each control their own bu~ 21, 22 or 23;
in this embodiment, these are not multi-master busses, there is no contention or arbitration For a write, the CPU drives the address and command onto the bus in one cycle along with the address strobe AS (active low), then in a subseguent cycle (possibly the next cycle, but not neces~arily) drives the data onto the address/data lines of the bus at the same time as a data strob DS The address strobe AS from each CPU cause~ the addres~ and co~mand then appearing at th- port~ 91, 92 or 93 to be latched into the address and com~and sections of the registers 94, 95 and 96, as these strobes appear, then the data strobe DS
cause~ the data to be latched When a plurality (two out of three in this embodiment) of the busses 21, 22 and 23 drive the sam- memory request into the latches 94, 95 and 96, the vote circuit 100 passes on the final command to the bus 103 and the memory access will be xecut-d; if the command is a write, an acknowl-dg- ACK ignal i~ ~ent back to ach CPU by a line 112 (sp-cifically lin- 112-1 rOr Memory~l and line 112-2 for M mory~2) a- soon a~ th- writ- ha~ be-n executed, and at the same time ~tatu- bit~ ar- driven via acknowl-dg-/status bua 33 (~p-cifically lin-- 33-1 for Memory~l and lines 33-2 for Memory~2) to each CPU at time T3 of Figur- 7 The delay T4 b twe-n th- last strob- DS (or AS if a r-~d) and th- ACK at T3 is variable, depending upon how many cycles out of synch th- CPUs ar- at th- tim- of th- memory r-guest, and dep-nding upon the 3S d-lay in th- voting circuit and th- pha~- Or th- internal . ~.
' ' :.

. . .
. : .

, : '' - , .
; . , ~

independent clock 17 o~ the memory ~odule 14 or lS compared to the CPU clocks 17 I~ the memory requegt is~ued by the CPUg is a read, then the ACX signal on lines 112-1 and 112-2 and the status bits on lines 33-1 and 33-2 will be sent at the same time as the data is driven to the address/data bus, during time T3; this will release the stall in the CPUs and thu~ synchronize the CPU chips 40 on the same lnstruction That is, the rastest CPU will have executed more stall cycles as it waited ~or the slower ones to catch up, then all three will be released at the same time, although the clocks 17 will probably be out of phase; the first instruction executed by all three CPUs when th-y come out of stall will be the same instruction All data being sent from the memory module 14 or 15 to the CPUs 11, 12 and 13, whether the data is read data from the DRAM
lS 104 or from the memory locations 106-108, or is I/O data from the busses 24 and 25, goes through a register 114 Thi~ register is loaded ~rom the internal data bu~ 101, and an output 115 from this register is applied to the addre~s/data lines for busses 21, 22 and 23 at ports 91, 92 and 93 at tim T3 Parity is checked when the data is loaded to this register 114 All data written to the DRAM 104, and all data on the I/O busses, has parity bits associated with it, but the parity bits ar- not trans~erred on ~, busses 21, 22 and 23 to the CPU modules Parity errors detected ;i at th- read regist-r 114 ar- r-ported to the CPU via the status ; 25 busses 33-1 and 33-2 Only the memory modul- 14 or 15 designated a- primary will driv- th- data in its r-gi~t-r 114 onto the bu~ 21, 22 and 23 Th- m-mory modul- d-~ignated as back-up or -condary will compl-t- a read op-ration all th- way up to the point o~ loading th- register 114 and ch-cking parity, and will report tatu- on bus-- 33-1 and 33-2, but no data will be driven to th- bu~ 21, 22 and 23 :, .
A controll-r 117 in ach memory modul- 14 or 15 operates as a stat- machine clock-d by the clock oscillator 17 ~or this modul- and rec-iving th- variou~ command lin-- from bus 103 and ,i . ~.

~';.

; -,, .; ~; : .-.
: .
..
: , .

bussei9 21-23 , ~tC., to generate control bits to load registers and bus~e~, generate external control ~ignals, and tha like Thisi controller aliso iisi connected to the bus 34 between the memory moduleis 14 and 15 which tran~fers ~tatus and control in~o~mation betwe~n the two The controIlQr 117 in the module 14 or 15 currently designated as primary will arbitrate via arbitrator 110 between the I/O side (interface los) and the CPU
side (ports 91-93) for acceiss to the common bus 101-103 This decision made by the controller 117 in th- primary memory module 14 or 15 is communicated to the controller 117 of other memory modula by the lines 34, and forc-s the other memory module to execute the same access The controller 117 in each memory module also introduces refresh cycles for the DRAM 104, based upon a refresh counter 118 receiving pulses from th- clock oscillator 17 for this module The DRAM must receiv- 512 refresh cycles ev-ry 8-misec, so on average there must be a refresh cycl- introduced about every 15-microsec The counter 118 thuis produces an ov-rflow signal to the controller 117 every 15-microsec , and if an idle condition exiists (no CPU access or I/O access executing) a refresh cycle is implemented by a command applied to the bus 103 If an operation is in progress, the refresh is executed when the current operation i8 finished For lengthy operations such as block transfers used in m-mory paging, sev-ral rerresh cycles may be backed up and execut- in a bur~t mode after th- transfer is eompl-t-d; to thi- nd, the numb r of overflowo of counter 118 ine- th- la-t r-~r--h eyel- are aeeumulated in a regiister assoeiat-d with th- eount-r 118 ; lnt-rrupt r-gue~ts for CPU-g-nerated int-rrupts ar- receivedfrom each CPU 11, 12 and 13 individually by lines 68 in the int-rrupt bu~ 35; th--- int-rrupt regueot- ar- s-nt to each m-mory module 14 and 15 These interrupt r-qu-st lines 68 in bus 35 ar- applied to an interrupt vote eireuit 119 whieh eompares the thre- reguests and produeeis a vot-d int-rrupt signal on . ..
~' ; ~

-.. - , -, . .
.: ' ' . ' . ' outgoing line 69 of the bus 35 Tho cPu~ each receive a voted interrupt ~ignal on the two lines 69 and 70 (one from each module 14 and 15) via the bus 35 The voted interrupts from each memory module 14 and 15 are ORed and presented to the interrupt synchronizing circuit 65 The CPUs, under software control, decide which interrupts to service External interrupts, generated in the I/O processors or I/O controllers, are also signalled to the CPUs through the memory modules 14 and 15 via lines 69 and 70 in bus 35, and likewise th- CPUs only respond to an interrupt from the primary module 14 or lS

I/O Processor Referring now to Figure 8, one of th- I/O processors 26 or 27 is shown in detail The I/O processor has two identical ports, one port 121 to the I/O bus 24 and the other port 122 to the I/O bus 25 Each one of the I/O busses 24 and 25 consists ; of a 36-bit bidirectional multiplexed address/data bus 123 (containing 32-bits plus 4-bits parity), a bidirectional command bus 124 defining the read, write, block read, block write, etc , type of operation that is being executed, an address line that designates which location i~ being addressed, ither internal to I/O processor or on busses 28, and ~he byte mask, and finally control lines 125 including addr-s~ ~trobe, data strobe, address acknowl-dg- and data acknowl-dg- The radial line~ in bus 31 includ- individual lin-~ ~rom each I/O proc-ssor to each memory modulo bu- r-qu--t from I/O proce~or to th- memory modules, bus grant from th- memory module~ to th- I/O processor, interrupt r-que~t lin-s trom I/O processor to memorv module, and a reset lin- from m-mory to I/O processor Lin-~ to indicate which memory modul- i~ primary are connect-d to ach I/O proc-ssor via th- syst-m statu~ bus 32 A controll-r or stat- machin- 126 in th- I/O proce~sor of Figure 8 r-CQiv-~ th command, control, ~3 status and radial lines and int-rnal data, and command lines from th- buss-s 28, and d-fines th- int-rnal operation o~ th- I/O

~., :
.
. ..
.: ., . ' ' ' 2~)03342 proces~or, including operation or latche~ 127 and 128 which receive the content~ of busses 24 and 25 and also hold in~ormation ~or transmitting onto the busse~

~ransfer on the bus~es 24 and 2s from memory module to I/o S processor use~ a protocol as shown in Figur- g with the address and data ~eparately acknowledged The arbitrator circuit llo in the memory module which i~ designated primary performs the arbitration for ownership of the I/0 busses 24 and 25 When a trans~er from CPUs to I/o is needed, tho CPU request is presented to the arbitration logic llo in the memory module When the arbiter 110 qrants this request the memory modules apply the address and command to busses 123 and 124 (of both busses 24 and 25) at the same time the address strobe is asserted on bus 125 (o~ both busses 24 and 25) in time T1 o~ Figure 9; when the controller 126 ha~ caused the addres~ to b- latched into latches 127 or 128, the addr-~- acknowledg- i~ a-s-rted on bus 125, then the memory modules place the data (via both busses 24 and 25) on the bu~ 123 and a data strobe on lines 125 in time T2, following which the controller causes the data to be latched into both latches 127 and 128 and a data acknowledge signal is placed upon the lines 125, so upon receipt Or the data acknowledge, both of the memory modules release the bus 24, 25 by de-asserting the address strobe signal ~he I/0 proc-ssor then deasserts the address acknowledge signal ' 25 For tran-~-r- ~rom I/0 proc-soor to the m mory module, when; th- I/0 proc---or n--do to us- th- I/0 buo, it ass-rts a bus r-qu--t by a lln- in th- radial buo 31, to both busses 24 and 25, then waito ~or a buo grant signal ~rom an arbitrator circuit 110 in th- prlmary m-mory modul- 14 or lS, th- bu~ grant lin- also b ing on- o~ th- radialo When th- bus grant has be-n asserted, th- controll-r 126 th-n waits until th- addres- strob- and ~` addr-~o acknowledg- ignalo on buo~e~ 125 are d-a~erted (i e , ~als-) m anlng the pr-viou~ tran-rer is completed At that time, th- controll-r 126 causes th- addr-s~ to b- applied ~rom latches s ~ 27 , .

, '` ` ' - ' ` ~: ' . . .

': ' ' ' . :
. . . , - ~ ~ .
-,.~ : . '~ ' - : ` -. :~, . .. .

127 and 128 to lines 123 of both busses 24 and 25, the command to be appli-d to linQs 124, and the addres~ trobe to be ~pplied to the bus 125 o~ both buss~s 24 and 25 When address acknowledqe is received from both busses 24 and 25, theso are ~ollowed by s applying the data to the addres~/data bus~e~, along with data strobes, and the trans~or is completed with a data acknowledge signal~ from the memory modules to the I/0 processor The latches 127 and 128 are coupled to an internal bus 129 including an address bus 129a, and data bus 129b and a control bus 129c, which can address int-rnal statu~ and control register 130 used to set up the commands to be executed by the controller state machine 126, to hold the st2tus distributed by the bus 32, etc These registers 130 are addressable for read or write from the CPUs in the address space of the CPUs A bus interface 131 communicates with the VMEbus 28, under control of the controller 126 Th- bus 28 includes an address bus 28a, a data bus 28b, a control bus 28c, and radials 28d, and all of these lines are communicated through the bus interfac- modules 29 to the I/O
controllers 30; the bus interface module 29 contains a multiplexer 132 to allow only on- set of bus lines 28 (from one I/O processor or the other but not both) drive the controller 30 Internal to the controller 30 are command, control, status and data registers 133 which (as is standard practice for peripheral cantrollers of this typ-) are addr-ssable from the CPUs 11, 12 2S and 13 for read and write to initiat- and control operations in I/0 devic-8 ;j Each on- Or th- I/0 controller- 30 on the VMEbuses 28 has connection~ via a multiplexer 132 in the BIM 29 to both I/O
proc--~or- 26 and 27 and can b- controlled by ither one, but is bound to on- or th- oth-r by th- progra~ executing in th- CPUs A particular addr-~ (or ~-t of addre~ tablished for control and data-transf-r register~ 133 r-pre~-nting each controll-r 30, and th--- address-s are ~aintained in an I/0 page tabl- (normally in the k-rn-l data ection of local memory) by ~ 28 ,:
. ~
~.
~, , - : :
..... . .
~. .

the operating system. These addresses a6sociate each controller 30 a~ being accQ3sible only through ither I/0 proces~or #1 or t2, but not both. That i8, a dir~erent addres~ is used to reach a particular register 133 via I/0 processor 26 compared to I/0 s processor 27. The bus interface 131 (and controller 126) can switch the multiplexer 132 to acc-pt bu~ 28 from one or the other, and this i8 done by a write to the registers 130 of the I/0 processors from the CPUs. Thus, when tho device driver is called up to access this controller 30, the operating system uses these addresses in the page table to do it. The processors 40 access the controllers 30 by I/0 writes to the control and data-transfer registers 133 in these controllers using the write buffer bypass path 52, rather than through the write buffer S0, so these are synchronous writes, voted by circuits 100, passed through the memory modules to the bu~ses 24 or 25, thus to the selected bus 28; the processors 40 stall until the write is completed. The I/0 processor board of Figure 8 is con~igured to detect certain failures, such as improper commands, time-outs where no response i3 received over VMEbus 28, parity-checked data if implemented, etc., and when one of these failures is detected the I/0 processor guits responding to bu~ traf~ic, i.e., guits - sending address acknowledge and data acknowledge as discussed above with reference to Figure 9. Thi~ i~ detected by the bus interface 56 a~ a bus fault, resulting in an interrupt as will be explained, and self-correcting action if pos~ible.
;' ,:', ~ Error ~-cov-ry:
~ ., The ~-qu-nc- u~-d by the CPUs 11, 12 and 13 to evaluate response- by th- memory modules 14 and 15 to transfers via busses 21, 22 and 23 will now be doscribed. Thi~ seguence is defined by the ~tat- machine in the bus inter~ace unit- 56 and in code executed by the CPUs.

: --::
. ,~ .
.:
- . . .. - . .

.; - . . . ~ :
. ; ~. . ~ .:
, `: . ' . . .
. . .

In case one, for a read trans~er, it i~ as~umed that no data errors are indicated in the ~tatu- blts on line~ 33 ~ro~ the primary memory Here, the stall begun by the memory reference is ended by asserting a Ready signal via control bus s5 and 43 to - 5 allow instruction execution to continue in each microproce~sor 40 But, another transfer i8 not started until acknowledge i9 recelved on line 112 from the other (non-primary) memory module(or it times out) An interrupt is posted if any error was detected in either status field (lines 33-1 or 33-2), or if the non-primary memory times out In case two, for a read transfer, it i~ assumed that a data error i~ indicated in the status lines 33 from the primary memory or that no response is received from the primary memory The CPU~ will wait for an acknowledge from the other memory, and if no data errors are found in status bit~ from the other memory, circuitry of the bus interfac- 56 forces a change in ownership (primary m-mory ~tatus), then a retry is instituted to see if data i~ correctly read from the new primary If good status is roceiv-d from the new primary, then the stall is ended as before, and an interrupt is posted to update the sy~tem (to note one memory bad and different memory i- primary) How ver, if data error or timeout results from this attempt to read from the new primary, then an interrupt is asserted to th- processor 40 via - control bus SS and 43 i~
2S For writ- tr~n~f-r-, with th- writ- burfer S0 bypassed, case ono i- wh-r- no data rror~ ar- indicat-d in statu- bits 33-1 or 33-2 from th- ith-r m mory modul- Th- stall i~ endod to allow in~truction x-cution to continuo Again, an interrupt i~ posted if any rror wa~ d-t-ctod in ith-r status field .
; 30 For write trans~-r~, writ- bu~f-r 50 bypass-d, case two is - wh-re a data rror i~ indicat-d in tatu~ from the primary m-mory, or no respons- is r-c-iv-d ~rom th- primary m-mory The ; interfac- controll-r of ach CPU wait- for an acknowl-dg- from .''~

.!

r, ' . ~ ' .
' ~ ~

:~`
`' the other memory module, and i~ no data errors are fou~d in the statu~ from th- other ~eDory an owner~hlp change i8 forced and an interrupt i~ postod But if data errors or timeout occur ror the other (new primary) memory module, then an interrupt is a~sQrted s to the processor 4 0 .

For write transfers with the write buffer 50 enabled so the CPU chip is not stalled by a write operation, case one is with no errors indicated in status from either memory module The transfer is ended, so another bus transf-r can begin But if any lo error is detected in either status field an interrupt is posted For write transfers, write buffer 50 enabled, case two is where a data error i8 indicated in statu~ from the primary memory, or no responQe is received from the primary memory The mechanism waits for an acknowledge from the other memory, and if no data error i~ found in the statu~ from th- other memory then - an ownership change is forced and an interrupt is posted But ifdata error or timeout occur for the other memory, then an interrupt i8 posted . .
once it has been determined by the mechanism ~ust described , 20 that a memory module 14 or 15 is faulty, the fault condition is ~ignalled to the oparator, but the sy~tem can continue operating The operator will probably wish to replace the memory board containing the faulty module, which can b- done while the system pow-red up and operating The system i8 then able to re-2S int-grat- th- n-w m ~ory board without a shutdown Thls mechani-m al~o work- to revive a memory module that failed to xecut- a writ- due to a ~oft error but then tested good so it need not b phy~ically r-plac-d The ta-k is to get th- memory modul- back to a ~tat- wh-re it- data is identical to th- other ~0 memory module Thi- reviv- mod- i~ a two st-p process First, it is assumed that the memory is uninitialized and may contain parity errors, so good data with good parity must be written into all locations, this could be all zero~ at this point, but since . ~ .

,`

. . .

~ - ' ,'' : , - ~ . . , .

..
- - : .

: , . -all writes ars executed on bot~ memories the way this ~irst step i8 accomplished i5 to read a location in th- good memory module then write this data to the same location in both memory modules 14 and 15 This is done while ordinary operations are going on, interleaved with the task being per~ormed Writes originating from the I/O busses 24 or 25 are ignored by this revive routine in its ~irst stage After all location- have been thus written, the next step i3 the same a- the ~irst except that I/o accesses are also written; that is, I/O writ-s from the I/O busses 24 or 25 are executed as they occur in ordinary traffic in the executing task, interleaved with r-ading every location in the good memory and writing this same data to the same location in both memory modules When the modules have been addressed from zero to maximum address in this second step, the memorie~ are lS identical During thi~ second revive step, both CPUs and I/O
processors expect the memory module being revived to perform all operation~ without error~ The I/O processor~ 26, 27 will not use data presented by the memory modul- b-ing revived during data read transfers After completing th- revive procQss the revived ; 20 memory can then be (if necessary) designated primary `~ A similar revive process is provided for CPU modules When one CPU is detected faulty (as by the me~ory voter 100, etc ) the other two continue to operat-, and th- bad CPU board can be ~ replac-d without syst-m shutdown Wh-n th- new CPU board has run ;;~ 25 it~ power-on s-lf-t-~t routin-~ rrOm on-board ROM 63, it signals thi~ to th- oth-r CPU-, and a r-viv routin- is execut-d First, th- two good CPU- will copy th-ir ~tat- to global memory, then all thr-- CPU- will x-cut- a "-oft r---t" whereby the CPUs reset and t~rt x-cuting fro~ th-ir initialization routines in ROM, so th-y will all com up at th- exact aame point in their instruction stream and will b- ~ynchronized, then the saved state ;1 i8 copi-d bacX into all thre- CPU- and th- task pr-viously j x-cuting i- continued . ~

~ 32 i :`
.~ . . - .

. , -; . , .. ,,.. ,., ~ . - ,. - - ~
.

. - .

`: :. :
. . : ..

A~ noted abov~, th~ vot~ circuit 100 in oach memory module determinos whether or not all three CPUs make identical memory references If so, the memory operation i~ allowed to proceed to completion If not, a CPU fault mode is entered The CPU which transmits a different memory reference, as detected at the vote circuit 100, is identified in the status returned on bus 33-1 and or 33-2 An interrupt is posted and a software subsequently puts the faulty CPU offline This offline status is re~lected on status bus 32 The memory reference where the fault was detected is allowed to complete based upon the two-out-of-three vote, then until the bad CPU board has been replaced the vote circuit 100 require~ two identical memory requests fro~ the two good CPUs before allowing a memorv reference to proceed The system is ordinarily configured to continue operating with one CPU off-line, but not two However, if it were desired to operate with only one good CPU, this is an alternative available A CPU is voted faulty by the voter circuit 100 if different data is detected in its memory request, and also by a time-out; if two CPUs send identical memory requests, but the third does not send any signal~ for a preselected time-out period, that CPU is a~sumed to be faulty and is placed off-line as before The I/0 arrangement of the syste~ has a mechanism for software reintegration in the event of a failure That is, the CPU and memory modul- core i~ hardware fault-protected as just described, but the I/0 portion of the system is software fault-protect-d When on- of th- I/0 proce~sor~ 26 or 27 fails, the controll-r- 30 bound to that I/0 processor by ~oftware as mention-d abov- ar- ~witched over to the other I/0 processor by ~oftwar-; th- operating system rewrites the addresses in the I/0 page tabl- to use th- new addresses for the same controllers, and from then on the~e controllers are bound to the other one of the pair of I/0 processors 26 or 27 The error or ~ault can be ; detected by a bus error terminating a bus cycle at the bus inter~ace 56, producing an exception dispatching into the kernel ~35 through an exception handler routine that will determine the .~;
.
,.

, . . .

-.
.
~ ' ` ' - ' ` .
: :' ' ' ;

caus~ of the exception, and then (by rewriting addresses in the I/O table) move all the controller~ 30 from the railed I/o processor 26 or 2 7 to the other one When the bus int~rface 56 detects a bus error as just described, the fault must be isolated be~ore the reintegration scheme i~ used When a CPU does a write, either to one of the I/0 proce~sors 26 or 27 or to one of the I/0 controllers 30 on one of the busses 28 (e g , to one of the control or status registers, or data registers, in one of the I/0 elements), this 0 i8 a bypass operation in the memory modules and both memory modules execute the operation, passing it on to the two I/0 busses 24 and 25; the two I/o processors 26 and 27 both monitor th- bus~es 24 and 25 and check parity and check the commands for proper syntax via the controllers 126 For example, if the CPUs lS are executing a write to a register in an I/o processor 26 or 27, if either one of the memory modules presents a valid address, valid command and valid data (as evidenced by no parity errors and proper protocol), the addressed I/0 processor will write the data to the addressed location and respond to the memory module with an Acknowledge indication that the write was completed successfully Both memory modules 14 and lS are monitoring the responses from the I/0 processor 26 or 27 (i e , the address and data acknowledge signals of Figur- 9, and associated status), and both memory modules r-spond to th- CPU~ with operation status on lin-s 33-1 and 33-2 ~If thi~ had be-n a read, only the primary memory modul- would r-turn data, but both would return status ) !~, Now t~- CPU- can d-t-rmin- if both x-cuted th- write correctly,or only on-, or non- If only one r-turns good status, and that was th- primary, th-n th-r- i8 no n--d to forc- an ownership chang-, but if th- backup returned good and the primary bad, then an own-rship chang- is forc-d to mak- th- on- that xecuted corr-ctly now th- primary In ith-r cas- an interrupt is ~l enter-d to report th- fault At thia point th- CPU~ do not know ~3 wh-th-r it i~ a memory modul- or som-thing down~tr-am of the ~;l35 m-mory modules that is bad So, a similar writ- is attempted to :~
:
~1 s : ~. ~ , ' . : --:
.- , - .
. . - ~.
. .

:
, the other I/O processor, but if this succe-ds it does not necessarily prove th- memory module is bad becau~e the I/O
processor initially addressed could be hanging up a line on the bus 24 or 25, for example, and causing parity errors So, the s process can then selectively shut otf the I/o proces~ors and retry the operations, to see if both memory modules can correctly execute a write to the same I/O processor If so, the system can continue operating with the bad I/O proce-~or off-line until replaced and reintegrated But if the retry still gives bad status from one memory, the memory can be off-line, or further fault-isolation steps taken to mak- sur- the rault is in the memory and not in some other element; this can include switching all the controllers 30 to one I/O processor 26 or 27 then issuing a reset command to the off I/O processor and retry communication with the online I/O processor with both memory modules live -then if the reset I/O processor had been corrupting the bus 24 or 25 it~ bu~ drivers will have been turned off by the reset so if the retry of communication to the online I/O proce~sor (via both busses 24 and 25) now returnJ good status it iB known that the reset I/O processor was at fault In any event, for each bus error, some type of fault isolation seguence in implemented to determine which system component needs to be forced offline . . .
Synchronization The processor~ 40 u-ed in the illu-trative embodiment are of pip-lin-d archit-ctur- with overlapp-d in~truction execution, as di~cu---d abov- with r-f-r-nc- to Figure~ 4 and 5 Since a ~ynchronization techniqu- us-d in thi~ e~bodiment relies upon cycle counting, i - , incr-~ nting a count-r 71 and a counter 73 ; of Figur- 2 very tim an in~truction is executed, generally as s-t forth in application Ser No 118,503, th-r- mu~t be a definition of what con~titut-s th- exQcution of an instruction in the proc-~sor 40 A straightforward d-finition is that every tim- th- pip-line advanc-~ an instruction is executed One of .~
'', .

'~:............ .. . .. ..

~; - , ~ .
. ~
.
;: , .~.: ' . " . ~ ' ; , ' the control lines in the control bus 43 is a ~ignal RUNt which indicates that the pipeline is stalled; when RUN~ is h~gh the pipeline is stalled, when RUN# is low (logic zero~ the pipeline advance3 e~ch machine cycle Thi~ RUNt signal is used in the numeric processor 46 to monitor the pipeline of the processor 40 ~o this coprocessor 46 can run in lock~tep with its associated processor 40 This RUN~ signal in the control bus 43 along with the clock 17 are usQd by the counter~ 71 and 73 to count Run cycles.
., lo The size of the counter register 71, in a preferred embodiment, is chosen to be 4056, i e , 212, which is selected because the tolerances of the crystal o~cillators used in the clocks 17 are such that the drift in about 4K Run cycles on averaqe results in a skew or difference in number of cycles run by a processor chip 40 of about all that can be reasonably allowed for proper operation of the interrupt synchronization as explained below one synchronization mechanism is to force action to cause the CPUs to synchronize whenever the counter 71 overflow~ One such action is to force a cache mi~ in response ; 20 to an overflow signal OVFL from the counter 71; this can be done by merely generating a false Miss signal (e g , TagValid bit not set) on control bus 43 for the next I-cache reference, thus torcing a cache miss exception routine to be entered and the ; r-sultant memory reference will produc- synchronization just as any memory referenc- does Another method of torcing synchronization upon ov-rtlow of counter 71 i~ by torcing a stall in th- proc--~or 40, which can b don- by using th- over~low ignal OVFL to g-n-rat- a CP Bu~y (coproce~sor bu~y) signal on ~, control bu- 43 via logic circuit 71a o~ Figure 2; this CP Busy ~ignal alway- r-~ult~ in th- processor 40 ent-ring stall until CP
Busy i~ d-as~-rt-d All thr-e proc-~or~ will enter this stall b-caus- th-y are ex-cuting the sam- cod- and will count the same i cycleB in th-ir count-r 71, but the actual tim- th-y enter the ~tall will vary; th- logic circuit 71a r-c-ives th- RUNt signal , 35 trom bus 43 ot the oth-r two proce~sors via input Rt, so when all "1 36 ~i ;

' .~ ", ' , ' ~ ' ', ' , ' ' - .
.: - :
.
. ~ ~, .
.

three have stalled t~e cP Busy s~gnal is released and the processors will come out of stall in ~ynch again.

Thus, two synchronization techniques have been described, the first being the synchronization resulting from voting the memory references in circuits 100 in the memory modules, and the second by the overflow of counter 71 as ~u~t set forth. In addition, interrupts are synchronized, ais will be described below. It ls important to nots, howev~r, that the procQSsors 40 are basically running free at their own clock speed, and are substantially decoupled from one another, except when synchronizing events occur. The fact that microprocessors are used as illustrated in Figures 4 and 5 would make lock-step synchronization with a single clock more difficult, and would degrade performance; also, use of the write buffer 50 serves to decouple the processors, and would be much less effective with close coupling of the processors. Likewise, the high-performance resulting from using instruction and data caches, and virtual memory management with the TLBs 83, would be more difficult to implement if close coupling were used, and performance would ` 20 suffer.
' The interrupt synchronization technique must distinguish between real time and i~o-called "virtual time". Real time is the external actual time, clock-on-the-wall time, measured in seconds, or for convenience, mea~ured in machine cycles which are ' 25 60-ns-c divisions in th- xampl-. The clock generators 17 each produc- clock pul~-- in real time, of cours-. Virtual time is the lnt-rnal cycle-count tim- Or each of the processor chips 40 as measured in each on- of the cycle counters 71 and 73, i.e., the instruction number of the ini~truction being executed by the processor chip, measured in instructions since some arbitrary beginning point. R-~erring to Figure 10, the relationship ; between real time, shown ais to to t12, and virtual time, shown as instruction numb r (modulo-16 count in count regi~ter 73) Io to I15, is illustrated. Each row of Figur- 10 is the cycle count .~ .

i , . . .. .
':: ': ; ' , , ' .

.~ , . ,-- .
.-; . . . .- :
,-,, ` . `' ' , ` '' . .

, :; ' .

;Z 003342 ~or one of the cPuc A, B or c, and each column i~ a "polnt" in real tim~. The clocks for the cPus will most likelY be out of phase, so the actual time correlation will be as seen in Figure lOa, where the instruction numbers (column~) are not per~ectly s aligned, i.e., the cycle-count does not change on aligned real-time machine cycle boundaries; however, for explanatory purposes the illustration of Figure 10 will suffice. In Figure 10, at real time t3 the CPU-A is at the third instruction, CPU-B is at count-9 or executing the ninth instruetion, and CPU-C is at the fourth instruetion. Note that both real time and virtual time ean only advance.

The processor chip 40 in a CPU stall~ under certain conditions when a resource is not available, sueh as a D-caehe 45 or I-caehe 44 miss during a load or an instruetion feteh, or a signal that the write buffer 50 is full during a store operation, or a "CP Busy" signal via the eontrol bus 43 that the eoprocessor 46 is busy (the eoproeessor reeeives an instruetion it cannot yet handle due to data depondeney or limited proeessing resourees), - or the multiplier/divider 79 is busy (the internal multiply/divide eireuit has not eompleted an oporation at the time the proeessor attempts to aeeess the result register). Of ; these, the eaehes 44 and 45 are "pas~ive re~ourees" which do not ehange ~tate without intervention by the proeessor 40, but the ~ remainder of th- items are aetive resouree~ that ean ehange state ; 25 while the proeessor is not doing anything to aet upon the resouree. For exampl-, th- write buffer 50 ean ehange from full to empty with no action by th- proee~or (so long aQ the proe-~-or do-~ not p-rform another store operation). So there are two typ-s of stalls: stalls on pa~siv- resourees and stalls ` 30 on aetiv resourees. Stalls on aetive rQ~ourees are ealled interloek stalls.

Sinee the eode streams exeeuting on the CPU~ A, B and C are the ~ame, the state~ of the passive resourees ~ueh as eaehes 44 and 45 in the three CPU~ are neeessarily the same at every point :

~_.

.
~;

in virtual time Ir a stall i~ a result o~ a confliCt at a passivo resource (e g , th~ data cache 45) then all three processors will perform a stall, and the only variable will be the length of the stall Referring to Figure 11, assume the cache miss occurs at I~, and that the acce~s to the qlobal memory 14 or 15 resulting from the miss takes eight clocks (actually it may be more than eight) In this case, CPU-C begins the access to global memory 14 and 15 at tl, and the controller 117 for global memory begins the memory access when the first processor CPU-C signals the beginning of the memory access The controller 117 completes the access eight clock~ later, at t8, although CPU-A and CPU-B each stalled less than the eight clocks required for the memory access The result is that the CPUs become synchronized in real time as well as in virtual time This example also illustrates the advantage of overlapping the access to DRAM 104 and the voting in circuit 100 Interlock stalls present a different situation from passive resource stalls One CPU can perform an interlock stall when another CPU does not stall at all Referring to Figure 12, an interlock stall caused by the write buffer 50 is illustrated The cycle-counts for CPU-A and CPU-B are shown, and the full flags A~b and B~b from write buffers 50 for CPU-A and CPU-B are shown below the cycle-counts (high or loqic one means full, low or logic zero mean~ empty) The CPU checks the state of the full ~25 flag every time a stor- operation is executed; if the full flag i8 set, the CPU stall~ until the full flag is cleared then compl-t-~ th- stor- op-ration Th- writ- bufrer 50 sets the full flag if th- ~tor- op-ration fill~ th- buffer, and clears the full ; flag wh-n-v-r a stor- operation drains one word from the buffer th-reby fre-ing a location for th- n-xt CPU store operation At time to tho CPU-B is three clocks ahead of CPU-A, and the write buffer~ are both full A~sum- the writ- buffers are performing a write operation to global memory, 80 when this write completes during t5 the write buffer full flaqs will be cleared; this . ` .

.
. ~

,:;, . : .
,': . . .: ' .: ' , ' ''' ' , ';" ' ' ' : :

, ', ' , ' ,: ' ~

.
clearing will occur synchronously in t6 in real time (for the reason illu~trated by Figure 11) but not ~ynchronously in virtual time Now, assume the instruction at cycle-count I6 is a store operation; CPU-A executes this store at t6 after the write buffer rull flag i- cleared, but CPU-B trie~ to execute this atore operation at t3 and finds the write buffer full flag ls still set and 80 has to stall for three clocks Thu~, CPU-B performs a stall that CPU-A did not .
The property that one CPU may stall and the other not stall imposes a restriction on the interpretation of the cycle counter 71 In Figure 12, assume interrupts are presented to the CPUs on a cycle count of I7 (while the CPU-B is stalling from the I6 - instruction) The run cycle for cyele eount I7 oeeurs ror both CPUs at t7 If the eyele eounter alon- presQnts the interrupt to the CPU, then CPU-A would see the interrupt on cyele eount I7 but CPU-B would sQe the interrupt during a stall eyele resulting from ; cyele eount I6, so this method of presenting interrupts would eause the two CPUs to take an exeeption on different instruetions, a condition that would not have oceurred if either all of the CPUs stalled or none stalled Another restriction on the interpr-tation of the cycle eounter i~ that th-r- hould not b- any d-lays between deteeting ~; th- eyel- eount and p-r~orming an aetion Again re~-rring to Figur- 12, a-~um- int-rrupt- ar- pr-~-nt-d to the CPUs on eyele 2S eount I~, but beeau-- o~ implem-ntation r-strietionJ an extra eloek d-lay i~ int-rposed betwe-n deteetion o~ eyele eount I6 and ~ pres-ntation o~ th- interrupt to the CPU The result is that ;~ CPU-A 8-e- thiJ int-rrupt on eyele eount I7, but CPU-8 will see th- int-rrupt during th- ~tall from eycl- eount I~, cau~ing the , 30 two CPUs to tak- an xe-ption on di~-r-nt instruetions Again, I th- importanee of monitoring th- statQ o~ th- instruetion `~ pip-lin- in r-al time i~ illustrated .~
~ 40 . ::. . .
.
,, , , . . :
;. ' . . -.. . : .
i'.;' ' ~; ' , : '~

2~)03342 Interrupt Synchronization The three cPus of the ~y~tem of Figures 1-3 are required to function a~ a single logical processor, thus requiring that the CPUs adhere to certain restrictions regarding their internal state to ensure that the programming model of the three CPUs is that o~ a ~ingle logical proces~or Excopt in failure modes and in diagnostic functions, the instruction streams of the three CPUs are required to be identical~ If not identical, then voting global memory accesses at voting circuitry 100 of Figure 6 would be difficult; the voter would not know whether one CPU was faulty or whether it was executing a different sequence of instructions The synchronization scheme is designed so that if the code stream of any CPU diverges from the code stream of the other CPUs, then a failure is assumed to have occurred Interrupt synchroniza-i 15 tion provides one of the mechanisms of maintaining a single CPU image All interrupts are required to occur synchronous to virtual time, ensuring that the instruction streams of the three proce~sors CPU-A, CPU-~ and CPU-C will not diverge as a result of interrupts (th-re are other causes of divergent instruction stream-, such as one proc-ssor r-ading diff-rent data than the data read by the oth-r processors) Several scenario~ exist wh-r-by interrupts occurring asynchronous to virtual time would cau-- th- code str-ams to div-rg- For example, an interrupt 2S cau-lng a cont-xt witch on on- CPU b for- proces~ A completes,but cau-ing th- cont-xt ~witch after proce~s A completes on ~ anoth-r CPU would r--ult in a ~ituation wh-r-, at some point ;~, lat-r, on- CPU continu-s executing proces~ A, but the other CPUcannot ex-cut- proc-s~ A becaus- that proc-s~ had alr-ady compl-ted If in thi~ case the int-rrupt- occurr-d aJynchronous to virtual tim-, then just thQ fact that th- xc-ption program counters wer- diff-rent could ~au-e probl-ms The act of writing `j th- xception program counters to global m mory would result in ~~ 41 .~
,~

t5.
'''`' -~r :: :
.~ . . . .

. ''~':- : ' , . .

the voter detecting different data from the three CPU~, producing a vote fault certain types of exceptions in the CPU~ ar~ inherently synchronous to virtual time One example is a breakpoint exception caused by the execution of a breakpoint instruction Slnce the in~truction streams of the CPUs aro identical, the breakpoint exception occurs at the ~ame point in virtual time on all three of the CPUs Similarly, all such internal exceptions inherently occur synchronous to virtual time For example, ~LB
exceptions are internal exceptions that are inherently synchronous TLB exceptions occur because the virtual page number does not match any of the entries in the TLB 83 Because the act of translating addresses is solely a function of the instruction stream (exactly as in the case of the breakpoint exception), the translation is inherently synchronous to virtual time In order to ensure that TI~3 Qxceptions are synchronous to virtual time, the state of the TLBs 83 must be identical in all three of the CPUs 11, 12 and 13, and thi~ is guaranteed because the TLB 83 can only be modified by software Again, since all of th- CPUs execute the same instruction stream, the state of the TL8s 83 are always changed synchronous to virtual time So, as a general rule of thumb, if an action is performed by software then the act$on is synchronou~ to virtual time If an action is performed by hardware, which does not use the cycle counters 71, then the action is g-nerally synchronou~ to real time :
Ext-rnal xceptions are not inh-r-ntly synchronous to virtual tlu- I/O d-vic-s 26, 27 or 30 have no information about the virtual tim- o~ the threo CPUs 11, 12 and 13 Therefore, all interrupt~ that ar- generated by thes- I/O devices must be synchroniz-d to virtual tim- befor- pr-senting to th- CPUs, as xplain-d b-low Floating point xc-ptions ar- di~ferent from I/O device interrupts becausa th- floating point coprocessor 46 is tightly coupled to the microprocessor 40 within the CPU
` :

', . .

'' . . :

. ,.

External devices view the three CPUs a~ one logical processor, and have no information about the synchronaity or lack of synchronaity between the CPUs~ so the external device~ cannot produce interrupts that are ~ynchronous with the individual instruction stream (virtual time) of each CPU Without any sort of synchronization, if some external device drove an interrupt at time real time tl of Figure 10, and the interrupt was presented directly to the CPUs at this time then the three CPUs would take an exception trap at different instructions, resulting in an unacceptable state of the three CPUs This is an example of an - event (assertion of an interrupt) which i~ synchronous to real time but not synchronous to virtual time Interrupts are synchronized to virtual time in the system of Figures 1-3 by performing a distributed vote on the interrupts lS and then presenting the interrupt to the processor on a predetermined cycle count Figure 13 shows a more detailed block diagram of the interrupt synchronization logic 65 of Figure 2 Each CPU contains a distributor 135 which captures the external interrupt from the line 69 or 70 coming from the modules 14 or lS; this capture occurs on a predetermined cycle count, e g , at count-4 as signalled on an input line CC-4 from the counter 71 The captured interrupt is distributed to the other two CPUs via the int~r-CPU bus 18 These distributed interrupts are called pending interrupts Ther- are thr~e pending interrupts, one from each CPU 11, 12 and 13 A voter circuit 136 captures the pending i interrupts and per~orm- a vot- to veri~y that all o~ the CPUs did r-c-iv- th- xternal int-rrupt r-qu--t On a predetermined cycle count ~det-ct-d ~ro~ th- cycl- count~r 71), in this example ` cycl--8 r-c-iv-d by input lin- CC-8, the interrupt voter 136 ~30 pres nt~ th- interrupt to the interrupt pin on its respective microproc-ssor 40 via lin~ 13~ and control bus 55 and 43 Since ;` th- cycl- count that is used to pres-nt th- interrupt is pr-d-ter~in-d, all o~ th- microproc-ssors 40 will receive the ~ int-rrupt on th- same cycl- count and thu~ the interrupt will ;~35 hav~ b--n synchronized to virtual tim . ,.
; 43 .
; ,;

'~
'`. .

.: :........ .. . .
, ' .
;:: -... .. . .
: .
. . ~

.. ~ :

2003;~42 Figure 14 shows the sequence of events ~or synchronizing interrupts to virtual time The rows labeled CPU-A, CPU-B, and CPu-c indicate the cycle count in counter 71 of each CPU at a point in real time The rows labeled IRQ A PEND, IRQ B PEND, and IRQ C PEND indicate the state o~ the interrupt pending bits coupled via the inter-CPU bus 18 to the input of the voters 136 (a one signifies that the pending bit i5 set) The rows labeled IRQ A, IRQ B, and IRQ_C indicate the ~tate of the interrupt input pin en the microprocessor 40 (the signals on lines 137), where a one signifies that an interrupt is present at the input pin In Figure 14, the external interrupt (EX IRQ) is asserted on line 69 at to If the interrupt distributor 135 captures and then distributes the interrupt to the inter-CPU bus 18 on cycle count 4, then IR Q C PEND will go active at tl, IRQ B PEND will qo lS activ- at t2, and I~Q A PEND will go active at t~ If the interrupt voter 136 captures and then votes the interrupt pending bits on cycle count 8, then IRQ C will go active at ts, IRQ B
;~ will go active at t6, and IRQ-A will go active at t8 The result - is that the interrupts were presented to the CPUs at differentpoints in real time but at the same point in virtual time (i e ' cycle count 8) Figure lS illustrates a scenario which reguires the algorithm presented in Figure 14 to be modified Note that the cycl- counter 71 i~ h-r- r-pr--ented by a modulo 8 counter The 2S xt-rnal int-rrupt (EX ~RQ) is asserted at time t3, and the lnterrupt di-tributor 13S captur-s and then distributes the int-rrupt to tho int-r-CPU bus 18 on cycle count 4 S~nce CPU-B
and CPU-C have ex-cuted cycle count 4 before time t3, their interrupt distributor do-s not captur- tho external interrupt CPU-A, however, execut-s cycl- count 4 after time t3 The result is that CPU-A capturas and distributes the xt-rnal interrupt at time t~ But if th- interrupt vot-r 136 capture~ and votes the interrupt pending bits on cycle 7, the interrupt voter on CPU-A
:~ :
~ 44 ;

.~

.

.. .
~. , -..... :.

.:

captures the IRQ_A_PEND signal at time t7, when the two other interrupt pending bits are not ~et The interrupt vot-r 136 on CPU-A recognizes that not all of the CPU~ hav- distributed the external interrupt and thus places the captured interrupt pending s blt in a holding register 138 The interrupt voters 136 on CPU-B and CPU-C capture the single interrupt pending bit at times ts and t~ respectively LiXe the interrupt voter on CPU-A, the voters r-cognize that not all o~ the interrupt pending bits are set, and thus the single interrupt pending bit that i5 set is placed into the holding register 138 When the cycle counter 71 on each CPU reaches a cycle count of 7, the counter rolls over and begins counting at cycle count 0 Since the external - interrupt is still asserted, the interrupt distributor 135 on CPU-B and CPU-C will capture the external interrupt at times tlo and tg respectively These times correspond to when the cycle count becomes egual to 4 At time tl~, th- interrupt voter on CPU-C captures the interrupt pending bitff on th- inter-CPU bus 18 Th- voter 136 determineQ that all of th- CPUs did capture and distribute the extornal interrupt and thus presents the interrupt to the processor chip 40 At times tl33 and tl5, the interrupt voters 136 on CPU-B and CPU-A capture the interrupt pending bits and then presents the interrupt to the processor chip 40 The result is that all of tho processor chips received the external interrupt reque~t at identical instructions, and the ~i 25 information saved in the holding regi~ters i~ not needed .s Holding R-gi-t-r In th- int-rrupt scenario pr-sent-d above with reference to Figur- lS, the voter 136 use~ a holding register 138 to save some state information In particular, th- sav-d state wa~ that some, but not all, of the CPU~ captur-d and di~tribut-d an xternal interrupt If th- ~yst-m do-- not hav- any fault- (a~ wa- the situation in Figur- 15) th-n thi~ stat~ information i~ not .
. ~ .. - . . .
.`. . ' ~ . ' . ' ' ''. ' ' :
. . , .~ . - :., .. : . .
:, ::: . - - . . -.~ .
':~

Z0~3342 necessary because, a ~hown in the prQvious example, external interrupto can be synchronized to virtual time without the use o~
the holding register 138 The algorithm is that the interrupt voter 136 capture~ and voteC the interrupt pending bits on a pr-determined cycle count When all of the interrupt pending bits are aaserted, then the interrupt is presented to the processor chip 40 on the predetermined cycle count In the example of Figure 15, the interrupts were voted on cycle count 7 Referring to Figure 15, if CPU-C fails and the failure mode is such that the interrupt distributor 135 does not function correctly, then if the interrupt voters 136 waited until all of the interrupt pending bits were set before presenting the interrupt to the processor chip 40, the result would be that the interrupt would never get presented Thus, a single fault on a single CPU renders the entire interrupt chain on all of the CPUs inoperable The holding register 138 provides a mechanism for the voter ! 136 to know that the last interrupt vote cycle captured at least one, but not all, of the interrupt pending bits The interrupt vote cycle occurs on the cycle count that the interrupt voter capture~ and votes the interrupt pending bits There are only two scenarios that result in some of the interrupt pending bits being set One is the ~cenario presented in reference to Figure $~ 15 in which the external int-rrupt i8 aoserted before the int-rrupt di-tribution cycl- on some of the CPUs but after the int-rrupt di-tributlon cycl- on other CPUa In the second sc-nario, at l-ast on- of the CPUs fails in a manner that disabl-~ the interrupt distributor If th- rea~on that only some of th- int-rrupt p-nding bits ar- set at th- interrupt ~ote cycle is case on- scQnario, then th- interrupt voter is guaranteed that all of th- interrupt p-nding bit~ will b- set on the next interrupt vote cycl- Therefore, if the interrupt voter discover- that the holding register hao been s-t and not all of the interrupt pending bits are set, then an error must exist on : ,. .

. -- .
.-:- :

' .

one or more of the CPU~ This as~umes that the holding regl~ter 138 o~ each CPU gets cleared when an interrupt is serviced, so that the ~tate of the holding register doe- not repre~ent ~tale 3tate on the interrupt pending bits In th- case o~ an error, tho interrupt voter 136 can pre~ent the interrupt to the proces-sor chip 40 and simultaneou~ly indicate that an error has been d-tected in the interrupt synchronization logic :
The interrupt voter 136 does not actually do any voting but instead merely checks the state of the interrupt pending bits and lo the holding register 137 to determin- whether or not to present an interrupt to the processor chip 40 and whether or not to ' indicate an error in the interrupt logic , . ~ .
Modulo Cycle Counters ;, :
;~ The interrupt synchronization example of Figure 15 3 15 repr-sQnt-d the interrupt cycle count-r 71 as a modulo N counter (e g , a modulo 8 counter) Using a modulo N cycle counter ;, simplifi-d the description of the interrupt voting algorithm by allowing the concept of an interrupt vot- cycle With a modulo N
cycle counter, the interrupt vote cycle c~n be described a~ a single cycle count which lies between O and N-l where N is the modulo of the cycl- count-r What-v r valu- o~ cycle counter is cho~-n for th- int-rrupt vot- cycle, that cycle count i8 gu~rant--d to occur ~ ry N cycl- count-; a~ illu~trat-d in ' Figur- 15 for ~ modulo 8 count-r, v-ry ight count~ an interrupt q 2S vot- cycl- occur- Th- interrupt vot- cycl- i9 used here merely to illu-trat- th- p-riodic nature of a modulo N cycle counter Any v nt th~t i- k-y-d to ~ p~rticular cycl- count of ~ modulo N
cycle count-r is guar~nt--d to occur ev-ry N cycl- counts Obviously, ~n infinit- (i - , non-rep-~ting count-r 71) couldn't b- u--d ..~
,',,~ .
~i 47 . .', .;
.

:- -:
.... . .

. .

-:, A value of N is chosen to maximize system parameters that have a pos~tive effect on the systom and to minimize system parameters that have a negative effect on the system Some of such ef~ects are developed empirically First, some o~ the s parameters will be described; C. and cd are the interrupt vote cycle and the interrupt distribution cycle respectively (in the circuit of Figure 13 these are the inputs CC-8 and CC-4, respectively) The value of C. and Cd must lie in the range between 0 and N-l where N is the modulo of the cycle counter D ~X is the maximum amount of cycle count drift between the three processors CPU-A, -B and -C that can be tolerated by the synchronization logic The processor drift is determined by taking a snapshot of the cycle counter 71 from each CPU at a point in real time The drift is calculated by subtracting the lS cycle count of the slowest CPU from the cycl- count of the fastest CPU, performed as modulo N subtraction The value of D~ i8 described as a function of N and the values of C~ and Cd First, D,~ will be defined as a function of the difference ' Cv-Cd, where the subtraction operation is performed as modulo N
subtraction This allows us to choose values of C. and Cd that maximize D~ Consider the scenario in Figure 16 Suppose that Cd-8 and C.-9 From Figure 16 the processor drift can be calculated to be D~ 4. The external interrupt on line 69 is asserted at time t~ In this cas-, CPU-D will capture and ;25 di~tribut- the interrupt at time t~ CPU-B will then capture and ; vot- th- int-rrupt pending bits at time t6 This scenario is ~ncon-ist-nt with th- int-rrupt synchronization algorithm pres-nt-d arli-r becausQ CPU-B ex-cute~ it~ interrupt vote cycle before CPU-A has performed the interrupt distribution cycle The `30 flaw with thi~ scenario is that the processors have drifted further apart than the difference b~twe-n C. and Cd The relationship can b- formally written a-~ Equation (1) C. - C~ < D~ - e :
:, 48 ., .

'~ ' ,': . ~ ' . ...

,' . . - ~ :
, . .
~ ' `
', ' '' ' 20~3342 wher8 e i9 the time need-d for th- interrupt p-nding bits to propagate on the inter-CPU bu~ 18 In pr-viou~ xampl-s, e has been assumed to b- zero Since wall-clock tim- has been quantized in cloc~ cycl- (Run cycle) incrementQ, e can also be quantized Thu~ the equation becomes Equation (2) C. - Cd < D..~
where D~,~ is expressed as an integer numb r of cycle counts Next, the m~ximum drift can be described as a function of N
Figur- 17 illustrate~ a ~cenario in which N~4 and the processor dri~t D~3 Suppose that Cd-0 The subseript~ on cycle count 0 o~ each proeessor denote the quotient part (Q) of the instruction , cycle eount Since the cycle count iB now represented in modulo N, the value of th- cycle counter is the remainder portion of I/N
where I is the number of instructions that have been executed sinee time to The Q of the instruction eycle count is the integer portion Or I/N If the xternal interrupt is asserted at time t3, then CPU-A will eapture and di~tribut- th- interrupt at tim- t~, and CPU-B will execute its int-rrupt distribution cycle at tim- t5 Thi~ presents a problem beeause the interrupt distribution cycle for CPU-A ha~ Q~1 and the interrupt distribution cycle for CPU-B has Q~2 Th- synchronization logic will continue as if there are no problem~ and will thus present th- int-rrupt to th- proeessors on equal eyele eounts But the ~nterrupt will b- pr-~-nt-d to th- proe-s-ors on diff-r-nt 2S instruetions beeau~- th- Q of aeh proe-s-or is different The r-lation-hip of D.~, a- a funetion of N i~ th-refor-:,.
Equ~tion ~3) N/2 > D..~
wh-r- N 1- an v-n nu~b r and Dmax i8 xpre-sed as an integer num~ r of eyel- eount- (Th--- equation~ 2 and 3 ean be shown to b- both quival-nt to th- Nyquist th-or-m in sampling theory ) ; Combining quations 2 and 3 giv-~
Equation (4) C. - Cd < N/2 - 1 whieh allow- optimu~ valu-s of Cv and Cd to b- eho~en ~or a given valu- of N

,:

.. ~ : . .. .
.
.

:.. : . . , . -.,.~, .

All o~ the above equations suggest that N should be as large as possible The only factor that trie~ to drive N to a small number is interrupt latency Interrupt latency is the time interval between the assertion o~ the external interrupt on line 69 and the presentation o~ the interrupt to the microprocessor chip on line 1~7 Which processor should be used to determine the interrupt latency is not a clear-cut choice The three microprocessors will operate at different speeds because of the slight differences in the crystal oscillators in clock sources 17 and other factors T~ere will be a ~aste~t processor, a slowest processor, and the other processor Defining the interrupt latency with respect to the slowest processor is reasonable because the performance of system is ultimately determined by the performance of the slowest processor Th- maximum interrupt latency is Equation (5) L~ 2N - 1 where L~ is the maximum interrupt latency expressed in cycle counts The maximum interrupt latency occurs when the external interrupt is asserted after the interrupt distribution cycle Cd of the fastest processor but before the interrupt distribution cycle Cd of the slowest processor The calculation of the average interrupt latency L~.~ i8 more complicated because it depends on the probability that the external interrupt occurs after the interrupt distribution cycle of the fastest processor and before the int-rrupt distribution cycle of the slowe~t proce--or ~his probability d-pends on th- drift between the proe~-sor- which in turn i8 det-rmin-d by a number of external factors If we assum that theJe probabilities are zero, then the average latency may be expressed as Eguation (6) L~ N/2 + (C~ - Cd) Using the~e relationships, values of N, C~, and Cd are chosen using the system requirement for D.~ and interrupt latency For example, choosing N-128 and (C. - Cd)~10, L~ 74 or about 4 4 microsee (with no stall cycles) Using the preferred .. ,~ . " . .
; .. . : ;
, : .

.

.

20()3342 embodiment where a four bit (~our binary stage) counter 71a is used as the interrupt synch count-r, and th- distribute and VotQ
outputs aro at ec-4 and CC-8 as discussed, it is seen that N~16, e.~8 and Cd-4, so L.~.-16/2 +(8-4) - 12-cycles or 0 7 microsec :;
Refresh Control for Local Memory ..
The r-fresh counter 72 counts non-stall cycles (not machine cycles) ~ust as the counters 71 and 71a count The ob~ect is that the refresh cycle~ will be introduced for each ePu at the same cycle count, measured in virtual tim- rather than real time lo Preferably, each one of the CPUs will interpose a refresh cycle ~; at the ~ame point in the in6truction stream as the other two - The DRAMs in loeal memory 16 must be refreshed on a S12 cyeles per 8-msee sehedule ~ust as mentioned above regard$ng the DRAMs 104 o~ the global m-mory Thus, the eounter 72 eould issue a r-fr-sh eommand to the DRAMs 16 onee every lS-mierosee , addr-s~ing one row of S12, 80 the refr--h speeifieation would be satisfied; if a memory operation was reguested during refre~h then a Busy response would result until r-rresh was finished But l-tting eaeh CPU handle its own loeal memory refresh in real tim- independently of th- others eould eause the CPUs to get out of syneh, and so additional eontrol is needed For example, if refr-sh mod- i~ ent-r-d ~ust a~ a divide operation is beginning, i~ th-n timing i- sueh that on- CPU eould tak- two eloek- longer than oth-r- or, if a non-int-rruptabl- -gu-ne- was nt-red by ~2S a faJt-r CPU th-n th- oth-r- w nt into r-fr-sh b fore ent-ring ;~ thi- routin , th- CPU eould walk away from on- another ~5~ How-v r, u-ing th- eyele eount-r 71 ~in~tead of real time) to avoid om of th-~- problem~ means that ~tall eyeles are not eount-d, and if a loop i- enter-d eau-ing many stall~ (some ean eau-- a 7-to-1 stall-to-run ratio) th-n th refre~h peeifieation not m-t unl-s~ th- p-riod i~ d-er-a--d ubstantially from the 15-miero~-e figur-, but that would d-grad- performane- For this rea~on, ~tall eyel-- ar- also eount-d in a seeond eounter 72a, :~.

~ . . : , .. .... .
.... . .
.- . .
. . . .. ; : .:
.: . :.
~ - .. . . .

seen in Figure 2, and every time thi~ counter reache~ the same number as that counted in the refre~h counter 72, an additional refresh cycle i~ introduced For exampl~, the re~resh counter 72 counts 2~ or 256 Run cycle~, ln ~tep with the counter 71, and s when ~t overflows a refre~h is ~ignalled via control bus 43 Meanwhile, counter 72a counts 2~ ~tall cycles (responsive to the RUN~ signal and clock 17), and every time it overflows a second counter 72b is incremented (counter 72b may be merely bits 9-to-11 ~or the eight-bit counter 72a), 80 when a refresh mode is ~inally entered the CPU does a number of additional refreshes indicated by the number in the counter regi~ter 72b Thus, if a long period of stall-intensive execution is encountered, the average number of refreshes will stay in the one per 15-microsec range, even i~ up to 7x256 stall cycles are interposed, because when finally going into a re~resh mode the number of row~
refreshed will catch up to the nominal re~resh rate, yet there is no degradation of per~ormance by arbitrarily shortening the re~resh cycle Memory Management The CPUs 11, 12 and 13 o~ Figures 1-3 have memory space organiz-d as illustrated in Figure 18 Using the example that the local memory 16 is 8-MByte and the global memory 14 or 15 is r 32-MByte, note that th- local m-mory 16 is part of the same continuou- zero-to-40M map o~ CPU memory access space, rather than b ing a c~ch- or a s-parat- m-mory space; realizing that the 0-8N -ction i~ triplicated (in the three CPU modules), and the 8-40M ection i- duplicated, neverthele-s logically there is m-rely a ingl- 0-40M physical addre~- space An address over 8-MByte on bu~ 54 cau--~ th- bu- inter~ac- 56 to mak- a request to th- memory module- 14 and 15, but an addr--s under 8-MByte will acc--~ the local memory 16 within the CPU module its-l~
j~ Per~ormance i~ improved by placing mor- o~ the memory used by the application- being execut-d in local m-mory 16, and so as memory '~
. .

.

`, .- ~

?

chips are availablo in higher densitie~ at lower cost and higher speeds, additional local memory will b- added, a~ well as additional global memory For example, th- local memory might be 32-MByte and the global memory 128-M~yte On the other hand, if a very minimum-cost ~ystem is needed, and perrormance i9 not a ma~or determining ractor, the system can b- operated with no local m-mory, all main memory being in the global memory area (in memory modules 14 and 15), althouqh the perrormance ponalty is high rOr such a conriguration Th- content Or local memory portion 141 of the map Or Figure 18 i~ identical in the three CPU~ 11, 12 and 13 Likewise, the two memory modules 14 and 15 contain identieally the aame data in their spaeo 142 at any given instant Within the loeal memory ` portion 141 is ~tor-d the kernel 143 (eode) for the Unix operating system, and this area is physieally mapped within a fixed portion of the loeal memory 16 of eaeh CPU Likewise, ~ kern-l data is a~sign-d a rixod ar-a 144 in eaeh local memory 16;
xcept upon boot-up, these block- do not get swapped to or from global memory or di~k Another portion 145 of local memory 16 is employ-d for us-r progra~ (and data) pag-s, which are swapped to ar-a 146 of th- global memory 14 and 15 und-r control of the operating sy~tem The global memory area 142 i~ u~ed as a staging ar-a for u~-r page~ in area 146, and also as a disk buff-r in an ar-a 147; if th CPU~ ar- x-euting eod- whieh p-rform- a writ- of a bloek of data or eod- fro~ loeal memory 16 i to dl-k 148, th-n th- -qu-ne- i~ to alway- writ- to a disk - buff-r ar-a 1~7 in-t-ad b eau-e th- ti~ to eopy to aroa 147 is n-gligibl- eo~par d to the ti~ to eopy direetly to the I~O
proe---or 26 and 27 and thu- via I/O eontroll-r 30 to disk 148 ~30 Th-n, whil- th- CPU proe--d to x-eut- oth-r cod-, th- write-to-di-k op-ration i~ don-, tran-par-nt to th- CPU~, to mov- the ~ bloek fro~ ar-a 147 to di~k 148 In a lik- ~ann-r, th- global `~ m-mory ar-a 146 i~ mapp-d to inelud- an I/O taging 14g area, for ~imilar tr-atment of I/O aee--~e- oth r than disk (- g , video) .~

:~

.- . -, . ~ . . :
,. ~ .

: .

~ . , : , The physical memory map of Figure 18 ig correlated with the virtual memory management syst~m o~ the proc--sor 40 in each CPU
Figure 19 illustrates the virtual addres~ map of the R2000 processor chip used in the example embodiment, although it is understood that other microproces~or chips ~upporting virtual memory management with paging and a protectlon mechani~m would provide corresponding features In Figure 19, two separat- 2-GByte virtual address spaces 150 and 151 are illustrated; the processor 40 operate~ in one of two modes, user mode and kernel mode The proc-s~or can only access the area 150 in the user mode, or can access both the areas 150 and 151 in the kernel mode The kernel mode is analogous to the supervisory mode provided in many machines The processor 40 is configured to operate normally in the user mode until an xception is detected forcing it into the kernel mode, where it remains until a restore from exception (RFE) instruction is executed The manner in which the memory address-s are translated or mapped depends upon tho operating mode of the microproces~or, which i8 defin-d by a bit in a status register When in th- user mod-, a singl-, uniform virtual address space 150 referred to as "ku~eg" of 2-GByt- size is available Each virtual address is also extended with a 6-bit process identifier ~PID) field to form unigue virtual addresses for up to sixty-four u~er proc-ss-~ All r-f-r-nc-- to thi~ segment 150 in user mod- ar- mapp-d through th- T1~3 83, and u-- of th- cache~ 144 and 145 i~ d-termin-d by bit ~-tting~ for ach page entry in the TLB
ntri~ om pag-- may be cachabl- and some not as specifl-d by th- programmer . . .
Wh-n in the kern-l mode, th- virtual addr-ss spac- includes both th- ar-a~ 150 and 151 of Flguro 19, and thi~ spac- has four s-parat- ~egments kus-g 150, ksegO 152, k--gl 153 and k~eg2 154 The kuseg 150 s-gment for th- k-rn-l mod- 1~ 2-GByt- in ~ize, `` coincid-nt with th- "ku~-g" of th- u~-r mod-, ~o when in the k-rn-l mode th- proc-s~or treats r-f-r-nc-- to thi~ s-gm-nt just ... .

:, ~ .

.

.. 1 ~ . .
, - .
.

: , .

' -.~ - . : -. .
.. . . .
" ,.

like user mode references, thus ~treamlining kernel access to user data The ku~eg 150 i~ u~ed to hold us-r code and data, but the operating system often needR to reference this same code or data The k~egO area 152 is a 512-MByte kernel physical address space direct-mapped onto the first sl2-MBytes o~ phy~ical address ~pace, and is cached but does not use th- ~L~ 83; this segment is used ~or kernel executable code and ~ome kernel data, and is represented by thQ area 143 of Figure 18 in local memory 16 The ksegl area 153 is also directly mapped into the first 512-MByte of phy~ical addres~ space, the same as k~egO, and is uncached and uses no TLB entrie~ Ksegl differs ~rom k~egO only in that it is uncached Ksegl i~ used by the operating sy~tem for I/O
registers, ROM code and disk buffer~, and so corresponds to areas 147 and 149 of the physical map of Figure 18 The kseg2 area 154 i~ a l-G~yt- space which, like kuseg, u~e~ T1~3 83 entries to map virtual addresses to arbitrary physieal ones, with or w$thout caching This k~eg2 area diff-r- from th- kuseg ar-a 150 only in that it is not accessiblQ in the user mod-, but instead only in the kernel mode The operating system use~ kseg2 for ~tack~ and per-process data that must remap on eontext switehes, for user page tables (memory map), and for some dynamieally-allocated data areas Kseg2 allows seleetive eaehing and mapping on a per page basis, rather than requiring an all-or-nothing approaeh The 32-bit virtual addr-sse- generated in the regi~ters 76 ~25 or PC 80 of th- mieroproee--or ehip and output on th- bus 84 are r-pr---nt-d in Figur- 20, wh-r- it i~ --n that bits 0-11 are the Or~--t u~-d uneonditionally a- th- low-ord-r 12-bit~ of th-addr--- on bu- 42 o~ Figur- 3, while bit~ 12-31 ar- th- VPN or virtual pag- numb r in whieh bit~ 29-31 ~ ct b tw--n kus-g, ~30 ks-gO, k~ gl and ks-g2 Th- proe--- id-ntifi-r PID for the eurr-ntly--xoeuting proc--~ tor-d in a regi~ter also acc-~-ibl- by th- TLB Th- 64-bit TL~ ntri-- are r-pre--nted in ~ Figur- 20 as w-ll, wh-re it i~ -en that th- 20-bit VPN from the `~ virtual addres~ is compar-d to th- 20-bit VPN field locat-d in ~35 bits 44-63 of th- 64-bit ntry, whil- at th- sam tim th- PID is SS

,~
" .
.j ` `' ..

., -.
.,. . ~

. .

2~t~3342 compared to bits 38-43; if a match is found in any of the sixty-~our 64-bit T~3 entrie~, the pag- frame number PFN at bits 12-31 of th- matched entry i~ used as the output via busse- 82 and 42 of Figure 3 (assu~ing other crit-ria ar- m-t) Other one-bit valu~ in a TLB entry include N, D, V and G N i8 the non-cachable indicator, and if set the page i- non-cachable and the processor directly accesses local memory or global memory instead o~ first accessing the cache 44 or 45 D is a write-protect bit, and if set means that the location is ~dirty" and therefore lo writable, but if zero a write operation cau-es a trap The V bit means valid if set, and allow~ the TLB Qntrie~ to be cleared by merely resetting the valid bits; thi~ V bit is used in the page-swapping arrangement of this system to indicat- whether a page is in local or global memory The G bit is to allow global accesses which ignore the PID match reguirement ~or a valid TLB
translation; in kseg2 this allow- the k-rnel to access all mapped data without regard ~or PID

The device controllers 30 cannot do DMA into local memory 16 directly, and so the global memory is used a~ a ~taging area for DMA type block transfers, typically ~ro~ disk 148 or the like The CPUs can perform operations dir-ctly at the controllers 30, to initiate or actually control operations by the controllers (i e , programmed I/O), but the controll-r~ 30 cannot do DMA
except to global memory; the controll-r~ 30 can become the VMEbus `~ 25 (bus 28) ma~ter and through th- I/O proc--~or 26 or 27 do reads ; or writ-- dir-ctly to global m mory in the m-mory modules 14 and lS

.~
~`~ Pag- wapping b tween global and local memori-s (~nd disk) i~ initiat-d ith-r by a pag- ~ault or by an aging proc-~ A
pag- rault occurs wh-n a proces~ i~ ex-cuting and att-mpts to ex-cute ~rom or access a pag- that i- in global m-mory or on disk; the TI~3 83 will show a mi~ and a trap will r-sult, so low ~ level trap code in th- kern-l will show th- location o~ the page, `~ and a routine will b- entered to initiat- a pag- swap If the r ~ 56 ;~
. ~
,~, `', - ~ ~ .. . ..

.~ . .
.. ...~.
' - ; , ,. '~ , `

~ ~;

page needed i~ in global memory, a serie- of command~ are sent to the DMA controller 74 to writ- t~ t-r~cently-used pag- ~rom local memory to global memory and to read th- needed page from global to local If the page i9 on disk, commands and addresses s (sectors) are written to the controller 30 rom tho CPU to go to disk and acquire th- page, then the proce-s which made the memory referenc- is suspended When the disk controller has ~ound the data and i~ ready to ond it, an interrupt i- signalled which will be used by the memory module~ (not reaching the CPUs) to allow the disk controller to begin a DMA to global memory to write th- page into global memory, and wh-n finished the CPU is interrupted to begin a block transfer under control of DMA
controller 74 to swap a least used page from local to global and read the needed page to local Then, th- original process is lS mad- runnabl- again, ~tate is re~tored, and the original memory referenc- will again occur, finding the needed page in local ~-mory The other mechanism to initiate pag- swapping is an aging routin- by which the operating ~y~tem periodically goes through th- page~ in local memory marking them as to whether or `20 not each page has been used recently, and tho~e that have not are sub~ect to be pushed out to global memory A ta~k switch does not itselr initiat- pag- swapping, but in~tead as the new task begin~ to produce page fault~ pages will be swapped as needed, i and th- candidate- ~or swapping out ar- those not recently used .
~25 I~ a memory r-~-r-nc- i- mad- and a TLB mi-s is shown, but `~ th- pag- tabl- lookup r-~ulting ~om th- TLB mis~ xc-ption shows th- pag- i- ln loeal ~ -ory, th-n a TLB ntry i- mad- to show thl- pag to b- ln local m mory That 1~, th- proc-~s takes an xc-ptlon wh-n th- TL~ - oecur-, go-~ to th- pag~ tables (in ~30 th- k-rn-l data ~-etion), ~ind- th- tabl- ntry, writ-~ to T~B, then th- proe--- i- allow d to proe- d But ir th- memory r-f-r-ne- hows a TL~ mi--, and th- pag- tables how th-corr-~ponding phy~ieal addr-s~ i~ in global memory (ov-r 8M
phy~ieal addr-~s), th- T~B ntry i- mad- for this page, and when ~35 the proc-s~ r-~ume- it will find th- page ntry in th- TLB as ~ 57 ,~
.~

., ::`
., ~ , ,. . - -'`''" ' ~:' ' . , ' ' ' ~' ' ' . ' --: .

~ : :
, . . .

before; yet another exception is taken b~cause the valid bit will be zero, indicating the page i5 physically not in local memory, so this time the exceptlon will enter a routine to swap the page from global to local and validate the TL~ entry, ~o xecution can then proceed In the third situation, ir th- page tables ~how address for the memory re~erence is on dlsk, not ~n local or global memory, thQn the system operate~ as indicated above, i e , the process i8 put off the run queuo and put in the sleep queue, a disk request is made, and when the diok has transferred the page to global memory and signalled a command-complete interrupt, then the page is swapped from global to local, and the TLB
updated, then the process can execute again Private Memory Although the m mory modules 14 and 15 stor- the same data at tho same locations, and all three CPUs 11, 12 and 13 have equal access to thes- memory modules, ther- i~ a ~mall area of the memory assigned under software control a- a private memory in each one of the memory modules For xample, as illustrated in Figur- 21, an area 155 of the map of the m mory modul- locations is d-signat-d the private memory area, and is writabl- only when th- CPUs issue a "private m-mory writ-~ command on bus 59 In an exampl- mbodim-nt, th- private memory ar-a 155 i~ a 4K pag-~tarting at th- addr-~- contain-d in a r-gi-t-r 156 in th- bus int-rfac- 56 of ach on- of th- CPU modul--; this starting addr--- can b- chang d und-r softwar- control by writing to this r-gist-r lS6 by th- CPU Th- privato m-mory area 155 is further divid-d b tw -n th thr-- CPUs; only cPU-a can write to area 155a, CPU-B to ar-a 155b, and CPU-C to ar-a 155c On- o~ tho command ~ignal~ in bu- S7 i- sot by tho bu- interfac- 56 to inform th- m~mory modul-- 14 and lS that th- op-ration is a privat- writ-, and thi~ i- -t in r--pon-- to th- addr---g-n-rat-d by th~ proc--~or 40 from a Stor- instruction; bits of th- addr-ss (and a WritQ command) ar- detected by a decoder 157 :
:, .

. . .
.
.

.: .: ~ . . . - , :. . .. .

~.

.

in the bus interface (which compares bus addresses to the contentJ o~ r~gister 156) and us-d to genorat- the "private memory writ-" com~and for bus 57 In th- m~mory module, when a write command i~ d-tected in th- registers 94, 95 and 96, and the s addresses and commands ar- all vot-d good (i - , in agreement) by th- vot- circuit 100, th-n th- control circuit lOo allows the data from only on- of the CPUs to pass through to the bus 101, this one being determined by two bits of the address from the CPUs During this privat- write, all thr-- CPUs pre~ent the same address on th-ir bus 57 but differ-nt data on th-ir bus 58 (the different data i~ so~- state unigu- to th- CPU, for exa~ple) Th- memory modules vote the addresses and commands, and select data from only one CPU based upon part of the address field seen on the address bus To allow the CPUs to vote some data, all lS thr~e CPU~ will do three private writes (th-r- will be three writes on the bu~es 21, 22 and 23) of ~om state information unique to a CPU, into both m mory modul-- 14 and 15 During each write, ach CPU s-nd~ its uniqu- data, but only on- is accepted each time So, the software ffeguenc- ex-cuted by all three CPUs iB (1) Storo (to location lSSa), (2) Store (to loeation lSSb), (3) Stor- (to location lSSe) But data from only on- CPU is aetually written eaeh time, and th- data i- not voted (beeause it is or eould be differont and eould show a fault if voted) Then, the CPUs ean vote the data by having all three CPUs read all ~25 throe of th- loeation- 15Sa, lSSb and 155e, and by software eompar- thi~ data Thi- typ- of op-ration i~ us-d in diagno-tie-, for xa~pl-, or in int-rrupta to vote the cause r-gl~t-r data ~h- privat--writ- m ehani~ u~ed in fault deteetion and ~30 r-eov-ry For xampl-, if th- CPU- d-t-et a bus rror upon making a ~e~ory r-ad r-qu-~t, sueh a~ a me~ory module 14 or 15 returning bad statu~ on lin ~ 33-1 or 33-2 At thi~ point a CPU
:*
do-sn~t know if th- oth-r CPU- r-e-iv-d th sa~- status from the mory ~odul-; th- CPU eould b- faulty or its status deteetion eireuit faulty, or, a- indieat-d, th- ~e~ory eould be faulty ; ~`
.,, : .
:, ~ ' .-:
, .. . -.
.. ,.
: .

, .
.
. . . . .

so~ to isolate the fault, when the bus fault routine mentioned above i8 ntered, all thre~ cPu~ do a privat- write of th~ status information they ~UQt received from the memory module~ in the preced~ng read attempt Then all threo CPUs read what the others have writt-n, and compare it with their own memory status information If they all agree, then the memory module i9 voted off-line If not, and one CPU shows bad status for a memory module but the others show good status, then that CPU is voted of~-line Fault-Tolerant Power Supply .' ~
Referring now to Figure 22, the system of the pre erred embodiment may us- a fault-tolerant power supply which provides the capability for on-line replacement of failed power ~upply modules, as well as on-line replacement of CPU modules, memory modules, I/O processor modules, I/O controllers and di~k module~
as discussed above In the circuit of Figure 22, an a/c power line 160 is connected directly to a power distribution unit 161 that provides power line filtering, tran~i-nt suppr- ~ors, and a circuit breaker to protect against short clrcuits To protect again~t a/c power line failure, redundant battery packs 162 and 163 provide 4-1/2 minutes of full sy-tQm power 80 that ordarly system shutdown can be accomplish-d Only on- of the two battery packs 162 or 163 i~ r-quir-d to be op-rative to saf-ly shut the y-t-~ down ,:
2S Th- pow-r sub-y~t-~ has two identical AC to DC bulk power supplio~ 164 and 16S which exhibit high pow-r factor and energize a pair of 36-volt DC di-tribution bu~ 166 and 167 Th- ~ystem can remain op-rational with on- of th- bulk power supplies 164 or 16S op~rational .:
Four ~-parat- pow r distribution buss-s are includ-d in thes- bu~s-s 166 and 167 Th- bulk ~upply 164 drive~ a power bus .,~ .
' 60 .;
: i, ; , .
, ;

: . - - . -: . . .
..
.
. ............ - , . .
... .

166-~, 167-1, while the bulk upply 165 drive power bus 166-2, 167-2 Th- batt-ry pack 162 drives bus 166-3, 167-3, and is itself recharged from both 166-1 and 166-2 The batt-ry pack ~63 drives bu~ 166-3, 167-3 and is recharged from busse~ 166-1 and 167-2 Th- three cPus 11, 12 and 13 are driven ~rom dif~er-nt combination~ of the-- tour distribution bu~

A number of DC-to-DC converters 168 connected to these 36-v busses 166 and 167 are used to individually power the CPU modules 11, 12 and 13, the memory module~ 14 and 15, th- I/0 processors 26 and 27, and the I/0 controllQr~ 30 Th- bulk power supplies 164 and 165 also pow-r the thre- syst-m fan- 169, and battery charger~ for the battory packs 162 and 163 By having thQse separate DC-to-DC converters for each y~t-m component, failur-of on- conv rter doe~ not result in ~yst-~ ~hutdown, but instead lS the 4y-t-~ will continu- under on- of it~ failur- r-covery modes discuss-d abov-, and th- failed power 8Upply compon-nt can b-rJ r-plac-d while the sy~tQm is operating : q The power system can be shut down by ither a manual switch (with ~tandby and off functions) or und-r oftware control from a ~20 maint-nanc- and diagno-tic proc--sor 170 which automatically d-~aults to th- pow r-on stat- in th- v nt of a maint-nanc- and diagnostic pow r failur-. ~ .
`3 Whil- th- inv ntion ha- b--n d ~erib d with r-fer-nce to a p-eifie ~bodi~ nt, th d--eription i~ not ~ ant to b- eon~trued ~;~2S in a li~iting -n-- Variou- ~odifieation- of th di~clos-d ~bodi~ nt, a- w ll a- oth-r rbodi~ nt- of the inv ntion, will b- appar-nt to p-r-on~ kill-d in th art upon r-f-r-nc- to thia d--eription It i- th-r-for- eont-uplat d that th- app-nded claim~ will eov-r any ueh ~odifieation- or e~bodl~-nt~ a~ fall ~30 within th- tru- ~eop- of th- inv ntion .
.

:`~

. ;:~ ., .: . . ~, ' ~ . .
,

Claims

1. A computer system comprising:
a) multiple CPUs each executing the same instruction stream, each CPU employing virtual memory addressing with paging;
b) each CPU having a local memory accessible only by said CUP, the local memory containing selected pages;
c) a global memory accessible by all said CPUs, the local memory having faster access time than the global memory, the global memory containing selected pages and page-swapped with said local memory upon demand to maintain most-used pages in said local memory of each CPU.

2. A system according to claim 1 further including a disk memory coupled to said global memory and having access time slower than said global memory, the disk memory containing pages defined by said virtual memory addressing and page-swapped with said global memory and local memory upon demand.

3. A system according to claim 1 further including an operating system having a kernel stored in said local memory for each CPU.

4. A system according to claim 1 wherein each said CPU has a separate cache memory having access time faster than that of said local memory.

5. A system according to claim 1 wherein said CPUs are clocked independent of one another, and wherein said CPUs are synchronized upon accessing said global memory, and said global memory is duplicated.

6. A system according to claim 1 wherein said global memory is coupled to I/O means accessible only via said global memory, and said global memory is used for staging I/O requests by said CPUs.

7. A method of operating a computer system, comprising the steps of:
a) executing the same instruction stream in multiple CPUs using virtual memory addressing with paging;
b) accessing a local memory by each CPU in execution of said instruction stream, each local memory accessible only by one of said CPUs, to store selected pages in the local memory;
c) accessing a global memory by all of said CPUs in execu-tion of said instruction stream, the global memory accessible by all said CPUs, the local memory having faster access time than the global memory, to store selected pages in the global memory page-swapped with said local memory upon demand to maintain most-used pages in said local memory of each CPU.

8. A method according to claim 7 further including the step of storing pages in a disk memory coupled to said global memory, the disk memory having access time slower than said global memory, the pages stored in said disk memory being defined by said virtual memory addressing and page-swapped with said global memory and local memory upon demand.

9. A method according to claim 7 including executing said instruction stream under an operating system having a kernel stored in said local memory for each CPU.

10. A method according to claim 7 wherein each said CPU has a separate cache memory having access time faster than that of said local memory.

11. A method according to claim 7 including the step of clocking the CPUs independently of one another, including the step of synchronizing the CPUs upon accessing said global memory, and wherein said global memory is duplicated.

12. A method according to claim 7 wherein said global memory is coupled to I/O means accessible only via said global memory, and including the step of transferring data between said CPUs and said I/O means using said global memory for staging.

13. A method of operating a computer system, comprising the steps of:
a) executing the same instruction stream in multiple proces-sors using virtual memory addressing with paging under control of an operating system having a kernel;
b) accessing a local memory by each processor in execution of said instruction stream, each local memory accessible only by one of said processors, to store selected pages in the local memory and to store said kernel of said operating system;
c) accessing a duplicated global memory by all of said processors in execution of said instruction stream, the global memory accessible by all said processors, the local memory having faster access time than the global memory, to store selected pages in the global memory page-swapped with said local memory upon demand under control of said operating system to maintain most-used pages in said local memory of each processor; and d) storing pages in a disk memory coupled to said global memory, the disk memory having access time slower than said global memory, the pages stored in said disk memory being defined by said virtual memory addressing using said operating system and page-swapped with said global memory and local memory upon demand.

14. A method according to claim 13 wherein each said processor has a separate cache memory having access time faster than that of said local memory.

15. A method according to claim 13 including the steps of clocking the processors independently of one another, and includ-ing the step of synchronizing the processors upon accessing said global memory.

16. A method according to claim 13 wherein said global memory is coupled to I/O means accessible only via said global memory, and including the step of transferring data between said processor and said I/O means using said global memory for staging.

17. A computer system, comprising a) a plurality of CPUs each executing an instruction stream, the CPUs being clocked independently of one another to provide execution cycles, the CPUs executing stall cycles while awaiting implementation of some instruction execution;
b) each of the CPUs having a first counter to count execu-tion cycles but not stall cycles, and having a second counter to count stall cycles;
c) each of said CPUs having a local memory requiring perio-dic refresh;
d) and a refresh control for each CPU responsive to said first and second counters to initiate a refresh of said local memory to perform a number of refresh cycles depending upon output of the second counter.

18. A system according to claim 17 wherein said refresh control initiates said refresh at execution of the same instruc-tion in said instruction stream in each of said CPUs.

19. A system according to claim 17 wherein said CPUs are loosely synchronized by voting access to a common memory acces-sible by all said CPUs.

20. A system according to claim 17 wherein there are three said CPUs and wherein said CPUs access a duplicated common global memory.

21. A computer system comprising:
a) a CPU executing an instruction stream, the CPU being clocked to provide execution cycles, the CPU executing stall cycles while awaiting implementation of some instruction execu-tion;
b) the CPU having a first counter to count execution cycles but not stall cycles, and having a second counter to count stall cycles;
c) said CPU having a memory requiring periodic refresh;
d) and a refresh control for said CPU responsive to said first and second counters to initiate a refresh of said memory to perform a number of refresh cycles depending upon output of the second counter.

22. A system according to claim 21 wherein a third counter counts the number of times said second counter overflows, and the number of said refresh cycles is determined by the content of said third counter.

23. A system according to claim 21 wherein said first counter is of a size related to the number of refresh cycles needed by said local memory in a given time period.

24. A method of operating a computer system, comprising the steps of:
a) executing an instruction stream in each of a plurality of CPUs, the CPUs being clocked independently of one another to provide execution cycles, the CPUs executing stall cycles while awaiting implementation of some instruction execution;
b) counting execution cycles but not stall cycles in each of the CPUs in a first counter, and counting stall cycles in each CPU in a second counter;
e) each of said CPUs accessing a local memory requiring periodic refresh;
d) and initiate refresh of said local memory for each CPU
responsive to said first and second counters to perform a number of refresh cycles depending upon output of the second counter.

25. A method according to claim 24 wherein said step of initiating said refresh is done at execution of the same instruc-tion in said instruction stream in each of said CPUs.

26. A method according to claim 24 wherein said CPUs are loosely synchronized by voting access to a common memory acces-sible by all said CPUs.

27. A method according to claim 24 wherein there are three said CPUs and wherein said CPUs access a duplicated common global memory.

28. A computer system comprising:
a) multiple CPUs executing the same instruction stream, b) a common memory having memory space accessed by all said CPUs, c) a private memory space in said common memory for storing state information for each CPU writable only by one CPU, d) said state information in said private memory spaces for all CPUs being readable by all CPUs to thereby evaluate said state information for equality by each CPU.

29. A system according to claim 28 wherein there are a plurality of said private memory spaces, one for each one of said CPUs.

30. A system according to claim 28 wherein memory accesses made by said CPUs to said common memory are voted by said common memory before being executed.

31. A system according to claim 30 wherein memory accesses made by said CPUs to said private memory are voted to compare addresses but not data.

32. A system according to claim 28 wherein said private memory for each CPU has the same logical address associated with instructions executed by said CPUs, but is translated to a unique address for each private memory before addressing said common memory.

33. A computer system having multiple CPUs comprising:
a) a shared memory having memory space accessed by all of said multiple CPUs, b) each one of said multiple CPUs also having a separate private-write memory space in said shared memory for storing state information, each said private-write space writable only by one of said multiple CPUs;
c) said private-write memory spaces for each one of said multiple CPUs being readable by all of said multiple CPUs.

34. A system according to claim 33 wherein said multiple CPUs are executing the same instruction stream.

35. A system according to claim 34 wherein said shared memory votes memory requests made by said multiple CPUs to said shared memory.

36. A system according to claim 33 wherein said shared memory votes write requests made to said private-write spaces by comparing addresses but not data.

37. A method or operating a computer system having multiple processors, comprising the steps of:
a) storing data by each of said multiple processors in a shared memory having memory space accessed by all of said multi-ple processors, b) also storing information by each one of said multiple processors in a private memory space for each multiple processor writable only by one multiple processor.

38. A method according to claim 37 including the step of executing the same instruction stream in each one of said multi-ple processors.

39. A method according to claim 37 wherein said step of storing data includes voting memory requests to said shared memory made by said multiple processors.

40. A method according to claim 37 wherein step of storing information in private memory space includes making a write request to all of said private memory spaces by each of said multiple processors but executing the write request only for the one processor for each write request associated with each private memory space.

41. A method according to claim 37 including the step of evaluating for equality said information from said private memory space by each one of said multiple processors.

42. A method according to claim 37 including the step of reading said information in said private memory spaces for all multiple processors by each multiple processor.

43. A method according to claim 42 including the step of executing the same instruction stream in each one of said multi-ple processors, and wherein said step of storing data includes voting memory requests to said shared memory made by said multi-ple processors.

44. A method according to claim 43 wherein said multiple processors are loosely synchronized upon the event of voting memory requests.