WO1993020506A1 - Semiconductor floor plan and method for a register renaming circuit - Google Patents

Semiconductor floor plan and method for a register renaming circuit Download PDF

Info

Publication number
WO1993020506A1
WO1993020506A1 PCT/JP1993/000377 JP9300377W WO9320506A1 WO 1993020506 A1 WO1993020506 A1 WO 1993020506A1 JP 9300377 W JP9300377 W JP 9300377W WO 9320506 A1 WO9320506 A1 WO 9320506A1
Authority
WO
WIPO (PCT)
Prior art keywords
rows
instruction
data dependency
instructions
output lines
Prior art date
Application number
PCT/JP1993/000377
Other languages
French (fr)
Inventor
Kevin Ray Iadonato
Le Trong Nguyen
Original Assignee
Seiko Epson Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seiko Epson Corporation filed Critical Seiko Epson Corporation
Priority to JP51729593A priority Critical patent/JP3555140B2/en
Publication of WO1993020506A1 publication Critical patent/WO1993020506A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30141Implementation provisions of register files, e.g. ports
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/3826Bypassing or forwarding of data results, e.g. locally between pipeline stages or within a pipeline stage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • G06F9/384Register renaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • Advance Control (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

A semiconductor floor plan layout for integrating a Data Dependency Checker (DDC) circuit and a Tag Assignment Logic (TAL) of a Register Renaming Circuit (RRC) circuit to conserve valuable semiconductor real estate. Floor plans of the present invention contemplate laying out the DDC and TAL in such a fashion as to reduce the distance signals must travel between the DDC and TAL, as well as the sitance signals must travel between the TAL and RPM. By rearranging selected DDC comparator rows and their associated TAL, a considerable amount of area can be conserved for performing register renaming for up to eight instructions.

Description

DESCRIPTION
SEMICONDUCTOR FLOOR PLAN AND METHOD FOR A REGISTER RENAMING CIRCUIT
CROSS-REFERENCE TO RELATED APPLICATIONS
The following are commonly owned, co-pending applications:
* "Superscalar RISC Instruction Scheduling", Serial No. 07/860,719, concurrently filed with the present application (Attorney Docket No. SP035);
* "High Performance RISC Microprocessor Architecture", Serial No. 07/817,810, filed 1/8/92 (Attorney Docket No. SP015);
* "Extensible RISC Microprocessor Architecture", Serial No. 07/817,809, filed 1/8/92 (Attorney Docket No. SP021).
The disclosures of the above applications are incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a semiconductor floorplan layout, and more particularly, to a semiconductor floorplan layout which integrates sections of a register renaming circuit of a superscalar RISC chip.
2. Related Art
Given instructions with two input operands and one output value, as holds for typical RISC (reduced instruction set computer) instructions, then there are five possible dependencies between any two instructions: two true dependencies, two anti- dependencies, and one output dependency. Furthermore, the number of dependencies between a group of instructions (such as a group of instructions in a window) varies with the square of the number of instructions in the group, because each instruction must be considered against every other instruction. Complexity is further multiplied by the number of instructions that the processor attempts to decode, issue, and complete in a single cycle, because these actions introduce dependencies, are controlled by dependencies, and remove dependencies -from consideration.
True dependencies (sometimes called "flow dependencies" or "write-read" dependencies) are often grouped with anti-dependencies (also called "read-write" dependencies) and output dependencies (also called "write- write" dependencies) into a single group of instruction dependencies. The reason for this grouping is that each of these dependencies manifests itself through the use of registers or other storage locations. However, it is important to distinguish true dependencies from the other two. True dependencies represent the flow of data and information through a program. Anti- and output dependencies arise because, at different points in time, registers or other storage locations hold different values for different computations.
When instructions are issued in order and complete in order, there is a one-to- one correspondence between registers and values. At any given point in execution, a register identifier precisely identifies the value contained in the corresponding register. When instructions are issued out of order and complete out of order, the correspondence between registers and values breaks down, and values conflict for registers. This problem is severe when the goal of register allocation is to keep as many values in as few registers as possible. Keeping a large number of values in a small number of registers creates a large number of conflicts when the execution order is changed from the order assumed by the register allocator.
Anti- and output dependencies are more properly called "storage conflicts" because the reuse of storage locations (including registers) causes instructions to interfere with one another even though the conflicting instructions are otherwise independent. Storage conflicts constrain instruction issue and reduce performance. In view of the above discussion it becomes clear that implementing data dependency circuits, and register renaming circuits in general, is complex and requires a great deal of semiconductor area. Superscalar RISC processors is particular, strive to simultaneously execute multiple instructions. As this technology develops, chip developers attempt to simultaneously execute more and more instructions. Thus, the required amount of dependency checking increases at an exponential rate. What is needed is a layout technique (also called a floorplan) that can integrate sections of the register renaming circuit (RRC) to conserve semiconductor area.
A more detailed description of some of the basic concepts discussed in this application is found in a number of references, including Mike Johnson, Superscalar Microprocessor Design (Prentice-Hall, Lie, Englewood Cliffs, New Jersey, 1991); John L. Hennessy etal.. Computer Architecture-A Quantitative Approach (Morgan Kaufmann Publishers, Inc., San Mateo, California, 1990). Johnson's text, particularly Chapters 2, 6 and 7 provide an excellent discussion of the register renaming and data dependency issues addressed by the present invention.
SUMMARY OF THE INVENTION
The present invention is directed to a semiconductor floorplan layout for integrating a Data Dependency Checker (DDC) circuit and a Tag Assignment Logic (TAL) of a Register Renaming Circuit (RRC) circuit to conserve valuable semiconductor realestate.
Floorplans of present invention contemplate laying out the DDC and TAL in such a fashion as to reduce the distance signals must travel between the DDC and TAL, as well as the distance signals must travel between the TAL and RPM. By rearranging selected DDC comparator rows and their associated TAL, a considerable amount of area can be conserved for performing register renaming for up to eight instructions. The function of the DDC is to locate the dependencies betweeii the instructions in these buckets. The DDC does this by comparing the addresses of the source registers of each instruction to the addresses of the destination registers of each previous instruction. For example, if instruction A reads a value from a register that is written to by instruction B, then instruction A is dependent upon instruction B and instruction A cannot start until instruction B has finished. The DDC outputs indicate these dependencies. The data dependency checking is done by a plurality of comparators which are laid-out in rows.
The DDC results are used by the TAL to control result forwarding for out-of- order instruction execution. The TAL in turn generates input signals for Register File Port Multiplexers (RPM) which channel addresses of data that will be read from the register file or temporary buffer for use as operands for subsequently issuing instructions.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will be better understood if reference is made to the accompanying drawings in which:
Fig. 1 shows a representative high level block diagram of a register renaming circuit (RRC). Fig. 2 is a representative floorplan showing a simple layout of an RRC.
Fig. 3 is a representative floorplan showing an improved layout of an RRC in accordance with the present invention.
Fig. 4 is a representative floorplan showing a further improved layout of an RRC in accordance with the present invention.
DETAILED DESCRD7TION
FIG. 1 shows a representative high level block diagram of an Instruction Execution Unit (IEU) 100 associated with the present invention. The goal of IEU 100 is to execute as many instructions as possible in the shortest amount of time. There are two basic ways to accomplish this: optimize IEU 100 so that each instruction takes as little time as possible or optimize IEU 100 so that it can execute several instructions at the same time. An IEU for use with the present invention is disclosed in commonly owned, co- pending applications titled, "High Performance RISC Microprocessor Architecture", Serial No. 07/817,810, filed 1/8/92 (Attorney Docket No. SP015/1397.0280001), and "Extensible RISC Microprocessor Architecture", Serial No. 07/817,809, filed 1/8/92 (Attorney Docket No. SP021/1397.0300001), the disclosures of which are incorporated herein by reference.
Instructions are sent to IEU 100 from an Instruction Fetch Unit (IFU, not shown) through an instruction FIFO (first-in-first-out register stack storage device)
101 in groups of four called "buckets." IEU 100 can decode and schedule up to two buckets of instructions at one time. FIFO 101 stores 16 total instructions in four buckets labeled 0-3. IEU 100 looks at the an instruction window 102. In one embodiment of the present invention, window 102 comprises eight instructions (buckets 0 and 1). Every cycle IEU 100 tries to issue a maximum number of instructions from window 102. Window 102 functions as a instruction buffer register. Once the instructions in a bucket are executed and their results stored in the processor's register file (see block 117), the bucket is flushed out a bottom 104 and a new bucket is dropped in at a top 106.
In order to execute instructions in parallel or out of order, care must be taken so that the data that each instruction needs is available when the instruction needs it and also so that the result of each instruction is available for any future instructions that might need it. A Register Rename Circuit (RRC), which is part of the scheduling logic of the computer's IEU performs this function by locating dependencies between current instructions and then renaming the sources (inputs) of the instruction.
As noted above, there are three types of dependencies: input dependencies, output dependencies and anti-dependencies. Input dependencies occur when an instruction, call it A, that performs an operation on the result of a previous instruction, call it B. Output dependencies occur when the outputs of A and B are to be stored in tile same place. Anti-dependencies occur when instruction A comes before B in the instruction stream and B's result will be stored in the same place as one of A's inputs.
Input dependencies are handled by not executing instructions until their inputs are available. RRC 112 is used to locate the input dependencies between current instructions and then to signal an Instruction Scheduler or Issuer 118 when all inputs for a particular instruction are ready. In order to locate these dependencies, RRC 112 compares the register file addresses of each instruction's inputs with the addresses of each previous instruction's output using a data dependency circuit (DDC) 108. If one instruction's input comes from a register where a previous instruction's output will be stored, then the latter instruction must wait for the former to finish.
This implementation of RRC 112 can check eight instructions at the same time, so a current instruction is defined as any one of those eight from window 102. It should become evident to those skilled in the art that the present invention can easily be adapted to check more or less instructions.
In one embodiment of the present invention, instructions can have from 0 to 3 inputs and 0 or 1 outputs. Most instructions' inputs and outputs come from, or are stored in, one of several register files. Each register file 117 (e.g., separate integer, floating and boolean register files) has 32 real entries plus the group of 8 temporary buffers 116. When an instruction completes, (The term "complete" means that the operation is complete and the operand is ready to be written to its destination register.) its result is stored in its preassigned location in the temporary buffers 116. Its result is later moved to the appropriate place in register file 117 after all previous instructions' results have been moved to their places in the register file. This movement of results from temporary buffers 116 to register file 117 is called "retirement" and is controlled by termination logic, as should become evident to those skilled in the art. More than one instruction may be retired at a time. Retirement comprises updating the "official state" of the machine including the computer's Program Counter, as will become evident to those skilled in the art. For example, if instruction 10 happens to complete directly before instruction II, both results can be stored directly into register file 117. But if instruction 13 then completes, its result must be stored in temporary buffer 116 until instruction 12 completes. By having IEU 100 store each instruction's result in its preassigned place in the temporary buffers 116, IEU 100 can execute instructions out of program order and still avoid the problems caused by output and anti-dependencies.
RRC 112 sends a bit map to an Instruction Scheduler 118 via a bus 120 indicating which instructions in window 102 are ready for issuing. Instruction decode logic (not shown) indicates to Issuer 118 the resource requirements for each instruction over a bus 123. For each resource in IEU 100 (e.g., each functional unit being an adder, multiplier, shifter, or the like), Issuer 118 scans this information and selects the first and subsequent instructions for issuing by sending issue signals over bus 121. The issue signals select a group of Register File Port MUXes (RPMs) 124 inside RRC 112 whose inputs are the addresses of each instruction's inputs. Because the results may stay in temporary buffer 116 several cycles before going to register file 117, a mechanism is provided to get results from temporary buffer 116 before they go to register file 117, so the information can be used as operands for other instructions. This mechanism is called "result forwarding," and without it, Issuer 118 would not be able to issue instructions out of order. This result forwarding is done in register file 117 and is controlled by RRC 112. The control signals necessary for performing the result forwarding will be come evident to those skilled in the art, as should the random logic used for generating such control signals. If an instruction is not dependent on any of the current instructions result forwarding is not necessary since the instruction's inputs are already in register file 117. When Issuer 118 decides to execute that instruction, RRC 112 tells register file 117 to output its data.
RRC 112 contains three subsections: a Data Dependency Checker (DDC) 108, Tag Assign Logic (TAL) 122 and Register File Port MUXes (RPM) 124. DDC 108 determines where the input dependencies are between the current instructions. TAL 122 monitors the dependencies for Issuer 118 and controls result forwarding. RPM 124 is controlled by Issuer 118 and directs the outputs of TAL 122 to the appropriate register file address ports 119. Instructions are passed to DDC 108 via bus 110. All source registers are compared with all previous destination registers for each instruction in window 102.
Each instruction has only one destination, which may be a double register in one embodiment. An instruction can only depend on a previous instruction and may have up to three source registers. There are various register file source and destination addresses that need to be checked against each other for any dependencies. As noted above, the eight bottom instructions corresponding to the lower two buckets are checked by DDC 108. All source register addresses are compared with all previous destination register addresses for the instructions in window 102.
For example, let's say a program has the following instruction sequence: add R0, Rl, R2 (0) add R0, R2, R3 (1) addR4, R5, R2 (2) add R2, R3, R4 (3)
The first two registers in each instruction 0-3 are the source registers, and the last listed register in each instruction is the destination register. For example, R0 and Rl are the source registers for instruction 0 and R2 is the destination register. Instruction 0 adds the contents of registers 0 and 1 and stores the result in R2. For instructions 1-3 in this example, the following are the comparisons needed to evaluate all of the dependencies: I1S1, 11S2 vs. IOD I2S1, 12S2 vs. I1D, IOD I3S1, 13S2 vs. I2D, I1D, I0D
The key to the above is as follows: IXRS1 is the address of source (input) number 1 of instruction X; IXRS2 is the address of source (input) number 2 of instruction X; and IXD is the address of the destination (output) of instruction X.
Note also that RRC 112 can ignore the fact that instruction 2 is output dependent on instruction 0, because the processor has a temporary buffer where instruction 2's result can be stored without interfering with instruction 0's result. As discussed before, instruction 2's result will not be moved from temporary buffers 116 to register file 117 until instructions 0 and l's results are moved to register file 117.
The number of instructions that can be checked by RRC 112 is easily scaleable. In order to check eight instructions at a time instead of four, the following additional comparisons would also need to be made:
I4S1, 14S2 vs I3D, I2D, I1D, IOD
I5S1, 15S2 vs I4D, I3D, I2D, I1D, IOD
I6S1, 16S2 vs I5D, I4D, I3D, I2D, Ϊ1D, IOD
I7S1, 17S2 vs I6D, I5D, I4D, I3D, I2D, I1D, IOD
There are several special cases that RRC 112 must handle in order to do the dependency check. First, there are some instructions that use the same register as an input and an output. Thus, RRC 112 must compare this source/destination register address with the destination register addresses of all previous instructions. So for instruction 7, the following comparisons would be necessary:
I7S1,I7S2,I7S D vs. I6D,I5D,I4D,I3D,I2D,I1D,I0D.
Another special case occurs when a program contains instructions that generate 64 bit outputs (called long-word operations). These instructions need two registers in which to store their results. In this embodiment, these registers must be sequential. Thus if RRC 112 is checking instruction 4's dependencies and instruction
1 is a long-word operation, then it must do the following comparisons: I4S1 4S2 vs. I3D,I2D,I1D,I1D+1,I0D
Sometimes, instructions do not have destination registers. Thus RRC 112 must ignore any dependencies between instructions without destination registers and any future instructions. Also instructions may not have only one valid source register, so RRC 112 must ignore any dependencies between the unused source register (usually S2) and any previous instructions.
RRC 112 is also capable of dealing with multiple register files. When using multiple register files, dependencies only occur when one instruction's source register has the same address and is in the same register file as some other instruction's destination register. RRC 112 treats the information regarding which register file a particular address is from as part of the address. For example, in an implementation using four 32 bit register files, RRC 112 would do 7 bit compares instead of 5 bit compares (5 for the address and 2 for the register file). Signals indicating which instructions are long-word operations or have invalid source or destination registers are sent to RRC 112 from Instruction Decode Logic (IDL; not shown).
A straight forward, representative floorplan for laying out DCL 108, TAL 122 and RPM 124 for RRC 112 is shown in Fig. 2. DDC 108 has two sets of inputs. The first set includes source address signals from IFIFO 101 for all eight instructions of window 102; these inputs are shown at reference number 202. Inputs 202 are also supplied to TAL blocks 220, as shown by reference number 222. The second set of inputs includes long-word load operation flags, register file decode signals, invalid destination register flags, destination address signals and addressing mode flags for all eight instructions; these inputs are shown at reference number 203.
DDC 108 comprises 28 data dependency blocks 204. Each block 204 receives 3 inputs, IXS1, IXS2 and IXS D. IXSl is the address of source (input) number 1 of instruction X, IXS2 is the address of source (input) number 2 of instruction X; and IXS D is the address of the source/destination (input) of instruction X. Each block 204 also receives input D7S/D, which is the destination register address for instruction Y. A first column 208, for example, receives I0S/D, which is the destination register address for instruction 0. Each block 204 outputs the data dependency results to one of a corresponding bus line 214 to a TAL block 220. In this example, the address of I2S D must be checked with operand addresses SI, S2 and S/D of instructions 7, 6, 5, 4, and 3.
Each tag assignment logic block 220 receives the corresponding data dependency results via buses 214, as well as further signals that come from the computer's IDL (not shown) via a set of input lines 226. A BKT bit forms the least significant bit of the tag. A set of DONEEX] flags for instructions 0 through 6 indicate if the instruction is done. A set of DBLREG[X] flags indicates which, if any, of the instructions is a double Gong) word.
Each TAL block 220 also receives its own instruction's register addresses as inputs; this input is indicated by reference number 222. The miscellaneous signals DBLREG and BKT signals are all implementation dependent control signals. Each TAL block 220 outputs 0-3 TAGs 126 labeled IXS1, IXS2 and IXS/D, which are 6 bits. TALs 220 also outputs the least significant 5 bits of each TAG signal to RPMs 124 via output buses 224 which form a main bus 126, and the most significant TAG bit to ISL 218 via bus 120.
The floorplan arrangement shown in Fig. 2 has two major limitations: it requires a large area, and some of the outputs 214 of DDC 108 have to travel a long distance to TAL 122, which limits the performance of RRC 112. A second floorplan embodiment is shown at Fig. 3. In this arrangement, TAL blocks 220 are placed (e.g., integrated with) between compare blocks 204 of DDC 108, as shown generally at reference number302. This arrangement does, however, have one limitation. The most efficient arrangement of DDC 108 and TAL 122 requires that TAL 122 outputs 224 exit near the middle of rows 4, 5, 6 and 7, which is shown at a dashed box 304. This creates a wiring problem, because TAL 122 outputs 224 now must travel a long distance to RPM 124, especially in the case if 17.
To resolve this problem, the TAL outputs of the rows furthest away from RPM
124 must be channeled through the rows closest to RPM 124. One method would be to expand rows 4, 5 and 6 enough to get all of the wires through. Since compare blocks 204 must be lined up vertically, row 7 would also need to be expanded. This would increase the width of RRC 112.
A preferred floorplan embodiment of the present invention is shown in Fig. 4. In the floorplan layout shown in Fig. 4, the left sides of rows 4, 5, 6, and 7 have been flipped. In other words, referring to the vertically aligned comparators 204 and their associated TAL logic as columns, columns 3, 4, 5 and 6 have been flipped. This creates gap in rows 4, 5 and 6 without increasing the length of row 7. (The gap is also called a center channel and is shown as a dashed box402.) TAL outputs 224 of rows 4-7 are laid-out in center channel 402 and are fed directly to RPM 124 in essentially a straight path. The overall area of RRC 112 therefore remains the same. While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example, and not limitation. Thus the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

CLAIMSWhat is claimed is:
1. A method for laying out a floor plan for a register renaming circuit on a semiconductor chip to conserve chip area, said register renaming circuit permitting out-of-order issuing of multiple instructions by performing data dependency checking between N number of multiple instructions, such that each instruction's source and destination operands are compared to each preceding instruction's result operand, the method comprising the steps of: (1) arranging data dependency comparator blocks in rows and columns, wherein said arrangement defines layout regions between adjacent ones of said data dependency comparator blocks in said rows, and wherein said data dependency comparator blocks include output lines for forwarding dependency information to tag assignment logic; and (2) positioning said tag assignment logic in one or more of said layout regions to thereby integrate said tag assignment logic with said data dependency comparator logic for more easily receiving said dependency information from said data dependency comparator block output lines, wherein said tag assignment logic includes further output lines for forwarding tag information out of said layout regions.
2. The method according to claim 1, wherein step (2) further spatially defines a channel in one or more of said rows, said channel running substantially orthogonal to said rows.
3. The method according to claim 2, further comprising the step of routing said further output lines in said channel to minimize there length.
4. A floor plan for laying out a register renaming circuit on a semiconductor chip to conserve chip area, said register renaming circuit permitting out-of-order issuing of multiple instructions by performing data dependency checking between N number of multiple instructions, such that each instruction's source and destination operands are compared to each preceding instruction's result operand, the floor plan comprising: (a) data dependency comparator blocks arranged in rows and columns, said arrangement defining layout regions between adjacent ones of said data dependency comparator blocks in said rows, wherein said data dependency comparator blocks include output lines for forwarding dependency information; and (b) tag assignment logic for receiving said dependency information, being positioned in one or more of said layout regions to thereby integrate said tag assignment logic with said data dependency comparator logic, said tag assignment logic including further output lines for forwarding tag information out of said layout regions.
5. The floor plan according to claim 1, wherein said tag assignment logic is positioned to spatially defines a channel in one or more of said rows, said channel running substantially orthogonal to said rows.
6. The floor plan according to claim 2, wherein said further output lines are routed in said channel to minimize there length and forward tag information out of said layout regions.
7. A system for laying out a floor plan for a register renaming circuit on a semiconductor chip to conserve chip area, the system comprising: (1) first means for arranging data dependency comparator blocks in rows and columns, wherein said arrangement defines layout regions between adjacent ones of said data dependency comparator blocks in said rows, and wherein said data dependency comparator blocks include output lines for forwarding dependency information to tag assignment logic; (2) second means, associated with said first means, for positioning said tag assignment logic in one or more of said layout regions to spatially define a channel in one or more of said rows, said channel running substantially orthogonal to said rows, wherein said tag assignment logic includes further output lines for forwarding tag information out of said layout regions; and (3) third means, associated with said second means, for routing said further output lines in said channel to minimize there length.
PCT/JP1993/000377 1992-03-31 1993-03-26 Semiconductor floor plan and method for a register renaming circuit WO1993020506A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP51729593A JP3555140B2 (en) 1992-03-31 1993-03-26 Semiconductor floorplan and method for register rename circuit

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/860,718 1992-03-31
US07/860,718 US5371684A (en) 1992-03-31 1992-03-31 Semiconductor floor plan for a register renaming circuit

Publications (1)

Publication Number Publication Date
WO1993020506A1 true WO1993020506A1 (en) 1993-10-14

Family

ID=25333865

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP1993/000377 WO1993020506A1 (en) 1992-03-31 1993-03-26 Semiconductor floor plan and method for a register renaming circuit

Country Status (3)

Country Link
US (9) US5371684A (en)
JP (3) JP3555140B2 (en)
WO (1) WO1993020506A1 (en)

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2575564B2 (en) * 1991-03-05 1997-01-29 インターナショナル・ビジネス・マシーンズ・コーポレイション Automatic macro optimization ordering method
US5539911A (en) 1991-07-08 1996-07-23 Seiko Epson Corporation High-performance, superscalar-based computer system with out-of-order instruction execution
US5493687A (en) 1991-07-08 1996-02-20 Seiko Epson Corporation RISC microprocessor architecture implementing multiple typed register sets
US5371684A (en) * 1992-03-31 1994-12-06 Seiko Epson Corporation Semiconductor floor plan for a register renaming circuit
EP0636256B1 (en) 1992-03-31 1997-06-04 Seiko Epson Corporation Superscalar risc processor instruction scheduling
DE69308548T2 (en) * 1992-05-01 1997-06-12 Seiko Epson Corp DEVICE AND METHOD FOR COMPLETING THE COMMAND IN A SUPER-SCALAR PROCESSOR.
JP3644959B2 (en) 1992-09-29 2005-05-11 セイコーエプソン株式会社 Microprocessor system
US6735685B1 (en) 1992-09-29 2004-05-11 Seiko Epson Corporation System and method for handling load and/or store operations in a superscalar microprocessor
US5835745A (en) * 1992-11-12 1998-11-10 Sager; David J. Hardware instruction scheduler for short execution unit latencies
DE69330889T2 (en) 1992-12-31 2002-03-28 Seiko Epson Corp System and method for changing register names
US5628021A (en) 1992-12-31 1997-05-06 Seiko Epson Corporation System and method for assigning tags to control instruction processing in a superscalar processor
TW242673B (en) * 1993-08-18 1995-03-11 Ibm
US5613132A (en) * 1993-09-30 1997-03-18 Intel Corporation Integer and floating point register alias table within processor device
US5564056A (en) * 1994-03-01 1996-10-08 Intel Corporation Method and apparatus for zero extension and bit shifting to preserve register parameters in a microprocessor utilizing register renaming
US6112019A (en) * 1995-06-12 2000-08-29 Georgia Tech Research Corp. Distributed instruction queue
US5764532A (en) * 1995-07-05 1998-06-09 International Business Machines Corporation Automated method and system for designing an optimized integrated circuit
US6356918B1 (en) 1995-07-26 2002-03-12 International Business Machines Corporation Method and system for managing registers in a data processing system supports out-of-order and speculative instruction execution
US5664120A (en) * 1995-08-25 1997-09-02 International Business Machines Corporation Method for executing instructions and execution unit instruction reservation table within an in-order completion processor
US5768556A (en) * 1995-12-22 1998-06-16 International Business Machines Corporation Method and apparatus for identifying dependencies within a register
US5757657A (en) * 1996-02-07 1998-05-26 International Business Machines Corporation Adaptive incremental placement of circuits on VLSI chip
US5802386A (en) * 1996-11-19 1998-09-01 International Business Machines Corporation Latency-based scheduling of instructions in a superscalar processor
US5838941A (en) * 1996-12-30 1998-11-17 Intel Corporation Out-of-order superscalar microprocessor with a renaming device that maps instructions from memory to registers
US5996063A (en) * 1997-03-03 1999-11-30 International Business Machines Corporation Management of both renamed and architected registers in a superscalar computer system
DE10159699A1 (en) * 2001-12-05 2003-06-26 Infineon Technologies Ag Method of manufacturing a semiconductor integrated circuit
US20030154363A1 (en) * 2002-02-11 2003-08-14 Soltis Donald C. Stacked register aliasing in data hazard detection to reduce circuit
AU2003252157A1 (en) * 2002-07-23 2004-02-09 Gatechange Technologies, Inc. Interconnect structure for electrical devices
US7269811B1 (en) * 2003-01-10 2007-09-11 Xilinx, Inc. Method of and apparatus for specifying clock domains in electronic circuit designs
US20060095732A1 (en) * 2004-08-30 2006-05-04 Tran Thang M Processes, circuits, devices, and systems for scoreboard and other processor improvements
US7187606B1 (en) * 2005-08-22 2007-03-06 P.A. Semi, Inc. Read port circuit for register file
US7277353B2 (en) * 2005-08-22 2007-10-02 P.A. Semi, Inc. Register file
US7577038B2 (en) * 2005-09-29 2009-08-18 Hynix Semiconductor, Inc. Data input/output multiplexer of semiconductor device
US8813021B1 (en) 2006-02-16 2014-08-19 Cypress Semiconductor Corporation Global resource conflict management for an embedded application design
KR100798792B1 (en) 2006-12-27 2008-01-28 주식회사 하이닉스반도체 Semiconductor memory device
KR100801309B1 (en) * 2007-01-03 2008-02-05 주식회사 하이닉스반도체 Memory device performing write leveling operation
KR100825002B1 (en) * 2007-01-10 2008-04-24 주식회사 하이닉스반도체 Semiconductor memory device with ability to effectively check an error of data outputted in serial
KR100920830B1 (en) * 2007-04-11 2009-10-08 주식회사 하이닉스반도체 Write Control Signal Generation Circuit And Semiconductor Memory Apparatus Using The Same And Operation Method Thereof
KR100907928B1 (en) * 2007-06-13 2009-07-16 주식회사 하이닉스반도체 Semiconductor memory device
KR100933668B1 (en) * 2008-04-30 2009-12-23 주식회사 하이닉스반도체 Output circuit
US7990780B2 (en) * 2009-02-20 2011-08-02 Apple Inc. Multiple threshold voltage register file cell
JP2010282296A (en) * 2009-06-02 2010-12-16 Sanyo Electric Co Ltd Data check circuit

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0437044A2 (en) * 1989-12-20 1991-07-17 International Business Machines Corporation Data processing system with instruction tag apparatus

Family Cites Families (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3913074A (en) * 1973-12-18 1975-10-14 Honeywell Inf Systems Search processing apparatus
US4814979A (en) * 1981-04-01 1989-03-21 Teradata Corporation Network to transmit prioritized subtask pockets to dedicated processors
JPS57204125A (en) * 1981-06-10 1982-12-14 Hitachi Ltd Electron-ray drawing device
US4498134A (en) * 1982-01-26 1985-02-05 Hughes Aircraft Company Segregator functional plane for use in a modular array processor
US4500963A (en) * 1982-11-29 1985-02-19 The United States Of America As Represented By The Secretary Of The Army Automatic layout program for hybrid microcircuits (HYPAR)
US5150509A (en) * 1984-05-03 1992-09-29 Kompan A/S Method of producing a stabilized bolt joint between a timber element and another construction element and the timber construction
JPH0652784B2 (en) * 1984-12-07 1994-07-06 富士通株式会社 Gate array integrated circuit device and manufacturing method thereof
US4613941A (en) * 1985-07-02 1986-09-23 The United States Of America As Represented By The Secretary Of The Army Routing method in computer aided customization of a two level automated universal array
US4945479A (en) * 1985-07-31 1990-07-31 Unisys Corporation Tightly coupled scientific processing system
JPS62175831A (en) * 1986-01-30 1987-08-01 Fujitsu Ltd Control system for pipeline with tag
US4814978A (en) * 1986-07-15 1989-03-21 Dataflow Computer Corporation Dataflow processing element, multiprocessor, and processes
JPH0793358B2 (en) * 1986-11-10 1995-10-09 日本電気株式会社 Block placement processing method
US5150309A (en) * 1987-08-04 1992-09-22 Texas Instruments Incorporated Comprehensive logic circuit layout system
JPH01231126A (en) 1988-03-11 1989-09-14 Oki Electric Ind Co Ltd Information processor
GB8817912D0 (en) * 1988-07-27 1988-09-01 Int Computers Ltd Data processing apparatus
JPH0673105B2 (en) 1988-08-11 1994-09-14 株式会社東芝 Instruction pipeline type microprocessor
US5241635A (en) * 1988-11-18 1993-08-31 Massachusetts Institute Of Technology Tagged token data processing system with operand matching in activation frames
US5317734A (en) * 1989-08-29 1994-05-31 North American Philips Corporation Method of synchronizing parallel processors employing channels and compiling method minimizing cross-processor data dependencies
US4964479A (en) * 1989-10-10 1990-10-23 Sumida Kunio A Weight scale compensating for tare
JPH03196334A (en) 1989-12-26 1991-08-27 Fujitsu Ltd Arithmetic control system
JP2988965B2 (en) 1990-06-07 1999-12-13 株式会社東芝 Pipeline information processing circuit
JPH0480824A (en) 1990-07-23 1992-03-13 Nec Corp Data processor
US5625836A (en) * 1990-11-13 1997-04-29 International Business Machines Corporation SIMD/MIMD processing memory element (PME)
US5826055A (en) * 1991-07-08 1998-10-20 Seiko Epson Corporation System and method for retiring instructions in a superscalar microprocessor
US5493687A (en) * 1991-07-08 1996-02-20 Seiko Epson Corporation RISC microprocessor architecture implementing multiple typed register sets
US5539911A (en) * 1991-07-08 1996-07-23 Seiko Epson Corporation High-performance, superscalar-based computer system with out-of-order instruction execution
EP0547247B1 (en) 1991-07-08 2001-04-04 Seiko Epson Corporation Extensible risc microprocessor architecture
EP0636256B1 (en) * 1992-03-31 1997-06-04 Seiko Epson Corporation Superscalar risc processor instruction scheduling
US5371684A (en) * 1992-03-31 1994-12-06 Seiko Epson Corporation Semiconductor floor plan for a register renaming circuit
US5615126A (en) * 1994-08-24 1997-03-25 Lsi Logic Corporation High-speed internal interconnection technique for integrated circuits that reduces the number of signal lines through multiplexing
US6093274A (en) * 1996-02-02 2000-07-25 Westvaco Corporation Method of making a composite paperboard structure with a silicon-oxide-coated film for improving the shelf life of oxygen-sensitive products
US5826065A (en) * 1997-01-13 1998-10-20 International Business Machines Corporation Software architecture for stochastic simulation of non-homogeneous systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0437044A2 (en) * 1989-12-20 1991-07-17 International Business Machines Corporation Data processing system with instruction tag apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS vol. 10, no. 1, January 1991, NEW YORK US pages 116 - 129 LUK ET AL 'multistack optimization for data-path chip layout' *

Also Published As

Publication number Publication date
JP3724582B2 (en) 2005-12-07
JP3755604B2 (en) 2006-03-15
JPH07505495A (en) 1995-06-15
US6083274A (en) 2000-07-04
US6782521B2 (en) 2004-08-24
US20020129324A1 (en) 2002-09-12
JP2004234642A (en) 2004-08-19
JP3555140B2 (en) 2004-08-18
US5371684A (en) 1994-12-06
US5566385A (en) 1996-10-15
US5831871A (en) 1998-11-03
JP2004158018A (en) 2004-06-03
US6401232B1 (en) 2002-06-04
US5734584A (en) 1998-03-31
US7555738B2 (en) 2009-06-30
US7174525B2 (en) 2007-02-06
US20070113214A1 (en) 2007-05-17
US20040243961A1 (en) 2004-12-02

Similar Documents

Publication Publication Date Title
US5371684A (en) Semiconductor floor plan for a register renaming circuit
US7802074B2 (en) Superscalar RISC instruction scheduling
US5809276A (en) System and method for register renaming
EP0605875B1 (en) Method and system for single cycle dispatch of multiple instruction in a superscalar processor system
US5898882A (en) Method and system for enhanced instruction dispatch in a superscalar processor system utilizing independently accessed intermediate storage

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase