WO2003034216B1 - Automatic instruction set architecture generation - Google Patents

Automatic instruction set architecture generation

Info

Publication number
WO2003034216B1
WO2003034216B1 PCT/US2002/028898 US0228898W WO03034216B1 WO 2003034216 B1 WO2003034216 B1 WO 2003034216B1 US 0228898 W US0228898 W US 0228898W WO 03034216 B1 WO03034216 B1 WO 03034216B1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction set
set architecture
instructions
existing
region
Prior art date
Application number
PCT/US2002/028898
Other languages
French (fr)
Other versions
WO2003034216A3 (en
WO2003034216A2 (en
Inventor
David William Goodwin
Dror Maydan
Ding-Kai Chen
Darin Stamenov Petkov
Steven Weng-Kiang Tjiang
Peng Tu
Christopher Rowen
Original Assignee
Tensilica Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tensilica Inc filed Critical Tensilica Inc
Priority to JP2003536879A priority Critical patent/JP4209776B2/en
Priority to KR1020047005643A priority patent/KR100705509B1/en
Priority to GB0407839A priority patent/GB2398144B/en
Publication of WO2003034216A2 publication Critical patent/WO2003034216A2/en
Publication of WO2003034216A3 publication Critical patent/WO2003034216A3/en
Publication of WO2003034216B1 publication Critical patent/WO2003034216B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation

Abstract

A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operation, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.

Claims

I PCT/US02/28898
AMENDED CLAIMS
[received by the International Bureau on 08 April 2004 (08.04.2004); original claims 1, 2, 7, 14, 20, 21, 27, 30 and 31 amended; original claim 13 cancelled; new claims 32 and 33 added; remaining claims unchanged (5 pages)]
AMENDED CLAIMS
[received by the International Bureau on 08 April 2004 (08.04.2004); original claims 1, 2, 7, 14, 20, 21, 27, 30 and 31 amended; original claim 13 cancelled; new claims 32 and 33 added; remaining claims unchanged (5 pages)]
1. A system, comprising:
means for receiving at least one software program written in a high level language^ and
means for automatically generating an instruction set architecture optimized for executing that program(s), wherein the instruction set architecture is represented as a set of configurations containing one or more extension instructions based on instructions in an existing standard or existing user defined instruction set architecture.
2. The system of claim 1, wherein:
the extension instructions operate on states and register files in the existing standard or existing user-defined instruction set architecture.
3. The system of claim 2, wherein the extension instructions contain vectorized versions of the existing instructions.
4. The system of claim 2, wherein the extension instructions contain VLIW combinations of the existing instructions.
5. The system of claim 2, wherein the extension instructions contain fused combinations of the existing instructions.
6. The system of claim 2, wherein the extension instructions contain specialized versions of the existing instructions.
7. The system of claim 2, wherein the extension instructions_contain vectorized versions of operations supported by the high level language.
8. The system of claim 2, wherein the extension instructions contain VLIW combinations of operations supported by the high level language.
9. The system of claim 2, wherein the extension instructions contain fused combinations of operations supported by the high level language.
10. The system of claim 2, wherein the extension instructions contain specialized versions of operations supported by the high level language.
11. The system of claim 2, wherein the extension instructions contain at least two of vectorized, VLIW, fused and specialized versions of the existing instructions.
12. The system of claim 2, wherein the extension instructions contain at least two of vectorized, VLIW, fused and specialized versions of operations supported by the high level language.
14. The system of claim 2, wherein:
instruction set architecture generation is guided by analysis information gathered from the at least one software program; and the analysis information is gathered for each region of code that could get a performance improvement from a generated instruction set algorithm.
15. The system of claim 14, wherein the analysis information includes an execution count of each region as determined from real or estimated profiling information.
16. The system of claim 14, wherein the analysis information includes an execution count of each region as determined from user-supplied directives.
17. The system of claim 14, where the analysis information includes a dependence graph of each region.
18. The system of claim 14, where the analysis information includes a set of operation vector lengths that can be used to improve performance of each region.
19. The system of claim 14, wherein each region is evaluated with a set of instruction set architecture configurations to determine a performance improvement that would result if instructions, operations, register files, and states represented by the configuration could be used for the region.
20. The system of claim 19, wherein:
instruction set architecture generation uses as a guideline an estimate of hardware cost of each instruction set architecture configuration that includes a cost of logic necessary to implement instructions, operations, register files, and state represented by the configuration; and the hardware cost and performance improvement of each instruction set architecture configuration for each region is used to determine a set of instruction set architecture configurations that together describe the generated instruction set architecture such that the performance improvement of the software program(s) is increased as much as possible while the hardware cost of the generated instruction set architecture does not exceed a cost budget.
21. The system of claim 19, wherein:
instruction set architecture generation uses as a guideline an estimate of hardware cost of each instruction set architecture configuration that includes a cost of logic necessary to implement instructions, operations, register files, and state represented by the configuration; and the hardware cost and performance improvement of each instruction set architecture configuration for each region is used to determine the set of instruction set architecture configurations that together describe the generated instruction set architecture such that the hardware cost of the generated instruction set architecture is as small as possible while providing a performance improvement that is greater or equal to a performance goal.
22. The system of claim 19, wherein the hardware cost and performance improvement of each instruction set architecture hardware configuration for each region is used to determine the set of instruction set architecture configurations that together describe the generated instruction set architecture such that the hardware cost of the generated instruction set architecture is smaller than a predetermined function of the performance improvement.
23. The system of claim 19, wherein a performance improvement provided by a particular instruction set architecture configuration for a particular region is determined by
an instruction scheduling algorithm operating on a modified dependence graph of the region.
24. The system of claim 23, wherein the dependence graph is modified to replicate operations with an operation width that is less than one.
25. The system of claim 23, wherein the dependence graph is modified to replace groups of operations with a single fused operation.
26. The system of claim 19, wherein the performance improvement provided by a particular instruction set architecture configuration for a particular region is determined using resource limits.
27. The system of claim 2, wherein instruction set architecture generation uses as a guideline an estimate of hardware cost of each instruction set architecture configuration that includes a cost of logic necessary to implement instructions, operations, register files, and state represented by the configuration.
28. The system of claim 27, wherein the hardware cost is estimated by adding hardware costs of components present in the instruction set architecture configuration.
29. The system of claim 27, wherein the hardware cost is reduced to represent reduced logic necessary when specialized operations replace generic operations.
30. A system, comprising:
means for receiving at least one software program written in a high level language^ and
means for automatically generating an instruction set architecture optimized for
executing said at least one program, by adding one or more new extension_instructions to instructions in an existing instruction set architecture based on the analysis of the said at least one program^
wherein the new instruction(s) contain vectorized versions of the existing instructions.
31. A system, comprising:
means for receiving at least one software program written in a high level language^ and means for automatically generating an instruction set architecture optimized for executing said at least one program, by adding one or more new register files based on the analysis of the said at least one program,;
wherein the new register file(s) are vectorized versions of an existing standard or existing user defined instruction set architecture.
32. A system, comprising:
means for receiving at least one software program written in a high level language; and
means for automatically generating an instruction set architecture optimized for executing said at least one program, by adding one or more new extension instructions to instructions in an existing instruction set architecture based on the analysis of the said at least one program,
wherein the new instruction(s) contain specialized versions of existing standard or user-defined instructions in the existing instruction set architecture.
33. A system, comprising:
means for receiving at least one software program written in a high level language; and means for automatically generating an instruction set architecture optimized for executing said at least one program, by adding one or more new extension instructions to instructions in an existing instruction set architecture based on the analysis of the said at least one program,
wherein a single new instruction contains one of a fused, specialized and vectorized combination of existing standard or user-defined instructions in the existing instruction set architecture.
PCT/US2002/028898 2001-10-16 2002-09-10 Automatic instruction set architecture generation WO2003034216A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2003536879A JP4209776B2 (en) 2001-10-16 2002-09-10 Automatic instruction set architecture generation
KR1020047005643A KR100705509B1 (en) 2001-10-16 2002-09-10 Automatic instruction set architecture generation
GB0407839A GB2398144B (en) 2001-10-16 2002-09-10 Automatic instruction set architecture generation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/981,291 2001-10-16
US09/981,291 US6941548B2 (en) 2001-10-16 2001-10-16 Automatic instruction set architecture generation

Publications (3)

Publication Number Publication Date
WO2003034216A2 WO2003034216A2 (en) 2003-04-24
WO2003034216A3 WO2003034216A3 (en) 2004-09-10
WO2003034216B1 true WO2003034216B1 (en) 2004-12-16

Family

ID=25528272

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/028898 WO2003034216A2 (en) 2001-10-16 2002-09-10 Automatic instruction set architecture generation

Country Status (7)

Country Link
US (2) US6941548B2 (en)
JP (1) JP4209776B2 (en)
KR (1) KR100705509B1 (en)
CN (1) CN100367209C (en)
GB (1) GB2398144B (en)
TW (1) TW594581B (en)
WO (1) WO2003034216A2 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2367915B (en) * 2000-10-09 2002-11-13 Siroyan Ltd Instruction sets for processors
US7254810B2 (en) * 2002-04-18 2007-08-07 International Business Machines Corporation Apparatus and method for using database knowledge to optimize a computer program
US7353503B2 (en) * 2002-12-27 2008-04-01 Sun Microsystems, Inc. Efficient dead code elimination
JP4057938B2 (en) * 2003-03-26 2008-03-05 株式会社東芝 Compiler, compiling method, and program development tool
JP2005216177A (en) * 2004-01-30 2005-08-11 Toshiba Corp Configurable processor design apparatus and method, library optimizing method, processor, and method of manufacturing semiconductor device comprising processor
US7603546B2 (en) * 2004-09-28 2009-10-13 Intel Corporation System, method and apparatus for dependency chain processing
JP2006243839A (en) * 2005-02-28 2006-09-14 Toshiba Corp Instruction generation device and instruction generation method
JP2006243838A (en) * 2005-02-28 2006-09-14 Toshiba Corp Program development device
US7506326B2 (en) * 2005-03-07 2009-03-17 International Business Machines Corporation Method and apparatus for choosing register classes and/or instruction categories
JP4541218B2 (en) 2005-04-08 2010-09-08 三菱電機株式会社 Command generator
US20060265485A1 (en) * 2005-05-17 2006-11-23 Chai Sek M Method and apparatus for controlling data transfer in a processing system
US7603492B2 (en) * 2005-09-20 2009-10-13 Motorola, Inc. Automatic generation of streaming data interface circuit
WO2007085121A1 (en) * 2006-01-26 2007-08-02 Intel Corporation Scheduling multithreaded programming instructions based on dependency graph
US7441224B2 (en) * 2006-03-09 2008-10-21 Motorola, Inc. Streaming kernel selection for reconfigurable processor
US20080120497A1 (en) * 2006-11-20 2008-05-22 Motorola, Inc. Automated configuration of a processing system using decoupled memory access and computation
US8037466B2 (en) 2006-12-29 2011-10-11 Intel Corporation Method and apparatus for merging critical sections
US7971132B2 (en) * 2007-01-05 2011-06-28 Dialogic Corporation Universal multimedia engine and method for producing the same
US7802005B2 (en) * 2007-03-30 2010-09-21 Motorola, Inc. Method and apparatus for configuring buffers for streaming data transfer
US8136107B2 (en) * 2007-10-24 2012-03-13 International Business Machines Corporation Software pipelining using one or more vector registers
US8250342B1 (en) * 2008-01-09 2012-08-21 Xilinx, Inc. Digital signal processing engine
KR100946763B1 (en) * 2008-07-23 2010-03-11 성균관대학교산학협력단 Method and device for generating of code
IT1392495B1 (en) * 2008-12-29 2012-03-09 St Microelectronics Srl METHOD OF DESIGNING AN ACCELERATOR AT HIGH PERFORMANCE ASIC TYPE (INTEGRATED CIRCUIT WITH SPECIFIC APPLICATION - APPLICATION-SPECIFIC INTEGRATED CIRCUIT)
US8370784B2 (en) * 2010-07-13 2013-02-05 Algotochip Corporation Automatic optimal integrated circuit generator from algorithms and specification
US9811335B1 (en) 2013-10-14 2017-11-07 Quicklogic Corporation Assigning operational codes to lists of values of control signals selected from a processor design based on end-user software
US9785413B2 (en) * 2015-03-06 2017-10-10 Intel Corporation Methods and apparatus to eliminate partial-redundant vector loads
US10157164B2 (en) * 2016-09-20 2018-12-18 Qualcomm Incorporated Hierarchical synthesis of computer machine instructions
US20220156078A1 (en) * 2020-11-19 2022-05-19 Arm Limited Register rename stage fusing of instructions

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6123275A (en) * 1984-07-11 1986-01-31 Nec Corp Vector processor
US5530881A (en) * 1991-06-06 1996-06-25 Hitachi, Ltd. Vector processing apparatus for processing different instruction set architectures corresponding to mingled-type programs and separate-type programs
US5361373A (en) 1992-12-11 1994-11-01 Gilson Kent L Integrated circuit computing device comprising a dynamically configurable gate array having a microprocessor and reconfigurable instruction execution means and method therefor
US5933642A (en) * 1995-04-17 1999-08-03 Ricoh Corporation Compiling system and method for reconfigurable computing
US5794062A (en) * 1995-04-17 1998-08-11 Ricoh Company Ltd. System and method for dynamically reconfigurable computing using a processing unit having changeable internal hardware organization
US5835771A (en) * 1995-06-07 1998-11-10 Rogue Wave Software, Inc. Method and apparatus for generating inline code using template metaprograms
US5696956A (en) 1995-11-08 1997-12-09 Digital Equipment Corporation Dynamically programmable reduced instruction set computer with programmable processor loading on program number field and program number register contents
US5819064A (en) 1995-11-08 1998-10-06 President And Fellows Of Harvard College Hardware extraction technique for programmable reduced instruction set computers
US6035123A (en) 1995-11-08 2000-03-07 Digital Equipment Corporation Determining hardware complexity of software operations
US6016395A (en) * 1996-10-18 2000-01-18 Samsung Electronics Co., Ltd. Programming a vector processor and parallel programming of an asymmetric dual multiprocessor comprised of a vector processor and a risc processor
US6233599B1 (en) * 1997-07-10 2001-05-15 International Business Machines Corporation Apparatus and method for retrofitting multi-threaded operations on a computer by partitioning and overlapping registers
US7275246B1 (en) * 1999-01-28 2007-09-25 Ati International Srl Executing programs for a first computer architecture on a computer of a second architecture
US7065633B1 (en) * 1999-01-28 2006-06-20 Ati International Srl System for delivering exception raised in first architecture to operating system coded in second architecture in dual architecture CPU
AU3484100A (en) 1999-02-05 2000-08-25 Tensilica, Inc. Automated processor generation system for designing a configurable processor andmethod for the same
US6779107B1 (en) * 1999-05-28 2004-08-17 Ati International Srl Computer execution by opportunistic adaptation
US6457173B1 (en) * 1999-08-20 2002-09-24 Hewlett-Packard Company Automatic design of VLIW instruction formats
US7036106B1 (en) 2000-02-17 2006-04-25 Tensilica, Inc. Automated processor generation system for designing a configurable processor and method for the same
US6615340B1 (en) * 2000-03-22 2003-09-02 Wilmot, Ii Richard Byron Extended operand management indicator structure and method
US7028286B2 (en) * 2001-04-13 2006-04-11 Pts Corporation Methods and apparatus for automated generation of abbreviated instruction set and configurable processor architecture
US20030005423A1 (en) * 2001-06-28 2003-01-02 Dong-Yuan Chen Hardware assisted dynamic optimization of program execution
US7278137B1 (en) * 2001-12-26 2007-10-02 Arc International Methods and apparatus for compiling instructions for a data processor
US7254696B2 (en) * 2002-12-12 2007-08-07 Alacritech, Inc. Functional-level instruction-set computer architecture for processing application-layer content-service requests such as file-access requests
US7543119B2 (en) * 2005-02-10 2009-06-02 Richard Edward Hessel Vector processor
US8214808B2 (en) * 2007-05-07 2012-07-03 International Business Machines Corporation System and method for speculative thread assist in a heterogeneous processing environment

Also Published As

Publication number Publication date
TW594581B (en) 2004-06-21
WO2003034216A3 (en) 2004-09-10
GB2398144B (en) 2005-10-12
US20050278713A1 (en) 2005-12-15
KR100705509B1 (en) 2007-04-13
US7971197B2 (en) 2011-06-28
US20030074654A1 (en) 2003-04-17
WO2003034216A2 (en) 2003-04-24
KR20040062567A (en) 2004-07-07
CN100367209C (en) 2008-02-06
JP2005507105A (en) 2005-03-10
GB0407839D0 (en) 2004-05-12
GB2398144A (en) 2004-08-11
JP4209776B2 (en) 2009-01-14
US6941548B2 (en) 2005-09-06
CN1608247A (en) 2005-04-20

Similar Documents

Publication Publication Date Title
WO2003034216B1 (en) Automatic instruction set architecture generation
US7861234B1 (en) System and method for binary translation to improve parameter passing
Mahlke et al. A comparison of full and partial predicated execution support for ILP processors
US5781758A (en) Software emulation system with reduced memory requirements
US7596781B2 (en) Register-based instruction optimization for facilitating efficient emulation of an instruction stream
US5920723A (en) Compiler with inter-modular procedure optimization
US6240544B1 (en) Simulation system, simulation evaluation system, simulation method, and computer-readable memory containing a simulation program, having less trace information for reverse execution
WO2004012083A2 (en) Source-to-source partitioning compilation
Leone et al. A declarative approach to run-time code generation
Janssen et al. A specification invariant technique for operation cost minimisation in flow-graphs
US6526572B1 (en) Mechanism for software register renaming and load speculation in an optimizer
Chang et al. Three architectural models for compiler-controlled speculative execution
Kim et al. Dynamic binary translation for accumulator-oriented architectures
CN105117269B (en) The optimization method of compiler based on vector interrupt
Chuang et al. Phi-predication for light-weight if-conversion
Ferdinand et al. Run-Time Guarantees for Real-Time Systems—The USES Approach
GB2378269A (en) Improved power efficiency in microprocessor systems
Guilfanov Simple type system for program reengineering
US7774766B2 (en) Method and system for performing reassociation in software loops
CN112540764A (en) Coding optimization method for conditional branch prediction direction transformation
US5437035A (en) Method and apparatus for compiling a program incending a do-statement
JP2003131889A (en) Object program generation method
CN102902532B (en) Tool chain conversion and extension method in integrated development environment
Baird et al. Optimizing transfers of control in the static pipeline architecture
Franke C compilers and code optimization for DSPs

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG UZ VN YU ZA ZM

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG

ENP Entry into the national phase

Ref document number: 0407839

Country of ref document: GB

Kind code of ref document: A

Free format text: PCT FILING DATE = 20020910

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 20028206045

Country of ref document: CN

Ref document number: 2003536879

Country of ref document: JP

Ref document number: 1020047005643

Country of ref document: KR

122 Ep: pct application non-entry in european phase
B Later publication of amended claims

Effective date: 20040408