CN100474236C - 多线程并行处理器及其维持线程执行的方法 - Google Patents

多线程并行处理器及其维持线程执行的方法 Download PDF

Info

Publication number
CN100474236C
CN100474236C CNB008144966A CN00814496A CN100474236C CN 100474236 C CN100474236 C CN 100474236C CN B008144966 A CNB008144966 A CN B008144966A CN 00814496 A CN00814496 A CN 00814496A CN 100474236 C CN100474236 C CN 100474236C
Authority
CN
China
Prior art keywords
register
processor
thread
window
visit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB008144966A
Other languages
English (en)
Other versions
CN1390323A (zh
Inventor
G·沃尔里奇
M·J·阿迪莱塔
W·维勒
D·伯恩斯坦因
D·胡珀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN1390323A publication Critical patent/CN1390323A/zh
Application granted granted Critical
Publication of CN100474236C publication Critical patent/CN100474236C/zh
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30018Bit or string instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control
    • G06F9/30058Conditional branch instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30087Synchronisation or serialisation instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • G06F9/3016Decoding the operand specifier, e.g. specifier format
    • G06F9/30167Decoding the operand specifier, e.g. specifier format of immediate specifier, e.g. constants
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/321Program or instruction counter, e.g. incrementing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/3834Maintaining memory consistency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming

Abstract

本发明描述一个并行的基于硬件的多线程处理器。该处理器包括一个协调系统功能的通用处理器和支持多个硬件线程或上下文(THREAD_3,…THREAD_0)的多个微引擎。该处理器保持执行线程(THREAD_3,…THREAD_0)。该执行线程(THREAD_3,…THREAD_0)访问组织成多个可相对编址的寄存器窗的寄存器组,它们对每个线程(THREAD_3,…THREAD_0)可相对编址。

Description

多线程并行处理器及其维持线程执行的方法
背景技术
本发明涉及计算机处理器。
在计算过程中并行处理是并发事件信息处理的有效方式。与顺序处理相反,并行处理要求在计算机中同时执行许多程序。在并行处理器的范围内,并行化意味着在同一时刻做多于一件事情。不象在单个站上顺序地完成所有任务的串行范例,也不象在专门的站完成诸任务的管线(流水线)机器,对于并行处理提供多个站,每个能完成所有任务。即,通常所有的或多个站同时地并独立地对一个问题的相同或共同的部分工作。某些问题适合于应用并行处理解决。
附图概述
图1是使用基于硬件的多线程处理器的通信系统的方框图。
图2是图1的基于硬件的多线程处理器的详细的方框图。
图3是在图1和图2的基于硬件的多线程处理器中使用的微引擎功能单元的方框图。
图4是在图3的微引擎中管线的方框图。
图5是示出通用寄存器地址安排的方框图。
描述
参考图1,通信系统10包括一个并行的、基于硬件的多线程处理器12。基于硬件的多线程处理器12连结到如PCI总线14之类的总线、存储器系统16和第二总线18。对于能够分解成并行子任务或功能的任务,系统10特别有用。具体说来,基于硬件的多线程处理器12对于面向带宽而非面向等待时间的任务是有用的。基于硬件的多线程处理器12具有多个微引擎22,每个带有多个能同时激活并独立对一个任务工作的硬件控制的线程。
基于硬件的多线程处理器12还包括一个中央处理器20,它帮助加载用于基于硬件的多线程处理器12的其他资源的微码控制,并完成其他通用计算机类型的功能,如处理协议,异常,在诸如边界条件中微引擎将包停下以作更详细处理的情况下对包处理的额外支持。在一个实施例中,处理器20是基于Strong 
Figure C00814496D00061
(Arm是英国ARM有限公司的商标)的结构。通用微处理器20具有操作系统。通过此操作系统,处理器20能调用功能对微引擎22a—22f操作。处理器20能使用任何支持的操作系统,最好是实时操作系统。对于作为StrongArm结构实现的核心处理器,可以使用如
Figure C00814496D00062
实时,VXWorks和□CUS那样可在因特网上得到的免费软件操作系统。
基于硬件的多线程处理器12还包括多个功能微引擎22a—22f。功能微引擎(微引擎)22a-22f中每一个包含多个硬件的程序计数器和与这些计数器相关的状态。实际上,对应多个线程组能在每个微引擎22a—22f上同时激活,虽然在任何时刻只有一个实际上在操作。
在一个实施例中,示出6个微引擎22a—22f。每个微引擎22a—22f具有处理4个硬件线程的能力。6个微引擎22a—22f在共享资源下操作,包括存储器系统16和总线接口24和28。存储器系统16包括一个同步动态随机存取存储器(SDRAM)控制器26a和静态随机存取存储器(SRAM)控制器26b。SDRAM存储器16a和SDRAM控制器26a通常用于处理大量数据,如处理从网络包来的网络有效负载。SRAM控制器26b和SRAM存储器16b用在对低等待时间,快速访问任务的网络实施中,如对核心处理器20的访问查找表,访问存储器等。
6个微引擎22a—22g根据数据的特征访问SDRAM 16a或SRAM16b。低等待时间,低带宽数据存在SRAM并从中取出,而等待时间不重要的较高带宽的数据存入SDRAM,并从中取出。微引擎22a—22f能执行对SDRAM控制器26a或SRAM控制器16b的存储器引用指令。
硬件多线程的优点能通过SRAM或SDRAM存储器的访问解释。作为例子,来自微引擎的Thread_0(线程_0)所请求的SRAM访问引起SRAM控制器26b启动对SRAM存储器16b的访问。SRAM控制器控制对SRAM总线的判优,访问SRAM16b,从SRAM 16b取出数据,并将数据返回到请求的微引擎22a—22b。在SRAM访问过程中,如果微引擎,如22a,只能操作单个线程,该微引擎在数据从SRAM返回以前休眠。通过在每个微引擎22a—22f中应用硬件上下文(context)交换,使得其他带着唯一程序计数器的其他上下文能在同一微引擎中执行。因此在第一线程,如Thread_0,等待读数据返回时,另一线程,如Thread_1能工作。在执行中Thread_1可访问SDRAM存储器16a。当Thread_1对SDRAM单元操作,且Thread_0对SRAM单元操作的同时,一个新的线程,如Thread_2,现在能在微引擎22a中操作。Thread_2能操作一定时间,直到它需要访问存储器,或完成某些如作出对总线接口访问那样另外的长等待时间操作。因此,处理器12能同时具有总线操作、SRAM操作和SDRAM操作,所有均由一个微引擎22a操作或完成,并且能具有一个以上可用线程以在数据通道中处理更多的工作。
硬件上下文交换也同步任务的完成。例如,两个线程可以选中同一个共享资源,如SRAM。这些分别的功能单元的每一个,如FBUS接口28、SRAM控制器26a和SDRAM控制器26b,在它们完成从一个微引擎线程上下文来的请求任务时,回报一个标志,通知一个操作的完成。当微引擎接收到此标志时,该微引擎能确定打开哪个线程。
对基于硬件的多线程处理器12的一个应用例子是作为网络处理器。作为网络处理器,基于硬件的多线程处理器12接口到如媒体访问控制设备那样网络设备,如10/100 BaseT Octal MAC 13a或Gigabit Ethernet(千兆以太网)设备13b。通常,作为网络处理器,基于硬件的多线程处理器12能接口到接收/发送大量数据的任意类型的通信设备或接口设备。在网络应用中工作的通信系统10能从设备13a,13b接收多个网络包,并以并行方式处理那些包。用基于硬件的多线程处理器12能分别地处理每个网络包。
使用处理器12的另一个例子是用于后脚本(postscript)处理器的打印机引擎作为对存储子系统,即RAID盘存储器的处理器。另一个使用是匹配引擎。在例如安全行业中,电子商务的兴起需要使用电子匹配引擎匹配买方和卖方之间的订单。这些和其他并行类型的任务能在系统10上完成。
处理器12包括连结处理器到第2总线18的总线接口28。在一个实施例中,总线接口28将处理器12连结到所谓FBUS 18(FIFO总线)。FBUS接口28负责控制并连结处理器12到FBUS18。FBUS18是64位宽的FIFO总线,用于连结到媒体访问控制器(MAC)设备。
处理器12包括一个第二接口,如PCI总线接口24,它将驻留在PCI14总线上的其他系统部件连接到处理器12。PCI总线接口24提供高速数据通道24a到存储器16,如SDRAM存储器16a。经过该通道,数据能借助直接存储器访问(DMA)传输,从SDRAM16a穿过PCI总线14快速移动。基于硬件的多线程处理器12支持图像传输。基于硬件的多线程处理器12能使用多个DMA通道,所以如果DMA传输的一个目标忙,另一个DMA通道能接管PCI总线以将信息提交到另一个目标,以保持高的处理器12的效率。此外,PCI总线接口24支持目标操作和主操作。目标操作是这样一种操作,其中在总线14上的从属设备通过读和写访问SDRAM,而读和写是从属于目标操作来服务的。在主要操作中,处理器核心20直接发送数据到PCI接口24或从中接收数据。
每个功能单元连结一个或多个内部总线。如下所述,内部总线是双32位总线(即一个总线用于读,一个总线用于写)。基于硬件的多线程处理器12,还构造成使得处理器12中内部总线的带宽之和超过连结到处理器12的外部总线的带宽。处理器12包括一个内部核心处理器总线32,如ASB总线(先进系统总线),它将处理器核心20连结到存储控制器26a,26c,并连结到如下所述的ASB翻译器30。ASB总线是与Strong Arm处理器核心一起使用的AMBA总线的子集。处理器12还包括一个专用总线34,将微引擎单元连结到SRAM控制器26b、ASB翻译器30和FBUS接口28。存储器总线38将存储控制器26a,26b连接到总线接口24和28以及包括用于自引导等操作的闪存ROM 16c的存储器系统16。
参考图2,每个微引擎22a—22f包括一个判优器,它检查标志以确定操作可用的线程。从任何一个微引擎22a—22f来的任何一个线程能访问SDRAM控制器26a,SRAM控制器26b,或FBUS接口28。存储控制器26a,26b中每一个包括多个队列,以存储未完成的存储器引用请求。此队列或者保持存储器引用的次序,或者安排存储器引用以优化存储器带宽。例如,如果Thread_0不依赖于Thread_1或与其没有关系,线程1和0没有理由不能不按顺序地完成它们对SRAM单元的存储器引用。微引擎22a—22f对存储控制器26a和26b发出存储器引用请求。微引擎22a—22f将足够的存储器引用操作充满存储器子系统26a和26b,使得存储器子系统26a和26b成为处理器12操作的瓶颈。
如果存储器子系统16用本质上独立的存储器请求充满,处理器12能够完成存储器引用排序。存储器调用排序改善了可得到的存储器带宽。如下所述,存储器调用排序减少了访问SRAM发生的停顿时间或泡沫。随着对SRAM的存储器引用,将信号线的电流方向在读和写之间切换产生一个泡沫或停顿时间,等待在SRAM16b与SRAM控制器26b连结导线上的电流稳定下来。
即,驱动总线电流的驱动器在改变状态以前需要稳定下来。重复的读周期后面跟一个写能降低峰值带宽。存储器引用排序允许处理器12组织对存储器的引用,使得一长串读能跟一长串的写。这能用于使在管线中停顿时间最小,从而更有效地达到接近最大可用的带宽。引用排列帮助维持并行的硬件上下文线程。在SDRAM中,引用排序允许在一个存储区与另一存储区之间隐藏预充电。具体说来,如果存储系统16b组织成奇数存储区和偶数存储区,当处理器在奇数存储区上操作的同时,存储控制器能开始预充电偶数存储区。如果存储器引用在奇数和偶数存储区之间交替,预充电是可能的。通过将存储器引用排序为对相反存储区的交替访问,处理器12改善了SDRAM的带宽。此外,也可使用其他的优化。例如,将可以合并的操作在存储器访问前合并的合并优化;通过检查地址,存储器的已打开的页面不再重新打开的打开页面优化;如下所述的链接;和刷新机构都可以使用。
FBUS接口28支持用于每个MAC设备支持的端口的发送和接收标志,以及指示何时需要服务的中断标志。FBUS接口28还包括一个控制器28a,它完成从FBUS18进入的包的报头处理。控制器28a提取包的报头并完成在SRAM中的一个微程序可编程的源/目标/协议的散列查找(用于地址平滑)。如果散列不能成功地解决,该包的首部被送到处理器核心20作另外的处理。FBUS接口28支持下列内部数据事务:
FBUS单元  (共享总线SRAM)  到/从微引擎。
FBUS单元  (经过专用总线)  从SDRAM单元写。
FBUS单元  (经过Mbus)      读至SDRAM。
FBUS18是标准的工业总线并包括一个数据总线(如64位宽)和用于地址和读/写控制的边带控制。FBUS接口28提供使用一系列输入和输出FIFO 29a—29b输入大量数据的能力。从FIFO 29a—29b,微引擎22a—22f从一个接收FIFO取出数据、或命令SDRAM控制器26a将数据从其送到FBUS接口28,在接收FIFO中数据从总线18的设备来。借助直接存储器访问,数据能经过存储控制器26a送到SDRAM存储器16a。类似地,微引擎能将数据从SDRAM 26a移到接口28,经过FBUS接口28,移出FBUS18。
数据功能在各微引擎中分配。到SRAM26a,SDRAM26b和FBUS的连接性是通过命令请求。命令请求可以是存储器请求或FBUS总线请求。例如,一个命令请求可将数据从位于微引擎22a中的寄存器移到共享资源,如SDRAM位置,SRAM位置,闪存储器或某些MAC地址。命令发送到每个功能单元及共享资源。但是,共享资源不需要保持数据的本地缓存。而是,共享资源访问位于微引擎内部的分布数据。这使得微引擎22a—22f有对数据的本地访问,而不是仲裁总线访问和总线风险竞争。以这个特征,有0周期的停顿用于等待在微引擎22a—22f内部的数据。
连结如存储控制器26a和26b这类共享资源的数据总线,如ASB总线30、SRAM总线34和SDRAM总线38,具有足够的带宽,使得没有内部瓶颈。因此,为了避免瓶颈,处理器12有带宽要求,给每个功能单元提供至少两倍的内部总线的最大带宽。作为一个例子,SDRAM能以83MHz运行64位宽的总线。SRAM数据总线能具有分别的读和写总线,如能是以166MHz运行的32位宽读总线和以166MHz运行的32位宽写总线。本质上,那是以166MHz运行的64位,它实际是SDRAM带宽的两倍。
核心处理器20也能访问共享资源。核心处理器20具有通过总线32对SDRAM控制器26a、总线接口24和SRAM控制器26b的直接通信。然后,为了访问微引擎22a—22f以及位于任何一个微引擎22a—22f的传输寄存器,核心处理器20经过总线34上的ASB翻译器30访问微引擎22a—22f。ASB翻译器30能物理地驻留在FBUS接口28中,但逻辑上是分开的。ASB翻译器30完成FBUS微引擎传输寄存器位置和核心处理器地址(即ASB总线)之间的地址翻译,所以核心处理器20能访问属于微引擎22a—22c的寄存器。
虽然微引擎22能使用寄存器组如下所述地交换数据,还可提供便笺存储器(暂时存储器)27,以允许微引擎将数据写到存储器以外,用于其他微引擎读取。便笺存储器27连结到总线34。
处理器核心20包括以5阶段管线实现的、在单个周期完成一个操作数或二个操作数的单循环移位的RISC核心50,提供乘法支持和32位滚动移位支持。此RISC核心50是标准的Strong 
Figure C00814496D00101
结构,但为了性能的原因用5阶段管线实现。处理器核心20还包括16千字节的指令高速缓存器52和8千字节的数据高速缓存器54以及预取流缓存器56。核心处理器20与存储写和指令的读取并行地完成算术操作。核心处理器20经过ARM确定的ASB总线与其他功能单元接口。ASB总线是32位双向总线32。
微引擎:
参考图3,示出微引擎22a—22f的一个例子,如22f。微引擎包括一个控制存储70,在一个实施例中,它包括一个RAM,这里是1024个32位字。RAM存储微程序。微程序可由核心处理器20加载。微引擎22f还包括控制器逻辑72。控制器逻辑包括一个指令解码器73和程序计数器(PC)单元72a—72d。4个微程序计数器72a—72d保持在硬件中。微引擎22f还包括上下文事件切换逻辑74。上下文事件逻辑74从如SRAM26a、SDRAM26b或处理器核心20及控制和状态寄存器等那样的共享资源中的每一个接收消息(如SEQ_#_EVENT_RESPONSE;FBI_EVENT_RESPONSE;SRAM_EVENT_RESPONSE;SDRAM_EVENT_RESPONSE和ASB_EVENT_RESPONSE)。这些消息提供有关所请求的功能是否已完成的信息。根据由线程请求的功能是否已经完成并产生完成信号,该线程需要等待该完成信号,且如果该线程能操作,则该线程被放置于可用的线程列表(未图示)上。微引擎22f能具有最多例如4个可用的线程。
除了对执行线程是本地的事件信号以外,微引擎22使用全局的信令状态。一个执行线程能用信令号状态对所有微引擎22广播一个信号状态。接收请求有效信号,在微引擎中的任何和所有线程能根据这些信令状态转移。能使用这些信令状态确定一个资源的可用性或一个资源是否已准备好服务。
上下文事件逻辑74具有对4个线程的判优。在一个实施例中,判优是一个循环算法机制。可以使用其他技术,包括优先级排队或加权公平排队。微引擎22f还包括一个执行箱(框)(EBOX)数据通道76,它包括一个算术逻辑单元76a和通用寄存器组76b。算术逻辑单元76a完成算术和逻辑功能以及移位功能。寄存器组76b具有相当大数目的通用寄存器。如在图6中将描述,在本实施例中第一存储区Bank A中有64个通用寄存器,且在第二存储区Bank B中也有64个。如将描述的,寄存器组分成窗,使得它们可被相对地和绝对地编址。
微引擎22f还包括一个写传输寄存器堆栈78和一个读传输堆栈80。这些寄存器也分窗,使得它们可相对地和绝对地编址。写传输寄存器堆栈78是写到资源去的数据位于的地方。类似的,读寄存器堆栈80是用于从共享资源返回的数据。在数据到达之后或同时,从如SRAM控制器26a、SDRAM控制器26b或核芯处理器20那样各共享资源来的一个事件信号提供给上下文事件判优器74,后者将提醒线程,数据已可用或已发出。传输寄存器存储区78和80通过数据通道连结到执行箱(EBOX)76。在一个实施例中,读传输寄存器有64个寄存器且写传输寄存器有64个寄存器。
参考图4,微引擎数据通道保持5级微管线82。此微管线包括查找微指令字82a;形成寄存器文件地址82b;从寄存器文件读出操作数82c;ALU,移位或比较操作82d;将结果写回到寄存器82e。通过提供写回数据旁路到ALU/移位单元,并通过假设寄存器作为寄存器文件(而非RAM)实现,微引擎能完成同时的寄存器文件读和写,这完全隐藏了写操作。
SDRAM接口26a将一个信号返回到请求读的微引擎,指出在读请求时是否发生奇偶校验错误。当微引擎使用任何返回数据时,该微引擎的微码负责检查SDRAM读的奇偶校验标志。在检查标志时,如果它被置位,根据它转移的动作将其清除。只有当SDRAM能够检查且SDRAM受奇偶校验保护的时候才发送奇偶校验校志。微引擎和PCI单元是得知奇偶校验错误的仅有的请求者。因此,如果处理器核心20或FIFO请求奇偶校验保持,微引擎在请求中预以协助。
参考图5,存在两个寄存器地址空间,本地可访问的寄存器,和全局可访问的寄存器,后者可由所有微引擎访问。通用寄存器(GPR)作为两个分别的存储区(A存储区和B存储区)实现,它们的地址逐字交叉,使得A存储区寄存器有lsb=0,B存储区寄存器有lsb=1(lsb是最低有效位)。每个存储区能实现对该存储区中两个不同字的同时读和写。
在存储区A和B内,寄存器组76也被组织成4个32个寄存器的窗76b0-76b3,它们可每个线程相对编址。因此,thread_0在77a找到它的寄存器0(寄存器0),thread_1在77b找到它的寄存器_0(寄存器32),thread_2在77c找到它的寄存器_0(寄存器64),而thread_3在77d找到它的寄存器0(寄存器96)。支持相对编址,使得多线程能使用完全相同的控制存储和位置,而访问不同的寄存器窗并完成不同的功能。使用寄存器窗编址和存储区编址,只要在微引擎22f中采用双口RAM就提供了需要的读带宽。
这些分窗的寄存器从上下文切换到上下文切换不需要保存数据,因而消除了上下文交换文件或堆栈的正常压入或弹出。这里的上下文切换对于从一个上下文改变为另一个具有0周期的开销。相对寄存器编址将寄存器存储区按通用寄存器组的地址宽度分成窗。相对编址允许对于窗起始点访问任何窗。在此结构中也支持绝对编址,通过提供寄存器的精确地址,任何一个绝对寄存器能被任何一个线程访问。
通用寄存器78的编址以两种方式出现,取决于微字的格式。两种方式是绝对方式和相对方式。在绝对方式中,寄存器地址的编址直接在7位源字段(a6-a0或b6-b0)中指定:
 7  6   5   4   3   2   1   0
+---+---+---+---+---+---+---+---+
A GPR:|a6|0|a5|a4|a3|a2|a1|a0| a6=0
B GPR:|b6|1|b5|b4|b3|b2|b1|b0| b6=0
SRAM/ASB:|a6|a5|a4|0|a3|a2|a1|a0|a6=1,a5=0,a4=0
SDRAM:|a6|a5|a4|0|a3|a2|a1|a0|a6=1,a5=0,a4=1
寄存器地址直接在8位目标字段(d7-d0)中指定:
  7   6 5   4  3  2  1  0
+---+---+---+---+---+---+---+---+
A GPR:|a6|0|a5|a4|a3|a2|a1|a0| a6=0
B GPR:|b6|1|b5|b4|b3|b2|b1|b0| b6=0
SRAM/ASB:|a6|a5|a4|0|a3|a2|a1|a0| a6=1,a5=0,a4=0
SDRAM:|a6|a5|a4|0|a3|a2|a1|a0|a6=1,a5=0,a4=1
如果<a6:a5>=1,1,<b6:b5>=1,1或<d7:d6>=1,1,则较低位解释为上下文相关的地址字段(下面描述)。当在A,B绝对字段中指定一非相对的A或B的源地址,只有SRAM/ASB和SDRAM地址空间的低一半能编址。实际上,读绝对SDRAM/SDRAM设备具有有效的地址空间;但是因为此限制不应用于目标字段,写SRAM/SDRAM仍然使用全部地址空间。
在相对方式中,指定地址的编址是在上下文空间中如5位源字段(a4-a0,或b4—b0)所定义的偏移量:
  7 6   5   4  3  2  1  0
+---+---+---+---+---+---+---+---+
A GPR:|a4|0|上下文|a3|a2|a1|a0|a4=0
B GPR:|b4|1|上下文|b3|b2|b1|b0|b4=0
SRAM/ASB:|ab4|0|ab3|上下文|b2|b1|ab0|ab4=1,ab3=0
SDRAM:|ab4|0|ab3|上下文|b2|b1|ab0|ab4=1,ab3=1或6位目标字段(d5-d0)所定义的:
  7  6  5   4  3  2  1  0
+---+---+---+---+---+---+---+---+
A GPR:|d5|d4|上下文|d3|d2|d1|d0| d5=0,d4=0
B GPR:|d5|d4|上下文|d3|d2|d1|d0| d5=0,d4=1
SRAM/ASB:|d5|d4|d3|上下文|d2|d1|d0| d5=1,d4=0,d3=0
SDRAM:|d5|d4|d3|上下文|d2|d1|d0| d5=1,d4=0,d3=1
如果<d5:d4>=1,1,则目标地址不选址有效的寄存器,因此没有目标操作数要写回。
其他实施例在附后的权利要求的范围内。

Claims (18)

1.一种在并行多线程处理器中维持线程执行的方法,其特征在于,所述方法包括下述步骤:
通过在所述多线程处理器中的执行线程来访问寄存器组,所述寄存器组被组织成每个线程可相对编址的多个可相对编址寄存器窗,从而,多个线程能使用完全相同的控制存储和位置但访问不同的寄存器分窗存储区,并执行不同功能。
2.如权利要求1所述的方法,其特征在于,相对寄存器编址在所述寄存器组的地址宽度上将寄存器存储区分成各个窗。
3.如权利要求1所述的方法,其特征在于,相对编址允许访问相对于一个寄存器窗的起点的任何分窗寄存器。
4.如权利要求1所述的方法,其特征在于,还包括:
按照在处理器中执行的线程数,将所述寄存器组组织成各个窗。
5.如权利要求1所述的方法,其特征在于,使用双端口随机访问存储器实现所述寄存器窗。
6.如权利要求1所述的方法,其特征在于,相对编址允许访问相对于寄存器窗起点的任何寄存器窗。
7.如权利要求1所述的方法,其特征在于,所述寄存器组是可绝对编址的,其中任何一个绝对可编址的寄存器能通过提供寄存器的确切地址被任何线程访问。
8.如权利要求7所述的方法,其特征在于,寄存器的绝对地址在指令的源字段或目标字段中直接指定。
9.如权利要求1所述的方法,其特征在于,相对地址在指令中被指定为上下文执行空间中的地址偏移量,由源字段或目标字段操作数确定。
10.一种基于硬件的多线程处理器,其特征在于包括:
处理器单元,它包括:
控制逻辑,包括上下文事件切换逻辑,所述上下文切换逻辑对访问用于多个可执行线程的微引擎进行判优;
算术逻辑单元,用于处理用于执行线程的数据;和
寄存器组,所述寄存器组被组织成每个可执行线程相对可编址的多个相对可编址寄存器窗,从而,多个线程能使用完全相同的控制存储和位置但访问不同的寄存器分窗存储区,并执行不同功能。
11.如权利要求10所述的处理器,其特征在于,所述控制逻辑还包括:
指令解码器;和
用于跟踪执行线程的程序计数器单元。
12.如权利要求11所述的处理器,其特征在于,所述程序计数器单元保持在硬件中。
13.如权利要求10所述的处理器,其特征在于,相对编址允许访问相对一个寄存器窗的起点的任何寄存器。
14.如权利要求10所述的处理器,其特征在于,所述寄存器组的窗的数目取决于在所述处理器中执行的线程数。
15.如权利要求11所述的处理器,其特征在于,使用双端口随机访问存储器提供分窗寄存器。
16.如权利要求10所述的处理器,其特征在于,所述处理单元是微程序编程的处理器单元。
17.一种用于管理在多线程处理器中执行多个线程的设备,其特征在于,所述设备包括:
通过在多线程处理器中的执行线程来访问寄存器组的逻辑,所述寄存器组被组织成每个线程可相对编址的多个相对可编址寄存器窗,从而,多个线程能使用完全相同的控制存储和位置但访问不同的寄存器分窗存储区,并执行不同功能。
18.如权利要求17所述的设备,其特征在于,所述寄存器组还是绝对可编址的,其中任何一个绝对可编址的寄存器能通过提供该寄存器的确切地址被任何线程访问。
CNB008144966A 1999-09-01 2000-08-31 多线程并行处理器及其维持线程执行的方法 Expired - Fee Related CN100474236C (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15196199P 1999-09-01 1999-09-01
US60/151,961 1999-09-01

Publications (2)

Publication Number Publication Date
CN1390323A CN1390323A (zh) 2003-01-08
CN100474236C true CN100474236C (zh) 2009-04-01

Family

ID=22540994

Family Applications (7)

Application Number Title Priority Date Filing Date
CNB008144966A Expired - Fee Related CN100474236C (zh) 1999-09-01 2000-08-31 多线程并行处理器及其维持线程执行的方法
CNB008154309A Expired - Fee Related CN1254739C (zh) 1999-09-01 2000-08-31 处理器和操作处理器的方法
CNB008151237A Expired - Fee Related CN1296818C (zh) 1999-09-01 2000-08-31 用于多线程并行处理器的指令
CNB008154376A Expired - Fee Related CN1184562C (zh) 1999-09-01 2000-08-31 处理器的转移指令
CNB008148740A Expired - Fee Related CN1271513C (zh) 1999-09-01 2000-08-31 转移指令的方法和处理器
CNB008154171A Expired - Fee Related CN100342326C (zh) 1999-09-01 2000-08-31 多线程处理器和操作处理器的方法
CNB008154120A Expired - Fee Related CN100351781C (zh) 1999-09-01 2000-09-01 多线程并行处理器结构中所用的微引擎的存储器引用指令

Family Applications After (6)

Application Number Title Priority Date Filing Date
CNB008154309A Expired - Fee Related CN1254739C (zh) 1999-09-01 2000-08-31 处理器和操作处理器的方法
CNB008151237A Expired - Fee Related CN1296818C (zh) 1999-09-01 2000-08-31 用于多线程并行处理器的指令
CNB008154376A Expired - Fee Related CN1184562C (zh) 1999-09-01 2000-08-31 处理器的转移指令
CNB008148740A Expired - Fee Related CN1271513C (zh) 1999-09-01 2000-08-31 转移指令的方法和处理器
CNB008154171A Expired - Fee Related CN100342326C (zh) 1999-09-01 2000-08-31 多线程处理器和操作处理器的方法
CNB008154120A Expired - Fee Related CN100351781C (zh) 1999-09-01 2000-09-01 多线程并行处理器结构中所用的微引擎的存储器引用指令

Country Status (10)

Country Link
US (1) US7421572B1 (zh)
EP (7) EP1236094B1 (zh)
CN (7) CN100474236C (zh)
AT (2) ATE396449T1 (zh)
AU (11) AU7340700A (zh)
CA (7) CA2383528C (zh)
DE (2) DE60044752D1 (zh)
HK (8) HK1046049A1 (zh)
TW (11) TW559729B (zh)
WO (8) WO2001018646A1 (zh)

Families Citing this family (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001016702A1 (en) 1999-09-01 2001-03-08 Intel Corporation Register set used in multithreaded parallel processor architecture
US7681018B2 (en) 2000-08-31 2010-03-16 Intel Corporation Method and apparatus for providing large register address space while maximizing cycletime performance for a multi-threaded register file set
US7292586B2 (en) * 2001-03-30 2007-11-06 Nokia Inc. Micro-programmable protocol packet parser and encapsulator
US6785793B2 (en) * 2001-09-27 2004-08-31 Intel Corporation Method and apparatus for memory access scheduling to reduce memory access latency
EP1436724A4 (en) * 2001-09-28 2007-10-03 Consentry Networks Inc MORE THREAD PACKAGE PROCESSING ENGINE FOR CAREFUL PACKAGE PROCESSING
US7069442B2 (en) * 2002-03-29 2006-06-27 Intel Corporation System and method for execution of a secured environment initialization instruction
US7437724B2 (en) * 2002-04-03 2008-10-14 Intel Corporation Registers for data transfers
GB2409062C (en) 2003-12-09 2007-12-11 Advanced Risc Mach Ltd Aliasing data processing registers
US7027062B2 (en) * 2004-02-27 2006-04-11 Nvidia Corporation Register based queuing for texture requests
GB0420442D0 (en) * 2004-09-14 2004-10-20 Ignios Ltd Debug in a multicore architecture
US9038070B2 (en) 2004-09-14 2015-05-19 Synopsys, Inc. Debug in a multicore architecture
SE0403128D0 (sv) * 2004-12-22 2004-12-22 Xelerated Ab A method for a processor, and a processor
US8028295B2 (en) * 2005-09-30 2011-09-27 Intel Corporation Apparatus, system, and method for persistent user-level thread
US7882284B2 (en) * 2007-03-26 2011-02-01 Analog Devices, Inc. Compute unit with an internal bit FIFO circuit
US7991967B2 (en) * 2007-06-29 2011-08-02 Microsoft Corporation Using type stability to facilitate contention management
US9384003B2 (en) * 2007-10-23 2016-07-05 Texas Instruments Incorporated Determining whether a branch instruction is predicted based on a capture range of a second instruction
US9207968B2 (en) * 2009-11-03 2015-12-08 Mediatek Inc. Computing system using single operating system to provide normal security services and high security services, and methods thereof
CN101950277B (zh) * 2010-09-13 2012-04-25 青岛海信信芯科技有限公司 用于微控制单元的数据传输方法与装置以及数据传输系统
GB2486737B (en) * 2010-12-24 2018-09-19 Qualcomm Technologies Int Ltd Instruction execution
US8880851B2 (en) * 2011-04-07 2014-11-04 Via Technologies, Inc. Microprocessor that performs X86 ISA and arm ISA machine language program instructions by hardware translation into microinstructions executed by common execution pipeline
US8645618B2 (en) * 2011-07-14 2014-02-04 Lsi Corporation Flexible flash commands
WO2013101232A1 (en) 2011-12-30 2013-07-04 Intel Corporation Packed rotate processors, methods, systems, and instructions
CN102833336A (zh) * 2012-08-31 2012-12-19 河海大学 分散分布式信息采集与并发处理系统中数据分包处理方法
US10140129B2 (en) * 2012-12-28 2018-11-27 Intel Corporation Processing core having shared front end unit
CN103186438A (zh) * 2013-04-02 2013-07-03 浪潮电子信息产业股份有限公司 一种提高磁盘阵列数据重构效率的方法
CN103226328B (zh) * 2013-04-21 2015-06-24 中国矿业大学(北京) 采集次数控制模式下的多线程数据采集系统同步控制方法
US20150127927A1 (en) * 2013-11-01 2015-05-07 Qualcomm Incorporated Efficient hardware dispatching of concurrent functions in multicore processors, and related processor systems, methods, and computer-readable media
KR102254099B1 (ko) 2014-05-19 2021-05-20 삼성전자주식회사 메모리 스와핑 처리 방법과 이를 적용하는 호스트 장치, 스토리지 장치 및 데이터 처리 시스템
CN103984235B (zh) * 2014-05-27 2016-05-11 湖南大学 基于c/s结构的空间机械臂控制系统软件架构及构建方法
US20160381050A1 (en) 2015-06-26 2016-12-29 Intel Corporation Processors, methods, systems, and instructions to protect shadow stacks
US10394556B2 (en) * 2015-12-20 2019-08-27 Intel Corporation Hardware apparatuses and methods to switch shadow stack pointers
US10430580B2 (en) 2016-02-04 2019-10-01 Intel Corporation Processor extensions to protect stacks during ring transitions
US10838656B2 (en) 2016-12-20 2020-11-17 Mediatek Inc. Parallel memory access to on-chip memory containing regions of different addressing schemes by threads executed on parallel processing units
US10387037B2 (en) * 2016-12-31 2019-08-20 Intel Corporation Microarchitecture enabling enhanced parallelism for sparse linear algebra operations having write-to-read dependencies
EP4089531A1 (en) 2016-12-31 2022-11-16 Intel Corporation Systems, methods, and apparatuses for heterogeneous computing
CN107329812B (zh) * 2017-06-09 2018-07-06 腾讯科技(深圳)有限公司 一种运行协程的方法和装置
CN112463327B (zh) * 2020-11-25 2023-01-31 海光信息技术股份有限公司 逻辑线程快速切换的方法、装置、cpu芯片及服务器
TWI769080B (zh) * 2021-09-17 2022-06-21 瑞昱半導體股份有限公司 用於同步動態隨機存取記憶體之控制模組及其控制方法
US20230205869A1 (en) * 2021-12-23 2023-06-29 Intel Corporation Efficient exception handling in trusted execution environments

Family Cites Families (140)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3373408A (en) 1965-04-16 1968-03-12 Rca Corp Computer capable of switching between programs without storage and retrieval of the contents of operation registers
US3478322A (en) 1967-05-23 1969-11-11 Ibm Data processor employing electronically changeable control storage
US3577189A (en) * 1969-01-15 1971-05-04 Ibm Apparatus and method in a digital computer for allowing improved program branching with branch anticipation reduction of the number of branches, and reduction of branch delays
BE795789A (fr) 1972-03-08 1973-06-18 Burroughs Corp Microprogramme comportant une micro-instruction de recouvrement
US3881173A (en) 1973-05-14 1975-04-29 Amdahl Corp Condition code determination and data processing
IT986411B (it) 1973-06-05 1975-01-30 Olivetti E C Spa Sistema per trasferire il control lo delle elaborazioni da un primo livello prioritario ad un secondo livello prioritario
FR2253415A5 (zh) 1973-12-04 1975-06-27 Cii
US3913074A (en) * 1973-12-18 1975-10-14 Honeywell Inf Systems Search processing apparatus
US4130890A (en) 1977-06-08 1978-12-19 Itt Industries, Inc. Integrated DDC memory with bitwise erase
US4392758A (en) 1978-05-22 1983-07-12 International Business Machines Corporation Underscore erase
JPS56164464A (en) 1980-05-21 1981-12-17 Tatsuo Nogi Parallel processing computer
US4400770A (en) 1980-11-10 1983-08-23 International Business Machines Corporation Cache synonym detection and handling means
CA1179069A (en) 1981-04-10 1984-12-04 Yasushi Fukunaga Data transmission apparatus for a multiprocessor system
US4471426A (en) * 1981-07-02 1984-09-11 Texas Instruments Incorporated Microcomputer which fetches two sets of microcode bits at one time
US4454595A (en) 1981-12-23 1984-06-12 Pitney Bowes Inc. Buffer for use with a fixed disk controller
US4477872A (en) 1982-01-15 1984-10-16 International Business Machines Corporation Decode history table for conditional branch instructions
US4569016A (en) 1983-06-30 1986-02-04 International Business Machines Corporation Mechanism for implementing one machine cycle executable mask and rotate instructions in a primitive instruction set computing system
JPS6014338A (ja) * 1983-06-30 1985-01-24 インタ−ナショナル ビジネス マシ−ンズ コ−ポレ−ション 計算機システムにおける分岐機構
US4606025A (en) 1983-09-28 1986-08-12 International Business Machines Corp. Automatically testing a plurality of memory arrays on selected memory array testers
US4808988A (en) 1984-04-13 1989-02-28 Megatek Corporation Digital vector generator for a graphic display system
US4868735A (en) 1984-05-08 1989-09-19 Advanced Micro Devices, Inc. Interruptible structured microprogrammed sixteen-bit address sequence controller
US4742451A (en) 1984-05-21 1988-05-03 Digital Equipment Corporation Instruction prefetch system for conditional branch instruction for central processor unit
US5187800A (en) 1985-01-04 1993-02-16 Sun Microsystems, Inc. Asynchronous pipelined data processing system
US5045995A (en) 1985-06-24 1991-09-03 Vicom Systems, Inc. Selective operation of processing elements in a single instruction multiple data stream (SIMD) computer system
US4755966A (en) 1985-06-28 1988-07-05 Hewlett-Packard Company Bidirectional branch prediction and optimization
US4754398A (en) * 1985-06-28 1988-06-28 Cray Research, Inc. System for multiprocessor communication using local and common semaphore and information registers
US4777587A (en) * 1985-08-30 1988-10-11 Advanced Micro Devices, Inc. System for processing single-cycle branch instruction in a pipeline having relative, absolute, indirect and trap addresses
US5021945A (en) * 1985-10-31 1991-06-04 Mcc Development, Ltd. Parallel processor system for processing natural concurrencies and method therefor
US4847755A (en) 1985-10-31 1989-07-11 Mcc Development, Ltd. Parallel processing method and apparatus for increasing processing throughout by parallel processing low level instructions having natural concurrencies
US4745544A (en) 1985-12-12 1988-05-17 Texas Instruments Incorporated Master/slave sequencing processor with forced I/O
US4724521A (en) * 1986-01-14 1988-02-09 Veri-Fone, Inc. Method for operating a local terminal to execute a downloaded application program
US5297260A (en) 1986-03-12 1994-03-22 Hitachi, Ltd. Processor having a plurality of CPUS with one CPU being normally connected to common bus
US5170484A (en) 1986-09-18 1992-12-08 Digital Equipment Corporation Massively parallel array processing system
US4992934A (en) 1986-12-15 1991-02-12 United Technologies Corporation Reduced instruction set computing apparatus and methods
US5073864A (en) 1987-02-10 1991-12-17 Davin Computer Corporation Parallel string processor and method for a minicomputer
US5142683A (en) 1987-03-09 1992-08-25 Unisys Corporation Intercomputer communication control apparatus and method
US4866664A (en) 1987-03-09 1989-09-12 Unisys Corporation Intercomputer communication control apparatus & method
US4816913A (en) 1987-11-16 1989-03-28 Technology, Inc., 64 Pixel interpolation circuitry as for a video signal processor
US5189636A (en) 1987-11-16 1993-02-23 Intel Corporation Dual mode combining circuitry
US5055999A (en) * 1987-12-22 1991-10-08 Kendall Square Research Corporation Multiprocessor digital data processing system
US5220669A (en) * 1988-02-10 1993-06-15 International Business Machines Corporation Linkage mechanism for program isolation
DE68913629T2 (de) 1988-03-14 1994-06-16 Unisys Corp Satzverriegelungsprozessor für vielfachverarbeitungsdatensystem.
US5056015A (en) 1988-03-23 1991-10-08 Du Pont Pixel Systems Limited Architectures for serial or parallel loading of writable control store
US5165025A (en) 1988-10-06 1992-11-17 Lass Stanley E Interlacing the paths after a conditional branch like instruction
US5202972A (en) 1988-12-29 1993-04-13 International Business Machines Corporation Store buffer apparatus in a multiprocessor system
US5155854A (en) 1989-02-03 1992-10-13 Digital Equipment Corporation System for arbitrating communication requests using multi-pass control unit based on availability of system resources
US5155831A (en) 1989-04-24 1992-10-13 International Business Machines Corporation Data processing system with fast queue store interposed between store-through caches and a main memory
US5113516A (en) 1989-07-31 1992-05-12 North American Philips Corporation Data repacker having controlled feedback shifters and registers for changing data format
US5168555A (en) 1989-09-06 1992-12-01 Unisys Corporation Initial program load control
US5263169A (en) 1989-11-03 1993-11-16 Zoran Corporation Bus arbitration and resource management for concurrent vector signal processor architecture
DE3942977A1 (de) 1989-12-23 1991-06-27 Standard Elektrik Lorenz Ag Verfahren zum wiederherstellen der richtigen zellfolge, insbesondere in einer atm-vermittlungsstelle, sowie ausgangseinheit hierfuer
US5544337A (en) 1989-12-29 1996-08-06 Cray Research, Inc. Vector processor having registers for control by vector resisters
US5247671A (en) 1990-02-14 1993-09-21 International Business Machines Corporation Scalable schedules for serial communications controller in data processing systems
JPH0799812B2 (ja) * 1990-03-26 1995-10-25 株式会社グラフイックス・コミュニケーション・テクノロジーズ 信号符号化装置および信号復号化装置、並びに信号符号化復号化装置
US5390329A (en) 1990-06-11 1995-02-14 Cray Research, Inc. Responding to service requests using minimal system-side context in a multiprocessor environment
JPH0454652A (ja) * 1990-06-25 1992-02-21 Nec Corp マイクロコンピュータ
US5347648A (en) 1990-06-29 1994-09-13 Digital Equipment Corporation Ensuring write ordering under writeback cache error conditions
CA2045790A1 (en) * 1990-06-29 1991-12-30 Richard Lee Sites Branch prediction in high-performance processor
US5432918A (en) 1990-06-29 1995-07-11 Digital Equipment Corporation Method and apparatus for ordering read and write operations using conflict bits in a write queue
US5404482A (en) 1990-06-29 1995-04-04 Digital Equipment Corporation Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills
DE4129614C2 (de) * 1990-09-07 2002-03-21 Hitachi Ltd System und Verfahren zur Datenverarbeitung
JP2508907B2 (ja) * 1990-09-18 1996-06-19 日本電気株式会社 遅延分岐命令の制御方式
EP0553158B1 (en) * 1990-10-19 1994-12-28 Cray Research, Inc. A scalable parallel vector computer system
US5367678A (en) 1990-12-06 1994-11-22 The Regents Of The University Of California Multiprocessor system having statically determining resource allocation schedule at compile time and the using of static schedule with processor signals to control the execution time dynamically
US5394530A (en) 1991-03-15 1995-02-28 Nec Corporation Arrangement for predicting a branch target address in the second iteration of a short loop
EP0522513A2 (en) * 1991-07-09 1993-01-13 Hughes Aircraft Company High speed parallel microcode program controller
US5247675A (en) * 1991-08-09 1993-09-21 International Business Machines Corporation Preemptive and non-preemptive scheduling and execution of program threads in a multitasking operating system
US5255239A (en) 1991-08-13 1993-10-19 Cypress Semiconductor Corporation Bidirectional first-in-first-out memory device with transparent and user-testable capabilities
US5623489A (en) 1991-09-26 1997-04-22 Ipc Information Systems, Inc. Channel allocation system for distributed digital switching network
US5392412A (en) 1991-10-03 1995-02-21 Standard Microsystems Corporation Data communication controller for use with a single-port data packet buffer
US5392391A (en) 1991-10-18 1995-02-21 Lsi Logic Corporation High performance graphics applications controller
DE69231957T2 (de) 1991-10-21 2002-04-04 Toshiba Kawasaki Kk Hochgeschwindigkeitsprozessor zum fähiger Abhandeln mehrerer Unterbrechungen
US5452437A (en) 1991-11-18 1995-09-19 Motorola, Inc. Methods of debugging multiprocessor system
US5357617A (en) 1991-11-22 1994-10-18 International Business Machines Corporation Method and apparatus for substantially concurrent multiple instruction thread processing by a single pipeline processor
US5442797A (en) 1991-12-04 1995-08-15 Casavant; Thomas L. Latency tolerant risc-based multiple processor with event driven locality managers resulting from variable tagging
JP2823767B2 (ja) 1992-02-03 1998-11-11 松下電器産業株式会社 レジスタファイル
KR100309566B1 (ko) 1992-04-29 2001-12-15 리패치 파이프라인프로세서에서다중명령어를무리짓고,그룹화된명령어를동시에발행하고,그룹화된명령어를실행시키는방법및장치
US5459842A (en) 1992-06-26 1995-10-17 International Business Machines Corporation System for combining data from multiple CPU write requests via buffers and using read-modify-write operation to write the combined data to the memory
DE4223600C2 (de) 1992-07-17 1994-10-13 Ibm Mehrprozessor-Computersystem und Verfahren zum Übertragen von Steuerinformationen und Dateninformation zwischen wenigstens zwei Prozessoreinheiten eines Computersystems
US5274770A (en) 1992-07-29 1993-12-28 Tritech Microelectronics International Pte Ltd. Flexible register-based I/O microcontroller with single cycle instruction execution
US5442756A (en) * 1992-07-31 1995-08-15 Intel Corporation Branch prediction and resolution apparatus for a superscalar computer processor
US5692167A (en) * 1992-07-31 1997-11-25 Intel Corporation Method for verifying the correct processing of pipelined instructions including branch instructions and self-modifying code in a microprocessor
US5463746A (en) 1992-10-30 1995-10-31 International Business Machines Corp. Data processing system having prediction by using an embedded guess bit of remapped and compressed opcodes
US5481683A (en) * 1992-10-30 1996-01-02 International Business Machines Corporation Super scalar computer architecture using remand and recycled general purpose register to manage out-of-order execution of instructions
US5428779A (en) 1992-11-09 1995-06-27 Seiko Epson Corporation System and method for supporting context switching within a multiprocessor system having functional blocks that generate state programs with coded register load instructions
US5450603A (en) 1992-12-18 1995-09-12 Xerox Corporation SIMD architecture with transfer register or value source circuitry connected to bus
KR100313261B1 (ko) 1992-12-23 2002-02-28 앙드래베이너,조엘브르리아드 저전력형다중작업제어기(명칭정정)
US5404464A (en) 1993-02-11 1995-04-04 Ast Research, Inc. Bus control system and method that selectively generate an early address strobe
US5448702A (en) 1993-03-02 1995-09-05 International Business Machines Corporation Adapters with descriptor queue management capability
US5522069A (en) 1993-04-30 1996-05-28 Zenith Data Systems Corporation Symmetric multiprocessing system with unified environment and distributed system functions
WO1994027216A1 (en) 1993-05-14 1994-11-24 Massachusetts Institute Of Technology Multiprocessor coupling system with integrated compile and run time scheduling for parallelism
CA2122182A1 (en) 1993-05-20 1994-11-21 Rene Leblanc Method for rapid prototyping of programming problems
US5363448A (en) * 1993-06-30 1994-11-08 United Technologies Automotive, Inc. Pseudorandom number generation and cryptographic authentication
CA2107299C (en) 1993-09-29 1997-02-25 Mehrad Yasrebi High performance machine for switched communications in a heterogenous data processing network gateway
US5446736A (en) 1993-10-07 1995-08-29 Ast Research, Inc. Method and apparatus for connecting a node to a wireless network using a standard protocol
DE69415126T2 (de) 1993-10-21 1999-07-08 Sun Microsystems Inc Gegenflusspipelineprozessor
EP0650117B1 (en) 1993-10-21 2002-04-10 Sun Microsystems, Inc. Counterflow pipeline
TW261676B (zh) * 1993-11-02 1995-11-01 Motorola Inc
US5450351A (en) 1993-11-19 1995-09-12 International Business Machines Corporation Content addressable memory implementation with random access memory
US6079014A (en) * 1993-12-02 2000-06-20 Intel Corporation Processor that redirects an instruction fetch pipeline immediately upon detection of a mispredicted branch while committing prior instructions to an architectural state
US5487159A (en) 1993-12-23 1996-01-23 Unisys Corporation System for processing shift, mask, and merge operations in one instruction
EP0661625B1 (en) * 1994-01-03 1999-09-08 Intel Corporation Method and apparatus for implementing a four stage branch resolution system in a computer processor
US5490204A (en) 1994-03-01 1996-02-06 Safco Corporation Automated quality assessment system for cellular networks
US5659722A (en) * 1994-04-28 1997-08-19 International Business Machines Corporation Multiple condition code branching system in a multi-processor environment
US5542088A (en) 1994-04-29 1996-07-30 Intergraph Corporation Method and apparatus for enabling control of task execution
US5544236A (en) 1994-06-10 1996-08-06 At&T Corp. Access to unsubscribed features
US5574922A (en) 1994-06-17 1996-11-12 Apple Computer, Inc. Processor with sequences of processor instructions for locked memory updates
FR2722041B1 (fr) * 1994-06-30 1998-01-02 Samsung Electronics Co Ltd Decodeur de huffman
US5655132A (en) * 1994-08-08 1997-08-05 Rockwell International Corporation Register file with multi-tasking support
US5640538A (en) 1994-08-22 1997-06-17 Adaptec, Inc. Programmable timing mark sequencer for a disk drive
US5717760A (en) * 1994-11-09 1998-02-10 Channel One Communications, Inc. Message protection system and method
WO1996017295A1 (en) * 1994-12-02 1996-06-06 Hyundai Electronics America, Inc. Limited run branch prediction
US5610864A (en) 1994-12-23 1997-03-11 Micron Technology, Inc. Burst EDO memory device with maximized write cycle timing
US5550816A (en) 1994-12-29 1996-08-27 Storage Technology Corporation Method and apparatus for virtual switching
US5649157A (en) 1995-03-30 1997-07-15 Hewlett-Packard Co. Memory controller with priority queues
JP3130446B2 (ja) * 1995-05-10 2001-01-31 松下電器産業株式会社 プログラム変換装置及びプロセッサ
US5592622A (en) 1995-05-10 1997-01-07 3Com Corporation Network intermediate system with message passing architecture
US5541920A (en) 1995-06-15 1996-07-30 Bay Networks, Inc. Method and apparatus for a delayed replace mechanism for a streaming packet modification engine
KR0180169B1 (ko) * 1995-06-30 1999-05-01 배순훈 가변길이 부호기
US5613071A (en) 1995-07-14 1997-03-18 Intel Corporation Method and apparatus for providing remote memory access in a distributed memory multiprocessor system
US5933627A (en) * 1996-07-01 1999-08-03 Sun Microsystems Thread switch on blocked load or store using instruction thread field
US6058465A (en) * 1996-08-19 2000-05-02 Nguyen; Le Trong Single-instruction-multiple-data processing in a multimedia signal processor
US6061711A (en) * 1996-08-19 2000-05-09 Samsung Electronics, Inc. Efficient context saving and restoring in a multi-tasking computing system environment
CN1147785C (zh) * 1996-08-27 2004-04-28 松下电器产业株式会社 执行多个指令流的多程序流程同时处理器
DE69718278T2 (de) * 1996-10-31 2003-08-21 Texas Instruments Inc Methode und System zur Einzel-Zyklus-Ausführung aufeinanderfolgender Iterationen einer Befehlsschleife
US5857104A (en) 1996-11-26 1999-01-05 Hewlett-Packard Company Synthetic dynamic branch prediction
US6088788A (en) * 1996-12-27 2000-07-11 International Business Machines Corporation Background completion of instruction and associated fetch request in a multithread processor
US6029228A (en) * 1996-12-31 2000-02-22 Texas Instruments Incorporated Data prefetching of a load target buffer for post-branch instructions based on past prediction accuracy's of branch predictions
EP0863462B8 (en) * 1997-03-04 2010-07-28 Panasonic Corporation Processor capable of efficiently executing many asynchronous event tasks
US5835705A (en) * 1997-03-11 1998-11-10 International Business Machines Corporation Method and system for performance per-thread monitoring in a multithreaded processor
US5996068A (en) * 1997-03-26 1999-11-30 Lucent Technologies Inc. Method and apparatus for renaming registers corresponding to multiple thread identifications
US5907702A (en) * 1997-03-28 1999-05-25 International Business Machines Corporation Method and apparatus for decreasing thread switch latency in a multithread processor
US6009515A (en) * 1997-05-30 1999-12-28 Sun Microsystems, Inc. Digital data processing system including efficient arrangement to support branching within trap shadows
GB2326253A (en) * 1997-06-10 1998-12-16 Advanced Risc Mach Ltd Coprocessor data access control
US6385720B1 (en) * 1997-07-14 2002-05-07 Matsushita Electric Industrial Co., Ltd. Branch prediction method and processor using origin information, relative position information and history information
US6243735B1 (en) * 1997-09-01 2001-06-05 Matsushita Electric Industrial Co., Ltd. Microcontroller, data processing system and task switching control method
US5926646A (en) * 1997-09-11 1999-07-20 Advanced Micro Devices, Inc. Context-dependent memory-mapped registers for transparent expansion of a register file
UA55489C2 (uk) * 1997-10-07 2003-04-15 Каналь+ Сосьєте Анонім Пристрій для багатопотокової обробки даних (варіанти)
US6567839B1 (en) * 1997-10-23 2003-05-20 International Business Machines Corporation Thread switch control in a multithreaded processor system
US6560629B1 (en) * 1998-10-30 2003-05-06 Sun Microsystems, Inc. Multi-thread processing

Also Published As

Publication number Publication date
ATE396449T1 (de) 2008-06-15
AU7097900A (en) 2001-03-26
CA2383528A1 (en) 2001-03-08
WO2001016758A3 (en) 2001-10-25
TWI220732B (en) 2004-09-01
CA2383532A1 (en) 2001-03-08
EP1242869B1 (en) 2011-11-16
EP1236094A1 (en) 2002-09-04
US7421572B1 (en) 2008-09-02
DE60044752D1 (de) 2010-09-09
AU7342900A (en) 2001-03-26
TW594562B (en) 2004-06-21
CN1402846A (zh) 2003-03-12
EP1236088B9 (en) 2008-10-08
EP1236092A4 (en) 2006-07-26
TW486667B (en) 2002-05-11
AU7098700A (en) 2001-03-26
EP1236094B1 (en) 2010-07-28
WO2001016758A9 (en) 2002-09-12
CN1402845A (zh) 2003-03-12
CA2383540A1 (en) 2001-03-08
AU7101200A (en) 2001-03-26
DE60038976D1 (de) 2008-07-03
HK1046566A1 (zh) 2003-01-17
EP1242867A4 (en) 2008-07-30
AU7340700A (en) 2001-03-26
CA2386558A1 (en) 2001-03-08
EP1236092A1 (en) 2002-09-04
WO2001018646A9 (en) 2002-09-12
TW486666B (en) 2002-05-11
CN1296818C (zh) 2007-01-24
EP1236097A1 (en) 2002-09-04
CN1387640A (zh) 2002-12-25
CN1184562C (zh) 2005-01-12
EP1236094A4 (en) 2006-04-19
EP1236093A1 (en) 2002-09-04
WO2001016722A1 (en) 2001-03-08
EP1242869A1 (en) 2002-09-25
TW571239B (en) 2004-01-11
TW569133B (en) 2004-01-01
TWI221251B (en) 2004-09-21
AU7340600A (en) 2001-04-10
EP1236093A4 (en) 2006-07-26
CA2383528C (en) 2008-06-17
WO2001016698A3 (en) 2002-01-17
AU7099000A (en) 2001-03-26
TW475148B (en) 2002-02-01
WO2001016698A2 (en) 2001-03-08
WO2001016716A1 (en) 2001-03-08
AU7098600A (en) 2001-03-26
EP1242867A2 (en) 2002-09-25
WO2001018646A1 (en) 2001-03-15
CN100342326C (zh) 2007-10-10
HK1046565A1 (zh) 2003-01-17
CA2383531A1 (en) 2001-03-08
HK1051730A1 (en) 2003-08-15
WO2001016758A2 (en) 2001-03-08
HK1051247A1 (en) 2003-07-25
CN1387642A (zh) 2002-12-25
EP1236088B1 (en) 2008-05-21
CN1271513C (zh) 2006-08-23
EP1242869A4 (en) 2006-10-25
HK1051728A1 (en) 2003-08-15
EP1236097A4 (en) 2006-08-02
WO2001016715A1 (en) 2001-03-08
CA2386562A1 (en) 2001-03-08
CN1402844A (zh) 2003-03-12
WO2001016715A9 (en) 2002-09-12
EP1236088A4 (en) 2006-04-19
WO2001016713A1 (en) 2001-03-08
HK1051729A1 (en) 2003-08-15
CN1254739C (zh) 2006-05-03
TW548584B (en) 2003-08-21
AU7098400A (en) 2001-03-26
HK1049902A1 (en) 2003-05-30
CN100351781C (zh) 2007-11-28
HK1049902B (zh) 2005-08-26
CA2383526A1 (en) 2001-03-15
AU7340400A (en) 2001-03-26
HK1046049A1 (zh) 2002-12-20
WO2001016714A1 (en) 2001-03-08
WO2001016714A9 (en) 2002-09-12
ATE475930T1 (de) 2010-08-15
TW559729B (en) 2003-11-01
EP1236088A1 (en) 2002-09-04
TW546585B (en) 2003-08-11
CN1399736A (zh) 2003-02-26
CN1390323A (zh) 2003-01-08
AU7098500A (en) 2001-03-26
CA2386558C (en) 2010-03-09

Similar Documents

Publication Publication Date Title
CN100474236C (zh) 多线程并行处理器及其维持线程执行的方法
US7991983B2 (en) Register set used in multithreaded parallel processor architecture
CN101221493B (zh) 用于并行处理的方法和设备
US6629237B2 (en) Solving parallel problems employing hardware multi-threading in a parallel processing environment
CA2391833C (en) Parallel processor architecture
CN100367257C (zh) 并行处理器体系结构的sdram控制器
EP1214660A1 (en) Sram controller for parallel processor architecture
US7191309B1 (en) Double shift instruction for micro engine used in multithreaded parallel processor architecture
WO2001016697A2 (en) Local register instruction for micro engine used in multithreadedparallel processor architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090401

Termination date: 20170831