US20100205408A1 - Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix - Google Patents

Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix Download PDF

Info

Publication number
US20100205408A1
US20100205408A1 US12/764,024 US76402410A US2010205408A1 US 20100205408 A1 US20100205408 A1 US 20100205408A1 US 76402410 A US76402410 A US 76402410A US 2010205408 A1 US2010205408 A1 US 2010205408A1
Authority
US
United States
Prior art keywords
instruction
instructions
transaction
processor
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/764,024
Inventor
Jaewoong Chung
David S. Christie
Michael P. Hohmuth
Stephan Diestelhorst
Martin Pohlack
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Micro Devices Inc
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Priority to US12/764,024 priority Critical patent/US20100205408A1/en
Assigned to ADVANCED MICRO DEVICES, INC. reassignment ADVANCED MICRO DEVICES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIESTELHORST, STEPHAN, POHLACK, MARTIN, HOHMUTH, MICHAEL P., CHRISTIE, DAVID S., CHUNG, JAEWOONG
Publication of US20100205408A1 publication Critical patent/US20100205408A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30087Synchronisation or serialisation instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30185Instruction operation extension or modification according to one or more bits in the instruction, e.g. prefix, sub-opcode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30189Instruction operation extension or modification according to execution mode, e.g. mode flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/3834Maintaining memory consistency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory
    • G06F9/38585Result writeback, i.e. updating the architectural state or memory with result invalidation, e.g. nullification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling
    • G06F9/3863Recovery, e.g. branch miss-prediction, exception handling using multiple copies of the architectural state, e.g. shadow registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/468Specific access rights for resources, e.g. using capability register
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Definitions

  • Transactional memory In a transactional memory programming model, a programmer may designate a section of code (i.e., an execution path or a set of program instructions) as a “transaction”, which a transactional memory system should execute atomically with respect to other threads of execution. For example, if the transaction includes two memory store operations, then the transactional memory system ensures that all other threads may only observe either the cumulative effects of both memory operations or of neither, but not the effects of only one.
  • a section of code i.e., an execution path or a set of program instructions
  • memory accesses are sometimes executed one by one speculatively and committed all at once at the end of the transaction. Otherwise, if an abort condition is detected (e.g., data conflict with another processor), those memory operations that have been executed speculatively may be rolled back or dropped and the transaction may be reattempted. Data from speculative memory accesses may be saved in a speculative data buffer, which may be implemented by various hardware structures, such as an on-chip data cache.
  • HTMs hardware-based transactional memory proposals
  • HTMs hardware-based transactional memory proposals
  • these may be a product of limited hardware resources, such as the size of one or more speculative data buffers used to buffer speculative data during transactional execution.
  • a computer processor may be configured to implement a hardware transactional memory system.
  • the system may execute a transactional region of code such that only a subset of the instructions in the transactional region of code (e.g., those including a given instruction prefix) are executed as a single atomic memory transaction while the other instructions in the transactional region (e.g., those lacking the given prefix) are not necessarily executed atomically.
  • a computer system may be configured to determine whether an instruction within a plurality of instructions in a transactional region includes a given prefix.
  • the prefix indicates that one or more memory operations performed by the processor to complete the instruction are to be executed as part of an atomic transaction.
  • the atomic transaction may include memory operations performed by the processor to complete one or more others of the plurality of instructions in the transactional region.
  • the one or more memory operations performed atomically by the processor to complete the instruction may correspond to implicit memory operations. That is, if executing the instruction with the given prefix requires executing multiple implicit memory operations, then the processor may execute these multiple implicit memory operations atomically as party of the atomic transaction.
  • FIG. 1 is a block diagram illustrating components of a multi-processor computer system configured to implement selective annotation of transactions, according to various embodiments.
  • FIG. 2 is a flow diagram of a method for executing a transaction that uses selective annotation, according to some embodiments.
  • FIG. 3 is a block diagram illustrating the hardware structures of a processor configured to implement selective transactional annotation as described herein, according to some embodiments.
  • FIG. 5 illustrates a computing system configured to implement selective annotation as described herein, according to various embodiments.
  • a programmer may designate a region of code as a transaction, such as by using designated start and end instructions to demarcate the execution boundaries of the transaction.
  • hardware capacity limitations such as speculative buffer sizes, constrain the number of speculative memory access operations that can be executed together atomically by a hardware transactional memory (HTM) system as part of a single transaction.
  • HTM hardware transactional memory
  • a computer processor may be configured to determine a first non-empty subset of instructions within a transactional region that are speculative and another non-empty subset of instructions within the transactional region that are non-speculative. The processor may then execute the transactional region such that the speculative subset is executed as a single atomic transaction and the non-speculative subset of instructions is not necessarily executed atomically.
  • a processor may be configured to differentiate between speculative instructions from non-speculative instructions, at least in part based on a predefined speculative-instruction prefix.
  • some instruction set architectures e.g., x86 and other CISC architectures
  • x86 instructions include one to five optional prefix bytes, followed by an operation code (opcode) field, and optional addressing mode byte, scale-index-base byte, displacement field, and immediate data field.
  • a processor may be configured to determine that a given instruction of a transactional region is speculative dependent at least on the value of the instruction prefix.
  • a “prefix” refers to a portion of an instruction that is distinct from the opcode field (as well as from any operands of the instruction), where the opcode field specifies an operation to be performed by a processor.
  • a prefix further specifies the manner in which the processor performs the operation specified by the opcode.
  • the processor may comprise different transactional memory mechanisms for executing the speculative instructions of the transactional region as a single atomic instruction while executing the non-speculative instructions without guarantee of atomicity.
  • This differentiation between speculative and non-speculative execution for different subsets of instructions in a transaction may be referred to herein as selective annotation, and therefore, a processor in various embodiments may be configured to implement selectively annotated transactions.
  • executing the instruction may include performing one or more explicit or implicit memory access operations.
  • an “explicit” memory access operation is one that the processor performs as part of executing an explicit load or store instruction (e.g., an instruction for loading from or writing to a given target memory location specified by an operand, such as the x86 MOV instruction).
  • an “implicit” memory access operation is one that is performed by the processor as part of executing a type of instruction other than a load/store instruction, but one which necessitates that the processor perform a memory access to one or more memory locations specified by one or more operands.
  • an “ADD” instruction having an operand that specifies a memory location rather than a register or immediate value is an example of instruction whose execution includes an implicit memory access operation.
  • FIG. 1 is a block diagram illustrating components of a multi-processor computer system configured to implement selective annotation of transactions, according to various embodiments.
  • computer system 100 may include multiple processors, such as processors 110 a and 110 b .
  • the processors e.g., 110
  • the processors may be coupled to each other and/or to a shared memory (e.g., 150 ) over an interconnect, such as 140 .
  • an interconnect such as 140
  • different interconnects may be used, such as a shared system bus or a point-to-point network in various topographies (e.g., fully connected, torus, etc.).
  • processors 110 may comprise multiple physical or logical (e.g., SMT) cores, each capable of executing a respective thread of execution concurrently.
  • SMT physical or logical
  • selective annotation may be implemented by a single processor with multiple cores.
  • the embodiments outlined herein are described using single-core processors. Those skilled in the art will recognize that the methods and systems described herein apply also to multi-core processors.
  • each processor 110 may include one or more levels of memory caches 130 .
  • Levels of memory caches may be hierarchically arranged (e.g., L1 cache, L2 cache, L3 cache, etc.) and may be used to cache local copies of values stored in shared memory 150 .
  • memory caches 130 may include various cache-coherence mechanisms 132 .
  • Cache-coherence mechanisms 132 may, in some embodiments, implement a cache coherence communication protocol among the interconnected processors, which may ensure that the values contained in memory caches 130 of each processor 110 are coherent with values stored in shared memory and/or in the memory caches of other processors.
  • MESI i.e., Illinois protocol
  • MOESI protocols MOESI protocols
  • Cache coherence protocols may define a set of messages and rules by which processors may inform one another of modifications to shared data and thereby maintain cache coherence.
  • each block stored in a cache must be marked as being in one of four states: modified, exclusive, shared, or invalid.
  • a given protocol defines a set of messages and rules for sending and interpreting those messages, by which processors maintain the proper markings on each block.
  • a processor may be restricted from performing certain operations. For example, a processor may not execute program instructions that depend on a cache block that is marked as invalid.
  • Cache coherence mechanisms may be implemented in hardware, software, or in a combination thereof, in different embodiments.
  • Cache coherence messages may be may be communicated across interconnect 140 and may be broadcast or point-to-point.
  • each processor 110 may also include various transactional memory mechanisms (e.g., 134 ) for implementing transactional memory, as described herein.
  • Transactional memory mechanisms 134 may include selective annotation mechanisms 136 for implementing selective annotation using instruction prefixes, as described herein.
  • more processors 110 may be connected to interconnect 140 , and various levels of cache memories may be shared among multiple ones of such processors and/or among multiple cores on each processor.
  • FIG. 2 is a flow diagram of a method for executing a transaction that uses selective annotation, according to some embodiments.
  • Method 200 may be performed by a processor (e.g., 110 ) in a multiprocessor system. In one embodiment, method 200 may be performed by processor 110 executing a thread, and is described in this manner below.
  • method 200 begins when the thread of execution encounters a transactional region of code, and thus begins transactional execution, as in 205 .
  • the processor may begin transactional execution (i.e., enter a transactional mode of execution) in response to executing an explicit instruction that indicates the beginning of a transactional region of code (e.g., TxBegin, SPECULATE, etc.).
  • the processor examines each instruction encountered with the transactional region of code to determine whether the instruction is speculative.
  • the processor may be configured to ensure that the set of instructions in the transaction that are determined to be speculative are executed together as a single atomic memory transaction. However, the processor need not make such a guarantee for any instructions in the transactional region of code that are not determined to be speculative.
  • the processor may determine which of the instructions are speculative based, at least in part, on a prefix (a “speculative instruction prefix”) that indicates that a particular instruction is to be executed speculatively (e.g., within a single atomic memory transaction).
  • a prefix e.g., within a single atomic memory transaction.
  • CISC Complex Instruction Set Computers
  • an instruction may include an instruction prefix informing the processor that the instruction is speculative and should be executed atomically as part of the speculative instruction subset in the transaction body.
  • the speculative-instruction prefix may be a one-byte encoding using reserved register D 6 or F 1 .
  • the prefix may be implemented as a two-bye encoding where the first byte is 0 F (escaping byte) and the second byte is one of the unused encodings available.
  • Various other encodings may be possible.
  • the processor checks for a speculative-instruction prefix, as in 210 .
  • the instruction decoding unit of the processor (as shown in FIG. 3 ) is configured to determine whether a given instruction includes a speculative instruction prefix.
  • the processor executes the next instruction speculatively, as in 220 .
  • the processor executes the instruction non-speculatively, as in 225 . That is, instructions without the speculative instruction prefix may be executed and committed upon being retired, regardless of whether the transaction has been committed. Executing such non-speculative instructions may consume fewer or no transactional memory hardware resources, such as transactional buffer space.
  • different instructions may include one or more explicit or implicit memory access operations.
  • an explicit memory store e.g., MOV
  • other instructions e.g., ADD
  • an ADD instruction may require the processor to read the respective memory locations before performing the summation.
  • the ADD instruction implicitly requires two memory read operations.
  • executing a set of speculative instructions as a single atomic transaction includes performing both the explicit and implicit memory operations necessitated by the speculative instructions as a single atomic transaction. For instance, if the ADD instruction from the example above is determined to be speculative, then the two implicit memory operations involved in reading the operands are included in the set of memory operations that are executed by the processor as a single atomic transaction.
  • the processor continues executing instructions of the transaction (as indicated by the feedback loop from 245 to 210 ) until either an abort condition is detected (affirmative exit from 230 ) or no more instructions exist in the transaction (negative exit from 245 ). If all instructions have been executed (as indicated by the negative exit from 245 ), then the transaction is committed (as in 250 ), and the processor exits the transactional execution mode (as in 255 ) and continues normal execution (as in 260 ). Normal execution may include executing more transactions, as indicated by the feedback loop from 260 to 205 .
  • aborting the transaction attempt may include dropping speculative data and/or metadata and/or undoing various speculatively executed memory operations.
  • aborts may be caused by different conditions. For example, if there is insufficient hardware capacity to buffer speculative data in the transaction (e.g., the transaction is too long), then the processor may determine that the transaction attempt must be aborted (as in 235 ). Buffering speculative data is discussed in more detail below.
  • an abort may be caused by memory contention—that is, interference caused by another processor attempting to access one or more memory locations accessed by the processor as part of executing one of the speculative instructions.
  • the processor may include contention detection mechanisms configured to detect various cache coherence messages (e.g., invalidating and/or non-invalidating probes) sent by other processors.
  • the contention detection mechanisms may determine whether a received probe is relevant to one or more memory areas accessed as part of executing a speculative instruction and to determine whether the probe indicates that a data conflict exists.
  • the processor may abort the transactional attempt, as in 235 .
  • a transaction may be aborted if an invalidating probe relevant to a speculatively-read memory location is received and/or if a non-invalidating probe relevant to a speculatively-written memory location is received.
  • detecting a data conflict consider a first thread executing in transactional mode on a first processor and having an access to a memory location as part of executing a speculative instruction. If a second thread executing on a second processor subsequently attempts a store to the speculatively-accessed memory location, then the second processor may send an invalidating probe to the first processor in accordance with the particular cache coherence protocol deployed by the system. If the first processor receives the invalidating probe while the memory location is still protected (e.g., before the first thread commits its transaction or otherwise releases the memory location) then a data conflict may exist and the first processor may abort the transaction.
  • speculative data buffer may be implemented by various processor structures, such as one or more data caches, a load queue, store queue, combined load/store queues, etc.
  • different transactional memory mechanisms may follow different policies with respect to speculative data values resulting from speculative write operations (implicit and/or explicit).
  • the processor may implement a redo protocol where speculative data values are kept in a private speculative data buffer until commit time (e.g., 250 ), when they are then collectively exposed to other threads in the system. In the case of an abort, the speculative data in the speculative buffer may simply be dropped.
  • the processor may implement an undo policy, where the processor records a checkpoint at the start of the transaction and restores the checkpoint in the case of an abort, thereby overriding any data modified as part of executing one or more speculative instructions during the transaction.
  • undo policy Such techniques for redoing or undoing modifications to memory in response to a transaction committing or aborting may be referred to herein broadly as “versioning” and the metadata recorded to enable versioning may be referred to as “versioning data.”
  • the ISA may support nested transactions so that another transaction can begin within a currently executing transaction.
  • the HTM implementation may support such nesting in different ways, such as by subsuming the outer transactions (i.e., flattening) or by treating an inner transaction as separate independent transaction from a transaction that contains it.
  • FIG. 3 is a block diagram illustrating the hardware structures of a processor configured to implement selective transactional annotation as described herein, according to some embodiments.
  • processor 300 includes an instruction fetch unit 305 configured to fetch program instructions to execute, a decoder 310 configured to decode/interpret the fetched instructions, and a scheduler 315 configured to schedule the instructions for execution.
  • decoder 310 may be configured to recognize various transactional memory instructions, such as TxBegin for starting a transaction, TxEnd for ending a transaction, and the Tx prefix for determining which instructions in a transaction are speculative based on the use of selective annotation.
  • scheduler 315 is configured to dispatch instructions to the proper execution units, such as Load Store Unit 320 for memory instructions and Arithmetic Logic Unit 345 for arithmetic instructions. Both execution units 320 and 345 are configured to communicate with a register file 340 , which contains operand data and/or various other register values.
  • processor 300 may include a shadow register file 335 configured to store a register file checkpoint of register file 340 .
  • processor 300 may take a register checkpoint, such as by storing a backup copy of the current values held in various registers of register file 340 (e.g., program counter register).
  • the checkpoint values may be restored from the shadow register file 335 to the register file 340 .
  • program control flow may be returned to the start of the transaction by restoring the value of the program counter register stored in shadow register file 335 to register file 340 .
  • the transaction initiating instruction may accept a parameter indicating an alternative address for the program counter saved in the checkpoint operation, such that in case of an abort and/or rollback operation, the program execution could be made to jump to the alternative address.
  • processor 300 When processor 300 is executing in a transactional mode (e.g., a TxBegin instruction has been executed and no corresponding TxEnd instruction has been executed), it may perform some instructions speculatively and perform other instructions non-speculatively.
  • the processor may be configured to determine which instructions to execute speculatively based at least in part on decoder 310 detecting a speculative instruction prefix. For example, the processor may be configured to execute those instructions that include a given Tx prefix speculatively, while executing those instructions without the given prefix non-speculatively.
  • the processor may be configured to store versioning data in various components, such as data cache 350 and/or load-store unit 320 .
  • processor 300 includes data cache 350 , which is configured to store data from recently-accessed memory regions.
  • the data cache 350 may be arranged into multiple cache lines 352 a - 352 n , each identified by one or more tags and each storing data (e.g., data 254 ) from recently-accessed regions of memory (e.g., shared memory 150 of FIG. 1 ).
  • data cache 350 may be used to implement a speculative buffer for buffering speculative transactional data.
  • each cache line 352 in data cache 350 may include versioning data, such as one or more associated transaction flags usable to indicate whether the data in the cache line has been accessed by the processor as part of executing a speculative instruction and/or the nature of such accesses.
  • processor 300 includes TW flag 356 , which is usable to indicate whether data in the cache line has been transactionally written (i.e., written as part of executing a speculative instruction) and TR flag 358 usable to indicate whether data in the cache line has been transactionally read.
  • TW flag 356 and/or TR flag 358 may comprise any suitable number of bits.
  • processor 300 may also buffer speculative data in a load and/or store queue, such as store queue 325 and load queue 330 in load/store unit 320 .
  • load queue 330 may hold data indicative of an issued load instruction that has not yet been retired and store queue 325 may hold data indicative of an issued store instruction that has not yet been retired.
  • store queue 325 and load queue 330 may include one or more entries 322 a - 322 n for storing such data.
  • each such entry may comprise a TW flag or TR flag, such as flags 356 and 358 , indicating whether the data is associated with a speculative instruction.
  • the speculative data buffer may be implemented by one or more data caches (e.g., 350 ), in load and store queues (e.g., 330 and 325 respectively), a combined load/store queue, or any combination thereof.
  • speculative data from retired instructions may be moved from store queue 325 or load queue 330 into data cache 350 .
  • LS unit 320 may be configured to detect whether such a transfer would overflow the capacity of data cache 350 to buffer all the speculative data of an active transaction, and in response, to delay flushing the speculative data and instead maintain it in the load or store queues.
  • Processor 300 may be configured to detect cache coherency probes sent from other processors via on-chip network 380 , such as by using conflict detection unit 370 .
  • Conflict detection unit 370 may receive a cache coherency probe (e.g., sent as part of a cache coherency protocol, such as MESI) and in response, check the speculative buffer implemented by data cache 350 and/or LS Unit 320 to determine if a data conflict exists. For example, conflict detection unit 370 may check the tag of each cache line 352 to determine if the received probe matches the cache line tag and check the TW flag 356 and/or TR flag 358 to determine whether the data contained in the cache line is speculative. In some embodiments, based on these determinations, the conflict detection unit 370 may determine whether the probe indicates a data conflict.
  • MESI cache coherency protocol
  • conflict detection unit 370 may detect a conflict if a received probe matches an entry in data cache 350 or LS Unit 320 and the entry has the TW flag set. Also, conflict detection unit 370 may detect a conflict if a received probe matches an entry and the probe indicates a write operation (e.g., an invalidating probe).
  • conflict detection unit 370 may determine that the probe indicates a data conflict and therefore signal an abort condition.
  • conflict detection unit 370 may invoke the microcoded transaction abort handler 365 in microcode ROM 360 , which may invalidate the cache entries with the TW bits, clear all TW/TR bits, restore the register checkpoint taken when the transaction began, and/or flush the instruction pipeline. Since the checkpoint has been restored, including the old program counter, the execution flow then returns to the start of the transaction. Alternatively, if the transaction reaches TxEnd, it may be committed, which may include clearing all TW/TR bits and discarding the register checkpoint.
  • processor 300 may implement transaction workflow 200 of FIG. 2 .
  • processor 300 may begin a transaction (i.e., enter transactional execution mode) as in 205 by executing a TxBegin instruction recognized by decoder 310 .
  • Executing the TxBegin instruction may include storing a checkpoint in shadow register file 335 .
  • This checkpoint may include the values held in various register in register file 340 , including the program counter value that can be used to roll back the transaction in case of an abort.
  • the decoder 310 may determine if the instruction includes the speculative instruction prefix (e.g., TX), as in 210 . If the instruction includes the prefix, then decoder 310 determines that the instruction should be executed speculatively as part of the transaction, as indicated by the affirmative exit from 215 . As in method 200 of FIG. 2 , each instruction determined to be speculative is executed speculatively, as in 220 .
  • the speculative instruction prefix e.g., TX
  • FIG. 4 is a flow diagram illustrating a method by which processor 300 may execute a speculative memory access operation (as in 220 ), according to some embodiments.
  • Method 400 begins when the decoder (and/or other component(s) of the processor) determines (as in 405 ) whether executing the given prefixed instruction necessitates executing one or more implicit memory operations. If so, as indicated by the affirmative exit from 405 to 410 , the processor may split the instruction into multiple simpler instructions (e.g., RISC-style instructions), which may include a respective explicit memory access instruction (e.g., MOV) for each of the implicit memory operations that executing the prefixed instruction requires. However if the prefixed instruction is already an explicit memory access instruction, then the processor may skip step 410 , as indicated by the negative exit from 405 .
  • simpler instructions e.g., RISC-style instructions
  • MOV respective explicit memory access instruction
  • method 400 then includes dispatching the explicit memory instruction(s) to load/store unit 320 with a speculate signal, as in 415 .
  • the speculate signal may indicate to LS unit 320 that the instruction is to be executed speculatively. This dispatching may be performed via an instruction scheduler, such as scheduler 315 .
  • the instruction if the instruction is a store operation, as indicated by the affirmative exit from 420 , then it may be transferred to store queue 325 for execution.
  • the store operation data may be stored in store queue 325 in one of entries 322 , and the TW flag of the entry may be set, as in 425 .
  • the data in the respective entry 322 may be sent to data cache 350 for buffering, as in 430 , the TW flag in the store queue may be cleared as in 435 , and the TW flag of the new entry in the data cache may be set, as in 440 .
  • the instruction may be a load operation and may be transferred to load queue 330 for execution.
  • the load operation may be stored in load queue 330 in one of entries 322 , and the TR flag of the entry may be set, as in 445 .
  • the instruction is retired (as in 450 ), the TR flag of the respective load queue entry is cleared (as in 455 ), and the TR flag of the data cache entry for the loaded data is set (as in 460 ).
  • decoder 310 does not send a speculate signal to LS unit 320 .
  • such instructions may be executed non-speculatively, as in 225 .
  • the processor is configured to not record versioning data, such as the TR and/or TW flags, for the instruction.
  • conflict detection unit 370 may invoke abort handler 365 to perform an abort, as in 235 .
  • performing the abort may include invalidating entries of data cache 350 and/or of LS unit 320 whose TW flag is set and then clearing all TR and/or TW flags.
  • abort handler 365 may then restore the register checkpoint taken at the start of the transaction, including the old program counter value.
  • the abort procedure may return program control to the start of the transaction, as in 240 , allowing the processor to reattempt execution.
  • decoder 310 may be configured to detect whether an instruction that initiates transactional execution (e.g., FxBegin) includes the speculative instruction prefix (e.g., FX). In such embodiments, if the transaction-initiating instruction (e.g., FxBegin) includes the speculative instruction prefix, the processor is configured to treat every explicit and implicit memory access instruction within the transaction as speculative.
  • FxBegin an instruction that initiates transactional execution
  • speculative instruction prefix e.g., FX
  • FIG. 5 illustrates a computing system configured to implement selective annotation as described herein, according to various embodiments.
  • the computer system 500 may be any of various types of devices, including, but not limited to, a personal computer system, desktop computer, laptop or notebook computer, mainframe computer system, handheld computer, workstation, network computer, a consumer device, application server, storage device, a peripheral device such as a switch, modem, router, etc, or in general any type of computing device.
  • Computer system 500 may include one or more processors 570 , each of which may include multiple cores, any of which may be single or multi-threaded.
  • the processor may be manufactured by configuring a semiconductor fabrication facility through the use of various mask works. These mask works may be created/generated by the use of netlists, HDL, GDS data, etc.
  • the computer system 500 may also include one or more persistent storage devices 550 (e.g. optical storage, magnetic storage, hard drive, tape drive, solid state memory, etc) and one or more memories 510 (e.g., one or more of cache, SRAM, DRAM, RDRAM, EDO RAM, DDR 12 RAM, SDRAM, Rambus RAM, EEPROM, etc.).
  • persistent storage devices 550 e.g. optical storage, magnetic storage, hard drive, tape drive, solid state memory, etc
  • memories 510 e.g., one or more of cache, SRAM, DRAM, RDRAM, EDO RAM, DDR 12 RAM, SDRAM, Rambus RAM, EEPROM, etc.
  • Various embodiments may include fewer or additional components not illustrated in FIG. 5 (e.g., video cards, audio cards, additional network interfaces, peripheral devices, a network interface such as an ATM interface, an Ethernet interface, a Frame Relay interface, etc.)
  • the one or more processors 570 , the storage device(s) 550 , and the system memory 510 may be coupled to the system interconnect 540 .
  • One or more of the system memories 510 may contain program instructions 520 .
  • Program instructions 520 may include program instructions executable to implement one or more multithreaded applications 522 and operating systems 524 .
  • Program instructions 520 may be encoded in platform native binary, any interpreted language such as JavaTM byte-code, or in any other language such as C/C++, JavaTM, etc or in any combination thereof.
  • Any number of program instructions 520 may include a speculative instruction prefix as described herein for selective annotation of speculative regions.
  • Each processor 570 may include a decoder unit for recognizing instructions of program instructions 520 usable to signal the start of a transactional region (e.g., TxBegin), the end of a transactional region (e.g., TxEnd), and/or a speculative-instruction prefix (e.g., TX), as described herein.
  • Program instructions 520 may be provided on a computer readable storage medium.
  • the computer-readable storage medium may include any tangible (non-transitory) mechanism for storing information in a form (e.g., software, processing application) readable by a machine (e.g., a computer).
  • the computer-readable storage medium may include, but is not limited to, magnetic storage medium (e.g., floppy diskette); optical storage medium (e.g., CD-ROM); magneto-optical storage medium; read only memory (ROM); random access memory (RAM); erasable programmable memory (e.g., EPROM and EEPROM); flash memory; electrical, or other types of medium suitable for storing program instructions.
  • magnetic storage medium e.g., floppy diskette
  • optical storage medium e.g., CD-ROM
  • magneto-optical storage medium e.g., magneto-optical storage medium
  • ROM read only memory
  • RAM random access memory
  • EPROM and EEPROM erasable programmable memory
  • flash memory electrical, or other types of medium suitable for storing program instructions.
  • a computer-readable storage medium as described above can be used in some embodiments to store instructions read by a program and used, directly or indirectly, to fabricate the hardware comprising system processor 570 .
  • the instructions may describe one or more data structures describing a behavioral-level or register-transfer level (RTL) description of the hardware functionality in a high level design language (HDL) such as Verilog or VHDL.
  • HDL high level design language
  • the description may be read by a synthesis tool, which may synthesize the description to produce a netlist.
  • the netlist may comprise a set of gates (e.g., defined in a synthesis library), which represent the functionality of processor 570 .
  • the netlist may then be placed and routed to produce a data set describing geometric shapes to be applied to masks.
  • the masks may then be used in various semiconductor fabrication steps to produce a semiconductor circuit or circuits corresponding to processor 570 .
  • the database may be the netlist (with or without the synthesis library) or the data set, as

Abstract

A computer system and method is disclosed for executing selectively annotated transactional regions. The system is configured to determine whether an instruction within a plurality of instructions in a transactional region includes a given prefix. The prefix indicates that one or more memory operations performed by the processor to complete the instruction are to be executed as part of an atomic transaction. The atomic transaction can include one or more other memory operations performed by the processor to complete one or more others of the plurality of instructions in the transactional region.

Description

  • This application is a continuation-in-part of U.S. application Ser. No. 12/510,884, filed Jul. 28, 2009, which claims the benefit of priority to U.S. Provisional Application No. 61/084,008, filed Jul. 28, 2008, both of which are incorporated by reference herein in their entireties.
  • BACKGROUND
  • Shared-memory computer systems allow multiple concurrent threads of execution to access shared memory locations. Unfortunately, writing correct multi-threaded programs is difficult due to the complexities of coordinating concurrent memory access. One approach to concurrency control between multiple threads of execution is transactional memory. In a transactional memory programming model, a programmer may designate a section of code (i.e., an execution path or a set of program instructions) as a “transaction”, which a transactional memory system should execute atomically with respect to other threads of execution. For example, if the transaction includes two memory store operations, then the transactional memory system ensures that all other threads may only observe either the cumulative effects of both memory operations or of neither, but not the effects of only one.
  • To implement transactional memory, memory accesses are sometimes executed one by one speculatively and committed all at once at the end of the transaction. Otherwise, if an abort condition is detected (e.g., data conflict with another processor), those memory operations that have been executed speculatively may be rolled back or dropped and the transaction may be reattempted. Data from speculative memory accesses may be saved in a speculative data buffer, which may be implemented by various hardware structures, such as an on-chip data cache.
  • Various transactional memory systems have been proposed in the past, including those implemented by software, by hardware, or by a combination thereof. However, many traditional implementations are bound by various limitations. For example, hardware-based transactional memory proposals (HTMs) sometimes impose limitations on the size of transactions supported (i.e., maximum number of speculative memory operations that can be executed before the transaction is committed). Often, this may be a product of limited hardware resources, such as the size of one or more speculative data buffers used to buffer speculative data during transactional execution.
  • SUMMARY
  • In various embodiments, a computer processor may be configured to implement a hardware transactional memory system. The system may execute a transactional region of code such that only a subset of the instructions in the transactional region of code (e.g., those including a given instruction prefix) are executed as a single atomic memory transaction while the other instructions in the transactional region (e.g., those lacking the given prefix) are not necessarily executed atomically.
  • In some embodiments, a computer system may be configured to determine whether an instruction within a plurality of instructions in a transactional region includes a given prefix. The prefix indicates that one or more memory operations performed by the processor to complete the instruction are to be executed as part of an atomic transaction. The atomic transaction may include memory operations performed by the processor to complete one or more others of the plurality of instructions in the transactional region.
  • In some embodiments, the one or more memory operations performed atomically by the processor to complete the instruction may correspond to implicit memory operations. That is, if executing the instruction with the given prefix requires executing multiple implicit memory operations, then the processor may execute these multiple implicit memory operations atomically as party of the atomic transaction.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram illustrating components of a multi-processor computer system configured to implement selective annotation of transactions, according to various embodiments.
  • FIG. 2 is a flow diagram of a method for executing a transaction that uses selective annotation, according to some embodiments.
  • FIG. 3 is a block diagram illustrating the hardware structures of a processor configured to implement selective transactional annotation as described herein, according to some embodiments.
  • FIG. 4 is a flow diagram illustrating a method by which processor 300 may execute a speculative memory access operation (as in 220), according to some embodiments.
  • FIG. 5 illustrates a computing system configured to implement selective annotation as described herein, according to various embodiments.
  • Any headings used herein are for organizational purposes only and are not meant to limit the scope of the description or the claims. As used herein, the word “may” is used in a permissive sense (i.e., meaning having the potential to) rather than the mandatory sense (i.e. meaning must). Similarly, the words “include”, “including”, and “includes” mean including, but not limited to.
  • DETAILED DESCRIPTION OF EMBODIMENTS
  • In transactional programming models, a programmer may designate a region of code as a transaction, such as by using designated start and end instructions to demarcate the execution boundaries of the transaction. In many implementations, hardware capacity limitations, such as speculative buffer sizes, constrain the number of speculative memory access operations that can be executed together atomically by a hardware transactional memory (HTM) system as part of a single transaction.
  • In traditional implementations, all memory accesses that occur within a designated transaction are executed together atomically. However, in some cases, correct program semantics may not strictly require that all memory operations within a given transaction be executed together atomically.
  • According to various embodiments, a computer processor may be configured to determine a first non-empty subset of instructions within a transactional region that are speculative and another non-empty subset of instructions within the transactional region that are non-speculative. The processor may then execute the transactional region such that the speculative subset is executed as a single atomic transaction and the non-speculative subset of instructions is not necessarily executed atomically.
  • In some embodiments, a processor may be configured to differentiate between speculative instructions from non-speculative instructions, at least in part based on a predefined speculative-instruction prefix. For example, some instruction set architectures (e.g., x86 and other CISC architectures) include various potentially variable-length instructions that may include optional prefix fields. For instance, x86 instructions include one to five optional prefix bytes, followed by an operation code (opcode) field, and optional addressing mode byte, scale-index-base byte, displacement field, and immediate data field. According to various embodiments, a processor may be configured to determine that a given instruction of a transactional region is speculative dependent at least on the value of the instruction prefix. As used herein, a “prefix” refers to a portion of an instruction that is distinct from the opcode field (as well as from any operands of the instruction), where the opcode field specifies an operation to be performed by a processor. A prefix further specifies the manner in which the processor performs the operation specified by the opcode.
  • In various embodiments, the processor may comprise different transactional memory mechanisms for executing the speculative instructions of the transactional region as a single atomic instruction while executing the non-speculative instructions without guarantee of atomicity. This differentiation between speculative and non-speculative execution for different subsets of instructions in a transaction may be referred to herein as selective annotation, and therefore, a processor in various embodiments may be configured to implement selectively annotated transactions.
  • For each of the speculative instructions, executing the instruction may include performing one or more explicit or implicit memory access operations. As used herein, an “explicit” memory access operation is one that the processor performs as part of executing an explicit load or store instruction (e.g., an instruction for loading from or writing to a given target memory location specified by an operand, such as the x86 MOV instruction). As used herein, an “implicit” memory access operation is one that is performed by the processor as part of executing a type of instruction other than a load/store instruction, but one which necessitates that the processor perform a memory access to one or more memory locations specified by one or more operands. For example, an “ADD” instruction having an operand that specifies a memory location rather than a register or immediate value is an example of instruction whose execution includes an implicit memory access operation.
  • FIG. 1 is a block diagram illustrating components of a multi-processor computer system configured to implement selective annotation of transactions, according to various embodiments. According to the illustrated embodiment, computer system 100 may include multiple processors, such as processors 110 a and 110 b. In different embodiments, the processors (e.g., 110) may be coupled to each other and/or to a shared memory (e.g., 150) over an interconnect, such as 140. In various embodiments, different interconnects may be used, such as a shared system bus or a point-to-point network in various topographies (e.g., fully connected, torus, etc.).
  • In some embodiments, processors 110 may comprise multiple physical or logical (e.g., SMT) cores, each capable of executing a respective thread of execution concurrently. In some such embodiments, selective annotation may be implemented by a single processor with multiple cores. However, for clarity of explanation, the embodiments outlined herein are described using single-core processors. Those skilled in the art will recognize that the methods and systems described herein apply also to multi-core processors.
  • According to the illustrated embodiment, each processor 110 may include one or more levels of memory caches 130. Levels of memory caches may be hierarchically arranged (e.g., L1 cache, L2 cache, L3 cache, etc.) and may be used to cache local copies of values stored in shared memory 150.
  • In various embodiments, memory caches 130 may include various cache-coherence mechanisms 132. Cache-coherence mechanisms 132 may, in some embodiments, implement a cache coherence communication protocol among the interconnected processors, which may ensure that the values contained in memory caches 130 of each processor 110 are coherent with values stored in shared memory and/or in the memory caches of other processors. Several such protocols exist (including the MESI (i.e., Illinois protocol) and MOESI protocols), and may be implemented in various embodiments. Cache coherence protocols may define a set of messages and rules by which processors may inform one another of modifications to shared data and thereby maintain cache coherence. For example, according to the MESI protocol, each block stored in a cache must be marked as being in one of four states: modified, exclusive, shared, or invalid. A given protocol defines a set of messages and rules for sending and interpreting those messages, by which processors maintain the proper markings on each block. Depending on the state of a given cache block, a processor may be restricted from performing certain operations. For example, a processor may not execute program instructions that depend on a cache block that is marked as invalid. Cache coherence mechanisms may be implemented in hardware, software, or in a combination thereof, in different embodiments. Cache coherence messages may be may be communicated across interconnect 140 and may be broadcast or point-to-point.
  • According to the illustrated embodiment, each processor 110 may also include various transactional memory mechanisms (e.g., 134) for implementing transactional memory, as described herein. Transactional memory mechanisms 134 may include selective annotation mechanisms 136 for implementing selective annotation using instruction prefixes, as described herein. In various embodiments, more processors 110 may be connected to interconnect 140, and various levels of cache memories may be shared among multiple ones of such processors and/or among multiple cores on each processor.
  • FIG. 2 is a flow diagram of a method for executing a transaction that uses selective annotation, according to some embodiments. Method 200 may be performed by a processor (e.g., 110) in a multiprocessor system. In one embodiment, method 200 may be performed by processor 110 executing a thread, and is described in this manner below.
  • According to the illustrated embodiment, method 200 begins when the thread of execution encounters a transactional region of code, and thus begins transactional execution, as in 205. In some embodiments, the processor may begin transactional execution (i.e., enter a transactional mode of execution) in response to executing an explicit instruction that indicates the beginning of a transactional region of code (e.g., TxBegin, SPECULATE, etc.).
  • In some embodiments, once the transactional execution begins, the processor examines each instruction encountered with the transactional region of code to determine whether the instruction is speculative. In one embodiment, the processor may be configured to ensure that the set of instructions in the transaction that are determined to be speculative are executed together as a single atomic memory transaction. However, the processor need not make such a guarantee for any instructions in the transactional region of code that are not determined to be speculative.
  • According to various embodiments, the processor may determine which of the instructions are speculative based, at least in part, on a prefix (a “speculative instruction prefix”) that indicates that a particular instruction is to be executed speculatively (e.g., within a single atomic memory transaction). For example, various Complex Instruction Set Computers (CISC), such as x86, allow each instruction to include (or exclude) a prefix field, which can be used to provide the processor with special directions for executing an instruction specified by the opcode portion of the instruction encoding. In some embodiments, an instruction may include an instruction prefix informing the processor that the instruction is speculative and should be executed atomically as part of the speculative instruction subset in the transaction body. For example, the speculative-instruction prefix may be a one-byte encoding using reserved register D6 or F1. In other embodiments, the prefix may be implemented as a two-bye encoding where the first byte is 0F (escaping byte) and the second byte is one of the unused encodings available. Various other encodings may be possible.
  • According to the illustrated embodiment, for each instruction in the transaction, the processor checks for a speculative-instruction prefix, as in 210. For example, in one embodiment, the instruction decoding unit of the processor (as shown in FIG. 3) is configured to determine whether a given instruction includes a speculative instruction prefix.
  • If the examined instruction includes the speculative instruction prefix, as indicated by the affirmative exit from 215, the processor executes the next instruction speculatively, as in 220. Alternatively, if the next instruction does not include the speculative-instruction prefix, as indicated by the negative exit from 215, the processor executes the instruction non-speculatively, as in 225. That is, instructions without the speculative instruction prefix may be executed and committed upon being retired, regardless of whether the transaction has been committed. Executing such non-speculative instructions may consume fewer or no transactional memory hardware resources, such as transactional buffer space.
  • As discussed above, different instructions may include one or more explicit or implicit memory access operations. For example, an explicit memory store (e.g., MOV) instruction explicitly instructs the processor to perform a memory access to a given memory location specified in an operand. However, other instructions (e.g., ADD) may include one or more operands that specify memory locations that contain data needed by the processor to execute the full instruction. In such cases, the processor must perform one or more memory read operations as part of executing the instruction, meaning that the instruction contains one or more implicit memory references. For example, if an ADD instruction includes three register operands (two that contain respective locations in memory where the values to be added are stored and a third register operand for storing the summation) then executing the ADD instruction may require the processor to read the respective memory locations before performing the summation. In this case, the ADD instruction implicitly requires two memory read operations.
  • In some embodiments, executing a set of speculative instructions as a single atomic transaction includes performing both the explicit and implicit memory operations necessitated by the speculative instructions as a single atomic transaction. For instance, if the ADD instruction from the example above is determined to be speculative, then the two implicit memory operations involved in reading the operands are included in the set of memory operations that are executed by the processor as a single atomic transaction.
  • As shown in FIG. 2, the processor continues executing instructions of the transaction (as indicated by the feedback loop from 245 to 210) until either an abort condition is detected (affirmative exit from 230) or no more instructions exist in the transaction (negative exit from 245). If all instructions have been executed (as indicated by the negative exit from 245), then the transaction is committed (as in 250), and the processor exits the transactional execution mode (as in 255) and continues normal execution (as in 260). Normal execution may include executing more transactions, as indicated by the feedback loop from 260 to 205.
  • Otherwise, if an abort condition is detected before the transaction commits (as indicated by the affirmative exit from 230), then the processor aborts the transaction attempt (as in 235), rolls back execution to the start of the transaction (as in 240), and reattempts to execute the transaction (as indicated by the feedback loop from 240 to 210). In some embodiments, aborting the transaction attempt may include dropping speculative data and/or metadata and/or undoing various speculatively executed memory operations.
  • In various embodiments, aborts may be caused by different conditions. For example, if there is insufficient hardware capacity to buffer speculative data in the transaction (e.g., the transaction is too long), then the processor may determine that the transaction attempt must be aborted (as in 235). Buffering speculative data is discussed in more detail below. In another example, an abort may be caused by memory contention—that is, interference caused by another processor attempting to access one or more memory locations accessed by the processor as part of executing one of the speculative instructions. In various embodiments, the processor may include contention detection mechanisms configured to detect various cache coherence messages (e.g., invalidating and/or non-invalidating probes) sent by other processors. The contention detection mechanisms may determine whether a received probe is relevant to one or more memory areas accessed as part of executing a speculative instruction and to determine whether the probe indicates that a data conflict exists. In response to detecting a data conflict, the processor may abort the transactional attempt, as in 235.
  • According to various embodiments, a transaction may be aborted if an invalidating probe relevant to a speculatively-read memory location is received and/or if a non-invalidating probe relevant to a speculatively-written memory location is received. In one example of detecting a data conflict, consider a first thread executing in transactional mode on a first processor and having an access to a memory location as part of executing a speculative instruction. If a second thread executing on a second processor subsequently attempts a store to the speculatively-accessed memory location, then the second processor may send an invalidating probe to the first processor in accordance with the particular cache coherence protocol deployed by the system. If the first processor receives the invalidating probe while the memory location is still protected (e.g., before the first thread commits its transaction or otherwise releases the memory location) then a data conflict may exist and the first processor may abort the transaction.
  • Once a transaction is committed, as in 250, all values written either explicitly and/or implicitly by the processor as part of executing the speculative instructions in the transaction become visible to all other threads in the system atomically. However, data values read or written as part of executing the non-speculative instructions are not protected as part of the transactional execution.
  • In various embodiments, different mechanisms and/or techniques may be used to implement transactional memory. For example, data accessed by one or more speculative instructions may be marked as speculative in a speculative data buffer, which may be implemented by various processor structures, such as one or more data caches, a load queue, store queue, combined load/store queues, etc.
  • Additionally, different transactional memory mechanisms may follow different policies with respect to speculative data values resulting from speculative write operations (implicit and/or explicit). For example, in some embodiments, the processor may implement a redo protocol where speculative data values are kept in a private speculative data buffer until commit time (e.g., 250), when they are then collectively exposed to other threads in the system. In the case of an abort, the speculative data in the speculative buffer may simply be dropped. In other embodiments, the processor may implement an undo policy, where the processor records a checkpoint at the start of the transaction and restores the checkpoint in the case of an abort, thereby overriding any data modified as part of executing one or more speculative instructions during the transaction. Various other combinations are possible. Such techniques for redoing or undoing modifications to memory in response to a transaction committing or aborting may be referred to herein broadly as “versioning” and the metadata recorded to enable versioning may be referred to as “versioning data.”
  • In various embodiments, the ISA may support nested transactions so that another transaction can begin within a currently executing transaction. In different embodiments, the HTM implementation may support such nesting in different ways, such as by subsuming the outer transactions (i.e., flattening) or by treating an inner transaction as separate independent transaction from a transaction that contains it.
  • FIG. 3 is a block diagram illustrating the hardware structures of a processor configured to implement selective transactional annotation as described herein, according to some embodiments. In the illustrated embodiment, processor 300 includes an instruction fetch unit 305 configured to fetch program instructions to execute, a decoder 310 configured to decode/interpret the fetched instructions, and a scheduler 315 configured to schedule the instructions for execution. In various embodiments, decoder 310 may be configured to recognize various transactional memory instructions, such as TxBegin for starting a transaction, TxEnd for ending a transaction, and the Tx prefix for determining which instructions in a transaction are speculative based on the use of selective annotation.
  • In the illustrated embodiment, scheduler 315 is configured to dispatch instructions to the proper execution units, such as Load Store Unit 320 for memory instructions and Arithmetic Logic Unit 345 for arithmetic instructions. Both execution units 320 and 345 are configured to communicate with a register file 340, which contains operand data and/or various other register values.
  • According to the illustrated embodiment, processor 300 may include a shadow register file 335 configured to store a register file checkpoint of register file 340. For example, as part of executing a TxBegin instruction, processor 300 may take a register checkpoint, such as by storing a backup copy of the current values held in various registers of register file 340 (e.g., program counter register). In the event of a transaction abort, the checkpoint values may be restored from the shadow register file 335 to the register file 340. For instance, if the transaction is aborted, program control flow may be returned to the start of the transaction by restoring the value of the program counter register stored in shadow register file 335 to register file 340. In some embodiments, the transaction initiating instruction may accept a parameter indicating an alternative address for the program counter saved in the checkpoint operation, such that in case of an abort and/or rollback operation, the program execution could be made to jump to the alternative address.
  • When processor 300 is executing in a transactional mode (e.g., a TxBegin instruction has been executed and no corresponding TxEnd instruction has been executed), it may perform some instructions speculatively and perform other instructions non-speculatively. The processor may be configured to determine which instructions to execute speculatively based at least in part on decoder 310 detecting a speculative instruction prefix. For example, the processor may be configured to execute those instructions that include a given Tx prefix speculatively, while executing those instructions without the given prefix non-speculatively.
  • In various embodiments, the processor may be configured to store versioning data in various components, such as data cache 350 and/or load-store unit 320. For example, in the illustrated embodiment, processor 300 includes data cache 350, which is configured to store data from recently-accessed memory regions. The data cache 350 may be arranged into multiple cache lines 352 a-352 n, each identified by one or more tags and each storing data (e.g., data 254) from recently-accessed regions of memory (e.g., shared memory 150 of FIG. 1).
  • In addition to buffering data from recently-accessed regions, data cache 350 may be used to implement a speculative buffer for buffering speculative transactional data. For example, in some embodiments, each cache line 352 in data cache 350 may include versioning data, such as one or more associated transaction flags usable to indicate whether the data in the cache line has been accessed by the processor as part of executing a speculative instruction and/or the nature of such accesses. For example, in the illustrated embodiment, processor 300 includes TW flag 356, which is usable to indicate whether data in the cache line has been transactionally written (i.e., written as part of executing a speculative instruction) and TR flag 358 usable to indicate whether data in the cache line has been transactionally read. In various embodiments, TW flag 356 and/or TR flag 358 may comprise any suitable number of bits.
  • In addition to buffering speculative data in data cache 350, processor 300 may also buffer speculative data in a load and/or store queue, such as store queue 325 and load queue 330 in load/store unit 320. In some embodiments, load queue 330 may hold data indicative of an issued load instruction that has not yet been retired and store queue 325 may hold data indicative of an issued store instruction that has not yet been retired. For example, store queue 325 and load queue 330 may include one or more entries 322 a-322 n for storing such data. In various embodiments, each such entry may comprise a TW flag or TR flag, such as flags 356 and 358, indicating whether the data is associated with a speculative instruction.
  • In different embodiments, the speculative data buffer may be implemented by one or more data caches (e.g., 350), in load and store queues (e.g., 330 and 325 respectively), a combined load/store queue, or any combination thereof. For example, in some embodiments, speculative data from retired instructions may be moved from store queue 325 or load queue 330 into data cache 350. In some embodiments, LS unit 320 may be configured to detect whether such a transfer would overflow the capacity of data cache 350 to buffer all the speculative data of an active transaction, and in response, to delay flushing the speculative data and instead maintain it in the load or store queues.
  • According to the illustrated embodiment, processor 300 also includes an on-chip network 380 usable by multiple processors (and/or processing cores) to communicate with one another. In some embodiments, on-chip network 380 may be analogous to interconnect 140 of FIG. 1, and may implement various network topologies.
  • Processor 300 may be configured to detect cache coherency probes sent from other processors via on-chip network 380, such as by using conflict detection unit 370. Conflict detection unit 370 may receive a cache coherency probe (e.g., sent as part of a cache coherency protocol, such as MESI) and in response, check the speculative buffer implemented by data cache 350 and/or LS Unit 320 to determine if a data conflict exists. For example, conflict detection unit 370 may check the tag of each cache line 352 to determine if the received probe matches the cache line tag and check the TW flag 356 and/or TR flag 358 to determine whether the data contained in the cache line is speculative. In some embodiments, based on these determinations, the conflict detection unit 370 may determine whether the probe indicates a data conflict.
  • In some embodiments, a data conflict occurs if two processors have accessed a location in shared memory and at least one processor has written to it. Therefore, conflict detection unit 370 may detect a conflict if a received probe matches an entry in data cache 350 or LS Unit 320 and the entry has the TW flag set. Also, conflict detection unit 370 may detect a conflict if a received probe matches an entry and the probe indicates a write operation (e.g., an invalidating probe). In one example, if the probe indicates that the sending processor has read data from a memory location that matches the tag of 352 a in data cache 350, and TW flag 356 indicates that processor 300 has modified that data speculatively within an active (i.e., not yet committed) transaction, then conflict detection unit 370 may determine that the probe indicates a data conflict and therefore signal an abort condition.
  • In response to detecting a conflict, conflict detection unit 370 may invoke the microcoded transaction abort handler 365 in microcode ROM 360, which may invalidate the cache entries with the TW bits, clear all TW/TR bits, restore the register checkpoint taken when the transaction began, and/or flush the instruction pipeline. Since the checkpoint has been restored, including the old program counter, the execution flow then returns to the start of the transaction. Alternatively, if the transaction reaches TxEnd, it may be committed, which may include clearing all TW/TR bits and discarding the register checkpoint.
  • In some embodiments, processor 300 may implement transaction workflow 200 of FIG. 2. For example, processor 300 may begin a transaction (i.e., enter transactional execution mode) as in 205 by executing a TxBegin instruction recognized by decoder 310. Executing the TxBegin instruction may include storing a checkpoint in shadow register file 335. This checkpoint may include the values held in various register in register file 340, including the program counter value that can be used to roll back the transaction in case of an abort.
  • For each instruction in the transaction, the decoder 310 may determine if the instruction includes the speculative instruction prefix (e.g., TX), as in 210. If the instruction includes the prefix, then decoder 310 determines that the instruction should be executed speculatively as part of the transaction, as indicated by the affirmative exit from 215. As in method 200 of FIG. 2, each instruction determined to be speculative is executed speculatively, as in 220.
  • FIG. 4 is a flow diagram illustrating a method by which processor 300 may execute a speculative memory access operation (as in 220), according to some embodiments.
  • Method 400 begins when the decoder (and/or other component(s) of the processor) determines (as in 405) whether executing the given prefixed instruction necessitates executing one or more implicit memory operations. If so, as indicated by the affirmative exit from 405 to 410, the processor may split the instruction into multiple simpler instructions (e.g., RISC-style instructions), which may include a respective explicit memory access instruction (e.g., MOV) for each of the implicit memory operations that executing the prefixed instruction requires. However if the prefixed instruction is already an explicit memory access instruction, then the processor may skip step 410, as indicated by the negative exit from 405.
  • In the illustrated embodiment, method 400 then includes dispatching the explicit memory instruction(s) to load/store unit 320 with a speculate signal, as in 415. The speculate signal may indicate to LS unit 320 that the instruction is to be executed speculatively. This dispatching may be performed via an instruction scheduler, such as scheduler 315.
  • According to method 400, if the instruction is a store operation, as indicated by the affirmative exit from 420, then it may be transferred to store queue 325 for execution. Thus, the store operation data may be stored in store queue 325 in one of entries 322, and the TW flag of the entry may be set, as in 425. Once the instruction is executed, the data in the respective entry 322 may be sent to data cache 350 for buffering, as in 430, the TW flag in the store queue may be cleared as in 435, and the TW flag of the new entry in the data cache may be set, as in 440.
  • According to method 400, if the explicit memory access operation is not a store operation, as indicated by the negative exit from 420, then the instruction may be a load operation and may be transferred to load queue 330 for execution. Thus, the load operation may be stored in load queue 330 in one of entries 322, and the TR flag of the entry may be set, as in 445. Once the instruction is executed, the instruction is retired (as in 450), the TR flag of the respective load queue entry is cleared (as in 455), and the TR flag of the data cache entry for the loaded data is set (as in 460).
  • For instructions that do no have the speculative instruction prefix (e.g., negative exit from 215 in FIG. 2), decoder 310 does not send a speculate signal to LS unit 320. Thus, such instructions may be executed non-speculatively, as in 225. When an instruction is executed non-speculatively, the processor is configured to not record versioning data, such as the TR and/or TW flags, for the instruction.
  • If conflict detection unit 370 detects an abort condition (as in 230), then it may invoke abort handler 365 to perform an abort, as in 235. In some embodiments, performing the abort may include invalidating entries of data cache 350 and/or of LS unit 320 whose TW flag is set and then clearing all TR and/or TW flags. In performing the abort, abort handler 365 may then restore the register checkpoint taken at the start of the transaction, including the old program counter value. Thus, the abort procedure may return program control to the start of the transaction, as in 240, allowing the processor to reattempt execution.
  • In some instances, it may be desirable for every instruction in a transaction to be treated as speculative. For example, if an application invokes a function implemented in legacy code that does not use transactions, but correct program semantics dictate that the legacy function should be executed transactionally, then it may be desirable to indicate the system that all explicit and/or implicit memory access operations performed by the legacy function should be treated as speculative. To accommodate such use cases, in some embodiments, decoder 310 may be configured to detect whether an instruction that initiates transactional execution (e.g., FxBegin) includes the speculative instruction prefix (e.g., FX). In such embodiments, if the transaction-initiating instruction (e.g., FxBegin) includes the speculative instruction prefix, the processor is configured to treat every explicit and implicit memory access instruction within the transaction as speculative.
  • FIG. 5 illustrates a computing system configured to implement selective annotation as described herein, according to various embodiments. The computer system 500 may be any of various types of devices, including, but not limited to, a personal computer system, desktop computer, laptop or notebook computer, mainframe computer system, handheld computer, workstation, network computer, a consumer device, application server, storage device, a peripheral device such as a switch, modem, router, etc, or in general any type of computing device.
  • Computer system 500 may include one or more processors 570, each of which may include multiple cores, any of which may be single or multi-threaded. The processor may be manufactured by configuring a semiconductor fabrication facility through the use of various mask works. These mask works may be created/generated by the use of netlists, HDL, GDS data, etc.
  • The computer system 500 may also include one or more persistent storage devices 550 (e.g. optical storage, magnetic storage, hard drive, tape drive, solid state memory, etc) and one or more memories 510 (e.g., one or more of cache, SRAM, DRAM, RDRAM, EDO RAM, DDR 12 RAM, SDRAM, Rambus RAM, EEPROM, etc.). Various embodiments may include fewer or additional components not illustrated in FIG. 5 (e.g., video cards, audio cards, additional network interfaces, peripheral devices, a network interface such as an ATM interface, an Ethernet interface, a Frame Relay interface, etc.)
  • The one or more processors 570, the storage device(s) 550, and the system memory 510 may be coupled to the system interconnect 540. One or more of the system memories 510 may contain program instructions 520. Program instructions 520 may include program instructions executable to implement one or more multithreaded applications 522 and operating systems 524. Program instructions 520 may be encoded in platform native binary, any interpreted language such as Java™ byte-code, or in any other language such as C/C++, Java™, etc or in any combination thereof.
  • Any number of program instructions 520 may include a speculative instruction prefix as described herein for selective annotation of speculative regions. Each processor 570 may include a decoder unit for recognizing instructions of program instructions 520 usable to signal the start of a transactional region (e.g., TxBegin), the end of a transactional region (e.g., TxEnd), and/or a speculative-instruction prefix (e.g., TX), as described herein.
  • Program instructions 520, such as those used to implement multithreaded applications 522 and/or operating system 524, may be provided on a computer readable storage medium. The computer-readable storage medium may include any tangible (non-transitory) mechanism for storing information in a form (e.g., software, processing application) readable by a machine (e.g., a computer). The computer-readable storage medium may include, but is not limited to, magnetic storage medium (e.g., floppy diskette); optical storage medium (e.g., CD-ROM); magneto-optical storage medium; read only memory (ROM); random access memory (RAM); erasable programmable memory (e.g., EPROM and EEPROM); flash memory; electrical, or other types of medium suitable for storing program instructions.
  • A computer-readable storage medium as described above can be used in some embodiments to store instructions read by a program and used, directly or indirectly, to fabricate the hardware comprising system processor 570. For example, the instructions may describe one or more data structures describing a behavioral-level or register-transfer level (RTL) description of the hardware functionality in a high level design language (HDL) such as Verilog or VHDL. The description may be read by a synthesis tool, which may synthesize the description to produce a netlist. The netlist may comprise a set of gates (e.g., defined in a synthesis library), which represent the functionality of processor 570. The netlist may then be placed and routed to produce a data set describing geometric shapes to be applied to masks. The masks may then be used in various semiconductor fabrication steps to produce a semiconductor circuit or circuits corresponding to processor 570. Alternatively, the database may be the netlist (with or without the synthesis library) or the data set, as desired.
  • The scope of the present disclosure includes any feature or combination of features disclosed herein (either explicitly or implicitly), or any generalization thereof, whether or not it mitigates any or all of the problems addressed herein. Accordingly, new claims may be formulated during prosecution of this application (or an application claiming priority thereto) to any such combination of features. In particular, with reference to the appended claims, features from dependent claims may be combined with those of the independent claims and features from respective independent claims may be combined in any appropriate manner and not merely in the specific combinations enumerated in the appended claims.
  • Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

Claims (21)

1. An apparatus, comprising:
a computer processor configured to determine whether an instruction within a plurality of instructions in a transactional region of code includes a prefix indicating that one or more memory operations performed by the computer processor to complete the instruction are to be executed as part of an atomic transaction that includes memory operations performed by the computer processor to complete at least one other of the plurality of instructions.
2. The apparatus of claim 1, wherein the computer processor is further configured to determine that at least some other of the plurality of instructions do not include the prefix and in response, to execute those instructions non-atomically.
3. The apparatus of claim 1, wherein the one or more memory operations performed by the computer processor to complete the instruction are implicit memory operation of the instruction.
4. The apparatus of claim 1, wherein execution of the one or more memory operations includes buffering versioning data for the instruction in a data cache of the processor.
5. The apparatus of claim 1, wherein the computer processor comprises a decoder unit and at least one execution unit, wherein the decoder is configured to determine that the instruction includes the prefix and to send the instruction to the at least one execution unit with an indication that the instruction is speculative.
6. The apparatus of claim 1, wherein the computer processor is configured to execute all of the plurality of instructions as speculative instructions in response to an opcode portion of the instruction indicating the start of the transactional region of code.
7. The apparatus of claim 1, wherein at least one other of the plurality of instructions also includes the prefix indicating that one or more memory operations performed by the computer processor to complete the at least one other instruction are to be executed as part of the atomic transaction.
8. The apparatus of claim 1, wherein the computer processor is configured to detect an abort condition while executing the transactional region of code, and, in response thereto, to abort execution of the transactional region of code, at least by undoing modifications to values stored in memory as a result of executing one or more speculative instructions within the plurality of instructions without undoing modifications to one or more other values stored in memory as a result of executing one or more non-speculative instructions within the plurality of instructions.
9. A method comprising:
a computer processor detecting a transactional region of code having a plurality of instructions; and
the computer processor determining that an instruction within the transactional region includes a prefix indicating that the instruction is to be executed as part of an atomic memory transaction that includes one or more other instructions in the transactional region.
10. The method of claim 9, further comprising:
the computer processor determining that at least some other of the plurality of instructions are not to be executed as part of the atomic memory transaction; and
executing the at least some other instructions non-atomically.
11. The method of claim 9, further comprising:
determining that execution of the instruction includes at least one implicit memory operation and in response, executing the at least one implicit memory operation as part of the atomic memory transaction.
12. The method of claim 9, further comprising: executing the instruction as part of the atomic memory transaction, wherein said executing includes buffering versioning data for the instruction.
13. The method of claim 9, wherein the instruction indicates the start of the transactional region of code.
14. The method of claim 13, further comprising: in response to the instruction indicating the start of the transactional region of code, determining that all of the plurality of instructions are to be executed as part of the atomic memory transaction.
15. The method of claim 9, wherein the one or more other instructions in the transactional region included in the atomic transaction also include the prefix.
16. The method of claim 9, further comprising:
attempting to execute the atomic memory transaction, wherein said attempting includes:
detecting an abort condition; and
in response to detecting the abort condition, aborting execution of the transactional region of code at least by undoing memory effects of one or more instructions within the transactional region that include the prefix without undoing memory effects of one or more instructions within the transactional region that do not include the prefix; and
reattempting to execute the transactional region of code.
17. A computer-readable storage medium having stored thereon program instructions executable by a processor, wherein the program instructions comprise:
a plurality of instructions in a transactional region of code, the instructions executable by the processor in a transactional mode of execution;
wherein at least some of the instructions in the transactional region include a prefix that indicates to the processor that memory operations performed by the processor as part of executing the instructions that include the prefix are to be performed as a single atomic memory transaction.
18. The computer-readable storage medium of claim 17, wherein the plurality of instructions include:
a transaction-initiating instruction executable by the processor to begin the transactional mode of execution;
a transaction-terminating instruction executable by the processor to exit the transactional mode of execution;
wherein the processor is configured to determine that the transaction-initiating instruction includes the prefix and in response, to execute all memory operations performed as part of executing the plurality of instructions in the transactional region as part of the atomic memory transaction.
19. The computer-readable storage medium of claim 17, wherein the plurality of instructions include:
a transaction-initiating instruction executable by the processor to begin the transactional mode of execution;
a transaction-terminating instruction executable by the processor to exit the transactional mode of execution;
intermediate instructions appearing between the transaction-initiating instruction and transaction-terminating instruction in program execution order;
wherein each of two or more of the intermediate instructions includes a prefix indicating to the processor that the two or more intermediate instructions are to be executed together as part of the atomic memory transaction, and wherein at least one of the intermediate instructions does not include the prefix.
20. The computer-readable storage medium of claim 17, wherein the transaction-initiating instruction includes an operand indicating a memory address to which execution should jump in the event that an attempt to atomically execute two or more intermediate instruction appearing between the transaction-initiating instruction and transaction-terminating instruction in program execution order is aborted.
21. A computer readable storage medium comprising a data structure that is operated upon by a program executable on a computer system, the program operating on the data structure to perform a portion of a process to fabricate an integrated circuit including circuitry described by the data structure, the circuitry described in the data structure including:
a computer processor configured to determine whether an instruction within a plurality of instructions in a transactional region of code includes a prefix indicating that the instruction is to be executed speculatively within a single atomic memory transaction.
US12/764,024 2008-07-28 2010-04-20 Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix Abandoned US20100205408A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/764,024 US20100205408A1 (en) 2008-07-28 2010-04-20 Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US8400808P 2008-07-28 2008-07-28
US12/510,884 US20100023703A1 (en) 2008-07-28 2009-07-28 Hardware transactional memory support for protected and unprotected shared-memory accesses in a speculative section
US12/764,024 US20100205408A1 (en) 2008-07-28 2010-04-20 Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/510,884 Continuation-In-Part US20100023703A1 (en) 2008-07-28 2009-07-28 Hardware transactional memory support for protected and unprotected shared-memory accesses in a speculative section

Publications (1)

Publication Number Publication Date
US20100205408A1 true US20100205408A1 (en) 2010-08-12

Family

ID=41090366

Family Applications (5)

Application Number Title Priority Date Filing Date
US12/510,856 Active 2032-09-30 US8621183B2 (en) 2008-07-28 2009-07-28 Processor with support for nested speculative sections with different transactional modes
US12/510,905 Active 2032-01-12 US9372718B2 (en) 2008-07-28 2009-07-28 Virtualizable advanced synchronization facility
US12/510,884 Abandoned US20100023703A1 (en) 2008-07-28 2009-07-28 Hardware transactional memory support for protected and unprotected shared-memory accesses in a speculative section
US12/510,893 Active 2031-10-22 US8407455B2 (en) 2008-07-28 2009-07-28 Coexistence of advanced hardware synchronization and global locks
US12/764,024 Abandoned US20100205408A1 (en) 2008-07-28 2010-04-20 Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US12/510,856 Active 2032-09-30 US8621183B2 (en) 2008-07-28 2009-07-28 Processor with support for nested speculative sections with different transactional modes
US12/510,905 Active 2032-01-12 US9372718B2 (en) 2008-07-28 2009-07-28 Virtualizable advanced synchronization facility
US12/510,884 Abandoned US20100023703A1 (en) 2008-07-28 2009-07-28 Hardware transactional memory support for protected and unprotected shared-memory accesses in a speculative section
US12/510,893 Active 2031-10-22 US8407455B2 (en) 2008-07-28 2009-07-28 Coexistence of advanced hardware synchronization and global locks

Country Status (6)

Country Link
US (5) US8621183B2 (en)
EP (1) EP2332043B1 (en)
JP (1) JP2011529603A (en)
KR (1) KR20110044884A (en)
CN (1) CN102144218A (en)
WO (1) WO2010014200A1 (en)

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110307689A1 (en) * 2010-06-11 2011-12-15 Jaewoong Chung Processor support for hardware transactional memory
US20120254846A1 (en) * 2011-03-31 2012-10-04 Moir Mark S System and Method for Optimizing a Code Section by Forcing a Code Section to be Executed Atomically
US20130198491A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions with transactional memory
US20130198496A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions
US20130205119A1 (en) * 2012-02-02 2013-08-08 Ravi Rajwar Instruction and logic to test transactional execution status
US20140047196A1 (en) * 2012-08-10 2014-02-13 International Business Machines Corporation Transaction check instruction for memory transactions
US20140089640A1 (en) * 2012-09-27 2014-03-27 Texas Instruments Incorporated Processor with variable instruction atomicity
US20140089639A1 (en) * 2012-09-27 2014-03-27 Texas Instruments Incorporated Processor with instruction concatenation
CN104335183A (en) * 2012-06-29 2015-02-04 英特尔公司 Instruction and logic to test transactional execution status
CN104714848A (en) * 2013-12-12 2015-06-17 国际商业机器公司 Software indications and hints for coalescing memory transactions
US9146774B2 (en) 2013-12-12 2015-09-29 International Business Machines Corporation Coalescing memory transactions
US20150278121A1 (en) * 2014-03-26 2015-10-01 International Business Machines Corporation Transactional processing based upon run-time conditions
US20150278120A1 (en) * 2014-03-26 2015-10-01 International Business Machines Corporation Transactional processing based upon run-time storage values
US9158573B2 (en) 2013-12-12 2015-10-13 International Business Machines Corporation Dynamic predictor for coalescing memory transactions
US20150355937A1 (en) * 2014-03-02 2015-12-10 International Business Machines Corporation Indicating nearing the completion of a transaction
US20150378732A1 (en) * 2014-06-30 2015-12-31 International Business Machines Corporation Latent modification instruction for transactional execution
US9292337B2 (en) 2013-12-12 2016-03-22 International Business Machines Corporation Software enabled and disabled coalescing of memory transactions
US9336047B2 (en) 2014-06-30 2016-05-10 International Business Machines Corporation Prefetching of discontiguous storage locations in anticipation of transactional execution
US9348523B2 (en) 2013-12-12 2016-05-24 International Business Machines Corporation Code optimization to enable and disable coalescing of memory transactions
US9348643B2 (en) 2014-06-30 2016-05-24 International Business Machines Corporation Prefetching of discontiguous storage locations as part of transactional execution
US9372718B2 (en) 2008-07-28 2016-06-21 Advanced Micro Devices, Inc. Virtualizable advanced synchronization facility
US9448939B2 (en) 2014-06-30 2016-09-20 International Business Machines Corporation Collecting memory operand access characteristics during transactional execution
US9459877B2 (en) 2012-12-21 2016-10-04 Advanced Micro Devices, Inc. Nested speculative regions for a synchronization facility
US9465636B2 (en) 2011-08-24 2016-10-11 Kt Corporation Controlling virtual machine in cloud computing system
RU2606878C2 (en) * 2012-06-15 2017-01-10 Интернэшнл Бизнес Машинз Корпорейшн Transaction processing
US20170083331A1 (en) * 2015-09-19 2017-03-23 Microsoft Technology Licensing, Llc Memory synchronization in block-based processors
US9684537B2 (en) * 2015-11-06 2017-06-20 International Business Machines Corporation Regulating hardware speculative processing around a transaction
US9703560B2 (en) 2014-06-30 2017-07-11 International Business Machines Corporation Collecting transactional execution characteristics during transactional execution
EP3264317A1 (en) * 2016-06-29 2018-01-03 Arm Ltd Permission control for contingent memory access program instruction
US9870253B2 (en) 2015-05-27 2018-01-16 International Business Machines Corporation Enabling end of transaction detection using speculative look ahead
CN108292221A (en) * 2015-12-22 2018-07-17 英特尔公司 Affairs terminate to submit instruction, processor, method and system plus duration
US10162757B2 (en) * 2016-12-06 2018-12-25 Advanced Micro Devices, Inc. Proactive cache coherence
US10210019B2 (en) 2014-02-27 2019-02-19 International Business Machines Corporation Hint instruction for managing transactional aborts in transactional memory computing environments
US10402327B2 (en) 2016-11-22 2019-09-03 Advanced Micro Devices, Inc. Network-aware cache coherence protocol enhancement
US20190310941A1 (en) * 2018-04-04 2019-10-10 Nxp B.V. Secure speculative instruction execution in a data processing system
US11175896B2 (en) * 2014-05-13 2021-11-16 Oracle International Corporation Handling value types
US20220121617A1 (en) * 2020-10-20 2022-04-21 Micron Technology, Inc. Method of notifying a process or programmable atomic operation traps
US11403023B2 (en) 2020-10-20 2022-08-02 Micron Technology, Inc. Method of organizing a programmable atomic unit instruction memory
US11586439B2 (en) 2020-10-20 2023-02-21 Micron Technology, Inc. Detecting infinite loops in a programmable atomic transaction
US11693690B2 (en) 2020-10-20 2023-07-04 Micron Technology, Inc. Method of completing a programmable atomic transaction by ensuring memory locks are cleared
US11740929B2 (en) 2020-10-20 2023-08-29 Micron Technology, Inc. Registering a custom atomic operation with the operating system

Families Citing this family (128)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8479166B2 (en) * 2008-08-25 2013-07-02 International Business Machines Corporation Detecting locking discipline violations on shared resources
US9021502B2 (en) * 2008-12-29 2015-04-28 Oracle America Inc. Method and system for inter-thread communication using processor messaging
US8812796B2 (en) 2009-06-26 2014-08-19 Microsoft Corporation Private memory regions and coherence optimizations
US8375175B2 (en) * 2009-12-09 2013-02-12 Oracle America, Inc. Fast and efficient reacquisition of locks for transactional memory systems
US9092253B2 (en) 2009-12-15 2015-07-28 Microsoft Technology Licensing, Llc Instrumentation of hardware assisted transactional memory system
US8402218B2 (en) * 2009-12-15 2013-03-19 Microsoft Corporation Efficient garbage collection and exception handling in a hardware accelerated transactional memory system
US8972994B2 (en) * 2009-12-23 2015-03-03 Intel Corporation Method and apparatus to bypass object lock by speculative execution of generated bypass code shell based on bypass failure threshold in managed runtime environment
US8924692B2 (en) 2009-12-26 2014-12-30 Intel Corporation Event counter checkpointing and restoring
US20110208921A1 (en) * 2010-02-19 2011-08-25 Pohlack Martin T Inverted default semantics for in-speculative-region memory accesses
US9626187B2 (en) 2010-05-27 2017-04-18 International Business Machines Corporation Transactional memory system supporting unbroken suspended execution
US8782434B1 (en) * 2010-07-15 2014-07-15 The Research Foundation For The State University Of New York System and method for validating program execution at run-time
US8904189B1 (en) 2010-07-15 2014-12-02 The Research Foundation For The State University Of New York System and method for validating program execution at run-time using control flow signatures
US20120079212A1 (en) 2010-09-23 2012-03-29 International Business Machines Corporation Architecture for sharing caches among multiple processes
US20120079245A1 (en) * 2010-09-25 2012-03-29 Cheng Wang Dynamic optimization for conditional commit
US8549504B2 (en) 2010-09-25 2013-10-01 Intel Corporation Apparatus, method, and system for providing a decision mechanism for conditional commits in an atomic region
US8424015B2 (en) * 2010-09-30 2013-04-16 International Business Machines Corporation Transactional memory preemption mechanism
US9110691B2 (en) * 2010-11-16 2015-08-18 Advanced Micro Devices, Inc. Compiler support technique for hardware transactional memory systems
US8468169B2 (en) 2010-12-01 2013-06-18 Microsoft Corporation Hierarchical software locking
US9274962B2 (en) * 2010-12-07 2016-03-01 Intel Corporation Apparatus, method, and system for instantaneous cache state recovery from speculative abort/commit
US9122476B2 (en) 2010-12-07 2015-09-01 Advanced Micro Devices, Inc. Programmable atomic memory using hardware validation agent
US8788794B2 (en) * 2010-12-07 2014-07-22 Advanced Micro Devices, Inc. Programmable atomic memory using stored atomic procedures
US8612694B2 (en) 2011-03-07 2013-12-17 Advanced Micro Devices, Inc. Protecting large objects within an advanced synchronization facility
US8990823B2 (en) 2011-03-10 2015-03-24 International Business Machines Corporation Optimizing virtual machine synchronization for application software
US8677331B2 (en) * 2011-09-30 2014-03-18 Oracle International Corporation Lock-clustering compilation for software transactional memory
US8954680B2 (en) 2011-11-20 2015-02-10 International Business Machines Corporation Modifying data prefetching operation based on a past prefetching attempt
CN104011669B (en) * 2011-12-22 2017-12-12 英特尔公司 The method, apparatus and system performed for the selectivity for submitting instruction
US20140223140A1 (en) * 2011-12-23 2014-08-07 Intel Corporation Systems, apparatuses, and methods for performing vector packed unary encoding using masks
US20130326196A1 (en) * 2011-12-23 2013-12-05 Elmoustapha Ould-Ahmed-Vall Systems, apparatuses, and methods for performing vector packed unary decoding using masks
US8893094B2 (en) 2011-12-30 2014-11-18 Intel Corporation Hardware compilation and/or translation with fault detection and roll back functionality
US20140059333A1 (en) * 2012-02-02 2014-02-27 Martin G. Dixon Method, apparatus, and system for speculative abort control mechanisms
WO2013175858A1 (en) * 2012-05-23 2013-11-28 日本電気株式会社 Lock management system, lock management method, and lock management program
US9411595B2 (en) 2012-05-31 2016-08-09 Nvidia Corporation Multi-threaded transactional memory coherence
US9367323B2 (en) * 2012-06-15 2016-06-14 International Business Machines Corporation Processor assist facility
US9448796B2 (en) * 2012-06-15 2016-09-20 International Business Machines Corporation Restricted instructions in transactional execution
US9298631B2 (en) 2012-06-15 2016-03-29 International Business Machines Corporation Managing transactional and non-transactional store observability
US9442737B2 (en) 2012-06-15 2016-09-13 International Business Machines Corporation Restricting processing within a processor to facilitate transaction completion
US11538055B2 (en) * 2012-06-15 2022-12-27 Edatanetworks Inc. Systems and method for incenting consumers
US8880959B2 (en) * 2012-06-15 2014-11-04 International Business Machines Corporation Transaction diagnostic block
US10437602B2 (en) 2012-06-15 2019-10-08 International Business Machines Corporation Program interruption filtering in transactional execution
US8682877B2 (en) * 2012-06-15 2014-03-25 International Business Machines Corporation Constrained transaction execution
US9772854B2 (en) 2012-06-15 2017-09-26 International Business Machines Corporation Selectively controlling instruction execution in transactional processing
US9311101B2 (en) 2012-06-15 2016-04-12 International Business Machines Corporation Intra-instructional transaction abort handling
US9298469B2 (en) * 2012-06-15 2016-03-29 International Business Machines Corporation Management of multiple nested transactions
US9262320B2 (en) 2012-06-15 2016-02-16 International Business Machines Corporation Tracking transactional execution footprint
US9317460B2 (en) 2012-06-15 2016-04-19 International Business Machines Corporation Program event recording within a transactional environment
US20130339680A1 (en) 2012-06-15 2013-12-19 International Business Machines Corporation Nontransactional store instruction
US9740549B2 (en) 2012-06-15 2017-08-22 International Business Machines Corporation Facilitating transaction completion subsequent to repeated aborts of the transaction
US9436477B2 (en) * 2012-06-15 2016-09-06 International Business Machines Corporation Transaction abort instruction
US8966324B2 (en) 2012-06-15 2015-02-24 International Business Machines Corporation Transactional execution branch indications
US9223687B2 (en) 2012-06-15 2015-12-29 International Business Machines Corporation Determining the logical address of a transaction abort
US9361115B2 (en) 2012-06-15 2016-06-07 International Business Machines Corporation Saving/restoring selected registers in transactional processing
US9336046B2 (en) 2012-06-15 2016-05-10 International Business Machines Corporation Transaction abort processing
US9348642B2 (en) 2012-06-15 2016-05-24 International Business Machines Corporation Transaction begin/end instructions
US9384004B2 (en) * 2012-06-15 2016-07-05 International Business Machines Corporation Randomized testing within transactional execution
US9535768B2 (en) * 2012-07-16 2017-01-03 Sony Corporation Managing multi-threaded operations in a multimedia authoring environment
US9274963B2 (en) 2012-07-20 2016-03-01 International Business Machines Corporation Cache replacement for shared memory caches
US9411633B2 (en) * 2012-07-27 2016-08-09 Futurewei Technologies, Inc. System and method for barrier command monitoring in computing systems
US8914586B2 (en) 2012-07-31 2014-12-16 Advanced Micro Devices, Inc. TLB-walk controlled abort policy for hardware transactional memory
US8943278B2 (en) 2012-07-31 2015-01-27 Advanced Micro Devices, Inc. Protecting large regions without operating-system support
US9342454B2 (en) 2012-08-02 2016-05-17 International Business Machines Corporation Nested rewind only and non rewind only transactions in a data processing system supporting transactional storage accesses
US9396115B2 (en) 2012-08-02 2016-07-19 International Business Machines Corporation Rewind only transactions in a data processing system supporting transactional storage accesses
US9122873B2 (en) 2012-09-14 2015-09-01 The Research Foundation For The State University Of New York Continuous run-time validation of program execution: a practical approach
US9069782B2 (en) 2012-10-01 2015-06-30 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US9081607B2 (en) 2012-10-24 2015-07-14 International Business Machines Corporation Conditional transaction abort and precise abort handling
US9824009B2 (en) 2012-12-21 2017-11-21 Nvidia Corporation Information coherency maintenance systems and methods
US10102142B2 (en) 2012-12-26 2018-10-16 Nvidia Corporation Virtual address based memory reordering
US9569223B2 (en) * 2013-02-13 2017-02-14 Red Hat Israel, Ltd. Mixed shared/non-shared memory transport for virtual machines
US20140281236A1 (en) * 2013-03-14 2014-09-18 William C. Rash Systems and methods for implementing transactional memory
US9569385B2 (en) 2013-09-09 2017-02-14 Nvidia Corporation Memory transaction ordering
US9588801B2 (en) * 2013-09-11 2017-03-07 Intel Corporation Apparatus and method for improved lock elision techniques
US9424072B2 (en) 2014-02-27 2016-08-23 International Business Machines Corporation Alerting hardware transactions that are about to run out of space
US9454370B2 (en) 2014-03-14 2016-09-27 International Business Machines Corporation Conditional transaction end instruction
US10120681B2 (en) 2014-03-14 2018-11-06 International Business Machines Corporation Compare and delay instructions
US9558032B2 (en) 2014-03-14 2017-01-31 International Business Machines Corporation Conditional instruction end operation
US20150278123A1 (en) * 2014-03-28 2015-10-01 Alex Nayshtut Low-overhead detection of unauthorized memory modification using transactional memory
US9778949B2 (en) 2014-05-05 2017-10-03 Google Inc. Thread waiting in a multithreaded processor architecture
GB2528270A (en) * 2014-07-15 2016-01-20 Advanced Risc Mach Ltd Call stack maintenance for a transactional data processing execution mode
GB2529148B (en) * 2014-08-04 2020-05-27 Advanced Risc Mach Ltd Write operations to non-volatile memory
US10462185B2 (en) 2014-09-05 2019-10-29 Sequitur Labs, Inc. Policy-managed secure code execution and messaging for computing devices and computing device security
GB2533414B (en) * 2014-12-19 2021-12-01 Advanced Risc Mach Ltd Apparatus with shared transactional processing resource, and data processing method
GB2533415B (en) * 2014-12-19 2022-01-19 Advanced Risc Mach Ltd Apparatus with at least one resource having thread mode and transaction mode, and method
US10061583B2 (en) * 2014-12-24 2018-08-28 Intel Corporation Systems, apparatuses, and methods for data speculation execution
US10942744B2 (en) * 2014-12-24 2021-03-09 Intel Corporation Systems, apparatuses, and methods for data speculation execution
US10303525B2 (en) 2014-12-24 2019-05-28 Intel Corporation Systems, apparatuses, and methods for data speculation execution
US10540524B2 (en) 2014-12-31 2020-01-21 Mcafee, Llc Memory access protection using processor transactional memory support
WO2016172237A1 (en) 2015-04-21 2016-10-27 Sequitur Labs, Inc. System and methods for context-aware and situation-aware secure, policy-based access control for computing devices
US11847237B1 (en) 2015-04-28 2023-12-19 Sequitur Labs, Inc. Secure data protection and encryption techniques for computing devices and information storage
WO2016183504A1 (en) 2015-05-14 2016-11-17 Sequitur Labs, Inc. System and methods for facilitating secure computing device control and operation
US11228458B2 (en) * 2015-09-10 2022-01-18 Lightfleet Corporation Group-coherent memory
US9513960B1 (en) 2015-09-22 2016-12-06 International Business Machines Corporation Inducing transactional aborts in other processing threads
US10558582B2 (en) * 2015-10-02 2020-02-11 Intel Corporation Technologies for execute only transactional memory
US9563467B1 (en) 2015-10-29 2017-02-07 International Business Machines Corporation Interprocessor memory status communication
US10261827B2 (en) 2015-10-29 2019-04-16 International Business Machines Corporation Interprocessor memory status communication
US9760397B2 (en) 2015-10-29 2017-09-12 International Business Machines Corporation Interprocessor memory status communication
US9916179B2 (en) 2015-10-29 2018-03-13 International Business Machines Corporation Interprocessor memory status communication
US9652385B1 (en) * 2015-11-27 2017-05-16 Arm Limited Apparatus and method for handling atomic update operations
US10649773B2 (en) * 2016-04-07 2020-05-12 MIPS Tech, LLC Processors supporting atomic writes to multiword memory locations and methods
US10248564B2 (en) 2016-06-24 2019-04-02 Advanced Micro Devices, Inc. Contended lock request elision scheme
US10169106B2 (en) * 2016-06-30 2019-01-01 International Business Machines Corporation Method for managing control-loss processing during critical processing sections while maintaining transaction scope integrity
CN107766080B (en) * 2016-08-23 2021-11-09 阿里巴巴集团控股有限公司 Transaction message processing method, device, equipment and system
US10802971B2 (en) 2016-10-13 2020-10-13 International Business Machines Corporation Cache memory transaction shielding via prefetch suppression
US10700865B1 (en) 2016-10-21 2020-06-30 Sequitur Labs Inc. System and method for granting secure access to computing services hidden in trusted computing environments to an unsecure requestor
CN106502920B (en) * 2016-11-08 2019-09-24 郑州云海信息技术有限公司 A kind of caching method based on MESI, device and processor
US20180165073A1 (en) 2016-12-14 2018-06-14 International Business Machines Corporation Context information based on type of routine being called
US10095493B2 (en) 2016-12-14 2018-10-09 International Business Machines Corporation Call sequence generation based on type of routine
US10235190B2 (en) 2016-12-14 2019-03-19 International Business Machines Corporation Executing instructions to store context information based on routine to be executed
US10180827B2 (en) 2016-12-14 2019-01-15 International Business Machines Corporation Suppressing storing of context information
US10241769B2 (en) 2016-12-14 2019-03-26 International Business Machines Corporation Marking sibling caller routines
US10152338B2 (en) 2016-12-14 2018-12-11 International Business Machines Corporation Marking external sibling caller routines
US10496311B2 (en) 2017-01-19 2019-12-03 International Business Machines Corporation Run-time instrumentation of guarded storage event processing
US10725685B2 (en) 2017-01-19 2020-07-28 International Business Machines Corporation Load logical and shift guarded instruction
US10452288B2 (en) 2017-01-19 2019-10-22 International Business Machines Corporation Identifying processor attributes based on detecting a guarded storage event
US10579377B2 (en) 2017-01-19 2020-03-03 International Business Machines Corporation Guarded storage event handling during transactional execution
US10732858B2 (en) 2017-01-19 2020-08-04 International Business Machines Corporation Loading and storing controls regulating the operation of a guarded storage facility
US10496292B2 (en) 2017-01-19 2019-12-03 International Business Machines Corporation Saving/restoring guarded storage controls in a virtualized environment
GB201708439D0 (en) 2017-05-26 2017-07-12 Microsoft Technology Licensing Llc Compute node security
CN110785746B (en) 2017-06-28 2024-04-12 Arm有限公司 Memory region locking
GB2564097B (en) * 2017-06-28 2019-10-23 Advanced Risc Mach Ltd Memory region locking
EP3462308B1 (en) * 2017-09-29 2022-03-02 ARM Limited Transaction nesting depth testing instruction
GB2567433B (en) * 2017-10-10 2020-02-26 Advanced Risc Mach Ltd Checking lock variables for transactions in a system with transactional memory support
US10621103B2 (en) 2017-12-05 2020-04-14 Arm Limited Apparatus and method for handling write operations
US11580234B2 (en) 2019-06-29 2023-02-14 Intel Corporation Implicit integrity for cryptographic computing
US11403234B2 (en) 2019-06-29 2022-08-02 Intel Corporation Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US11575504B2 (en) 2019-06-29 2023-02-07 Intel Corporation Cryptographic computing engine for memory load and store units of a microarchitecture pipeline
US11144322B2 (en) * 2019-11-05 2021-10-12 Mediatek Inc. Code and data sharing among multiple independent processors
US11580035B2 (en) 2020-12-26 2023-02-14 Intel Corporation Fine-grained stack protection using cryptographic computing
US11669625B2 (en) 2020-12-26 2023-06-06 Intel Corporation Data type based cryptographic computing
US20210318961A1 (en) * 2021-06-23 2021-10-14 Intel Corporation Mitigating pooled memory cache miss latency with cache miss faults and transaction aborts

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5239633A (en) * 1989-03-24 1993-08-24 Mitsubishi Denki Kabushiki Kaisha Data processor executing memory indirect addressing and register indirect addressing
US20070260942A1 (en) * 2006-03-30 2007-11-08 Ravi Rajwar Transactional memory in out-of-order processors
US20100332538A1 (en) * 2009-06-30 2010-12-30 Microsoft Corporation Hardware accelerated transactional memory system with open nested transactions

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7685583B2 (en) 2002-07-16 2010-03-23 Sun Microsystems, Inc. Obstruction-free mechanism for atomic update of multiple non-contiguous locations in shared memory
US7418577B2 (en) * 2003-02-13 2008-08-26 Sun Microsystems, Inc. Fail instruction to support transactional program execution
US7206903B1 (en) * 2004-07-20 2007-04-17 Sun Microsystems, Inc. Method and apparatus for releasing memory locations during transactional execution
US8041958B2 (en) * 2006-02-14 2011-10-18 Lenovo (Singapore) Pte. Ltd. Method for preventing malicious software from execution within a computer system
US8180967B2 (en) * 2006-03-30 2012-05-15 Intel Corporation Transactional memory virtualization
US20080005504A1 (en) * 2006-06-30 2008-01-03 Jesse Barnes Global overflow method for virtualized transactional memory
US8516201B2 (en) * 2006-12-05 2013-08-20 Intel Corporation Protecting private data from cache attacks
US7516365B2 (en) * 2007-07-27 2009-04-07 Sun Microsystems, Inc. System and method for split hardware transactions
CN102144218A (en) 2008-07-28 2011-08-03 超威半导体公司 Virtualizable advanced synchronization facility

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5239633A (en) * 1989-03-24 1993-08-24 Mitsubishi Denki Kabushiki Kaisha Data processor executing memory indirect addressing and register indirect addressing
US20070260942A1 (en) * 2006-03-30 2007-11-08 Ravi Rajwar Transactional memory in out-of-order processors
US20100332538A1 (en) * 2009-06-30 2010-12-30 Microsoft Corporation Hardware accelerated transactional memory system with open nested transactions

Cited By (112)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9372718B2 (en) 2008-07-28 2016-06-21 Advanced Micro Devices, Inc. Virtualizable advanced synchronization facility
US20110307689A1 (en) * 2010-06-11 2011-12-15 Jaewoong Chung Processor support for hardware transactional memory
US9880848B2 (en) * 2010-06-11 2018-01-30 Advanced Micro Devices, Inc. Processor support for hardware transactional memory
US8533699B2 (en) * 2011-03-31 2013-09-10 Oracle International Corporation System and method for optimizing a code section by forcing a code section to be executed atomically
US20120254846A1 (en) * 2011-03-31 2012-10-04 Moir Mark S System and Method for Optimizing a Code Section by Forcing a Code Section to be Executed Atomically
US9465636B2 (en) 2011-08-24 2016-10-11 Kt Corporation Controlling virtual machine in cloud computing system
US20130198492A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions
US20130198497A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions with transactional memory
US20130198491A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions with transactional memory
US20130198496A1 (en) * 2012-01-31 2013-08-01 International Business Machines Corporation Major branch instructions
US9286138B2 (en) * 2012-01-31 2016-03-15 International Business Machines Corporation Major branch instructions
US9280398B2 (en) * 2012-01-31 2016-03-08 International Business Machines Corporation Major branch instructions
US9250911B2 (en) * 2012-01-31 2016-02-02 Internatonal Business Machines Corporation Major branch instructions with transactional memory
US9229722B2 (en) * 2012-01-31 2016-01-05 International Business Machines Corporation Major branch instructions with transactional memory
US20160202979A1 (en) * 2012-02-02 2016-07-14 Intel Corporation Instruction And Logic To Test Transactional Execution Status
US9268596B2 (en) * 2012-02-02 2016-02-23 Intel Corparation Instruction and logic to test transactional execution status
US10223227B2 (en) * 2012-02-02 2019-03-05 Intel Corporation Instruction and logic to test transactional execution status
US10210066B2 (en) * 2012-02-02 2019-02-19 Intel Corporation Instruction and logic to test transactional execution status
US10210065B2 (en) * 2012-02-02 2019-02-19 Intel Corporation Instruction and logic to test transactional execution status
US10152401B2 (en) * 2012-02-02 2018-12-11 Intel Corporation Instruction and logic to test transactional execution status
US10261879B2 (en) * 2012-02-02 2019-04-16 Intel Corporation Instruction and logic to test transactional execution status
US10248524B2 (en) * 2012-02-02 2019-04-02 Intel Corporation Instruction and logic to test transactional execution status
US20160203068A1 (en) * 2012-02-02 2016-07-14 Intel Corporation Instruction and logic to test transactional execution status
US20130205119A1 (en) * 2012-02-02 2013-08-08 Ravi Rajwar Instruction and logic to test transactional execution status
US20160266992A1 (en) * 2012-02-02 2016-09-15 Intel Corporation Instruction and logic to test transactional execution status
US20160203019A1 (en) * 2012-02-02 2016-07-14 Intel Corporation Instruction and logic to test transactional execution status
US20160188479A1 (en) * 2012-02-02 2016-06-30 Intel Corporation Instruction and logic to test transactional execution status
US20160202987A1 (en) * 2012-02-02 2016-07-14 Intel Corporation Instruction and logic to test transactional execution status
RU2606878C2 (en) * 2012-06-15 2017-01-10 Интернэшнл Бизнес Машинз Корпорейшн Transaction processing
CN105760139A (en) * 2012-06-29 2016-07-13 英特尔公司 Instruction and logic to test transactional execution status
CN105760138A (en) * 2012-06-29 2016-07-13 英特尔公司 Instruction and logic to test transactional execution status
CN104335183A (en) * 2012-06-29 2015-02-04 英特尔公司 Instruction and logic to test transactional execution status
US20140047195A1 (en) * 2012-08-10 2014-02-13 International Business Machines Corporation Transaction check instruction for memory transactions
US9367264B2 (en) * 2012-08-10 2016-06-14 International Business Machines Corporation Transaction check instruction for memory transactions
US20140047196A1 (en) * 2012-08-10 2014-02-13 International Business Machines Corporation Transaction check instruction for memory transactions
US9367263B2 (en) * 2012-08-10 2016-06-14 International Business Machines Corporation Transaction check instruction for memory transactions
US11210103B2 (en) * 2012-09-27 2021-12-28 Texas Instruments Incorporated Execution of additional instructions prior to a first instruction in an interruptible or non-interruptible manner as specified in an instruction field
US9471317B2 (en) * 2012-09-27 2016-10-18 Texas Instruments Deutschland Gmbh Execution of additional instructions in conjunction atomically as specified in instruction field
US20140089640A1 (en) * 2012-09-27 2014-03-27 Texas Instruments Incorporated Processor with variable instruction atomicity
US20140089639A1 (en) * 2012-09-27 2014-03-27 Texas Instruments Incorporated Processor with instruction concatenation
US9612834B2 (en) * 2012-09-27 2017-04-04 Texas Instruments Deutschland Gmbh Processor with variable instruction atomicity
US9459877B2 (en) 2012-12-21 2016-10-04 Advanced Micro Devices, Inc. Nested speculative regions for a synchronization facility
US9292337B2 (en) 2013-12-12 2016-03-22 International Business Machines Corporation Software enabled and disabled coalescing of memory transactions
US9619383B2 (en) 2013-12-12 2017-04-11 International Business Machines Corporation Dynamic predictor for coalescing memory transactions
US9361031B2 (en) 2013-12-12 2016-06-07 International Business Machines Corporation Software indications and hints for coalescing memory transactions
CN104714848A (en) * 2013-12-12 2015-06-17 国际商业机器公司 Software indications and hints for coalescing memory transactions
US9430276B2 (en) 2013-12-12 2016-08-30 International Business Machines Corporation Coalescing memory transactions
US9348523B2 (en) 2013-12-12 2016-05-24 International Business Machines Corporation Code optimization to enable and disable coalescing of memory transactions
US9690556B2 (en) 2013-12-12 2017-06-27 International Business Machines Corporation Code optimization to enable and disable coalescing of memory transactions
US9348522B2 (en) 2013-12-12 2016-05-24 International Business Machines Corporation Software indications and hints for coalescing memory transactions
US9383930B2 (en) 2013-12-12 2016-07-05 International Business Machines Corporation Code optimization to enable and disable coalescing of memory transactions
US9292357B2 (en) 2013-12-12 2016-03-22 International Business Machines Corporation Software enabled and disabled coalescing of memory transactions
US9158573B2 (en) 2013-12-12 2015-10-13 International Business Machines Corporation Dynamic predictor for coalescing memory transactions
US9582315B2 (en) 2013-12-12 2017-02-28 International Business Machines Corporation Software enabled and disabled coalescing of memory transactions
US9146774B2 (en) 2013-12-12 2015-09-29 International Business Machines Corporation Coalescing memory transactions
US10223154B2 (en) 2014-02-27 2019-03-05 International Business Machines Corporation Hint instruction for managing transactional aborts in transactional memory computing environments
US10210019B2 (en) 2014-02-27 2019-02-19 International Business Machines Corporation Hint instruction for managing transactional aborts in transactional memory computing environments
US10565003B2 (en) 2014-02-27 2020-02-18 International Business Machines Corporation Hint instruction for managing transactional aborts in transactional memory computing environments
US9830185B2 (en) * 2014-03-02 2017-11-28 International Business Machines Corporation Indicating nearing the completion of a transaction
US20150355937A1 (en) * 2014-03-02 2015-12-10 International Business Machines Corporation Indicating nearing the completion of a transaction
US20150278121A1 (en) * 2014-03-26 2015-10-01 International Business Machines Corporation Transactional processing based upon run-time conditions
US20150278120A1 (en) * 2014-03-26 2015-10-01 International Business Machines Corporation Transactional processing based upon run-time storage values
US9256553B2 (en) * 2014-03-26 2016-02-09 International Business Machines Corporation Transactional processing based upon run-time storage values
US9262343B2 (en) * 2014-03-26 2016-02-16 International Business Machines Corporation Transactional processing based upon run-time conditions
US11175896B2 (en) * 2014-05-13 2021-11-16 Oracle International Corporation Handling value types
US9921834B2 (en) 2014-06-30 2018-03-20 International Business Machines Corporation Prefetching of discontiguous storage locations in anticipation of transactional execution
US9632820B2 (en) 2014-06-30 2017-04-25 International Business Machines Corporation Prefetching of discontiguous storage locations in anticipation of transactional execution
US9710271B2 (en) 2014-06-30 2017-07-18 International Business Machines Corporation Collecting transactional execution characteristics during transactional execution
US9720725B2 (en) 2014-06-30 2017-08-01 International Business Machines Corporation Prefetching of discontiguous storage locations as part of transactional execution
US9727370B2 (en) 2014-06-30 2017-08-08 International Business Machines Corporation Collecting memory operand access characteristics during transactional execution
US11243770B2 (en) 2014-06-30 2022-02-08 International Business Machines Corporation Latent modification instruction for substituting functionality of instructions during transactional execution
US9348643B2 (en) 2014-06-30 2016-05-24 International Business Machines Corporation Prefetching of discontiguous storage locations as part of transactional execution
US9851971B2 (en) * 2014-06-30 2017-12-26 International Business Machines Corporation Latent modification instruction for transactional execution
US9448939B2 (en) 2014-06-30 2016-09-20 International Business Machines Corporation Collecting memory operand access characteristics during transactional execution
US9703560B2 (en) 2014-06-30 2017-07-11 International Business Machines Corporation Collecting transactional execution characteristics during transactional execution
US9336047B2 (en) 2014-06-30 2016-05-10 International Business Machines Corporation Prefetching of discontiguous storage locations in anticipation of transactional execution
US20150378735A1 (en) * 2014-06-30 2015-12-31 International Business Machines Corporation Latent modification instruction for transactional execution
US9600286B2 (en) * 2014-06-30 2017-03-21 International Business Machines Corporation Latent modification instruction for transactional execution
US9600287B2 (en) * 2014-06-30 2017-03-21 International Business Machines Corporation Latent modification instruction for transactional execution
US10061586B2 (en) * 2014-06-30 2018-08-28 International Business Machines Corporation Latent modification instruction for transactional execution
US20180253312A1 (en) * 2014-06-30 2018-09-06 International Business Machines Corporation Latent modification instruction for transactional execution
US20150378732A1 (en) * 2014-06-30 2015-12-31 International Business Machines Corporation Latent modification instruction for transactional execution
US10228943B2 (en) 2014-06-30 2019-03-12 International Business Machines Corporation Prefetching of discontiguous storage locations in anticipation of transactional execution
US20170123802A1 (en) * 2014-06-30 2017-05-04 International Business Machines Corporation Latent modification instruction for transactional execution
US20170123801A1 (en) * 2014-06-30 2017-05-04 International Business Machines Corporation Latent modification instruction for transactional execution
US9632819B2 (en) 2014-06-30 2017-04-25 International Business Machines Corporation Collecting memory operand access characteristics during transactional execution
US10876228B2 (en) 2015-05-27 2020-12-29 International Business Machines Corporation Enabling end of transaction detection using speculative look ahead
US9870253B2 (en) 2015-05-27 2018-01-16 International Business Machines Corporation Enabling end of transaction detection using speculative look ahead
US20170083331A1 (en) * 2015-09-19 2017-03-23 Microsoft Technology Licensing, Llc Memory synchronization in block-based processors
US9684537B2 (en) * 2015-11-06 2017-06-20 International Business Machines Corporation Regulating hardware speculative processing around a transaction
US9690623B2 (en) * 2015-11-06 2017-06-27 International Business Machines Corporation Regulating hardware speculative processing around a transaction
US20170249185A1 (en) * 2015-11-06 2017-08-31 International Business Machines Corporation Regulating hardware speculative processing around a transaction
US10606638B2 (en) * 2015-11-06 2020-03-31 International Business Machines Corporation Regulating hardware speculative processing around a transaction
US10996982B2 (en) * 2015-11-06 2021-05-04 International Business Machines Corporation Regulating hardware speculative processing around a transaction
EP3394724A4 (en) * 2015-12-22 2019-08-21 Intel Corporation Transaction end plus commit to persistence instructions, processors, methods, and systems
CN108292221A (en) * 2015-12-22 2018-07-17 英特尔公司 Affairs terminate to submit instruction, processor, method and system plus duration
EP3264317A1 (en) * 2016-06-29 2018-01-03 Arm Ltd Permission control for contingent memory access program instruction
US20190171376A1 (en) * 2016-06-29 2019-06-06 Arm Limited Permission control for contingent memory access program instruction
TWI746576B (en) * 2016-06-29 2021-11-21 英商Arm股份有限公司 Permission control for contingent memory access program instruction
US10824350B2 (en) * 2016-06-29 2020-11-03 Arm Limited Handling contingent and non-contingent memory access program instructions making use of disable flag
WO2018001643A1 (en) * 2016-06-29 2018-01-04 Arm Limited Permission control for contingent memory access program instruction
US10402327B2 (en) 2016-11-22 2019-09-03 Advanced Micro Devices, Inc. Network-aware cache coherence protocol enhancement
US10162757B2 (en) * 2016-12-06 2018-12-25 Advanced Micro Devices, Inc. Proactive cache coherence
US10657057B2 (en) * 2018-04-04 2020-05-19 Nxp B.V. Secure speculative instruction execution in a data processing system
US20190310941A1 (en) * 2018-04-04 2019-10-10 Nxp B.V. Secure speculative instruction execution in a data processing system
US20220121617A1 (en) * 2020-10-20 2022-04-21 Micron Technology, Inc. Method of notifying a process or programmable atomic operation traps
US11403023B2 (en) 2020-10-20 2022-08-02 Micron Technology, Inc. Method of organizing a programmable atomic unit instruction memory
US11436187B2 (en) * 2020-10-20 2022-09-06 Micron Technology, Inc. Method of notifying a process or programmable atomic operation traps
US11586439B2 (en) 2020-10-20 2023-02-21 Micron Technology, Inc. Detecting infinite loops in a programmable atomic transaction
US11693690B2 (en) 2020-10-20 2023-07-04 Micron Technology, Inc. Method of completing a programmable atomic transaction by ensuring memory locks are cleared
US11740929B2 (en) 2020-10-20 2023-08-29 Micron Technology, Inc. Registering a custom atomic operation with the operating system
US11829323B2 (en) 2020-10-20 2023-11-28 Micron Technology, Inc. Method of notifying a process or programmable atomic operation traps

Also Published As

Publication number Publication date
CN102144218A (en) 2011-08-03
EP2332043A1 (en) 2011-06-15
US8621183B2 (en) 2013-12-31
JP2011529603A (en) 2011-12-08
US20100023707A1 (en) 2010-01-28
US8407455B2 (en) 2013-03-26
US20100023703A1 (en) 2010-01-28
US20100023704A1 (en) 2010-01-28
WO2010014200A1 (en) 2010-02-04
KR20110044884A (en) 2011-05-02
US20100023706A1 (en) 2010-01-28
EP2332043B1 (en) 2018-06-13
US9372718B2 (en) 2016-06-21

Similar Documents

Publication Publication Date Title
US20100205408A1 (en) Speculative Region: Hardware Support for Selective Transactional Memory Access Annotation Using Instruction Prefix
US10956163B2 (en) Processor support for hardware transactional memory
US10409612B2 (en) Apparatus and method for transactional memory and lock elision including an abort instruction to abort speculative execution
US8612694B2 (en) Protecting large objects within an advanced synchronization facility
JP5118652B2 (en) Transactional memory in out-of-order processors
US8127057B2 (en) Multi-level buffering of transactional data
US8176266B2 (en) Transaction based shared data operations in a multiprocessor environment
US8190859B2 (en) Critical section detection and prediction mechanism for hardware lock elision
EP2641171B1 (en) Preventing unintended loss of transactional data in hardware transactional memory systems
US20150378731A1 (en) Apparatus and method for efficiently implementing a processor pipeline
US20080005504A1 (en) Global overflow method for virtualized transactional memory
US20150032998A1 (en) Method, apparatus, and system for transactional speculation control instructions
US8914586B2 (en) TLB-walk controlled abort policy for hardware transactional memory
US20090119459A1 (en) Late lock acquire mechanism for hardware lock elision (hle)
US8943278B2 (en) Protecting large regions without operating-system support
US20190065160A1 (en) Pre-post retire hybrid hardware lock elision (hle) scheme

Legal Events

Date Code Title Description
AS Assignment

Owner name: ADVANCED MICRO DEVICES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHUNG, JAEWOONG;CHRISTIE, DAVID S.;HOHMUTH, MICHAEL P.;AND OTHERS;SIGNING DATES FROM 20100408 TO 20100420;REEL/FRAME:024264/0696

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION