CA2280057A1

CA2280057A1 - Internal bus system for dfps, building blocks with two dimensional or multidimensional programmable cell structures to handle large amounts of data involving high networking requirements

Info

Publication number: CA2280057A1
Application number: CA002280057A
Authority: CA
Inventors: Martin Vorbach; Robert Munch
Original assignee: Individual
Current assignee: Pact Informationstechnologie GmbH
Priority date: 1997-02-11
Filing date: 1998-02-11
Publication date: 1998-08-13
Also published as: DE59810469D1; US7010667B2; DE19880128D2; JP2001511326A; US20030135686A1; WO1998035294A2; EP1398706A3; EP1398706A2; WO1998035294A3; JP3894957B2; US6405299B1; EP0960374B1; DE19704742A1; EP0960374A2; AU6392498A; ATE256888T1

Abstract

The invention relates to a bus system for DFPs DE 16 88 A1 and to building blocks with two dimensional or multidimensional programmable cell structures (of type FPGA, DPGA or the like), characterized in that a) the bus system comprises a plurality of electrically independent bus segments disconnected by nodes, b) the nodes actively connect or disconnect bus segments, whereby b1) several bus segments are joined and connected by gates to a bus bar located inside the node or b2) connection occurs directly via circuitry elements, drivers and/or registers, c) each node has a routing table where information on the structure of the connection is stored, d) each node has a monitoring unit, which verifies independently if a connection can be established or not.

Description

THIS ~.-PACT07PCT TEXT TR'~~N
INTERNAL BUS SYSTEM FOR DFPS AND UNITS HAVING TWO-DIMENSIONAL
OR MULTIDIMENSIONAL PROGRAMMABLE CELL STRUCTURES FOR HANDLING
LARGE VOLUMES OF DATA INVOLVING COMPLEX INTERCONNECTION.
Background of the Invention Technical field The present invention concerns a bus system for units having cell structures arranged in a plurality of dimensions. The cells may either be FPGA-type cell structures (FPGA = Field Programmable Gate Array) in the known sense or DFP-type arithmetic units (DFP = Data Flow Processor). Connections are set up and cleared automatically during operation, and a pluralitv of cells use the same resources in an alternating manner. Data flow and resource handling are automatically svnchronized.
Related art In systems having two-dimensional or multidimensional programmable FPGAs or DPGAs (DPGA = Dynamically Programmable Gate Arrav):
FPGAs and DPGAs have internal bus systems that are either globally connected to all or a plurality of logic cells, or alternatively have a local next-neighbor connection. The common feature of both types is that: they both involve direct connections between two or a plurality of logic cells. In addition, in all instances one signal can use the bus, unless a multiplexer structure is configured in a plurality of logic cells together with a control. US Patent Application No. A-5,521.837 describes a technology wherein a pluralitv of bus segments can be connected to one another via switch elements (switch boxes, also known as SBXs). Connections are specified by a place and route tool before the unit is programmed and are configured once. Connections cannot be used in an alternating manner by a plurality of different cells.
Furthermore, SBXs have no internal functions for managing-setting up or clearing connections or for s~rnchronization.
Furthermore, there is no automatic synchronisation of the data transfer via these connections.
In DFP-based systems:
DFP-based systems according to German Patent No. 44 16 881 include the same bus systems previously described. In addition, a bus system can be subdivided to create a plurality of partial buses that can be used independently of one another.
Problems The known bus systems are unsuitable for transferring large volumes of data in the form of signals grouped by byte or otherwise. Particularly if the units are being used to calculate algorithms, a large number of data (packets) have to be transferred simultaneously between a given unit's individual configured functional areas. With the known technology, a direct point-to-point connection must be set up for each data path, i.e., connection (bus) between two (or a plurality of) function blocks containing the same data. This connection then exclusively controls data traffic between these specific function blocks. In all instances one data packet can be present on the bus. As a result, highly complex interconnection is required. The speed of known internal buses is limited by the maximum bus width and the signal's propagation delay on the bus. In the technoloQV described in US Patent Application No. A-5.521.837, automatic multiple occupation of resources is not possible, as connections are permanently specified by software before runtime. These units cannot set up and clear connections automatically on demand Moreover, there is no automatic svnchronization of data transfer via dedicated lines It is up to the user to program functionality of this kind. However, automatic synchronization

2 is vital to automatically set-up and clear connections so that one can ensure data is correct and loss-free.
Improvements Achieved by the Invention; Object of the Invention The object of the present invention is to create a bus system that can transfer data between a plurality of function blocks in such a way that a plurality of data packets can be present on the bus simultaneously. The bus system automatically identifies the correct connections for the various different data types or data transmitters and sets up these connections.
The details, advantageous embodiments and features of the bus system according to the invention are the subject of the patent claims.
Description of the Invention Overview of the Invention, Abstract A bus system is described that can be integrated into a unit horizontally, vertically, diagonally or in any position desired.
The bus system is divided into a plurality of segments, and segments are separated by a bus control circuit. This bus control circuit is referred to as the node. The node also handles routing, i.e., it controls the direction in which data flow. Logic cells or PAE cells customary in DFPs are connected to the nodes, and they send and receive their data via the nodes; a target address can be assigned to each data packet.
In addition, the bus system can generate target addresses using lookup tables, referred to below as routing tables. To this end, a method can be used whereby at least one entrv is selected from a table based on events that occur, and a specific receiver can be reconfiaured via this entry (or entries). This method is known from, for example German Patent No. 196 54 846.2-53. The bus systems are particularly suitable for settina up direct connections to the unit's external.,peripheral connectors. Set-up and clearing of

3 connections, and synchronization of data, are carried out automatically. If a connection cannot be set up because the required bus segment is busy at that particular moment, connection set-up is repeated at a later time. Different types of connections can be set up automatically based on different data types or data transmitters.
Detailed Description of the Invention Bus System A plurality of buses are arranged horizontally, vertically, diagonally or in any desired position, on a unit. Individual buses do not extend continuously from one edge of the unit to the other, but rather they are divided into a plurality of bus segments. The bus segments are separated by the nodes described below. Bus segments can be used and interconnected independently of one another, and interconnection is handled by the nodes. A separate protocol managed by the nodes can be implemented on the bus system; it is also feasible for the protocol to be managed by the cells that use the bus, so that the nodes are simply passive switches.
Node A node is used to connect individual bus segments with each other. In addition, nodes handle the task of connecting cells to bus segments.
The bus segments for all directions come together in a node, i.e., - in a two-dimensional system, buses lead to the node from four directions, North (N), South (S), East (E), and West (W) ;

4 - in a three-dimensional system, buses lead to the node from six directions, North (N), South (S), East (E), West (W), Top (T) , and Bottom (B) ;
- in an n-dimensional system, buses lead to the node from n directions (there is a direction vector for each dimension, and the direction vector's plus or minus sign indicates the direction ~ there are two directions for each dimension, and the direction is indicated by the direction vector's plus or minus sign).
Type A Node Within the node there is a bus system to which the external buses are connected and which thus includes a plurality of bus bars. A first external bus is connected to a bus bar via a gate. The bus bar is connected to the second external bus via an additional gate. To allow broadcasting, i.e., sending of data to a plurality of receivers, a plurality of 'second' buses can be connected to the internal bus system. A gate may be a purely passive switch, a bus driver or a register/latch.
Furthermore, the node has access to one (or a plurality of) configurable elements (cells) connected to it. It connects them to one or a plurality of the adjacent bus segments.
Type B Node In contrast to type A nodes, there is no internal bus system.
The node includes n configurable switches which can connect each adjacent segment to any other adjacent segment. For example in two-dimensional structures, n = 6.
N Connection

5 S-W/W-S

6 W-N/N-W

(N = North; E = East; S = South; W = West) A switch may be unidirectional or bidirectional, and may contain a register or latch for storing data.
'Standard' Routing Method A connection is initiated by a data transmitter (DS) - a configurable element (logic cell, bus cell or external connections) - that requires a connection to a data receiver (DR) that also are made up of a configurable element. To this end, the DS communicates its bus request to the node. The first node downstream from a data transmitter is called the initiator node. It obtains the address of the next node required for data transfer from an internal table, as described below.
Provided the node is capable of addressing the first bus segment required - this is always possible if the bus segment AND an internal bus bar of the node are free - it sets up the connection. Otherwise, it rejects the DS's request, and the DS
attempts access again later, or waits and sustains the access request until the node accepts it.
Each subsequent node obtains the address of the next node from its internal table and continues to set up the connection. If the node is unable to set up the connection (either the required bus segment is busy or the bus bar of the node is not free), it can either jump to a wait condition or interrupt set-up and return an error to the initiator node.
Once the connection has been completely set up, the data packets are sent and the transmitter receives the acknowledgement signals for data exchange (Ready/Acknowledg~e protocol). Thus data transfer is automatically synchronized with the data transmitters. If connection set-up fails and has to be repeated later, the data transmitter receives no acknowledge signal, so no data are lost.
Once the connection has been completely set up, it remains quasi-permanent (i.e., it appears to the DS and DR to be a direct connection) until the DS clears the connection by sending a message to the initiator node. It is conceivable to implement a timeout procedure which disconnects an existing connection after a given period of time, particularly if no data transfer has taken place for a fairly long period of time, so that bus segments can be cleared for other connections.
Extended routing method In the aforementioned method for setting up a connection, the addresses are only present on the bus during the set-up phase, and are no longer transferred during data transfer. In the extended routing method for setting up connections, addresses are sent continuously along with the data on separate lines.
There are two addressing schemes:
1. Spatial Coordinates The spatial coordinates of the target are supplied as the addresses. The spatial coordinates are dependent on the selected dimension of the system. For example, a three-dimensional system uses coordinates X/Y/Z, while a two-dimensional system uses X/Y.
In principle, the method will work in systems having any number of dimensions. A direction vector indicating whether 3o the data/connection set-up moves in a positive or a negative direction is assigned to each coordinate.

7 The data moves from the source node in one of the specified directions. When a node is passed, the corresponding direction coordinate is modified in such a way that - in the case of movement in a positive direction 1 is subtracted;
- in the case of movement in a negative direction 1 is added.
The target axis is reached when the coordinate is 0, and the target node is reached when all coordinates are 0.
Depending on the implementation, a complement to two may be generated, and a coordinate may be represented as a negative number (which is added to); alternatively, it is possible to subtract from a positive number. In addition, it is possible to add to a positive number until an overflow is generated, which identifies the target position.
There are two sensible strategies for specifying the direction of movement of data:
a. Static method: Data always moves in the same direction, i.e., the system tries to always maintain the same direction until a change of direction is absolutely necessary. A change of direction is required if the instantaneous direction coordinate is equal to 0, i.e., the target position has been reached.
If the target position of a coordinate is reached, the data is moved in the direction of the coordinates that are not equal to 0. If all coordinates are equal to 0, the data has reached their target node.
b. Dynamic method. The data are moved in any one of the possible directions as desired, and movement is always in the direction specified by the direction vector. Movement 'as desired' here means that the data is always forwarded to the

8 node where there is the least traffic. Thus the system tries to always take the route involving the fewest collisions and blockages. In some cases this method is faster and more suitable for large-scale systems.
2. Address Lookup If lookup addresses are sent, the next node is specified anew in each node, as follows: The entry of the lookup address is read out in the routing tables described below, so that data for the next target is specified. This procedure is the same as in the 'standard' routing method.
The advantage of spatial coordinates is that lookup in a table is not required, which means less management is required. The advantage of address lookup lies in its flexibility and in the fact that the connection can be accurately predicted in advance.
It sometimes makes sense to mix the two methods. In such cases both addresses (spatial coordinates and lookup addresses) must be sent at the same time. If the spatial coordinates are not equal to 0, the data is sent based on the spatial coordinates.
As soon as the spatial coordinates reach 0, a lookup is carried out in the present routing table in place of the lookup address. One can thus be flexible when specifying the segments via which data are sent using the lookup method or the spatial coordinates method.
Bus States in the Extended Routing Method 1. Quasi-permanent A connection may be set up as quasi-permanent in much the same way as in the 'standard' routing method. Each node which a first data item passes is permanently enabled with respect to

9 the addresses assigned to this data item. Enablement is then maintained for all subsequent data until the connection is aborted. Setting up a connection in this way is not absolutely necessary in the extended routing method, but it has two advantages:
i. Data throughput time is reduced considerably.
ii. There are no time losses due to arbitration.
2. Registered This is the normal bus status. First, incoming data is stored temporarily in a register. When the data is written to the register, an acknowledgment signal is sent to the sending node. Data is only written to the register if the latter is empty, i.e., if no data has been written to the register previously, or if data written to it previously has already been sent on. If the register is not empty, the system waits, and an acknowledgment signal is not generated until the register is empty. The registers are arbitrated and the register with the highest priority at the time is sent.
Arbitration and transfer are carried out cyclically with each clock pulse. This method is very suitable in the case of transfer channels along which data from many different sources is to be sent in a non-time-critical manner. This method is known as 'source-optimized.' 3. Segmented A segmented transfer channel includes quasi-permanent and register-oriented nodes. This means the transfer channel is speed-optimized quasi-permanent at some points and source-optimized in others.

Routing Tables The basic component of a node is a routing table.
For an example of routing table structure, please see the exemplary embodiment below:
Gate EALU Bus Entry Propagation Set-Address Delay up in Time Target Table 1 2 3 4 O O R R 1 0 a.-.0 b.-.0 c.-.0 Each line represents a valid connection. A plurality of connections may be active simultaneously; the maximum number of active connections is determined by the number of free internal bus bars and free external bus segments. A monitoring logic circuit, as described below, performs a test to determine whether a newly selected connection can be set up.
Each line is addressed and selected by its binary address. In addition, there are special rows that have no binary address but rather are selected via trigger signals or status signals.
These signals include the following:
~ rRDYl, rRDYh oACKl, oACK2 Bus connections to data receivers (rRDY) are set up automatically in all instances provided the data transmitter cell has valid results. In the case of the oACK signals, the receiver/transmitter sequence is reversed. The data receiver sets up the connection to its data transmitter as soon as the data receiver has processed its operands and is ready to process new operands.

The active gates for the connection in question are entered in the Gate columns. It is possible for just one gate to be marked, and a connection to a configurable element (one or a plurality of cells) can be selected in the EALU columns (e.g-, a connection to a cell's input or output) The internal bus bar used for the connection is selected in the Bus column, the value being binary, so in the table shown a total of four internal bus bars can be used. Internal bus bars do not have to be selected if a priority decoder detects the first free bus bar and assigns it automatically.
The address of the line of the table that controls the node to which a connection is being set up is indicated in the Entry Address in Target Table column. The routing information required for the present connection is present at this address in the next node.
The Propagation De3ay column is optional, and may contain the anticipated signal propagation delay between DS and DR. This information may be used to calculate data throughput or to generate a timeout.
The Set-up Time column is optional. The maximum time for setting up the connection to the next node (or the entire connection from DS to DR) may be indicated here. If this time is exceeded, connection set-up can be aborted via a timeout, thus clearing the bus segments and nodes for other connections. The DS will make a further attempt to set up the connection at a later time.
The entries in the routing table can be configured and reconfigured by a superordinate PLU unit using known methods.
If the 'extended' routing method is used, spatial coordinates must be added to routing tables. In addition, a priority flag must be provided. The priority flag indicates how important a channel is in terms of a unit's performance. The higher the priority flag, the more important the channel. The flag can be set up in three ways:
1. Timeout The flag indicates after how many unused clock cycles the channel is to be interrupted, i.e., after how many cycles a DISCONNECT is generated.
2. Packet size The flag indicates after how many data packets a DISCONNECT is generated.
3. Clock cycles The flag indicates after how many clock cycles a DISCONNECT is generated.
One of the three types can be permanently implemented, or one type can be selected via additional data.
In the example of a routing table shown below, the priority type (shown as 'Prior. Type' in the table below) is evaluated as follows:
Bit Priority Type Combination 00 Timeout O1 Packet size

10 Clock c cles

11 Permanent To indicate 'permanent,' one can also permanently designate a priority flag value 'permanent.' Usually the priority flag's maximum value or 0 are most suitable for this.

Gate EALU Bus Entry Y X Spatial Priori Prior Address Coordinate ty in and Flag Type Target Direction Table Vector 1 2 3 4 O O R R 1 0 k..0 m..0 n..0 1 0 Arbiter An arbiter is connected upstream from the routing table and selects a plurality of signals from the multitude of requests to set up a connection via the routing table. It is advisable to base the arbiter on a known priority logic circuit and a known round-robin arbiter (a round-robin arbiter always assigns the highest priority on a time slice to the next signal, i.e., the signal that presently has the highest priority will subsequently have the lowest priority, and is then assigned higher priority with each subsequent access).
One can use the priority logic circuit to basically assign especially high or low priority to some signals such as rACK
or oRDY. The round-robin arbiter ensures that a connection set-up that has been requested but not yet established is assigned lowest priority, and must wait until all other connection requests have either been set up or checked.
State Machine (Control) The state machine controls internal sequences in nodes. The state machine is divided into two parts:
Control of the node - Control of bus transfer and synchronization State machines can be accessed by the system.

State machines) can be implemented using known methods and are not described in detail here.
Monitoring Unit A monitoring unit is connected downstream from the routing table. It gets the data entered in a line that has been addressed and checks whether the connection to be set up is conceivable. In particular, it checks the following:
~ Is there a free internal bus bar available?
~ Is the gate that has been requested free?
Is the external bus segment that has been requested free?
a) If the results of the check are positive, an ACCEPT signal is generated and sent to the state machines and the unit sending the connection set-up request to indicate successful set-up.
b) If the result of the check are negative, a REJECT signal is generated and sent to the state machines and the unit issuing the connection set-up request to indicate that set-up has failed. The aforementioned arbiter can react to the signal and set the priority of the request to 'lowest priority.' Clearing Connections An existing connection can be cleared based on various criteria. The most important criteria include:
Timeout: Clear a given connection because a longer period has elapsed during which no data was sent. A timeout is easily implemented using a loadable decrementer. When each data item is sent, the counter is loaded anew with a fixed initial value representing the length of time until the timeout. If no data is sent, it counts down one for each bus clock cycle. If it reaches zero, the maximum time period allowed has elapsed and the bus is cleared.
Data counter: A loadable counter is loaded with the number of data items to be sent. For each data transfer, the counter counts down one. If the counter reaches zero, all data has been sent and the bus is cleared.
Synchronization signals: Clearing of buses is controlled by the status signals and/or synchronization signals of the cells) sending the data. For example, if the cells indicate that they have completed data processing or that they can be reconfigured, the bus is cleared, as it is no longer required.
These signals include rRDY and oACK.
Connections are cleared as follows: A signal to clear the connection is sent from the initiator node. Each subsequent node forwards the signal received to its partner nodes) and immediately clears the connection.
In the extended routing method, quasi-permanent buses are either cleared via the RECONFIG synchronization signal or based on the priority flag. If a node detects the end of a connection based on the priority flag, it generates the DISCONNECT signal to clear the bus and forwards it to all other nodes. The nodes react to a DISCONNECT in the same way they react to a RECONFIG. In the case of register-oriented nodes, connections do not have to be cleared, as they are set up dynamically using the incoming data and based on the assigned addresses. After the data has been forwarded, the connection in question is cleared automatically and made available for other transfers.
Broadcasting Bus systems that allow a data packet to be sent to a plurality of receivers and allow transfer of data to be acknowledged are known. This system also works in the case of the method described here. It is a simple matter to connect a plurality of gates to one bus bar. Just one row in the routing table is used. Inevitably, the address within a given target routing table, Entry Address in Target Table, has to be the same in each selected routing table.
To cope with this problem, a plurality of entries can be made available for the Entry Address in Target Table, e.g., there could be a separate Entry Address in Target Table for each gate. Thus each gate would be assigned an address within the target routing table.
Acknowledgment is carried out via signal lines, which use a simple Ready/Acknowledg~e protocol or if necessary an extended more complex protocol; they are driven by an open collector driver; and they terminate at a transistor. To ensure acknowledge signals can be easily implemented in existing chip technologies, acknowledge signals can first be masked in a node, and then subjected to a logic operation. The result of the logic operation is then moved to the next node. For example, if the acknowledge signals are subjected to an AND
operation in all the nodes passed, the result is the same as in the case of an open collector circuit.
Primary Logic Unit (PLU) A primary logic unit according to German Patent Application No. 44 16 881 A1 is connected to the routing table. It is used to configure and reconfigure the routing table. Furthermore, an additional column, which is used to send an acknowledgment to the primary logic unit if the connection indicated in the line in question is set up or cleared, can be added to the routing tables. Thus a column may indicate whether an acknowledgment should be sent to the PLU if the connection in question is set up or cleared, and what type of acknowledgment should be sent. The acknowledgment is sent via a gate which completes a circuit either when the connection is set up or when it is cleared, depending on the setting. The gate also addresses the acknowledgment to the transistors (1502?, which are connected as open collector drivers.
Rest of PLU
the table Message if Message if Binary value of connection is connection is feedback set up cleared b 2 1 0 Brief Description of Diagrams Figure 1 shows a two-dimensional unit having a cell array and nodes.
Figure 2 shows a detail of Figure 1.
Figure 3 shows a plurality of existing bus connections, and new connections being set up.
Figure 4 shows a new bus connections being set up.
Figure 5 shows the next step in setting up the connection.
Figure 6 shows the collision of two buses.
Figure 7 shows the step-by-step clearing of a connection following a collision.

Figure 8 shows the step-by-step clearing of a connection following a collision.
Figure 9 shows the step-by-step clearing of a connection following a collision.
Figure 10 shows the step-by-step clearing of a connection following a collision Figure 11 shows setting up a cleared connection anew after a specific period of time.
Figure 12 shows the continuation of Figure 6 if the node has more than one bus bar, and therefore no collision takes place.
Figure 13 shows bus segments connected to the bus bars of a node.
Figure 14 shows data transfer in the case of connection set-up. Each further sub-figure shows the status of the connection in intervals of one bus cycle.
Figure 15 shows data transfer in the case of connection clearance. Each further sub-figure shows the status of the connection in intervals of one bus cycle.
Figure 16 shows the control unit of a node.
Figure 17 shows broadcasting to a plurality of data receivers.
Figure 18 shows a control unit of a node having collision detectors that has been improved relative to Figure 16.
Figure 19 shows a routing table that has been improved relative to Figure 16 having a control unit shown in Figure 18.
Figure 20 shows bus bars suitable for Figures 18 and 19.
Figure 21 shows broadcasting to a plurality of nodes that has been improved relative to Figure 17.

Figure 22 shows the sequence control for Figures 18-21.
Figure 23 shows a bus node based on the extended routing method.
Figure 24 shows an optimized version of Figure 23.
Figure 25 shows a data register having a synchronization unit.
Figure 26 shows a collision detector based on the extended routing method.
Figure 27 shows the control unit for Figures 23-26.
Figure 28 shows a throughput-boosting, cascadable addition to Figure 27.
Figure 29 shows the control register for Figure 27 (2705).
Figure 30 shows a bus set up using relative spatial coordinates based on the extended routing method.
Figure 31 shows a bus set up using absolute spatial coordinates based on the extended routing method.
Figure 32 shows management of spatial co-ordinates.
Figure 33 shows a segmented bus set-up.
Detailed Description of the Diagrams Figure 1 shows an FPGA-type, DPGA-type or DFP-type (German Patent No. 44 16 881 A1) unit. The unit is symmetrical in two dimensions being made up of configurable cells (0101). 0101 can also represent a plurality of different configurable cells which are grouped together and interconnected with each other.
The nodes of the bus system (0102) are located between the cells. A plurality of nodes is indicated, and in the exemplary embodiment described below they will set up a plurality of connections. As described below, data transmitter A (0103) will set up a connection to data receiver A (0104), and data transmitter B (0106) will set up a connection to data receiver B (0105). An enlarged detail (0107) is shown in Figure 2.
Figure 2 shows a section from a unit of the type described previously. The configurable cells) from Figure 1 (0101) are shown as 0201. A bundled plurality of wires (0203) - the number may be specified as desired and is not shown exactly -connects 0201 to a node 0202. Nodes (0202) are connected with one another via bus segments (0205). In addition, the nodes are connected to the nodes located outside the enlarged detail via bus segments 0206 having the same design as bus segments 0205. The optional bundled wires (0204) indicate that the configurable cells) can also be connected to a plurality of nodes (0202) via a plurality of different bundled wires.
Figure 3 shows the unit during runtime. There is a plurality of connections:
~ Between nodes Z (0301), and ~ Between nodes Y (0306).
Data transmitter A (0302) attempts to set up a connection (0303) to data receiver A (0307). However, the connection is rejected (REJECT), as it is blocked at node Z (0308). At the same time, data transmitter B (0304) sets up a bus segment (0305) to its receiver. This attempt is successful, as the node that is addressed and the required bus segment are not blocked.
Figure 4 shows the next bus cycle. Connections Y and Z have in the meantime been cleared. Data transmitter A (0401) can now set up bus segment (0403), as node (0402) is no longer blocked. At the same time, data transmitter B (0404) extends existing bus segment (0405) beyond node (0406), thus setting up new bus segment (0407) Figure 5 shows the bus set-up started in Figure 3 and continued in Figure 4 is continued in the same manner as in Figure 4.
Figure 6 shows the attempt to set up bus segment connection 0602 from data transmitter B fails. Node 0601 is busy and sends node 0603 a REJECT signal indicating that connection set-up has failed and clears the connection.
Figure 7 shows the complete connection set-up between data transmitter A (0701) and data receiver A (0702). The connection from data transmitter B is cleared further. Node 0703 sends a REJECT signal to node 0705 via bus segment (0704). As a result, segment 0704 is cleared.
Figure 8 shows data transfer between data transmitter A and data receiver A begins. The connection from data transmitter is cleared further. Node 0801 sends a REJECT signal to node 0803 via bus segment (0802). As a result, segment 0802 is cleared.
Figure 9 shows data transfer between data transmitter A and data receiver A continues. The last segment from data transmitter B is cleared. Node 0901 sends a REJECT signal to node 0903 via bus segment (0902). As a result, segment 0902 is cleared.
Figure 10 shows data transfer between data transmitter A and data receiver A continues. Data transmitter B (1001) waits for a given period of time before making a further attempt to set up a connection to its data receiver.
Figure 11 shows the status several bus cycles later: Data transfer between data transmitter A and data receiver A is still continuing. Data transmitter B (1101) starts a new attempt to set up the connection to its data receiver. It sets up a bus segment (1102) to node (1103). Provided the connection from data transmitter A to its data receiver A is cleared in the next bus cycles, connection set-up from data transmitter B (1101) will be successful. Otherwise it will fail again in the manner described in Figure 6.
Figure 12 shows the continuation of Figure 6 if node 1202 is able to set up more than one connection, i.e., if the node has a plurality of internal bus bars. If so, the connection from data transmitter A will be processed via the first bus bar and the connection from data transmitter B via the second. Node 1202 sets up bus segment 1203 to data receiver B (1201).
Figure 13 shows a node-internal interconnection structure.
There are four node-internal bus bars 1301, 1302, 1303 and 1304. The bus bars are connected to bus segments West (1318), North (1316), East (1319), and South (1317) via a group of gates (1308, 1309, 1310, 1311). The bus bars are connected with O-REG1/2 (1314, 1315) via gates 1307. The R-REG is connected to the bus bars via gates 1306. The lower-order R-REG and the higher-order R-REG (1312, 1313) can be connected separately. Gates are controlled via bus 1320. The gate system (West, North, East, South) required and the internal bus bar required are indicated in this way. The gate (1321, 1322, 1323, 1324) required is selected by ANDing (1325, 1326, 1327, 1328) the data of the direction with the data of the selected bus bar.
Figure 14a shows the setting up of a connection. Data transmitter (1401) sends the first data packet to the node of the data transmitter (1402).
Figure 14b shows the node selects the entry associated with rRDY - rRDY is a status signal that indicates that data is ready at the data transmitter - from the routing table. Based on this entry, the next bus segment is set up and the address of the routing table of the next node is sent.
Figure 14c shows the last node (1403), the data receiver node, receives the address for the entry within its routing table.

The entry does not indicate a further node, but rather a cell.
As a result, the node immediately activates its gates to the selected cell.
Figure 14d shows the data is moved directly to receiver cell 1404 via the activated gate of 1403.
Figure 14e shows the cell returns the oACK signal to acknowledge that the data has been received. The data transmitter will send the next data packet in the next bus cycle (see Figure 14e).
Figure 14 e-g show normal exchange of data between the cells takes place.
Figure 15a shows a data connection from the data transmitter (1501) to the data receiver (1503) via a plurality of nodes exists.
Figure 15b shows the data transmitter (1501) has ended its data transfer and sends a DISCONNECT signal to the first node.
Figure 15c shows the first bus segment is cleared and the node forwards the DISCONNECT.
Figure 15d shows clearing of the connection continues.
Figure 15d shows the last node receives the DISCONNECT. As a result, the last node simultaneously clears the connection to the preceding node and to the data receiver.
Figure 15e shows the last bus segment and the connection to the data receiver has been cleared.
Figure 15f shows a clearing method in which the DISCONNECT
line is looped through all the nodes. Thus the DISCONNECT is propagated in one clock cycle, and all segments are cleared simultaneously.
Figure 15g shows same as Figure 15e.

Figure 16 shows the control unit of a node. Requests (1601) to set up a connection are sent to routing table (1603) via priority decoder (1602). The priority decoder selects the request with the highest priority, and a request that fails as a result is assigned lowest priority. The priority logic unit receives its requests via status signals (e. g., status signals rRDY and oACK from the configurable cells) or via bus segments 1316, 1317, 1318, 1319. If data is present on the bus segments without the gate of the bus segment in question having been activated, the priority logic unit interprets the data as an address of the routing table and treats it as a request. If status signals are present at the priority logic unit (rRDY, oACK), they are translated into addresses for the routing table. The addresses of the routing table select an entry. The data of entry (1604) is forwarded to an AND-gate unit (1605).
The binary number of the bus selection (BUS 1..0) is translated into select signals via a 2:4 decoder (1606). The AND-gate unit ANDS each signal with the same signal stored in a latch (1607), i.e., signal GATE1 of the routing table is ANDed with signal GATE1 in the latch, signal GATE2 of the routing table is ANDed with signal GATE2 in the latch, etc.
The signals in the latch represent the status of the present interconnection structure, i.e., the gates used and the bus bars used are entered in the latch. If, when a connection request is ANDed with the present status, the result is a true level, this means the new connection request requires resources presently in use. All AND gates are combined in a logic operation using an OR gate (1608). If the output of the OR gate is a true level, the connection request is rejected (REJECT) (1609), as the necessary resources are busy. The ACCEPT signal (1611) is generated from the REJECT signal using an inverter (1610). The signals are forwarded to a state machine (1612) which can be implemented using known methods.
The state machine controls whether the connection is accepted or rejected. If the connection request is rejected, the state machine communicates (1613) the REJECT to the priority decoder and the request is assigned lowest priority. If the request is accepted, the new status signals are ORed (1614) with the present status signals downstream from the latch - the OR unit has the same design as AND unit (1605) described previously -and written back to latch (1607). The state machine controls, via 1623, whether OR unit (1614) or mask (1616) is active .
The latch is triggered by the state machine via 1622. The new setting reaches the gates via bus 1615.
Bus connections are cleared in a similar manner. However, the REJECT signal must occur when resources are checked, as the l0 bus connection to be cleared has to exist. Based on the REJECT, state machine 1612 activates mask (1616) instead of OR
unit 1614. The connection data of the connection to be cleared are masked out of the present interconnection status and written back to latch 1607. Before the new connection data is written, the state machine sends the DISCONNECT signal for clearing the connection to the next node.
The control unit can access bus bars 1301, 1302, 1303, 1304 directly via gates 1617, 1618, 1619, 1620. The state machine can send control signals (DISCONNECT) to a given bus bar, and can also receive control signals from the bus bar (REJECT, ACCEPT) and react to them. In addition, these gates are used to send the Entry Address in Target Table (via 1621) to the given bus bar.
The primary logic unit (PLU) can access the routing table via 1624.
Figure 17 shows how data transmitter (1701) broadcasts to a plurality of data receivers (1702) via a plurality of nodes (1707), which will not be discussed further here. For purposes of clarity, the bus is shown as divided into acknowledgement line (ACK) (1703) and the rest of the bus (1704). ACK is negated and sent to the open collector bus driver, which also performs an invert. ACK is pulled to H via pullup resistor 1705. Due to the structure of the circuit, the following cases may arise:

~ If the bus in question is not activated, L is present at the base of transistor (1706). As a result it does not place a load on the bus.
~ If the bus in question is activated and the signal is not acknowledged, H is present at the base of transistor (1706).
This means the bus is pulled to L. If a result is sent to a plurality of data receivers using broadcasting, all nodes that have not yet acknowledged the result data and require wait cycles pull the bus to L.
~ If the bus in question is activated and the signal is acknowledged, L is present at the base of transistor (1706).
This means no load is placed on the bus. If a result is sent to a plurality of data receivers using broadcasting, all nodes that have acknowledged the result data and do not require wait cycles do not place a load on the bus.
Because in its basic status the bus is at the H level, i.e., acknowledgment, non-acknowledgment per Case 2 overrides acknowledgment by pulling the bus to L. The bus does not go to the H level, i.e., to acknowledgment status, until all nodes involved in a connection send an acknowledgment. This therefore constitutes a wired AND circuit.
Below, we provide an example of implementation for a node having the following bus structure:
Data Data si nals ACK Data handshake and set-a handshake RDY Data handshake (data is resent) ESTABLISH Set-a handshake (se ment is being set a ) DISCONNECT Re est to clear bus (via a timeout) RECONFIG Request to clear bus (by reconfiguring the confi urable cells involved) Figure 18: Control unit of the bus bars shown in Figure 13. By contrast with the solution described above, in which a bus bar is preassigned in routing tables, the logic system finds a free bus bar itself and allocates it.
A control unit 1801 is assigned to each bus bar. A control unit consists of one gate (1801a) in order to send the address information of the routing table to the bus bar of a connected node during connection set-up, and a register (1801b) that controls the bus bar. Via bus 1813, 1801 receives data from the circuit described in Figure 19, which is connected to the routing table. The gates that have access to the assigned bus bar in question are enabled via 1801b. Each gate has an enable signal to which an entry in 1801b is assigned. If no entry is set, the assigned bus bar is not busy and may be freely assigned to any access request. Checking is implemented via an OR function (1802) applied to all enable signals to the gates.
The results of 1802 of all bus bars are sent to an arbiter (1803), which selects one of the free bus bars and addresses its 1801 via an address bus (1804). If no bus bar is free, 1803 communicates this to the controlling state machine via 1805. Each entry in 1801b indicates one gate assigned to the bus bar. The position is the same in every 1801b, i.e., the enable signal for gate p is always located at position p of an 1801b, the enable signal for gate p+1 is always located at position p+1, and the enable signal for gate q is always located at position q. If an OR function is applied to the enable signals of a gate p, the results indicate whether or not gate p is free. There is a checking function of this kind for each gate (1807 = gate p+1, 1808, 1809 = gate q). All gates that are irrelevant to the present connection set-up are masked via mask 1810, i.e., the mask forwards all irrelevant gates as 'not enabled.' An OR function (1811) is used to determine whether one of the gates is enabled. As all irrelevant gates are marked 'not enabled', only the status of the gate required for the present connection set-up is forwarded (1812) to the state machine. If the desired gate is enabled, it cannot be used for the present connection set-up, as this would result in a collision. The connection set-up is interrupted and either rejected or attempted again at a later time.
Figure 19 shows the arbiter for selecting the active bus, and the routing table. Each bus connected to a node (2004, 2005, 2006) sends its access request via a signal (1901) to an arbiter (1902), which selects one of the access requests.
Multiplexer (1904) is controlled by a decoder (1903) in such a way that either the number of the selected access (in the case of direct access by a configurable cell) (1905) or the lookup address of the selected access is sent to routing table 1906.
1906 outputs the data assigned to the value of 1905. The lookup address for the next node is sent to 1801a directly via 1813. The address of the next node is decoded into decimal (1907) via a collator (1908) which is usually configured with OR gates and sent to 1801b via bus 1813. If the bus to the next node has been set up, enabling of the gate to the preceding node is made possible in that the address of the preceding node decoded via decimal decoder 1909 is connected to bus 1813 in collator 1908 and sent to 1801b.
Figure 20 shows bus bars (2001, 2002, 2003) for connecting the buses that are present (2004, 2005, 2006). The buses are connected to the bus bars via multiplexer/demultiplexer (2007) based on control via 1801b; the output signals p of all 1801bs are sent to multiplexer p; the output signals p+1 of all 1801bs are sent to multiplexer p+1, etc. The individual signals represent bus bars that are present, as each bus bar control unit controls exactly one (p) of a plurality of signals (pX, px + 1, . . , qX) . If a signal associated with a bus bar control unit is set, the corresponding bus bar is connected via a 2007.
Timeout generators (2008) control clearing of the segment in question and the connected buses. The timeout generators are configured directly by routing table (1906). The connection has been omitted from the diagrams for the sake of simplicity.

The 1801as assigned to a given bus bar are connected to that bus bar.
Figure 21 shows how a bus segment (2105) is broadcast to a plurality of nodes (2102, 2103, 2104) via a node (2101). The RDY handshake is sent directly to each receiver in the same manner as the data. The returning ACK handshakes are connected to OR gate (2107) and AND gate (2108) via masks (2105, 2106) .
The masks select which ACK is significant, and whether an ACK
is forwarded via a Boolean AND function or via an OR function.
The two functions are combined via an OR gate (2109). If an ACK is irrelevant, mask 2105 forwards a logical 0 (L level), while mask 2106 forwards a logical 1 (H level). Masks 2105 and 2106 are set separately by the routing table. The connection has been omitted from the diagrams for the sake of simplicity.
Figure 22 shows the state machine of the circuit described The basic status is 'IDLE,' which the state machine does not leave until a 'request' (access) occurs AND a bus bar AND the selected gate are free. The state machine sends an acknowledgment of the bus set-up to the preceding state machine by sending an ACK handshake. The state machine goes into SEND status, during which the data of the routing table are sent (via 1801a) to the next routing table. The state machine only leaves this status when an ACK handshake of the next routing table arrives OR when a 'disconnect' signal arrives on the bus (e.g., via a timeout). In the case of a 'disconnect', the state machine goes into DISCONNECT status in order to clear the bus (this status is not absolutely necessary; in the implementation, it jumps back directly to IDLE; however, in this example it is included for greater clarity). When an ACK handshake arrives, it jumps back to IDLE
status, and the gate of the preceding routing table in 1801b is enabled via 1909/1908. To boost the performance of the routing table, waiting for an ACK handshake during SEND status can be left out. To this end, during SEND the access data to be sent to the next routing table must be stored in 1801a, i.e., 1801a is a register; at the same time the information regarding the preceding routing table must be written to an additional 1801b during SEND. In independently structured logic, arrival of the ACK handshake of the subsequent routing table causes disconnection of 1801a and switchover from the first 1801b to the second, and the connections of the gate of the preceding routing table are stored.
Remaining diagrams: An example of implementation based on the 'extended' routing method. The bus looks like this:
Data Data si nals ACK Data handshake and set-up handshake RDY Data handshake (data is resent) DISCONNECT Re uest to clear bus (via a timeout) RECONFIG Request to clear bus (by reconfiguring the confi urable cells involved) X/Y X Y s atial coordinates LUT Looku address for routing table ADR-MODE Indicates whether X/Y spatial coordinates or lookup address is to be used. If X=0 AND Y=0, the lookup address is used automaticall Figure 23 shows a node having switch elements 1, 2, 3, 4, 5, 6 and buses B1, B2, B3, B4.
Figure 23a: For purposes of clarity, a single-line system is shown, whereas in fact a bus system is involved. All diagrams should therefore be expanded to reflect the number of bus lines. In the most straightforward case, the switch elements are made up of a transistor (2301).
Figure 23b: To allow temporary storage of data, the switch element is expanded by one register (2302). Bidirectional buses Bn and Bm are connected to the register via transistors T1, T2, T3, T4 in such a way that either transition Bm -> Bn or transition Bn -> Bm is stored. Alternatively data transfer may be bidirectional with no storage, via T5. The switch element's mode is set via control signals S1, S2, S3, S4 as follows:
S1 Bm -> Bn (unidirectional, stored) S2 Bm <-> (bidirectional, non-stored) S3 Bn <-> Bm (bidirectional, non-stored) S4 Bn -> Bm (unidirectional, stored) Figure 23c: Input Schmitt triggers and output drivers (2303) are used to ensure better signal quality. Drivers (2303) are designed so that either the output driver or the input driver can be enabled via a control signal based on the level.
Bidirectionality is no longer feasible; only unidirectional switch procedures can be carried out.
The switch element's mode is set via control signals S1, S2, S3, S4 as follows:
S1 Bm -> Bn (unidirectional, stored) S2 Bm -> Bn (unidirectional, non-stored) S3 Bn -> Bm (unidirectional, non-stored) S4 Bn -> Bm (unidirectional, stored) Figure 23d: The inputs and outputs run on different lines (Bmi, Bmo, Bni, Bno), to allow greater ease of implementation in chips. Drivers (2304) are unidirectional. There is no need for the drivers to be activated.
S1 Bmi -> Bno (unidirectional, stored) S2 Bm -> Bno (unidirectional, non-stored) S3 Bni -> Bmo (unidirectional, non-stored) S4 Bni -> Bmo (unidirectional, stored) Figure 24: A node similar to the node shown in Figure 23. This node has the following advantage: The node is easier to implement, and it makes it easier to manage the register. For purposes of clarity, a single-line system is shown, whereas in fact a bus system is involved. All diagrams should therefore be expanded to reflect the number of bus lines.
The registers and drivers (A, B, C, D) are arranged upstream from switch elements (1, 2, 3, 4, 5) 6). The switch elements can be reduced to the cross-type arrangement shown in Figure 24a. Using control lines S5, S6, the inputs (Imi, Ini) are selectively connected to outputs (Imo, Ino) via transistors T6, T7.
Figures 24b-d show various embodiments of the registers and drivers (A, B, C, D) .
Figure 24b: A bidirectional bus is connected either as input to register 2401 via T8 or as output via T9 with node-internal bus Imo as the signal source. T8 and T9 are activated via control line S7. Using transistor pair T10/T11, which are activated via S8, a register bypass can be connected to allow a quasi-permanent mode. The output of the register goes to node-internal bus Imi. Interconnection of Imi and Imo is carried out via the switch elements shown in Figure 24a.
Figure 24c: To improve signal quality to the bus, driver (2402) is connected instead of transistor pair T8/T9.
Figure 24d: The external bus is unidirectional, which makes it easier to implement in chips. Drivers 2403 are unidirectional, and control signal S7 is not required.
Figure 25 shows a possible synchronization circuit for Figure 24. Registers 2401 for storing data is shown as 2501. Register 2502 is used to store an RDY handshake signal, i.e., the information indicating that valid data is present on the bus or in 2501. If there is no valid data in 2501, output Q of 2502 is a logical 0. If valid data arrives (RDY is active and a logical 1), an enable signal (EN) for registers 2501 and 2502 is generated via AND gate 2503, and the data and the RDY
are stored using the rising clock pulse edge. The input of 2503 for Q (of 2502) is inverting! If further data arrives, output (Q) of 2502 is a logical 1. The AND gate supplies a logical 0 and the registers are not enabled via EN. If the data is forwarded via the bus, the activation signal of output driver (OE) is used as 'Clear' for 2502; Q of 2502 becomes a logical 0, and new data can be stored at the next clock cycle.
Line Din and Dout is shown in bold, as in this case a bus system is involved. 2501 is also shown in bold, as the register matches the width of the bus.
The registers may be designed as latches and coupled to the level of the clock (CLK) or handshake (RDY). However, this means the circuit will behave asynchronously, which can cause significant problems in implementation, and is likely to involve substantial extra cost.
Figure 26 shows the testing method for determining whether a connection can be set up, i.e., whether the network is free.
The status information of switch elements 1, 2, 3, 4, 5, 6 that indicates whether a switch element is connected or free is arranged in matrix 2603. 90° switch elements 3, 4, 5, 6 form the corners; 180° switch elements 1 and 2 form the middle parts and occur twice. To ensure that a switch element can be used in a collision-free manner, the entire edge on which it is located must be free. For example, 1+2, 6+4 and 3+5 may be used. By contrast, 6+2, 6+1, 2+5, 2+4, 2+3, etc. cannot be used.
One must therefore ensure that and test whether each edge is occupied just once. The data for this is supplied via input r by register 2602, in which the present interconnection of nodes is stored, and by routing table 2601, which forwards the data of the desired new bus to the matrix via input t.
The test circuit is shown in Figure 26a. A given row (2605, 2606, 2607, 2608) is tested for the existence of a connection via an OR gate (2609, 2610, 2611, 2612). If a connected element exists in the row, the OR gate in question supplies a logical 1. The result of the row in question is ANDed with any connection to be newly set up present in the row. If the row is already occupied AND a further connection is being requested in the row, the AND gate in question supplies a logical 1. The outputs of all AND gates are OR-ed (2613). Thus the result of the test supplies a logical 0 to 2604 if the required connection is valid and a logical 1 if a collision is present.
The circuit in Figure 26 can only process one request per time unit. A time-optimized version is shown in Figure 27. The access requests are moved from the buses to the circuit via 2701. The routing table (2702) includes a plurality of individual registers (2711) rather than a known memory. As a result, the data of all access requests can be read out simultaneously from the routine table via multiplexer 2703.
The data of each access request is sent to a given matrix (2704) of the kind shown in Figure 26, which receives comparison data from the register containing the present interconnection of nodes (2705). Circuit 2706 includes an OR
gate which determines whether a valid request to matrix 2704 is present. The result of 2704 is ANDed with the output of the OR gate via an inverter. In the case of an existing and valid access, the result is a logical 1; otherwise a logical 0 is supplied. Each matrix has its own circuit 2706. The results of these matrices are connected to arbiter 2707, which selects one of the valid accesses. Multiplexer 2708 is connected so that the data of the valid access is sent to collator 2709 which merges the valid new access and the existing connection and forwards them to register 2705 to be stored.
This circuit can select one valid access out of four accesses.
A valid access can be processed from any desired number of accesses by changing the number of multiplexers (2703) and matrices (2704), the width of the arbiter and of the multiplexer (2707, 2708), and modifying the associated logic circuits.

It is often necessary to select more than one valid access from a number of accesses. Lines 2801, 2805, 2802 and 2810, which lead to the additional circuit shown in Figure 28 and allow two accesses to be selected simultaneously, are used for this. If 2810 is connected, line 2710 is left out. Any desired number of accesses can be selected simultaneously based on the cascading principle described below.
The information regarding which access was selected as 'valid' is sent to decoder 2803 via 2801. The information is decoded in such a way that that only the access data of the non-selected accesses is sent to the matrices via the three multiplexers 2804. As the access that has already been selected is not sent, the number of matrices decreases by one.
The decoder works in the manner shown in the table below:
Decoder MUX1 MUX2 MUX3 (2802) a b c d b a c d c a b d d a b c ~

The 'valid' bus selected via 2802 is indicated in the 'Decoder' column in the table. The MUX1-MUX3 columns indicate which bus the multiplexer in question selects based on value 2802.
Matrices (2811), logic circuit (2806) and arbiter (2807) work in the manner described in Figure 27. The data of the access selected by the arbiter is sent to collator 2809 via multiplexer 2808. The collator adds the data of the access selected by the logic circuit shown in Figure 28 to the output data of 2709 in the same manner as collator 2709 and sends the access data that has been generated to register 2705 via 2810.
The input data of multiplexer 2808 must be taken from the outputs of multiplexers 2804, due to the connections in those multiplexers.

The circuit shown in Figure 28 can be further cascaded to a deeper level according to the principle just described; the number of matrices will decrease by one for each cascade.
Figure 29 shows register 2602 or 2705. The outputs of collator 2709 or 2809 are sent as input data to the register via 9201.
A given register bank 2902a/b manages one of the buses (B1, B2, .. Bm) of the node. The control circuit of the node is stored in part a of a given bank. The timeout of the bus connection is defined in part b. Part b includes a loadable counter whose enable and reload are selected via multiplexer 2903, which can be set by part a.
Timeout Effect princi le BUS-ACK Data transfers are counted.

(Bus in use/ acket size) !BUS-ACK Clock pulses with no (inverted) data transfer are counted. (Bus NOT in use/timeout) en Each clock pulse is counted/clock c cles - No timeout ermanent The reload and enable signals of the counter are generated as follows:
Timeout rinci le reload (rld) enable (en) BUS-ACK never in the case of data transfer !BUS-ACK (inverted) in the case of data with no data transfer transfer en never continuously - never never The register required for a reload of the counter that contains the counter status originally set is contained in 2902b. 2904 tests for counter status 0 to determine the timeout. 2904 is shown in the diagram as an aid to comprehension only; in implementation, the carry signal (ripple-carry) of the counter is used. The carry deletes the contents of 2902a, which then forwards the status information 'bus free' and thus clears the bus. From the carry, BUS-DISCONNECT is connected to the bus as a signal and used to clear the remaining bus segment. BUS-RECONF is sent along with the data, and also clears the bus if it occurs. Both signals are sent to 2902 via OR gate 2905 and cause the register and counter to be cleared. The timeout is deactivated when the enable signal is deactivated in accordance with the table shown above and the counter is loaded with a value greater than 0.
The data in the register is bus-oriented, not switch-element-oriented. This data is sent to collator 2709 and 2809 via 2906. Each control signal occurs m times (number of buses) and is indicated using the notation S;,m, where m stands for the bus and i for the number of switch elements. Before the data is sent to a matrix of the kind shown in Figure 26 or to a node as shown in Figures 23/24, it must be represented such that there is only one series Ti_ The representation rule is 2 5 thus Ti = ( Si, l U Si, 2 U Si, 3 V . . . V Si,m) , in other words al l Si) 1 to Si,mare ORed. 2907 handles this function and sends T to the matrices and switch elements via 2908.
Figure 30 shows an example of two bus connections. A node 3002 is assigned to configurable elements or groups of configurable elements (3001). Node 3003 sends data to node 3004; connection set-up is static. Node 3005 sends data to target node 3008 on a dynamic basis; segments 3006 and 3007 are occupied, so that the direction of movement changes in each case. The X/Y
spatial co-ordinates are indicated in the nodes that are passed. Depending on the direction of movement, the coordinates are left the same, or incremented or decremented by one. The direction of movement and the target can be determined based on the numerical value of the coordinates.
Deviation of direction of movement is calculated from position (A, B, C, D) of the incoming bus on the node and the plus or minus sign of the X/Y movement. Compass points are used for designation purposes: y is the North-South axis, and x the East-West axis:
Direction of movement Movement as expressed in com ass oints y = 0 x > 0 ~ E
x < 0 ~ W

Y > 0 N

< 0 S

x = 0 y > 0 ~ N
y < 0 ~ S

x > 0 E

x < 0 W

The direction of movement and the compass point of the incoming bus are used as the basis for calculating which of the switch elements (l, 2, 3, 4, 5, 6) is addressed. Both 2o aforementioned calculations are very straightforward, so the processing unit required (XY2ADR) can be designed as, for example, lookup tables. The calculation is not discussed in greater detail here, instead, reference is made to the above table.
Addressing in this example is relative.
Figure 31 shows the same example, but in this case with absolute coordinates. In contrast to Figure 30, the coordinates are not calculated in the nodes, but rather are compared with the coordinates of the nodes in accordance with superordinate coordinate system 3101. Connection set-up is controlled based on the comparisons greater than (>), less than (<) and equal to (_). If both coordinates (X and Y) are equal to the coordinates of the node, the target has been reached. If one coordinate is equal to the coordinate of the node, the target axis of the coordinate has been reached.
The examples shown in Figures 30 and 31 do not allow any deviation from the optimal direction. For example, if segment 3009 in Figure 30 were occupied, it would be impossible to send the data any further. In cases where a segment is occupied, one can allow deviation from the specified direction. This would mean the connection could be set up via 3010. However, the allowance for a possible deviation has to be limited, to keep unreasonable routing attempts from being made. +/-1 to +/-2 is a sensible limit for deviations from the specified direction.
Figures 32a and 32b show the periphery needed around a node 3201 used to evaluate or modify the spatial coordinates.
Figure 32a shows relative coordinates that have been modified based on the direction of movement. In the case of movement in the positive direction, subtraction is performed (3203); in the case of movement in the negative direction, addition is performed (3202). Comparators (3204) test whether a coordinate has reached 0.
Figure 32b compares absolute coordinates with the coordinates of the node via comparator 3205. To allow deviation from the specified direction, comparators 3205 and 3204 are extended so that they check and forward the information indicating whether a coordinate is within the deviation range (-deviation <
coordinate < deviation). Based on this information, the processing unit (XY2ADR) can modify the direction of movement within the permitted deviation boundaries in the case of a collision of the specified direction and allow or prevent a deviation. This calculation is also very straightforward and if necessary can be carried out by extending the lookup tables. In the table below, the maximum permitted deviation is shown as A:
Direction Movement as expressed in of compass points movement y - A 0 x + A 0 ~ E
= > 0 ~ W
x - A
<

y + A 0 N
>

y - A 0 S
<

x - A 0 y + A 0 ~ N
= > 0 ~ S
y - A
<

x + A 0 E
>

x - A O W
<

x and y thus become fuzzy, i.e., movements in opposing compass-point directions may become permissible, as k - A < 0 AND k + A > 0 may be true simultaneously. If desired, one can restrict this by stipulating that movement in the direction opposite to the sign of k is not permitted. If k = 0, all directions of movement are permitted.
Figure 33 shows the behavior of a segmented bus. The structure of the diagram is the same as that of the earlier diagram.
Transmitter node Sa sends data to receiver node Ea; further transmitter node Sb sends data to Eb, and last node Sc sends data to Ec, which is also the receiver node Eb. Thus collisions occur at segments 3301 and 3302. To allow optimal use of the bus (aside from the fact that in principle a different route is conceivable), all buses are set up as quasi-permanent, with the exception of segments 3301 and 3302.
These segments function in 'registered' mode and arbitrate the buses present based on their respective assigned timeouts. The priority of a bus can be determined via its timeout. A
relevant bus is assigned generous 'timeout rights,' i.e., long cycles, whereas an irrelevant bus only has short cycles at its disposal.
For the sake of simplicity, Figures 23-27 only show node interconnections in the four compass-point directions. In fact, a configurable cell or group of configurable cells has to be connected in each node as well. The extensions for this are shown in Figure 34. The designations in Figure 34a match those in Figure 23, and those in Figure 34b match those in Figure 24; the connection of the configurable elements is shown as Z. Matrix 2603 should be changed to 3401, as shown in Figure 34c. Connections are modified based on Figure 26.
For ease of comprehension, the examples shown are two-dimensional systems. Complex multidimensional systems may also be constructed as desired based on the methods described.

_ CA 02280057 1999-08-10 Glossary Address lookup: The address in not calculated but rather is generated by being "looked up" in a memory.
ALU: Arithmetic-logic unit. Basic unit for processing data.
The unit can carry out arithmetical operations such as addition, subtraction and also in some cases multiplication, division, series expansion etc. The unit may be designed as an integer unit or a floating-point unit. The unit can also carry out logic operations such as AND and OR and can perform comparisons.
Arbiter: Unit for distributing rights among signals.
Bidirectional: Data transfer in both directions (source/target 1 <-> source/target 2).
Broadcast: To send data of a PAE to a plurality or all data receivers.
Bus bar: Bus to which a plurality of bus segments are connected.
Bus request: Request to set up a bus connection for transferring data. (See also Connection request.) Bus segment: Section of a bus system between two nodes.
Bus status: Manner in which a bus functions. There are two main states:
Quasi-permanent: The bus behaves like a continuous line. The bus can only be used by one data packet (until it is cleared).
Registered: A register that delays the data by one clock cycle is looped in between each segment. With each clock cycle (depending on the timeout) a different data packet can be arbitrated.
If the two statuses are mixed, the resulting status is termed 'Segmented.' This status combines the advantages of the two ~ ~3 types.
Cells: Synonym for configurable elements.
Collator: Unit for combining a plurality of signals based on specific mapping rules.
a) Usually a logic operation (AND/OR) is carried out, or b) the signals are combined into one bus, and possibly c) a plurality of signal sources are selectively combined via a plurality of multiplexers.
Configurable cell: See Logic cells.
Configurable element: A configurable element is a unit of a logic unit that one can set to perform a special function using a configuration word. Configurable elements therefore include all types of RAM cells, multiplexers, arithmetic-logic units, registers and all types of internal and external interconnection description.
Configure: Set functionality and interconnection of a logical unit, FPGA cell or PAE (See also Reconfigure).
Connection request: Request to set up a bus connection for data transfer. (See also Bus request).
Data receiver: The units) that subjects) the results of the PAE to further processing.
Data transmitter: The units) that makes) data available as operands for the PAE.
Data type: Type of data: Signs, numbers, floating-point numbers, signals (Boolean), etc.
Decimal decoder: Changes a binary signal into a decimal signal.
~ 44 DFP: Data Flow Processor according to German Patent/
Offenlegungsschrift No. 44 16 881.
DISCONNECT: Signal generated by timeout counters/generators that clears a bus. Sent to all nodes of a bus.
DPGA: Dynamically configurable FPGA. Known art.
EALU: Expanded arithmetic-logic unit. ALU that is extended so that it can perform special functions required for or useful for operating a data processing unit according to German Patent No. 441 16 881 Al (counters in particular).
Elements: Collective terms for any type of self-contained unit that can be used as a component in an electronic unit. The following are considered elements:
~ Configurable cells of any type ~ Clusters ~ RAM blocks ~ Logic circuits ~ Arithmetic processing units ~ Registers ~ Multiplexers I/O pins of a chip Enable: Enable a register or counter.
FPGA: Programmable logic unit. Known art.
Gate: Switch that forwards or blocks a signal. Compare: Relay.
H leve3: Logical 1 level; depends on the technology used.
Latch: Storage element that usually forwards a signal transparently during the H level and stores it during the L

level. In PAEs, there are some latches for which the level function is exactly the opposite way. In such cases, an inverter is connected in series with the clock pulse of a known latch.
L level: Logical 0 level; depends on the technology used.
Logic cells: Configurable cells used in the case of DFPs, FPGAs and DPGAs that carry out simple logical or arithmetic tasks based on their configuration.
Logic gate: Group of transistors that carry out a basic logic function. Basic functions include NAND, NOR, and transmission gates.
Lookup table (LUT): Table that receives a value as the address and returns a result. For example, a number is indicated as the address and its sine is returned.
Mask: Bit combination that indicates which signals of a source are to be forwarded and which to be interrupted (masked).
M-PLUREG: Register in which the interconnection of the PAE is set. This register is written by the PLU.
Node: Element that connects a plurality of bus segments to each other, actively controls connection set-up and is passive during data transfer.
Open collector: Circuit system in which the collector of a transistor is connected to a bus signal that is pulled to H
level via a pullup. The emitter of the transistor is connected to ground. If the transistor switches, the bus signal is pulled to L level. The advantage of the method is as follows:
A plurality of transistors of this kind can be used to control the bus so that there are no electrical collisions. The signals are ORed, which means the circuit constitutes a wired OR circuit.

PAE: Processing Array Element: EALU having 0-REG, R-REG, R20-MUX, F-PLUREG, M-PLUREG, BM-, SM-, Sync, StateBack- and Power-UNIT.
Partner node: Node with which a given node or bus segment has contact or wishes to set up contact.
PLU (Primary Logic Unit): Unit for configuring and reconfiguring the PAE. Includes a special micro-controller that is specially tailored to its tasks.
Priority decoder: The signal with the highest priority is forwarded or enabled.
Priority flag: Indicates the priority level (high to low) of a bus connection.
Priority logic: The signal with the highest priority is forwarded or enabled.
Priority type: Basis on which a priority flag is evaluated.
PulIDown: Resistor that pulls a bus line to an L level.
PullUp: Resistor that pulls a bus line to an H level.
RECONFIG: Signal generated by configurable elements that indicates whether the elements can be reconfigured and have ended their activity. It is used to clear all buses involved and is forwarded to all nodes of a bus.
Reconfigure: Reconfigure any desired number of PAEs, while any desired number of remaining PAEs continue to carry out their own functions (See also Configure).
Register bank: A plurality of different registers of different sizes and functions combined in a group.
Register bypass: Line for bypassing a register. This disconnects the register s synchronization effect.
RELOAD: Reload a counter with its original value.
Routing table: Table within a node containing information about connections to be set up.
Round-robin arbiter: Arbiter that enables one signal after another in sequence. The presently enabled signal is assigned lowest priority and, as the last in the chain, is enabled once again. The arbiter works in a circle.
Schmitt trigger: Window comparator that assigns exactly one of two values to a signal via hysteresis and thus improves signal quality.
Set-up phase: Cycle during which a bus segment is set up.
Source-optimized: Bus system that is mostly registered and is set up with low priorities to ensure that as many data transmitters (sources) as possible have access to the bus.
Spatial coordinates: Specification of points using a multidimensional coordinate system. Absolute coordinates (exact address of a point) or relative coordinates (relative distance from a point of origin) can be used. In the case of movement in a positive direction, the numerical value of a coordinate increases; in the case of movement in a negative direction, the numerical value of a coordinate decreases.
Speed-optimized: Bus system that is generally. set up as quasi-permament, has a high priority and is not influenced by other accesses.
State machine: Logic unit that can be in various different statuses. The transitions from one status to another are dependant on various input parameters. These machines are used to control complex functions and are related art.

Switching table: A switching table is a loop memory that is triggered via a control unit. The entries of a switching table may contain any desired configuration words. The control unit can carry out commands. The switching table reacts to trigger signals and reconfigures configurable elements based on an entry in a loop memory.
Target axis: The X/Y axis on which X=0 lies or X = axis, or on which Y=0 lies or Y = axis, is the target axis of X or Y.
Timeout: Something happens (a procedure is started or interrupted) after a specific period of time elapses.
Timeout counter: See Timeout generator.
Timeout generator: Unit for generating a timeout based on various criteria such as:
~ Clock cycles in which no connection was set up ~ Data packets sent ~ Clock cycles ~ Clock cycles in which no data was sent Unidirectional: Data transfer in one direction (Source ->
Target) Designation conventions:
Unit -UNIT

Mode -MODE

Multiplexer -MUX

Negated signal not-Register for PLU is -PLUREG

visible Register is internal -REG

Shift register -sft Function conventions:
NOT function !
I Q

AND function &
A B Q

~1 1 1 OR function #~ >-A B Q

~1 1 1 GATE function G
EN B Q

~1 0 0 5~, ~O

1 1 1~
x.51 Key to diagrams:
Ersatzblatt (Regel 26) Substitute sheet (Rule 26) R (= Receiver)

Claims

1. Bus system for connecting individual cells or groups of cells in DFPs according to German Patent Application No. 44 16 881 A1, and units having two-dimensional or multidimensional programmable cell structures, the bus system being made up of a plurality of bus segments that are electrically independent of one another and are separated via nodes, characterized in that a) the nodes connect or disconnect bus segments automatically and during data processing (Figures 26-29), either a1) to achieve connection, a plurality of bus segments are connected via gates to a bus bar located within the node (Figure 20), or a2) they are connected directly via switch elements (Figure 23), drivers and/or registers, b) each node has a lookup table (2601) in which the information regarding setting up of connections is stored, c) each node has a monitoring unit 2603 which automatically checks whether or not a connection can be set up, sets up an appropriate, feasible connection or delays data transfer until such a connection can be set up.

2. Bus system according to Patent Claim 1, characterized in that a bus connection is set up in a step-by-step manner, and data transmitter (0401) first sets up only the connection to the node connected to it (0402), and the node then determines from its routing table which of its adjacent nodes is required for the connection and sets up a connection to that node via the connecting bus segment provided that segment is not already busy with another connection; if it is busy, bus connection set-up is interrupted or suspended; if not, the addressed node continues to set up the bus connection (Figures 4-12).

3. Bus system according to Patent Claim 1, characterized in that a bus segment for transferring a data packet is set up, and the data transmitter first sets up only the connection to the node connected to it, and the node then determines from its routing table which of its adjacent nodes is required for the connection and forwards the data via that bus segment provided it is not already busy with another connection; if it is busy, bus connection set-up is suspended; if not, the data is stored in the adjacent node (2302. 2401 and Figure 25) and the new data is accepted (Figure 33).

4. Bus system according to Patent Claims 1 to 3, characterized in that the addresses of the data or of a connection are translated (2709) into the addresses of the next node in the lookup table (2601) of a node.

5. Bus system according to Patent Claims 1 to 3, characterized in that the addresses of the data or of the connections are calculated in the processing unit of a node in such a way that either the address of the next node is calculated (Figure 32) or the addresses of the next node are generated via lookup table (2601).

6. Bus system according to Patent Claims 1 to 5, characterized in that if a connection set up fails a further attempt is made at a later time (Figure 11).

7. Bus system according to Patent Claims 1 to 6, characterized in that after a connection set-up has been carried out the nodes are passive and data transfer is controlled exclusively by the data transmitter and the data receiver (Figure 14).

8. Bus system according to Patent Claims 1 to 6, characterized in that a node transfers the data to the next node by itself (Figure 33) and handles synchronization automatically (Figure 25).

9. Bus system according to Patent Claims 1 to 8, characterized in that the data transmitter and the data receiver have no knowledge regarding the bus system either during connection set-up or during operation and do not actively affect the nodes (Figures 30-31).

10. Bus system according to Patent Claims 1 to 9, characterized in that the node and the routing tables are configured and reconfigured by a primary logic unit (1624).

11. Bus system according to Patent Claims 1 to 10, characterized in that a node can send data to a plurality of other nodes simultaneously (broadcasting), and the synchronization signals of the receiving nodes can be subjected to Boolean logic operations based on settings and returned to the transmitter node via masks based on settings (Figure 21).

12. Method for transferring data between individual cells in DFPs according to German Patent Application No. 44 16 881 A1, and units having two-dimensional or multidimensional programmable cell structures (FPGA-type, DPGA-type or similar), wherein the data is sent in segments in a multidimensional bus system and the segments can be connected as desired, characterized in that data transfer is synchronized automatically and independently by the nodes (Figure 22, Figure 25).

13. Method according to Claim 12, characterized in that the segments of the buses can be permanently connected to create a continuous undelayed bus system (Figure 14).

14. Method according to Claim 12, characterized in that the segments of the buses are delayed and arbitrated via registers (3301, 3302).

15. Method according to Claims 12 to 14, characterized in that both connection methods according to Claims 13 and 14 are used in one connection.

16. Method according to Claims 12 to 15, characterized in that setting-up of a bus is controlled by lookup tables (2601) in which connection data is stored, and the entry of the next table is referenced in each lookup table.

17. Method according to Claims 12 to 15, characterized in that a bus is set up by supplying an unambiguous calculable relative or absolute address of the target (Figure 30).

18. Method according to Claims 12 to 15 and 17, characterized in that a set-up direction that is present is changed when the target axis is reached (Figure 31).

19. Method according to Claims 12 to 15 and 17, characterized in that a set-up direction that is present is changed as soon as a blockage occurs in the set-up direction that is present (Figure 31).

20. Method according to Claims 12 to 15 and 19, characterized in that set-up cannot occur in the direction opposite to set-up and beyond the target axis (Figures 30-31).

21. Method according to Claims 12 to 15 and 17, characterized in that set-up can occur in the direction opposite to set-up and beyond the target axis provided the deviation is within a predefined interval (Figure 32).

22. Method according to Claims 12 to 21, characterized in that both set-up methods according to Claims 16 and 17 occur in one connection.

23. Method according to Claims 12 to 22, characterized in that if a plurality of requests occur at a node (3301, 3302) the requests are arbitrated.

24. Method according to claim Claims 12 to 23, characterized in that a plurality of requests occur a plurality of requests can be processed simultaneously (Figure 27).

25. Method according to Claims 12 to 24, characterized in that data is sent by a bus segment to a next node and the next node acknowledges the data (Figure 14).

26. Method according to Claims 12 to 24 characterized in that data is sent via a bus segment to a plurality of next nodes and the next nodes acknowledge the data, and the acknowledge signals are subjected to Boolean logic operations as desired (Figure 21).

27. Method according to Claims 12 to 24 and 26, characterized in that data is sent to a plurality of next nodes by a bus segment and the next nodes acknowledge the data, and a mask for masking individual acknowledge signals is present (Figure 21) .

28. Method according to Claims 12 to 26 characterized in that both transfer methods according to Claims 25 and 26 occur in one connection.

31. Method according to Claims 12 to 28, characterized in that an existing bus is cleared based on a disconnect signal (Figure 22) which is sent to all nodes involved (Figure 15f).

30. Method according to Claims 12 to 29, characterized in that a disconnect signal based on a timeout is generated by one or a plurality of nodes and sent to all nodes involved (Figure 15f ) .

31. Method according to Claims 12 to 30, characterized in that in a timeout occurs based on a period of time in which no data was sent (See 'Timeout generator' in Glossary).

32. Method according to Claims 12 to 30, characterized in that a timeout occurs based on the volume of data sent (See 'Timeout generator' in Glossary) .

33. Method according to Claims 12 to 30, characterized in that a timeout is not possible.

34. Method according to Claims 13 to 33, characterized in that the timeout method used for each bus segment is selected selectively according to Claims 31 to 33.