US5864512A - High-speed video frame buffer using single port memory chips - Google Patents
High-speed video frame buffer using single port memory chips Download PDFInfo
- Publication number
- US5864512A US5864512A US08/832,708 US83270897A US5864512A US 5864512 A US5864512 A US 5864512A US 83270897 A US83270897 A US 83270897A US 5864512 A US5864512 A US 5864512A
- Authority
- US
- United States
- Prior art keywords
- pixels
- pixel
- frame buffer
- memory
- display
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/363—Graphics controllers
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
- G09G5/393—Arrangements for updating the contents of the bit-mapped memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
- G09G5/395—Arrangements specially adapted for transferring the contents of the bit-mapped memory to the screen
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/10—Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
- G11C7/1015—Read-write modes for single port memories, i.e. having either a random port or a serial port
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/10—Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
- G11C7/1015—Read-write modes for single port memories, i.e. having either a random port or a serial port
- G11C7/1018—Serial bit line access mode, e.g. using bit line address shift registers, bit line address counters, bit line burst counters
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C8/00—Arrangements for selecting an address in a digital store
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C8/00—Arrangements for selecting an address in a digital store
- G11C8/04—Arrangements for selecting an address in a digital store using a sequential addressing device, e.g. shift register, counter
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/121—Frame memory handling using a cache memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/123—Frame memory handling using interleaving
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/128—Frame memory using a Synchronous Dynamic RAM [SDRAM]
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/02—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed
- G09G5/022—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the way in which colour is displayed using memory planes
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
- G09G5/399—Control of the bit-mapped memory using two or more bit-mapped memories, the operations of which are switched in time, e.g. ping-pong buffers
Definitions
- High performance graphics processing commonly requires a specialized graphics frame buffer including a graphics engine in communication with a host processor over a bus. Control over a graphics frame buffer of this sort has been achieved by a variety of means, typically involving hardware configured to supervise the operation of the graphics engine.
- the graphics engine is typically controlled through commands from a host computer's processor over a bus so as to provide request code and data from the host processor to the graphics engine.
- High-performance frame buffers in the prior art have three general characteristics.
- the video board logic for performing texture processing i.e. the integrated circuit that performs those functions, is separate from the circuitry for performing other frame buffer manipulations, such as graphics display requests. This results in limitations placed upon the performance of the graphics system due to the frame buffer designer's having to arrange for a communication path between the texture processor and other components on the board.
- prior art video frame buffers arrange video memory in a linear fashion, such that consecutive memory locations represent the next pixel upon a given row of the display.
- prior art video memory arrangements track the scanline of the display.
- prior art video frame buffers store as one word in memory all information relevant to a particular display pixel. Consequently, acquiring the color value information for displaying a row of pixels upon the display requires skipping through video memory to obtain the values. This can be a very inefficient process.
- Prior art video frame buffers exemplified by the Edge III graphics processing system sold by Intergraph Corporation, and described in a technical white paper titled GLZ5 Hardware User's Guide, which is incorporated herein by reference, represents the state of the prior art in graphics processing systems.
- the Edge III as do other prior art video buffers, suffers from the three general limitations referenced above: lack of integration, linear video buffer memory, and consecutive placement of pixel information within the frame buffer. These limitations result in a graphics processing system that is not as efficient or speedy as it could be.
- the present invention resolves these issues.
- the present invention in accordance with a preferred embodiment, provides a device for storing pixel information for displaying a graphics image on a display.
- the information includes an intensity value and a value associated with each of a plurality of additional planes for each pixel.
- the device has a video frame buffer memory having a series of consecutive addresses for storing information to be output to the display.
- the buffer memory is subdivided into a plurality of blocks, each block corresponding to a region of the display having a plurality of contiguous pixels.
- the device also has a processor for placing the pixel information within the frame buffer memory so that in a given block there are placed at a first collection of consecutive addresses the intensity values for each of the pixels in the block. (Typically the processor is implemented by one or more resolvers.)
- the frame buffer memory has a single port.
- the placement of pixel information within the frame buffer includes a processor for placing at a second collection of consecutive addresses values for each of the pixels in the block associated with a first one of the plurality of additional planes.
- the present invention provides a device for storing pixel information for displaying a graphics image on a display, the information including an intensity value and a value associated with each of a plurality of additional planes for each pixel.
- This embodiment has a video frame buffer for storing information to be output to the display, the buffer memory having a plurality of banks, each bank being separately addressable and being subdivided into a plurality of blocks, each block corresponding to a region of the display having a plurality of contiguous pixels.
- This embodiment also has a processor for placing the pixel information within the frame buffer so that pixel information relating to first and second contiguous blocks is stored in different ones of the plurality of banks.
- the buffer memory has two banks, a first bank and a second bank, and the pixel information relating to first and second contiguous blocks is stored in the first and second banks respectively, so that there results a checkerboard form of allocation of pixels of the image over the display.
- the contiguous blocks are rectangular in shape, each block having more than 4 pixels on a side. In alternate embodiments, each block may have more than 7 pixels on a first side, and more than 7, 15, 31, 63, or 79 pixels on a second side.
- the invention provides a device for storing pixel information for displaying a graphics image on a display, the information including an intensity value and a value associated with each of a plurality of additional planes for each pixel.
- This embodiment has a video frame buffer memory having a series of consecutive addresses for storing information to be output to the display, the buffer memory subdivided into a plurality of banks, each bank being separately addressable and subdivided into a plurality of blocks, each block corresponding to a region of the display having a plurality of contiguous pixels; and a processor for placing the pixel information within the frame buffer so that, first, that pixel information relating to first and second contiguous blocks is stored in different ones of the plurality of banks, and second, in a given block there are placed at a first collection of consecutive addresses the intensity values for each of the pixels in the block.
- the buffer memory has two banks, a first bank and a second, and the pixel information relating to first and second contiguous blocks is stored in the first and second banks respectively, so that there results a checkerboard form of allocation of pixels of the image over the display.
- FIG. 1 is a diagram showing the general structure of a preferred embodiments of the graphics invention.
- FIG. 2 is a chart showing a comparison between packed versus full pixel information storage.
- FIG. 3 is a chart showing memory to display address mapping.
- FIG. 4a is an example of memory within a video frame buffer.
- FIG. 4b is a chart showing an example of checkerboard memory addressing.
- FIG. 5 is a chart showing a texture processing memory interface for 2M ⁇ 8 SyncDRAMs.
- FIG. 6 is a chart showing a texture processing memory interface for 1M ⁇ 16 SyncDRAMs.
- FIG. 7 is a chart showing a texture processing memory interface for 256 ⁇ 16 SyncDRAMs.
- FIG. 8 is a chart showing a texel mapping for 2M ⁇ 8 SyncDRAMs.
- FIG. 9 is a chart showing a texel mapping for 1M ⁇ 16 SyncDRAMs.
- FIG. 10 is a chart showing a texel mapping for 256 ⁇ 16 SyncDRAMs.
- a preferred embodiment of the present invention has been implemented in a graphics controller-processor having the general structure shown in FIG. 1.
- This embodiment is suitable for use with computers, such as those utilizing the Intel family of 80 ⁇ 86 processors (including the Pentium, Pentium Pro and MMX compliant technologies), running an operating system such as Microsoft Windows NT, designed to communicate over a Peripheral Component Interchange (PCI) Local Bus, pursuant to the PCI Local Bus Specification version 2.0 published by PCI Special Interest Group, 5200 NE Elam Young Parkway, Hillsboro, Oreg. 97124-6497, which is hereby incorporated herein by reference.
- PCI Peripheral Component Interchange
- the embodiment may also be configured, for example, to operate in an X-windows or other windowing environment, and on other buses, such as the VESA local bus (VLB), fibre channel and fibre optic buses.
- VLB VESA local bus
- graphics processing may be off loaded to the central processing unit.
- FIG. 1 shows a block diagram for a preferred implementation of the invention.
- the principal components are the PCI DMA bridge chip 102 connecting the high-speed video RAM buffer 104 to the PCI bus 106, the graphics engine circuitry 108, a set of dual resolver chips 110, a RAM DAC chip 112, the texture buffer 114, and the frame buffer 116.
- the basic flow of data within the high-speed video frame buffer system starts with a host computer's processor, which writes requests to the Request FIFO 118 inside the graphics engine 108 via a PCI address.
- the graphics engine interprets the request, breaks it down to pixel requests, and sends pixel requests over a dedicated bus 120 (IZ bus) to the appropriate Dual Resolver 110.
- IZ bus dedicated bus 120
- a Resolver module When a Resolver module receives a pixel request, it may alter the pixel's color, as well as determine whether the pixel should be written to the frame buffer. Independent of the rendering path, a Screen Refresh module 122 inside each Dual Resolver 110 requests data from the frame buffer 116 and sends the pixel's color data to the RAM DAC 112, which converts the digital color data to analog signals for display.
- the ScreenRefresh Module(SRM) 122 is responsible for supplying the video stream with pixel data.
- the video stream is scanline oriented: pixels are supplied starting at the left edge of the screen and painted from left to right. When the right edge of the screen is reached, the beam is reset to the left edge. This process continues for the entire screen.
- the memory organization in the invention is not inherently scanline oriented, but pixel block oriented (see discussion hereinbelow defining pixel blocking). For the 2 Mpixel case, each Resolver is only assigned 8 pixels per scanline within one pixel block.
- Pixel data includes Image, Image VLT Context, Overlay (or Highlight), and FastClear plane sets from the visible buffer. Some plane sets, such as FastClear, are stored 32 pixels per word.
- the memory controller when the memory controller reads FastClear, it reads enough data for the 8 pixels (for 2 MP) on the current scanline, plus the next three scanlines. Image is stored 1 pixel per word. To reduce the bandwidth impact of supplying data to the Pixel FIFO, the SRM will read the dense plane sets on the first scanline and temporarily store the portion of the word that is not used for the current scanline. On the next scanlines, the data is fetched from temporary storage (called Overrun RAMs) instead of the frame buffer. What results, however, is that for the first and fifth scanlines within a pixel block, the memory controller must read at least one word for all of the plane sets that comprise a pixel's visible information.
- Overrun RAMs temporary storage
- the first and fifth scanlines as "Long" scanlines, and the remaining scanlines as "Short".
- Flags generated by the Pixel FIFO help the SRM determine when to start and stop requesting more pixels from the Resolver's memory controller. To generate the flags, the FIFO compares the current depth of the FIFO with programmable "water marks". If the current depth is lower than the low water mark (LWM), then the SRM begins requesting data. If the current depth is higher than the high water mark (HWM), then the SRM quits requesting data.
- LWM low water mark
- HWM high water mark
- the worst case latency is from when the low water mark (LWM) is reached to when memory actually begins to supply Image data.
- LWM low water mark
- the instantaneous fill rate is potentially very low on long scanlines.
- the memory controller While the memory controller is filling the pixel FIFO, it cannot service any graphics requests in its IZ input FIFOs. Therefore, for long scanline cases, if the memory controller waits until the pixel FIFO is full before it services any IZ requests, then the IZ input FIFOs will fill, the IZ bus will stall, and system performance will be lost.
- the requirements on the water marks may be summarized as (1) set LWM high enough so the pixel FIFO won't go empty under the worst case latency conditions; and (2) set HWM low enough to minimize the time the IZ bus stalls.
- the worst case latency is better than for long scanlines. Latency is shorter because there are fewer (or no) plane sets to read in front of the Image planes.
- the instantaneous fill rate is very high, so it will take much less time to fill the pixel FIFO for short scanlines than for the long ones.
- the SRM uses two different sets: LWM1 and HWM1 when it is requesting pixels for "long" scanlines, and LWM2 and HWM2 when it is requesting pixels for "short” scanlines. In preferred embodiments, these values are programmable.
- the Screen Refresh Manager requests the last visible pixel on the display, it will stop requesting data, even if it has not reached its HWM. This feature is present so that software has additional time during vertical blank to swap buffers before the SRM accesses the "visible" pixels for the upper left region of the screen. If this artificial stall is not introduced, then visual integrity could be degraded for that region for some frames.
- the SRM will begin requesting pixels for the Pixel FIFO after it receives a restart signal from the VSG approximately one half-line before vertical blank ends. Note that the Pixel FIFO will go completely empty once per frame.
- a preferred embodiment For storing the video display information, a preferred embodiment uses single-ported SDRAMs in the frame buffer and texture buffer. However, a preferred embodiment need not be limited to SDRAMS, and reference to SDRAMS is intended to encompass use of equivalent RAMs.
- prior art video frame buffers stored their information in VRAM-type memory chips. These chips were dual-ported, meaning that the video board could read and write to video memory simultaneously, and resulted in parallel processing with fairly high performance video frame buffers.
- video frame buffers using dual ported RAM represented the best the frame buffer industry could offer.
- using SDRAM type of memory instead of VRAM memory, while raising the complexity associated with memory access, also greatly increases performance.
- a texture processor and a graphics engine are integrated into a single chip 124. By placing both into the same chip, it is possible to double the clock rate of the video card, as there are no external bus technologies to consider. An issue relevant to a single-chip design, however, is that memory accesses is more complex.
- the texture processor directly accesses the texture memory 114 via a dedicated bus 126.
- the graphics engine 108 does not have direct access to the frame buffer 116; instead the graphics engine 108 sends pixel commands to the resolvers 110, whereupon the resolvers 110 directly access frame buffer memory 116.
- the graphics engine 108 groups requests into three different categories: (1) Block requests (long-span, block-oriented requests such as PutBlock, GetBlock, RecFill, FastRecFill, etc.), (2) Blit requests (blits consist of first reading a short subspan and then writing the subspan to a different part of the screen), and (3) Region requests (multiple spans with a high probability of pixel-block locality, such as triangles. Vectors are lumped into this request type). For Resolver Read, Write, and Fill requests, the IZP then sets two IZ Request Type bits in the first IZ header to indicate which category of request is being sent.
- Block requests long-span, block-oriented requests such as PutBlock, GetBlock, RecFill, FastRecFill, etc.
- Blit requests blits consist of first reading a short subspan and then writing the subspan to a different part of the screen
- Region requests multiple spans with a high probability of pixel-block locality, such as triangles.
- a preferred embodiment implements page crossing algorithms based on the request category identified by the graphics processor 108 in the request made to the communication bus 120.
- the Resolvers 134, 136 optimize their page crossings differently according to the data transfer category. Optimizing page crossings is important, since the FastClear cache is filled and flushed during page crossings. Indiscriminate page crossings, therefore, cost performance.
- the two different page crossing modes are discussed below. Each mode corresponds to a specific request category. Note that SDRAMs have two banks. One bank may be accessed while the other bank is idle, being closed, precharging, or being opened.
- Mode0 Wait. Assume a Resolver is currently accessing a page (a "page" is synonymous with a pixel block) from Bank0 of the SDRAMs. When the Resolver stops accessing Bank0 and begins accessing a page from Bank1, close the page in Bank0. The Resolver may then access Bank1 while Bank0 is precharging. Wait until future activity specifically requires another pixel block in Bank0 before opening that pixel block.
- a complex method for storing pixel data within memory, in a preferred embodiment, a complex method, referred herein as pixel-packing or packed-pixel format, is used to store graphics information.
- a frame buffer contains information about pixels, and a pixel is the basic unit within a graphics environment.
- a collection of pixels forms the screen of the display monitor used to show the graphics output.
- VRAM type memory chips are used to store the attributes that control the display of the pixels, and all data associated with a pixel is stored in the same word in VRAM memory. Consequently, if 124 bits were associated with each pixel, of which 24 were used for recording color intensity (i.e. 8 bits to encode red, green, and blue color information), there would be 100 bit gaps in the VRAM memory between occurrences of pixel coloration information. In an environment where getting access such information is the most important task, this spreading out of the information is not the most efficient arrangement for the pixel information.
- a preferred embodiment reduces the inefficiency by subdividing the display region into many logical rectangular pixel blocks, where each pixel block contains the pixels for that region upon the video display. For each logical pixel block, there is a corresponding region of video RAM. This region of RAM is broken into a display partition and a non-display partition. Unlike the prior art, preferred embodiments arrange the information for pixels within the pixel block so that the display intensity (e.g. color) values are stored within the display partition, and the rest of the information for the pixels are arranged by plane category and stored in the non-display partition. For example, single-bit planes, such as the fast-clear and select buffer planes, are packed into consecutive addresses in memory.
- the display intensity e.g. color
- a preferred embodiment also stores image planes for horizontally-adjacent pixels at sequential memory addresses.
- a preferred embodiment is able to take advantage of SDRAM burst mode.
- Burst mode means that a memory access controller may be given just a starting memory address, and consecutive memory addresses may be read without having to specify the addresses for the consecutive memory locations.
- VRAM voltage-to-erase memory
- the present invention stores pixel information in linear memory addresses, and relocates other non-display related information, the invention is able to utilize burst mode and greatly exceed prior art performance.
- manipulations of pixels that require some combination of read or write access to memory will be collected into a variable length burst of reads, and if applicable, followed by a variable length burst of writes.
- a preferred pixel packing arrangement also reduces the bus width needed from each resolver to the frame buffer memory it controls. Also, the invention is able to quickly toggle the sense of which buffer is to be displayed. In just a few clocks, all of the select buffer planes can be written for all of the pixels. Further, it is possible to quickly fill and flush the fast clear cache inside the resolvers (the fast clear cache is more thoroughly discussed hereinbelow). And, to perform a screen refresh, it is not necessary to waste any clock cycles reading unnecessary information from the frame buffer as all relevant information has already been placed in the display partitions of the video memory regions. Related to this, preferred embodiments are able to quickly read and write just the planes (e.g. image and Z buffer) that are involved in the rendering process.
- planes e.g. image and Z buffer
- Optimizing rendering is crucial because one of the most complex graphics tasks is the simulation and animation of three dimensional objects.
- realistic representations of real-world object such as a bowling ball
- This frame buffer then processes (e.g. renders) the triangles for display upon a display screen.
- the rendered image may appear very realistic.
- the packing method for display pixels is such that usually at least one of triangles will fit within a video RAM region. This means that the video frame buffer is able to render an entire triangle in burst-mode, resulting in a substantial performance increase over the prior art.
- the next triangle to be drawn is likely to be in another video RAM block.
- This allows for queuing a chain of pixel blocks to be burst-mode displayed.
- the present invention takes advantage of the SDRAM burst-mode displaying by supplying the next memory address of a memory page to display upon the monitor while the invention is currently reading and displaying the previous memory page.
- a burst mode algorithm employed during rendering will allow grouping memory accesses together for pixels residing within a pair of pixel blocks, so long as the pixel blocks come from opposite banks within the SDRAMs. In this fashion, during the drawing process, no extra clock cycles are wasted on opening or precharging memory locations.
- video memory is broken into eight memory banks.
- a vertical stripe of pixels is stored within one memory bank.
- Each adjacent stripe of pixels is stored within a different memory bank.
- one memory bank stores every eighth stripe.
- resolver is the logic necessary to build an image in the frame buffer, and for sending pixel display data to a display monitor.
- Preferred embodiments use resolvers operating in parallel to process graphics display requests.
- a first resolver may handle the pixel display requirements for pixels 0, 8, and 16, while a second may handle pixels 1, 9, and 17.
- Each location of a SDRAM memory bank contains two sub-memory locations (referred herein as sub-banks). It is possible, while one sub-bank is being used, to simultaneously prepare the other for future use. Due to a latency involved with setting up the other sub-bank, alternating sub-banks are used when storing display pixel information. That is, for a series of pixel blocks, the pixels will be stored in alternating banks. In effect, this arrangement is what makes burst-mode possible. As one sub-bank is being used, by design of the SDRAMs, another can be simultaneously prepared for future use.
- a preferred embodiment also supports performing fast clears.
- another stored attribute is whether the pixel is going to be cleared (un-displayed) in the next write cycle. That is, this information is in addition to the invention's storing RGB, alpha, Z buffer, stencil, and overlay information for a particular pixel.
- preferred embodiments store clear information for many pixels in a single location. The grouping of clear bits is designed to correspond with the video RAM blocks. Consequently, when reading in the values for the pixels within a block, the video frame buffer is able, in a single memory access, to read the clear information for an entire group of pixels. This arrangement in effect caches the information for other pixels. When this is coupled with memory accesses being performed in burst-mode, the pixel clearing scheme is faster than prior art methods.
- Preferred embodiments of the invention will have a cache internal to the resolvers for maintaining fast clear bits.
- the present invention incorporates a highly integrated ASIC chip that provides hardware acceleration of graphics applications written for the OpenGL graphics standard used in windowing environments.
- the high-speed video frame buffer supports a wide variety of graphics users and applications. These features include scalable display architecture, multiple display sizes, multiple frame buffer configurations, variable amounts of texture memory, and high-performance.
- the present invention will also accelerate OpenGL's high-end visualization features to speed up operations such as texture mapping, alpha blending, and depth queuing via an integrated Z-buffer.
- the present invention will preferably allow floating point processing to be performed by the system CPU.
- an optional accelerator may be used instead to off-load work from the system CPU.
- the rendering ASICs will preferably be packaged in 625-pin and 361-pin ball grid-type arrays
- the frame buffer memory will be stored in high-density (preferably at least 16-Mbit) SDRAMs
- the optional texture memory will be available on vertically-installed DIMMs
- preferred implementations of the invention will be configured as single or dual-PCI card subsystems.
- the high-speed video frame buffer will preferably support storing images in 24-bit double-buffered image planes and accelerating OpenGL operations such as stencil functions with, in preferred embodiments, 8 Stencil planes per pixel, ownership tests (masking), scissor tests (clipping of triangles and vectors), alpha blending, and z-buffering.
- the invention in preferred embodiments, will also support texturing features (if texture memory is installed on the card) such as texturing of lines and triangles through trilinear interpolation, 32 bits per texel (RGBA), storage of mipmaps in a variable-size texture buffer, from 4 to 64 Megabytes, partial mipmap loading, 1-texel borders around the texture images, multiple texture color modes, such as 4-component decals and 1-component (luminance) texture maps.
- texturing features if texture memory is installed on the card
- RGBA 32 bits per texel
- mipmaps in a variable-size texture buffer, from 4 to 64 Megabytes
- partial mipmap loading 1-texel borders around the texture images
- multiple texture color modes such as 4-component decals and 1-component (luminance) texture maps.
- Preferred embodiments will also provide a FastClear function for rapidly clearing large regions of the screen, support for the Display Data Channel proposal from VESA for monitor identification, Dynamic Contrast Mapping (DCM) per Image Context to map 16-bit frame buffer data to 8-bit display data in the back end video stream in real time, and generation of video timing for industry-standard multiply synchronous monitors, as well as for specialized monitors such as Intergraph's Multiple Sync monitors.
- DCM Dynamic Contrast Mapping
- Preferred embodiments of the invention will also support various screen display modes such as Monoscopic, Dual-Screen, Interlaced Stereo, Frame-Sequential Stereo, Color-Sequential for Head-Mounted Displays, VGA compatibility (as well as allowing concurrent residency within a computer with 3 rd -party VGA cards).
- the invention will provide, in preferred embodiments, at least 2.6 million pixels in monoscopic single-screen mode, and, at least 1.3 million pixels per field for stereo modes.
- Preferred embodiments will also provide features to enhance performance and visual integrity of both interlaced and frame-sequential stereo images. Such embodiments will allow programmable control over inhibiting the draws of pixels to either even or odd scanlines without checking the frame buffer's mask planes, as well as programmable control over drawing to both the even and odd fields of stereo images through one request from software.
- the primary function of the PCI DMA 102 device is to increase the speed at which requests are received by the graphics engine.
- the PCI DMA 102 device has bus master and direct memory access (DMA) capabilities, allowing it to perform unattended transfers of large blocks of data from host memory to the Request FIFO.
- the texture processor 128 inside the combined graphics engine/ texture processor ASIC 124 may optionally perform pre-processing of pixels before they are sent to the Dual Resolvers 110. This extra processing may be used to add a texture or some other real-world image to a rendered object. In preferred embodiments, the texturing step would be transparent to the resolvers 110.
- the graphics engine 108 receives requests from a host processor via the PCI bus. It buffers the requests in its Request FIFO 118.
- the graphics engine 108 reads from the Request FIFO 118, decodes the request, and then executes the request.
- Requests are usually graphic primitives that are vertex oriented (i.e. points, lines, and triangles), rectangular fills, gets and puts of pixel data, blits, and control requests.
- the graphics engine's initial breakdown of the graphics request is to the span level, which is a horizontal sequence of adjacent pixels.
- the graphics engine sends span requests over the dedicated bus 120 to the Dual Resolvers 110. Before it sends the span request, the graphics engine 108 may texture the span.
- Span requests may include a fixed color for each pixel in the request, or each pixel may have its own color. Some of the requests may return data.
- the graphics engine 108 provides this data to the application by placing it back into the Request FIFO 118. In the preferred embodiment of the present invention, there is only one Request FIFO 118, and it operates in a half-duplex fashion (either in input mode or output mode).
- the texture processor 128 inside the integrated ASIC 124 writes and reads the texture buffer 114.
- software first loads a family of images into texture memory.
- the family is called a mipmap.
- a mipmap includes an original image and smaller versions of the same image. The smaller versions represent the image as it would be seen at a greater distance from the eye.
- a partial mipmap set can be loaded into the invention's texture memory.
- a texture space is treated as a collection of sub-blocks. Say a 1K ⁇ 1K space is tiled with 64 ⁇ 64 sub-blocks, and each sub-block can be replaced independently.
- Texture memory 114 looks like frame buffer 116 memory to the graphics engine 108, and is loaded by normal put and fill operations, or is read back by normal get operations.
- This technique was implemented for the bilinear blends within a texture map, the linear blend between texture mipmaps, and the final blend between the fragment color and the texture color. A summary of sorts of results is given below.
- the output result of 0 ⁇ ff is only obtained in the previous blend method if both a and b are 0 ⁇ ff. This biases the results slightly towards 0, demonstrated by a mean blended result of 127.0. Under the invention's blending method, the mean blended result is 127.5 (that is, 255.0/2.0). In fact, the distribution of blended results and maximum absolute error are symmetric across the output range about 127.5 for all possible inputs.
- the proposed blend in C code is the following.
- fb ((f ⁇ 1)
- the hardware gate count and timing path delay impacts of this new blending method are minimal.
- Logic synthesis is able to take advantage of the fact that the LSB of the a and b operands are always 1.
- the hardware is implemented in a partial sum adder tree.
- software sends textured requests (triangle or vector requests containing texture coordinates).
- textured requests triangle or vector requests containing texture coordinates.
- the graphics engine 108 receives a textured request, it sends special span requests to the texture processor's input FIFO 130.
- the texture processor 128 textures the pixels within the span, and places the resulting values in its output FIFO 132.
- the graphics engine 108 transfers these altered spans to the Dual Resolver chips 110.
- the high-speed video frame buffer is composed of either two or four Dual Resolver chips 110.
- Each Dual Resolver is built from three main modules: two resolver modules 134, 136 and one Screen Refresh module 122.
- a resolver module 134, 136 is responsible for translating span requests into manipulations of the frame buffer 114, while the Screen Refresh module 122 is responsible for sending pixel data to the RAM DAC 112.
- the resolvers 134, 136 also perform masking, alpha tests, Z buffering, stencil tests, frame buffer merges (read/modify/write operations such as alpha blending and logical operations), and double-buffering.
- a resolver 134, 136 receives requests from the graphics engine 108 in its input FIFO 138, 140, and parses the request and translates it into a series of frame buffer reads and/or writes. After performing the appropriate operations on the pixel data, the resolver then determines whether or not to write the pixel. Further, if it is going to write a pixel, the resolver determines which planes it will write. Each resolver is only responsible for a subset of the pixels on the screen. Therefore, each resolver only reads and writes the portion of the frame buffer that it "owns".
- the Screen Refresh module has a pixel FIFO 142. This FIFO supplies pixels (digital RGB plus Image Context) to the RAM DAC 112 for display on a monitor 144. To keep the FIFO from emptying, the Screen Refresh 122 module requests pixel data from the two resolver modules 138, 140 within the same Dual Resolver chip 110, which in turn read the frame buffer 116. As long as the Screen Refresh module 122 requests pixel data, both of the resolver modules 138, 140 continue to supply data. After the pixel FIFO 142 has temporarily stored enough pixels, the Screen Refresh module stops the requests, and the resolvers 138, 140 may return to other operations.
- the Screen Refresh module 122 also interprets the color of a pixel. Since a pixel may consist of double-buffered image planes, double-buffered overlay planes, double-buffered fast clear planes, and double-buffered image context planes, the Screen Refresh Module must determine which planes drive a pixel's color. After it determines the pixel's color, the Screen Refresh module may also map the pixel through the DCM logic 144. This special-purpose logic maps 16-bit pixel data (stored in the red and green planes) into 8-bit data. When this feature is enabled, the Screen Refresh module replicates the 8-bit result onto its red, green, and blue outputs.
- the frame buffer 116 has a full set of planes for each pixel.
- Each plane for each pixel represents information being tracked for that pixel. Planes are logically bundled into sets. The three most common plane sets are red, green, and blue, representing the pixel's display color. In the present invention there are over 100 planes of information per pixel.
- One such plane set is Overlay. These planes, if transparent, allow the values of the Red, Green, and Blue planes (hereinafter the Image planes) to show. If the Overlay planes are opaque, however, one viewing the display would see the Overlay values replicated onto all three RAM DAC 112 channels.
- image planes that are double-buffered 24-bit planes. All 24 planes represent 24-bit RGB color. It possible to configure the invention to represent 16-bit image data mapped to 8-bit data through dynamic contrast mapping, or to assert pseudo-color mode to assign an image arbitrary colors.
- Image Context planes that are Double-buffered 4-bit planes. These planes select dynamic contrast mapping lookup table entries in the screen refresh modules, and also choose the appropriate set of video lookup tables in the RAM DAC 112.
- Fast Clear planes that are Double-buffered single-bit planes. These planes, if set, indicate that the frame buffer contents for the pixel are stale, in that values in a static register are newer.
- Select Buffer Image planes that are Single-buffered single-bit planes. These planes indicate which buffer of Image, Image Context, and Fast Clear is the front buffer.
- Overlay planes that are Double-buffered single-bit, 4-bit, or 8-bit planes, depending on operation mode of the invention (controlled by software). These planes are displayed if their value is "opaque”. Otherwise, the image layer displays.
- Select Buffer Overlay planes that are Single-buffered single-bit planes. These planes indicate which buffer of Overlay is the front buffer.
- planes are logically grouped together. For example, in preferred embodiments, writes to the frame buffer are made to a "visual", which is a set of related planes.
- visual 2 is the Image visual. It primarily accesses the image (RGBA) planes, but it can also affect Z, Stencil, and Image Context planes.
- RGBA image
- Stencil image
- Image Context planes are only included as implied data: their value is sourced by a static register in the graphics engine. Enable and disable the writing of implied data separately via plane enables.
- the RAM DAC has three primary functions: provide the palette RAM for mapping incoming RGB to input data for the digital to analog converter (DAC), provide a 64 ⁇ 64 hardware cursor, and convert digital RGB to analog RGB.
- four sets of video lookup tables are available in the RAM DAC.
- the Image Context values sent with each pixel determine which lookup table maps the pixel.
- the lookup tables output three 10-bit values (one value each for red, green, and blue), which are sent to the DAC. 10-bit values allow more flexible storage of gamma correction curves than 8-bit values. Recall that the particular bit widths are dependent on the RAM architecture chosen, which in present embodiments, is SDRAMs.
- the concept of Highlight and Overlay planes will be implemented through visuals 0 or 1.
- Preferred embodiments intend to use visual 1 exclusively to access Overlay.
- supporting Highlight and Overlay is more complex.
- one double-buffered plane serves as an Opaque plane
- ImageNibble6 NEB6; the nibbles in memory stored with both buffers of R, G, B, and IC
- NEB6 ImageNibble6
- "Opaque” reflects the state of a layer of planes that can either obscure the underlying image (when opaque), or allow the underlying image to show through (when transparent).
- this layer of planes is the 4-bit Overlay. When the Overlay value matches the transparent Overlay value, the Opaque bit is clear. For all other Overlay values, Opaque is set.
- VSG Video Sync Generator
- the VSG generates horizontal and vertical timing markers, in addition to synchronizing the Screen Refresh modules 122 with the video stream.
- the RAM DAC 112 receives the sync signals that the VSG generates, and sends them through its pixel pipeline along with the pixel data. The RAM DAC then drives the monitor's sync and analog RGB signals.
- feed-through display signals 148 that may or may not undergo processing by the invention before being displayed upon a monitor. Such signals could be the input of video information that is to be directly displayed upon the display monitor, as well as feed-through VGA signals.
- the present invention provides pixel-mode double-buffering, wherein a single bit per pixel determines which image-related planes are to be displayed. Similarly, an additional bit per pixel determines which overlay planes are to be displayed.
- the high-speed video frame buffer supports four frame buffer combinations. These configurations are derived from the possible combinations of pixel depth (number of planes per pixel), and the number of Resolvers 110 installed. Preferred embodiments will support at least two pixel depth options: 102 planes per pixel and 128 planes per pixel. The following table shows the available plane sets in the 128 PPP (planes per pixel) embodiment.
- choosing between supported pixel depths and modes is through setting a bit within a special purpose register contained in each Resolver.
- a preferred embodiment will have either two or four Dual Resolver 110 devices present.
- Each Dual Resolver 110 will preferably control its own buffer memory 116.
- each buffer 116 is four 1M ⁇ 16 SDRAM devices, so that the combinations of preferred pixel depths and number of Dual Resolver devices creates four preferred frame buffer (FB) embodiments:
- the present invention may be utilized in a stereo mode to allow stereo viewing of images.
- the number of pixels available for each eye is half the total number of pixels.
- the invention will store the texture buffer 114 in high-density memory DIMMs.
- the presence of DIMMs is preferably optional, allowing the user to either install no DIMMs, or one pair of DIMMs.
- Software automatically detects the presence of texture memory. If no DIMMs are present, then the graphics engine renders textured requests to the frame buffer untextured.
- the invention's texturing subsystem should support a variety of DIMMs, including Synchronous DRAMs or Synchronous GRAMs.
- the texture processor should also support many densities of memory chips, including 256K ⁇ 16, 256K ⁇ 32, 1M ⁇ 16, and Z2M ⁇ 8 devices.
- the high-speed video frame buffer supports various monitor configurations, dependent upon the amount of memory installed upon the invention, and the properties of the monitor.
- a subtle point regarding monitors stems from the high-speed video frame buffer being organized as rectangular regions of pixels, or pixel blocks.
- one page (row) of memory in the SDRAMs corresponds to one pixel block.
- the high-speed video frame buffer only supports an integer number of pixel blocks in the x dimension. Therefore, if a resolution to be supported is not divisible by the pixel block width, then some pixels off the right edge of the display are held in off-screen memory. In that situation, the high-speed video frame buffer supports fewer displayable pixels than technically possible according to available video memory and monitor characteristics.
- the module in the high-speed video frame buffer system that generates timing signals for the video display places further restrictions on the monitor configurations that are supported.
- the maximum vertical period is 1K lines per field in interlaced stereo, 2K lines per field in frame-sequential stereo, and 2K lines in mononscopic mode. "Maximum vertical period" includes the displayed lines, plus the blank time. Additionally, the minimum horizontal period is 64 pixels.
- the back end video logic restricts the maximum frequency of the pixel clock to approximately 160 MHZ for the 1.0 MP and 1.3 MP frame buffers and approximately 220 MHZ for the 2.0 MP and 2.6 MP frame buffers.
- the present invention is a request-driven graphics system.
- Requests are used for operations such as loading registers, clearing a window, and drawing triangles.
- Drawing is accomplished via the DrawVec, DrawClipVec, and DrawTri requests.
- Sending graphics data to the invention is accomplished via fills and puts, and graphics data is retrieved via get requests.
- Data is moved within the system with the blit request (BitBlit).
- the context of the system can be changed via the requests that load registers and data tables.
- the context of the system can be observed via the "Read" requests.
- Control requests exist for miscellaneous purposes. These requests include the NoOp, PNoOp, Interlock, SetUserID, and Wait commands.
- the Request FIFO may be half-duplex, and if so, after software issues a request that will return data, it may not accept further requests until the returned data has been emptied. If software does not obey this constraint, then a "FIFO Duplex Error" will result. Requests are further divided into protected and not protected requests. Protected requests will not be executed unless they were written to the protected FIFO. Not protected requests will execute from either FIFO. Note there is only one physical FIFO, mapped into several addresses. The sync FIFO is considered a protected FIFO, and hence can execute protected requests.
- VRAMs Video RAMs
- SDRAMs Synchronous DRAMs
- the primary reason for the choice of SDRAMs in the invention is cost. SDRAMs cost less per bit than VRAMs, while they are available in much higher densities than VRAMs. Their higher densities allow for more compact packaging.
- the 2 Mpixel frame buffer is built from 136 VRAMs in Edge III, but only 16 SDRAMs in the invention.
- an alternate type of RAM may be utilized instead of SDRAMs, so long as similar functionality is achieved.
- VRAMs are dual-ported, while SDRAMs are single-ported.
- the VRAM's additional port is a serial shift register that provides a path from the frame buffer to the display, while only minimally impacting bandwidth between the memory controller and the frame buffer.
- a characteristic of both types of RAM devices is that they hold a matrix of memory. Each row in the matrix is referred to as a page of memory. Accesses to locations within a page can occur very quickly. When a location to be accessed falls outside the page that is currently being accessed, then the memory controller must cross the page boundary. A page crossing involves closing the open page, precharging, and then opening the new page. Page crossings in SDRAMs are relatively more expensive than in VRAMs.
- the actual time to perform a page crossing is about equal for the two devices, but the memory interface for a SDRAM may provide new data to the controller synchronously at speeds of around 100 MHZ, while VRAMs provide new data to the controller asynchronously from 20 to 30 MHZ.
- VRAMs and SDRAMs produced allowed several new memory configurations providing superior performance to that of the prior art.
- Such new configurations include a using a packed pixel format, rather than storing a whole pixel in one word of memory, and mapping pixels to the display in a pixel block organization, versus a linear mapping scheme.
- the prior art does not utilize SDRAMs for texture memory. Often regular asynchronous DRAMs are used to contain the texture memory.
- preferred embodiments hold texture memory in SDRAMs.
- page crossings are relatively expensive, and to maintain high performance, texels are arranged into texel blocks (analog to pixel blocks).
- a Resolver's wide data bus provides simultaneous read access to all of a pixel's single-buffered planes and one set of its double-buffered planes. Therefore, in one memory cycle, the prior art Resolver may typically access all of the information relevant to a pixel.
- each Resolver within a Dual Resolver package may only access 32 bits of data per cycle (due to current SDRAM width limitations discussed hereinabove). Since a pixel in a high-performance graphics system is usually represented by over 100 planes, each Resolver may only access a fraction of a pixel at one time, so the pixel data must be stored differently in the invention than used in the prior art.
- some words of memory hold a partial pixel, while other words of memory hold a plane set for many pixels. This format is called a Packed Pixel format in the invention.
- FIG. 2 shows a comparison between the present invention and how data is stored in the 2.0 Mpixel Frame Buffer of a prior art Edge III graphics processor.
- the Resolver may access one of several possible plane set combinations.
- the contents are: For Buffer0, Image (Red, Green, Blue) and Image VLT Context, Alpha 3:0! for a single pixel, Overlay for 4 pixels, and FastClear for 32 pixels 202.
- pixel mapping in moving information from the frame buffer to the display, most prior art designs use VRAMs with a built-in serial shift register, since a linear address map is convenient to implement.
- a linear address map a memory address of zero accesses the upper left pixel on the screen. Increasing memory addresses correspond to screen locations further to the right until the right edge of the screen is reached. Further increasing the memory address by one corresponds to a location on the left edge of the next scan line.
- mapping screen addresses to memory locations is linear oriented.
- a page of VRAM memory may hold 512 locations.
- all four Resolvers would access two sets of VRAMs via one data bus, one for each adjacent pixel on the display. Therefore, one page of VRAM spans 4096 (512*2*4) pixels.
- the first page of memory accessible by the combined Resolvers spans from pixel zero on the screen to pixel 4095.
- the second page accesses pixels 4096 to 8191, and so on. If the monitor displays 1600 pixels in the x-dimension, then page zero spans the first two lines of the display, and 896 pixels on the third line of the display. Page one then spans from pixel 896 on the third line to pixel 191 on the sixth line, and so on.
- a preferred embodiment uses a pixel-block arrangement to map addresses to physical screen coordinates.
- pixel blocks are 8 rows tall, and their width is determined by the number of Dual Resolver chips installed, and the pixel depth chosen. In preferred embodiments, several configurations are available, and others could easily be implemented.
- the pixel block width is 32 pixels.
- pixel block width is 40 pixels.
- pixel block width is 64 pixels.
- pixel block width is 80 pixels.
- FIG. 3 illustrates a pixel-block mapping for a preferred embodiment of the invention's pixel-block mapping, and its assignment of pixels to Resolvers for the 128 PPP embodiment of the invention. As shown, a Resolver is assigned every eighth vertical pixel stripe across the screen. (For a 102 PPP embodiment, each Resolver would be assigned every fourth pixel stripe.)
- the pixel-block mapping is configures so as to minimize page crossings during triangle draws and during surface rendering.
- the rationale is that triangles, and vectors to a lesser degree, are more typically drawn into a rectangular region of the screen, as opposed to being drawn in a thin horizontal screen slice that prior-art linear mapping produces.
- Each pixel block is wide enough so that page crossings are also reduced during block-oriented requests, such as puts and gets. Note, however, that Blits will likely cause page crossings when a switch is made from a the read to the write portion of the blit.
- the invention exploits another feature of SDRAMs (or other memory with similar features): they are dual-bank.
- dual-bank SDRAMs two different pages in memory may be open at the same time-one in each bank (referenced hereinbelow as BankA and BankB). While one of the banks is being closed and reopened, say BankA, a page in BankB may be accessed. This effectively hides most if not all of the page crossings in BankA.
- each Resolver is assigned 64 pixels in one pixel block 302 (eight rows of eight pixels).
- the FC0, FC1, HL0 and HL1 (102 PPP case), SBI, SBO (128 PPP case), and SBH (102 PPP case) plane sets are each packed such that only two memory words are required to store each plane set.
- the ScreenRefresh Module (SRM) in the Dual Resolver chip When the monitor is in non-interlaced mode, the ScreenRefresh Module (SRM) in the Dual Resolver chip must provide a pixel stream that uses every line of pixels in the frame buffer. To supply this stream, the SRM receives a complete word of SBI (for example) from the memory controller. It supplies one row of pixels immediately to satisfy the display, and temporarily stores the three other rows. On the succeeding scanlines, all of this data is provided to the display. When the monitor is placed in interlaced mode, however, the SRM only needs to supply every other line of pixels to the display during one frame. The next frame consumes the remaining lines. In this case, if the memory storage were the same as in the non-interlaced mode, the SRM would receive a memory word that only contained two useful rows of pixels. Therefore, memory would have to be read more often to supply the pixel stream. To enhance the efficiency of the frame buffer's bandwidth, the pixel rows are stored differently in interlaced mode by the Resolver:
- the 102 PPP case is very similar to this example, with the exception that each resolver is responsible for more pixels per pixel block (80, or 8 rows of 10 pixels), which means that 21/2 words in memory store the packed pixels.
- the two storage modes are as shown below for the 4-Resolver case.
- mapping from a pixel location in one frame buffer to a pixel location in the other frame buffer just requires that the pixel row number be modified such that noninterlaced row numbers 0 through 7 map to interlaced row numbers 0, 2, 4, 6, 1, 3, 5, and 7.
- This mapping is accomplished by a left-circular rotate of the pixel row number.
- This mapping is driven by the packed pixel plane sets, but it is also applied to all the other plane sets for a consistently-mapped scheme.
- FIG. 4a and FIG. 4b show a standard mapping versus a preferred embodiment's checkerboard mapping.
- a scan-line (segment CD) is part of a PutBlock32.
- a Resolver might first open Page n in BankA, draw pixels from left to right until the right side of the pixel block is reached, close Page n in BankA and open Page n in BankB, draw pixels from left to right until the right side of the pixel block is reached, close Page n in BankB and open Page n+1 in BankA, and then draw pixels from left to right until point D is reached.
- a faster way to write the scanline into memory is to hide the page crossings in the drawing time, or open Page n in BankA, and while drawing pixels in Page n, BankA, open Page n, BankB, and while drawing pixels in Page n, BankB, close Page n, BankA and open Page n+1 in BankA, and then draw pixels in Page n+1 in BankA until point D is reached.
- FIG. 4b corresponds to a preferred embodiment's intentional checkerboarding of frame buffer pixel blocks.
- pixels from opposite banks in memory are placed into adjacent pixel blocks on the screen, so that if an even number of pixel blocks fill the screen in the x-dimension, then all pixel blocks in a vertical line would fall within different pages in the SDRAM bank.
- an imaginary line drawn in either the horizontal or vertical directions passes through alternating SDRAM banks.
- the pixel blocks naturally form a checkerboard pattern (i.e. two pages within the same bank are never adjacent to each other).
- the Resolvers By intentionally addressing memory differently, the Resolvers always access memory banks in a checkerboarded fashion.
- the address mapping is linear. Therefore, any off screen pixels are mapped in the same linear fashion as the rest of memory.
- the frame buffer always holds 2 Mpixels.
- the amount of off-screen memory varies with the monitor and resolution chosen.
- a preferred embodiment has support for such off screen memory, but as with the prior art, the amount of off screen memory varies with according to the monitor, resolution, and frame buffer configuration.
- the off screen memory is grouped into pixel blocks. Consequently, it is possible that even though there is many off screen pixels, there may be no full rows of pixels.
- An advantage to the present invention's utilization of the dual-pages is that the apparent page size of the SDRAMs is increased, while dynamically altering the physical dimensions of the pixel block. As a result, objects that are large enough to span multiple pixel blocks may be drawn more quickly.
- Another advantage is that during reads of memory to satisfy screen refresh requirements, it becomes possible to hide page crossings. While data is being read from one SDRAM bank, the page that maps to the next pixel block to be hit by the raster scan is opened. In preferred embodiments, this page is always in the opposite SDRAM bank. And, while reading from the now-open bank, the previous bank is closed.
- preferred embodiments of the invention also support interlaced mode.
- pixels are stored differently in the frame buffer when in interlaced mode than in non-interlaced mode.
- Interlaced mode is enabled by setting a bit in a control register for the invention. Setting this bit causes some plane sets for some pixels to be stored differently.
- the logic for the texture processor is included in the graphics engine. Therefore, if texture memory is available, texturing is available. SDRAMs are used for texture memory instead of the DRAMs used by the prior art. SDRAMs provide faster texturing performance.
- FIG. 5, FIG. 6 and FIG. 7. show hardware block diagrams for three general configurations supported by the texture processor. Preferred embodiments of the invention support several such memory configurations. Some general features of the subsystem are apparent.
- the texture processor accesses texture memory via four, independent, 32-bit data and address buses. And, memory is still split into two logical (and physical) banks: "Texture Memory Even", (TME) and “Texture Memory Odd” (TMO). And, TME is subdivided into two sets of SDRAMs: Set0 and Set1. TMO is subdivided similarly.
- the size and number of banks varies depending on the organization and quantity of SDRAMs that hold the texture memory.
- Each one of these organizations presents a different memory map to applications.
- the common features are that the maximum U dimension is fixed at 2K, the bank bit (B) is stored in an internal register and toggles access between TME and TMO banks, and a mipmap with the next-lower level of detail from the current mipmap is stored in the opposite bank.
- the limiting factor for maps with borders is that the border information must be stored in the same bank (Texture Memory Even or Texture Memory Odd) as the image data.
- LOD CLAMP may represent the actual number of mipmaps; if clear, it is assumed no mipmaps exist for a given texture.
- the LOD CLAMP field determines the final level of detail (LOD) to use when mipmapping. Normally, this is set to the minimum of U SIZE (size of selected texture map in U direction) and V SIZE (size of selected texture map in V direction). For example, a preferred embodiment modifies OpenGL borders stores by storing the actual texture data centered in the next larger map size. If just min(U SIZE, V SIZE) were used, the texture processor would go one level of detail beyond where the method still remains correct. Also, as an alternate way to do borders, a 1K ⁇ 1K space may be tiled with 64 ⁇ 64 subblocks. Mipmap sets only exist for the 64 ⁇ 64 blocks. Normally, the maximum LOD would be 11, but in this case the maximum LOD should be 7. By default, its value is set to 0 ⁇ 0, giving the same behavior as U SIZE and V SIZE after warm reset.
- mapping of SDRAM addresses to texel (UV) space is carefully constructed to allow high texturing performance.
- eight memory locations (eight texels) must be read: four texels that surround a specific (u,v) coordinate at one level of detail, and the four texels that surround that same coordinate at the next lower level of detail.
- FIG. 8 shows how texels are mapped from memory to UV space.
- the upper right-hand corner of the figure indicates that TME is divided into texel blocks.
- the lower left-hand corner of the figure shows that each texel block is divided into texels.
- Each texel within a texel block has been assigned one of four symbols.
- the texture processor reads a group of four texels at one level of mipmap detail, it reads one texel of each symbol type.
- the rows of texels represented by the circles and squares are read from Set0 of TME, while the rows of texels represented by the triangles and crosses are read from Set1 of TME (refer to the hardware block diagram in FIG. 5).
- FIG. 8, FIG. 9, and FIG. 10 show how texels are mapped for three general hardware configuration, although other configurations are possible.
- the texture processor reads eight texels in two read cycles. Normally, this occurs on two consecutive clock cycles. First clock--1 texel from TME, Set0; 1 texel from TME, Set1; 1 texel from TMO, Set0, and 1 texel from TMO, Set1. Second clock-repeat the reads from the first clock after changing the texel addresses. The only exception to this pattern occurs on some cases of texturing with borders. Due to conflicting requests among SDRAM pages, there is a pause between the first and second reads while the current pages are closed and new pages are opened.
- one page of SDRAM memory represents one texel block. Since the SDRAMs are dual-bank, two texel blocks may be open at one time, so the texture processor may often access texels that straddle two texel blocks without any page crossing penalties.
- the pixel blocks are checkerboarded as in the frame buffer memory, further enabling the texture processor to access texels from adjacent texel blocks without opening and closing pages in middle of the access. In most situations, no matter which direction the texture is traversed, all texels required may be accessed without interruption from a page crossing.
- a preferred embodiment implements a WRITE VISIBLE and READ VISIBLE register bits to indicate when a given visual is double-buffered.
- WRITE VISIBLE When WRITE VISIBLE is set, fast clear planes for the associated pixel will be ignored, and not be read nor written when the pixel is accessed.
- READ VISIBLE When READ VISIBLE is set for a read, then the Resolver will determine the clear status of the pixel from the VISIBLE FC plane. (Note that the video display driver should be aware of this interpretation, since the visible buffer may not own the construction plane sets.) When these bits are set, a Resolver must first read the appropriate SelectBufferImage (SBI) or SelectBufferOverlay (SBO) bit from the frame buffer to determine which buffer is visible. Also, the ScreenRefresh module in the Dual Resolver must read these bits to determine which buffer it should display as it updates the screen.
- SBI SelectBufferImage
- SBO SelectBufferOverlay
- double-buffering utilizes two planes per pixel to control the displayed buffer.
- the first is the SelectBufferImage (SBI) plane for the Image planes (Red, Green, Blue, and Image VLT Context) and the SelectBufferOverlay (SBO) plane for the Overlay planes.
- SBI SelectBufferImage
- SBO SelectBufferOverlay
- a preferred embodiment supports Displayed-Buffer Detection (DBD).
- DBD Displayed-Buffer Detection
- the usefulness of this feature relies on the assumption that for well-behaved cases, all of the pixels on the screen are displaying buffer0. This condition will be true before any application begins double-buffering; while one application is double-buffering, and it is currently displaying buffer0; while many applications are double-buffering, and all of them are currently displaying buffer0; or after applications have stopped double-buffering, and the device driver cleans up all SBI and SBO bits to point to buffer0.
- DBD determination occurs as the ScreenRefresh module in the Dual Resolver must determine which buffer is displayed for every pixel on the screen as it is filling its pixel FIFO. If an entire screen is updated from buffer0, the ScreenRefresh module may set a flag to the Resolvers, signaling that READ VISIBLE actually means "read from buffer0", etc. If the Resolver modules interacting with the ScreenRefresh module detect that one of the SBI or SBO bits has been written with a "1", then they may reset the flag, forcing reads of the frame buffer to resume for visible determination. This flag may also be monitored by the ScreenRefresh module itself so that it may avoid reading the SBI and SBO bits during the next pass of the screen update.
- FCEn FastClearEnable
- Each Dual-Resolver chip has two registers that hold FastClearEnable (FCEn) bits.
- FCEn bit corresponds to a plane set or an individual plane. If a plane's FCEn bit is disabled, then the polarity of the FC bit does not affect the interpretation of that plane. For all enabled planes, the FC bit determines whether the frame buffer's contents or the clear value represent a pixel.
- performance is enhanced through All-Normal Detection.
- All-Normal Detection As with Displayed-Buffer Detection discussed above, by linking this function to the ScreenRefresh module's functions, preferred embodiments of the invention may detect the presence of any set FC bits on the displayed pixels of the screen at least 76 times per second, and preferably at speeds of at least 85 Hz.
- a preferred embodiment also implements a FastClear Cache.
- the FC bits for a pixel must be evaluated before the pixel may be accurately manipulated.
- the FC bit for the pixel must be reset. Performing these reads and writes takes memory cycles that could otherwise be dedicated to rendering.
- these "read/modify/writes" has a tendency to break pixels that could otherwise be bursted together into many smaller bursts. To minimize this impact, each Resolver module in the invention holds a FastClear cache.
- the Resolver may fill and flush this cache much more quickly (for most operations) than updating one pixel at a time in memory.
- the Resolver normally fills and flushes the FastClear cache during page crossings. On opening a page, the Resolver fills the cache for that bank. During accesses to the open page, the Resolver updates the cache locations instead of the FC locations in the frame buffer. When a page is to be closed, the Resolver flushes the appropriate cache lines, updating the frame buffer. A cache line holds the data that is stored in one word (32 bits) of memory. In preferred embodiments, the fill algorithm for a full cache simply to completely fill the cache regardless of the request. All lines in the cache are set to clean. Any lines touched by requests while the page remains open are marked as dirty.
- the invention Resolver When a page is scheduled to be closed, the invention Resolver must first flush the cache (while the page is still open) to make room for the FC bits for the next page. To determine which lines in the cache to flush, the invention Resolver examines the dirty bit for each line. All lines that are dirty are flushed. After flushing the caches, the Resolver marks all lines as clean (it does not destroy the contents of the cache).
- the Resolver accesses the FC bits in their caches instead of frame buffer memory.
- Preferred embodiments of the invention allow for the Resolver, if necessary, to manipulate these bits in the frame buffer directly instead of within the confines of the caches.
- An example of when this would be necessary is when the bits are read to fill the ScreenRefresh module's pixel FIFO.
- the scanlines affected by the request are the current line indicated by the address in the first header word and the next seven scanlines. The spans on all eight scanlines begin at the same x coordinate on the screen.
- the IZP sends single-scanline requests until it reaches a horizontal pixel block boundary. It then sends multiple-scanline requests via BLOCK FILL mode until the number of scanlines remaining in the request is less than eight (the height of a pixel block). The IZP then resumes sending single-scanline requests until it completes the fill.
- each Resolver 134, 136 is not assigned adjacent pairs of pixels.
- each Resolver covers every fourth pixel.
- each Resolver covers every eighth pixel. Since Resolvers are packaged in pairs in the invention, a package covers every other pixel or every fourth pixel for the 4- and 8-Resolver configurations, respectively.
- each Resolver module within a Dual Resolver device controls a pair of Synchronous DRAMs (SDRAMs).
- SDRAMs Synchronous DRAMs
- the preferred memory device for the frame buffer 116 (FIG. 1) are 1Meg ⁇ 16 SDRAMs, but the Resolvers support 2 Meg ⁇ 8 SDRAMs in case they are more available for prototype checkout than the ⁇ 16s, and could be designed to support other memory configurations if necessary.
- the feature subset of such memory most important to the frame buffer architecture of the invention includes pipeline mode--the ability to issue a new column command on every clock cycle; dual-bank--the memory array is divided into two equal halves, each of which may have a page open with an independent page address; pulsed RAS; high-speed--at least a 100 MHZ clock rate; low-voltage; LVTTL Signaling interface; support for 4096 Pages of Memory; support for a page size of 256 locations; full-page burst length; CAS latency of 3; DQM Write latency of zero, DQM Read latency of 2; and preferably packaged in a 400 mil, 50 pin, TSOP II package.
- the Resolver's memory controller When accessing the frame buffer 116 (FIG. 1) when there is a draw request, the Resolver's memory controller tries to satisfy the request via onboard logic referred herein as the Burst Builder.
- the Resolver's Burst Builder groups sequences of reads and writes to the frame buffer into bursts to use the SDRAM interface more efficiently.
- a burst is a sequence of transactions that occur without intervening page crossings.
- the general structure of a burst is as follows: Page Commands! Read Requests! Read->Write Transition! Write Requests!.
- Implied by this format is that all page requests (Close, Open) are performed before the burst is started. Also implied is that all read requests for all the pixels in the burst will be completed before the SDRAM bus is reversed. After the bus is reversed, all the write requests for all the pixels in the burst will be completed. By minimizing the number of dead clocks on the SDRAM bus incurred from switching the bus frequently from a read to a write, performance is optimized.
- the Burst Builder generates one list of required plane sets to be accessed for all pixels in the burst. Two pixels may be placed in the same burst only if certain conditions are true: (1) Only one page in a SDRAM bank may be opened at one time. A JEDEC-standard SDRAM is dual-banked. Therefore, only pixels destined for the same pair of pages (one from each SDRAM bank) may be bursted together; (2) If a read/modify/write is required for a pixel, then only one access to that pixel is allowed within the same burst; and (3) If a plane set to be written for two pixels lies within the same byte in memory (for example, Mask in 2 Mpixel), then those two pixels must be split into separate bursts.
- burst Builder indicates that Fast Clears are necessary for a burst, then the FastClear cache will be filled when the page is opened. Also, if a page is scheduled to be closed before the next burst begins, the Resolver will flush any dirty pages in the cache before closing the current page. Therefore, a more general format for bursts is as follows: Flush Cache! Page Commands! Fill Cache! Read Requests!. . . . . Read->Write Transition! Write Requests!.
- the border data is stored in texture memory along with the texture data, and it may be thought of as a collection of single maps. For this discussion, it is assumed that:
- the currently active border is defined by the GE TEX BDR ORG register.
- V coordinate of GE TEX BDR ORG is an integral multiple of 64. This is regardless of map size or if mipmapping is enabled.
- the U coordinate of GE TEX BDR ORG is an integral multiple of the base map U size.
- the border for a map must be stored in the same texture memory bank (Texture Memory Odd or Texture Memory Even) as the associated texture data. For mipmaps, this means the borders swap banks along with the normal texture image data.
- a group of 8 lines is required to store the borders for a map.
- 8 such groups are possible since the V coordinate of GE TEX BDR ORG must be an integral multiple of 64.
- Line 0 Bottom border
- Line 1 Top border
- Line 2 Left border, copy 0
- Line 3 Left border, copy 1
- Line 4 Right border, copy 0
- Line 5 Right border, copy 1
- Line 6 Corner borders, copy 0
- Line 7 Corner borders, copy 1.
- border storage does not depend on the type of synchronous DRAMs that are used. Not all the border texels will always be accessed according to the rules, but all 8 lines of border storage are required with the current texture memory layout.
- BDR.u and BDR.v are the U and V values respectively from GE TEX BDR ORG register.
- the bottom and top rows are stored at V addresses of BDR.v+0 and BDR.v+1.
- the U addresses start with BDR.u and increment once every texel until the entire width of the map is loaded.
- the left and right borders are each duplicated on two rows.
- the two rows for the left border are loaded at BDR.v+2 and BDR.v+3, and the two rows for the right border are loaded at BDR.v+4 and BDR.v+5.
- the U addresses in BDR.u corresponds to the top of the map and increments as the border is traversed from top to bottom.
- the corner texels are duplicated on rows BDR.v+6 and BDR.v+7.
- the comers are stored in order top-left, top-right, bottom-left, bottom-right.
- the border storage is very similar to what is described above. Note that the texture memory bank is swapped for every increasing integer LOD value.
- the border storage is in the same bank as the associated texture data.
- the bank for base v is the same as for the corresponding mipmap.
- the order of the rows and the U addresses are as follows: ##STR2##
- left and right borders must be duplicated as well if V map size is 1. Although not imposed by the physical arrangement of texture memory, this simplifies the hardware address translation. Starting at BDR.u, the same border value is stored into two adjacent locations for each of these rows.
Abstract
Description
______________________________________ Maximum absolute error in output for N fractional bits, Mean of All ABS (theoretical - actual) Possible Blend Method 5-bit frac 8-bit frac Results ______________________________________ Current 4.419 1.439 127.0 Proposed 4.065 0.910 127.5 ______________________________________
______________________________________ /* a,b are the two operands to blend between. * f is the fraction of b desired in the blended result. * (1 - f) is fraction of a desired in the blended result. * out is the result of the linear blend. */ fa = ((˜f << | 1) & 0x1ff; /* 9 bit fractions, with */ fb = ((f << | 1) & 0x1ff; /* rounding LSB added. */ out = (fa*a + fb*b) >> 9; ______________________________________
______________________________________ Planes Plane Set Buffering Per Pixel Total Planes Per Pixel ______________________________________ Image double 24 48 Image VLT Context double 4 8 Fast Clear double 1 2Overlay double 8 16 Mask single 4 4 Z Buffer single 32 32 Alpha single 8 8 Stencil single 8 8 Select Buffer Image single 1 1 Select Buffer single 1 1 Overlay ______________________________________
______________________________________ 1 Mpixel =2 dual resolvers, 8SDRAMs 128 planes per pixel; (16 Mbytes), 1.3 Mpixel =2 dual resolvers, 8SDRAMs 102 planes per pixel; (16 Mbytes), 2 Mpixel =4 dual resolvers, 16SDRAMs 128 planes per pixel; (32 Mbytes), 2.6 Mpixel =4 dual resolvers, 16SDRAMs 102 planes per pixel. (32 Mbytes), ______________________________________
______________________________________ Key: ______________________________________ --> stored in increasing order, indicates a span >= 1 T top border B bottom border L left border R right border TL top-left corner TR top-right corner BL bottom-left corner BR bottom-right corner ______________________________________
Claims (25)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/832,708 US5864512A (en) | 1996-04-12 | 1997-04-11 | High-speed video frame buffer using single port memory chips |
US09/129,293 US6278645B1 (en) | 1997-04-11 | 1998-08-05 | High speed video frame buffer |
US09/934,444 US6667744B2 (en) | 1997-04-11 | 2001-08-21 | High speed video frame buffer |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US1534996P | 1996-04-12 | 1996-04-12 | |
US08/832,708 US5864512A (en) | 1996-04-12 | 1997-04-11 | High-speed video frame buffer using single port memory chips |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/129,293 Continuation US6278645B1 (en) | 1997-04-11 | 1998-08-05 | High speed video frame buffer |
Publications (1)
Publication Number | Publication Date |
---|---|
US5864512A true US5864512A (en) | 1999-01-26 |
Family
ID=21770881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/832,708 Expired - Lifetime US5864512A (en) | 1996-04-12 | 1997-04-11 | High-speed video frame buffer using single port memory chips |
Country Status (3)
Country | Link |
---|---|
US (1) | US5864512A (en) |
EP (1) | EP0892972A1 (en) |
WO (1) | WO1997039437A1 (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6011579A (en) * | 1996-12-10 | 2000-01-04 | Motorola, Inc. | Apparatus, method and system for wireline audio and video conferencing and telephony, with network interactivity |
US6052134A (en) * | 1997-12-22 | 2000-04-18 | Compaq Computer Corp. | Memory controller and method for dynamic page management |
WO2001029818A1 (en) * | 1999-10-18 | 2001-04-26 | S3 Incorporated | Atomic operation in system with burst mode memory access |
US20020003518A1 (en) * | 2000-06-29 | 2002-01-10 | Kabushiki Kaisha Toshiba | Semiconductor device for driving liquid crystal and liquid crystal display apparatus |
US6385566B1 (en) * | 1998-03-31 | 2002-05-07 | Cirrus Logic, Inc. | System and method for determining chip performance capabilities by simulation |
US6441818B1 (en) * | 1999-02-05 | 2002-08-27 | Sony Corporation | Image processing apparatus and method of same |
US6476816B1 (en) * | 1998-07-17 | 2002-11-05 | 3Dlabs Inc. Ltd. | Multi-processor graphics accelerator |
US20030016225A1 (en) * | 2001-07-19 | 2003-01-23 | International Business Machines Corporation | Selecting between double buffered stereo and single buffered stereo in a windowing system |
US6687798B1 (en) * | 2001-05-31 | 2004-02-03 | Oracle International Corporation | Methods for intra-partition parallelism for inserts |
US6724396B1 (en) * | 2000-06-01 | 2004-04-20 | Hewlett-Packard Development Company, L.P. | Graphics data storage in a linearly allocated multi-banked memory |
US6734866B1 (en) * | 2000-09-28 | 2004-05-11 | Rockwell Automation Technologies, Inc. | Multiple adapting display interface |
US6778177B1 (en) * | 1999-04-15 | 2004-08-17 | Sp3D Chip Design Gmbh | Method for rasterizing a graphics basic component |
US20040160446A1 (en) * | 2003-02-18 | 2004-08-19 | Gosalia Anuj B. | Multithreaded kernel for graphics processing unit |
US6897874B1 (en) * | 2000-03-31 | 2005-05-24 | Nvidia Corporation | Method and apparatus for providing overlay images |
US6947100B1 (en) * | 1996-08-09 | 2005-09-20 | Robert J. Proebsting | High speed video frame buffer |
US20050231519A1 (en) * | 1999-03-22 | 2005-10-20 | Gopal Solanki | Texture caching arrangement for a computer graphics accelerator |
US20060031565A1 (en) * | 2004-07-16 | 2006-02-09 | Sundar Iyer | High speed packet-buffering system |
US7016903B1 (en) | 2001-01-25 | 2006-03-21 | Oracle International Corporation | Method for conditionally updating or inserting a row into a table |
US7136068B1 (en) * | 1998-04-07 | 2006-11-14 | Nvidia Corporation | Texture cache for a computer graphics accelerator |
CN100463511C (en) * | 2003-04-28 | 2009-02-18 | 三星电子株式会社 | Image data processing system and image data reading and writing method |
US7724253B1 (en) * | 2006-10-17 | 2010-05-25 | Nvidia Corporation | System and method for dithering depth values |
US20100149426A1 (en) * | 2008-12-17 | 2010-06-17 | Ho-Tzu Cheng | Systems and methods for bandwidth optimized motion compensation memory access |
US20100325319A1 (en) * | 2005-11-01 | 2010-12-23 | Frank Worrell | Systems for implementing sdram controllers, and buses adapted to include advanced high performance bus features |
US20140085193A1 (en) * | 2009-05-29 | 2014-03-27 | Microsoft Corporation | Protocol and format for communicating an image from a camera to a computing environment |
US20170230603A1 (en) * | 2016-02-04 | 2017-08-10 | Samsung Electronics Co., Ltd. | High resolution user interface |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2381103B (en) * | 1997-12-17 | 2003-06-04 | Fujitsu Ltd | Memory access methods and devices for use with random access memories |
EP0935199B1 (en) * | 1998-02-04 | 2011-05-04 | Panasonic Corporation | Memory control unit and memory control method and medium containing program for realizing the same |
JPH11339005A (en) * | 1998-05-22 | 1999-12-10 | Sony Corp | Image processor and special effect device and image processing method |
US6456746B2 (en) * | 1999-01-26 | 2002-09-24 | Sarnoff Corporation | Method of memory utilization in a predictive video decoder |
US6798418B1 (en) * | 2000-05-24 | 2004-09-28 | Advanced Micro Devices, Inc. | Graphics subsystem including a RAMDAC IC with digital video storage interface for connection to a graphics bus |
KR20040075927A (en) * | 2000-08-17 | 2004-08-30 | 주식회사 이노티브 | Information service method |
US6891545B2 (en) * | 2001-11-20 | 2005-05-10 | Koninklijke Philips Electronics N.V. | Color burst queue for a shared memory controller in a color sequential display system |
KR100878231B1 (en) * | 2002-02-08 | 2009-01-13 | 삼성전자주식회사 | Liquid crystal display and driving method thereof and frame memory |
US8045021B2 (en) | 2006-01-05 | 2011-10-25 | Qualcomm Incorporated | Memory organizational scheme and controller architecture for image and video processing |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0279229A2 (en) * | 1987-02-12 | 1988-08-24 | International Business Machines Corporation | A graphics display system |
WO1989006031A2 (en) * | 1987-12-18 | 1989-06-29 | Digital Equipment Corporation | Method of drawing in graphics rendering system |
US5170468A (en) * | 1987-08-18 | 1992-12-08 | Hewlett-Packard Company | Graphics system with shadow ram update to the color map |
US5187575A (en) * | 1989-12-29 | 1993-02-16 | Massachusetts Institute Of Technology | Source adaptive television system |
US5333299A (en) * | 1991-12-31 | 1994-07-26 | International Business Machines Corporation | Synchronization techniques for multimedia data streams |
US5517253A (en) * | 1993-03-29 | 1996-05-14 | U.S. Philips Corporation | Multi-source video synchronization |
US5557342A (en) * | 1993-07-06 | 1996-09-17 | Hitachi, Ltd. | Video display apparatus for displaying a plurality of video signals having different scanning frequencies and a multi-screen display system using the video display apparatus |
-
1997
- 1997-04-11 US US08/832,708 patent/US5864512A/en not_active Expired - Lifetime
- 1997-04-11 WO PCT/US1997/005983 patent/WO1997039437A1/en not_active Application Discontinuation
- 1997-04-11 EP EP97921139A patent/EP0892972A1/en not_active Withdrawn
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0279229A2 (en) * | 1987-02-12 | 1988-08-24 | International Business Machines Corporation | A graphics display system |
US5170468A (en) * | 1987-08-18 | 1992-12-08 | Hewlett-Packard Company | Graphics system with shadow ram update to the color map |
WO1989006031A2 (en) * | 1987-12-18 | 1989-06-29 | Digital Equipment Corporation | Method of drawing in graphics rendering system |
US5187575A (en) * | 1989-12-29 | 1993-02-16 | Massachusetts Institute Of Technology | Source adaptive television system |
US5333299A (en) * | 1991-12-31 | 1994-07-26 | International Business Machines Corporation | Synchronization techniques for multimedia data streams |
US5517253A (en) * | 1993-03-29 | 1996-05-14 | U.S. Philips Corporation | Multi-source video synchronization |
US5557342A (en) * | 1993-07-06 | 1996-09-17 | Hitachi, Ltd. | Video display apparatus for displaying a plurality of video signals having different scanning frequencies and a multi-screen display system using the video display apparatus |
Non-Patent Citations (16)
Title |
---|
"Design considerations and applications for innovative display option using projector arrays" Index; Nov. 1, 1996; pp. 1-2. |
"GLZ5 Hardware User's Guide," Intergraph Corporation, Nov. 1995. |
"Silicon Graphics United Kingdom," Nov. 1, 1996; pp. 1-4. |
C. Cruz Neira, et al., Surround Screen Projection Based Virtual Reality: The Design and Implementation of the CAVE, Nov. 1, 1996; pp. 1 13. * |
C. Cruz-Neira, et al., "Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE," Nov. 1, 1996; pp. 1-13. |
D. Pape, "A Hardware-Independent Virtual Reality Development System," Nov. 1, 1996; pp. 1-5. |
D. Pape, A Hardware Independent Virtual Reality Development System, Nov. 1, 1996; pp. 1 5. * |
Design considerations and applications for innovative display option using projector arrays Index; Nov. 1, 1996; pp. 1 2. * |
GLZ5 Hardware User s Guide, Intergraph Corporation, Nov. 1995. * |
K. Suizu, et al., "Emerging Memory Solutions for Graphics Applications," IEICE Transactions on Electronics, vol. E78-c, No. 7, Jul. 1995; pp. 773-781. |
K. Suizu, et al., Emerging Memory Solutions for Graphics Applications, IEICE Transactions on Electronics, vol. E78 c, No. 7, Jul. 1995; pp. 773 781. * |
M. C. Miller et al., "Multiscale Terrain Tiling for Real Time Rendering," Nov. 1, 1996; pp. 1-10. |
M. C. Miller et al., Multiscale Terrain Tiling for Real Time Rendering, Nov. 1, 1996; pp. 1 10. * |
Silicon Graphics United Kingdom, Nov. 1, 1996; pp. 1 4. * |
T. DeFanti, et al., "Overview of the I-WAY: Wide Area Visual Supercomputing," International Journal of Supercomputing Applications, Oct. 2, 1996; pp. 1-10. |
T. DeFanti, et al., Overview of the I WAY: Wide Area Visual Supercomputing, International Journal of Supercomputing Applications, Oct. 2, 1996; pp. 1 10. * |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6947100B1 (en) * | 1996-08-09 | 2005-09-20 | Robert J. Proebsting | High speed video frame buffer |
US6011579A (en) * | 1996-12-10 | 2000-01-04 | Motorola, Inc. | Apparatus, method and system for wireline audio and video conferencing and telephony, with network interactivity |
US6052134A (en) * | 1997-12-22 | 2000-04-18 | Compaq Computer Corp. | Memory controller and method for dynamic page management |
US6385566B1 (en) * | 1998-03-31 | 2002-05-07 | Cirrus Logic, Inc. | System and method for determining chip performance capabilities by simulation |
US7136068B1 (en) * | 1998-04-07 | 2006-11-14 | Nvidia Corporation | Texture cache for a computer graphics accelerator |
US6476816B1 (en) * | 1998-07-17 | 2002-11-05 | 3Dlabs Inc. Ltd. | Multi-processor graphics accelerator |
US6441818B1 (en) * | 1999-02-05 | 2002-08-27 | Sony Corporation | Image processing apparatus and method of same |
US8018467B2 (en) | 1999-03-22 | 2011-09-13 | Nvidia Corporation | Texture caching arrangement for a computer graphics accelerator |
US7330188B1 (en) | 1999-03-22 | 2008-02-12 | Nvidia Corp | Texture caching arrangement for a computer graphics accelerator |
US20050231519A1 (en) * | 1999-03-22 | 2005-10-20 | Gopal Solanki | Texture caching arrangement for a computer graphics accelerator |
US6778177B1 (en) * | 1999-04-15 | 2004-08-17 | Sp3D Chip Design Gmbh | Method for rasterizing a graphics basic component |
US6956578B2 (en) | 1999-10-18 | 2005-10-18 | S3 Graphics Co., Ltd. | Non-flushing atomic operation in a burst mode transfer data storage access environment |
US6756986B1 (en) | 1999-10-18 | 2004-06-29 | S3 Graphics Co., Ltd. | Non-flushing atomic operation in a burst mode transfer data storage access environment |
WO2001029818A1 (en) * | 1999-10-18 | 2001-04-26 | S3 Incorporated | Atomic operation in system with burst mode memory access |
US20050007374A1 (en) * | 1999-10-18 | 2005-01-13 | S3 Graphics Co., Ltd. | Non-flushing atomic operation in a burst mode transfer data storage access environment |
US6897874B1 (en) * | 2000-03-31 | 2005-05-24 | Nvidia Corporation | Method and apparatus for providing overlay images |
GB2364873B (en) * | 2000-06-01 | 2005-02-02 | Hewlett Packard Co | Graphics data storage in a linearly allocated multi-banked memory |
US6724396B1 (en) * | 2000-06-01 | 2004-04-20 | Hewlett-Packard Development Company, L.P. | Graphics data storage in a linearly allocated multi-banked memory |
US6933915B2 (en) | 2000-06-29 | 2005-08-23 | Kabushiki Kaisha Toshiba | Semiconductor device for driving liquid crystal and liquid crystal display apparatus |
US20020003518A1 (en) * | 2000-06-29 | 2002-01-10 | Kabushiki Kaisha Toshiba | Semiconductor device for driving liquid crystal and liquid crystal display apparatus |
US6734866B1 (en) * | 2000-09-28 | 2004-05-11 | Rockwell Automation Technologies, Inc. | Multiple adapting display interface |
US7016903B1 (en) | 2001-01-25 | 2006-03-21 | Oracle International Corporation | Method for conditionally updating or inserting a row into a table |
US20040158570A1 (en) * | 2001-05-31 | 2004-08-12 | Oracle International Corporation | Methods for intra-partition parallelism for inserts |
US6687798B1 (en) * | 2001-05-31 | 2004-02-03 | Oracle International Corporation | Methods for intra-partition parallelism for inserts |
US6895487B2 (en) | 2001-05-31 | 2005-05-17 | Oracle International Corporation | Methods for intra-partition parallelism for inserts |
US6888550B2 (en) | 2001-07-19 | 2005-05-03 | International Business Machines Corporation | Selecting between double buffered stereo and single buffered stereo in a windowing system |
US20030016225A1 (en) * | 2001-07-19 | 2003-01-23 | International Business Machines Corporation | Selecting between double buffered stereo and single buffered stereo in a windowing system |
US20100122259A1 (en) * | 2003-02-18 | 2010-05-13 | Microsoft Corporation | Multithreaded kernel for graphics processing unit |
US8671411B2 (en) | 2003-02-18 | 2014-03-11 | Microsoft Corporation | Multithreaded kernel for graphics processing unit |
US7673304B2 (en) * | 2003-02-18 | 2010-03-02 | Microsoft Corporation | Multithreaded kernel for graphics processing unit |
US20080301687A1 (en) * | 2003-02-18 | 2008-12-04 | Microsoft Corporation | Systems and methods for enhancing performance of a coprocessor |
CN1538296B (en) * | 2003-02-18 | 2010-05-26 | 微软公司 | Method and system for scheduling coprocessor |
US9298498B2 (en) | 2003-02-18 | 2016-03-29 | Microsoft Technology Licensing, Llc | Building a run list for a coprocessor based on rules when the coprocessor switches from one context to another context |
US20040160446A1 (en) * | 2003-02-18 | 2004-08-19 | Gosalia Anuj B. | Multithreaded kernel for graphics processing unit |
CN100463511C (en) * | 2003-04-28 | 2009-02-18 | 三星电子株式会社 | Image data processing system and image data reading and writing method |
US20060031565A1 (en) * | 2004-07-16 | 2006-02-09 | Sundar Iyer | High speed packet-buffering system |
US20100325319A1 (en) * | 2005-11-01 | 2010-12-23 | Frank Worrell | Systems for implementing sdram controllers, and buses adapted to include advanced high performance bus features |
US8046505B2 (en) * | 2005-11-01 | 2011-10-25 | Lsi Corporation | Systems for implementing SDRAM controllers, and buses adapted to include advanced high performance bus features |
US7724253B1 (en) * | 2006-10-17 | 2010-05-25 | Nvidia Corporation | System and method for dithering depth values |
US20100149426A1 (en) * | 2008-12-17 | 2010-06-17 | Ho-Tzu Cheng | Systems and methods for bandwidth optimized motion compensation memory access |
US20140085193A1 (en) * | 2009-05-29 | 2014-03-27 | Microsoft Corporation | Protocol and format for communicating an image from a camera to a computing environment |
US9215478B2 (en) * | 2009-05-29 | 2015-12-15 | Microsoft Technology Licensing, Llc | Protocol and format for communicating an image from a camera to a computing environment |
US20170230603A1 (en) * | 2016-02-04 | 2017-08-10 | Samsung Electronics Co., Ltd. | High resolution user interface |
US11064150B2 (en) * | 2016-02-04 | 2021-07-13 | Samsung Electronics Co., Ltd. | High resolution user interface |
Also Published As
Publication number | Publication date |
---|---|
WO1997039437A1 (en) | 1997-10-23 |
EP0892972A1 (en) | 1999-01-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6667744B2 (en) | High speed video frame buffer | |
US5864512A (en) | High-speed video frame buffer using single port memory chips | |
US6104418A (en) | Method and system for improved memory interface during image rendering | |
US6906720B2 (en) | Multipurpose memory system for use in a graphics system | |
US6377266B1 (en) | Bit BLT with multiple graphics processors | |
US5990912A (en) | Virtual address access to tiled surfaces | |
US5233689A (en) | Methods and apparatus for maximizing column address coherency for serial and random port accesses to a dual port ram array | |
US6674443B1 (en) | Memory system for accelerating graphics operations within an electronic device | |
KR100648293B1 (en) | Graphic system and graphic processing method for the same | |
US6704026B2 (en) | Graphics fragment merging for improving pixel write bandwidth | |
US20030169265A1 (en) | Memory interleaving technique for texture mapping in a graphics system | |
US20060098021A1 (en) | Graphics system and memory device for three-dimensional graphics acceleration and method for three dimensional graphics processing | |
US6885384B2 (en) | Method of creating a larger 2-D sample location pattern from a smaller one by means of X, Y address permutation | |
US6836272B2 (en) | Frame buffer addressing scheme | |
US6741256B2 (en) | Predictive optimizer for DRAM memory | |
US5859646A (en) | Graphic drawing processing device and graphic drawing processing system using thereof | |
US5321809A (en) | Categorized pixel variable buffering and processing for a graphics system | |
US6532018B1 (en) | Combined floating-point logic core and frame buffer | |
US6812928B2 (en) | Performance texture mapping by combining requests for image data | |
US6720969B2 (en) | Dirty tag bits for 3D-RAM SRAM | |
JPH087565A (en) | Dynamic random access memory and access method and system for dynamic random access memory | |
US6778179B2 (en) | External dirty tag bits for 3D-RAM SRAM | |
US6992673B2 (en) | Memory access device, semiconductor device, memory access method, computer program and recording medium | |
Sproull | Frame-buffer display architectures | |
JPH07199907A (en) | Display controller |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERGRAPH CORPORATION, ALABAMA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BUCKELEW, MATT E.;CARLTON, STEWART G.;DEMING, JAMES L.;AND OTHERS;REEL/FRAME:008682/0693;SIGNING DATES FROM 19970624 TO 19970625 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: FOOTHILL CAPITAL CORPORATION, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:INTEGRAPH CORPORATION;REEL/FRAME:010425/0955 Effective date: 19991130 |
|
AS | Assignment |
Owner name: 3DLABS, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERGRAPH CORPORATION;REEL/FRAME:011122/0951 Effective date: 20000816 |
|
AS | Assignment |
Owner name: FOOTHILL CAPITAL CORPORATION, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNORS:3DLABS INC., LTD., AND CERTAIN OF PARENT'S SUBSIDIARIES;3DLABS INC., LTD.;3DLABS (ALABAMA) INC.;AND OTHERS;REEL/FRAME:012043/0880 Effective date: 20010727 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: 3DLABS (ALABAMA) INC., ALABAMA Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS INC., A CORP. OF DE, CALIFORNIA Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS INC., A COMPANY ORGANIZED UNDER THE LAWS OF Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS LIMITED, A COMPANY ORGANIZED UNDER THE LAWS Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS (ALABAMA) INC.,ALABAMA Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS INC., A CORP. OF DE,CALIFORNIA Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 Owner name: 3DLABS INC., LTD., A COMPANY ORGANIZED UNDER THE L Free format text: RELEASE OF SECURITY AGREEMENT;ASSIGNOR:WELL FARGO FOOTHILL, INC., FORMERLY KNOWN AS FOOTHILL CAPITAL CORPORATION;REEL/FRAME:015722/0752 Effective date: 20030909 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
AS | Assignment |
Owner name: 3DLABS, INC. LTD., BERMUDA Free format text: CORRECTION;ASSIGNOR:INTERGRAPH CORPORATION, A DELAWARE CORPORATION;REEL/FRAME:018480/0116 Effective date: 20000816 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
AS | Assignment |
Owner name: ZIILABS INC., LTD., BERMUDA Free format text: CHANGE OF NAME;ASSIGNOR:3DLABS INC., LTD.;REEL/FRAME:032588/0125 Effective date: 20110106 |
|
AS | Assignment |
Owner name: RPX CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZIILABS INC., LTD.;REEL/FRAME:044476/0621 Effective date: 20170809 |