WO1999067925A1 - Content storage and redundancy elimination - Google Patents

Content storage and redundancy elimination Download PDF

Info

Publication number
WO1999067925A1
WO1999067925A1 PCT/IL1999/000345 IL9900345W WO9967925A1 WO 1999067925 A1 WO1999067925 A1 WO 1999067925A1 IL 9900345 W IL9900345 W IL 9900345W WO 9967925 A1 WO9967925 A1 WO 9967925A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
packet
network
destination
maintained
Prior art date
Application number
PCT/IL1999/000345
Other languages
French (fr)
Inventor
Marco Talmon
Ziv Haparnas
Original Assignee
Infit Communications Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Infit Communications Ltd. filed Critical Infit Communications Ltd.
Priority to AU43890/99A priority Critical patent/AU4389099A/en
Publication of WO1999067925A1 publication Critical patent/WO1999067925A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/163In-band adaptation of TCP data exchange; In-band control procedures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Definitions

  • the present invention relates to packet-based networks in general, and in particular to apparatus and methods for reducing data traffic associated with the transmission of packets in such networks.
  • BACKGROUND OF THE INVENTION In computer networks having a client-server architecture, data files are often sent multiple times from a server to the same destination or the same subnet in response to multiple requests. For example, on the Internet a server may transmit the same Graphics Interchange Format (GIF) file over and over again to different users at a single Internet Service Provider (ISP).
  • GIF Graphics Interchange Format
  • ISP Internet Service Provider
  • packet-switched networks such as the Internet, a data file is segmented prior to transmission into one or more "packets" that are transmitted in a "data stream" to the destination where they are then reassembled into the original data file. In order to facilitate the transmission of the packets over the network, they are "wrapped" by one or more protocols.
  • the Internet server in the previous example may use the HyperText Transfer Protocol (HTTP) in order to transmit the GIF file.
  • HTTP HyperText Transfer Protocol
  • the HTTP protocol adds a header that provides additional information about the GIF file (such as size, time and type of server etc.), and may also concatenate several files together (according to the HTTP/1.1 protocol).
  • TCP Transmission Control Protocol
  • IP Internet Protocol
  • the present invention seeks to provide apparatus and methods for reducing data traffic associated with the transmission of packets in a packet-based network, such as between two routers in the network.
  • a method for reducing data traffic associated with the transmission of packets in a packet-based network including a) maintaining data transmitted in a data stream at a network source and a network destination, the network source is in communication with the network destination via a network path, b) subsequently receiving a packet associated with the data stream at the network source, c) extracting the packet's data, d) comparing the data extracted in step c) with the data maintained in step a), and e) where the data extracted in step c) at least partially matches the data maintained in step a) f) sending a system packet to the network destination identifying the data extracted in step c) with the data maintained at the network destination, and g) recreating the packet at the network destination from the data maintained at the destination.
  • system packet includes a checksum of the data stream.
  • Fig. 1 is a simplified block diagram of a system for reducing data traffic associated with the transmission of packets in a network, the system constructed and operative in accordance with a preferred embodiment of the present invention;
  • Fig. 2 is a simplified flowchart illustration of a method of operation of the system of Fig. 1 in accordance with a preferred embodiment of the present invention.
  • Fig. 1 is a simplified block diagram of a system for reducing data traffic associated with the transmission of packets in a network, the system constructed and operative in accordance with a preferred embodiment of the present invention
  • Fig. 2 is a simplified flowchart illustration of a method of operation of the system of Fig. 1 in accordance with a preferred embodiment of the present invention.
  • a stream of packets is transmitted via a network path 8 from a server 10 to a client 12.
  • Client 12 may be any recipient of the data stream sent by server 10, such as an end-user computer or an ISP server.
  • Server 10 is connected to network path 8 via a router 14 and a source device 16.
  • Client 12 is connected to network path 8 via a router 20 and a destination device 18.
  • Source device 16 preferably receives each packet sent by server 10 for transmission via network path 8 (block 102), extracts the data contained in the packet, and notes the data stream or streams to which the packet belongs (blocks 104 and 106). Next, the data from each packet is then checked to determine whether some of it, or all of it, was already sent over the network path 8 to destination device 18 (block 108) which preferably stores data sent to it (block 110).
  • One method of determining whether data received at source device 16 from server 10 already exists at destination device 18 is by maintaining in both source device 16 and destination device 18 a copy of all data sent in all data streams or specific data streams, such as those containing GIF data, that previously traversed network path 8 (blocks 110 and 118). Thus if data in a data stream is present in source device 16, it is also present in destination device 18, assuming that a reliable connection exists between source device 16 and destination device 18. If a reliable connection does not exist destination device 18 may inform source device 16 that destination device 18 is missing certain data and request a retransmission of the original data.
  • source device 16 communicates to destination device 18 that the data is already available at destination device 18 (block 130).
  • destination device 18 retrieves the data from memory (block 132), regenerates the packet (block 134), and sends the regenerated packet to client 12 via router 20 (block 120).
  • Data not found at destination device 18 is sent by source device 16 via network path 8 (block 112).
  • the data may be sent using any conventional technique, compressed or "as is,” provided that the destination device 18 can recreate the original data packet.
  • data streams may be transmitted in the context of one or more network sessions.
  • a session might be composed of all packets related to a certain TCP connection.
  • a network session may also be defined as all the IP packets sent from one IP address to another IP address that share an additional element such as a certain identifier.
  • a session may also include one or more data streams.
  • source and destination devices 16 and 18 are installed at each end of an IP hop.
  • devices 16 and 18 may be installed between the routers at each end of the IP hop. The device at the sending end of the hop, i.e. source device 16.
  • the system packet typically includes a packet identifier combined with data.
  • the identifier indicates whether this is a "regular" (bypass) packet, for which no matching data is found at destination device 18, and hence it is sent in its entirely.
  • the packet may include both the unmatched data and indexing information that destination device 18 may use to identify the matched stream, such as a unique checksum of the original stream and information about the location of the data in the stream, such as an offset-length pair.
  • the destination device on the receiving end of the hop receives the system packet and checks the identifier.
  • the packet is analyzed, and the data found in the packet will be stored in destination device 18's memory.
  • the original IP packet is then sent to router 20.
  • destination device 18 analyzes the packet and recreates the original IP packet from its memory using the information found in the system packet.
  • source device 16 and destination device 18 may be incorporated into routers 14 and 20 respectfully in hardware and/or software using conventional techniques.
  • the routers may thus be configured to process packets as described above, with the data being stored in the routers' internal memory as necessary.
  • Source device 16 may use several methods for matching incoming packets to data stored at destination device 18. For example, source device 16 may perform matching on a packet-by-packet basis, than reassemble the data file (e.g., GIF file), and compare the received and stored files.
  • a system utilizing the teachings of the present invention may provide additional data throughput as compared with existing systems as it is typically more efficient to send indexing information as described above which is usually several bytes in length, instead of the entire packet which may contain hundreds of bytes of data.

Abstract

A method for reducing data traffic associated with the transmission of packets in a packet-based network including: a) maintaining data transmitted in a data stream at a network source (16) and destination (18), the network source (16) is in communication with the destination (18) via a network path (8); b) subsequently receiving a packet associated with the data stream at the network source (16); c) extracting the packet's data; d) comparing the data extracted in step c) with the data maintained in step a); and e) where the data extracted in step c) at least partially matches the data maintained in step a); f) sending a system packet to the network destination (18) identifying the data extracted in step c) with the data maintained at the network destination (18); and g) recreating the packet at the network destination (18) from the data maintained at the destination.

Description

CONTENT STORAGE AND REDUNDANCY ELIMINATION
FIELD OF THE INVENTION The present invention relates to packet-based networks in general, and in particular to apparatus and methods for reducing data traffic associated with the transmission of packets in such networks.
BACKGROUND OF THE INVENTION In computer networks having a client-server architecture, data files are often sent multiple times from a server to the same destination or the same subnet in response to multiple requests. For example, on the Internet a server may transmit the same Graphics Interchange Format (GIF) file over and over again to different users at a single Internet Service Provider (ISP). In packet-switched networks such as the Internet, a data file is segmented prior to transmission into one or more "packets" that are transmitted in a "data stream" to the destination where they are then reassembled into the original data file. In order to facilitate the transmission of the packets over the network, they are "wrapped" by one or more protocols. The Internet server in the previous example may use the HyperText Transfer Protocol (HTTP) in order to transmit the GIF file. The HTTP protocol adds a header that provides additional information about the GIF file (such as size, time and type of server etc.), and may also concatenate several files together (according to the HTTP/1.1 protocol). The Transmission Control Protocol (TCP) then splits the HTTP protocol transmission into packets and for physical transmission over the Internet using the Internet Protocol (IP). SUMMARY OF THE INVENTION
The present invention seeks to provide apparatus and methods for reducing data traffic associated with the transmission of packets in a packet-based network, such as between two routers in the network. There is thus provided in accordance with a preferred embodiment of the present invention a method for reducing data traffic associated with the transmission of packets in a packet-based network including a) maintaining data transmitted in a data stream at a network source and a network destination, the network source is in communication with the network destination via a network path, b) subsequently receiving a packet associated with the data stream at the network source, c) extracting the packet's data, d) comparing the data extracted in step c) with the data maintained in step a), and e) where the data extracted in step c) at least partially matches the data maintained in step a) f) sending a system packet to the network destination identifying the data extracted in step c) with the data maintained at the network destination, and g) recreating the packet at the network destination from the data maintained at the destination.
Further in accordance with a preferred embodiment of the present invention the system packet includes a checksum of the data stream.
Still further in accordance with a preferred embodiment of the present invention the checksum uniquely identifies the data stream. Additionally in accordance with a preferred embodiment of the present invention the system packet includes an offset identifying the position of the data extracted in step c) in the data maintained in step a). BRIEF DESCRIPTION OF THE DRAWINGS The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the appended drawings in which: Fig. 1 is a simplified block diagram of a system for reducing data traffic associated with the transmission of packets in a network, the system constructed and operative in accordance with a preferred embodiment of the present invention; and
Fig. 2 is a simplified flowchart illustration of a method of operation of the system of Fig. 1 in accordance with a preferred embodiment of the present invention.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
Reference is now made to Fig. 1 , which is a simplified block diagram of a system for reducing data traffic associated with the transmission of packets in a network, the system constructed and operative in accordance with a preferred embodiment of the present invention, and Fig. 2, which is a simplified flowchart illustration of a method of operation of the system of Fig. 1 in accordance with a preferred embodiment of the present invention. In the system of Fig. 1 a stream of packets is transmitted via a network path 8 from a server 10 to a client 12. Client 12 may be any recipient of the data stream sent by server 10, such as an end-user computer or an ISP server. Server 10 is connected to network path 8 via a router 14 and a source device 16. Client 12 is connected to network path 8 via a router 20 and a destination device 18.
Typical operation of the system of Fig. 1 is now explained with specific reference to Fig. 2. Source device 16 preferably receives each packet sent by server 10 for transmission via network path 8 (block 102), extracts the data contained in the packet, and notes the data stream or streams to which the packet belongs (blocks 104 and 106). Next, the data from each packet is then checked to determine whether some of it, or all of it, was already sent over the network path 8 to destination device 18 (block 108) which preferably stores data sent to it (block 110). One method of determining whether data received at source device 16 from server 10 already exists at destination device 18 is by maintaining in both source device 16 and destination device 18 a copy of all data sent in all data streams or specific data streams, such as those containing GIF data, that previously traversed network path 8 (blocks 110 and 118). Thus if data in a data stream is present in source device 16, it is also present in destination device 18, assuming that a reliable connection exists between source device 16 and destination device 18. If a reliable connection does not exist destination device 18 may inform source device 16 that destination device 18 is missing certain data and request a retransmission of the original data.
If the data were already sent over network path 8, source device 16 communicates to destination device 18 that the data is already available at destination device 18 (block 130). One way of doing this is by sending a stream identifier including the checksum of the stream combined with an offset into the stream, and a length. Destination device 18 retrieves the data from memory (block 132), regenerates the packet (block 134), and sends the regenerated packet to client 12 via router 20 (block 120).
Data not found at destination device 18 is sent by source device 16 via network path 8 (block 112). The data may be sent using any conventional technique, compressed or "as is," provided that the destination device 18 can recreate the original data packet.
It is appreciated that data streams may be transmitted in the context of one or more network sessions. In the TCP IP model, a session might be composed of all packets related to a certain TCP connection. A network session may also be defined as all the IP packets sent from one IP address to another IP address that share an additional element such as a certain identifier. A session may also include one or more data streams.
A preferred method of associating the data found in a packet with the appropriate data stream is now described. Each packet that is received by source device 16 is checked to determine if it is associated with an existing network session, if a new network session should be defined for the packet transmission, or if it is not to be associated with any session and therefore should be sent "as is," without additional processing by source device 16. In one preferred embodiment of the present invention source and destination devices 16 and 18 are installed at each end of an IP hop. For example, devices 16 and 18 may be installed between the routers at each end of the IP hop. The device at the sending end of the hop, i.e. source device 16. intercepts each IP packet the router sends over the hop, analyzes the packet, and creates a "system packet." The system packet typically includes a packet identifier combined with data. The identifier indicates whether this is a "regular" (bypass) packet, for which no matching data is found at destination device 18, and hence it is sent in its entirely. Alternatively, if some data matches were found, then the packet may include both the unmatched data and indexing information that destination device 18 may use to identify the matched stream, such as a unique checksum of the original stream and information about the location of the data in the stream, such as an offset-length pair. The destination device on the receiving end of the hop receives the system packet and checks the identifier. If the identifier indicates a "regular" (bypass) packet, the packet is analyzed, and the data found in the packet will be stored in destination device 18's memory. The original IP packet is then sent to router 20. However, if the packet identifier indicates that the packet contains information about data streams already present at destination device 18, destination device 18 then analyzes the packet and recreates the original IP packet from its memory using the information found in the system packet.
Other approaches to data stream identification may be employed, thus supporting data streams which are not yet completed and for which a checksum of the entire stream cannot be calculated.
In another preferred embodiment of the present invention the functionality of source device 16 and destination device 18 may be incorporated into routers 14 and 20 respectfully in hardware and/or software using conventional techniques. The routers may thus be configured to process packets as described above, with the data being stored in the routers' internal memory as necessary.
Source device 16 may use several methods for matching incoming packets to data stored at destination device 18. For example, source device 16 may perform matching on a packet-by-packet basis, than reassemble the data file (e.g., GIF file), and compare the received and stored files. A system utilizing the teachings of the present invention may provide additional data throughput as compared with existing systems as it is typically more efficient to send indexing information as described above which is usually several bytes in length, instead of the entire packet which may contain hundreds of bytes of data.
The methods and apparatus disclosed herein have been described without reference to specific hardware or software. Rather, the methods and apparatus have been described in a manner sufficient to enable persons of ordinary skill in the art to readily adapt commercially available hardware and software as may be needed to reduce any of the embodiments of the present invention to practice without undue experimentation and using conventional techniques. While the present invention has been described with reference to a few specific embodiments, the description is intended to be illustrative of the invention as a whole and is not to be construed as limiting the invention to the embodiments shown. It is appreciated that various modifications may occur to those skilled in the art that, while not specifically shown herein, are nevertheless within the true spirit and scope of the invention.

Claims

CLAIMS What is claimed is:
1. A method for reducing data traffic associated with the transmission of packets in a packet-based network comprising: a) maintaining data transmitted in a data stream at a network source and a network destination, wherein said network source is in communication with said network destination via a network path; b) subsequently receiving a packet associated with said data stream at said network source; c) extracting said packet's data; d) comparing said data extracted in step c) with said data maintained in step a); and e) where said data extracted in step c) at least partially matches said data maintained in step a): f) sending a system packet to said network destination identifying said data extracted in step c) with said data maintained at said network destination; and g) recreating said packet at said network destination from said data maintained at said destination.
2. A method according to claim 1 wherein said system packet includes a checksum of said data stream.
3. A method according to claim 2 wherein said checksum uniquely identifies said data stream.
4. A method according to claim 1 wherein said system packet includes an offset identifying the position of said data extracted in step c) in said data maintained in step a).
PCT/IL1999/000345 1998-06-23 1999-06-23 Content storage and redundancy elimination WO1999067925A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU43890/99A AU4389099A (en) 1998-06-23 1999-06-23 Content storage and redundancy elimination

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US9042598P 1998-06-23 1998-06-23
US60/090,425 1998-06-23

Publications (1)

Publication Number Publication Date
WO1999067925A1 true WO1999067925A1 (en) 1999-12-29

Family

ID=22222714

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL1999/000345 WO1999067925A1 (en) 1998-06-23 1999-06-23 Content storage and redundancy elimination

Country Status (2)

Country Link
AU (1) AU4389099A (en)
WO (1) WO1999067925A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9083708B2 (en) 2010-05-17 2015-07-14 Microsoft Technology Licensing, Llc Asymmetric end host redundancy elimination for networks

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4899148A (en) * 1987-02-25 1990-02-06 Oki Electric Industry Co., Ltd. Data compression method
US5293379A (en) * 1991-04-22 1994-03-08 Gandalf Technologies, Inc. Packet-based data compression method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4899148A (en) * 1987-02-25 1990-02-06 Oki Electric Industry Co., Ltd. Data compression method
US5293379A (en) * 1991-04-22 1994-03-08 Gandalf Technologies, Inc. Packet-based data compression method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9083708B2 (en) 2010-05-17 2015-07-14 Microsoft Technology Licensing, Llc Asymmetric end host redundancy elimination for networks

Also Published As

Publication number Publication date
AU4389099A (en) 2000-01-10

Similar Documents

Publication Publication Date Title
EP1330720B1 (en) Network architecture and methods for transparent on-line cross-sessional encoding and transport of network communications data
US7143169B1 (en) Methods and apparatus for directing messages to computer systems based on inserted data
US6032193A (en) Computer system having virtual circuit address altered by local computer to switch to different physical data link to increase data transmission bandwidth
US8009696B2 (en) System and method for achieving accelerated throughput
JP3225924B2 (en) Communication quality control device
JP4759389B2 (en) Packet communication device
US7079501B2 (en) Method and system for efficiently delivering content to multiple requesters
US20030084185A1 (en) Apparatus and method for scaling TCP off load buffer requirements by segment size
US20030206549A1 (en) Method and apparatus for multicast delivery of information
US6321269B1 (en) Optimized performance for transaction-oriented communications using stream-based network protocols
US20030229809A1 (en) Transparent proxy server
WO2002035795A1 (en) Transparent proxy server
US20030149792A1 (en) System and method for transmission of data through multiple streams
WO2022193447A1 (en) Data packet deduplication and transmission method, electronic device, and storage medium
US6466987B2 (en) Wire protocol for a media server system
EP2084860A2 (en) Selective session interception method
US7283527B2 (en) Apparatus and method of maintaining two-byte IP identification fields in IP headers
US7564848B2 (en) Method for the establishing of connections in a communication system
US20090055919A1 (en) Unauthorized communication detection method
US7876757B2 (en) Router-assisted fast processing of packet termination in host
US20030055915A1 (en) Method and apparatus for transmitting data over a network
US6963568B2 (en) Method for transmitting data packets, method for receiving data packets, data packet transmitter device, data packet receiver device and network including such devices
US6542503B1 (en) Multicast echo removal
WO2002051077A1 (en) A method and system for distinguishing higher layer protocols of the internet traffic
US7978598B1 (en) Connection replication

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase