WO2006130830A2 - System and method for measuring traffic and flow matrices - Google Patents

System and method for measuring traffic and flow matrices Download PDF

Info

Publication number
WO2006130830A2
WO2006130830A2 PCT/US2006/021447 US2006021447W WO2006130830A2 WO 2006130830 A2 WO2006130830 A2 WO 2006130830A2 US 2006021447 W US2006021447 W US 2006021447W WO 2006130830 A2 WO2006130830 A2 WO 2006130830A2
Authority
WO
WIPO (PCT)
Prior art keywords
sketch
data packet
network
node
bitmap
Prior art date
Application number
PCT/US2006/021447
Other languages
French (fr)
Other versions
WO2006130830A3 (en
Inventor
Qi Zhao
Abhishek Kumar
Jun Xu
Original Assignee
Georgia Tech Research Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Georgia Tech Research Corporation filed Critical Georgia Tech Research Corporation
Publication of WO2006130830A2 publication Critical patent/WO2006130830A2/en
Publication of WO2006130830A3 publication Critical patent/WO2006130830A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/022Capturing of monitoring data by sampling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/022Capturing of monitoring data by sampling
    • H04L43/024Capturing of monitoring data by sampling by adaptive sampling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • H04L43/106Active monitoring, e.g. heartbeat, ping or trace-route using time related information in packets, e.g. by adding timestamps
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Definitions

  • Figure 1 illustrates an example of a network monitoring device
  • Figure 2 illustrates another example of a network monitoring device
  • Figure 3 illustrates an example of a network that includes the network monitoring device in accordance with the invention
  • Figures 4A and 4B illustrate an example of a counter array sketch data structure used with the network monitoring device
  • Figures 5A and 5B illustrate an example of a bit map sketch data structure used with the network monitoring device
  • Figure 6 illustrates a network in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device
  • Figure 7 illustrates more details of each node of the network shown in Figure 6;
  • Figure 8 illustrates a method for monitoring a network link in accordance with the invention
  • Figure 9 illustrates a method for estimating a traffic matrix in accordance with the invention.
  • Figure 10 illustrates an example of a flow measurement interval
  • Figure 11 compares an observed average error with the predicted error for various load factors
  • Figure 12 illustrates a method for estimating a flow matrix by matching counter values in accordance with the invention
  • Figures 13A and 13B compare an estimated traffic matrix using the bit map sketch and counter array sketch, respectively, to an original traffic matrix estimates using the NLANR traces;
  • Figure 14 illustrates the impact of varying the threshold on the relative error of a traffic matrix estimation
  • Figure 15 illustrates a comparison of an average error of the bitmap scheme to the known sampling scheme
  • Figure 16 illustrates a flow matrix estimation error for various thresholds
  • Figure 17 illustrates a cumulative distribution of traffic with certain average errors.
  • TM traffic matrix
  • TM traffic matrix
  • a node could be a link, router, point of presence (PoP), or an application server (AS)
  • the traffic matrix can be measured between any two nodes of the network, such as between two routers.
  • the total traffic volume traversing the network from the ingress node i e ⁇ 1,2, • • • , r ⁇ / to the egress nodey e ⁇ 1, 2, • • • , n ⁇ is TMy.
  • estimating the TM on a high-speed network in a specified measurement interval is performed.
  • Figure 1 illustrates an example of a network monitoring device 20 that monitors a stream of data packets 22 over a link (not shown) of a communications network.
  • the device may include a collection unit 24 and an analysis unit 26.
  • the collection unit 24 collects sampled data packets or generates a sketch of the data packet stream over the link during a period of time while the analysis unit 26 analyzes the sampled data packets or the sketches.
  • the collection unit and the analysis unit may be implemented in software, but may also be implemented in hardware or in a combination of hardware and software.
  • the collection unit and analysis unit may be co- located or may be located at different physical locations and may be executed on the same piece of hardware or different pieces of hardware.
  • the sketch is a data structure that stores information about each data packet in a packet stream wherein the sketch is typically a smaller size than the actual data packets.
  • Two examples of a sketch that can be used with the network monitoring device are shown in Figures 4A-4B and 5A-5B, respectively.
  • the collection unit 24 may further comprise one or more of a selection process 28, an online streaming module 32 and a reporting process 30 which are each a piece of software code that implements the functions and methods described below.
  • the selection process performs a sampling of the packet stream and selects the sampled data packets that are communicated to the reporting process 30 that aggregates the sampled data packets and generates a report based on the sampled data packets.
  • the online streaming module monitors the packet stream and generates one or more sketches (shown in Figures 4A, 4B and/or 5 A, 5B) based on the data packets in the packet stream. As shown, the collection unit may communicate flow records/reports and the sketches to the analysis unit 26.
  • the analysis unit 26 may further comprise a collector 34, a digest collector 36 and one or more network monitoring applications 38, such as application 38 1? application 38 2 and application 38 3 .
  • the collector 34 receives the flow records/reports from the reporting process 30 while the digest collector 36 receives the sketches generated by the online streaming module 32.
  • the flow records/reports and/or the sketches may then be input into the monitoring applications 38 that perform different network monitoring functions.
  • one application can generate a data packet volume estimate over a link between a first and second node of a network
  • another application may generate a flow estimate between a flow source and a flow destination.
  • the network monitoring device 20 is a platform on which a plurality of different monitoring applications can be executed to perform various network monitoring functions and operations.
  • One or more examples of the monitoring applications that may be executed by the network monitoring device are described in more detail below.
  • FIG. 2 illustrates another example of a network monitoring device 20 that includes the collection unit 24 and the analysis unit 26.
  • the collection unit 24 includes only the online streaming module 32 that generates the sketches and periodically communicates the sketches to the analysis unit 26 that includes the offline processing module 38.
  • the network monitoring device 20 generates sketches and then analyzes those sketches to perform a network monitoring function.
  • Each data packet 22 1 , 22 2 , 22 3 and 22 4 may include a header 23 ls 23 2 , 23 3 and 23 4 that may be used by the online streaming module 32 to generate the sketches as described in more detail below.
  • a user may submit a query, such as the traffic volume of a particular link, to the monitoring application and the monitoring application returns a result to the user, such as the traffic volume of the particular link based on the sketches generated by the online streaming module.
  • FIG. 3 illustrates an example of a network 40 that includes the network monitoring device in accordance with the invention.
  • the network may include one or more nodes 42, such as routers 42j, 42 2 , 42 3 , 42 4 and 42 5 , wherein the network may include one or more ingress routers and one or more egress routers.
  • the routers form the network that permits data packets to be communicated across the network.
  • the collection unit 24 is physically located at each link interface at each node 42 and is a piece of software executed by each router.
  • the analysis unit 26 may be physically located at a central monitoring unit 44, such as a server computer, that is remote from the nodes of the network and the collection unit is a piece of software executed by the central monitoring unit 44.
  • Each collection unit 24 may generate one or more sketches for its link during a particular time period and then communicate those sketches to the central monitoring unit as shown by the dotted lines in Figure 3.
  • FIGs 4A and 4B illustrate an example of a counter array sketch data structure 50 used with the network monitoring device.
  • the sketch data structure may include one or more counters 52 (Cl to Cb for example) in an array (known as a counter array) wherein each counter has an index number (1 to b in the example shown in Figure 4A) associated with the counter.
  • Each counter can be incremented based on the scanning of the data packets in the data stream performed by the online streaming module shown above.
  • each counter is - associated with a particular set of one or more packet flow label attributes.
  • Each data packet flow label may include a field containing a source node address (an address of the source of the particular data packet, a field containing a destination node address (an address of the destination of the particular data packet, a field containing a source port (an application at the source from which the particular data packet is generated), a field containing a destination port (the application at the destination to which the particular data packet is being sent) and a field containing a protocol designation that identifies the type of protocol being used for the particular data packet, such as HTTP, UDP, SNMP, etc.
  • one counter for a particular network may be assigned to count all data packets (during a predetermined time interval) that are sent from a particular source node while another counter may be assigned to count all data packets (during a predetermined time interval) that are sent to a particular application in a particular destination node.
  • the assignment of each counter in the counter array is configurable depending on the particular network and the particular user and what the particular user needs to monitor in the network.
  • Figure 4B illustrates an example of a piece of pseudocode 54 that implements the counter array data structure shown in Figure 4A.
  • the pseudocode shows that, during an initialization process 56 (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each counter in the counter array is reset to a default value that may be zero.
  • a hash function is performed on the flow label of the data packet (illustrated as h(pkt.flow_label) in the pseudocode) which generates an index value (ind) into the counter array and the counter at that index location is incremented by one to indicate that a data packet with the particular set of one or more packet flow label attributes was monitored by the particular online streaming module.
  • h(pkt.flow_label) in the pseudocode
  • the hash function used by the counter array may be the known H 3 family of hash functions that are described in an article by J. Carter and M. Wegman entitled “Universal classes of hash functions", Journal of Computer and System Sciences, pages 143-154 (1979) which is incorporated herein by reference.
  • Q is a r x w matrix defined over GF(2) and its value is fixed for each hash function in H 3 .
  • the multiplication and addition in GF(2) is boolean AND (denoted as o) and XOR (denoted as ⁇ ), respectively.
  • Each bit of B is calculated as:
  • the bit map sketch and the counter array sketch may also use other known or unknown hash functions, such as for example, SHAl or MD5 and the invention is not limited to any particular hash function. Now, an example of another sketch data structure is described.
  • Figures 5 A and 5B illustrate an example of a bitmap sketch data structure 60 used with the network monitoring device.
  • the sketch data structure may include one or more bit positions 62 (1 to b for example) in an array (known as a bit map sketch) wherein each bit position has an index number (1 to b in the example shown in Figure 5A) associated with the bit position.
  • Each bit position can have a value of "0" or "1” and be set to "1" based on the scanning of the data packets in the data stream performed by the online streaming module shown above.
  • each bit position is associated with a particular data packet characteristic that uniquely identifies the data packet wherein that portion of the data packet is input to the hash function.
  • the invariant portion of a packet used as the input to the hash function must uniquely represent the packet and by definition should remain the same when it travels from one router to another. At the same time, it is desirable to make its size reasonably small to allow for fast hash processing. Therefore, the invariant portion of a packet consists of the packet header, where the variant fields (e.g., TTL, ToS, and checksum) are marked as O's, and the first 8 bytes of the payload if there is any. As is known, these 28 bytes are sufficient to differentiate almost all non-identical packets.
  • the variant fields e.g., TTL, ToS, and checksum
  • Figure 5B illustrates an example of a piece of pseudocode 64 that implements the bit map data structure shown in Figure 5 A.
  • the pseudocode shows that, during an initialization process 66. (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each bit in the bit map is reset to a default value that may be zero.
  • bitmap sketch may be the same hash function used for the counter array sketch.
  • the bit map once it reaches a threshold level of fullness, is stored in a memory or disk and then communicated to the analysis unit as described above. As with the counter array, the bit map has the same performance characteristics as the counter array.
  • the sketch data structure trades off complete information about each data packet for a limited amount of information about each data packet in the link.
  • the sketches store some information about each data packet due to the hash function and the bitmap or counter array.
  • one of the monitoring applications that may be resident on the network monitoring device is an application that, based on monitoring of the data packets in a packet stream to generate a sketch by the collection unit, generates a traffic matrix or a flow matrix.
  • the traffic matrix provides an estimate of the volume of data packets over a link while the flow matrix provides an estimate of the volume of the data packets for a particular flow over a link.
  • FIG. 6 illustrates a network 70 in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device.
  • the network 70 may include a source node 72 1 and a destination node 72 2 that may, for example, be a server computer in Seattle with a particular IP address and a server computer in Atlanta with a particular IP address.
  • the source node 12 ⁇ may run an email application that generates data packets over a particular port having a SMNP protocol that are destined for an email application of the destination node 72 2 .
  • a data packet from the source node to the destination node may pass through one or more intermediate nodes 74 that may be routers in the example shown in Figure 6 and the data packets can pass over various different links between the nodes during the transit of the data packets from the source node to the destination node.
  • This network 70 may include the central monitoring unit 44 that houses the analysis unit 26 (not shown) that includes the monitoring application that generates traffic matrices and flow matrices for the network.
  • the monitoring application may be a piece of software executed by the central monitoring unit that is a computer-based device.
  • each node 74 may include the collection unit 24 associated with each link interface of each node.
  • a node connected to two different communications links would have a collection unit associated with each link or may have a single collection unit with two online streaming modules.
  • the collection unit 24 as shown in Figure 7 may be integrated into the node 74 or the collection unit may be a separate piece of hardware.
  • the traffic matrix or flow matrix is measured between a first observation point and a second observation point wherein the volume of data packets between the first and second observation points is estimated (traffic matrix) or the distribution of flow sizes (data packets in each flow) within the volume of data packets between the first and second observation points is estimated (the flow matrix).
  • the first observation point may be a link or node and the second observation point may also be a node or a link.
  • the flow matrix between the node in Seattle and the node in Atlanta may be estimated using the methods described herein.
  • Figure 8 illustrates a method 80 for monitoring a network link in accordance with the invention using a novel data streaming technique that processes a long stream of data items in one pass using a small working memory in order to answer a class of queries regarding the stream.
  • the method may use the counter array sketch or the bit map sketch described above to generate a traffic matrix or a flow matrix based on the data packets in the data stream monitored at the communications link.
  • the monitoring method shown in Figure 8 may be implemented by the collection unit and in particular the online streaming module that is a plurality of lines of computer code executed by the node processor (which the node is the hardware on which the online streaming module runs) that implements the steps described below.
  • step 82 the online streaming module waits for a packet on the link and, when there is a packet, extracts the invariant portion (or the flow label information when the counter array sketch is used) from the data packet in step 84 which is described above.
  • step 86 the online streaming module performs a hash operation on the extracted data packet information and, as described above, generates an index into the sketch based on the data packet information so that, in step 88, the sketch position identified by the index is incremented (or a bit position is changed to "1" for the bit map sketch).
  • the counter array sketch or bit map sketch data structure may be stored in a smaller amount of memory than the actual data packets.
  • step 90 the online streaming module determines if the sketch period has been exceeded (either the epoch period of the counter array is exceeded or the bit map sketch has exceeded a threshold level of fullness.) If the sketch period is not exceeded, the method loops back to step 82 to check for the next data packet on the link. If the sketch period is exceeded, then in step 92, the sketch is stored on the node and the sketch data structure is reset and the new sketch data structure is filled with data when and the method loops back to step 82.
  • the sketches may be communicated to the central monitoring unit on demand when an estimate is requested by a user or periodically communicated to the central monitoring unit.
  • the monitoring application analyzes the sketches and generates an estimate of the volume of the data packets over the link based on the one or more bit map sketches (a traffic matrix element) or generates an estimate of the volume of data packets for a particular flow over the link based on the one or more counter array sketches (a flow matrix).
  • TMy To generate an estimate of a traffic matrix element, TMy, two bitmap sketches are collected from the corresponding nodes i andj, and are fed to the monitoring application that is able to estimate the traffic matrix element as described below in more detail. Since only the bitmap sketches from the two nodes, i an ⁇ j, are needed, the monitoring application can estimate a submatrix using the minimum amount of information possible, namely, only the bitmaps from the rows and columns of the submatrix. The estimation of the submatrix allows large ISP networks to focus on particular portions of their network. The estimation of the submatrix also permits the incremental deployment of the network monitoring device since the existence of non- participating nodes does not affect the estimation of the traffic submatrix between all participating ingress and egress nodes.
  • the counter array sketch is used.
  • the counter array sketch permits the volume of data packets from a plurality of different flows (based on the flow labels) to be estimated.
  • the counter array matrix may also be used to estimate the traffic matrix even though is the less cost effective than the bitmap sketch.
  • the network 40 has one or more nodes 42 that may each have the online streaming module (in the collection unit 24 shown in Figure 3) that generates the sketches (either the bitmap sketches or the counter array sketches) that are thousands of times smaller than the raw data packet traffic.
  • the sketches may be stored locally for a period of time, and will be shipped to a central monitoring unit 44 on demand.
  • the data analysis unit 26 running at the central monitoring unit 44 obtains the sketches needed for estimating the traffic matrix and flow matrix through queries.
  • the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the bitmap sketch.
  • the method for generating the bitmap sketch in the online streaming module was discussed above with respect to Figure 5B.
  • the bitmap sketch is stored in the online streaming module (at each node using the same hash function and same bitmap size b) until the sketch is filled to a threshold percentage over a time interval wherein the time interval may be known as a "bitmap epoch".
  • the bitmap sketch data structure is reset and then again filled with data.
  • FIG 9 illustrates a method 100 for estimating a traffic matrix in accordance with the invention that may be implemented by the monitoring application (that may be software executed by a processor that is part of the central monitoring unit) within the analysis unit 26 that may be within the central monitoring unit 44.
  • the monitoring application may determine if a request for a traffic matrix estimate (T My) between two nodes (i and j) during an interval, t, has been requested. If a traffic matrix estimate is requested, then in step 104, the monitoring application requests the bitmap(s) from the two nodes that are contained in or partly contained in the time interval, t.
  • T My traffic matrix estimate
  • FIG. 10 An example of the bitmaps from the two nodes over a time interval requested by the monitoring application is shown in Figure 10.
  • the analysis unit 26 receives the requested bitmaps from the nodes.
  • the monitoring application estimates the traffic matrix (T My) for the volume of data packets between the two nodes given the bitmaps delivered from the nodes. For purposes of an example, it is assumed that both node i and node/ produce exactly one bitmap during the time interval (the measurement interval) when the traffic matrix is estimated.
  • the estimator may be adapted from an article "A Linear-time Probabilistic Counting Algorithm for Database Applications" by K.Y. Whang et al, IEEE Transaction of Database Systems, pgs. 208-229 (June 1990) which is incorporated herein by reference, in which the estimator is used for databases.
  • the set of packets arriving at the ingress node i during the measurement interval is Tj, the resulting bitmap is B T ; and the number of bits (that are all "0") in Bxi is U ⁇ i while the size of the bitmap is b.
  • the estimator of T 1 which is the number of elements (packets) in Tj is:
  • T My (the quantity to be estimated) is T t nT j .
  • B TI ⁇ JT (the result of hashing the set of packets T 1 KJT 1 into a single bitmap.)
  • the bitmap B T ⁇ uT is computed as the bitwise-OR of B ⁇ i and B T1 . It can be shown that D n +D Tj -D Tl ⁇ Tj is
  • the computational complexity of estimating each element of the matrix is O(b) for the bitwise operation of the two bitmaps.
  • the overall complexity of estimating the entire m x n matrix is therefore O(mnb). Note that the bitmaps from other nodes are not needed when we are only interested in estimating TM ⁇ This poses a significant advantage in computational complexity over existing indirect measurement approaches, in which the whole traffic matrix needs to be estimated even if we are only interested in a small subset of the matrix elements due to the holistic nature of the inference method.
  • the measurement interval is exactly one bitmap epoch. Practically some network management tasks such as capacity planning and routing configuration need the traffic matrices on the long time scales such as tens of minutes or a few hours. Each epoch in our measurements is typically much smaller especially for the high speed links. Therefore we need extend our scheme to support any time scales.
  • bitmap epochs between nodes i andj are well aligned. Traffic going through different nodes can have rates orders of magnitude different from each other, resulting in some bitmaps being filled up very fast (hence short bitmap epoch) and some others filled up very slowly (hence long bitmap epoch). We refer to this phenomena as heterogeneity. Because of heterogeneity the bitmap epochs on different nodes may not well aligned.
  • the above estimation was for the ideal case.
  • the general case for the estimation is explained.
  • the measurement interval spans exactly bitmap epochs 1, 2, ..., k; at node i and bitmap epochs 1, 2, ..., k 2 at node y, respectively.
  • the traffic matrix element TMy can be estimated as
  • N q r is the estimation of the common traffic between the bitmap (also known as a page) q at node I and the page r at node j, and overlap ⁇ q, r) is 1 when the page q at node i overlaps temporally with page r at node j and is 0 otherwise.
  • the timestamps of their starting times will be stored along with the pages in a process known as "multipaging".
  • the multipaging process eliminates the assumptions set forth above. For example, the multipaging process supports the measurements over multiple epochs so that the first assumption is eliminated.
  • an exemplary measurement interval 110 corresponds to the rear part of epoch 1, epochs 2 and 3, and the front part of epoch 4 at node i.
  • the exemplary measurement interval also corresponds to the rear part of epoch 1, epoch 2, and the front part of epoch 3 at node/
  • N 11 ,N 2 1 , N 2 ⁇ 2 , N 3 2 , N 3 3 , and N 43 based on their temporal
  • the bitmaps In some situations, it is desirable to store the bitmaps for a long period of time for later troubleshooting which could result in huge storage complexity for very high speed links, but sampling can be used to reduce this requirement significantly.
  • sampling To sample the data packets, the impact on the accuracy should be minimized, but it is desirable to use DRAM to conduct online streaming for very high speed links (e.g., beyond OC- 192) and it is important to sample only a certain percentage of the packets so that the DRAM speed can keep up with the data stream speed.
  • the constraint can be one bitmap of 4 Mbits per second and suppose we have 40 million packets arriving within one second.
  • One option is that the process does no sampling, but hashes all these packets into the bitmap, referred to as "squeezing". But the resulting high load factor of approximately 10 would lead to high estimation error.
  • An alternative option is to sample only a certain percentage ⁇ of packets to be squeezed into the bitmap and many different/? values can be chosen. For example, we can sample 50% of the packets and thereby squeeze 20 million sampled packets into the bitmap, or we can sample and squeeze only 25% of them so that it is necessary to determine an optimal value of p. On the one extreme, if we sample at a very low rate, the bitmap will only be lightly loaded and the error of estimating the total sampled traffic as well as its common traffic with another node (a traffic matrix element) becomes lower.
  • the optimal value of p may be determined based on the following principle:
  • each overlapping page pair may have its own optimal t* to achieve the optimal accuracy of estimating its common traffic. Therefore it is impossible to adapt t* to satisfy every other node, as their needs (t* for optimal accuracy) conflict with each other. Therefore, a default t* for every node is identified such that the estimation accuracy for the most common cases is high.
  • the optimal ⁇ ?* and t* between pages a and ⁇ given the expected traffic demand in a bitmap epoch. In fact, only one of them needs to be determined since the other follows from Principle 1.
  • a sampling technique called consistent sampling is used to significantly reduce the estimation error. With consistent sampling, X , the estimator of X, is given by
  • the above formula consists of two terms.
  • the first term corresponds to the variance from estimating the sampled traffic (equation 2 above) scaled by 1/p 2 (to compensate for the sampling), and the second term corresponds to the variance of the sampling process. Since these two errors are orthogonal to each other, their total variance is the sum of their individual variances.
  • X is an almost unbiased estimator of X, is given by:
  • the optimal t * value is a function of T and X, setting it according to some global default value may not be optimal all the time. Fortunately we observe that t*, the optimal load factor, does not vary much for different T and X values through our extensive experiments. In addition, we can observe from Figure 11 that the curve is quite flat in a large range around the optimal load factor. For example, the average errors corresponding to any load factor between 0.09 and 1.0 only fluctuate between around 0.012 to 0.015. Combining the above two observations, we conclude that by setting a global default load factor t* according to some typical parameter settings, the average error will stay very close to optimal values. Throughout this work we set the default load factor to 0.7.
  • the consistent sampling scheme works by fixing a hash function h' (different from the aforementioned h which is used to generate the bitmap), that maps the invariant portion of a packet to an 1-bit binary number.
  • the range of the hash function h ' is ⁇ 0, 1 , • • • , 2 l - 1 ⁇ .
  • the flow matrix contains finer grained information than the traffic matrix and the counter array sketch described above may be used to estimate the flow matrix.
  • the flow matrix is the traffic matrix combined with the information on how each OD element is split into flows of different sizes.
  • a flow matrix element FMy is the set of sizes of flows that travel from node i to node/ ' during a measurement interval.
  • the counter array sketch can be used to estimate the traffic matrix as well.
  • the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the counter array sketch.
  • the method for generating the counter array sketch (using the flow label described above) in the online streaming module was discussed above with respect to Figure 4B.
  • the counter array sketch is stored on the online streaming module (at each node using the same hash function and same bitmap size b.) Since for each packet, this process requires only one hash operation, one memory read and one memory write (to the same location), this allows the online streaming module to operate at OC-768 (40 Gbps) speed with off-the-shelf 1 Oris SRAM and an efficient hardware implementation of the Hj family of hash functions.
  • the counter array scheme is holistic in the sense that all ingress and egress nodes have to participate.
  • the counter epochs in this scheme need to be aligned with each other, that is, all counter epochs in all ingress and egress nodes need to start and end at the approximately same time.
  • the practical implication is that the counter array b needs to be large enough to accommodate the highest link speed among all nodes (i.e., the worst case). Similar to the definition of "bitmap epoch”, we refer to the amount of time the highest-speed link takes to fill up the counter array to a threshold percentage as a "counter epoch", or epoch for abbreviation.
  • the memory and storage complexities of the online streaming module for the counter array scheme are explored.
  • the counter epoch ranges from 1 to a few 10's of seconds and very accurate estimates can be achieved by setting the number of counters in the array to around the same order as the number of flows during an epoch. Therefore, for an OC- 192 or an OC- 768 link, one to a few million counters need to be employed for a measurement interval of one to a few seconds. If each counter has a "safe" size of 64 bits to prevent the overflow, the memory requirement would be quite high.
  • Huffman type of compression can easily reduce the storage complexity to only a few bits per counter. Since the average flow length is about 10 packets (observed in our evaluation described below), the average storage cost per packet is amortized to less than 1 bit.
  • the counter arrays during that interval need to be shipped to the central monitoring unit for analysis. If the measurement interval spans more than one epochs, sketches in each epoch will be processed independently.
  • V"' C n [Ic] « V"_ C Ej [k] The approximation comes from the fact that the clock is not perfectly synchronized at all nodes and the packet traversal time from an ingress node to an egress node is non-zero. Both factors only have marginal impact on the accuracy of our estimation. In addition, the impact of this approximation on the accuracy of the data analysis process is further alleviated due to the "elephant matching" nature of the data analysis process described below.
  • Figure 12 is a piece of pseudocode that illustrates a method 120 for estimating the flow matrix by matching counter values at index k wherein the large and medium flows are matched. For each index k in all the counter arrays, the steps shown in lines 2-13 of Figure 12 are executed. In the matching process, the largest ingress counter value Ci max . t [k] is matched with the largest egress counter value C EMOX -J [k]- The smaller value of the two is considered a flow from max J to max J (determined in lines 6 and 10 of the pseudocode), and this value will be subtracted from both counter values (see lines 7 and 11 of the pseudocode) which reduces the smaller counter value to 0.
  • the computation complexity of the process shown in Figure 12 is O((m+ n — l)(log m + log n)) because the binary searching operation (lines 3 and 4 of the pseudocode that determine the largest ingress and egress counter values) dominates the complexity of each iteration and there are at most m + n — 1 iterations.
  • the overall complexity to estimate the flow matrix is 0(b(m + n — l)(log m + log n)).
  • an exact flow matrix can be used to indicate intrusions such as DDoS attacks.
  • the estimation process in Figure 12 provides accurate estimation on the medium and large flow matrix elements and some typical intrusions (e.g., DDoS attacks) consist of a large number of small flows.
  • the process shown in Figure 12 can still be used to provide valuable information about intrusions.
  • the flow label of the process can be selected to provide the valuable information.
  • the flow label can be the destination IP address of a packet so that the traffic of a DDoS attack becomes a large flow going through the network instead of a large number of small ones.
  • the method may include other known sampling or steaming processes as set forth in "New Directions in Traffic Measurement and a Counting", C. Estan and G. Varghese, Proceedings of ACM SIGCOMM, August 2002 which is incorporated herein by reference.
  • the traffic matrix can also be obtained by adding up the sizes of all the flows that we determine going from node i to nodey using the above process. This is in fact a fairly accurate estimation of traffic matrix since the process tracks kangaroos and elephants very accurately and thus accounts for the majority of traffic.
  • Root Mean Squared Error RMSE
  • RMSRE Root Mean Squared Relative Error
  • the RMSE provides an overall measure of the absolute errors in the estimates, while RMSRE provides a relative measure. Note that the relative errors for small matrix elements are usually not very important for network engineering so that only matrix elements greater than some threshold T are used in the computation of RMSRE (properly normalized), hi the above equation, N 7 - refers to the number of matrix elements greater than T, i.e.,
  • N r
  • x > r,i i,2,...,N ⁇
  • ⁇ LA ⁇ R trace-driven Evaluation The set of traces used consist of 16 publicly available packet header traces from ⁇ LA ⁇ R.
  • the number of flows in these traces varies from 170K to 320K and the number of packets varies from 1.8M to 3.5M.
  • a synthetic scenario is constructed that appears as if these traces were collected simultaneously at all ingress nodes of a network.
  • the challenge in constructing this scenario lies in assigning the flows in the input stream at an ingress node to 16 different egress nodes such that the generated matrix will reflect some properties of real traffic matrices.
  • bitmap and the counter array For simplicity, we configure the size of the bitmap and the counter array to fit the data set size without adopting the enhancement techniques (i.e., multipaging and sampling).
  • enhancement techniques i.e., multipaging and sampling.
  • SRAM fast memory
  • Figures 13A and 13B compare the estimated traffic matrix elements using the bitmap scheme (Figure 13A) and the counter array scheme (Figure 13B) with the original traffic matrix elements.
  • the solid diagonal line in each figure denotes a perfect estimation, while the dashed lines denote an estimation error of ⁇ 5% so that points closer to the diagonal mean a more accurate estimate.
  • both schemes are very accurate, and the bitmap scheme is more Accurate than the counter array scheme.
  • Figure 14 shows the impact of varying Ton RMSRE.
  • Ton RMSRE Ton RMSRE
  • results above reflect relative accuracy on a small time scale (one to several seconds for high speed routers), and they should not be directly compared with other reported results since those results are on much larger time scales.
  • the schemes usually can achieve much higher relative accuracy on larger time scales (e.g., tens of minutes) as shown below.
  • Figure 16 shows the RMSREs for various thresholds T. We observe a sharp downward trend in the value of RMSRE for increasing threshold values. When the threshold is equal to 10 packets, the error drops to below 15%. The accurate estimation of these flows is very important since, in this trace, flows of size 10 and above (71,345 of them) accounts for 87% of the total traffic.
  • a one-hour router-level traffic matrix from a tier-1 ISP network is obtained to analytically evaluate the accuracy of the bitmap scheme.
  • traffic volume between each pair of backbone routers is evenly distributed over the one hour time period.
  • An hour's traffic is too large (we assume a conservative average packet size of 200 bytes) to fit in a single bitmap, and therefore the aforementioned multipaging technique is used.
  • Given a traffic matrix we split the traffic on each ingress/egress node into multiple pages of 4Mbits (i.e., 512KB) with load factor 0.7 (the default load factor described above). Then, we compute the standard deviation for each pair of overlapped pages using Theorem 1.

Abstract

A system and method for measuring traffic and flow matrices is provided that, for a very high speed traffic stream, is able to provide accurate estimates of the volume of traffic flows using traffic digests that are orders of magnitude smaller than the traffic stream. The system and method may also incorporate sampling methodologies.

Description

SYSTEM AND METHOD FOR MEASURING TRAFFIC AND FLOW MATRICES
Qi (George) Zhao
Abhishek Kumar
Jun (Jim) Xu
Priority Claim/Related Application
This application claims priority to US Provisional Patent Application 60/689,651 entitled "Data Streaming Algorithms for Efficient Estimation of Flow Size Distribution," filed June 10, 2005; US Provisional Patent Application 60/686,560 entitled "Data Streaming Algorithm for Estimating Subpopulation Flow Size Distribution," and filed June 2, 2005; US Provisional Patent Application 60/686,570 entitled "Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices," filed June 2, 2005; US Provisional Patent Application 60/709,198 entitled "Data Streaming Algorithms for Detection of Super Sources and Super Destination," filed August 17, 2005; and US Provisional Patent Application 60/709,191 entitled "Including Network Data Streaming in a Broad Architecture for Network Monitoring Applications," filed August 17, 2005; all of which are incorporated herein by reference. This application is also related to the PCT Patent Application filed on June 2, 2006 entitled "System and Method for Data Streaming" which is also incorporated herein by reference.
This invention was made with Government Support under Grant No. ANI-0238315 awarded by the National Science Foundation of the United States. The Government has certain rights in the invention.
Brief Description of the Drawings
Figure 1 illustrates an example of a network monitoring device;
Figure 2 illustrates another example of a network monitoring device;
Figure 3 illustrates an example of a network that includes the network monitoring device in accordance with the invention;
Figures 4A and 4B illustrate an example of a counter array sketch data structure used with the network monitoring device;
Figures 5A and 5B illustrate an example of a bit map sketch data structure used with the network monitoring device;
Figure 6 illustrates a network in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device;
Figure 7 illustrates more details of each node of the network shown in Figure 6;
Figure 8 illustrates a method for monitoring a network link in accordance with the invention;
Figure 9 illustrates a method for estimating a traffic matrix in accordance with the invention;
Figure 10 illustrates an example of a flow measurement interval;
Figure 11 compares an observed average error with the predicted error for various load factors;
Figure 12 illustrates a method for estimating a flow matrix by matching counter values in accordance with the invention; Figures 13A and 13B compare an estimated traffic matrix using the bit map sketch and counter array sketch, respectively, to an original traffic matrix estimates using the NLANR traces;
Figure 14 illustrates the impact of varying the threshold on the relative error of a traffic matrix estimation;
Figure 15 illustrates a comparison of an average error of the bitmap scheme to the known sampling scheme;
Figure 16 illustrates a flow matrix estimation error for various thresholds; and
Figure 17 illustrates a cumulative distribution of traffic with certain average errors.
Detailed Description of Embodiments of the Invention
The measurement of a traffic matrix, TM, is used to determine the traffic between m ingress nodes and n egress nodes (between origin/destination (OD) pairs) in a network wherein a node could be a link, router, point of presence (PoP), or an application server (AS), hi general, the traffic matrix can be measured between any two nodes of the network, such as between two routers. The total traffic volume traversing the network from the ingress node i e { 1,2, • • • , rø/ to the egress nodey e {1, 2, • • • , n} is TMy. In one embodiment of the invention, estimating the TM on a high-speed network in a specified measurement interval is performed.
Figure 1 illustrates an example of a network monitoring device 20 that monitors a stream of data packets 22 over a link (not shown) of a communications network. The device may include a collection unit 24 and an analysis unit 26. The collection unit 24 collects sampled data packets or generates a sketch of the data packet stream over the link during a period of time while the analysis unit 26 analyzes the sampled data packets or the sketches. The collection unit and the analysis unit may be implemented in software, but may also be implemented in hardware or in a combination of hardware and software. The collection unit and analysis unit may be co- located or may be located at different physical locations and may be executed on the same piece of hardware or different pieces of hardware. The sketch is a data structure that stores information about each data packet in a packet stream wherein the sketch is typically a smaller size than the actual data packets. Two examples of a sketch that can be used with the network monitoring device are shown in Figures 4A-4B and 5A-5B, respectively.
The collection unit 24 may further comprise one or more of a selection process 28, an online streaming module 32 and a reporting process 30 which are each a piece of software code that implements the functions and methods described below. The selection process performs a sampling of the packet stream and selects the sampled data packets that are communicated to the reporting process 30 that aggregates the sampled data packets and generates a report based on the sampled data packets. The online streaming module monitors the packet stream and generates one or more sketches (shown in Figures 4A, 4B and/or 5 A, 5B) based on the data packets in the packet stream. As shown, the collection unit may communicate flow records/reports and the sketches to the analysis unit 26.
The analysis unit 26 may further comprise a collector 34, a digest collector 36 and one or more network monitoring applications 38, such as application 381? application 382 and application 383. The collector 34 receives the flow records/reports from the reporting process 30 while the digest collector 36 receives the sketches generated by the online streaming module 32. The flow records/reports and/or the sketches may then be input into the monitoring applications 38 that perform different network monitoring functions. For example, one application can generate a data packet volume estimate over a link between a first and second node of a network, another application may generate a flow estimate between a flow source and a flow destination. In general, the network monitoring device 20 is a platform on which a plurality of different monitoring applications can be executed to perform various network monitoring functions and operations. One or more examples of the monitoring applications that may be executed by the network monitoring device are described in more detail below.
Figure 2 illustrates another example of a network monitoring device 20 that includes the collection unit 24 and the analysis unit 26. In this example, the collection unit 24 includes only the online streaming module 32 that generates the sketches and periodically communicates the sketches to the analysis unit 26 that includes the offline processing module 38. Thus, in this example, the network monitoring device 20 generates sketches and then analyzes those sketches to perform a network monitoring function. Each data packet 221, 222, 223 and 224 may include a header 23 ls 232, 233 and 234 that may be used by the online streaming module 32 to generate the sketches as described in more detail below. In this example, a user may submit a query, such as the traffic volume of a particular link, to the monitoring application and the monitoring application returns a result to the user, such as the traffic volume of the particular link based on the sketches generated by the online streaming module.
Figure 3 illustrates an example of a network 40 that includes the network monitoring device in accordance with the invention. The network may include one or more nodes 42, such as routers 42j, 422, 423, 424 and 425, wherein the network may include one or more ingress routers and one or more egress routers. The routers form the network that permits data packets to be communicated across the network. The links between the nodes over which data packets are communicated as shown in solid lines. In this example of the network, the collection unit 24 is physically located at each link interface at each node 42 and is a piece of software executed by each router. Furthermore, the analysis unit 26 may be physically located at a central monitoring unit 44, such as a server computer, that is remote from the nodes of the network and the collection unit is a piece of software executed by the central monitoring unit 44. Each collection unit 24 may generate one or more sketches for its link during a particular time period and then communicate those sketches to the central monitoring unit as shown by the dotted lines in Figure 3.
Figures 4A and 4B illustrate an example of a counter array sketch data structure 50 used with the network monitoring device. The sketch data structure may include one or more counters 52 (Cl to Cb for example) in an array (known as a counter array) wherein each counter has an index number (1 to b in the example shown in Figure 4A) associated with the counter. Each counter can be incremented based on the scanning of the data packets in the data stream performed by the online streaming module shown above. Furthermore, each counter is - associated with a particular set of one or more packet flow label attributes. Each data packet flow label (typically contained in the header portion of the data packet and having 13 bytes) may include a field containing a source node address (an address of the source of the particular data packet, a field containing a destination node address (an address of the destination of the particular data packet, a field containing a source port (an application at the source from which the particular data packet is generated), a field containing a destination port ( the application at the destination to which the particular data packet is being sent) and a field containing a protocol designation that identifies the type of protocol being used for the particular data packet, such as HTTP, UDP, SNMP, etc. For example, one counter for a particular network may be assigned to count all data packets (during a predetermined time interval) that are sent from a particular source node while another counter may be assigned to count all data packets (during a predetermined time interval) that are sent to a particular application in a particular destination node. Thus, the assignment of each counter in the counter array is configurable depending on the particular network and the particular user and what the particular user needs to monitor in the network.
Figure 4B illustrates an example of a piece of pseudocode 54 that implements the counter array data structure shown in Figure 4A. The pseudocode shows that, during an initialization process 56 (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each counter in the counter array is reset to a default value that may be zero. During an update process 58, upon the arrival on each data packet, a hash function is performed on the flow label of the data packet (illustrated as h(pkt.flow_label) in the pseudocode) which generates an index value (ind) into the counter array and the counter at that index location is incremented by one to indicate that a data packet with the particular set of one or more packet flow label attributes was monitored by the particular online streaming module. For each data packet, only one hash operation, one memory read and one memory write (to the same location) is performed so that the counter array is able to operate at OC-768 (40 Gbps) speed with off-the-shelf 10ns SRAM and an efficient hardware implementation of the H? family of hash functions. The hash function used by the counter array (or the bitmap sketch described below) may be the known H3 family of hash functions that are described in an article by J. Carter and M. Wegman entitled "Universal classes of hash functions", Journal of Computer and System Sciences, pages 143-154 (1979) which is incorporated herein by reference. Each hash function in the H3 class is a linear transformation Bτ = QAT that maps a w-bit binary string A = a^2...aw to an r-bit binary string B = bib2...br as:
Figure imgf000008_0001
Here a k-bit string is treated as a k-dimensional vector over finite field GF(2) = {0,1} and T stands for transposition. Q is a r x w matrix defined over GF(2) and its value is fixed for each hash function in H3. The multiplication and addition in GF(2) is boolean AND (denoted as o) and XOR (denoted as θ ), respectively. Each bit of B is calculated as:
b, = («i ° 0,i) ® (α2 ° tf,2) ® ■ • ■ ® («w ° O l = 1,2,...,r
Since computation of each output bit goes through log2 w stages of Boolean circuitry, and all output bits can be computed in parallel, the hash function can finish well within 10ns. The bit map sketch and the counter array sketch may also use other known or unknown hash functions, such as for example, SHAl or MD5 and the invention is not limited to any particular hash function. Now, an example of another sketch data structure is described.
Figures 5 A and 5B illustrate an example of a bitmap sketch data structure 60 used with the network monitoring device. Li this example, the sketch data structure may include one or more bit positions 62 (1 to b for example) in an array (known as a bit map sketch) wherein each bit position has an index number (1 to b in the example shown in Figure 5A) associated with the bit position. Each bit position can have a value of "0" or "1" and be set to "1" based on the scanning of the data packets in the data stream performed by the online streaming module shown above. Furthermore, each bit position is associated with a particular data packet characteristic that uniquely identifies the data packet wherein that portion of the data packet is input to the hash function. In particular, the invariant portion of a packet used as the input to the hash function must uniquely represent the packet and by definition should remain the same when it travels from one router to another. At the same time, it is desirable to make its size reasonably small to allow for fast hash processing. Therefore, the invariant portion of a packet consists of the packet header, where the variant fields (e.g., TTL, ToS, and checksum) are marked as O's, and the first 8 bytes of the payload if there is any. As is known, these 28 bytes are sufficient to differentiate almost all non-identical packets.
Figure 5B illustrates an example of a piece of pseudocode 64 that implements the bit map data structure shown in Figure 5 A. The pseudocode shows that, during an initialization process 66. (which may occur when the monitoring is started or to reset the sketch data structure when the predetermined time period (an epoch period such as 5 minutes) has expired), each bit in the bit map is reset to a default value that may be zero. Then, during an update process 68, upon the arrival on each data packet (pkt), a hash function is performed on the invariant portion of the packet (φ(pkt) in the pseudocode) which generates an index value (ind) into the bit map and the bit at that index location is set to "1" to indicate that a data packet with the particular invariant portion has been monitored by the collection unit. The hash function used by the bitmap sketch may be the same hash function used for the counter array sketch. As with the counter array set forth above, the bit map, once it reaches a threshold level of fullness, is stored in a memory or disk and then communicated to the analysis unit as described above. As with the counter array, the bit map has the same performance characteristics as the counter array.
For each of these sketch data structures described above, the sketch data structure trades off complete information about each data packet for a limited amount of information about each data packet in the link. Unlike a typical system in which the data packets are sampled (and only information about the sampled data packets are stored), the sketches store some information about each data packet due to the hash function and the bitmap or counter array.
In accordance with the invention, one of the monitoring applications that may be resident on the network monitoring device is an application that, based on monitoring of the data packets in a packet stream to generate a sketch by the collection unit, generates a traffic matrix or a flow matrix. The traffic matrix provides an estimate of the volume of data packets over a link while the flow matrix provides an estimate of the volume of the data packets for a particular flow over a link.
Figure 6 illustrates a network 70 in which the traffic matrix and/or the flow matrix for a network link can be measured using the network device. The network 70 may include a source node 721 and a destination node 722 that may, for example, be a server computer in Seattle with a particular IP address and a server computer in Atlanta with a particular IP address. In the example, the source node 12\ may run an email application that generates data packets over a particular port having a SMNP protocol that are destined for an email application of the destination node 722. As shown, a data packet from the source node to the destination node may pass through one or more intermediate nodes 74 that may be routers in the example shown in Figure 6 and the data packets can pass over various different links between the nodes during the transit of the data packets from the source node to the destination node. This network 70 may include the central monitoring unit 44 that houses the analysis unit 26 (not shown) that includes the monitoring application that generates traffic matrices and flow matrices for the network. The monitoring application may be a piece of software executed by the central monitoring unit that is a computer-based device. As shown in Figure 7, each node 74 may include the collection unit 24 associated with each link interface of each node. For example, a node connected to two different communications links would have a collection unit associated with each link or may have a single collection unit with two online streaming modules. The collection unit 24 as shown in Figure 7 may be integrated into the node 74 or the collection unit may be a separate piece of hardware. The traffic matrix or flow matrix is measured between a first observation point and a second observation point wherein the volume of data packets between the first and second observation points is estimated (traffic matrix) or the distribution of flow sizes (data packets in each flow) within the volume of data packets between the first and second observation points is estimated (the flow matrix). In accordance with the invention, the first observation point may be a link or node and the second observation point may also be a node or a link. For example, the flow matrix between the node in Seattle and the node in Atlanta (shown in Figure 6) may be estimated using the methods described herein.
Figure 8 illustrates a method 80 for monitoring a network link in accordance with the invention using a novel data streaming technique that processes a long stream of data items in one pass using a small working memory in order to answer a class of queries regarding the stream. The method may use the counter array sketch or the bit map sketch described above to generate a traffic matrix or a flow matrix based on the data packets in the data stream monitored at the communications link. The monitoring method shown in Figure 8 may be implemented by the collection unit and in particular the online streaming module that is a plurality of lines of computer code executed by the node processor (which the node is the hardware on which the online streaming module runs) that implements the steps described below. In step 82, the online streaming module waits for a packet on the link and, when there is a packet, extracts the invariant portion (or the flow label information when the counter array sketch is used) from the data packet in step 84 which is described above. In step 86, the online streaming module performs a hash operation on the extracted data packet information and, as described above, generates an index into the sketch based on the data packet information so that, in step 88, the sketch position identified by the index is incremented (or a bit position is changed to "1" for the bit map sketch). As described above, the counter array sketch or bit map sketch data structure may be stored in a smaller amount of memory than the actual data packets. In step 90, the online streaming module determines if the sketch period has been exceeded (either the epoch period of the counter array is exceeded or the bit map sketch has exceeded a threshold level of fullness.) If the sketch period is not exceeded, the method loops back to step 82 to check for the next data packet on the link. If the sketch period is exceeded, then in step 92, the sketch is stored on the node and the sketch data structure is reset and the new sketch data structure is filled with data when and the method loops back to step 82. The sketches may be communicated to the central monitoring unit on demand when an estimate is requested by a user or periodically communicated to the central monitoring unit. Once the sketches are communicated to the central monitoring unit (and the monitoring application), the monitoring application analyzes the sketches and generates an estimate of the volume of the data packets over the link based on the one or more bit map sketches (a traffic matrix element) or generates an estimate of the volume of data packets for a particular flow over the link based on the one or more counter array sketches (a flow matrix).
To generate an estimate of a traffic matrix element, TMy, two bitmap sketches are collected from the corresponding nodes i andj, and are fed to the monitoring application that is able to estimate the traffic matrix element as described below in more detail. Since only the bitmap sketches from the two nodes, i anάj, are needed, the monitoring application can estimate a submatrix using the minimum amount of information possible, namely, only the bitmaps from the rows and columns of the submatrix. The estimation of the submatrix allows large ISP networks to focus on particular portions of their network. The estimation of the submatrix also permits the incremental deployment of the network monitoring device since the existence of non- participating nodes does not affect the estimation of the traffic submatrix between all participating ingress and egress nodes. To generate an estimate of a flow matrix, the counter array sketch is used. The counter array sketch permits the volume of data packets from a plurality of different flows (based on the flow labels) to be estimated. The counter array matrix may also be used to estimate the traffic matrix even though is the less cost effective than the bitmap sketch.
Returning to Figure 3, the network 40 has one or more nodes 42 that may each have the online streaming module (in the collection unit 24 shown in Figure 3) that generates the sketches (either the bitmap sketches or the counter array sketches) that are thousands of times smaller than the raw data packet traffic. As described above, the sketches may be stored locally for a period of time, and will be shipped to a central monitoring unit 44 on demand. The data analysis unit 26 running at the central monitoring unit 44 obtains the sketches needed for estimating the traffic matrix and flow matrix through queries.
Traffic Matrix Estimation Using the BitMap Sketch
For the traffic matrix estimation, the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the bitmap sketch. The method for generating the bitmap sketch in the online streaming module was discussed above with respect to Figure 5B. To generate the traffic matrix estimation, the bitmap sketch is stored in the online streaming module (at each node using the same hash function and same bitmap size b) until the sketch is filled to a threshold percentage over a time interval wherein the time interval may be known as a "bitmap epoch". When the first bitmap sketch is filled, the bitmap sketch data structure is reset and then again filled with data.
Figure 9 illustrates a method 100 for estimating a traffic matrix in accordance with the invention that may be implemented by the monitoring application (that may be software executed by a processor that is part of the central monitoring unit) within the analysis unit 26 that may be within the central monitoring unit 44. In step 102, the monitoring application may determine if a request for a traffic matrix estimate (T My) between two nodes (i and j) during an interval, t, has been requested. If a traffic matrix estimate is requested, then in step 104, the monitoring application requests the bitmap(s) from the two nodes that are contained in or partly contained in the time interval, t. An example of the bitmaps from the two nodes over a time interval requested by the monitoring application is shown in Figure 10. In step 106, the analysis unit 26 receives the requested bitmaps from the nodes. In step 108, the monitoring application estimates the traffic matrix (T My) for the volume of data packets between the two nodes given the bitmaps delivered from the nodes. For purposes of an example, it is assumed that both node i and node/ produce exactly one bitmap during the time interval (the measurement interval) when the traffic matrix is estimated. The estimator may be adapted from an article "A Linear-time Probabilistic Counting Algorithm for Database Applications" by K.Y. Whang et al, IEEE Transaction of Database Systems, pgs. 208-229 (June 1990) which is incorporated herein by reference, in which the estimator is used for databases. The set of packets arriving at the ingress node i during the measurement interval is Tj, the resulting bitmap is BT; and the number of bits (that are all "0") in Bxi is Uτi while the size of the bitmap is b. The estimator of T1 which is the number of elements (packets) in Tj is:
Dτ = b\n~ (1)
T,
and T My (the quantity to be estimated) is Tt nTj . An estimator for this quantity
(also adapted from the above article) is:
Figure imgf000014_0001
b where , DrjUj. is defined as b In where UTlKjT denotes the number of "0"'s in
BTI<JT (the result of hashing the set of packets T1 KJT1 into a single bitmap.) The bitmap BTιuT is computed as the bitwise-OR of Bχi and BT1. It can be shown that Dn +DTj -DTl^Tj is
a good estimator of T1 + T ' - T1 U T ' which is exactly T nT,
The computational complexity of estimating each element of the matrix is O(b) for the bitwise operation of the two bitmaps. The overall complexity of estimating the entire m x n matrix is therefore O(mnb). Note that the bitmaps from other nodes are not needed when we are only interested in estimating TMΨ This poses a significant advantage in computational complexity over existing indirect measurement approaches, in which the whole traffic matrix needs to be estimated even if we are only interested in a small subset of the matrix elements due to the holistic nature of the inference method.
The above estimate is for the ideal case with following three assumptions:
(i) The measurement interval is exactly one bitmap epoch. Practically some network management tasks such as capacity planning and routing configuration need the traffic matrices on the long time scales such as tens of minutes or a few hours. Each epoch in our measurements is typically much smaller especially for the high speed links. Therefore we need extend our scheme to support any time scales.
(ii) The clocks on nodes i andj are perfectly synchronized. Using GPS synchronization, or more cost-effective schemes, clocks at different nodes only differ by tens of microseconds. Since each bitmap epoch is typically one to several seconds with OC- 192 or OC-768 speeds (even longer for lower link speeds), the effect of clock skew on our measurement is negligible. Due to the high price of GPS cards today, the standard network time protocol (NTP) is most commonly used to synchronize clocks. As we will shown later in this section, our measurements still work accurately with relative large clock skews (e.g., tens of milliseconds as one may get from clocks synchronized by using NTP).
(iii) The bitmap epochs between nodes i andj are well aligned. Traffic going through different nodes can have rates orders of magnitude different from each other, resulting in some bitmaps being filled up very fast (hence short bitmap epoch) and some others filled up very slowly (hence long bitmap epoch). We refer to this phenomena as heterogeneity. Because of heterogeneity the bitmap epochs on different nodes may not well aligned.
As mentioned above, the above estimation was for the ideal case. Now, the general case for the estimation is explained. For the general case, it is assumed that the measurement interval spans exactly bitmap epochs 1, 2, ..., k; at node i and bitmap epochs 1, 2, ..., k2 at node y, respectively. Then the traffic matrix element TMy can be estimated as
Λ k ky
TMtj = YJ ∑ Nq,r x overlap(q, r) (3)
where Nq r is the estimation of the common traffic between the bitmap (also known as a page) q at node I and the page r at node j, and overlap{q, r) is 1 when the page q at node i overlaps temporally with page r at node j and is 0 otherwise. To determine whether two bitmap epochs overlap with each other temporally, the timestamps of their starting times will be stored along with the pages in a process known as "multipaging". The multipaging process eliminates the assumptions set forth above. For example, the multipaging process supports the measurements over multiple epochs so that the first assumption is eliminated. As shown in Figure 10, multipaging can be applied to the case where the measurement interval does not necessarily align with the epoch starting times. As shown in Figure 10, an exemplary measurement interval 110 corresponds to the rear part of epoch 1, epochs 2 and 3, and the front part of epoch 4 at node i. The exemplary measurement interval also corresponds to the rear part of epoch 1, epoch 2, and the front part of epoch 3 at node/ Based on equation 3 above, we need to add up the terms N11 ,N2 1 , N2<2 , N3 2 , N3 3 , and N43 based on their temporal
overlap relationships. However, this would be more than TMy because the measurement interval only has the rear part of epoch 1 and the front part of epoch 4 at node z. The solution is to adjust N1 j to the proportion of the epoch 1 that overlaps with the measurement interval. N43 will also be adjusted proportionally. Since traffic matrix estimation is typically on the time scales of tens of minutes, which will span many pages, the inaccuracies resulting from this proportional rounding are negligible.
The assumption (ii) above can be eliminated by combing the clock skew factor into definition of "temporal overlapping" in equation 3. Using the example in Figure 10, the epoch 1 at node i does not overlap temporally with the epoch 2 at noάej visually. But if the interval between the end of the epoch 1 at node i and the start of the epoch 2 at nodey is smaller than an upper bound T1 of the clock skew (e.g., 50 ms for an ΝTP enabled network), we still consider they are temporally overlapping.
A remaining hurdle is extending equation 3 to handle the packets in transit. Returning to Figure 10, if there are packets departing from i in epoch 1 (at node i) and arriving εΛj in epoch 2 (at nodej) due to nontrivial traversal time from i toy, our measurement will miss these packets because only N1 1 is computed. This can be easily fixed using the same method used above to eliminate the assumption (ii), i.e., combining another upper bound J^ of the traversal time (e.g., 50ms) to define "temporal overlapping". In other words if the interval between the end of epoch 1 at node i and the start of epoch 2 at nodej is within T]+ T% it should be labeled "temporal overlapping" (overlaps, 2) = 1) and join the estimation. Sampling with Bit Map Sketch
In some situations, it is desirable to store the bitmaps for a long period of time for later troubleshooting which could result in huge storage complexity for very high speed links, but sampling can be used to reduce this requirement significantly. To sample the data packets, the impact on the accuracy should be minimized, but it is desirable to use DRAM to conduct online streaming for very high speed links (e.g., beyond OC- 192) and it is important to sample only a certain percentage of the packets so that the DRAM speed can keep up with the data stream speed. Suppose there is a hard resource constraint on how much storage the online streaming module can consume every second. For example, the constraint can be one bitmap of 4 Mbits per second and suppose we have 40 million packets arriving within one second. One option is that the process does no sampling, but hashes all these packets into the bitmap, referred to as "squeezing". But the resulting high load factor of approximately 10 would lead to high estimation error. An alternative option is to sample only a certain percentage^ of packets to be squeezed into the bitmap and many different/? values can be chosen. For example, we can sample 50% of the packets and thereby squeeze 20 million sampled packets into the bitmap, or we can sample and squeeze only 25% of them so that it is necessary to determine an optimal value of p. On the one extreme, if we sample at a very low rate, the bitmap will only be lightly loaded and the error of estimating the total sampled traffic as well as its common traffic with another node (a traffic matrix element) becomes lower. However, since the sampled traffic is only a small percentage of the total traffic, the overall error will be blown up by a large factor as discussed below. On the other extreme, if we sample with very high probability, the error from sampling becomes low but the error from estimating the sampled traffic becomes high. The optimal value of p may be determined based on the following principle:
PRINCIPLE 1. If the expected traffic demand in a bitmap epoch does not make the resulting load factor exceed t*, no sampling is needed. Otherwise, sampling rate p* should be set so that the load factor of the sampled traffic on the bitmap is approximately t*. To illustrate this principle, a scenario is used in which each ingress node and egress node will coordinate to use the same load factor t after sampling which allows us to optimize t for estimating most of the traffic matrix elements accurately. Then, the process minimizes the error of an arbitrary Nq r term shown in equation 3. Recall that Nq,γ is the amount of common traffic between two overlapping bitmap pages, page q at node i and page r at node 7. We denote Nq>τ and its estimator as X and X , respectively. Also, let a and pabe the aforementioned page q
at node i and the corresponding sampling rate, respectively. Similarly, β and pβ denote page
r at nodey and the corresponding sampling rate, respectively. Note that each overlapping page pair may have its own optimal t* to achieve the optimal accuracy of estimating its common traffic. Therefore it is impossible to adapt t* to satisfy every other node, as their needs (t* for optimal accuracy) conflict with each other. Therefore, a default t* for every node is identified such that the estimation accuracy for the most common cases is high. The optimal^?* and t* between pages a and β given the expected traffic demand in a bitmap epoch. In fact, only one of them needs to be determined since the other follows from Principle 1. Below, a sampling technique called consistent sampling is used to significantly reduce the estimation error. With consistent sampling, X , the estimator of X, is given by
— N , where p = rmn(pa , pβ ) and N is the estimation result (by equation 2) on the sampled
P traffic. We denote the total sampled traffic volume squeezed into the page with sampling ratep by T. The following theorem characterizes the variance of X.
THEOREM 2. The variance of X is approximately given by:
Figure imgf000018_0001
The above formula consists of two terms. The first term corresponds to the variance from estimating the sampled traffic (equation 2 above) scaled by 1/p2 (to compensate for the sampling), and the second term corresponds to the variance of the sampling process. Since these two errors are orthogonal to each other, their total variance is the sum of their individual variances.
X
The average error of X , which is equal to the standard deviation of the ratio — since
X
X is an almost unbiased estimator of X, is given by:
Figure imgf000018_0002
We have performed Monte-Carlo simulations to verify the accuracy of the above formula since there is some approximation in its derivation. The following parameter settings are used in the simulations: 1) the size b of the bitmap is IM bits and both the ingress page and the egress page have the same load factor 10 without sampling so that 1OM packets each is to be squeezed, or sampled and then squeezed, into the bitmaps; and 2) among both sets of 1OM packets, IM packets are common between both the ingress page and the egress page (i.e., X = IM packets). Figure 11 shows that the observed average error from Monte-Carlo simulation matches well with the value from Equation 5. The curve verifies the estimation error caused by sampling and squeezing. When the load factor is very low, the sampling error dominates; when the load factor becomes large the error caused by "excessive squeezing" is the main factor. At the optimal **(« 0.4), the scheme achieves the smallest average relative error (« 0.012) .
Since the optimal t * value is a function of T and X, setting it according to some global default value may not be optimal all the time. Fortunately we observe that t*, the optimal load factor, does not vary much for different T and X values through our extensive experiments. In addition, we can observe from Figure 11 that the curve is quite flat in a large range around the optimal load factor. For example, the average errors corresponding to any load factor between 0.09 and 1.0 only fluctuate between around 0.012 to 0.015. Combining the above two observations, we conclude that by setting a global default load factor t* according to some typical parameter settings, the average error will stay very close to optimal values. Throughout this work we set the default load factor to 0.7.
Consistent sampling If the sampling is performed randomly and independently, only papp of the common traffic that comes from page a to page β is recorded by both nodes on the average. To estimate X, although it is possible to get its unbiased estimate by blowing the estimation of the sampled
portion up by , it also blows the error of our estimate u by . To address this
Figure imgf000019_0001
problem, a consistent sampling scheme is used which has the desirable property that, when Pa - Pβ' among the set of packets that come from page a to page β , the data packets sampled and squeezed into page a are a subset of those sampled and squeezed into page β , and vice versa. In this way, min(pa ,pβ)of traffic between page a and page β will be sampled by both
nodes and the error of our estimation will only be blown up by times.
The consistent sampling scheme works by fixing a hash function h' (different from the aforementioned h which is used to generate the bitmap), that maps the invariant portion of a packet to an 1-bit binary number. The range of the hash function h ' is {0, 1 , • • • , 2l- 1 } .
If a node would like to sample packets with rate p = — 9 it simply samples the set of packets
Figure imgf000020_0001
< c). If 1 is sufficiently large such that any desirable sampling rate/) can be
approximated by — for some c between 1 and 21. When every node uses the same hash
C C function h', the above property is achieved because, when pa = -γ < -j = pβ , the set of
packets sampled and squeezed into page a ,
Figure imgf000020_0002
< C1) , is clearly a subset of those sampled and squeezed into page β ,
Figure imgf000020_0003
< C2) .
Flow and Traffic Matrix Estimation Usinfi the Counter Array Sketch
The flow matrix contains finer grained information than the traffic matrix and the counter array sketch described above may be used to estimate the flow matrix. The flow matrix is the traffic matrix combined with the information on how each OD element is split into flows of different sizes. Formally, a flow matrix element FMy is the set of sizes of flows that travel from node i to node/' during a measurement interval. Thus, a traffic matrix element TMy is simply the summation of all the flow sizes in FMy, that is, TMtj = *Ϋ s . Thus, the counter array sketch can be used to estimate the traffic matrix as well. For the flow matrix estimation (or traffic matrix estimation) using the counter array, the online streaming module (within the collection unit 24) and the analysis unit 26 are used in combination with the counter array sketch. The method for generating the counter array sketch (using the flow label described above) in the online streaming module was discussed above with respect to Figure 4B. To generate the flow matrix estimation, the counter array sketch is stored on the online streaming module (at each node using the same hash function and same bitmap size b.) Since for each packet, this process requires only one hash operation, one memory read and one memory write (to the same location), this allows the online streaming module to operate at OC-768 (40 Gbps) speed with off-the-shelf 1 Oris SRAM and an efficient hardware implementation of the Hj family of hash functions.
Due to the delicate nature of the. data analysis process (discussed below in more detail) for the counter array scheme, much more stringent are placed on the online streaming module. First, unlike in the bitmap scheme, the counter array scheme is holistic in the sense that all ingress and egress nodes have to participate. Second, unlike in the bitmap scheme, the counter epochs in this scheme need to be aligned with each other, that is, all counter epochs in all ingress and egress nodes need to start and end at the approximately same time. The practical implication is that the counter array b needs to be large enough to accommodate the highest link speed among all nodes (i.e., the worst case). Similar to the definition of "bitmap epoch", we refer to the amount of time the highest-speed link takes to fill up the counter array to a threshold percentage as a "counter epoch", or epoch for abbreviation.
Given these constraints, the memory and storage complexities of the online streaming module for the counter array scheme are explored. For the memory, depending on the maximum link speed among all the nodes, the counter epoch ranges from 1 to a few 10's of seconds and very accurate estimates can be achieved by setting the number of counters in the array to around the same order as the number of flows during an epoch. Therefore, for an OC- 192 or an OC- 768 link, one to a few million counters need to be employed for a measurement interval of one to a few seconds. If each counter has a "safe" size of 64 bits to prevent the overflow, the memory requirement would be quite high. Fortunately, this requirement is reduced to 9 bits per counter (using an existing technique), an 85% reduction so that a million counters will only cost 1.1 MB SRAM. The key idea of this technique is to keep short counters in SRAM and long counters in DRAM. When a short counter in SRAM exceeds a certain threshold value due to increments, the value of this counter will be "flushed" to the corresponding long counter in DRAM. As far as storage complexity, at the same link speed, the storage complexity is even smaller than the bitmap scheme. In the bitmap scheme, each packet results in 1 or 2 bits of storage. In the counter array, each flow results in 64 bits of storage. However, since most of the counter values are small, resulting in a lot of repetitions in small counter values, Huffman type of compression can easily reduce the storage complexity to only a few bits per counter. Since the average flow length is about 10 packets (observed in our evaluation described below), the average storage cost per packet is amortized to less than 1 bit.
Once there is a need to estimate the flow matrix during a measurement interval, the counter arrays during that interval need to be shipped to the central monitoring unit for analysis. If the measurement interval spans more than one epochs, sketches in each epoch will be processed independently.
Thus, a data analysis process for estimating the flow matrix from the counter arrays during
a single epoch is described first. Let Ik = (C71[A:], C72[Zb],...., CIm[Λ:]} where C71-[A:], i = 1,..., tn is the value of the kth counter at the ingress node I; and Ek = {CEl[klCE1[k],....,CEn[Jc]} where CEn [k], j = l,...,n is the value of the kth counter at the egress node Ej. Since every node uses the same hash function, packets recorded in i* have an approximate one-to-one correspondence with packets recorded in K, i.e.,
V"' Cn[Ic] « V"_ CEj[k] . The approximation comes from the fact that the clock is not perfectly synchronized at all nodes and the packet traversal time from an ingress node to an egress node is non-zero. Both factors only have marginal impact on the accuracy of our estimation. In addition, the impact of this approximation on the accuracy of the data analysis process is further alleviated due to the "elephant matching" nature of the data analysis process described below.
Now, an ideal case in which there is no hash collision on index k in all counter arrays, and therefore the counter value represents a flow of this size is considered. In the this ideal case, it is further assumed that: (i) the number of ingress nodes is the same as the egress nodes (i.e., until) ; (ii) the elements in i are all distinct; and (iii) the flows in f all go to distinct elements of Ek. Then the values in Ek are simply a permutation of the values in /. Thus, a straightforward "one-to-one matching" can be used for this idea case.
In reality, none of the above assumptions is true for various reasons. For example, at an ingress node, multiple flows can collide into one counter, flows from multiple ingress nodes can collide into the same counter at an egress node, and m is in general not equal to n. Therefore, the one-to-one matching cannot be used in general. However, since there are only a small number of medium to large flows due to the Zipfian nature of the Internet traffic, matching medium flow (known as kangaroos) and large flows (known as elephants) between ingress and egress nodes turns out to work very well, but does not work well on small flows. Figure 12 is a piece of pseudocode that illustrates a method 120 for estimating the flow matrix by matching counter values at index k wherein the large and medium flows are matched. For each index k in all the counter arrays, the steps shown in lines 2-13 of Figure 12 are executed. In the matching process, the largest ingress counter value Cimax.t [k] is matched with the largest egress counter value CEMOX-J [k]- The smaller value of the two is considered a flow from max J to max J (determined in lines 6 and 10 of the pseudocode), and this value will be subtracted from both counter values (see lines 7 and 11 of the pseudocode) which reduces the smaller counter value to 0. The above sequence of steps is repeated until either all ingress counters or all egress counters at index k become 0. When there is a tie on the maximum ingress or egress counter values, a random tie-breaking is performed. The above process produces surprisingly accurate estimation of the flow matrix for medium to large flows.
The computation complexity of the process shown in Figure 12 is O((m+ n — l)(log m + log n)) because the binary searching operation (lines 3 and 4 of the pseudocode that determine the largest ingress and egress counter values) dominates the complexity of each iteration and there are at most m + n — 1 iterations. Thus the overall complexity to estimate the flow matrix is 0(b(m + n — l)(log m + log n)).
As is well known, an exact flow matrix can be used to indicate intrusions such as DDoS attacks. However, the estimation process in Figure 12 provides accurate estimation on the medium and large flow matrix elements and some typical intrusions (e.g., DDoS attacks) consist of a large number of small flows. The process shown in Figure 12 can still be used to provide valuable information about intrusions. In particular, the flow label of the process can be selected to provide the valuable information. For example, to detect DDoS attacks, the flow label can be the destination IP address of a packet so that the traffic of a DDoS attack becomes a large flow going through the network instead of a large number of small ones.
In order to obtain the identity of a flow (the flow label), the method may include other known sampling or steaming processes as set forth in "New Directions in Traffic Measurement and a Counting", C. Estan and G. Varghese, Proceedings of ACM SIGCOMM, August 2002 which is incorporated herein by reference. As set forth above, the traffic matrix can also be obtained by adding up the sizes of all the flows that we determine going from node i to nodey using the above process. This is in fact a fairly accurate estimation of traffic matrix since the process tracks kangaroos and elephants very accurately and thus accounts for the majority of traffic.
Evaluation of Flow Matrix and Traffic Matrix Estimates Although an ideal evaluation of our traffic matrix and flow matrix estimation mechanisms would require packet-level traces collected simultaneously at hundreds of ingress and egress routers in an ISP network for a certain period of time (which is very time-intensive and may be impossible to accomplish), two existing data sets are used including 1) synthetic traffic matrices generated from publicly available packet-level traces from NLANR (http://pma.nlanr.net) and 2) actual traffic matrices from a tier- 1 ISP wherein the ISP traffic matrix was produced in "Fast Accurate Computation of Large-Scale IP Traffic Matrices from Link Loads" Y. Zhang et al., Proceedings of ACM SIGMETRICS, June 2003 and corresponds to traffic during a one-hour interval in a tier-1 network.
We adopt two known performance metrics (proposed in the "Fast Accurate" article above): Root Mean Squared Error (RMSE) and Root Mean Squared Relative Error (RMSRE) for evaluating the accuracy of estimated traffic matrix.
Figure imgf000025_0001
RMSRE
Figure imgf000025_0002
The RMSE provides an overall measure of the absolute errors in the estimates, while RMSRE provides a relative measure. Note that the relative errors for small matrix elements are usually not very important for network engineering so that only matrix elements greater than some threshold T are used in the computation of RMSRE (properly normalized), hi the above equation, N7- refers to the number of matrix elements greater than T, i.e.,
Nr = |{x | x > r,i = i,2,...,N}| .
ΝLAΝR trace-driven Evaluation The set of traces used consist of 16 publicly available packet header traces from ΝLAΝR.
The number of flows in these traces varies from 170K to 320K and the number of packets varies from 1.8M to 3.5M. To evaluate the matrices, a synthetic scenario is constructed that appears as if these traces were collected simultaneously at all ingress nodes of a network. We set up the experimental scenario as follows: there are 16 ingress nodes and 16 egress nodes in the measurement domain. Each trace corresponds to the packet stream for one ingress node. The challenge in constructing this scenario lies in assigning the flows in the input stream at an ingress node to 16 different egress nodes such that the generated matrix will reflect some properties of real traffic matrices.
Recent work shows that the Internet has "hot spot" behavior so that a few OD pairs have very large traffic volume, while the majority of OD pairs have substantially less volume between them. Following the observed quantitative properties of real Internet traffic matrices, for each ingress node, we randomly divide the 16 ingress nodes into three categories: 2 nodes belonging to large category (large data flow), 7 nodes belonging to medium category, and the remaining 7 nodes belonging to the small category. For each flow at an ingress node, we assign an egress node randomly, with nodes belonging to the large category twice as likely to be picked as medium category nodes which in turn are twice as likely to be picked as small category nodes. For simplicity, we configure the size of the bitmap and the counter array to fit the data set size without adopting the enhancement techniques (i.e., multipaging and sampling). Thus, we set the size of bitmap to 2,880K bits and the size of counter array to 320K counters which approximately occupy 2,88OK bits of fast memory (SRAM).
Figures 13A and 13B compare the estimated traffic matrix elements using the bitmap scheme (Figure 13A) and the counter array scheme (Figure 13B) with the original traffic matrix elements. The solid diagonal line in each figure denotes a perfect estimation, while the dashed lines denote an estimation error of ±5% so that points closer to the diagonal mean a more accurate estimate. As shown in the figures, both schemes are very accurate, and the bitmap scheme is more Accurate than the counter array scheme.
Figure 14 shows the impact of varying Ton RMSRE. We observe that both schemes produce very close estimates for the large and medium matrix elements. The traffic volume of the thresholded matrix elements decreases as the threshold increases, and the performance improves. For example, the RMSRE actually drops to below 0.05 for the top 70% of traffic for the counter array scheme. For the bitmap scheme, it drops even further to below 0.01 for the top 70% of traffic, hi absolute terms, the RMSEs of the bitmap scheme and counter array scheme are equal to 4,136 packets and 11,918 packets, respectively, which are very small in comparison to the average traffic on a node. All of the above results confirm that the bitmap scheme achieves higher accuracy than the counter array scheme. The overall RMSRE of the bitmap scheme is below 6%, and that of the counter array scheme evolves from around 1% for large elements to 16% for the overall elements.
Note that the results above reflect relative accuracy on a small time scale (one to several seconds for high speed routers), and they should not be directly compared with other reported results since those results are on much larger time scales. The schemes usually can achieve much higher relative accuracy on larger time scales (e.g., tens of minutes) as shown below.
For the evaluation, we also compare the average (relative) error between our bitmap scheme and the existing sampling-based schemes such as NetFlow. We adopt the similar method in "Deriving Traffic Demand for Operational IP networks", A. Feldmann et al., Proceedings of ACM SIGCOMM, August 2000 to infer the traffic matrix by collecting the packets in each ingress node with the typical NetFlow sampling rate of 1/500 (which generates a similar amount of data per second as our bitmap scheme) and inferring the traffic matrix according to the egress
Λ nodes we assigned above for each sampled flows. Here, the variance of TMtJ is given by
— where p is the sampling rate 1/500. Figure 15 plots the average error of each
P element of the traffic matrix in the trace-driven experiments for both our bitmap scheme and the sampling based scheme. We observe that our bitmap scheme achieves a consistently higher accuracy than the sampling based scheme. Now, the accuracy of flow matrix estimation is evaluated using the NLANR data. Note that our flow matrix estimation is counter-based and cannot distinguish the flows which are hashed to the same location with the same ingress node and egress node (we call this an indistinguishable collision). Our goal is to accurately estimate the medium and large flow matrix elements. We observe that the indistinguishable collisions happen rarely for medium and large flows. In our experiments, among the total 4 million flows with average size about 10 packets there are only 41 out of 71,345 medium and large flows (> 10 packets) which suffer from this collision. Thus the impact of such indistinguishable collisions on the estimation accuracy is negligible. Figure 16 shows the RMSREs for various thresholds T. We observe a sharp downward trend in the value of RMSRE for increasing threshold values. When the threshold is equal to 10 packets, the error drops to below 15%. The accurate estimation of these flows is very important since, in this trace, flows of size 10 and above (71,345 of them) accounts for 87% of the total traffic.
Evaluation based on ISP traffic matrices
For the evaluation using the ISP traffic matrices, a one-hour router-level traffic matrix from a tier-1 ISP network is obtained to analytically evaluate the accuracy of the bitmap scheme. We assume that traffic volume between each pair of backbone routers is evenly distributed over the one hour time period. An hour's traffic is too large (we assume a conservative average packet size of 200 bytes) to fit in a single bitmap, and therefore the aforementioned multipaging technique is used. Given a traffic matrix, we split the traffic on each ingress/egress node into multiple pages of 4Mbits (i.e., 512KB) with load factor 0.7 (the default load factor described above). Then, we compute the standard deviation for each pair of overlapped pages using Theorem 1. The sum of the standard deviation divided by the real matrix element value gives us the predicted error for the entire 1-hour interval. Figure 17 shows the cumulative distribution of traffic with the analytically predicted average error. We observe that our bitmap scheme provides very accurate results. Over 98% of the traffic has negligible average error (< 0.03) and the error for around 80% traffic is even below 0.005. Compared with the result in "An information- theoretic approach to traffic matrix estimation", Y. Zhang et al., Proceedings of ACM SIGCOMM, August 2003, the above described bit map process improves the accuracy by more than an order of magnitude. For example, the error for around 80% traffic in the article above is about 20%. Li addition, the average error across all traffic matrix elements in our estimation is around 0.5%, which is also more than an order of magnitude lower than that in the above article (i.e., 11.3%).
While the foregoing has been with reference to a particular embodiment of the invention, it will be appreciated by those skilled in the art that changes in this embodiment may be made without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims.

Claims

Claims:
1. A network monitoring system, comprising: a collection unit that monitors a packet stream over an observation point in a network, the collection unit further comprising a sketch generator that generates a sketch data structure during a sketch period, the sketch data structure containing one or more pieces of information about each data packet in the packet stream during the sketch period; and an analysis unit that, in response to a request for a data packet volume between a first observation point and a second observation point during a measurement interval, receives one or more sketches from each observation point whose sketch period is within the measurement interval, the analysis unit further comprising a monitoring application that generates a data packet volume estimate for data packets between the first and second observation points based on the one or more received sketches from each observation point.
2. The system of claim 1 , wherein the sketch further comprises a bitmap sketch having a plurality of bit positions each capable of storing a low value or a high value and wherein the sketch generator further comprises a hash function that hashes the invariant portion of each data packet to generate an index value wherein each index value corresponds to a bit position of the bitmap sketch so that each bit position having a high value indicates that a data packet whose invariant portion hashes to that index value has been monitored at the collection unit.
3. The system of claim 2, wherein the hash function further comprises an H3 family of hash functions .
4. The system of claim 2, wherein the monitoring application further comprises a traffic matrix estimator that estimates the volume of data packets between the first and second observations points based on the bitmap sketches.
5. The system of claim 1, wherein the sketch further comprises a counter array sketch having a plurality of counters each capable of storing a count wherein the sketch generator further comprises a hash function that hashes an attribute of a flow label of each data packet to generate an index value wherein each index value corresponds to a counter of the counter array so that each counter contains a count of a number of data packets whose flow label attribute hashes to that index value that have been monitored at the collection unit.
6. The system of claim 5, wherein the hash function further comprises an H3 family of hash functions.
7. The system of claim 5, wherein the flow label attribute for each data packet further comprises one or more of a source address for the data packet, a destination address for the data packet, a source port for the data packet, a destination port for the data packet and a protocol for the data packet.
8. The system of claim 6, wherein the monitoring application further comprises a flow matrix estimator that estimates the flow size distribution between the first and second observation points based on the counter array sketches.
9. The system of claim 1, wherein the collection unit further comprises a piece of software executed on a node of the network.
10. The system of claim 9, wherein the analysis unit further comprises a piece of software executed on a central monitoring unit that is remote from the network.
11. The system of claim 1 , wherein the analysis unit further comprises a piece of software executed on a central monitoring unit that is remote from the network.
12. The system of claim 11, wherein the node further comprises one of a switch, a router, a point of presence and an application server and wherein the central monitoring unit further comprises a server computer.
13. The system of claim 1, wherein the first observation point further comprises one of a node in a network and a link in a network and wherein the second observation point further comprises one of a node in a network and a link in a network.
14. A method for monitoring a network, comprising: monitoring the packet stream over an observation point in a network using a collection unit; generating a sketch data structure during a sketch period, the sketch data structure containing one or more pieces of information about each data packet in the packet stream during the sketch period; receiving, in response to a request for an data packet volume estimate between a first observation point and a second observation point in the network during a measurement interval, one or more sketches at each observation point whose sketch period is within the measurement interval; and generating an estimate of the volume of data packets between the first and second observation points based on the one or more received sketches.
15. The method of claim 14, wherein generating the sketch data structure further comprises generating a bitmap sketch having a plurality of bit positions each capable of storing a low value or a high value and hashing an invariant portion of each data packet to generate an index value wherein each index value corresponds to a bit position of the bitmap sketch so that each bit position having a high value indicates that a data packet whose invariant portion hashes to that index value has been monitored at the collection unit.
16. The method of claim 15, wherein hashing the invariant portion of the data packet further comprises hashing the invariant portion of the data packet using an H3 family of hash functions.
17. The method of claim 15, wherein generating the estimate further comprises generating a traffic matrix estimate that estimates the volume of data packets between the first and second observation points based on the bitmap sketches.
18. The method of claim 14, wherein generating the sketch data structure further comprises generating a counter array sketch having a plurality of counters each capable of storing a count and hashing an attribute of a flow label of each data packet to generate an index value wherein each index value corresponds to a counter of the counter array so that each counter contains a count of a number of data packets whose flow label attribute hashes to that index value that have been monitored at the collection unit.
19. The method of claim 18, wherein hashing the attribute of the flow label of each data packet further comprises hashing the attribute of the flow label of each data packet using an H3 family of hash functions.
20. The method of claim 18, wherein the flow label attribute for each data packet further comprises one or more of a source address for the data packet, a destination address for the data packet, a source port for the data packet, a destination port for the data packet and a protocol for the data packet.
21. The method of claim 19, wherein generating the estimate further comprises generating a flow matrix estimate that estimates the flow size distribution between the first and second observation points based on the counter array sketches.
22. The method of claim 14, wherein the first observation point further comprises one of a node in a network and a link in a network and wherein the second observation point further comprises one of a node in a network and a link in a network.
PCT/US2006/021447 2005-06-02 2006-06-02 System and method for measuring traffic and flow matrices WO2006130830A2 (en)

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
US68656005P 2005-06-02 2005-06-02
US68657005P 2005-06-02 2005-06-02
US60/686,560 2005-06-02
US60/686,570 2005-06-02
US68965105P 2005-06-10 2005-06-10
US60/689,651 2005-06-10
US70919805P 2005-08-17 2005-08-17
US70919105P 2005-08-17 2005-08-17
US60/709,198 2005-08-17
US60/709,191 2005-08-17

Publications (2)

Publication Number Publication Date
WO2006130830A2 true WO2006130830A2 (en) 2006-12-07
WO2006130830A3 WO2006130830A3 (en) 2007-08-30

Family

ID=37482345

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2006/021447 WO2006130830A2 (en) 2005-06-02 2006-06-02 System and method for measuring traffic and flow matrices
PCT/US2006/021512 WO2006130840A2 (en) 2005-06-02 2006-06-02 System and method for data streaming

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/US2006/021512 WO2006130840A2 (en) 2005-06-02 2006-06-02 System and method for data streaming

Country Status (1)

Country Link
WO (2) WO2006130830A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833134A (en) * 2012-09-04 2012-12-19 中国人民解放军理工大学 Workload adaptation method for measuring flow of network data stream
WO2013155021A3 (en) * 2012-04-09 2014-01-03 Cisco Technology, Inc. Distributed demand matrix computations
US9979613B2 (en) 2014-01-30 2018-05-22 Hewlett Packard Enterprise Development Lp Analyzing network traffic in a computer network

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8799754B2 (en) 2009-12-07 2014-08-05 At&T Intellectual Property I, L.P. Verification of data stream computations using third-party-supplied annotations
JP5937990B2 (en) * 2013-03-12 2016-06-22 日本電信電話株式会社 Traffic distribution estimation device, traffic distribution estimation system, and traffic distribution estimation method
US10084752B2 (en) 2016-02-26 2018-09-25 Microsoft Technology Licensing, Llc Hybrid hardware-software distributed threat analysis
US10608992B2 (en) 2016-02-26 2020-03-31 Microsoft Technology Licensing, Llc Hybrid hardware-software distributed threat analysis
US10656960B2 (en) 2017-12-01 2020-05-19 At&T Intellectual Property I, L.P. Flow management and flow modeling in network clouds

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030105976A1 (en) * 2000-11-30 2003-06-05 Copeland John A. Flow-based detection of network intrusions
US20040218529A1 (en) * 2000-11-01 2004-11-04 Robert Rodosek Traffic flow optimisation system
US20050039086A1 (en) * 2003-08-14 2005-02-17 Balachander Krishnamurthy Method and apparatus for sketch-based detection of changes in network traffic
US6873600B1 (en) * 2000-02-04 2005-03-29 At&T Corp. Consistent sampling for network traffic measurement

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1312892C (en) * 1999-06-30 2007-04-25 倾向探测公司 Method and apparatus for monitoring traffic in network
US6807156B1 (en) * 2000-11-07 2004-10-19 Telefonaktiebolaget Lm Ericsson (Publ) Scalable real-time quality of service monitoring and analysis of service dependent subscriber satisfaction in IP networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6873600B1 (en) * 2000-02-04 2005-03-29 At&T Corp. Consistent sampling for network traffic measurement
US20040218529A1 (en) * 2000-11-01 2004-11-04 Robert Rodosek Traffic flow optimisation system
US20030105976A1 (en) * 2000-11-30 2003-06-05 Copeland John A. Flow-based detection of network intrusions
US20050039086A1 (en) * 2003-08-14 2005-02-17 Balachander Krishnamurthy Method and apparatus for sketch-based detection of changes in network traffic

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013155021A3 (en) * 2012-04-09 2014-01-03 Cisco Technology, Inc. Distributed demand matrix computations
US9106510B2 (en) 2012-04-09 2015-08-11 Cisco Technology, Inc. Distributed demand matrix computations
US9237075B2 (en) 2012-04-09 2016-01-12 Cisco Technology, Inc. Route convergence monitoring and diagnostics
US9479403B2 (en) 2012-04-09 2016-10-25 Cisco Technology, Inc. Network availability analytics
CN102833134A (en) * 2012-09-04 2012-12-19 中国人民解放军理工大学 Workload adaptation method for measuring flow of network data stream
US9979613B2 (en) 2014-01-30 2018-05-22 Hewlett Packard Enterprise Development Lp Analyzing network traffic in a computer network

Also Published As

Publication number Publication date
WO2006130840A2 (en) 2006-12-07
WO2006130840A3 (en) 2007-07-19
WO2006130830A3 (en) 2007-08-30

Similar Documents

Publication Publication Date Title
US9781427B2 (en) Methods and systems for estimating entropy
WO2006130830A2 (en) System and method for measuring traffic and flow matrices
Dai et al. Finding persistent items in data streams
Katabi et al. A passive approach for detecting shared bottlenecks
US7779143B2 (en) Scalable methods for detecting significant traffic patterns in a data network
Kompella et al. Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator
Li et al. Low-complexity multi-resource packet scheduling for network function virtualization
Zhao et al. Data streaming algorithms for accurate and efficient measurement of traffic and flow matrices
Xu et al. ELDA: Towards efficient and lightweight detection of cache pollution attacks in NDN
US20090303879A1 (en) Algorithms and Estimators for Summarization of Unaggregated Data Streams
Chefrour One-way delay measurement from traditional networks to sdn: A survey
US11706114B2 (en) Network flow measurement method, network measurement device, and control plane device
Basat et al. Routing oblivious measurement analytics
Duffield et al. Trajectory sampling with unreliable reporting
Callegari et al. When randomness improves the anomaly detection performance
Zheng et al. Unbiased delay measurement in the data plane
Sanjuàs-Cuxart et al. Sketching the delay: tracking temporally uncorrelated flow-level latencies
Kong et al. Time-out bloom filter: A new sampling method for recording more flows
Wang et al. A new virtual indexing method for measuring host connection degrees
Singh et al. Hh-ipg: Leveraging inter-packet gap metrics in p4 hardware for heavy hitter detection
Cao et al. A quasi-likelihood approach for accurate traffic matrix estimation in a high speed network
Shahzad et al. Accurate and efficient per-flow latency measurement without probing and time stamping
AT&T
JP7174303B2 (en) Topology estimation system, traffic generator, and traffic generation method
Zhang et al. Chat: Accurate network latency measurement for 5g e2e networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06771942

Country of ref document: EP

Kind code of ref document: A2