CN100399283C - Method and apparatus for extending dispersion frame technique behavior using dynamic rule sets - Google Patents

Method and apparatus for extending dispersion frame technique behavior using dynamic rule sets Download PDF

Info

Publication number
CN100399283C
CN100399283C CNB2005100591727A CN200510059172A CN100399283C CN 100399283 C CN100399283 C CN 100399283C CN B2005100591727 A CNB2005100591727 A CN B2005100591727A CN 200510059172 A CN200510059172 A CN 200510059172A CN 100399283 C CN100399283 C CN 100399283C
Authority
CN
China
Prior art keywords
rule
mistake
user
error
dft
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2005100591727A
Other languages
Chinese (zh)
Other versions
CN1707438A (en
Inventor
迈克尔·加斯塔德
托马斯·费兰
布伦特·亚德利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN1707438A publication Critical patent/CN1707438A/en
Application granted granted Critical
Publication of CN100399283C publication Critical patent/CN100399283C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis

Abstract

A method, apparatus and program storage device for providing control of statistical processing of error data over a multitude of sources using a dynamically modifiable DFT rule set is disclosed. The dispersion frame technique is extended in the present invention to provide dispersion frame rules with user-defined parameters thereby creating a dynamically modifiable rule set.

Description

Use the method and apparatus of dynamic programming collection expansion discrete frames technology behavior
Technical field
The present invention relates to handle misdata, more particularly, relate to and utilize dynamic revisable DFT rule set, method, equipment and program storage device to the control of the statistical treatment of the misdata in a large amount of sources are provided.
Background technology
More depend on computer system and carry out reliable task along with the consumer becomes, the tolerance of computer system errors is reduced.When soft fault took place, computer system experienced shutdown usually.Along with hardware aging, the frequency of computation error is increasing, and the possibility of soft fault increases.If there is not release mechanism, computer system certainly leads to the fault that causes the user to be discontented with so.
For fear of the computer system fault, proposed to predict or diagnose the method for the imminent system failure.For example, be a kind of under the operating conditions of regulation based on the system fault diagnosis of standard, according to systematic design discipline, the anticipatory behavior of determining system is and so on a method.Proposed the test based on the system action of expection, this test is used to the diagnostic system fault.But, searching aspect the unexpected fault based on the diagnostic method of standard, and be used to diagnose limited in one's ability aspect the test of fault outside the expectation in formulation.
Another example of the mechanism of diagnostic system fault is based on the diagnosis of sign (symptom).By utilizing the environment that makes a mistake of incident or error log identification to come reconstructing system fault and assessment environment, according to sign ground recognition system failure condition around the mistake that causes the system failure.The same with diagnostic method based on standard, cause system failure designator based on the diagnostic method of sign, rather than test.
A specific example based on the diagnostic techniques of sign is according to before bust, department of computer science's observation that the error rate of other electronic equipment increases gradually of unifying, the discrete frames technology of proposition (dispersion frame technique:DFT).The DFT technology occurs in the tight ness rating on time and the space by checking mistake, and service regeulations are determined the relation between wrong the generation.Expansion DFT rule has increased the function of DTF engine, and permission is carried out stricter control to the statistical treatment of the misdata of a large amount of computer equipments.This rule also allows the remarkable increase of the error rate that will take place in the frame at the appointed time to regard single error event as.Have only when described increase surpasses the appointment watermark of rule definition, described single error event just is identified.But, use the method for DFT to utilize static rule, the single dimension of statistical study only is provided.
So a kind of method that provides and realize dynamic revisable DFT rule set, equipment and program storage device are provided.
Summary of the invention
In order to overcome above-mentioned limitation, and overcome other limitation that when reading and understanding this instructions, will become apparent, the present invention discloses a kind of dynamic revisable DFT rule set that utilizes, and method, equipment and program storage device to the control of the statistical treatment of the misdata in a large amount of sources are provided.
According to the present invention, provide a kind of and utilized dynamic revisable discrete frames technology DFT rule set that method to the control of the statistical treatment of the misdata in a large amount of sources is provided, described method comprises step:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detected error event, and the preservation information relevant with described error event;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information with described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
According to the present invention, also provide a kind of and utilized dynamic revisable discrete frames technology DFT rule set that computing equipment to the control of the statistical treatment of the misdata in a large amount of sources is provided, comprising:
Preserve the storer of the error message relevant with error event; With the processor that couples with described storer, described processor is configured to:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detected error event, and in described storer, preserve the information relevant with described error event;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information with described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
According to the present invention, also provide a kind of and utilized dynamic revisable discrete frames technology DFT rule set that method to the control of the statistical treatment of the misdata in a large amount of sources is provided, described method comprises step:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set
Detect a plurality of mistakes from a certain source, and calculate the time period between described a plurality of mistake;
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information with described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
According to the present invention, also provide a kind of and utilized dynamic revisable discrete frames technology DFT rule set that computing equipment to the control of the statistical treatment of the misdata in a large amount of sources is provided, comprising:
Preserve the storer of the error message relevant with error event; And
With the processor that described storer couples, described processor is configured to:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detect a plurality of mistakes from a certain source, and calculate the time period between described a plurality of mistake;
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information with described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
The present invention provides user-defined parameter by expansion discrete frames technology to the discrete frames rule, allows the DFT engine to work in the data area that changes thereby produce dynamic revisable rule set, has solved the problems referred to above.
The method that provides misdata with user definition parameter to handle comprises user-defined error thresholds is applied to the definable error thresholds rule of a plurality of users, handle error event, preserve and the relevant information of handling of error event, and, determine when that one of definable error thresholds rule of described a plurality of user is satisfied according to the information of preserving.
In another embodiment of the present invention, provide a kind of computing equipment that supplies the usefulness of misdata disposal system.This computing equipment comprises the storer of preserving error message, with couple with storer, be used for user-defined error thresholds data are applied to the definable error thresholds rule of a plurality of users, and, determine when the processor that one of definable error thresholds rule of described a plurality of user has been satisfied according to the error message of preserving.
In another embodiment of the present invention, provide a kind of method that misdata processing with user-defined parameter is provided.This method comprises user-defined error thresholds is applied to the definable error thresholds rule of a plurality of users, detect a plurality of mistakes from a certain source, calculate the time period between described a plurality of mistake, the information of the time correlation between preservation and described a plurality of mistake and the described a plurality of mistake, and, determine when that one of definable error thresholds rule of described a plurality of user is satisfied according to the information of preserving.
In another embodiment of the present invention, provide a kind of computing equipment that supplies the usefulness of misdata disposal system.Described computing equipment comprises the storer of preserving error message, described error message is relevant with wrong interarrival time with error source, with the processor that couples with storer, described processor is used for user-defined error thresholds data are applied to the definable error thresholds rule of a plurality of users, and, determine when that one of definable error thresholds rule of described a plurality of user is satisfied according to error source and the wrong interarrival time preserved.
In another embodiment of the present invention, provide a kind of program storage device.Described program storage device comprises and can be carried out by treating apparatus, thereby carry out the programmed instruction of the operation that the misdata processing with user-defined parameter is provided, described operation comprises user-defined error thresholds is applied to the definable error thresholds rule of a plurality of users, handle error event, preserve and the relevant information of handling of error event, and, determine when that one of definable error thresholds rule of described a plurality of user is satisfied according to the information of preserving.
In another embodiment of the present invention, provide a kind of program storage device.Described program storage device comprises and can be carried out by treating apparatus, thereby carry out the programmed instruction of the operation that the misdata processing with user-defined parameter is provided, described operation comprises user-defined error thresholds is applied to the definable error thresholds rule of a plurality of users, detect a plurality of mistakes from a certain source, calculate the time between described a plurality of mistake, the information of the time correlation between preservation and described a plurality of mistake and the described a plurality of mistake, and, determine when that one of definable error thresholds rule of described a plurality of user is satisfied according to the information of preserving.
In another embodiment of the present invention, provide a kind of computing equipment that supplies the usefulness of misdata disposal system.Described computing equipment comprises the device of preserving error message, with couple with described save set, be used for user-defined error thresholds data are applied to the definable error thresholds rule of a plurality of users, and, determine when the device that one of definable error thresholds rule of described a plurality of user has been satisfied according to the error message of preserving.
These and various other advantage and feature of new things of the present invention in the accessory claim that constitutes a part of the present invention, have been pointed out to characterize in detail.But in order to understand the present invention better, its advantage and use the purpose that obtains by it should be with reference to the accompanying drawing that constitutes another part of the present invention, and subsidiary description content, its illustrated according to the concrete example of equipment of the present invention.
Description of drawings
Referring now to accompanying drawing,, wherein identical Reference numeral is represented corresponding components:
Fig. 1 represents wherein can realize the network of data handling system of the present invention;
Fig. 2 be can be realized as shown in fig. 1 server or the block scheme of the computer processing system of computer system;
Error event on Fig. 3 schematic illustration description time line is used for the realization of graphic extension one embodiment of the present of invention;
Fig. 4 is the process flow diagram of misdata disposal route according to an embodiment of the invention;
Fig. 5 is according to one embodiment of present invention, and the process flow diagram of the method for user-defined parameter is provided to expansion discrete frames technology (DFT) rule set;
Fig. 6 graphic extension is handled the process flow diagram of wrong method according to embodiments of the invention according to expansion DFT rule set.
Embodiment
In the following explanation of embodiment, with reference to accompanying drawing, accompanying drawing constitutes the part of instructions, wherein illustrates for example and can put into practice specific embodiments of the invention.Should be appreciated that without departing from the scope of the invention, can use other embodiment, because can carry out some structural changes.
One embodiment of the present of invention provide utilizes dynamic revisable DFT rule set, and method, equipment and program storage device to the control of the statistical treatment of the misdata in a large amount of sources are provided.Expanded the discrete frames technology among the present invention,, produced dynamic revisable rule set so that provide user-defined parameter to the discrete frames rule.
Fig. 1 represents wherein can realize the network of data handling system 100 of the present invention.Network data processing system 100 comprises network 102, and network 102 is to be used to be provided at the various device that links together in the network data processing system 100 and the media of the communication link between the computing machine.Network 102 can comprise connection, for example wired, wireless communication link, perhaps optical cable.
In the example shown, server 104 is connected with network 102 together with storage unit 106.In addition, client computer 108,110 is connected with network 102 with 112.These client computer 108,110 and 112 can be for example personal computer, network computer or workstation.In Fig. 1, server 104 provides data to client computer 108-112, for example boot files, operation system image and application program.Client computer 108,110 and 112 is client computer of server 104.Network data processing system 100 can comprise unshowned other server, client computer and miscellaneous equipment.
Fig. 2 be can be realized as shown in fig. 1 server or the block scheme of the computer processing system 200 of computer system.Computer processing system 200 can be multiprocessor (SMP) system of symmetry, comprises a plurality of processors 202 and 204 that are connected with system bus 206.On the other hand, can adopt single processor system.Memory Controller/high-speed cache 208 also is connected with system bus 206, and Memory Controller/high-speed cache 208 is provided to the interface of local storage 209.I/O bus bridge 210 is connected with system bus 206, is provided to the interface of I/O bus 212.Memory Controller/high-speed cache 208 and I/O bus bridge 210 can be integrated as shown in the figure.
Peripheral Component Interconnect (PCI) bus bridge 214 that is connected with I/O bus 212 is provided to the interface of PCI local bus 216.Many modulator-demodular units 218 can be connected with PCI local bus 216.Typical pci bus realizes supporting four PCI expansion slots or interior socket, connector.By communicator 218 and the network adapter 220 that is connected with PCI local bus 216 through built-in inserted plate, can be provided to the communication link of the client computer 108-112 among Fig. 1.
Additional pci bus bridge 222 and 224 provides interface to additional PCI local bus 226 and 228, can support the modulator-demodular unit or the network adapter of adding from described additional PCI local bus 226 and 228.In this manner, computer processing system 200 allows to be connected to a plurality of network computers.Memory-mapped graphics adapter 230 and hard disk 232 also can be connected with I/O bus 212 as shown in the figure directly or indirectly.
Those of ordinary skill in the art will appreciate that the hardware of describing among Fig. 2 can change.For example, except shown in the hardware or hardware shown in replacing, also can use other external components, for example CD drive etc.In addition, the type of bus can be different.Described example does not also mean that structural limitations to the embodiment of the invention.
As previously mentioned, come reconstructing system fault and assessment environment by utilizing the environment that makes a mistake of incident or error log identification, according to sign ground recognition system failure condition around the mistake that causes the system failure.The same with diagnostic method based on standard, cause system failure designator based on the diagnostic method of sign, rather than test.A specific example based on the diagnostic techniques of sign is according to before bust, department of computer science's observation that the error rate of other electronic equipment increases gradually of unifying, the discrete frames technology (DFT) of proposition.The DFT technology occurs in the tight ness rating on time and the space by checking mistake, and service regeulations are determined the relation between wrong the generation.Following table 1 illustrated the DFT rule set.
Figure C20051005917200151
Table 1
Utilize the method for DFT to use static rule as shown in table 1.But static rule only provides the single dimension of statistical study.For example, as shown in table 1, typical discrete frames technology (DFT) provides five kinds of statistical ruless.Mistake dispersion index (error dispersion index:EDI) is the number of times of makeing mistakes in the discrete frames of half.Discrete frames is by the timing definition between the continuous error event of arrival interval (interarrival) time or same type.First rule covers two continuous wrong dispersion indexs (EDI) when the sequential use that comes from identical discrete frames and shows when being at least 3 EDI (3.3 rule).Second rule covers as two continuous EDI that come from two continuous discrete frames and shows (2.2 rule) when being at least 2 EDI.Three sigma rule covers when discrete frames during less than 1 hour (2 close 1 rule).The 4th rule covers when when in the time frame four error events taking place in 24 hours (4 close 1 rule).The 5th rule covers when there being four monotone decreasing discrete frames, and the size of at least one frame is a half of its previous frame (4 successively decrease rule).Therefore, by checking the wrong type that takes place and their tight ness ratings on time and space, these rules can be used to the relation between definite wrong generation.
DFT utilizes the model based on the interarrival time of the observations in a certain discrete frames.According to the experience that when error log being decomposed (factor), obtains to single error source, predictive failure analysis (PFA) engine from the persistent storage medium extract, tissue and inspection error log clauses and subclauses.The tissue of rule is used one of its five kinds of failure prediction rules according to the arrival interval pattern of mistake.These five kinds rules are caught in the discrete frames, the corresponding behavior of behavior that detects with traditional statistical analysis technique.The PFA engine occurs in the tight ness rating of time (duration) and aspect, space (zone of influence) by checking mistake, determines the relation between wrong the generation.
More particularly, 3.3 rules concentrate on and check the continuous EDI that comes from identical discrete frames.When the continuous application of discrete frames produces when being at least 3 EDI, send and 3.3 regular corresponding warnings.3.3 two continuous EDI of rule request, and the EDI that is at least 3.In the DFT rule set, these requirements remain unchanged.
2.2 rule concentrates on the EDI that checks in continuous discrete frames and the discrete frames.When two discrete frames have when being at least 2 EDI, send and 2.2 regular relevant warnings.Be similar to 3.3 rules, 2.2 rules have static requirement.Requirement described here is two continuous EDI in the continuous discrete frames, and is at least 2 EDI.
Close 1 (2 in 1) rule and 42 and close in 1 rule, focus concentrates on the time span between the error event.When the span of the interarrival time between a discrete frames or the mistake during less than 1 hour, 2 close 1 rule is satisfied.When in one day four error events taking place, 4 close 1 rule is satisfied.2 close 1 rule and 4 closes the wrong requirement that 1 rule includes constant time requirement and detection.
4 rules of successively decreasing concentrate on time span between the discrete frames and wrong incidence.Successively decrease in the rule 4, four discrete frames sizes with or less than discrete frames formerly, and wherein the size of a frame be formerly to send warning after half of discrete frames.4 successively decrease rule comprise four discrete frames size with or less than discrete frames formerly, and the size of a discrete frames is half static requirement of discrete frames formerly.
Fig. 3 is that 3.3 rules that cause on the schematic illustration description time line are reported to the police, and 2.2 rules are reported to the police and 4 figures 300 of incident that successively decrease the rule warning.Error event i-4, i-3, i-2, i-1 and i have been represented among the figure.Discrete frames is defined as the interarrival time between the continuous error event of same type.Thereby interarrival time is two time periods between the error event.Discrete frames (i-3) the 310th, the interarrival time between incident i-4 and the i-3.Frame (i-2) the 320th, the discrete frames between incident i-3 and the i-2.
Measured and be designated as wrong dispersion index (EDI) from the center of every frame to the error number of its right-hand member.The EDI of frame (i-3) 310 is 3, and the EDI of frame (i-2) 320 is 2.Example be exactly frame (i-3) 310 be time between wrong i-3 and the i-2.
With regard to 3.3 rules, in frame (i-3) 310, in the application of same number of frames, two chain indexs 305 and 315 EDI are 3.3.3 rules have been satisfied in time between the error event and space requirement, send 3.3 rules and report to the police.
With regard to 2.2 rules, at frame (i-3) 310 with (i-2) between 320, chain index has and is at least 2 EDI.It is 3 index that the time span 315 of next-door neighbour's frame (i-2) of frame (i-3) has, and it is 2 index that the time span 325 of next-door neighbour's frame (i-3) of frame (i-2) has.2.2 the time and the space requirement of rule are satisfied, and send the warning 322 corresponding to 2.2 rules.
Observer frame (i-3)~(i), can find out past along with the time, four frames (i-3) 310, (i-2) 320, (i-1) 330 and (i) 340 size reduce or remain unchanged, and among these four frames, the size of at least one frame (i) 340 is half of frame (i-1) 330 formerly.Thereby 4 rules of successively decreasing 344 are satisfied.But above mentioned DFT rule is static, and the single dimension of statistical study only is provided.
Fig. 4 is according to embodiments of the invention, is provided for the process flow diagram 400 of the rule set with user definition parameter of misdata processing.User-defined error thresholds is received (410), and according to user-defined error thresholds error thresholds rule (420) is set.Detect mistake, and preserve and wrong relevant information (430).Relatively information of Bao Cuning and threshold rule (440) determine whether error thresholds is satisfied (450).When error thresholds was not satisfied, the engine that drives rule set continued to handle and preserve detected mistake (430), and the information (440) of relatively preserving, till error thresholds is satisfied.In case reached error thresholds, so just sent report to the police (460).
Above-mentioned DFT rule is modified in an embodiment of the present invention, and is assigned to the equipment with unique pattern.User-defined rule is received, as the input of giving the expansion DFT processing engine that the following describes.According to embodiments of the invention, table 2 illustrated expansion DFT rule set.
Figure C20051005917200171
Figure C20051005917200181
Table 2
Be similar to table 1, mistake dispersion index (EDI) is the number of times of makeing mistakes in the discrete frames of half.Discrete frames is defined by the interarrival time between the continuous error event of same type.
Fig. 5 be graphic extension according to embodiments of the invention, the process flow diagram 500 of user-defined parameter is provided to expansion discrete frames rule set.Expansion discrete frames rule is defined by the user and be received (505).Each variable (510) is set in rule set.Variable comprises: 2 close 1 rule and 4 closes the time frame of 1 rule, 4 close the required number of times of makeing mistakes of 1 rule, the required EDI number of 3.3 and 2.2 rules, the number of the required chain index of 3.3 and 2.2 rules, 4 frame numbers that successively decrease rule, the 4 requirement sizes of successively decreasing rule are half the number of frame of frame formerly.Identification discrete frames (515), and with have user-defined parameter expanding discrete frames rule set relatively.
With regard to 3.3 rules, between 3.3 rule requests, compare (520).When the EDI of the user definition number of the continuous application that comes from identical discrete frames had user-defined EDI number at least, the threshold value of 3.3 rules was satisfied (530), sent and satisfied 3.3 regular relevant warnings (535).
For 2.2 rules, more a plurality of mistakes and 2.2 rule requests (520) with user definition parameter.When the continuous EDI of the user definition number that comes from two successive frames showed at least one user-defined EDI number, 2.2 rule requests were satisfied (540), sent 2.2 relevant rules report to the police (545).
Close 1 rule for 2, the time frame between more a plurality of mistakes and user-defined 2 closes 1 regular time frame (520).When receiving mistake in the time frame of definition, 2 close 1 rule is satisfied (550), sends 2 and closes 1 error message (555).
Close 1 rule for 4, the time between the mistake of user definition number must fall in the user-defined time frame.Close 1 regular user definition when the error message and 4 of relatively preserving and require (520), and when requiring to be satisfied (560), 1 error message (565) is closed in transmission 4.
With regard to 4 rules of successively decreasing, the discrete frames dullness of user definition number reduces, and the size of the discrete frames of user definition number is half of discrete frames formerly.Comparison error data and user-defined 4 rule (520) of successively decreasing is successively decreased rule request when being satisfied (570) when 4, sends and the 4 regular relevant error messages (575) that successively decrease.
Under the situation that above-mentioned rule is not satisfied, this process is returned, from memory identification discrete frames (505), till rule request is satisfied.
Fig. 6 graphic extension is handled the process flow diagram 600 of wrong method according to one embodiment of present invention according to expansion DFT rule set.Detect a plurality of mistakes (605) from a certain source.Determine the time period (610) between the mistake, and preserve and wrong relevant information (615).Relatively each expands DFT rule and the misdata (620,630,640,650 and 660) of preserving.Determine whether expansion DFT rule is satisfied (625,635,645,655 and 665).Each expansion DFT rule for being satisfied sends the warning relevant with the ad hoc rules that is satisfied (628,638,648,658 and 668).Under the situation that requires not to be satisfied at rule set, process is returned and is detected a plurality of wrong steps (605).
Again referring to Fig. 2, appropriate according to an embodiment of the invention computingasystem environment 200.For example, environment 200 can be client computer, data server and/or the master server of having described.Computingasystem environment 200 is an example of appropriate computing environment, is not intended any restriction of suggestion to use of the present invention or envelop of function.Computing environment 200 should not be understood as that dependence or the requirement that has about the combination of any one assembly of graphic extension in the operating environment 200 of illustration or assembly yet.Especially, environment 200 is can realize server, client computer or the example of the computerized equipment of other node of having illustrated.
Computer-readable storage medium comprises according to any means or technology to be realized, is used for canned data, such as the volatibility of computer-readable instruction, data structure, program module or other data, non-volatile, dismountable and non-removable medium.Storer 209,208, for example storer that is connected with pci bus 226,228 and/or hard disk drive 232 all are the examples of computer-readable storage medium.Computer-readable storage medium includes, but is not limited to RAM, ROM, EEPROM, short-access storage or other memory technology, CDROM, digital universal optic disk (DVD) or other optical memory, magnetic tape cassette, tape, magnetic disk memory or other magnetic storage device, perhaps can be used for preserving information needed, and any other medium that can be visited by equipment 200.Such computer-readable storage medium can be the part of equipment 200 arbitrarily.
What equipment 200 can comprise also that permission equipment communicates by letter with miscellaneous equipment communicates to connect 218.Communicating to connect 218 is examples of telecommunication media.Telecommunication media is generally used modulated data signal, for example carrier wave or other connecting gear imbody computer-readable instruction, data structure, program module or other data, and comprise that any information transmits media.Term " modulated data signal " meaning is arranged in such a way or changes its at least one characteristic, so as in signal to the information encoded signals.For example, telecommunication media includes, but is not limited to wired media, such as cable network or direct wired connection, and wireless medium, such as sound, RF, infrared and other wireless medium.Term used herein " computer-readable medium " had both comprised storage medium, comprised telecommunication media again.
The said method available computers realizes on equipment 200.Computer implemented method preferably is realized as on computers at least one program of operation to small part.Described program can be carried out from the computer-readable medium such as storer by the processor of computing machine.Program preferably can be kept at machine readable media, for example on floppy disk or the CD-ROM, so that be distributed to another computing machine, and install on described another computing machine and carries out.Described one or more program can be the part of computer system, computing machine or computerized equipment.
In other embodiments of the invention, expansion DFT rule allows the remarkable increase of the interior error rate that takes place of frame at the appointed time to regard single error event as.But, having only when described increase surpasses the appointment watermark of rule definition, described single error event just is identified.
Embodiments of the invention provide the agreement that on-the-fly modifies expansion DFT rule.This forces DFT to work in the data area of user-defined continuous variation.These variation ranges also can be applied to just being monitored, and the particular hardware component of capable reporting errors.The user of expansion DFT will have the stricter statistical restraint condition of setting, adjust the DFT engine so that the dirigibility of working in the processing environment that constantly changes.
For illustrational purpose, provided the above-mentioned explanation of illustration embodiment of the present invention.Above-mentioned explanation is not to want limit the present invention, perhaps limit the invention to disclosed concrete form.In view of above-mentioned instruction, many modifications and variations are possible.Scope of the present invention be can't help this detailed description and is limited, but is limited by additional claim.

Claims (30)

1. one kind is utilized dynamic revisable discrete frames technology DFT rule set that method to the control of the statistical treatment of the misdata in a large amount of sources is provided, and described method comprises step:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detected error event, and the preservation information relevant with described error event;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information of described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
2. in accordance with the method for claim 1, wherein said detected error event comprises:
Detect a plurality of mistakes from a certain source; With
Calculate the time period between described a plurality of mistake.
3. in accordance with the method for claim 2, the information that wherein said preservation is relevant with error event also comprises:
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period.
4. in accordance with the method for claim 2, wherein the information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set and also comprises:
The number of detected mistake and the time period between described a plurality of mistake and the definable error thresholds rule of user are compared.
5. in accordance with the method for claim 2, wherein the information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set and also comprises:
Determine that detected a plurality of mistake satisfies user-defined wrong dispersion index, wherein wrong dispersion index is the number of the mistake in half of time period between the mistake of same type.
6. in accordance with the method for claim 5, also be included in the identical time period between the mistake of same type, continuous user-defined number of times reaches user-defined wrong dispersion index.
7. in accordance with the method for claim 6, also be included in two continuous discrete frames, continuous user-defined number of times reaches user-defined wrong dispersion index.
8. in accordance with the method for claim 2, the information that wherein detected error event, and preservation is relevant with described error event also comprises:
The mistake that detection takes place in user-defined time frame, and satisfy the definable error thresholds of user when regular when error rate based on the time period between described a plurality of mistakes of calculating, described a plurality of wrong identification are become a mistake.
9. method according to claim 1 also comprises step:
When described error thresholds was not satisfied, the engine that drives the definable DFT error thresholds of described user rule set continued to preserve the error event of described detection,
The error event of described detection is compared with the threshold rule in the described DFT error thresholds rule set, be satisfied up to described error thresholds.
10. in accordance with the method for claim 1, wherein said warning is the warning based on the particular type of one of definable error thresholds rule of described a plurality of users.
11. in accordance with the method for claim 1, wherein the information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set and also comprises:
Time period between the mistake of determining to detect is less than user-defined time frame.
12. in accordance with the method for claim 1, wherein the information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set and also comprises:
The mistake that detects of determining the user definition number takes place in user-defined time frame.
13. in accordance with the method for claim 1, wherein the information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set and also comprises:
Each time period of user definition number between determining to make a mistake with the monotone decreasing ratio.
14. in accordance with the method for claim 13, the mistake that wherein takes place with the monotone decreasing ratio also is included in the mistake of the user definition number that takes place in half of time period formerly between the mistake.
15. one kind is utilized dynamic revisable discrete frames technology DFT rule set that computing equipment to the control of the statistical treatment of the misdata in a large amount of sources is provided, comprising:
Preserve the storer of the error message relevant with error event; With
With the processor that described storer couples, described processor is configured to:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detected error event, and in described storer, preserve the information relevant with described error event;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information with described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
16. according to the described computing equipment of claim 15, detected error event wherein, and in described storer, preserve the information relevant with described error event and also comprise:
Detect a plurality of mistakes from a certain source; With
Calculate the time period between described a plurality of mistake.
17. according to the described computing equipment of claim 16, the information that wherein said preservation error event is relevant also comprises:
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period.
18. according to the described computing equipment of claim 16, wherein the information of described preservation being compared with the threshold rule in the described DFT error thresholds rule set also comprises:
The number of detected mistake and the time period between described a plurality of mistake and the definable error thresholds rule of user are compared.
19. according to the described computing equipment of claim 16, wherein the information of described preservation being compared with the threshold rule in the described DFT error thresholds rule set also comprises:
Determine that detected a plurality of mistake satisfies user-defined wrong dispersion index, wherein wrong dispersion index is the number of the mistake in half of time period between the mistake of same type.
20. according to the described computing equipment of claim 19, also be included in the identical time period between the mistake of same type, continuous user-defined number of times reaches user-defined wrong dispersion index.
21. according to the described computing equipment of claim 20, also be included in two continuous discrete frames, continuous user-defined number of times reaches user-defined wrong dispersion index.
22. according to the described computing equipment of claim 16, wherein said processor detected error event, and the preservation information relevant with described error event also comprises in described storer:
Detect described a plurality of mistake and comprise the mistake that processing takes place in user-defined time frame, and, described a plurality of wrong identification are become a mistake when the error rate based on the time period between described a plurality of mistakes of calculating satisfies the definable error thresholds of user when regular.
23. according to the described computing equipment of claim 15, described processor also is configured to:
When described error thresholds was not satisfied, the engine that drives the definable DFT error thresholds of described user rule set continued to preserve and handle the error event of described detection,
The error event of described detection is compared with the threshold rule in the described DFT error thresholds rule set, be satisfied up to described error thresholds.
24. according to the described computing equipment of claim 15, wherein said warning is the warning based on the particular type of one of definable error thresholds rule of described a plurality of users.
25. according to the described computing equipment of claim 15, wherein the information of described preservation being compared with the threshold rule in the described DFT error thresholds rule set also comprises:
Time period between the mistake of determining to detect is less than user-defined time frame.
26. according to the described computing equipment of claim 15, wherein the information of described preservation being compared with the threshold rule in the described DFT error thresholds rule set also comprises:
The mistake that detects of determining the user definition number takes place in user-defined time frame.
27. according to the described computing equipment of claim 15, wherein the information of described preservation being compared with the threshold rule in the described DFT error thresholds rule set also comprises:
Each time period of user definition number between determining to make a mistake with the monotone decreasing ratio.
28. according to the described computing equipment of claim 27, wherein the mistake that takes place with the monotone decreasing ratio also is included in the mistake of the user definition number that takes place in half of time period formerly between the mistake.
29. one kind is utilized dynamic revisable discrete frames technology DFT rule set that method to the control of the statistical treatment of the misdata in a large amount of sources is provided, described method comprises step:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set
Detect a plurality of mistakes from a certain source, and calculate the time period between described a plurality of mistake;
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information of described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
30. one kind is utilized dynamic revisable discrete frames technology DFT rule set that computing equipment to the control of the statistical treatment of the misdata in a large amount of sources is provided, comprising:
Preserve the storer of the error message relevant with error event; And
With the processor that described storer couples, described processor is configured to:
The user-defined error thresholds that receives is applied to the discrete frames rule, the definable DFT error thresholds of user rule set is set;
Detect a plurality of mistakes from a certain source, and calculate the time period between described a plurality of mistake;
Preserve with described a plurality of mistakes and described a plurality of mistake between relevant information of time period;
The information of described preservation is compared with the threshold rule in the described DFT error thresholds rule set;
When the information of described preservation satisfies the pairing error thresholds of one of them threshold rule, send and the relevant warning of described one of them threshold rule.
CNB2005100591727A 2004-06-10 2005-03-24 Method and apparatus for extending dispersion frame technique behavior using dynamic rule sets Expired - Fee Related CN100399283C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/865,206 US7480828B2 (en) 2004-06-10 2004-06-10 Method, apparatus and program storage device for extending dispersion frame technique behavior using dynamic rule sets
US10/865,206 2004-06-10

Publications (2)

Publication Number Publication Date
CN1707438A CN1707438A (en) 2005-12-14
CN100399283C true CN100399283C (en) 2008-07-02

Family

ID=35461905

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100591727A Expired - Fee Related CN100399283C (en) 2004-06-10 2005-03-24 Method and apparatus for extending dispersion frame technique behavior using dynamic rule sets

Country Status (3)

Country Link
US (2) US7480828B2 (en)
CN (1) CN100399283C (en)
TW (1) TWI327694B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7480828B2 (en) * 2004-06-10 2009-01-20 International Business Machines Corporation Method, apparatus and program storage device for extending dispersion frame technique behavior using dynamic rule sets
US7752488B2 (en) * 2006-01-06 2010-07-06 International Business Machines Corporation Method to adjust error thresholds in a data storage and retrieval system
US7647533B2 (en) * 2006-04-25 2010-01-12 Alcatel Lucent Automatic protection switching and error signal processing coordination apparatus and methods
WO2012074515A1 (en) * 2010-11-30 2012-06-07 Hewlett-Packard Development Company, L.P. Change message broadcast error detection
US8479086B2 (en) * 2011-10-03 2013-07-02 Lsi Corporation Systems and methods for efficient parameter modification
US9223673B1 (en) 2013-04-08 2015-12-29 Amazon Technologies, Inc. Custom host errors definition service
JP6164307B2 (en) * 2014-01-09 2017-07-19 富士通株式会社 Analysis method, analysis program, and analysis apparatus
CN109933448B (en) * 2014-12-25 2021-04-20 华为技术有限公司 Method and device for predicting fault of nonvolatile storage medium
KR102413096B1 (en) * 2018-01-08 2022-06-27 삼성전자주식회사 Electronic device and control method thereof
US11789842B2 (en) * 2021-10-11 2023-10-17 Dell Products L.P. System and method for advanced detection of potential system impairment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6336065B1 (en) * 1999-10-28 2002-01-01 General Electric Company Method and system for analyzing fault and snapshot operational parameter data for diagnostics of machine malfunctions
US20020078382A1 (en) * 2000-11-29 2002-06-20 Ali Sheikh Scalable system for monitoring network system and components and methodology therefore
US6650949B1 (en) * 1999-12-30 2003-11-18 General Electric Company Method and system for sorting incident log data from a plurality of machines

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08249133A (en) * 1994-12-15 1996-09-27 Internatl Business Mach Corp <Ibm> Method and system for measures against fault of disk drive array
US5664093A (en) * 1994-12-27 1997-09-02 General Electric Company System and method for managing faults in a distributed system
US5974576A (en) 1996-05-10 1999-10-26 Sun Microsystems, Inc. On-line memory monitoring system and methods
US5768501A (en) * 1996-05-28 1998-06-16 Cabletron Systems Method and apparatus for inter-domain alarm correlation
US5944782A (en) * 1996-10-16 1999-08-31 Veritas Software Corporation Event management system for distributed computing environment
US5828830A (en) * 1996-10-30 1998-10-27 Sun Microsystems, Inc. Method and system for priortizing and filtering traps from network devices
JPH10187321A (en) 1996-12-25 1998-07-14 Casio Comput Co Ltd Error announcing device and recording medium for recording error announcement control program
CA2197263A1 (en) * 1997-02-11 1998-08-11 Dan Burke Method of detecting signal degradation fault conditions within sonet and sdh signals
US5935261A (en) * 1997-06-05 1999-08-10 International Business Machines Corporation Method and apparatus for detecting handling damage in a disk drive
US6553403B1 (en) * 1998-06-03 2003-04-22 International Business Machines Corporation System, method and computer program product for monitoring in a distributed computing environment
US6760391B1 (en) * 1999-09-14 2004-07-06 Nortel Networks Limited Method and apparatus for line rate control in a digital communications system
US6615367B1 (en) * 1999-10-28 2003-09-02 General Electric Company Method and apparatus for diagnosing difficult to diagnose faults in a complex system
US6598179B1 (en) * 2000-03-31 2003-07-22 International Business Machines Corporation Table-based error log analysis
GB0008952D0 (en) * 2000-04-12 2000-05-31 Mitel Corp Dynamic rule sets for generated logs
US6983413B2 (en) 2000-12-12 2006-01-03 Kabushiki Kaisha Toshiba Data processing method using error-correcting code and an apparatus using the same method
US20020124214A1 (en) * 2001-03-01 2002-09-05 International Business Machines Corporation Method and system for eliminating duplicate reported errors in a logically partitioned multiprocessing system
US6966015B2 (en) * 2001-03-22 2005-11-15 Micromuse, Ltd. Method and system for reducing false alarms in network fault management systems
US7007172B2 (en) 2001-06-01 2006-02-28 Microchip Technology Incorporated Modified Harvard architecture processor having data memory space mapped to program memory space with erroneous execution protection
US6898733B2 (en) * 2001-10-31 2005-05-24 Hewlett-Packard Development Company, L.P. Process activity and error monitoring system and method
US7234086B1 (en) * 2003-01-16 2007-06-19 Pmc Sierra, Inc. Overlapping jumping window for SONET/SDH bit error rate monitoring
US7480828B2 (en) * 2004-06-10 2009-01-20 International Business Machines Corporation Method, apparatus and program storage device for extending dispersion frame technique behavior using dynamic rule sets

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6336065B1 (en) * 1999-10-28 2002-01-01 General Electric Company Method and system for analyzing fault and snapshot operational parameter data for diagnostics of machine malfunctions
US6650949B1 (en) * 1999-12-30 2003-11-18 General Electric Company Method and system for sorting incident log data from a plurality of machines
US20020078382A1 (en) * 2000-11-29 2002-06-20 Ali Sheikh Scalable system for monitoring network system and components and methodology therefore

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Error Log Analysis : Statistical Modeling and HeuristicTrend Analysis. Ting-Ting Y. Lin,Daniel P. Siewiorek.IEEE TRANSACTIONS ON RELIABILITY,Vol.39 No.4. 1990
Error Log Analysis : Statistical Modeling and HeuristicTrend Analysis. Ting-Ting Y. Lin,Daniel P. Siewiorek.IEEE TRANSACTIONS ON RELIABILITY,Vol.39 No.4. 1990 *
基于粗糙集的不一致信息系统规则获取方法. 菅利荣,达庆利,陈伟达.中国管理科学,第11卷第4期. 2003
基于粗糙集的不一致信息系统规则获取方法. 菅利荣,达庆利,陈伟达.中国管理科学,第11卷第4期. 2003 *

Also Published As

Publication number Publication date
CN1707438A (en) 2005-12-14
US7480828B2 (en) 2009-01-20
TW200622580A (en) 2006-07-01
US20050278570A1 (en) 2005-12-15
TWI327694B (en) 2010-07-21
US20090063906A1 (en) 2009-03-05
US7725773B2 (en) 2010-05-25

Similar Documents

Publication Publication Date Title
CN100399283C (en) Method and apparatus for extending dispersion frame technique behavior using dynamic rule sets
Person et al. Nurse staffing and mortality for Medicare patients with acute myocardial infarction
Shang et al. Automated detection of performance regressions using regression models on clustered performance counters
US8484514B2 (en) Fault cause estimating system, fault cause estimating method, and fault cause estimating program
US10762544B2 (en) Issue resolution utilizing feature mapping
US20070143338A1 (en) Method and system for automatically building intelligent reasoning models based on Bayesian networks using relational databases
EP2541358B1 (en) Method, monitoring system and computer program product for monitoring the health of a monitored system utilizing an associative memory
CN102402479B (en) For the intermediate representation structure of static analysis
CN100375960C (en) Method and apparatus for regulating input/output fault
Zeng et al. An analytical method for reliability analysis of hardware‐software co‐design system
JP2007102506A (en) Fault diagnostic system, image forming device and fault diagnostic method
EP3470988A1 (en) Method for replicating production behaviours in a development environment
Bharathi et al. A machine learning approach for quantifying the design error propagation in safety critical software system
US7502956B2 (en) Information processing apparatus and error detecting method
US20190196897A1 (en) Influence range specifying method, influence range specifying apparatus, and storage medium
CN113886443A (en) Log processing method and device, computer equipment and storage medium
JP2019049802A (en) Failure analysis supporting device, incident managing system, failure analysis supporting method, and program
CN111339072A (en) User behavior based change value analysis method and device, electronic device and medium
CN115543665A (en) Memory reliability evaluation method and device and storage medium
CN105516793A (en) Method and apparatus of lapse monitoring
US9141460B2 (en) Identify failed components during data collection
JP6048119B2 (en) Abnormal cause estimation program, abnormal cause estimation apparatus, and abnormal cause estimation method
CN114239538A (en) Assertion processing method and device, computer equipment and storage medium
US20050210329A1 (en) Facilitating system diagnostic functionality through selective quiescing of system component sensor devices
CN106685694B (en) Information system alarm correlation analysis method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080702

Termination date: 20190324

CF01 Termination of patent right due to non-payment of annual fee