CN103198016B

CN103198016B - Based on combining the software error localization method relying on probabilistic Modeling

Info

Publication number: CN103198016B
Application number: CN201310099997.6A
Authority: CN
Inventors: 苏小红; 龚丹丹; 马培军; 王甜甜; 赵玲玲; 王煜
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2013-03-26
Filing date: 2013-03-26
Publication date: 2016-08-03
Anticipated expiration: 2033-03-26
Also published as: CN103198016A

Abstract

Based on combining the software error localization method relying on probabilistic Modeling, the present invention relates to computer program analysis field.The present invention is to solve the problem that traditional software error localization method positioning precision is low, and provide based on combining the software error localization method relying on probabilistic Modeling.Step 1: perform correct test case and error checking use-case respectively, and respectively correct test case and error checking use-case are set up associating dependence probabilistic model；Step 2: on the basis of step 1, according to combining dependence probabilistic model, calculates the suspicious degree of each node；Step 3: by suspicious degree, location of mistake information is carried out descending, the node that suspicious degree is high regards as the node more likely made mistakes, and i.e. completes and positions based on the mistake combining the software relying on probabilistic Modeling.The present invention is applied to computer program analysis field.

Description

Based on combining the software error localization method relying on probabilistic Modeling

Technical field

The present invention relates to computer program analysis field.

Background technology

Being widely used in along with computer software in the every field such as economy, military affairs, business, its integrity problem obtains the extensive attention of people day by day.Becoming increasingly complex however as software system, software often runs unlike expected from people, and in other words, software the most reliably runs, and computer application system is brought adverse effect, even causes huge economic loss and catastrophic consequence.Thus it is guaranteed that the high-quality of software, high reliability have become an indispensable importance of system development and maintenance work.

Causing the insecure main cause of software is the mistake in program source code.Programming is a complicated activity, is difficult to all possible execution route in derivation program, and prediction may affect program or the environmental factors affected by program.Mistake even if program seems correctly to perform, in the case of still there may be seldom or when specified conditions meet.Therefore software error is the problem needing solution at present badly.

Software test and debugging are the important stages in software development process, and they collaborative works can effectively identify and eliminate software error: test is used for exposing software error, and debugging is used for eliminating these software errors.But the speed eliminating software error in software debugging process does not often catch up with in software test procedure the speed finding software error.The most existing a lot of automation software testing instruments, but, software debugging uses the method for manual analysis the most mostly, and this is a highly difficult and time-consuming task, because: (1) first has to the mistake in positioning software.In some cases, when developer finds software error in program process, may be far from erroneous point, require a great deal of time and energy searches the program code causing mistake generation.(2) software error secondly it is appreciated that.Positioning software mistake is only the software debugging first step, next must be by being appropriately modified the mistake that program source code eliminates in statement.In some cases, how being appropriately modified statement is not it is apparent that there is a need of developer's manual analysis debugging enironment is wrong to be interpreted as what certain statement, then, find a kind of method to the mistake revising in code, and avoid introducing new mistake during amendment.

If software automation debugging can be realized, the errors present that is i.e. automatically found in program code by computer, analyze error reason and then automatically correct mistake, then can more effectively guarantee software reliability.Software error is automatically positioned analyzes, by computer, the run time behaviour produced in program source code or running, calculates and analyze the abnormal conditions in program, and it is independent as suspect code.The code automatic fitration unrelated with software error being fallen, only retaining needs the correlative code of debugging further, can reduce the scope of error code search, carrys out auxiliary development personnel and identifies mistake statement quickly, is effectively improved the efficiency of debugging.Therefore, mesh of the present invention aims to software reliability actual application background and demand, research software error is automatically positioned, and for software debugging and software error correction based theoretical, improves software quality, guarantees software high reliability, improves being appreciated that and maintainable of software.

Summary of the invention

The present invention is to solve the problem that traditional software error localization method positioning precision is low, and provide based on combining the software error localization method relying on probabilistic Modeling.

Comprise the following steps based on combining the software error localization method relying on probabilistic Modeling:

Step 1: perform correct test case and error checking use-case respectively, and respectively correct test case and error checking use-case are set up associating dependence probabilistic model；

Step 2: on the basis of step 1, according to combining dependence probabilistic model, calculates the suspicious degree of each node；

Step 3: by suspicious degree, location of mistake information is carried out descending, the node that suspicious degree is high regards as the node more likely made mistakes, and i.e. completes and positions based on the mistake combining the software relying on probabilistic Modeling.

Invention effect:

The basic thought of the present invention is: combining of node relies on the data dependence relation that can well represent between difference execution state lower node and its father node, contributes to carrying out location of mistake.If the frequency that the associating dependence of certain node occurs during error checking use-case performs is higher, and during correct test case performs, the frequency of appearance is relatively low or does not occur, then the associating dependence of this node is likely to mistake.The suspicious degree of the associating dependence of each statement, and then effectively positioning software mistake is calculated according to this thought.

The present invention based on combining the software error localization method relying on probabilistic Modeling, it is possible to effectively position the software error relevant to data dependence.Compared with location of mistake method SBI, SOBER, Tarantula, positioning precision can improve more than 15%, it is adaptable to the location of mistake technical field of extensive program code.

Accompanying drawing explanation

Fig. 1 is the schematic flow sheet of the method for the present invention；

Fig. 2 is that dependence probabilistic model schematic diagram is combined in detailed description of the invention one foundation；

Fig. 3 is detailed description of the invention one program control flowchart and data dependence graph example schematic；

Fig. 4 is detailed description of the invention one program control dependence path and data independent path example schematic.

Detailed description of the invention

Detailed description of the invention one: combine Fig. 1～Fig. 4 and present embodiment is described: comprising the following steps based on combining the software error localization method relying on probabilistic Modeling of present embodiment:

In present embodiment, Fig. 3 left upper portion is divided into program voiddistance (), and left lower is divided into the data dependence graph corresponding to this program, and right side is the controlling stream graph corresponding to this program；

For the test case run, the control independent path obtained when mid portion is by implementation of test cases, the data dependence path obtained when right part is by implementation of test cases on the left of Fig. 4；Such as, for controlling 5 (true) in independent path, represent that node 5 is performed, and state when being performed is true；For 5 (true, [d1 (i), d2 (n)]) in data dependence path, represent that when node 5 is performed, execution state is true, and data dependence is in node 1 and node 2, and data dependence variable is respectively i and n；

Such as n (5 (true in present embodiment step E, [d1 (i), d2 (n)])) represent 5 (true in data dependence path, [d1 (i), d2 (n)]) number of times that occurs, n (5 (true)) represents that the state of node 5 is the total degree that true occurs.

Present embodiment effect:

The basic thought of present embodiment is: combining of node relies on the data dependence relation that can well represent between difference execution state lower node and its father node, contributes to carrying out location of mistake.If the frequency that the associating dependence of certain node occurs during error checking use-case performs is higher, and during correct test case performs, the frequency of appearance is relatively low or does not occur, then the associating dependence of this node is likely to mistake.The suspicious degree of the associating dependence of each statement, and then effectively positioning software mistake is calculated according to this thought.

Present embodiment based on combining the software error localization method relying on probabilistic Modeling, it is possible to effectively position the software error relevant to data dependence.Compared with location of mistake method SBI, SOBER, Tarantula, positioning precision can improve more than 15%, it is adaptable to the location of mistake technical field of extensive program code.

Detailed description of the invention two: present embodiment is unlike detailed description of the invention one: described foundation combine rely on probabilistic model method particularly as follows:

A, first set up controlling stream graph for program, then the control dependence between record statement；

B, set up data dependence graph for program again, the most respectively the data dependence relation between record statement；

C, then by testing results use-case, the control independent path of plug-in mounting capture node and data independent path；

D, control independent path according to controlling stream graph and node, calculate the State-dependence probability of each node；

Wherein, the probability that described each node is performed is designated as P (node), it is the probability of true and false for branch node recording status on the basis of being performed probability, is designated as P (node (true)) and P (node (false))；

Described each node, the probability P (node) being performed according to following formula calculating node node:

P (node) = \frac{n (node)}{n (para (node))} \times P (para (node)) - - - (1)

Wherein, P (para (node)) is the probability that the father node of node node is performed, n (node) is for controlling in independent path, the number of times that node node is performed, n (para (node)) is to control the number of times that the father node of independent path interior joint node is performed；

Described branch node, calculates the probability that node state is true and false, i.e. P (node (true)) and P (node (false)) on the basis of being performed probability:

P (node)=P (node (true))+P (node (false)) (2)

Wherein, described

P (node (false)) = \frac{n (node (false))}{n (node)} \times P (node) = \frac{n (node (false))}{n (node (true)) + n (node (false))} \times P (node)

Wherein, n (node (true)) and n (node (false)) is respectively the number of times that execution state is true and false controlling independent path interior joint node；

E, data dependence path according to data dependence graph and node, calculate the condition of each node and rely on probability P (datadependency | statedependency):

P (data dependency | state dependency) = \frac{n (node (state dependency, data dependency))}{n (node (state dependency))} - - - (3)

N (node (statedependency, datadependency)) be the state of data dependence path interior joint node be state, data dependence be the number of times that datadependency occurs, n (node (statedependency)) be the state of data dependence path interior joint node be the total degree that state occurs；

F, relying on probability according to the State-dependence probability of each node and condition, calculate node combines dependence probability:

Dependence probability is combined according to what following formula calculated node:

Associating dependence probability=State-dependence probability × condition relies on probability (4)

G, foundation associating dependence probabilistic model

Theorem: (combine and rely on probabilistic model): the dependence probabilistic model of combining of program code P is a tlv triple (D, S, R), wherein:

(1) D=(N, E) is the data dependence graph of P, and N is node set, and E is the set of data dependence limit, the data dependence relation of representation program；

(2) S is the mapping that node arrives state；

(3) R be node combine dependence probability.

Other step and parameter are identical with detailed description of the invention one.

Detailed description of the invention three: present embodiment is unlike detailed description of the invention two: each node of described calculating is performed the method for the suspicious degree of state, particularly as follows:

Calculate the suspicious degree suspicious_score of the associating dependence of each node:

suspicious_score (node) = \frac{P_{failed} (joint dependency)}{P_{passed} (joint dependency)} - - - (5)

Wherein, P_passed(jointdependency), time for performing correct test case, node node combines dependence probability；P_failed(jointdependency), time for performing error checking use-case, node node combines dependence probability.

Other step and parameter are identical with detailed description of the invention two.

Detailed description of the invention four: present embodiment is unlike detailed description of the invention three: described node is corresponding to every statement in program.Other step and parameter are identical with detailed description of the invention three.

Detailed description of the invention five: present embodiment is unlike detailed description of the invention four: described branch node is corresponding to the selection in program and Do statement.Other step and parameter are identical with detailed description of the invention four.

Claims

1. based on combining the software error localization method relying on probabilistic Modeling, it is characterised in that comprise the following steps based on combining the software error localization method relying on probabilistic Modeling:

Described foundation combine rely on probabilistic model method particularly as follows:

Wherein, the probability that described each node is performed is designated as P (node), the probability that node state is true and false, i.e. P (node (true)) and P (node (false)) are calculated on the basis of being performed probability for branch node；

P (n o d e) = \frac{n (n o d e)}{n (p a r a (n o d e))} \times P (p a r a (n o d e)) - - - (1)

Described branch node calculates the probability that node state is true and false, i.e. P (node (true)) and P (node (false)) on the basis of being performed probability:

P (node)=P (node (true))+P (node (false)) (2)

Wherein, described

P (n o d e (f a l s e)) = \frac{n (n o d e (f a l s e))}{n (n o d e)} \times P (n o d e) = \frac{n (n o d e (f a l s e))}{n (n o d e (t r u e)) + n (n o d e (f a l s e))} \times P (n o d e)

P (d a t a d e p e n d e n c y | s t a t e d e p e n d e n c y) = \frac{n (n o d e (s t a t e d e p e n d e n c y, d a t a d e p e n d e n c y))}{n (n o d e (s t a t e d e p e n d e n c y))} - - - (3)

G, foundation associating dependence probabilistic model

Theorem: the dependence probabilistic model of combining of program code P is a tlv triple (D, S, R), wherein:

(2) S is the mapping that node arrives state；

(3) R be node combine dependence probability；

The most according to claim 1 based on combining the software error localization method relying on probabilistic Modeling, it is characterised in that described according to combining dependence probabilistic model, calculate the suspicious degree of each node, particularly as follows:

Calculate suspicious degree suspicious_score (node) of the associating dependence of each node:

s u s p i c i o u s_s c o r e (n o d e) = \frac{P_{f a i l e d} (j o int d e p e n d e n c y)}{P_{p a s s e d} (j o int d e p e n d e n c y)} - - - (5)

The most according to claim 2 based on combining the software error localization method relying on probabilistic Modeling, it is characterised in that described node is corresponding to every statement in program.

The most according to claim 3 based on combining the software error localization method relying on probabilistic Modeling, it is characterised in that described branch node is corresponding to the selection in program and Do statement.