hide
Free keywords:
-
Abstract:
Of all software development activities, debugging---locating the defective
source code statements that cause a failure---can be by far the most
time-consuming. We employ probabilistic modeling to support programmers in
finding defective code. Most defects are identifiable in control flow graphs of
software traces. A trace is represented by a sequence of code positions (line
numbers in source filenames) that are executed when the software runs. The
control flow graph represents the finite state machine of the program, in which
states depict code positions and arcs indicate valid follow up code positions.
In this work, we extend this definition towards an n-gram control flow graph,
where a state represents a fragment of subsequent code positions, also referred
to as an n-gram of code positions. We devise a probabilistic model for such
graphs in order to infer code positions in which anomalous program behavior can
be observed. This model is evaluated on real world data obtained from the open
source AspectJ project and compared to the well known multinomial and
multi-variate Bernoulli model.