Unit 3 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

SOFTWARE TESTING

UNIT 3
PATH TESTING, DATA FLOW TESTING
3.1 Path Testing
Definition
 Program graph is a directed graph in which nodes are either entire statements or fragments
of a statement, and edges represent flow of control.
 If i and j are nodes in the program graph, there is an edge from node i to node j iff the
statement (fragment) corresponding to node j can be executed immediately after the
statement (fragment) corresponding to node i.
 Constructing a program graph from a given program is an easy process. Line numbers refer to
statements and statement fragments. There is an element of judgment here: sometimes it is
convenient to keep a fragment (like a BEGIN) as a separate node, other times is seems better
to include this with another portion of a statement.
1. Program triangle 2 ‘structured programming version of simpler specification’;
2. Dim a, b, c as Integer;
3. Dim IsATriangle As boolean;
Step 1: Get Input
4. output ('Enter three integers which are sides of a triangle:');
5. Input (a,b,c);
6. output (“Side A is”' ,a)
7. output ( “Side B is “, b)
8. output (“side C is”, c);
Step 2: Is A Triangle?
09. IF (a < b + c) AND (b < a + c) AND (c < a + b)
10. THEN IsATriangle :=TRUE
11. ELSE IsATriangle := FALSE ;
12. End if
Step 3: Detremine Triangle Type
13. IF IsATriangle
14. Then If (a = b) AND (b = c)
15. THEN Output ('Triangle is Equilateral') ;
16. Else IF (a <> b) AND (a <> c) AND (b<> c)
17. THEN Output ('Triangle is Scalene') ;
18. Else output (“Isosceles”)

By- Shrikant Pujar Dept of CSE, JIT Page 1


SOFTWARE TESTING

19. End if
20. End if
21. ELSE OUTPUT ('Not a Triangle') ;
22. END if
23. End triangle2
 A program graph of this program is given in Figure 3.1. Nodes 4 through 7 are a sequence,
nodes 8 through 11 are an IF-THEN-ELSE construct (that terminates on an IF clause), and
nodes 14 through 16 are an IF-THEN construct. Nodes 4 and 22 are the program
source and sink nodes, corresponding to the single entry, single exit criteria.

4 5 6 7 8

10 11

12

13
14

16

21 15 17 18

19

20

22

23

By- Shrikant Pujar Dept of CSE, JIT Page 2


SOFTWARE TESTING

Figure 3.11 Program Graph of the Triangle Program


 There are no loops, so this iis a directed acyclic graph. The importance
nce of the program
graph is that program executions
tions correspond to paths from the source to the sink nodes.
 Figure 3.2 is a graph of a simple
imple program; it is typical of the kind of example used
us to show the
impossibility of completely testing
ing even simple programs [Schach 93].
 In this program, there are 5 paths
ths from node B to node F in the interior of the loop. If the loop
may have up to 18 repetitions, thhere are some 4.77 trillion distinct program exec
xecution paths.

Figure 3.2 Trillions of Paths

3.2 DD-Paths
 The best known form of structu
tural testing is based on a construct known as
a a decision-to-
decision path (DD-Path) [Milleer 77]. The name refers to a sequence of stat
tatements that, in
Miller’s words, begins with the “outway” of a decision statement and ends with the “inway”
of the next decision statement.
 There are no internal branches in such a sequence, so the corresponding code
ode is like a row
of dominoes lined up so that wh
when the first falls, all the rest in the sequence
ence fall.
 Miller’s original definition works well for second generation languages like FORTRAN II,
because decision making statements
ements (such as arithmetic IFs and DO loops)
oops) use statement
labels to refer to target statemeents.
 With block structured langua
uages (Pascal, Ada, C), the notion of stateement fragments
resolves the difficulty of applyying Miller’s original definition—otherwise, we end up with
program graphs in which some sstatements are members of more than one DD-P
Path.

By- Shrikant Pujar Dept of CSE, JIT Page 3


SOFTWARE TESTING

 We will define DD-Paths in terms of paths of nodes in a directed graph. We might call these
paths chains, where a chain is a path in which the initial and terminal nodes are distinct, and
every interior node has indegree = 1 and outdegree = 1.
 Notice that the initial node is 2-connected to every other node in the chain, and there are no
instances of 1- or 3-connected nodes, as shown in Figure 3.3. The length (number of edges)
of the chain in Figure 3.3 is 6. We can have a degenerate case of a chain that is of length 0,
that is, a chain consisting of exactly one node and no edges.

Interior nodes
Initial node Terminal node

Figure 3.3 A Chain Of Nodes In A Directed Graph

Definition:
A DD-Path is a Sequence of nodes in a program graph such that
Case 1: it consists of a single node with indeg = 0,
Case 2: it consists of a single node with outdeg = 0,
Case 3: it consists of a single node with indeg ≥ 2 or outdeg ≥ 2,
Case 4: it consists of a single node with indeg = 1 and outdeg = 1,
Case 5: it is a maximal chain of length ≥ 1.

 Cases 1 and 2 establish the unique source and sink nodes of the program graph of a structured
program as initial and final DD-Paths.
 Case 3 deals with complex nodes; it assures that no node is contained in more than one DD-
Path. Case 4 is needed for “short branches”; it also preserves the one fragment, one DD-Path
principle.
 Case 5 is the “normal case”, in which a DD-Path is a single entry, single exit sequence of
nodes (a chain). The “maximal” part of the case 5 definition is used to determine the final node
of a normal (non-trivial) chain.

By- Shrikant Pujar Dept of CSE, JIT Page 4


SOFTWARE TESTING

Table 1 Types of DD-Paths in Figure 9.1


Program Graph Nodes DD-Path Name Case of Definition
4 first 1
5-8 A 5
9 B 3
10 C 4
11 D 4
12 E 3
13 F 3
14 H 3
15 I 4
16 J 3
17 K 4
18 L 4
19 M 3
20 N 3
21 G 4
22 O 3
23 last 2

 This is a complex definition, so we’ll apply it to the program graph in Figure 9.1. Node 4 is a
Case 1 DD-Path, we’ll call it “first”; similarly, node 23 is a Case 2 DD-Path, and we’ll
call it “last”. Nodes 5 through 8 are a Case 5 DD-Path.
 We know that node 8 is the last node in this DD-Path because it is the last node that
preserves the 2-connectedness property of the chain. If we went beyond node 8 to include
nodes 9, We Violate The indegree = outdegree = 1 criterion of a chain.
 If we stopped at node 7, we would violate the “maximal” criterion. Nodes 10, 11, 15, 17,
18, and 21 are case4 DD-Paths. Nodes 9, 12, 13, 14, 16, 19, 20, and 22 are case3 DD-Paths.
Finall y node 23 is a case 2 DD-path.
 All of this is summarized in Table 1, where the DD-Path names correspond to the DD-Path
graph in Figure 3.4.

By- Shrikant Pujar Dept of CSE, JIT Page 5


SOFTWARE TESTING

 Part of the confusion with this example is that the triangle problem is logic intensive and
computationally sparse. This combination yields many short DD-Paths. If the THEN and ELSE
clauses contained blocks of computational statements, we would have longer DD-Paths, as
we do in the commission problem.

First

C D

H
G

J
I

K L

Last

Figure 3.4 DD-Path Graph for the Triangle Program

By- Shrikant Pujar Dept of CSE, JIT Page 6


SOFTWARE TESTING

Definition
 Given a program written in an imperative language, its DD-Path graph is the directed
graph in which nodes are DD-Paths of its program graph, and edges represent control
flow between successor DD-Paths.
 In effect, the DD-Path graph is a form of condensation graph in this condensation, 2-
connected components are collapsed into individual nodes that correspond to Case 5 DD-Paths.
 The single node DD-Paths (corresponding to Cases 1 - 4) are required to preserve the
convention that a statement (or statement fragment) is in exactly one DD-Path. Without
this convention, we end up with rather clumsy DD-Path graphs, in which some statement
(fragments) are in several DD-Paths.
3.3 Test Coverage Metrics
 The raison d’être of DD-Paths is that they enable very precise descriptions of test coverage.
the fundamental limitations of functional testing is that it is impossible to know either the
extent of redundancy or the possibility of gaps corresponding to the way a set of functional
test cases exercises a program.
 Test coverage metrics are a device to measure the extent to which a set of test cases covers
(or exercises) a program. There are several widely accepted test coverage metrics; most of
those in Table 2 are due to the early work of E. F. Miller [Miller 77].
Table 2 Structural Test Coverage Metrics
Metric Description of Coverage
C0 Every statement
C1 Every DD-Path (predicate outcome)

C1p Every predicate to each outcome


C2 C1 coverage + loop coverage
Cd C1 coverage + every dependent pair of DD-Paths
CMCC Multiple condition coverage
Cik Every program path that contains up to k repetitions of a loop (usually
k = 2)
Cstat “Statistically significant” fraction of paths
C∞ All possible execution paths

By- Shrikant Pujar Dept of CSE, JIT Page 7


SOFTWARE TESTING

3.3.1 Metric Based Testing


 The test coverage metrics in Table 2 tell us what to test, but not how to test it. In this
section, we take a closer look at techniques that exercise source code in terms of the
metrics in Table 2.
 Miller’s test coverage metrics are based on program graphs in which nodes are full
statements, whereas our formulation allows statement fragments to be nodes.

Statement and Predicate Testing


 Because our formulation allows statement fragments to be individual nodes, the statement
and predicate levels (C0 and C1) to collapse into one consideration.
 In our triangle example (see Figure 3.1), nodes 8, 9, and 10 are a complete Pascal IF-THEN-
ELSE statement. If we required nodes to correspond to full statements, we could execute just
one of the decision alternatives and satisfy the statement coverage criterion.
 Because we allow statement fragments, it is “natural” to divide such a statement into three
nodes. Doing so results in predicate outcome coverage. Whether or not our convention is
followed, these coverage metrics require that we find a set of test cases such that, when
executed, every node of the program graph is traversed at least once.
DD-Path Testing
 When every DD-Path is traversed (the C1 metric), we know that each predicate outcome has
been executed; this amounts to traversing every edge in the DD-Path graph (or program
graph), as opposed to just every node.
 For IF-THEN and IF-THEN-ELSE statements, this means that both the true and the false
branches are covered (C1p coverage). For CASE statements, each clause is covered.
Dependent Pairs of DD-Paths
 The Cd metric foreshadows the dataflow testing. The most common dependency among
pairs of DD-Paths is the define/reference relationship, in which a variable is defined
(receives a value) in one DD-Path and is referenced in another DD-Path.
 The importance of these dependencies is that they are closely related to the problem of
infeasible paths.
 Example in Figure 9.4, B and D are such a pair, so are DD-Paths C and L. Simple DD-Path
coverage might not exercise these dependencies, thus a deeper class of faults would not be

By- Shrikant Pujar Dept of CSE, JIT Page 8


SOFTWARE TESTING

revealed.
Multiple Condition Coverage
 Look closely at the compound conditions in DD-Paths A and E. Rather than simply traversing
such predicates to their TRUE and FALSE outcomes, we should investigate the different
ways that each outcome can occur.
 One possibility is to make a truth table; a compound condition of three simple conditions
would have eight rows, yielding eight test cases. Another possibility is to reprogram
compound predicates into nested simple IF-THEN-ELSE logic, which will result in more DD-
Paths to cover.

Loop Coverage
 The condensation graphs provide us with an elegant resolution to the problems of testing
loops. loops are a highly fault prone portion of source code. To start, there is an
amusing taxonomy of loops in [Beizer 83]: concatenated, nested, and horrible. shown in
Figure 3.5.

First First First

A A A

B B B

C C C

D D D

Last Last Last

Figure 3.5 Concatenated, Nested, and Knotted Loops


 Concatenated loops are simply a sequence of disjoint loops, while nested loops are such that
one is contained inside another.

By- Shrikant Pujar Dept of CSE, JIT Page 9


SOFTWARE TESTING

 Horrible loops cannot occur when the structured programming precepts are followed.
When it is possible to branch into (or out from) the middle of a loop, and these branches
are internal to other loops, the result is Beizer’s horrible loop. (Other sources define this as a
knot—how appropriate.)
 The simple view of loop testing is that every loop involves a decision, and we need to test
both outcomes of the decision: one is to traverse the loop, and the other is to exit (or not enter)
the loop.
 Once a loop has been tested, the tester condenses it into a single node. If loops are nested,
this process is repeated starting with the innermost loop and working outward. This result
in the same multiplicity of test cases we found with boundary value analysis, which makes
sense, because each loop index variable acts like an input variable.
 If loops are knotted, it will be necessary to carefully analyze them in terms of the dataflow
methods. As a preview, consider the infinite loop that could occur if one loop tampers with
the value of the other loop’s index.

3.3.2. Test Coverage Analyzers


 Coverage analyzers are a class of test tools that offer automated support for this approach to
testing management. With a coverage analyzer, the tester runs a set of test cases on a program
that has been “instrumented” by the coverage analyzer.
 The analyzer then uses information produced by the instrumentation code to generate a
coverage report. For example, the instrumentation identifies and labels all DD-Paths in an
original program. When the instrumented program is executed with test cases, the analyzer
tabulates the DD-Paths traversed by each test case. In this way, the tester can experiment
with different sets of test cases to determine the coverage of each set.

3.4 Basis Path Testing


 The mathematical notion of a “basis” has attractive possibilities for structural testing.
Certain sets can have a basis, and when they do, the basis has very important properties with
respect to the entire set.
 Mathematicians usually define a basis in terms of a structure called a “vector space”, which is
a set of elements (called vectors) and which has operations that correspond to

By- Shrikant Pujar Dept of CSE, JIT Page 10


SOFTWARE TESTING

multiplication and addition defined for the vectors.


 The basis of a vector space is a set of vectors such that the vectors are independent of each
other and they “span” the entire vector space in the sense that any other vector in the space
can be expressed in terms of the basis vectors.
 Thus a set of basis vectors somehow represents “the essence” of the full vector space:
everything else in the space can be expressed in terms of the basis, and if one basis element is
deleted, this spanning property is lost.
 The potential for testing is that, if we can view a program as a vector space, then the basis for
such a space would be a very interesting set of elements to test. If the basis is “OK”, we
could hope that everything that can be expressed in terms of the basis is also “OK.

3.4.1 McCabe’s Basis Path Method


 Figure 3.6 is taken from [McCabe 82]; it is a directed graph which we might take to be the
program graph (or the DD-Path graph) of some program.
 The original notation for nodes and edges is repeated here. (Notice that this is not a graph
derived from a structured program: nodes B and C are a loop with two exits, and the edge
from B to E is a branch into the IF-THEN statement in nodes D, E, and F. The program does
have a single entry (A) and a single exit (G).)

B D

E
C F

Figure 3.6 McCabe’s Control Graph

By- Shrikant Pujar Dept of CSE, JIT Page 11


SOFTWARE TESTING

 McCabe based his view of testing on a major result from graph theory, which states that the
cyclomatic number of a strongly connected graph is the number of linearly independent
circuits in the graph. (A circuit is similar to a chain: no internal loops or decisions, but the
initial node is the terminal node. A circuit is a set of 3-connected nodes.)
 We can always create a strongly connected graph by adding an edge from the (every) sink
node to the (every) source node. (Notice that, if the single entry, single exit precept is
violated, we greatly increase the cyclomatic number, because we need to add edges from each
sink node to each source node.) Figure 3.7 shows the result of doing this; it also contains
edge labels that are used in the discussion that follows.
 There is some confusion in the literature about the correct formula for cyclomatic complexity.
the formula as V(G) = e - n + p
while others use the formula V(G) = e - n + 2p; everyone agrees that e is the number of
edges, n is the number of nodes, and p is the number of connected regions.

 The confusion apparently comes from the transformation of an arbitrary directed graph (Figure
3.6) to a strongly connected directed graph obtained by adding one edge from the sink to the
source node (as in Figure 3.7). Adding an edge clearly affects value computed by the
formula, but it shouldn’t affect the number of circuits.
 Here’s a way to resolve the apparent inconsistency: The number of linearly independent paths
from the source node to the sink node in Figure 3.6 is
V(G) = e - n + 2p = 10 – 7 + 2 ( 1 ) = 5
and the number of linearly independent circuits in the graph in Figure 3.7 is
V(G) = e - n + p = 11 – 7 + 1 = 5

By- Shrikant Pujar Dept of CSE, JIT Page 12


SOFTWARE TESTING

B D

E
C F

Figure 3.7 McCabe’s Derived Strongly Connected Graph

 The cyclomatic complexity of the strongly connected graph in Figure 3.7 is 5, thus there are
five linearly independent circuits. If we now delete the added edge form node G to node A,
these five circuits become five linearly independent paths from node A to node G.
 In small graphs, we can visually identify independent paths. Here we identify paths as
sequences of nodes:
p1: A, B, C, G
p2: A, B, C, B, C, G
p3: A, B, E, F, G
p4: A, D, E, F, G
p5: A, D, F, G
 We can force this beginning to look like a vector space by defining notions of addition and
scalar multiplication: path addition is simply one path followed by another path, and
multiplication corresponds to repetitions of a path.
 McCabe arrives at a vector space of program paths. His illustration of the basis part of this
framework is that the path A, B, C, B, E, F, G is the basis sum p2 + p3 - p1, and the path A,
B, C, B, C, B, C, G is the linear combination 2p2 -p1.
 It is easier to see this addition with an incidence matrix in which rows correspond to
paths, and columns correspond to edges, as in Table 3. The entries in this table are obtained

By- Shrikant Pujar Dept of CSE, JIT Page 13


SOFTWARE TESTING

by following a path and noting whi


which edges are traversed.
 Path p1, for example, traverses edges 1, 4, and 9; while path p2 traverses thee following edge
sequence: 1, 4, 3, 4, 9. Since eddge 4 is traversed twice by path p2, that is the entry
e for the edge
4 column.
 We can check the independence
ndence of paths p1 - p5 by examining the first five
f rows of this
incidence matrix. The bold entrries show edges that appear in exactly one path,
p so paths p2
- p5 must be independent.
 Path p1 is independent of all of these, because any attempt to express p1 in terms of the
others introduces unwanted edgges. None can be deleted, and these five paths
hs span the set of
all paths from node A to node G. At this point, you should check the linear combinations of
the two example paths. The addi
ddition and multiplication are performed on the column entries.\
 McCabe next develops an algorithmic
orithmic procedure (called the “baseline method
hod”) to determine
a set of basis paths. The method
od begins with the selection of a “baseline” patth, which should
correspond to some “normal cas
ase” program execution.
 This can be somewhat arbitraary; McCabe advises choosing a path with as
a many decision
nodes as possible. Next the basseline path is retraced, and in turn each decisiion is “flipped”,
that is when a node of outdegr
gree ≥2 is reached, a different edge must be tak
ken.

Table 3 Path/Edge Traversal

path \ edges traversed 1 2 3 4 5 6 7 8 9 10

p1: A, B, C, G 1 0 0 1 0 0 0 0 1 0

p2: A, B, C, B, C, G 1 0 1 2 0 0 0 0 1 0

p3: A, B, E, F, G 1 0 0 0 1 0 0 1 0 1

p4: A, D, E, F, G 0 1 0 0 0 1 0 1 0 1
p5: A, D, F, G 0 1 0 0 0 0 1 0 0 1

ex1: A, B, C, B, E, F, G 1 0 1 1 1 0 0 1 0 1

ex2: A, B, C, B, C, B, C, G 1 0 2 3 0 0 0 0 1 0

 Here we follow McCabe’s exaample, in which he first postulates the path through
throu nodes A,
B, C, B, E, F, G as the baseline. (This was expressed in terms of paths p1 - p5 earlier.)
e The first

By- Shrikant Pujar Dept of CSE, JIT Page 14


SOFTWARE TESTING

decision node (outdegree ≥2) in this path is node A, so for the next basis path, we traverse
edge 2 instead of edge 1.
 We get the path A, D, E, F, G, where we retrace nodes E, F, G in path 1 to be as minimally
different as possible. For the next path, we can follow the second path, and take the other
decision outcome of node D, which gives us the path A, D, F, G.
 Now only decision nodes B and C have not been flipped; doing so yields the last two basis
paths, A, B, E, F, G and A, B, C, G. Notice that this set of basis paths is distinct from the one
in Table 3: this is not problematic, because there is no requirement that a basis be unique.

3.4.2 Observations on McCabe’s Basis Path Method


 If you had trouble following some of the discussion on basis paths and sums and products of
these, you may have felt a haunting skepticism, something along the lines of “Here’s
another academic oversimplification of a real-world problem”.

 Rightly so, because there are two major soft spots in the McCabe view: one is that testing the
set of basis paths is sufficient (it’s not), and the other has to do with the yoga-like contortions
we went through to make program paths look like a vector space.
 McCabe’s example that the path A, B, C, B, C, B, C, G is the linear combination 2 p2 - p1 is
very unsatisfactory. What does the 2p2 part mean? Execute path p2 twice? (Yes, according to
the math.) Even worse, what does the - p1 part mean? Execute path p1 backwards? Undo
the most recent execution of p1? Don’t do p1 next time? Mathematical sophistries like this
are a real turn-off to practitioners looking for solutions to their very real problems.
 To get a better understanding of these problems, we’ll go back to the triangle program
example. Start with the DD-Path graph of the triangle program in Figure 3.4. We begin with
a baseline path that corresponds to a scalene triangle; say with sides 3, 4, 5.
 This test case will traverse the path p1. Now if we flip the decision at node A, we get path
p2. Continuing the procedure, we flip the decision at node D, which yields the path p3. Now
we continue to flip decision nodes in the baseline path p1; the next node with outdegree = 2 is
node E.
 When we flip node E, we get the path p4. Next we flip node G to get p5. Finally, (we know
we’re done, because there are only 6 basis paths) we flip node I to get p6. This procedure

By- Shrikant Pujar Dept of CSE, JIT Page 15


SOFTWARE TESTING

yields the following basis paths:


Table 4: Basis Paths in Figure 3.4
Original p1: A-B-C-E-F-H-J-K-M-N-O-Last Scalene
flip p1 at B p2: A-B-D-E-F-H-J-K-M-N-O-Last Infeasible
flip p1 at F p3: A-B-C-E-F-G-O-Last Infeasible
flip p1 at H P4: A-B-C-E-F-H-I-N-O-Last Equilateral
flip p1 at J P5: A-B-C-E-F-H-J-L-M-N-O-Last Isosceles

 Time for a reality check: if you follow paths p2, p3, p4, p5, and p6, you find that they
are all infeasible. Path p2 is infeasible, because passing through node C means the sides are
not a triangle, so none of the sequel decisions can be taken.
 Similarly, in p3, passing through node B means the sides do form a triangle, so node L
cannot be traversed. The others are all infeasible because they involve cases where a triangle
is of two types (e.g., isosceles and equilateral).
 The problem here is that there are several inherent dependencies in the triangle problem.
One is that if three integers constitute sides of a triangle, they must be one of the three
possibilities: equilateral, isosceles, or scalene. A second dependency is that the three
possibilities are mutually exclusive: if one is true, the other two must be false.
 Recall that dependencies in the input data domain caused difficulties for boundary value
testing, and that we resolved these by going to decision table based functional testing, where
we addressed data dependencies in the decision table.
 Here we are dealing with code level dependencies, and these are absolutely incompatible
with the latent assumption that basis paths are independent. McCabe’s procedure
successfully identifies basis paths that are topologically independent, but when these
contradict semantic dependencies, topologically possible paths are seen to be logically
infeasible.
 One solution to this problem is to always require that flipping a decision results in a
semantically feasible path. Another is to reason about logical dependencies. If we think
about this problem we can identify several rules:
If node C is traversed, then we must traverse nodes H.
If node D is traversed, then we must traverse nodes G.

By- Shrikant Pujar Dept of CSE, JIT Page 16


SOFTWARE TESTING

 Taken together, these rules, in conjunction with McCabe’s baseline method, will yield the
following feasible basis path set:

p1: A-B-C-E-F-H-J-K-M-N-O-Last Scalene


P6: A-B-C-E-F-G-O-Last Not a Triangle
P4: A-B-C-E-F-H-I-N-O-Last Equilateral
P5: A-B-C-E-F-H-J-L-M-N-O-Last Isosceles

 The triangle problem is atypical in that there are no loops. The program has only 18
topologically possible paths, and of these, only the four basis paths listed above are feasible.
Thus for this special case, we arrive at the same test cases as we did with special value testing
and output range testing.
 For a more positive observation, basis path coverage guarantees DD-Path coverage: the
process of flipping decisions guarantees that every decision outcome is traversed, which is
the same as DD-Path coverage.

3.4.3 Essential Complexity


 Part of McCabe’s work on cyclomatic complexity does more to improve programming than
testing. Recall that condensation graphs are a way of simplifying an existing graph; so far our
simplifications have been based on removing either strong components or DD-Paths. Here,
we condense around the structured programming constructs, which are repeated as Figure 3.8.

By- Shrikant Pujar Dept of CSE, JIT Page 17


SOFTWARE TESTING

Pre-test loop Post-test loop


Sequenc
e

If-then-else
If-Then Case

Figure 3.8: structured programming constructs


 The basic idea is to look for the graph of one of the structured programming constructs,
collapse it into a single node, and repeat until no more structured programming constructs
can be found.
 This process is followed in Figure 3.9, which starts with the DD-Path graph of the
triangle program. The IF-THEN-ELSE construct involving nodes A, B, C, and D is
condensed into node a, and then the three IF-THEN constructs are condensed onto nodes b,
c, and d.
 The remaining IF-THEN-ELSE (which corresponds to the IF IsATriangle statement) is
condensed into node e, resulting in a condensed graph with cyclomatic complexity V(G) = 1.
 In general, when a program is well structured (i.e., is composed solely of the structured
programming constructs), it can always be reduced to a graph with one path.
 The graph in Figure 3.6 cannot be reduced in this way (try it!). The loop with nodes B and C
cannot be condensed because of edge from B to E. Similarly, nodes D, E, and F look like
an IF-THEN construct, but the edge from B to E violates the structure.

By- Shrikant Pujar Dept of CSE, JIT Page 18


SOFTWARE TESTING

 McCabe went on to find elemental “unstructures” that violate the precepts of structured
programming [McCabe 76]. These are shown in Figure 3.10.

First
First

A
A
a
B

C D a

F
F
H
G H
J G
I b
J
I
K L
K L

M
M
N
N
O
O

Last
Last

Figure 3.9 Condensing with Respect to the Structured Programming Constructs

By- Shrikant Pujar Dept of CSE, JIT Page 19


SOFTWARE TESTING

First
First

A
A
a

d
F
C
H
G
C
G

I
b
O

O Last

Last First

First e
A

e a

Last

Figure 3.10 Condensing with Respect to the Structured Programming Constructs


By- Shrikant Pujar Dept of CSE, JIT Page 20
SOFTWARE TESTING

 Each of these “unstructures” contains three distinct paths, as opposed to the two paths present
in the corresponding structured programming constructs, so one conclusion is that such
violations increase cyclomatic complexity.
 The piece d’ resistance of McCabe’s analysis is that these unstructures cannot occur by
themselves: if there is one in a program, there must be at least one more, so a program
cannot be just slightly unstructured.Since these increase cyclomatic complexity, the
minimum number of test cases is thereby increased.
 The bottom line for testers is this: programs with high cyclomatic complexity require more
testing. Of the organizations that use the cyclomatic complexity metric, most set some
guideline for maximum acceptable complexity; V(G) = 10 is a common choice.
 What happens if a unit has a higher complexity? Two possibilities: either simplify the unit or
plan to do more testing. If the unit is well structured, its essential complexity is 1, so it can
be simplified easily. If the unit has an essential complexity that exceeds the guidelines,
often the best choice is to eliminate the unstructures.

Branching into a loop Branching into a Decision

Branching out of a loop Branching out of a Decision

By- Shrikant Pujar Dept of CSE, JIT Page 21


SOFTWARE TESTING

Figure 3.11 Violations of Structured Programming


3.5 Guidelines and Observations
 In our study of functional testing, we observed that gaps and redundancies can both exist, and
at the same time, cannot be recognized. The problem was that functional testing removes
us “too far” from the code.
 The path testing approaches to structural testing represent the case where the pendulum
has swung too far the other way: moving from code to directed graph representations and
program path formulations obscures important information that is present in the code, in
particular the distinction between feasible and infeasible paths.
 McCabe was partly right when he observed: “It is important to understand that these are
purely criteria that measure the quality of testing, and not a procedure to identify test cases”
[McCabe 82]. He was referring to the DD-Path coverage metric (which is equivalent to the
predicate outcome metric) and the cyclomatic complexity metric that requires at least the
cyclomatic number of distinct program paths must be traversed.
 Basis path testing therefore gives us a lower bound on how much testing is necessary. Path
based testing also provides us with a set of metrics that act as cross checks on functional
testing. We can use these metrics to resolve the gaps and redundancies question.
 When we find that the same program path is traversed by several functional test cases, we
suspect that this redundancy is not revealing new faults. When we fail to attain DD-Path
coverage, we know that there are gaps in the functional test cases.
 As an example, suppose we have a program that contains extensive error handling, and we test
it with boundary value test cases (rain, mi n+, nom, max-, and max). Because these are all
permissible values, DD-Paths corresponding to the error handling code will not be
traversed. If we add test cases derived from robustness testing or traditional equivalence
class testing, the DD-Path coverage will improve.
 The coverage metrics in Table 2 can operate in two ways: as a blanket mandated standard
(e.g., all units shall be tested to attain full DD-Path coverage) or as a mechanism to
selectively test portions of code more rigorously than others.
 We might choose multiple condition coverage for modules with complex logic, while
those with extensive iteration might be tested in terms of the loop coverage techniques. This
is probably the best view of structural testing: use the properties of the source code to identify

By- Shrikant Pujar Dept of CSE, JIT Page 22


SOFTWARE TESTING

appropriate coverage metrics, and then use these as a cross check on functional test cases.
 When the desired coverage is not attained, follow interesting paths to identify additional
(special value) test cases.This is a good place to revisit the Venn diagram view of testing that
we used in Chapter 1. Figure 3.12 shows the relationship between specified behaviors (set S),
programmed behaviors (set P), and topologically feasible paths in a program (set T).
 As usual, region I is the most desirable — it contains specified behaviors that are
implemented by feasible paths. By definition, every feasible path is topologically possible,
so the shaded portion (regions 2 and 6) of the set P must be empty.
 Region 3 contains feasible paths that correspond to unspecified behaviors. Such extra
functionality needs to be examined: if useful, the specification should be changed, otherwise
these feasible paths should be removed. Regions 4 and 7 contain the infeasible paths; of
these, region 4 is problematic.
 Region 4 refers to specified behaviors that have almost been implemented: topologically
possible yet infeasible program paths. This region very likely corresponds to coding errors,
where changes are needed to make the paths feasible.
 Region 5 still corresponds to specified behaviors that have not been implemented. Path
based testing will never recognize this region. Finally, region 7 is a curiosity: unspecified,
infeasible, yet topologically possible paths.
 There is no problem here, because infeasible paths cannot execute. If the corresponding
code is incorrectly changed by a maintenance action (maybe by a programmer who doesn’t
fully understand the code), these could become feasible paths, as in region 3.

Specification Programmed Behaviors

5 2 6

4 1 3

7
8 Topologically
possible paths

Figure 3.12 Feasible and Topologically Possible Paths

By- Shrikant Pujar Dept of CSE, JIT Page 23


SOFTWARE TESTING

DATA FLOW TESTING


3.6 Data Flow Testing
 Data flow testing refers to forms of structural testing that focus on the points at which
variables receive values and the points at which these values are used (or referenced).
 We will look at two mainline forms of data flow testing: one provides a set of basic
definitions and a unifying structure of test coverage metrics, while the second is based on a
concept called a “program slice”. Both of these formalize intuitive behaviors (and analyses)
of testers, and although they both start with a program graph, both move back in the direction
of functional testing.
 Most programs deliver functionality in terms of data. Variables that represent data somehow
receive values, and these values are used to compute values for other variables. Since the
early 1960s, programmers have analyzed source code in terms of the points (statements)
at which variables receive values and points at which these values are used. Many times,
their analyses were based on concordances that list statement numbers in which variable
names occur.
 Early “data flow” analyses often centered on a set of faults that are now known as
define/reference anomalies:
• a variable that is defined but never used (referenced)
• a variable that is used but never defined
• a variable that is defined twice before it is used
 Each of these anomalies can be recognized from the concordance of a program. Since
the concordance information is compiler generated, these anomalies can be discovered by
what is known as “static analysis”: finding faults in source code without executing it.

3.6.1 Define/Use Testing


 The following definitions refer to a program P that has a program graph G(P), and a set of
program variables V. The program graph G(P) is with statement fragments as nodes, and
edges that represent node sequences.
 G(P) has a single entry node, and a single exit node. We also disallow edges from a node to
itself. Paths, sub paths, and cycles.

By- Shrikant Pujar Dept of CSE, JIT Page 24


SOFTWARE TESTING

Definition
 Node n € G(P) is a defining node of the variable v €V, written as DEF(v,n), iff the value of
the variable v is defined at the statement fragment corresponding to node n.
 Input statements, assignment statements, loop control statements, and procedure calls are all
examples of statements that are defining nodes. When the code corresponding to such
statements executes, the contents of the memory location(s) associated with the variables are
changed.

Definition
 Node n  G(P) is a usage node of the variable v  V, written as USE(v, n), iff the value
of the variable v is used at the statement fragment corresponding to node n.
 Output statements, assignment statements, conditional statements, loop control statements, and
procedure calls are all examples of statements that are usage nodes. When the code
corresponding to such statements executes, the contents of the memory location(s) associated
with the variables remain unchanged.

Definition
 A usage node USE(v, n) is a predicate use (denoted as P-use) iff the statement n is a
predicate statement; otherwise USE(v, n) is a computation use , (denoted C-use).
 The nodes corresponding to predicate uses always have an outdegree ≥ 2, and nodes
corresponding to computation uses always have outdegree ≤ 1.

Definition
 A definition-use (sub)path with respect to a variable v (denoted du-path) is a
(sub)path in PATHS(P) such that, for some v  V, there are define and usage nodes DEF(v,
m) and USE(v, n) such that m and n are the initial and final nodes of the (sub)path.

Definition
 A definition-clear (sub)path with respect to a variable v (denoted dc-path) is a
definition-use (sub)path in PATHS(P) with initial and final nodes DEF (v, m) and USE (v, n)
such that no other node in the (sub)path is a defining node of v.

By- Shrikant Pujar Dept of CSE, JIT Page 25


SOFTWARE TESTING

 Testers should notice how these definitions capture the essence of computing with stored
data values. Du-paths and dc-paths describe the flow of data across source statements from
points at which the values are defined to points at which the values are used. Du-paths that
are not definition-clear are potential trouble spots.

3.6.2 Example
 We will use the Commission Problem and its program graph to illustrate these definitions.
The numbered source code is given next, followed by a program graph constructed
according to the procedures.
 This program computes the commission on the sales of four salespersons, hence the outer
For-loop that repeats four times. During each repetition, a salesperson’s name is read
from the input device, and the input from that person is read to compute the total numbers of
locks, stocks, and barrels sold by the person.
 The While-loop is a classical sentinel controlled loop in which a value of -1 for locks
signifies the end of that person’s data. The totals are accumulated as the data lines are read in
the While-loop.
 After printing this preliminary information, the sales value is computed, using the constant
item prices defined at the beginning of the program. The sales value is then used to compute
the commission in the conditional portion of the program.

1 Program Commission (Input, Output)


2 Dim locks, stocks and barrels As Integer
3 Dim lockprice, stocksprice and barrelprice As Real
4 Dim totallocks, totalstocks and totalbarrels As Integer
5 Dim locksales, stockssales and barrelsales As Real
6 Dim sales, commission As Real
7 lock_price = 45.0;
8 stock_price = 30.0;
9 barrel_price 25.0;
10 totallocks =0
11 totalstocks =0

By- Shrikant Pujar Dept of CSE, JIT Page 26


SOFTWARE TESTING

12 totalbarrels =0
13 input (locks)
14 while NOT (locks= -1) ‘loop condition uses -1 to indicate end of data’
15 input(stocks, barrels)
16 totallocks = totallocks + locks
17 totalstocks = totalstocks + stocks
18 totalbarrels = totalbarrels + barrels
19 input (locks)
20 Endwhile
21 Output (“Locks sold: “, totallocks)
22 Output (“stocks sold: “, totalstocks)
23 Output (“Barrels sold: “, totalbarrels)
24 locksales = lockprice * totallocks
25 stocksales = stockprice * totalstocks
26 barrelsales = barrelprice * totalbarrels
27 sales = locksales+ stocksales + barrelsales
28 output (“total sales:”, sales)
29 If (sales > 1800.0)
30 Then
31 Commission = 0.10 * 1000.0
32 Commission = Commission + 0.15 * 800.0
33 Commission = Commission + 0.20 * (sales - 1800.0)
34 Else if (sales > 1000.0)
35 Then
36 Commission = 0.10 * 1000.0
37 Commission = Commission + 0.15 * (sales - 1000.0)
38 Else
39 Commission = 0.10 * 1000.0
40 Endif
41 Endif
42 Output (“commission is $”, commission)

By- Shrikant Pujar Dept of CSE, JIT Page 27


SOFTWARE TESTING

43 End commission

 The DD-Paths in this program are given in Table 1, and the DD-Path graph is shown in Figure
10.2. Tables 2 and 3 list the define and usage nodes for five variables in the commission
problem. We use this information in conjunction with the program graph in Figure 10.1 to
identify various definition-use and definition-clear paths.
Table 1 DD-Paths in Figure 10.1
DD - Path Nodes
A 7,8,9,10,11,12,13
B 14
C 15,16,17,18,19,20
D 21,22,23,24,25,26,27,28
E 29
F 30,31,32,33
G 34
H 35,36,37
I 38,39
J 40
K 41,42,43

 It’s a judgment call whether or not non-executable statements such as constant (CONST) and
variable (VAR) declaration statements should be considered as defining nodes. Technically,
these only define memory space (the CONST declaration creates a compiler-produced
initial value). Such nodes aren’t very interesting when we follow what happens along their
du-paths, but if there is something wrong, it’s usually helpful to include them. Take your pick.
We will refer to the various paths as sequences of node numbers.

3.1.2 du-paths for Stocks


 First, let’s look at the du-paths for the variable stocks. We have DEF(stocks, 15) and
USE(stocks, 17), so the path <15, 17> is a du-path wrt (with respect to) stocks. Since there
are no other defining nodes for stocks, this path is also definition-clear.

By- Shrikant Pujar Dept of CSE, JIT Page 28


SOFTWARE TESTING

7 8 9 10 11 12 13

14
15 16 17 18 19 20

21 22 23 24 25 26 27 28

29

34
30

38
31
35

32 36
39
37
33

40

23 42 43

Figure 3.1 Program Graph of the Commission Program


3.1.3 du-paths for Locks
 Two defining and two usage nodes make the locks variable more interesting: we have
DEF(locks, 13), DEF(locks, 19), USE(locks, 14), and USE(locks, 16). These yield four du-
paths:

By- Shrikant Pujar Dept of CSE, JIT Page 29


SOFTWARE TESTING

p1 = <13, 14>
p2 = <13, 14,15,16>
p3 = <19, 20, 14>
p4 = <19,20, 14,15, 16>

 Du-paths p1 and p2 refer to the priming value of locks which is read at node 13: locks
has a predicate use in the While statement (node 14), and if the condition is true (as in
path p2), a computation use at statement 16. The other two du-paths start near the end of the
While loop and occur when the loop repeats.

F H I

J
K

Figure 3.2 DD-Path Graph of the Commission Program

3.1.4 du-paths for totallocks


 The du-paths for totallocks will lead us to typical test cases for computations. With two
defining nodes (DEF(totallocks, 10) and DEF(totallocks, 16)) and three usage nodes

By- Shrikant Pujar Dept of CSE, JIT Page 30


SOFTWARE TESTING

(USE(totallocks, 16), USE(totallocks, 21), USE(totallocks, 24)), we might expect six du-
paths.
 Path p5 = <10,11,12,13,14,15,16> is a du-path in which the initial value (0) has a
computation use. This path is definition-clear. The next path is problematic:
p6 = <10,11,12,13,14,15,16,17,18,19,20,14,21>
 We have ignored the possible repetition of the While-loop. We could highlight this by
noting that the subpath <16,17,18,19,20,14,15> might be traversed several times. Ignoring
this for now, we still have a du-path that fails to be definition-clear. If there is a problem
with the value of totallocks at node 21 (the WRITE statement), we should look at the
intervening DEF(totallocks, 16) node.
 The next path contains p6; we can show this by using a path name in place of its
corresponding node sequence:
p7 = <10,11,12,13,14,15,16,17,18,19,20,14,21, 22, 23, 24>
p7 = < p6, 22, 23, 24>
Du-path p7 is not definition-clear because it includes node 16.
 Subpaths that begin with node 16 (an assignment statement) are interesting. The first, <16,
16>, seems degenerate. If we “expanded” it into machine code, we would be able to separate
the define and usage portions. We will disallow these as du-paths.
 Technically, the usage on the right-hand side of the assignment refers to a value defined at
node 10, (see path p5). The remaining two du-paths are both subpaths of p7:
p8 = <16,17,08,19,20,14,21>
p9 = <16,17,08,19,20,14,21,22,23,24>
Both of these are definition-clear, and both have the loop iteration problem we discussed
before.

3.1.5 du-paths for sales


 Since there is only one defining node for sales, all the du-paths wrt sales must be definition-
clear. They are interesting because they illustrate predicate and computation uses. The first
three du-paths are easy:
p10 = <27,28>
p11 = <27,28,29>

By- Shrikant Pujar Dept of CSE, JIT Page 31


SOFTWARE TESTING

p12 = <27,28,29,30,31,32,33>

 Notice that p12 is a definition-clear path with three usage nodes; it also contains paths p10 and
p11. If we were testing with p12, we know we would also have covered the other two
paths.
 The IF, ELSE IF logic in statements 29 through 40 highlights an ambiguity in the original
research. There are two choices for du-paths that begin with path p11: the static choice is the
path <27,28,29,30,31,32,33>, the dynamic choice is the path <27,28,29,34>. Here we will
use the dynamic view, so the remaining du-paths for sales are
p13 = <27,28,29,34>
p14 = <27,28,29,34,35,36,37>
p15 = <27,28,29,34,38,39>

Note that the dynamic view is very compatible with the kind of thinking we used for DD-
Paths.

10.1.6 du paths for Commission


 If you have followed this discussion carefully, you are probably dreading the stuff on du-
paths wrt commission. You’re right -- it’s time for a change of pace. In statements 29
through 41, the calculation of commission is controlled by ranges of the variable sales.
 Statements 31 to 33 build up the value of commission by using the memory location to
hold intermediate values.
 This is a common programming practice, and it is desirable because it shows how the
final value is computed. (We could replace these lines with the statement “commission :=
220 + 0.20* (sales -1800)”, where 220 is the value of 0.10*1000 + 0.15*800, but this
would be hard for a maintainer to understand.)
 The “built-up” version uses intermediate values, and these will appear as define and usage
nodes in the du-path analysis. Since we decided to disallow du-paths from assignment
statements like 31 and 32, we’ll just consider du-paths that begin with the three “real”
defining nodes: DEF(commission, 33), DEF(commission, 37), and DEF(commission, 38).
There is only one usage node, USE(commission, 42).

By- Shrikant Pujar Dept of CSE, JIT Page 32


SOFTWARE TESTING

3.6.3 Du-path Test Coverage Metrics


 The whole point of analyzing a program as in the previous section is to define a set of test
coverage metrics known as the Rapps-Weyuker data flow metrics [Rapps 85].
 The first three of these are equivalent to three of E. F. Miller’s metrics: All-Paths, All-Edges,
and All-Nodes. The others presume that define and usage nodes have been identified for all
program variables, and that du-paths have been identified with respect to each variable.
 In the following definitions, T is a set of (sub)paths in the program graph G(P) of a program P,
with the set V of variables.

Definition
The set T satisfies the All-Defs criterion for the program P iff for every variable v  V, T
contains definition clear (sub)paths from every defining node of v to a use of v.

Definition
The set T satisfies the All-Uses criterion for the program P iff for every variable v  V, T
contains definition-clear (sub)paths from every defining node of v to every use of v, and to the
successor node of each USE(v,n).

Definition
The set T satisfies the All-P-Uses /Some C-Uses criterion for the program P iff for every
variable v  V, T contains definition-clear (sub)paths from every defining node of v to every
predicate use of v, and if a definition of v has no P-uses, there is a definition-clear path to at
least one computation use.

Definition
The set T satisfies the All-C-Uses /Some P-Uses criterion for the program P iff for every
variable v  V, T contains definition-clear (sub)paths from every defining node of v to every
computation use of v, and if a definition of v has no C-uses, there is a definition-clear path
to at least one predicate use.

By- Shrikant Pujar Dept of CSE, JIT Page 33


SOFTWARE TESTING

Definition
The set T satisfies the All-DU-paths criterion for the program P iff for every variable v 
V, T contains definition-clear (sub)paths from every defining node of v to every use of v,
and to the successor node of each USE(v,n), and that these paths are either single loop
traversals, or they are cycle free.

 These test coverage metrics have several set-theory based relationships, which are referred
to as “subsumption” in [Rapps 85]. When one test coverage metric subsumes another, a set
of test cases that attains coverage in terms of the first metric necessarily attains coverage
with respect to the subsumed metric. These relationships are shown in Figure 3.3.

All -Paths

All- DU-Paths

All- Uses

All- C-Uses/some-P-uses All- P-Uses/some-C-uses

All- P-uses
All-Defs

All- Edges

All- Nodes

Figure 3.3 Rapps/Weyuker Hierarchy of Data Flow Coverage Metrics

3.7 Slice-Based Testing


 Program slices have surfaced and submerged in software engineering literature since the
early 1980s. They were originally proposed in [Weiser 85], used as an approach to software
maintenance in [Gallagher 91], and most recently used to quantify functional cohesion in
[Bieman 94]. Part of this versatility is due to the natural, intuitively clear intent of the

By- Shrikant Pujar Dept of CSE, JIT Page 34


SOFTWARE TESTING

program slice concept.


 Informally, a program slice is a set of program statements that contribute to, or affect a value
for a variable at some point in the program. This notion of slice corresponds to other
disciplines as well. We might study history in terms of slices: US history, European history,
Russian history, Far East history, Roman history, and so on. The way such historical slices
interact turns out to be very analogous to the way program slices interact.
 We’ll start by growing our working definition of a program slice. We continue with the
notation we used for define-use paths: a program P that has a program graph G(P), and a
set of program variables V. The first try refines the definition in [Gallagher 91] to allow
nodes in P(G) to refer to statement fragments.

Definition
Given a program P, and a set V of variables in P, a slice on the variable set V at
statement n, written S(V,n), is the set of all statements in P that contribute to the values of
variables in V.
Listing elements of a slice S(V,n) will be cumbersome, because the elements are
program statement fragments. Since it is much simpler to list fragment numbers in P(G), we
make the following trivial change (it keeps the set theory purists happy):

Definition
Given a program P, and a program graph G(P) in which statements and statement
fragments are numbered, and a set V of variables in P, the slice on the variable set V at
statement fragment n, written S(V,n), is the set node numbers of all statement fragments in
P prior to n that contribute to the values of variables in V at statement fragment n.

 The idea of slices is to separate a program into components that have some useful meaning.
First, we need to explain two parts of the definition. Here we mean “prior to” in the dynamic
sense, so a slice captures the execution time behavior of a program with respect to the
variable(s) in the slice.
 Eventually, we will develop a lattice (a directed, acyclic graph) of slices, in which nodes are
slices, and edges correspond to the subset relationship.

By- Shrikant Pujar Dept of CSE, JIT Page 35


SOFTWARE TESTING

 The “contribute” part is more complex. In a sense, declarative statements (such as


CONST and TYPE) have an effect on the value of a variable. One resolution might be to
simply exclude all non-executable statements.
 The notion of contribution is partially clarified by the predicate (P-use) and computation
(C-use) usage distinction of [Rapps 85], but we need to refine these forms of variable usage.
Specifically, the USE relationship pertains to five forms of usage:

P-use used in a predicate (decision)


C-use used in computation
O-use used for output
L-use used for location (pointers, subscripts)
I-use iteration (internal counters, loop indices)

While we’re at it, we identify two forms of definition nodes:

I-def defined by input


A-def defined by assignment
 For now, presume that the slice S(V, n) is a slice on one variable, that is, the set V consists
of a single variable, v. If statement fragment n is a defining node for v, then n is included in
the slice.
 If statement fragment n is a usage node for v, then n is not included in the slice. P-uses and C-
uses of other variables (not the v in the slice set V) are included to the extent that their
execution affects the value of the variable v.
 As a guideline, if the value of v is the same whether a statement fragment is included or
excluded, exclude the statement fragment. L-use and I-use variables are typically
invisible outside their modules, but this hardly precludes the problems such variables often
create.
 Another judgment call: here (with some peril) we choose to exclude these from the
intent of “contribute”. Thus O-use, L-use, and I-use nodes are excluded from slices..

Example

By- Shrikant Pujar Dept of CSE, JIT Page 36


SOFTWARE TESTING

 The commission problem is used in this book because it contains interesting data flow
properties, and these are not present in the Triangle problem (or in NextDate). Follow these
examples while looking at the source code for the commission problem that we used to
analyze in terms of define-use paths.
 Slices on the locks variable show why it is potentially fault-prone. It has a P-use at node 14 and
a C-use at node 16, and has two definitions, the I-defs at nodes 13 and 19.

S1: S(locks, 13) = {13}


S2: S(locks, 14) = {13,14,19,20}
S3: S(locks, 16) = {13,14,19,20}
S4: S(locks, 19) = {19}
 The slices for stocks and barrels are boring. Both are short, definition-clear paths contained
entirely within a loop, so they are not affected by iterations of the loop. (Think of the loop
body as a DD-Path.)

S5: S(stocks, 15) = {13,14,15,19,20}


S6: S(stocks, 17) = {13,14,15,19,20}
S7: S(barrels, 15) = {13,14,15,19,20}
S8: S(barrels, 18) = {13,14,15,19,20}
 The next four slices illustrate how repetition appears in slices. Node 10 is an A-def for
totallocks, and node 16 contains both an A-def and a C-use. The remaining nodes in S10
(13,14,19 and 20) pertain to the While-loop controlled by locks. Slices S10, S11, and S11 are
equal because nodes 21 and 24 are, respectively, an O-use and a C-use of totallocks.

S9: S(totallocks, 10) = {10}


S10: S(totallocks, 16) = {10,13,14,16,19,20}
S11: S(totallocks, 21) = {10,13,14,16,19,20}
 The slices on num_stocks and num_barrels are quite similar. They are initialized by A-defs at
nodes 11 and 12, and then are redefined by A-defs at nodes 17 and 18. Again, the remaining
nodes (13,14,19 and 20) pertain to the While-loop controlled by locks.

By- Shrikant Pujar Dept of CSE, JIT Page 37


SOFTWARE TESTING

S12: S(num_stocks, 11) = {11}


S13: S(num_stocks, 17) = {11,13,14,15,17,19,20}
S14: S(num_stocks, 22) = {11,13,14,15,17,19,20}
S15: S(num_stocks, 12) = {12}
S26: S(num_barrels, 18) = {12,13,14,15,17,19,20}
S17: S(num_barrels, 23) = {12,13,14,15,17,19,20}

 The next three slices demonstrate


te our convention regarding compiler-defined values.
v
S18: S(lockprice, 24) = {{7}
S19: S(stockprice, 25) = {{8}
S20: S(barrelprice, 26) = {9}
S21: S(locksales, 24) = {{7,10,13,14,16,19,20,24}
S22: S(stocksales, 25) = {{8,11,13,14,15,17,19,20,25}
S23: S(barrelsales, 26) = {9,12,13,14,15,18,19,20,26}

 The slices on sales and comm


mission are the interesting ones. There is only one defining
node for sales, the A-def at node 27. The remaining slices on sales show the P-uses, C-uses,
and the O-use in definition-clear
lear paths.

S24: S(sales, 27) = {7,8,9,10,11,12,13,14,15,16,17,18,


7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S25: S(sales, 28) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S26: S(sales, 29) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S27: S(sales, 33) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S28: S(sales, 34) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S29: S(sales, 37) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}
S30: S(sales, 39) = {7,8,9,10,11,12,13,14,15,16,17,18,
7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
6, 27}

 Think about slice S24 in terms of its “components”, the slices on the C-use variables. We can
write S24 = S10 S13 S16 S21 S22 S23, where the values of the six C-use variables

By- Shrikant Pujar Dept of CSE, JIT Page 38


SOFTWARE TESTING

at node 36 are defined by the six slices joined together by the union operation. Notice
how the formalism corresponds to our intuition: if the value of sales is wrong, we first look
at how it is computed, and if this is OK, we check how the components are computed.
 Everything comes together (literally) with the slices on commission. There are six A-def
nodes for commission (corresponding to the six du-paths we identified earlier). Three
computations of commission are controlled by P-uses of sales in the IF, ELSE IF logic. This
yields three “paths” of slices that compute commission. (See Figure 3.4.)
S31: S(commission, 31) = {31}
S32: S(commission, 32) = {31, 32}
S33: S(commission, 33) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
27,29,30,31,32,33}
S34: S(commission, 36) = {36}
S35: S(commission, 37) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
27,36,37}
S36: S(commission, 39) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
27,29,34,38,39}
Whichever computation is taken, all come together in the last slice.
S37: S(commission, 41) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,
27,29,30,31,32,33,34,35,36,37,38,39}
 The slice information improves our insight. Look at the lattice in Figure 3.4; it is a directed
acyclic graph in which slices are nodes, and an edge represents the proper subset relationship.

S31

S34
S32

S33 S35 S36

S37

Figure 3.4 Lattice of Slices on Commission

By- Shrikant Pujar Dept of CSE, JIT Page 39


SOFTWARE TESTING

S31 S10 S13 S16 S21 S22 S23

S32 S24 S34

S33 S36 S35

S37

Figure 3.5 Lattice on Sales and Commission


 This lattice is drawn so that the position of the slice nodes roughly corresponds with their
position in the source code. The definition-clear paths <33, 41>, <37, 41>, and <39,41>
correspond to the edges that show slices S33, S35, and S36 are subsets of slice S37. Figure
3.5 shows a lattice of slices for the entire program. Some slices (those that are identical to
others) have been deleted for clarity.

10.2.2 Style and Technique


 When we analyze a program in terms of “interesting” slices, we can focus on parts of interest
while disregarding unrelated parts. We couldn’t do this with du-paths — they are sequences
that include statements and variables that may not be of interest.
 Before discussing some analytic techniques, we’ll first look at “good style”. We could have
built these stylistic precepts into the definitions, but then the definitions become even more
cumbersome.
1. Never make a slice S(V, n) for which variables v of V do not appear in statement
fragment n. This possibility is permitted by the definition of a slice, but it is bad
practice. As an example, suppose we defined a slice on the locks variable at node 27.
Defining such slices necessitates tracking the values of all variables at all points in the
program.

By- Shrikant Pujar Dept of CSE, JIT Page 40


SOFTWARE TESTING

2. Make slices on one variable. The set V in slice S(V,n) can contain several variables,
and sometimes such slices are useful. The slice S(V, 26) where

V= { locksSales, stocksSales, barrelsSales }


contains all the elements of the slice S({sales}, 36) except the CONST declarations and
statement 36. Since these two slices are so similar, why define the one in terms of C-
uses?

3. Make slices for all A-def nodes. When a variable is computed by an assignment
statement, a slice on the variable at that statement will include (portions of) all du-paths
of the variables used in the computation. Slice S({sales}, 36) is a good example of an
A-def slice.

4. Make slices for P-use nodes. When a variable is used in a predicate, the slice on that
variable at the decision statement shows how the predicate variable got its value. This
is very useful in decision-intensive programs like the Triangle program and NextDate.

5. Slices on non-P-use usage nodes aren’t very interesting. We discussed C-use slices
in point 2, where we saw they were very redundant with the A-def slice. Slices on O-
use variables can always be expressed as unions of slices on all the A-defs (and I-defs)
of the O-use variable. Slices on I-use and O-use variables are useful during debugging,
but if they are mandated for all testing, the test effort is dramatically increased.

6. Consider making slices compilable. Nothing in the definition of a slice requires that
the set of statements is compilable, but if we make this choice, it means that a set of
compiler directive and declarative statements is a subset of every slice. If we added this
same set of statements to all the slices we made for the commission program, our
lattices remain undisturbed, but each slice is separately compilable (and therefore
executable).

By- Shrikant Pujar Dept of CSE, JIT Page 41


SOFTWARE TESTING

Guidelines and Observations


 Dataflow testing is clearly indicated for programs that are computationally intensive. As a
corollary, in control intensive programs, if control variables are computed (P-uses), dataflow
testing is also indicated.
 The definitions we made for define/use paths and slices give us very precise ways to describe
parts of a program that we would like to test. There are academic tools that support these
definitions, but they haven’t migrated to the commercial marketplace.
 Some pieces are there; you can find programming language compilers that provide on-screen
highlighting of slices, and most debugging tools let you “watch” certain variables as you step
through a program execution.
 Here are some tidbits that may prove helpful to you, particularly when you have a difficult
module to test.
1. Slices don’t map nicely into test cases (because the other, non-related code is still in
an executable path). On the other hand, they are a handy way to eliminate interaction
among variables. Use the slice composition approach to re-develop difficult sections of
code, and these slices before you splice (compose) them with other slices.

2. Relative complements of slices yield a “diagnostic” capability. The relative


complement of a set B with respect to another set A is the set of all elements of A that
are not elements of B. It is denoted as A -B. Consider the relative complement set
S(commission, 41) - S(sales, 27):

S(commission,41) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26,


27,29,30,31,32,33,34,35,36,37,38,39}
S(sales, 27) = {7,8,9,10,11,12,13,14,15,16,17,18,19, 20, 24, 25, 26, 27}
S(commission, 41) - S(sales, 27) = {29,30,31,32,33,34,35,36,37, 38, 39}

If there is a problem with commission at line 48, we can divide the program into two
parts, the computation of sales at line 34, and the computation of commission between
lines 35 and 48. If sales is OK at line 34, the problem must lie in the relative
complement; if not, the problem may be in either portion.

By- Shrikant Pujar Dept of CSE, JIT Page 42


SOFTWARE TESTING

3. There is a many-to-many relationship between slices and DD-Paths: statements in


one slice may be in several DD-Paths, and statements in one DD-Path may be in several
slices. Well-chosen relative complements of slices can be identical to DD-Paths. For
example, consider S(commission, 40) -S(commission, 37).

4. If you develop a lattice of slices, it’s convenient to postulate a slice on the very first
statement. This way, the lattice of slices always terminates in one root node. Show
equal slices with a two-way arrow.

5. Slices exhibit define/reference information. Consider the following slices on


totallocks:
S9: S(totallocks, 10) = {10}
S(totallocks, 16) = {10,13,14,16,19, 20}
S(totallocks, 21) = {10,13,14,16,19, 20}
When slices are equal, the corresponding paths are definition-clear.

By- Shrikant Pujar Dept of CSE, JIT Page 43

You might also like