Quantum Neural Networks Versus Conventional Feedforward Neural N
Quantum Neural Networks Versus Conventional Feedforward Neural N
Quantum Neural Networks Versus Conventional Feedforward Neural N
INTRODZJCTION
(C)IEEE
0-7803-6278-0/~$10.00 328
The limitations of conventional FFNNs motivated the development of
inherently fuzzy feedforward neural networks, known as quantum neural net-
works (QNNs) [6], [7]. Conventional FFNNs and QNNs satisfy the require-
ments outlined in [3] for universal function approximators. In addition t o
their function approximation capabilities, QNNs have also be shown to be
capable of representing and quantifying the uncertainty inherent in the train-
ing data. More specifically, QNNs can identify overlapping between classes
due to their capacity of approximating any arbitrary membership profile up
to any degree of accuracy.
QNNs have recently been used, together with FFNNs, to classify and
remove bird-contaminated data recorded by a 1290-MHz wind profiler [2].
QNNs outperformed FFNNs in this application, but the high dimensionality
of the input space did not allow a visual perception of the results. This paper
presents an experimental comparison of conventional FFNNs and QNNs on
a pattern classification problem involving two-dimensional (2-D) vowel data.
EXPERIMENTAL RESULTS
(C) IEEE
0-7803-6278-0/(N)$l0.00 330
-2 L
-2 -1 0
Input xl
1 i ; -2'
-2 -1 0
Input xl
1 2
I
3
Figure 1: The normalized vowel data: (a) the training set and (b) the testing set.
The two inputs x l and x2 represent the normalized values of the first two formants
of the vowels. Each symbol denotes a different vowel.
the testing set was produced by an FFNN containing six hidden units. This
FFNN was utilized in the experiments that followed. These experiments also
calculated the classification errors E, and the values of the average class-
conditional variance G produced on the testing set by the FFNN and QNNs
with six hidden units containing multilevel activation functions with 2, 3,
4, 5, and 10 quantum levels. The initializations that resulted in the lowest
values of E, and G are summarized in Tables 1 and 2, respectively. The
percentage of classification errors E, was computed based on a winner-takes-
all strategy, that is, the input vector was assigned to the class represented by
the output unit with the largest response. When evaluated in terms of the
percentage of classification errors E,, the FFNN and QNNs tested in these
experiments produced comparable results. On average, QNNs led to lower
values of the average class-conditional variance G than FFNNs. This is not
surprising, since the quantum intervals of the QNNs were updated during
their learning by minimizing the average class-conditional variance.
0-7803-6278-0/00$10.00 ( C ) IEEE 33 1
.. * #
-0.5-1
-0.5 0.5
-0.5
-1
- -0.5 0 0.5
InpInXl InpUcXl
Figure 2: Contour plots of the output values of one specific output unit (i.e., one
specific class) produced by the (a) FFNN, (b) Q N N (2), (c) Q N N (3), (d) Q N N (4),
(e) Q" (51, and (f) QNN (10).
The second set of experiments evaluated the ability of FFNNs and QNNs
to estimate class membership from the data. This investigation employed
the neural networks that produced the lowest percentage of classification
errors E, in the previous set of experiments (see Table 1). Figures 2(a)-2(f)
show the contour plots produced by the neural networks listed in Table 1
for class 1. A11 samples of class 1 that are represented by circles in Figure 1
are also shown as circles in Figure 2. The exact position of each sample of
class 1 is marked by a dot. No specific symbols are used in Figure 2 for the
samples of all other classes. Instead, their exact positions are marked by dots.
The contour lines (i.e., lines of constant values) of the network output values
that represent class 1 were obtained by presenting to the neural networks
(C) IEEE
0-7803-6278-0/00$10.00 332
input vectors produced by a regular 100 x 100 grid that covered the region
[-1,1] x [-0.5,2.5] of the input space. Dense contour lines produced dark
areas in Figure 2 and mark regions of the input space corresponding to a
steep slope in the output. Less dense contour lines produced light areas
in Figure 2 and mark regions of the input space corresponding to almost
constant output values. Figure 2 also shows the output values corresponding
to some selected contour lines. The highest output values were found near
the center of class 1 (i.e., near the center of the figures) and towards the top
right. The contour lines produced by the FFNN define a smooth surface.
In contrast, the contour lines produced by the QNNs quantize the input
space into regions corresponding to almost constant output values. These
regions represent certain levels of class membership or certain degrees of
class overlapping. This can be verified by observing the distribution of the
circled and non-circled data points in Figures 2(b)-2(f). The number of flat
regions produced by the QNNs increased as the number of quantum levels
increased. For a small number of quantum levels, QNNs produced a rough
membership profile representing the uncertainty in the training data, such
as that shown in Figure 2(b). For a large number of quantum levels, QNNs
seem to assign an individual level of uncertainty to small groups of training
vectors or even to isolated training vectors. An example of such a behavior
is shown in Figure 2(f). Referring t o Figure 2, the most reliable membership
profiles were produced by the QNNs with 3, 4,and 5 quantum levels.
The third set of experiments investigated how the response of trained
QNNs is affected by the minimization of the average class-conditional vari-
ance during the learning process. Figures 3(a) and 3(b) show the contour
lines produced by two QNNs with 3 quantum levels trained using two differ-
ent random initializations of their free parameters. These two initializations
produced different values for the average class-conditional variance corre-
sponding to class 1 and led to different membership profiles, as indicated by
comparing Figures 3(a) and 3(b). It is also clear that the minimization of
the average class-conditional variance has a significant impact on the repre-
Figure 3: Comparison of two QNN with 3 quantum levels whose training resulted
in the following values of the average class-conditional variance for class 1: (a) 1.4,
and (b) 3.9.
2 3
Input x i Input x i
-21
-2 -1
0 1 2 3
Input x i
Figure 4: Decision boundaries produced by the FFNN on (a) the training set and
(b) the testing set, the QNN (4) on ( c ) the training set and (d) the testing set, and
the Q N N (5) on (e) the training set and (f) the testing set.
0-7803-6278-0/00$10.00(C) IEEE 3 34
lnputxi Input x i
Input x i Input xi
testing sets. The neural networks tested in these experiments produced dif-
ferent decision boundaries despite the fact that they all produced almost the
same number of classification errors. According to Figures 4(a) and 4(b), the
FFNN produced relatively smooth and fairly adequate decision boundaries
for this data set. Figures 4(c)-4(f) indicate that the QNNs produced slightly
eccentric decision boundaries for certain regions of the input space. This ex-
perimental outcome reveals that a winner-takes-all strategy is not capable of
interpreting the rich and complex structure of the QNN output. As a result,
this strategy discards information that could improve class label assignment.
This reveals the need for the development of alternative strategies for class
(C) EEE
0-7803-6278-0/00$10.00 335
label assignment that could potentially improve the reliability of decision
making.
The results of the previous experiments motivated the development of
simple rules for identifying regions of uncertainty in the input space based
on the outputs of trained neural networks. These rules relied on a statistical
analysis of the most dominant network outputs corresponding to correctly
and incorrectly classified training samples. In these experiments, the input
vectors produced by a regular 100 x 100 grid were assigned to an uncertainty
region if
Rule 1': the highest output of the neural network was lower than the
average of the highest outputs of correctly classified samples and the
highest outputs of incorrectly classified samples,
Rule 5': the absolute value of the difference of the two highest outputs
was lower than the average of the two highest outputs.
Figures 5(a)-5(f) show the decision boundaries and the regions of uncertainty
produced by the two rules for the FFNN and the two QNNs with 4 and 5
quantum levels that achieved the lowest percentage of classification errors E,
(see Table 1). According to the regions of uncertainty shown in Figure 5 ,
Rule 2 appears to be more reliable than Rule 1. The distribution of the
training data in the input space also indicates that the regions of uncertainty
produced by Rule 2 can be interpreted as the regions of the input space
where class label assignment according to a winner-takes-all strategy may
not be valid Comparison of Figures 5(a)-5(f) indicates that the QNN with
5 quantum levels provided a more reliable basis for identifying regions of
uncertainty than the QNN with 4 quantum levels. This can be attributed to
the fact that the average class-conditional variance computed on the trained
QNN with 5 quantum levels was lower than that computed on the QNN with
4 quantum levels (see Table 1). According to Figures 5(e) and 5(f), Rule 2
produced uncertainty regions where no training data were available. Note
also that the QNN with 5 quantum levels produced very narrow uncertainty
regions where the classes are clearly separated and wider uncertainty regions
where there is significant overlapping between different classes.
CONCLUSIONS
The study outlined in this paper compared conventional FFNNs and QNNs
on a pattern classification problem involving a set of 2-D vowel data. This
data set is particularly challenging for any pattern classifier due t o exten-
sive overlapping between the samples belonging to different classes. FFNNs
and QNNs produced comparable classification rates when class label assign-
ment was based on a winner-takes-all strategy. However, there were some
remarkable differences between the responses of the output units of trained
FFNNs and (2"s. Unlike FFNNs, which produced smooth bell-like surfaces
REFERENCES