Classification of Iris Data Set: Mentor: Assist. Prof. Primož Potočnik Student: Vitaly Borovinskiy
Classification of Iris Data Set: Mentor: Assist. Prof. Primož Potočnik Student: Vitaly Borovinskiy
Classification of Iris Data Set: Mentor: Assist. Prof. Primož Potočnik Student: Vitaly Borovinskiy
Mentor:
Assist. prof. Primo Potonik
Student:
Vitaly Borovinskiy
Ljubljana, 2009
1. Problem statement
Fishers Iris data base (Fisher, 1936) is perhaps the best known
database to be found in the pattern recognition literature. The data
set contains 3 classes of 50 instances each, where each class refers
to a type of iris plant. One class is linearly separable from the other
two; the latter are not linearly separable from each other.
The data base contains the following attributes:
1). sepal length in cm
2). sepal width in cm
3). petal length in cm
4). petal width in cm
5). class:
- Iris Setosa
- Iris Versicolour
- Iris Virginica
Fishers Iris data base is available in Matlab (load fisheriris) and in
Internet (for example, on http://archive.ics.uci.edu/ml/datasets/Iris).
The goal of the seminar is to demonstrate the process of building a
neural network based classifier that solves the classification problem.
During the seminar various neural network based approaches will be
shown, the process of building various neural network architectures
will be demonstrated, and finally classification results will be
presented.
2. Theoretical part
In this seminar classification problem is solved by 3 types of neural
networks:
1) multilayer perceptron;
2) radial basis function network;
3) probabilistic neural network.
These network types are shortly described in this seminar. Each of
these
networks
has
adjustable
parameters
that
affect
its
performance.
2.1Multilayer perceptron
Multilayer perceptron is a multilayer feedforward network.
3. Practical part
3.1 Cross-validation
In this seminar a cross-validation procedure is applied to provide
better generalization of neural network classifiers. To perform the
cross-validation procedure input data is partitioned into 3 sets:
1) training set;
2) validation set;
3) test set.
The training set is used to train the network. The validation set is used
to validate the network, to adjust network design parameters. The
test set is used to test the generalization performance of the
selected design of neural network.
To ensure a correct comparison of different types of neural networks
the division of input data into training, validation and test sets is
performed by independent part of code (see Appendix) and the
division result is stored.
The partitioning of input data is performed randomly with a certain
ratio of input entities to be stored as training set, validation set and
test set (0.7, 0.15 and 0.15 respectively).
3.2 Multilayer perceptron
As soon as the architecture and the performance of multilayer
perceptron are determined by the number of hidden layers and by
the number of neurons in each hidden layer these are the network
design parameters that are adjusted. The correct classification
function is introduced as the ratio of number of correctly classified
inputs to the whole number of inputs.
Fig. 1. Correct classification function for multilayer perceptron with 1 hidden layer. Blue line
training set; green line validation set
Fig. 2. Correct classification function for multilayer perceptron with 2 hidden layers (2 ortogonal
projections of surface). Training set
Fig. 3. Correct classification function for multilayer perceptron with 2 hidden layers (2 ortogonal
projections of surface). Validation set
Fig. 4. Correct classification function for radial basis function network. Blue line training set;
green line validation set
classification function for the train set and for the validation set are
stored.
The values of the correct classification function are plotted versus
the spread.
Fig. 5. Correct classification function for radial basis function network. Blue line training set;
green line validation set
Neural networks
Sets of inputs
Multilayer
perceptron
training +
validation
test
Radial basis
function
network
Probabilistic
neural network
99.483%
99.225%
98.450%
96.825%
100%
95.238%
Neural networks
Sets of inputs
Radial basis
Probabilistic
function
neural
network
network
100%
99.483%
100%
96.825%
96.825%
95.238%
Multilayer
perceptron
training +
validation
test
Levenberg-Marquardt
algorithm,
because
the
5. Conclusions
1. Classification performance of all 3 investigated types of neural
networks is acceptable.
2. Radial basis function network exhibits better generalization
performance then multilayer perceptron and probabilistic
neural network.
3. Small number of inputs effect crucially on the generalization
performance of neural network classifier.
Appendix
1. Multilayer perceptron Matlab code
close all; clear; clc
%% load divided input data set
load divinp.mat
% coding (+1/-1) of 3 classes
a = [-1 -1 +1]';
b = [-1 +1 -1]';
c = [+1 -1 -1]';
% define training inputs
trainInp = [trainSeto trainVers trainVirg];
% define targets
T = [repmat(a,1,length(trainSeto)) repmat(b,1,length(trainVers))
repmat(c,1,length(trainVirg))];
%% network training
trainCor = zeros(10,10);
valCor = zeros(10,10);
Xn = zeros(1,10);
Yn = zeros(1,10);
for k = 1:10,
Yn(1,k) = k;
for n = 1:10,
Xn(1,n) = n;
net = newff(trainInp,T,[k n],{},'trainbfg');
net = init(net);
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
%net.trainParam.show = NaN;
net.trainParam.max_fail = 2;
valInp = [valSeto valVers valVirg];
VV.P = valInp;
valT = [repmat(a,1,length(valSeto)) repmat(b,1,length(valVers))
repmat(c,1,length(valVirg))];
net = train(net,trainInp,T,[],[],VV);%,TV);
Y = sim(net,trainInp);
[Yval,Pfval,Afval,Eval,perfval] = sim(net,valInp,[],[],valT);
% calculate [%] of correct classifications
trainCor(k,n) = 100 * length(find(T.*Y > 0)) / length(T);
valCor(k,n) = 100 * length(find(valT.*Yval > 0)) / length(valT);
end
end
figure
surf(Xn,Yn,trainCor/3);
view(2)
figure
surf(Xn,Yn,valCor/3);
view(2)
%% final training
k = 3;
n = 3;
fintrain = [trainInp valInp];
finT = [T valT];
net = newff(fintrain,finT,[k n],{},'trainbfg');
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
net = train(net,fintrain,finT);
finY = sim(net,fintrain);
finCor = 100 * length(find(finT.*finY > 0)) / length(finT);
fprintf('Num of neurons in 1st layer = %d\n',net.layers{1}.size)
fprintf('Num of neurons in 2nd layer = %d\n',net.layers{2}.size)
fprintf('Correct class
= %.3f %%\n',finCor/3)
%% Testing
% define test set
testInp = [testSeto testVers testVirg];
testT = [repmat(a,1,length(testSeto)) repmat(b,1,length(testVers))
repmat(c,1,length(testVirg))];
testOut = sim(net,testInp);
testCor = 100 * length(find(testT.*testOut > 0)) / length(testT);
fprintf('Correct class
= %.3f %%\n',testCor/3)
% plot targets and network response
figure;
plot(testT')
xlim([1 21])
ylim([0 2])
set(gca,'ytick',[1 2 3])
hold on
grid on
plot(testOut','r')
legend('Targets','Network response')
xlabel('Sample No.')
Sp(1,i) = spread;
% choose max number of neurons
K = 40;
% performance goal (SSE)
goal = 0;
% number of neurons to add between displays
Ki = 5;
% create a neural network
net = newrb(trainInp,T,goal,spread,K,Ki);
% simulate RBFN on training data
Y = sim(net,trainInp);
% define validation vector
valInp = [valSeto valVers valVirg];
valT = [repmat(a,1,length(valSeto)) repmat(b,1,length(valVers))
repmat(c,1,length(valVirg))];
[Yval,Pf,Af,E,perf] = sim(net,valInp,[],[],valT);
% calculate [%] of correct classifications
Cor(1,i) = 100 * length(find(T.*Y > 0)) / length(T);
Cor(2,i) = 100 * length(find(valT.*Yval > 0)) / length(valT);
end
figure
pl = plot(Sp,Cor/3);
set(pl,{'linewidth'},{1,3}');
%% choose a spread constant (2nd step)
spread = 1.0;
Cor = zeros(2,410);
Sp = zeros(1,410);
Sp(1,1) = spread;
for i = 1:410,
spread = spread - 0.001;
Sp(1,i) = spread;
% choose max number of neurons
K = 40;
% performance goal (SSE)
goal = 0;
% number of neurons to add between displays
Ki = 5;
% create a neural network
net = newrb(trainInp,T,goal,spread,K,Ki);
% simulate RBFN on training data
Y = sim(net,trainInp);
% define validation vector
valInp = [valSeto valVers valVirg];
valT = [repmat(a,1,length(valSeto)) repmat(b,1,length(valVers))
repmat(c,1,length(valVirg))];
[Yval,Pf,Af,E,perf] = sim(net,valInp,[],[],valT);
% calculate [%] of correct classifications
Cor(1,i) = 100 * length(find(T.*Y > 0)) / length(T);
Cor(2,i) = 100 * length(find(valT.*Yval > 0)) / length(valT);
end
figure
pl = plot(Sp,Cor/3);
set(pl,{'linewidth'},{1,3}');
%% final training
spr = 0.8;
fintrain = [trainInp valInp];
finT = [T valT];
[net,tr] = newrb(fintrain,finT,goal,spr,K,Ki);
% simulate RBFN on training data
finY = sim(net,fintrain);
% calculate [%] of correct classifications
finCor = 100 * length(find(finT.*finY > 0)) / length(finT);
fprintf('\nSpread
= %.3f\n',spr)
fprintf('Num of neurons = %d\n',net.layers{1}.size)
fprintf('Correct class
= %.3f %%\n',finCor/3)
% plot targets and network response
figure;
plot(T')
ylim([-2 2])
set(gca,'ytick',[-2 0 2])
hold on
grid on
plot(Y','r')
legend('Targets','Network response')
xlabel('Sample No.')
%% Testing
% define test set
testInp = [testSeto testVers testVirg];
testT = [repmat(a,1,length(testSeto)) repmat(b,1,length(testVers))
repmat(c,1,length(testVirg))];
testOut = sim(net,testInp);
testCor = 100 * length(find(testT.*testOut > 0)) / length(testT);
fprintf('\nSpread
= %.3f\n',spr)
fprintf('Num of neurons = %d\n',net.layers{1}.size)
fprintf('Correct class
= %.3f %%\n',testCor/3)
% plot targets and network response
figure;
plot(testT')
ylim([-2 2])
set(gca,'ytick',[-2 0 2])
hold on
grid on
plot(testOut','r')
legend('Targets','Network response')
xlabel('Sample No.')
spread = 1.1;
Cor = zeros(2,109);
Sp = zeros(1,109);
Sp(1,1) = spread;
for i = 1:109,
spread = spread - 0.01;
Sp(1,i) = spread;
% create a neural network
net = newpnn(trainInp,ind2vec(T),spread);
%
Y
%
Y
Yval = sim(net,valInp,[],[],ind2vec(valT));
Yval = vec2ind(Yval);
% calculate [%] of correct classifications
Cor1(1,i) = 100 * length(find(T==Y)) / length(T);
Cor1(2,i) = 100 * length(find(valT==Yval)) / length(valT);
end
figure
pl1 = plot(Sp1,Cor1);
set(pl1,{'linewidth'},{1,3}');
%% final training
spr = 0.242;
fintrain = [trainInp valInp];
finT = [T valT];
net = newpnn(fintrain,ind2vec(finT),spr);
% simulate PNN on training data
finY = sim(net,fintrain);
% convert PNN outputs
finY = vec2ind(finY);
% calculate [%] of correct classifications
finCor = 100 * length(find(finT==finY)) / length(finT);
fprintf('\nSpread
= %.3f\n',spr)
fprintf('Num of neurons = %d\n',net.layers{1}.size)
fprintf('Correct class
= %.3f %%\n',finCor)
% plot targets and network response
figure;
plot(T')
ylim([0 4])
set(gca,'ytick',[1 2 3])
hold on
grid on
plot(Y','r')
legend('Targets','Network response')
xlabel('Sample No.')
%% Testing
% define test set
testInp = [testSeto testVers testVirg];
testT = [repmat(a,1,length(testSeto)) repmat(b,1,length(testVers))
repmat(c,1,length(testVirg))];
testOut = sim(net,testInp);
testOut = vec2ind(testOut);
testCor = 100 * length(find(testT==testOut)) / length(testT);
fprintf('\nSpread
= %.3f\n',spr)
fprintf('Num of neurons = %d\n',net.layers{1}.size)
fprintf('Correct class
= %.3f %%\n',testCor)
% plot targets and network response
figure;
plot(testT')
ylim([0 4])
set(gca,'ytick',[1 2 3])
hold on
grid on
plot(testOut','r')
legend('Targets','Network response')
xlabel('Sample No.')