A Comparative Study of Existing Machine Learning Approaches For Parkinson's Disease Detection
A Comparative Study of Existing Machine Learning Approaches For Parkinson's Disease Detection
A Comparative Study of Existing Machine Learning Approaches For Parkinson's Disease Detection
To cite this article: Gunjan Pahuja & T. N. Nagabhushan (2018): A Comparative Study of Existing
Machine Learning Approaches for Parkinson's Disease Detection, IETE Journal of Research, DOI:
10.1080/03772063.2018.1531730
Article views: 4
ABSTRACT KEYWORDS
Parkinson’s disease (PD) has affected millions of people worldwide and is more prevalent in peo- Artificial neural networks
ple, over the age of 50. Even today, with many technologies and advancements, early detection of (ANN); K-nearest neighbors
this disease remains a challenge. This necessitates a need for the machine learning-based automatic (KNN); Parkinson’s disease
approaches that help clinicians to detect this disease accurately in its early stage. Thus, the focus (PD); support vector machine
(SVM)
of this research paper is to provide an insightful survey and compare the existing computational
intelligence techniques used for PD detection. To save time and increase treatment efficiency, clas-
sification has found its place in PD detection. The existing knowledge review indicates that many
classification algorithms have been used to achieve better results, but the problem is to identify the
most efficient classifier for PD detection. The challenge in identifying the most appropriate classifi-
cation algorithm lies in their application on local dataset. Thus, in this paper three types of classifiers,
namely, Multilayer Perceptron, Support Vector Machine and K-nearest neighbor have been discussed
on the benchmark (voice) dataset to compare and to know which of these classifiers is the most
efficient and accurate for PD classification. The Voice input dataset for these classifiers has been
obtained from UCI machine learning repository. ANN with Levenberg–Marquardt algorithm was
found to be the best classifier, having highest classification accuracy (95.89%). Moreover, we com-
pared our results with those obtained by Resul Das [“A comparison of multiple classification methods
for diagnosis of Parkinson Disease,” Expert Systems and applications, vol. 37, pp 1568–1572, 2010].
1. INTRODUCTION
After decades of exhaustive study, the causes of PD are
Parkinson’s disease (PD) is a progressive neurodegen- still unknown. Many of the researchers think that a
erative disorder of the nervous system that affects our combination of genetic [4] and environmental factors
body movements including speech. Dr. James Parkin- [5], such as exposure to the environmental toxin, head
son in 1817 [1] discovered this disease and described the injury, rural living, drinking water, manganese and expo-
condition which he called the ‘Shaking Palsy’. Neurode- sure to pesticides, are responsible for PD. These fac-
generative diseases are defined as hereditary and sporadic tors may vary from person to person. Also, there are
conditions which are characterized by dysfunction of some specific symptoms that an individual experience
the progressive nervous system (JPND research, 2015). and each PD patient experience these symptoms differ-
Out of many neurodegenerative diseases like Alzheimer’s ently. Description of different stages of PD is reported in
disease, Brain Cancer, Degenerative Nerve Diseases and Table 1. Primary motor symptoms of PD include tremor
Epilepsy, “Parkinson’s Disease” is considered to be the of the hands, arms, legs, jaw and face, bradykinesia or
second most common neurodegenerative disease [2]. slowness of movement, rigidity or stiffness of the limbs
and trunk and postural instability or impaired balance
PD is mainly caused by the progressive loss of dopamine and coordination [6–8]. In addition to these symptoms,
neurons in the area of the midbrain called substan- there are some non-motor symptoms like depression
tia nigra – the “movement control center” of the and loss of memory which may occur and affect the
brain (Figure 1). Loss of dopamine causes the neu- quality of life [9,10]. At the advanced stage, PD can
rons to fire out-of-control movements called hypo- be easily and accurately diagnosed, but effective treat-
kinetic movement disorder [3]. Although this disease ment is a challenging task. Also, if treatment is started
can be diagnosed easily in the advanced stage, effective in advanced stages, it might have less effective in con-
treatment is still very challenging. To date, there exists no trolling PD progression. This situation necessitates the
cure/medical treatment for PD. early and accurate diagnosis of PD, thus helps the patients
© 2018 IETE
2 G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES
dataset, Hui-Ling Chen et al. [23] used fuzzy k-nearest negative cases or it measures the overall performance of
neighbor approach with Principal Component Analysis the method. Table 2 describes some of the studies avail-
for predicting PD and constructing the feature subset able in the literature for PD diagnosis and classification
from the whole feature space. The authors reported that using machine learning approaches.
their proposed method outperformed the other methods
in the literature.
2.1 Feature Subset Selection (FSS) Techniques
Omer et al. [16] compared the performance of LS-SVM, The diagnosis of neurodegenerative diseases through
SVM, MLPNN and GRNN in the remote tracking of PD machine learning approaches includes the following:
progression. It was observed that LS-SVM outperforms
the other methods while mapping the vocal features to (1) Data acquisition (Brain MRI images, gait move-
UPDRS data. ments, vocal data, local field potential etc.).
(2) Feature extraction (extract the features suitable for
It is clear from the literature that most of the PD patients training and testing a classifier).
exhibit gait disorder [20] along with vocal impairment. (3) Feature subset selection (to reduce the redundant
You-Yin et al. [24] developed a gait regression model for features).
predicting the severity of motor dysfunction from gait (4) Training and validating the performance of the clas-
image sequence. The studies done so far also indicate sifier.
that there is a loss of neurons in dopamine region of the
brain in the individuals affected by PD. Thus, over the Figure 2 shows the steps involved in medical image pro-
past 2 decades, neuroImaging techniques, such as MRI; cessing (MIP) using machine learning techniques.
SPECT; fMRI and PET, have been used to visually assess
and quantify the loss of neurons in different lobes of In the literature, a variety of machine learning algorithms
the brain [25–27]. MRI is preferred over others because exist such as induction-based (ID3, CART) and instance-
of non-invasiveness and high spatial resolution quality based algorithms (IBL) for medical imaging classifica-
[28,29]. In the literature, various machine learning tech- tion. But, these algorithms degrade the prediction accu-
niques/approaches exist that are found to be effective for racy because of the availability of many features that
diagnosing PD patients using neuroImaging techniques. are not necessary for predicting the output. Thus there
is a need for FSS methods which optimize the num-
The changes in the functional connectivity of motor net- ber of features by selecting the relevant subset and thus
works in the resting state in PD, using fMRI and a net- improve the classification accuracy. A typical FSS consists
work model based on graph theory, were demonstrated of 4 basic steps: Subset Generation, Subset Evaluation,
by Tao et al. [30]. The authors found that functional con- Stopping Criterion and Result Validation [39].
nectivity in the supplementary motor area, left dorsal lat-
eral prefrontal cortex and left putamen of PD patients at Subset generation procedure is a search procedure that
off state had significantly decreased while functional con- produces feature subsets for evaluation based on prede-
nectivity in the left cerebellum, left primary motor cortex fined criterion [40]. An evaluation function is used to
and left parietal cortex had increased as compared to nor- evaluate the subset under examination, the stopping cri-
mal subjects in PD. Defeng et al. [31] conducted a real- terion is used to decide when to stop and a validation
time case study using deep brain electrode implantation procedure is used to check whether the subset is valid.
to predict the PD tremor. Similarly, Christian Salvotre Based on different evaluation criteria, FSS algorithms are
et al. [32] used a dataset of MRI scans from 28 controls, categorized into three categories (1) the filter model, (2)
28 PD patients and 28 Progressive Supranuclear Palsy. the wrapper model [39] and (3) hybrid model. In all the
Supervised machine learning algorithm was used based categories, algorithms can be further differentiated by
on PCA as a feature extraction method and SVM as a clas- how the space of feature subsets is explored and the exact
sification algorithm. The authors have tried to overcome nature of their evaluation function.
the problem of imbalance dataset by taking the same
number of patients of different classes (PD, HC and PSP). The filter model relies on general characteristics of the
Nowadays many classifiers are available for PD detection data to evaluate and select the feature subsets without
and their performance is measured with metrics such involving any learning algorithm. But, sometimes the fil-
as accuracy, sensitivity and specificity [15,17,33,34]. In ter method fails to select the right subset of features if
general, the accuracy is a measure of how many cases the applied criterion deviates from the one that is used
are correctly identified in total irrespective of positive or for training purpose. Also, the filter approach may fail to
4 G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES
Table 2: Literature survey for diagnosis of Parkinson’s disease using machine learning approaches
Study Dataset Method Results
Song Pan et al. [15] Local field potential signals Radial Basis Function+ Support Accuracy
Vector Machine + SVM: 81.14%
Multilayer Perceptron RBF:80.13%
MLP:79.25%
Sang-Hong Lee and Joon Gait characteristics Wavelet-based feature extraction, +Neural Accuracy:77.33%
S. Lim [17] Network with weighted fuzzy membership
functions
G. Sateesh Babu and S. Gene expressions ICA+ Meta-cognitive neural classifier Accuracy:95.55%
Suresh [18]
R. Armananzas et al. [35] Movement disorder Wrapper feature selection + 5 Accuracy
classifiers: 1. NB:82.08%
Naïve Bayes (NB), k-nearest neigh- 2. KNN:80.06%
bors 3. LDA:83.24%
LDA, C4.5 decision trees, ANN 4. C 4.5:81.50%
5. ANN:64.74%
G.S. Babu et.al [33] Brain MRI images Voxel-Based Morphometry + PBL-McRBFN+ Accuracy:87.21%
RFE
F.J. Martinez-Murcia et al. DaTSCAN Independent Component Analysis (ICA) + Accuracy on
[36] Images Support Vector Machines(SVM) 1. PPMI dataset = 91.3%
and
2. Virgen dela Victoria”
Hospital in Málaga (VV),
Spain-94.7%
G. Singh and L. T1-weighted MRI Images Kohonen Self Organizing Accuracy: 99.9% (For classifying PD, HC
Samavedham [37] Map+ and SWEDD subjects)
Least Square Support Vec-
tor Machine
A. Benba et al. [38] Voice Assessment Principal Component Analysis+ Support Accuracy: 87.50%
Vector Machine (On 3 vowel samples
/a/,/o/,/u/)
L. Naranjo [21] Acoustic features Gibb’s Sampling Algorithm +Bayesian Accuracy: 86.2%
y extracted from repli- Approach Sensitivity:82.5%
cated voice recordings Specificity:90.0%
Figure 2: Steps involved in medical image processing (MIP) using machine learning techniques
find a feature subset that would jointly maximize the cri- the hybrid methods are more efficient than wrapper
terion, thus degrading the performance of the learning and filter approaches, they are much complex and
model [41,42]. On the other hand, the wrapper method limited to a specific learning machine [46,47]. Much
requires a learning algorithm and uses its performance as work has been done in this field as well [48]; different
the evaluation criterion. Wrappers can show even better researchers have mentioned advantages and disadvan-
results than others by considering prediction accuracy. tages of filter and wrapper approaches. Table 3 indicates
But, wrapper models are less general and are computa- FSS/dimensionality reduction methods currently avail-
tionally expensive than filter models because they need able in the literature for reducing the dimensionality or
more computational resources and use specific learn- removing the irrelevant/redundant features in the case of
ing algorithm [43,44]. Since filters execute many times PD detection and classification using machine learning
faster than wrappers, there is a much better chance of methods.
scaling to databases with a large number of features
in filter approach than wrappers. Also, filters do not
require re-execution for different learning algorithms. 2.2 Classification
Thus filters can provide the same benefits for learning as After feature extraction and subset selection, the next
wrappers do. phase is the classification. It is an instance of supervised
learning and can be defined as a problem of identify-
The hybrid model combines the advantage of the fil- ing the category, to which new observation will belong.
ter and wrapper model by utilizing different evalua- Various methods used for classification are categorized
tion criteria in different search stages [45]. Although as (a) Statistical Algorithms, (b) Pattern Recognition and
G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES 5
Figure 3: Methods applied for PD classification In addition to classifying the PD patients using voice
dataset, we had also evaluated the classifiers performance
on two other benchmark datasets available from UCI
learning-based algorithms, (c) Search heuristics and a
repository. Table 4 summarizes the benchmark datasets
combination of algorithms.
used in this study.
In statistical approaches, the computation of mean, stan-
dard deviation of the features in the template is done. 3.2 Applied Methods
Distance techniques such as Euclidean distance, weighted
Euclidean distance and Manhattan distance are used for 3.2.1 Artificial Neural Network (ANN)
comparing the training data with the testing data. ANN symbolizes a parallel architecture that is moti-
vated by the way how biological neural processing takes
Pattern recognition is defined as an act of taking raw data place. Although many types of ANN architectures exist,
and classifying them into different categories based on MLP (multi-layer feed-forward neural network) is the
machine learning algorithms such as K-NN rule, Bayes most commonly used architecture (Figure 4). Backprop-
classifier, SVM, artificial neural networks (ANN) [13] agation algorithm proposed by Rumelhart in 1986 is
and clustering techniques like K-means [13,16,19,35,50]. a generalized delta rule that is utilized by MLP Net-
work for the adjustment of weights [13,16,54]. Leven-
Various evolutionary algorithms such as Ant colony opti- berg–Marquardt, Gradient descent scaled conjugate gra-
mization and Particle swarm optimization can also be dient, and Resilient back propagation are some of the
used for classification purpose [18,49,51,52]. The advan- variants of the Backpropagation algorithm. According to
tage of using these evolutionary algorithms is that they M.T. Hagan and M. Menhaj [55], for small- and medium-
can handle large databases. Figure 3 depicts the methods sized networks, the Levenberg–Marquardt algorithm is
applied for PD classification in this study. efficient and strongly recommended for neural net-
work training; therefore, the same algorithm has been
implemented here.
3. MATERIALS AND METHODS
This section describes the methods and materials used 3.2.2 Support Vector Machine (SVM)
in this study for classifying the PD patients from healthy SVM is considered to be a supervised classification
subjects. approach. Vapnik [56] first proposed SVM for binary
classification. Binary classification is based on the con-
cept of dividing the data into classes using a hyperplane.
3.1 Dataset
In this paper, dataset of PD patients regarding general For the linear classification problems, SVM is consid-
voice disorders has been used. MA Little [53] of the ered as an extension of the perceptron. From Figure 5,
University of Oxford, in collaboration with the National it is clear that the distance between the 2 hyper-
Centre for Voice and Speech, Denver, Colorado, recorded planes is 2/||w||. So, the optimization problem is to
the speech signals and created the database. This database reduce/minimize ||w|| or to maximize the margin
6 G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES
Variants →
Performance Levenberg– Marquardt Scaled conjugate Euclidean Cityblock RBF Polynomial Linear
parameters↓ algorithm gradient distance distance kernel kernel kernel
Classification accuracy 95.89 85.12 72.31 69.74 88.21 81.03 82.9
Sensitivity 93.75 70 68.75 66.67 91.67 79.17 87.33
Specificity 96.59 96.59 73.47 70.75 77.55 87.76 78.56
Geometric mean 95.16 82.23 71.07 68.68 84.31 83.35 82.83
8 G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES
Table 6: Classifier performance comparison with studies that the total number of samples from one class of data
available in the literature on vocal dataset (+ve) are not equal to the total number of samples from
Study Method Accuracy (%) other class of data (−ve). This problem exists not only
R. Das [13] ANN 92.9 in medical diagnosis but also approximately in all fields
F. Astrom and R. Kokar [14] 9 parallel neural 91.2
networks where “Machine Learning” is used such as face recogni-
A. Khemphila and Information Gain+ 83.33 tion and biometrics. This problem may be overcome by
V. Boonjing [54] ANN
H.-L. Chen et al. [23] PCA+FKNN 96.07
using balanced dataset, so that decision model can learn
A. Benba et al. [38] PCA+SVM 87.21 without bias. The presence of noise and outliers during
data collection can lead to poor diagnosis. Thus, prepro-
cessing of medical data is a necessary step and must be
handled automatically. Post removal of noise and outliers,
addresses byte challenge. The other two biggest chal-
medical images can be processed and analyzed to extract
lenges that still exist in MIP are the dataset and com-
meaningful information such as volume, shape, motion
putational power, e.g. G.S. Babu et al. [33] developed
of organs which are helpful in the diagnosis of the disease
the meta-cognitive algorithm for the identification of the
and abnormalities.
brain regions responsible for PD using RFE approach.
87.21% accuracy was achieved but the computational cost
was high. From machine learning perspective, dataset
6. CONCLUSION
must be clean and of significant size to solve the prob-
lem. However, availability of clean dataset is limited due Research highlights that 90% of people with PD exhibit
to the nature of complexity. vocal impairment. Vocal impairment or disorders of
voice means that voice will sound hoarse, strained or
Dataset collection has some inherent challenges like effortful. Several studies have been done to automate the
“class imbalance problem” [23] and presence of noise and PD diagnosis using voice dataset. In this paper, the per-
outliers in the dataset. Class imbalance problem means formance of ANN, KNN and SVM classifiers has been
Table 7: Performance Comparison of ANN, KNN and SVM on Wisconsin breast cancer dataset and Pima Indians diabetes dataset
Variants → ANN KNN SVM
Levenberg–
Performance Marquardt Scaled conjugate Euclidean Cityblock RBF Polynomial Linear
Datasets parameters↓ algorithm gradient distance distance kernel kernel kernel
Wisconsin Breast Classification accuracy 98 97 73.33 72.31 96.71 90.1 95.02
Cancer Database Sensitivity 97.8 97.16 68.75 66.67 96.29 92.16 96.72
Specificity 95.85 98.3 74.83 74.15 97.51 88.8 94.51
Geometric mean 96.82 97.73 71.73 70.31 96.90 90.46 95.61
Pima Indians Classification accuracy 81.11 78.51 72.82 72.31 75.01 73.16 74.61
Diabetes Dataset Sensitivity 90 80.62 68.75 68.75 73.4 77.4 78.3
Specificity 68.33 73.3 74.15 73.47 72.76 69.4 71.04
Geometric mean 78.42 76.87 71.40 71.07 73.08 73.29 74.58
G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES 9
evaluated using sensitivity, specificity, total classification (MDS-UPDRS): Scale presentation and clinimetric testing
accuracy and geometric mean on voice database. Sim- results,” Mov. Disord., Vol. 23, no. 15, pp. 2129–70, 2008.
ilar discussion is also carried out for Wisconsin Breast
12. B. Post, M. P. Merkus, R. M. de Bie, R. J. de Haan, and
Cancer database and Pima Indians Diabetes Dataset. It J. D. Speelman, “Unified Parkinson’s disease rating scale
is observed that Artificial Neural Networks with Lev- motor examination: Are ratings of nurses, residents in neu-
enberg–Marquardt algorithm gives the highest classifi- rology, and movement disorders specialists interchange-
cation accuracy of 95.89% for voice dataset. We believe able?,” Movement Dis., Vol. 20, pp. 1577–84, 2005.
that the use of machine learning techniques as discussed
13. R. Das, “A comparison of multiple classification methods
here will be a great support to the doctors. Although a
for diagnosis of Parkinson disease,” Expert Syst. Appl., Vol.
large number of techniques are available for PD diagno- 37, pp. 1568–72, 2010.
sis their performance is still imperfect. Hence, to improve
the accuracy of CAD algorithms, there is a need for fur- 14. F. Astrom and R. Koker, “A parallel neural network
ther enhancements. In future, we will attempt to use approach to prediction of Parkinson’s disease,” Expert Syst.
other evolutionary algorithms like Genetic algorithm and Appl., Vol. 38, pp. 12470–4, 2011.
Extreme Learning Machine for PD detection and classi- 15. S. Pan, S. Iplikci, K. Warwick, and T. Z. Aziz, “Parkinson’s
fication. disease tremor classification – a comparison between sup-
port vector machines and neural networks,” Expert Syst.
Appl., Vol. 39, pp. 10764–71, 2012.
REFERENCES
1. J. Parkinson, An Essay on Shaking Palsy. London: Whitting- 16. O. Eskidere, F. Ertas, and C. Hanilci, “A comparison of
ham and Rowland Printing, 1817. regression methods for remote tracking of Parkinson’s dis-
ease progression,” Expert Syst. Appl., Vol. 39, pp. 5523–8,
2. D. B. Calne, “Is idiopathic parkinsonism the consequence 2012.
of an event or a process,” Neurology, Vol. 44, no. 15, pp. 5–5,
1994. 17. S.-H. Lee and J. S. Lim, “Parkinson’s disease classifi-
cation using gait characteristics and wavelet-based fea-
3. A. E. Lang and A. M. Lozano, “Parkinson’s disease first of ture extraction,” Expert Syst. Appl., Vol. 39, pp. 7388–44,
Two parts,” New England J. Med., Vol. 339, pp. 1044–53, 2012.
1998.
18. G. Sateesh Babu, and S. Suresh, “Parkinson’s disease pre-
4. A Samii, J. G. Nutt, and B. R. Ransom, “Parkinson’s dis- diction using gene expression – a projection based learn-
ease,” Lancet, Vol. 363, no. 9423, pp. 1783–93, 2004. ing meta-cognitive neural classifier approach,” Expert Syst.
Appl., Vol. 40, pp. 1519–29, 2013.
5. L. M. de Lau and M. M. Breteler, “Epidemiology of Parkin-
son’s disease,” Lancet Neurol., Vol. 5, pp. 525–35, 2006. 19. M. Hariharan, K. Polat, and R. Sindhu, “A new hybrid intel-
ligent systems for accurate detection of Parkinson’s dis-
6. E. M. Morris, “Movement disorder in people with Parkin- ease,” Comp. Methods Prog. Biomed., Vol. 113, pp. 904–13,
son disease: A model for physical therapy,” Phys. Ther., Vol. 2014.
80, pp. 578–97, 2000.
20. W. Zeng and C. Wang, “Classification of neurodegenerative
7. A. Schrag, C. D. Good, K. Miszkiel, H. R. Morris, C. J. diseases using gait dynamics via deterministic learning,”
Mathias, A. J. Lees, and N. P. Quinn, “Differentiation of Inform. Sci., Vol. 317, pp. 246–58, 2015.
atypical parkinsonian syndromes with routine MRI,” Neu-
rology, Vol. 54, pp. 697–702, 2000. 21. L. Naranjo, C. J. Perez, J. Martín, and Y. Campos-Roca,
“A two-stage variable selection and classification approach
8. R. Angel, W. Alston, and J. R. Higgins, “Control of move- for Parkinson’s disease detection by using voice recording
ment in Parkinson’s disease,” Brain, Vol. 93, no. 1, pp. 1–14, replications,” Comp. Methods Prog. Biomed., Vol. 142, pp.
1970. 147–56, 2017.
9. S. L. Wu, R. M. Liscic, S. Kim, S. Sorbi, and Y. H. Yang, 22. M. A Little, P. E. McSharry, E. J. Hunter, J. Spielman, and L.
“Nonmotor symptoms of Parkinson’s disease,” Parkinson’s O. Ramig, “Suitability of dysphonia measurements for tele-
Dis., 2017. DOI:10.1155/2017/4382518. monitoring of Parkinson’s disease,” IEEE Trans. Biomed.
Eng., Vol. 56, pp. 1015–22, 2009.
10. T. Yousaf, H. Wilson, and M. Politis, “Imaging the nonmo-
tor symptoms in Parkinson’s disease,” Int. Rev. Neurobiol., 23. H.-L. Chen, C-C. Huang, X-G. Yu, X. Xu, X. Sun, G. Wang,
Vol. 133, pp. 179–257, 2017. and S-J. Wang, “An efficient diagnosis system for detec-
tion of Parkinson’s disease using fuzzy k-nearest neigh-
11. C. G. Goetz, et al., “Movement disorder society-sponsored bor approach,” Expert Syst. Appl., Vol. 40, pp. 263–71,
revision of the unified Parkinson’s disease rating scale 2013.
10 G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES
24. Y.-Y. Chen, et al., “A vision-based regression model to eval- DaTSCAN imaging,” Neurocomputing, Vol. 126, pp. 58–70,
uate parkinsonian gait from monocular image sequences,” 2014.
Expert Syst. Appl., Vol. 39, pp. 520–6, 2012.
37. G. Singh, and L. Samavedham, “Unsupervised learn-
25. P. Piccini, and A. Whone, “Functional brain imaging in ing based feature extraction for differential diagnosis of
the differential diagnosis of Parkinson’s disease,” Lancet neurodegenerative diseases: a case study on early-stage
Neurol., Vol. 3, pp. 284–90, 2004. diagnosis of Parkinson disease,” J. Neurosci.Methods, Vol.
256, pp. 30–40, 2015.
26. A. G. Filler, “The history, development and impact of
computed imaging in neurological diagnosis and neuro- 38. A. Benba, A. Jilbab, and A. Hammouch, “Voice assessments
surgery: CT, MRI, and DTI,” Nature, Vol. 7, pp. 1–69, for detecting patients with Parkinson’s diseases using PCA
2009. and NPCA,” Int J. Speech Technol., Vol. 19, no. 4, pp.
743–54, 2016.
27. B. S. Mahanand, S. Suresh, N. Sundararajan, and M.
Aswatha Kumar, “Identification of brain regions responsi- 39. M. Dash, and H. Liu, “Feature selection for classification,”
ble for Alzheimer’s disease using a self-adaptive resource Intell. Data Anal., Vol. 1, pp. 131–56, 1997.
allocation network,” Neural Netw., Vol. 32, pp. 313–22,
2012. 40. L. Huan and H. Motoda, Feature Selection for Knowledge
Discovery and Data Mining. Boston, MA: Kluwer Aca-
28. A. Schrage, M. Jahanshahi, and N. Quinn, “How does demic, 1998.
Parkinson’s disease affect quality of life? A comparison
with quality of life in the general population,” Movement 41. M. E. ElAlami, “A filter model for feature subset selection
Disord., Vol. 15, pp. 1112–8, 2000. based on genetic algorithm,” Knowledge Based Syst., Vol.
22, pp. 356–62, 2009.
29. B. Ravina, et al., “The role of radiotracer imaging in Parkin-
son disease,” Neurology, Vol. 64, pp. 208–15, 2005. 42. H. Yoon, C-S. Park, J. S. Kim, and J-G. Baek, “Algorithm
learning based neural network integrating feature selection
30. T. Wus, L. Wang, Y. Chen, C. Zhao, K. Li, and P. Chan, and classification,” Expert Syst. Appl., Vol. 40, pp. 231–41,
“Changes of functional connectivity of the motor network 2013.
in the resting state in Parkinson’s disease,” Neurosci. Lett.,
Vol. 460, pp. 6–10, 2009. 43. R. Kohavi and G. H. John, “Wrappers for feature subset
selection,” Artif. Intell., Vol. 97, pp. 273–324, 1997.
31. D. Wu, K. Warwick, Z. Ma, J. G. Burgess, S. Pan, and T. Z.
Aziz, “Prediction of Parkinson’s disease tremor onset using 44. M. Hall, “Correlation based feature selection for machine
radial basis function neural networks,” Expert Syst. Appl., learning,” Ph.D. dissertation, Dept. of Computer Science,
Vol. 37, pp. 2923-2928, 2010. University of Waikato, 1999.
32. C. Salvatore, et al., “Machine learning on brain MRI data 45. A. H. Hadjahmadi and T. J. Askari, “A decision support sys-
for differential diagnosis of Parkinson’s disease and pro- tem for Parkinson’s disease diagnosis using classification
gressive supranuclear palsy,” J. Neurosci. Methods, Vol. 222, and regression tree,” J. Math. Comp. Sci., Vol. 4, pp. 257–63,
pp. 230–37, 2014. 2012.
33. G. Sateesh Babu, S. Suresh, and B. S. Mahanand, “A novel 46. O. Uncu and I. B. Turksen, “A novel feature selection
PBL-McRBFN-RFE approach for identification of critical approach: combining feature wrappers and filters,” Inf. Sci.,
brain regions responsible for Parkinson’s disease,” Expert Vol. 177, pp. 449–66, 2007.
Syst. Appl., Vol. 41, pp. 478–88, 2014.
47. J. Huang, Y. Cai, and X. Xu, “A hybrid genetic algorithm for
34. B. Rana, A. Juneja, M. Saxena, S. Gudwani, S. Senthil feature selection wrapper based on mutual information,”
Kumaran, R.K. Agrawal, and M. Behari, “Regions-of- Pattern Recog. Lett., Vol. 28, pp. 1825–1844, 2007.
interest based automated diagnosis of Parkinson’s dis-
ease using T1-weighted MRI,” Expert Syst. Appl., Vol. 42, 48. D. Guan, W. Yuan, Y-K. Lee, K. Najeebullah, and M. K.
pp. 4506–16, 2015. Rasel, “A review of ensemble learning based feature selec-
tion,” IETE Tech. Rev., Vol. 31, pp. 190–198, 2014.
35. R. Armananzas, C. Bielza, K. R. Chaudhuri, P. Martinez-
Martin, and P. Larrañaga, “ Unveiling relevant non-motor 49. P. Shrivastava, A. Shukla, P. Vepakomma, N. Bhansali, and
Parkinson’s disease severity symptoms using a machine K. Verma, “A survey of nature-inspired algorithms for
learning approach,” Artif. Intell. Med., Vol. 58, pp. 195–202, feature selection to identify Parkinson’s disease,” Comp.
2013. Methods Prog. Biomed., Vol. 139, pp. 171–9, 2017.
36. F. J. Martinez-Murcia, J. M. Górriz, J. Ramírez, I. A. Illán, A. 50. M. Poletti, M. Emre, and U. Bonuccelli, “Mild cognitive
Ortiz, and the PPMI, “Automatic detection of parkinson- impairment and cognitive reserve in Parkinson’s disease,”
ism using significance measures and component analysis in Parkinsonism Related Disord., Vol. 17, pp. 579–586, 2011.
G. PAHUJA AND T. N. NAGABHUSHAN: A COMPARATIVE STUDY OF EXISTING MACHINE LEARNING APPROACHES 11
51. K Chandrasekaran, S. P. Simon, and N. P. Padhy, “Cuckoo 55. M. T. Hagan and M. Menhaj, “Training feedforward net-
search algorithm for emission reliable economic multi- works with the Marquardt algorithm,” IEEE Trans. Neural
objective dispatch problem,” IETE J. Res., Vol. 60, pp. Netw., Vol. 5, pp. 989–93, 1994.
128–38, 2014.
56. V. Vapnik, The Nature of Statistical Learning Theory. New
52. V. Mangat and R. Vig, “Dynamic PSO-based associative York: Springer-Verlag, 1995.
classifier for medical datasets,” IETE Tech. Rev., Vol. 31, pp.
258–65, 2014. 57. P. Hall, U. B. Park, and R. J. Samworth, “Choice of neighbor
order in nearest-neighbor classification,” Annals Stat., Vol.
53. M. A. Little, P. E. McSharry, S. J. Roberts, D. A. Costello, and 36, pp. 2135–52, 2008.
I. M. Moroz, “Exploiting nonlinear recurrence and fractal
scaling properties for voice disorder detection,” BioMed. 58. I. Scholl, T. Aach, M. T. Deserno, and T. Kuhlen, “Chal-
Eng. Online, Vol. 6, p. 23, 2007. lenges of medical image processing,” Comput Sci Res.
Develop., Vol. 26, pp. 5–13, 2011.
54. A. Khemphila and V. Boonjing, “Parkinsons disease classi-
fication using neural network and feature selection,” World
Acad. Sci. Tech, Vol. 64, pp. 1–15, 2012.