An Ensemble Method For Phishing Detection
An Ensemble Method For Phishing Detection
An Ensemble Method For Phishing Detection
Asib Hasan
[email protected]
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 3
OBJECTIVE
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 4
WORK FLOW DIAGRAM
Output Classification
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 5
METHODOLOGY
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 6
A New Ensemble Model for Phishing Detection Based on
4/3/2021 Cumulative Feature Selection 7
PHASE - 01
Dataset
͙͙ ͘ ͘ ͘ ͙͙ ͘ ͘ ͘ ͙͙ ͘ ͘ ͘ ͙͙ ͘ ͘ ͘
Phase - 02
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 8
PHASE - 02
Phase - 01
Majority Voting
Majority Voting
on Reduced
on Full Feature
Feature Set
Set Classifier
Classifier
Result
4/3/2021 A New Ensemble Model for Phishing Detection Based on Cumulative Feature Selection 9
MAJORITY VOTING
A series of votes.
ROC Curve
A New Ensemble Model for Phishing Detection Based on
4/3/2021 17
Cumulative Feature Selection
Performance Comparison Between Top-n Feature Subset and Full Feature Subset
Model Name Number of Features Accuracy(%)
Random Forest 32 98.36
Random Forest 48 98.27
Support Vector Machine 44 94.01
Support Vector Machine 48 93.97
Naïve Bayes 41 85.78
Naïve Bayes 48 85.26
C4.5 42 97.53
C4.5 48 91.11
JRip 48 97.35
PART 41 97.59
PART 48 97.48
KNN 31 96.42
KNN 35 95.23
KNN 48 95.26
PDCFS 48 98.24
4/3/2021 A New Ensemble Model for Phishing Detection Based on 18
Cumulative Feature Selection
Average Runtime for Classification Per Sample
1. APWG. Phishing activity trends reports. Accessed on: September 8, 2020. [Online]. Available:
https://apwg.org
2. W. Hadi, F. Aburub, and S. Alhawari, “A new fast associative classification algorithm for detecting phishing
websites,” Applied Soft Computing, vol. 48, pp. 729 – 734, 2016. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S1568494616303970
3. H. Y. C. I. Benesty J., Chen J., “Pearson correlation coefficient. in: Noise reduction in speech processing,”
Springer Topics in Signal Processing, vol. 2, 2009.
4. K. L. Chiew, C. L. Tan, K. Wong, K. S. Yong, and W. K. Tiong, “A new hybrid ensemble feature selection
framework for machine learning-based phishing detection system,” Information Sciences, vol. 484, pp. 153
– 166, 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0020025519300763.
5. S. A. Manaf, N. Mustapha, N. Sulaiman, N. A. Husin, M. N. S. Zainudin, and H. Z. M. Shafri, “Majority
voting of ensemble classifiers to improve shoreline extraction of medium resolution satellite images,” 2017.
6. M. X. Rodriguez-Alvarez and V. Inacio, “Rocnreg: An r package for receiver operating characteristic curve
inference with and without covariate information,” 2020.
7. A. J. O. Kelly H. Zou and L. Mauri, “Receiver-operating characteristic analysis for evaluating diagnostic
tests and predictive models,” vol. 115, no. 5, p. 654–657, 2007.