A Semi-Supervised Learning Scheme to Detect Unknown DGA Domain Names Based on Graph Analysis
F Yan, J Liu, L Gu, Z Chen - … on Trust, Security and Privacy in …, 2020 - ieeexplore.ieee.org
F Yan, J Liu, L Gu, Z Chen
2020 IEEE 19th International Conference on Trust, Security and …, 2020•ieeexplore.ieee.orgA large amount of malware families use the domain generation algorithms (DGA) to
randomly generate a large amount of domain names. It is a good way to bypass
conventional blacklists of domain names, because we cannot predict which of the randomly
generated domain names are selected for command and control (C&C) communications. An
effective approach for detecting known DGA families is to investigate the malware with
reverse engineering to find the adopted generation algorithms. As reverse engineering …
randomly generate a large amount of domain names. It is a good way to bypass
conventional blacklists of domain names, because we cannot predict which of the randomly
generated domain names are selected for command and control (C&C) communications. An
effective approach for detecting known DGA families is to investigate the malware with
reverse engineering to find the adopted generation algorithms. As reverse engineering …
A large amount of malware families use the domain generation algorithms (DGA) to randomly generate a large amount of domain names. It is a good way to bypass conventional blacklists of domain names, because we cannot predict which of the randomly generated domain names are selected for command and control (C&C) communications. An effective approach for detecting known DGA families is to investigate the malware with reverse engineering to find the adopted generation algorithms. As reverse engineering cannot handle the variants of DGA families, some researches leverage supervised learning to find new variants. However, the explainability of supervised learning is low and cannot find previously unseen DGA families. In this paper, we propose a graph-based semi-supervised learning scheme to track the evolution of known DGA families and find previously unseen DGA families. With a domain relation graph, we can clearly figure out how new variants relate to known DGA domain names, which induces better explainability. We deployed the proposed scheme on real network scenarios and show that the proposed scheme can not only comprehensively and precisely find known DGA families, but also can find new DGA families which have not seen before.
ieeexplore.ieee.org
Showing the best result for this search. See all results