PoS - Proceedings of Science
Volume 415 - International Symposium on Grids & Clouds 2022 (ISGC2022) - Network, Security, Infrastructure & Operations Session
Malicious Traffic Detection with Class Imbalanced Data Based on Coarse-grained Labels
Z. Li*, J. Liu, J. Wang, J. Liu, T. Yan, D. An, C. Zhou and Z. Wang
Full text: pdf
Published on: September 28, 2022
Abstract
In order to resist complex cyber-attacks, a Security Operations Center (SOC) named IHEPSOC has been developed and deployed in the Institute of High Energy Physics (IHEP) of the Chinese Academy of Sciences, which contributed to the reliability and security of the network for IHEP. It has become a major task to integrate state-of-the-art cyber-attack detection methods for IHEPSOC to improve the ability of threat detection. Malicious traffic detection based on machine learning is an emerging security paradigm, which can effectively detect both known and unknown cyber-attacks. However, the existing studies usually adopt traditional supervised learning, which often encounter issues when applied to real-world production environment due to its implicit assumptions on the operating dependence. For example, most studies are based on datasets that already have accurate data labels, but labeling these datasets accurately requires significant manual effort. In addition, in the real-world service, the volume of benign traffic data is larger than that of the malicious traffic data, and the imbalance between benign and malicious categories also makes many machine learning detection models difficult to apply to a production environment. Based on these, we propose a detection method for class imbalanced malicious traffic based on coarse-grained data labels, which achieves comparable performance compare to other supervised learning methods. We conducted three experiments, using the Android Malware 2017 dataset, and verified the practicability and effectiveness of the proposed method.
DOI: https://doi.org/10.22323/1.415.0030
How to cite

Metadata are provided both in "article" format (very similar to INSPIRE) as this helps creating very compact bibliographies which can be beneficial to authors and readers, and in "proceeding" format which is more detailed and complete.

Open Access
Creative Commons LicenseCopyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.