Volume 488 - International Symposium on Grids & Clouds (ISGC2025) (ISGC2025) - Network, Security, Infrastructure & Operations
A Novel Fine-Grained Source Code Vulnerability Detection Model via Joint Token and Statement Representation Learning
S. Sun*, J. Wang, T. Yan and F. Qi
*: corresponding author
Full text: pdf
Published on: October 20, 2025
Abstract
With the increasing amount of code and growing complexity of software systems, defects in source code can lead to significant security risks—for example, malicious intrusions, data breaches, compromised availability, and erroneous scientific computation results—making their detection crucial. Currently, mainstream code defect detection methods are divided into two categories: graph neural network (GNN)-based detection methods and sequence-based detection methods. Both categories have achieved tremendous success in this field; however, each also suffers from certain shortcomings. Graph-based detection methods typically face issues such as huge memory overhead for graph construction, over-smoothing, and incomplete utilization of heterogeneous edge information. Sequence model-based detection methods generally treat code as a regular text sequence, learning only token-level features while ignoring the structural information of the code, which results in suboptimal detection performance. Moreover, these methods rarely support line-level vulnerability detection. To address these issues, this paper proposes a novel sequence model-based detection method that simultaneously learns both token-level and statement-level feature representations and supports line-level detection, thereby significantly enhancing detection capabilities. The proposed method achieves an F1 score of 92.71% for function-level detection and a top-5 accuracy of 61% for line-level vulnerability detection on a public dataset.
DOI: https://doi.org/10.22323/1.488.0010
How to cite

Metadata are provided both in article format (very similar to INSPIRE) as this helps creating very compact bibliographies which can be beneficial to authors and readers, and in proceeding format which is more detailed and complete.

Open Access
Creative Commons LicenseCopyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.