Hengkai Ye

Ph.D. Candidate
College of Information Sciences and Technology
The Pennsylvania State University

E-mail: hengkai at psu dot edu

Short Bio: I am a second-year Ph.D. candidate at Penn State University, advised by Prof. Hong Hu. Before joining Penn State University, I obtained my Bachelor's degree from Huazhong University of Science and Technology and Master's degree from Purdue University. My research interests include Software and System Security.

Publications

Conference Proceedings

  1. Too Subtle to Notice: Investigating Executable Stack Issues in Linux System (To Appear) Website Paper
    Hengkai Ye and Hong Hu
    In Proceedings of the 32th Annual Network and Distributed System Security Symposium (NDSS 2025)
  2. Code injection was a favored technique for attackers to exploit buffer overflow vulnerabilities decades ago. Subsequently, the widespread adoption of lightweight solutions like write-xor-execute (W⊕X) effectively mitigated most of these attacks by disallowing writable-and-executable memory. However, we observe multiple concerning cases where software developers accidentally disabled W⊕X and reintroduced executable stacks to popular applications. Although each individual violation has been properly fixed, a lingering question remains: what underlying factors contribute to these recurrent mistakes among developers, even in contemporary software development practices? In this paper, we conduct two investigations aimed at gaining a comprehensive understanding of the challenges associated with properly enforcing W⊕X in Linux systems. First, we delve into program-hardening tools to assess whether experienced security developers consistently catch the necessary steps to avoid executable stacks. Second, we analyze the enforcement of W⊕X on Linux by inspecting the source code of the compilation toolchain, the kernel and the loader. Our investigation reveals that properly enforcing W⊕X on Linux requires close collaboration among multiple components. These tools form a complex chain of trust and dependency to safeguard the program stack. However, developers, including security researchers, may overlook the subtle yet essential .note.GNU-stack section when writing assembly code for various purposes, and inadvertently introduce executable stacks. For example, 11 program-hardening tools implemented as inlined reference monitors (IRM) introduce executable stacks to all hardened applications. Based on these findings, we discuss potential exploitation scenarios by attackers and provide suggestions to mitigate this issue.

  3. VIPER: Spotting Syscall-Guard Variables for Data-Only Attacks Website Slides Paper
    Hengkai Ye, Song Liu, Zhechang Zhang and Hong Hu
    In Proceedings of the 32nd USENIX Security Symposium (Security 2023)
  4. As control-flow protection techniques are widely deployed, it is difficult for attackers to modify control data, like function pointers, to hijack program control flow. Instead, data-only attacks corrupt security-critical non-control data (critical data), and can bypass all control-flow protections to revive severe attacks. Previous works have explored various methods to help construct or prevent data-only attacks. However, no solution can automatically detect program-specific critical data. In this paper, we identify an important category of critical data, syscall-guard variables, and propose a set of solutions to automatically detect such variables in a scalable manner. Syscall-guard variables determine to invoke security-related system calls (syscalls), and altering them will allow attackers to request extra privileges from the operating system. We propose branch force, which intentionally flips every conditional branch during the execution and checks whether new security-related syscalls are invoked. If so, we conduct dataflow analysis to estimate the feasibility to flip such branches through common memory errors. We build a tool, VIPER, to implement our ideas. VIPER successfully detects 34 previously unknown syscall-guard variables from 13 programs. We build four new data-only attacks on sqlite and v8, which execute arbitrary command or delete arbitrary file. VIPER completes its analysis within five minutes for most programs, showing its practicality for spotting syscall-guard variables

  5. BET: Black-box Efficient Testing for Convolutional Neural Networks Paper
    Jialai Wang, Han Qiu, Yi Rong, Hengkai Ye, Qi Li, Zongpeng Li and Chao Zhang
    In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022)
  6. Testing Convolutional neural networks (CNNs) to find defects (e.g. error-inducing inputs) before deploying them in security-sensitive scenarios is crucial. Although existing white-box testing methods can effectively test CNN models with high coverage achieved, requiring full knowledge of target CNN models which may not always be available in privacy-sensitive scenarios. In this paper, we propose a novel Black-box Efficient Testing (BET) method for CNN models. The core insight of BET is that CNNs are generally prone to be affected by continuous perturbations. Thus, by generating such continuous perturbations in a black-box manner, we design a tunable objective function to guide our testing process for thoroughly exploring defects in different decision boundaries of target CNN models. We also design an efficiency-centric policy to find more error-inducing inputs with a fixed query budget. We conduct extensive evaluations with three well-known datasets and five popular CNN structures. The results show that BET significantly outperforms existing white-box or black-box testing methods considering the effective error-inducing inputs found in a fixed query/inference budget. We further show that the error-inducing inputs found by BET can be used to fine-tune the target model to improve the accuracy by up to 3%.

  7. Interpreting Deep Learning-based Vulnerability Detector Predictions Based on Heuristic Searching Paper
    Deqing Zou, Yawei Zhu, Shouhuai Xu, Zhen Li, Hai Jin and Hengkai Ye
    In ACM Transactions on Software Engineering and Methodology (TOSEM), 2021
  8. Detecting software vulnerabilities is an important problem and a recent development in tackling the problem is the use of deep learning models to detect software vulnerabilities. While effective, it is hard to explain why a deep learning model predicts a piece of code as vulnerable or not because of the black-box nature of deep learning models. Indeed, the interpretability of deep learning models is a daunting open problem. In this article, we make a significant step toward tackling the interpretability of deep learning model in vulnerability detection. Specifically, we introduce a high-fidelity explanation framework, which aims to identify a small number of tokens that make significant contributions to a detector's prediction with respect to an example. Systematic experiments show that the framework indeed has a higher fidelity than existing methods, especially when features are not independent of each other (which often occurs in the real world). In particular, the framework can produce some vulnerability rules that can be understood by domain experts for accepting a detector's outputs (i.e., true positives) or rejecting a detector's outputs (i.e., false-positives and false-negatives). We also discuss limitations of the present study, which indicate interesting open problems for future research.

Others

  1. Executable Stack Issues in Software Transformation
    (Workshop)
    Hengkai Ye, Hong Hu
    In Workshop on Forming an Ecosystem Around Software Transformation (FEAST 2024)
  2. One Flip is All It Takes: Identifying Syscall-Guard Variables for Data-Only Attacks
    (Industry Conference) Paper
    Hengkai Ye, Hong Hu, Song Liu and Zhechang Zhang
    Black Hat Asia Briefings (Black Hat Asia 2024)

Experience


Honors and Awards


Personal