Code injection was a favored technique for attackers to exploit buffer overflow vulnerabilities decades ago. Subsequently, the widespread adoption of lightweight solutions like write-xor-execute (W⊕X) effectively mitigated most of these attacks by disallowing writable-and-executable memory. However, we observe multiple concerning cases where software developers accidentally disabled W⊕X and reintroduced executable stacks to popular applications. Although each individual violation has been properly fixed, a lingering question remains: what underlying factors contribute to these recurrent mistakes among developers, even in contemporary software development practices?
In this paper, we conduct two investigations aimed at gaining a comprehensive understanding of the challenges associated with properly enforcing W⊕X in Linux systems. First, we delve into program-hardening tools to assess whether experienced security developers consistently catch the necessary steps to avoid executable stacks. Second, we analyze the enforcement of W⊕X on Linux by inspecting the source code of the compilation toolchain, the kernel and the loader. Our investigation reveals that properly enforcing W⊕X on Linux requires close collaboration among multiple components. These tools form a complex chain of trust and dependency to safeguard the program stack. However, developers, including security researchers, may overlook the subtle yet essential .note.GNU-stack section when writing assembly code for various purposes, and inadvertently introduce executable stacks. For example, 11 program-hardening tools implemented as inlined reference monitors (IRM) introduce executable stacks to all hardened applications. Based on these findings, we discuss potential exploitation scenarios by attackers and provide suggestions to mitigate this issue.
As control-flow protection techniques are widely deployed,
it is difficult for attackers to modify control data, like function
pointers, to hijack program control flow. Instead, data-only attacks corrupt security-critical non-control data (critical data),
and can bypass all control-flow protections to revive severe attacks. Previous works have explored various methods to help
construct or prevent data-only attacks. However, no solution
can automatically detect program-specific critical data.
In this paper, we identify an important category of critical
data, syscall-guard variables, and propose a set of solutions
to automatically detect such variables in a scalable manner.
Syscall-guard variables determine to invoke security-related
system calls (syscalls), and altering them will allow attackers
to request extra privileges from the operating system. We
propose branch force, which intentionally flips every conditional branch during the execution and checks whether new
security-related syscalls are invoked. If so, we conduct dataflow analysis to estimate the feasibility to flip such branches
through common memory errors. We build a tool, VIPER, to
implement our ideas. VIPER successfully detects 34 previously unknown syscall-guard variables from 13 programs.
We build four new data-only attacks on sqlite and v8, which
execute arbitrary command or delete arbitrary file. VIPER
completes its analysis within five minutes for most programs,
showing its practicality for spotting syscall-guard variables
Testing Convolutional neural networks (CNNs) to find defects (e.g. error-inducing inputs) before deploying them in
security-sensitive scenarios is crucial. Although existing white-box testing methods can effectively test CNN models with
high coverage achieved, requiring full knowledge of target CNN models which may not always be available in privacy-sensitive
scenarios. In this paper, we propose a novel Black-box Efficient Testing (BET) method for CNN models. The core insight of BET
is that CNNs are generally prone to be affected by continuous perturbations. Thus, by generating such continuous perturbations
in a black-box manner, we design a tunable objective function to guide our testing process for thoroughly exploring defects
in different decision boundaries of target CNN models. We also design an efficiency-centric policy to find more error-inducing
inputs with a fixed query budget. We conduct extensive evaluations with three well-known datasets and five popular CNN
structures. The results show that BET significantly outperforms existing white-box or black-box testing methods considering
the effective error-inducing inputs found in a fixed query/inference budget. We further show that the error-inducing inputs
found by BET can be used to fine-tune the target model to improve the accuracy by up to 3%.
Detecting software vulnerabilities is an important problem and a recent development in tackling the problem
is the use of deep learning models to detect software vulnerabilities. While effective, it is hard to explain
why a deep learning model predicts a piece of code as vulnerable or not because of the black-box nature of
deep learning models. Indeed, the interpretability of deep learning models is a daunting open problem. In
this article, we make a significant step toward tackling the interpretability of deep learning model in vulnerability detection.
Specifically, we introduce a high-fidelity explanation framework, which aims to identify
a small number of tokens that make significant contributions to a detector's prediction with respect to an
example. Systematic experiments show that the framework indeed has a higher fidelity than existing methods,
especially when features are not independent of each other (which often occurs in the real world). In
particular, the framework can produce some vulnerability rules that can be understood by domain experts
for accepting a detector's outputs (i.e., true positives) or rejecting a detector's outputs (i.e., false-positives and
false-negatives). We also discuss limitations of the present study, which indicate interesting open problems
for future research.