diff --git a/README.md b/README.md index 2f21937..e77f3b5 100644 --- a/README.md +++ b/README.md @@ -1,62 +1,143 @@ + # Selfpatch SLR -Your annoyingly hard to use tool for adding structure layout randomization at (early) run-time to your C project. +Selfpatch SLR (SPSLR) is a research prototype that implements **structure layout randomization (SLR)** for C programs on **x86_64 Linux**. -Currently, this project is a RESEARCH PROTOTYPE and only for x86\_64 (to be expanded in the future) Linux (definitely not to be expanded in the future). +The system instruments compilation to collect metadata about structure layouts and accesses, compiles that metadata into a patch program before linking, embeds that program into the final binary, and applies randomized layouts at runtime through a self-patching mechanism. -## Description +--- -Selfpatch SRL (SPSLR) is a 3-stage system that allows ELF binaries to patch themselves (usually when starting), achieving structure layout randomization with different layouts after each reboot. In theory, there is no run-time overhead after the randomization itself has completed, though, currently, there is a 1 clock-cycle overhead per member access (so far, I was just too busy to deal with that). To do this, all instructions that perform struct field accesses are patched to use the newly generated field offsets. Additionally, all relevant variables with static storage are reordered in-memory to match the new layouts. +## Overview -To get accurate information about what instructions and variables to patch, the spslr\_pinpoint compiler plugin follows the compilation process, labels instructions and dumps all the information it has learned. The second stage, spslr\_finalize, accumulates that data from all compilation units, matches struct types that are used in multiple units and compiles a byte-code patcher program. This patcher program is then inserted into the previously compiled target binary. The actual patcher and third stage is linked into the binary. It reads and runs the byte-code that the finalizer generated. This byte-code loads initial structure layout, randomizes them with a fancy shuffle algorithm (respects alignment and struct size) and performs the patching. +SPSLR introduces controlled randomness into the in-memory layout of C structures. Instead of relying on fixed field offsets, programs are compiled together with metadata that allows rewriting those offsets at startup. -After the patcher has done its thing, the binary can do whatever it always used to do and simply not worry about layouts. Though, be aware that SLR contradicts the C standard, so there is things you can do that break using SLR (e.g. casting a struct pointer to a pointer to its first element). +The workflow consists of: -## Getting Started +1. Collecting structure and access metadata during compilation +2. Compiling metadata into a patch program before linking +3. Embedding the patch program into the executable +4. Executing the patch program at startup to randomize layouts and update references -### Dependencies +--- -The finalizer requires LIEF to patch the binary. Unfortunately, most distros do not provide the correct package. To build it from source, do this: +## Architecture -```bash -git clone https://github.com/lief-project/LIEF.git -cd LIEF -git checkout 0.17.1 -cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DLIEF_PYTHON_API=OFF -DLIEF_DOC=OFF -DLIEF_EXAMPLES=OFF -cmake --build build -j$(nproc) -sudo cmake --install build +SPSLR consists of three main components: + +### `pinpoint` — GCC plugin + +The `spslr_pinpoint` plugin runs during compilation and emits `.spslr` metadata files for each compilation unit. + +It tracks: +- structure definitions +- field accesses +- relevant data references + +The plugin requires two arguments: +- `metadir` — output directory for metadata +- `srcroot` — source root directory + +--- + +### `patchcompile` — pre-link patch compiler + +The `spslr_patchcompile` tool consumes `.spslr` metadata files and produces an assembly file containing the SPSLR patch program. + +Responsibilities: +- merge metadata across compilation units +- group compatible targets +- generate patch instructions +- emit an assembly representation of the patch program + +The generated assembly is assembled into an object file and linked into the final executable. + +--- + +### `selfpatch` — runtime patcher + +The `spslr_selfpatch` static library executes the embedded patch program at runtime. + +It exposes a single entry point: + +```c +void spslr_selfpatch(void); ``` -Warning: Rough road ahead. +At startup, this function: +- loads the embedded patch program +- randomizes structure layouts +- patches instruction operands and data references +- finalizes execution before normal program logic continues -Now to the annoying part: The GNU front-end for the C language folds offsetof-like expressions into constants. In the parser. IN. THE. PARSER. Any hooks/events available to plugins happen significantly later in the pipeline. Thus, SPSLR can not detect offsetofs, be it \_\_builtin\_offsetof or the DIY variants (`((size\_t)&((struct S\*)0)-\>m)`), using current GCC versions. +--- -To deal with this reliably, using a custom GCC build is necessary. The required patch is provided in this repo (do NOT use the v2 yet!). To use it and install the custom gcc, use these commands: +## Repository Structure + +- `pinpoint/` — GCC plugin for metadata extraction +- `patchcompile/` — pre-link patch compiler +- `selfpatch/` — runtime patch execution library +- `subject/` — example target demonstrating integration +- `docs/` — additional documentation and notes + +--- + +## Requirements + +### Platform + +- x86_64 Linux + +### Toolchain + +- `gcc-16` +- `g++-16` + +The repository includes GCC patch files used to preserve structure-access expressions required by SPSLR metadata collection. + +--- + +## Building the Custom GCC Toolchain + +SPSLR relies on a custom GCC build so that `offsetof`-style structure access expressions remain observable to the plugin infrastructure. ```bash git clone git://gcc.gnu.org/git/gcc.git cd gcc git checkout basepoints/gcc-16 -git am gcc_component_ref.patch +git am /path/to/selfpatch-slr/gcc_component_ref.patch cd .. -mkdir build -cd build -../gcc/configure --enable-host-shared --prefix=/usr/local/gcc-16 --program-suffix=16 --enable-languages=c,c++ --enable-plugin --disable-multilib --disable-werror --disable-bootstrap --disable-libsanitizer --disable-libquadmath --disable-libvtv + +mkdir gcc-build +cd gcc-build +../gcc/configure \ + --enable-host-shared \ + --prefix=/usr/local/gcc-16 \ + --program-suffix=16 \ + --enable-languages=c,c++ \ + --enable-plugin \ + --disable-multilib \ + --disable-werror \ + --disable-bootstrap \ + --disable-libsanitizer \ + --disable-libquadmath \ + --disable-libvtv make -j$(nproc) sudo make install + sudo ln -s /usr/local/gcc-16/bin/gcc16 /usr/local/bin/gcc-16 sudo ln -s /usr/local/gcc-16/bin/g++16 /usr/local/bin/g++-16 ``` -Afterwards, you have gcc-16 and g++-16 available on your system. This repo's CMake setup is already configured to use them. +Verify installation: -I am currently trying to get (a version of) the patch upstreamed so gcc 16 will actually support the SPSLR pipeline with its first stable release in 2026. Getting that done, would certainly make things a little less tedious. +```bash +gcc-16 --version +g++-16 --version +``` -### How To Use +--- -With the dependencies in place, you can use CMake to build all 3 stages of SPSLR and apply/add them to your target. - -In the directory you cloned this repo to, make a build directory and use cmake+make to build everything. As a result, you get all the stages and the example subject. Refer to the CMakeLists.txt files for information on how exactly all components come together. +## Build ```bash mkdir build @@ -65,4 +146,55 @@ cmake .. make -j$(nproc) ``` -The example subject with SPSLR applied is called subject\_final. +This builds: +- `spslr_pinpoint` +- `spslr_patchcompile` +- `spslr_selfpatch` +- the example `subject` executable + +--- + +## Integration Workflow + +To integrate SPSLR into a project: + +1. Compile all source files using the `spslr_pinpoint` plugin +2. Provide `metadir` and `srcroot` plugin arguments +3. Collect generated `.spslr` metadata files +4. Run `spslr_patchcompile` to produce a patch program assembly file +5. Assemble the generated assembly into an object file +6. Link the object together with: + - compiled program objects + - `spslr_selfpatch` +7. Call `spslr_selfpatch()` early in program startup + +--- + +## Example + +The `subject` target demonstrates the full pipeline: + +- compiles sources with the plugin +- generates metadata +- builds the SPSLR patch program +- links the program into the executable +- calls `spslr_selfpatch()` at the start of `main()` + +The example performs operations on randomized structures and accesses both local and global data after patching. + +--- + +## Limitations + +- Platform support: **x86_64 Linux** +- Requires a **custom GCC 16 toolchain** +- Structure layout randomization alters standard memory layout assumptions + +Code that relies on fixed structure layouts, manual offset calculations, or layout-dependent casting may not behave correctly under SPSLR. + +--- + +## Notes + +SPSLR is a research platform for exploring structure layout randomization and its implications for systems programming, security, and compiler-assisted transformations. +