Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Home

eBPF is a technology that allows running user-supplied programs inside the Linux kernel. For more info see the “What is eBPF?” documentation.

Aya is an eBPF library built with a focus on operability and developer experience. It does not rely on libbpf nor bcc - it’s built from the ground up purely in Rust, using only the libc crate to execute syscalls. With BTF support and when linked with musl, it offers a true compile once, run everywhere solution, where a single self-contained binary can be deployed on many linux distributions and kernel versions.

Some of the major features provided include:

  • Support for the BPF Type Format (BTF), which is transparently enabled when supported by the target kernel. This allows eBPF programs compiled against one kernel version to run on different kernel versions without the need to recompile.
  • Support for function call relocation and global data maps, which allows eBPF programs to make function calls and use global variables and initializers.
  • Async support with both tokio and async-std.
  • Easy to deploy and fast to build: aya doesn’t require a kernel build or compiled headers, and not even a C toolchain; a release build completes in a matter of seconds.

Who’s Using Aya

Anza

Anza is using Aya with XDP in Agave, a Solana validator implementation. See source code.

Deepfence

Deepfence is using Aya with LSM for managing Linux security policies. See ebpfguard.

Exein

Exein is using Aya in Pulsar, a Runtime Security Observability Tool for IoT. See pulsar on GitHub.

Kubernetes SIGs

The Kubernetes Special Interest Groups (SIGs) are using Aya to develop Blixt, a load-balancer that supports the development and maintenance of the Gateway API project.

Red Hat

Red Hat is using Aya to develop bpfman, an eBPF program loading daemon.

Getting Started

This getting started guide will help you use the Rust Programming Language and Aya library to build extended Berkley Packet Filter (eBPF) programs.

Who Aya Is For

Rust is proving to be a popular systems programming language because of its safety features and excellent C interoperability. The safety features are less important in the context of eBPF as programs often need to read kernel memory, which is considered unsafe. However, what Rust combined with Aya does offer is a fast and efficient development experience:

  • Cargo for project scaffolding, build, test and debugging
  • Generation of Rust bindings to Kernel Headers with Compile-Once, Run-Everywhere (CO-RE) support
  • Easy code sharing between user-space and eBPF programs
  • Fast compile times
  • No runtime dependency on LLVM, BCC or libbpf

Scope

The goals of this guide are:

  • Get developers up to speed with eBPF Rust development. i.e. How to set up a development environment.
  • Share current best practices about using Rust for eBPF

Who This Guide is For

This guide caters towards people with either some eBPF or some Rust background. For those without any prior knowledge we suggest you read the “Assumptions and Prerequisites” section first. You can check out the “Other Resources” section to find resources on topics you might want to read up on.

Assumptions and Prerequisites

  • You are comfortable using the Rust Programming Language, and have written, run, and debugged Rust applications on a desktop environment. You should also be familiar with the idioms of the 2021 edition as this guide targets Rust 2021.
  • You are familiar with the core concepts of eBPF

Other Resources

If you are unfamiliar with anything mentioned above or if you want more information about a specific topic mentioned in this guide you might find some of these resources helpful.

  • Rust: Rust Book. Read this if you are not yet comfortable with Rust.
  • eBPF: Cilium Guide. Excellent starting point for eBPF.

How to Use This Guide

This guide generally assumes that you’re reading it front-to-back. Later chapters build on concepts in earlier chapters, and earlier chapters may not dig into details on a topic, revisiting the topic in a later chapter.

eBPF Program Constraints

The eBPF Virtual Machine, where our eBPF programs will be run, is a constrained runtime environment:

  • There is only 512 bytes of stack (or 256 bytes if we are using tail calls).
  • There is no access to heap space and data must instead be written to maps.

Even applications written in C are restricted to a subset of language features, and we have similar constraints in Rust:

  • We may not use the standard library. We use core instead.
  • core::fmt may not be used and neither can traits that rely on it, for example Display and Debug
  • As there is no heap, we cannot use alloc or collections.
  • We must not panic as the eBPF VM does not support stack unwinding, or the abort instruction.
  • There is no main function

Alongside this, a lot of the code that we write is unsafe, as we are reading directly from kernel memory.

Development Environment

Prerequisites

Before getting started you will need the Rust stable and nightly toolchains installed on your system. This is easily achieved with rustup:

rustup install stable
rustup toolchain install nightly --component rust-src

Once you have the Rust toolchains installed, you must also install bpf-linker. The linker depends on LLVM, and it can be built against the version shipped with the rust toolchain if you are running on a linux x86_64 system with:

cargo install bpf-linker

On Debian based distributions, you need to install the llvm-19-dev, libclang-19-dev and libpolly-19-dev packages (if using LLVM 19).

If you are running macos, or linux on any other architecture, you need to install the newest stable version of LLVM first (for example, with brew install llvm), then install the linker with:

LLVM_SYS_180_PREFIX=$(brew --prefix llvm) cargo install \
    --no-default-features bpf-linker

To generate the scaffolding for your project, you’re going to need cargo-generate, which you can install following these instructions.

And finally to generate bindings for kernel data structures, you must install bpftool, either from your distribution or building it from source.

Warning

If you’re running on Ubuntu 20.04, there is a bug with bpftool and the default kernel installed by the distribution. To avoid running into it, you can install a newer bpftool version that does not include the bug with:

sudo apt install linux-tools-5.8.0-63-generic
export PATH=/usr/lib/linux-tools/5.8.0-63-generic:$PATH

Starting A New Project

To start a new project, you can use cargo-generate:

cargo generate https://github.com/aya-rs/aya-template

This will prompt you for a project name - we’ll be using myapp in this example. It will also prompt you for a program type and possibly other options depending on the chosen type (for example, the attach direction for network classifiers).

If you prefer, you can set template options directly from the command line, eg:

cargo generate --name myapp -d program_type=xdp https://github.com/aya-rs/aya-template

See the cargo-generate.toml file (in the aya-template repository) for the full list of available options.

A Simple XDP program

In this section we’ll walk you through the process of writing, building and running a simple eBPF/XDP program and userspace application.

Hello XDP

Note

Full code for the example in this chapter is available on GitHub.

Example Project

While there are myriad trace points to attach to and program types to write we should start somewhere simple.

XDP (eXpress Data Path) programs permit our eBPF program to make decisions about packets that have been received on the interface to which our program is attached. To keep things simple, we’ll build a very simplistic firewall to permit or deny traffic.

eBPF Component

Permit All

We must first write the eBPF component of our program. This is a minimal generated XDP program that permits all traffic. The logic for this program is located in xdp-hello-ebpf/src/main.rs and currently looks like this:

#![no_std] // (1)
#![no_main] // (2)

use aya_ebpf::{bindings::xdp_action, macros::xdp, programs::XdpContext};
use aya_log_ebpf::info;

#[xdp] // (4)
pub fn xdp_hello(ctx: XdpContext) -> u32 {
    // (5)
    match unsafe { try_xdp_hello(ctx) } {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

unsafe fn try_xdp_hello(ctx: XdpContext) -> Result<u32, u32> {
    // (6)
    info!(&ctx, "received a packet");
    // (7)
    Ok(xdp_action::XDP_PASS)
}

#[cfg(not(test))]
#[panic_handler] // (3)
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}
  1. #![no_std] is required since we cannot use the standard library.
  2. #![no_main] is required as we have no main function.
  3. The #[panic_handler] is required to keep the compiler happy, although it is never used since we cannot panic.
  4. This indicates that this function is an XDP program.
  5. Our main entry point defers to another function and performs error handling, returning XDP_ABORTED, which will drop the packet.
  6. Write a log entry every time a packet is received.
  7. This function returns a Result that permits all traffic.

Now we can compile this using cargo build.

Verifying The Program

Let’s take a look at the compiled eBPF program:

$ llvm-objdump -S target/bpfel-unknown-none/debug/xdp-hello

target/bpfel-unknown-none/debug/xdp-hello:	file format elf64-bpf

Disassembly of section .text:

0000000000000000 <memset>:
        0:	15 03 06 00 00 00 00 00	if r3 == 0 goto +6 <LBB1_3>
        1:	b7 04 00 00 00 00 00 00	r4 = 0

0000000000000010 <LBB1_2>:
        2:	bf 15 00 00 00 00 00 00	r5 = r1
        3:	0f 45 00 00 00 00 00 00	r5 += r4
        4:	73 25 00 00 00 00 00 00	*(u8 *)(r5 + 0) = r2
        5:	07 04 00 00 01 00 00 00	r4 += 1
        6:	2d 43 fb ff 00 00 00 00	if r3 > r4 goto -5 <LBB1_2>

0000000000000038 <LBB1_3>:
        7:	95 00 00 00 00 00 00 00	exit

0000000000000040 <memcpy>:
        8:	15 03 09 00 00 00 00 00	if r3 == 0 goto +9 <LBB2_3>
        9:	b7 04 00 00 00 00 00 00	r4 = 0

0000000000000050 <LBB2_2>:
        10:	bf 15 00 00 00 00 00 00	r5 = r1
        11:	0f 45 00 00 00 00 00 00	r5 += r4
        12:	bf 20 00 00 00 00 00 00	r0 = r2
        13:	0f 40 00 00 00 00 00 00	r0 += r4
        14:	71 00 00 00 00 00 00 00	r0 = *(u8 *)(r0 + 0)
        15:	73 05 00 00 00 00 00 00	*(u8 *)(r5 + 0) = r0
        16:	07 04 00 00 01 00 00 00	r4 += 1
        17:	2d 43 f8 ff 00 00 00 00	if r3 > r4 goto -8 <LBB2_2>

0000000000000090 <LBB2_3>:
        18:	95 00 00 00 00 00 00 00	exit

Disassembly of section xdp/xdp_hello:

0000000000000000 <xdp_hello>:
        0:	bf 16 00 00 00 00 00 00	r6 = r1
        1:	b7 07 00 00 00 00 00 00	r7 = 0
        2:	63 7a fc ff 00 00 00 00	*(u32 *)(r10 - 4) = r7
        3:	bf a2 00 00 00 00 00 00	r2 = r10
:
        245:	18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00	r3 = 4294967295 ll
        247:	bf 04 00 00 00 00 00 00	r4 = r0
        248:	b7 05 00 00 aa 00 00 00	r5 = 170
        249:	85 00 00 00 19 00 00 00	call 25

00000000000007d0 <LBB0_2>:
        250:	b7 00 00 00 02 00 00 00	r0 = 2
        251:	95 00 00 00 00 00 00 00	exit

The output was trimmed for brevity. We can see an xdp/xdp_hello section here. And in <LBB0_2>, r0 = 2 sets register 0 to 2, which is the value of the XDP_PASS action. exit ends the program.

Simple!

User-space Component

Now our eBPF program is complete and compiled, we need a user-space program to load it and attach it to a trace point. Fortunately, we have a generated program ready in xdp-hello/src/main.rs which is going to do that for us.

Starting Out

Let’s look at the details of our generated user-space application:

use anyhow::Context;
use aya::programs::{Xdp, XdpFlags};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use tokio::signal; // (1)

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String, // (2)
}

#[tokio::main] // (3)
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    // (4)
    // (5)
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/xdp-hello"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    // (6)
    let program: &mut Xdp = bpf.program_mut("xdp_hello").unwrap().try_into()?;
    program.load()?; // (7)
    // (8)
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
  1. tokio is the async library we’re using, which provides our Ctrl-C handler. It will come in useful later as we expand the functionality of the initial program:
  2. Here we declare our CLI flags. Just --iface for now for passing the interface name
  3. Here’s our main entry point
  4. include_bytes_aligned!() copies the contents of the BPF ELF object file at the compile time
  5. Ebpf::load() reads the BPF ELF object file contents from the output of the previous command, creates any maps, performs BTF relocations
  6. We extract the XDP program
  7. And then load it in to the kernel
  8. Finally, we can attach it to an interface

Let’s try it out!

$ cargo run -- -h
    Finished dev [optimized] target(s) in 0.90s
    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
xdp-hello

USAGE:
    xdp-hello [OPTIONS]

OPTIONS:
    -h, --help             Print help information
    -i, --iface <IFACE>    [default: eth0]

Note

This command assumes the interface is eth0 by default. To use a different interface name, run:

RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"' -- \
  --iface wlp2s0

Replace wlp2s0 with your interface.

$ RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'
[2022-12-21T18:03:09Z INFO  xdp_hello] Waiting for Ctrl-C...
[2022-12-21T18:03:11Z INFO  xdp_hello] received a packet
[2022-12-21T18:03:11Z INFO  xdp_hello] received a packet
[2022-12-21T18:03:11Z INFO  xdp_hello] received a packet
[2022-12-21T18:03:11Z INFO  xdp_hello] received a packet
^C[2022-12-21T18:03:11Z INFO  xdp_hello] Exiting...

So every time a packet was received on the interface, a log was printed!

Warning

If you get an error loading the program, try changing XdpFlags::default() to XdpFlags::SKB_MODE

The Lifecycle of an eBPF Program

The program runs until CTRL+C is pressed and then exits. On exit, Aya takes care of detaching the program for us.

If you issue the sudo bpftool prog list command when xdp_hello is running you can verify that it is loaded:

958: xdp  name xdp_hello  tag 0137ce4fce70b467  gpl
    loaded_at 2022-06-23T13:55:28-0400  uid 0
    xlated 2016B  jited 1138B  memlock 4096B  map_ids 275,274,273
    pids xdp-hello(131677)

Running the command again once xdp_hello has exited will show that the program is no longer running.

Parsing packets

In the previous chapter, our XDP application ran until Ctrl-C was hit and permitted all the traffic. Each time a packet was received, the eBPF program logged the string "received a packet". In this chapter we’re going to show how to parse packets.

While we could go all out and parse data all the way up to L7, we’ll constrain our example to L3, and to make things easier, IPv4 only.

Note

Full code for the example in this chapter is available on GitHub.

Using network types

We’re going to log the source IP address of incoming packets. So we’ll need to:

  • Read the Ethernet header to determine if we’re dealing with an IPv4 packet, else terminate parsing.
  • Read the source IP Address from the IPv4 header.

We could read the specifications of those protocols and parse manually, but instead we’re going to use the network-types crate which provides convenient type definitions for many of the common Internet protocols.

Let’s add it to our eBPF crate by adding a dependency on network-types in our xdp-log-ebpf/Cargo.toml:

[package]
name = "xdp-log-ebpf"
version = "0.1.0"
edition.workspace = true

[dependencies]
aya-ebpf = { git = "https://github.com/aya-rs/aya" }
aya-log-ebpf = { git = "https://github.com/aya-rs/aya" }
xdp-log-common = { path = "../xdp-log-common" }
network-types = "0.1.0"

[build-dependencies]
which = { version = "8.0.0", default-features = false, features = ["real-sys"] }

[[bin]]
name = "xdp-log"
path = "src/main.rs"

Getting packet data from the context

XdpContext contains two fields that we’re going to use: data and data_end, which are respectively a pointer to the beginning and to the end of the packet.

In order to access the data in the packet and to ensure that we do so in a way that keeps the eBPF verifier happy, we’re going to introduce a helper function called ptr_at. The function ensures that before we access any packet data, we insert the bound checks which are required by the verifier.

Finally to access individual fields from the Ethernet and IPv4 headers, we’re going to use the memoffset crate, let’s add a dependency for it in xdp-log-ebpf/Cargo.toml.

Tip

As there is limited stack space, it’s more memory efficient to use the offset_of! macro to read a single field from a struct, rather than reading the whole struct and accessing the field by name.

The resulting code looks like this:

#![no_std]
#![no_main]

use aya_ebpf::{bindings::xdp_action, macros::xdp, programs::XdpContext};
use aya_log_ebpf::info;

use core::mem;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::{IpProto, Ipv4Hdr},
    tcp::TcpHdr,
    udp::UdpHdr,
};

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

#[inline(always)] // (1)
fn ptr_at<T>(ctx: &XdpContext, offset: usize) -> Result<*const T, ()> {
    let start = ctx.data();
    let end = ctx.data_end();
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    Ok((start + offset) as *const T)
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let ethhdr: *const EthHdr = ptr_at(&ctx, 0)?; // (2)
    match unsafe { (*ethhdr).ether_type() } {
        Ok(EtherType::Ipv4) => {}
        _ => return Ok(xdp_action::XDP_PASS),
    }

    let ipv4hdr: *const Ipv4Hdr = ptr_at(&ctx, EthHdr::LEN)?;
    let source_addr = u32::from_be_bytes(unsafe { (*ipv4hdr).src_addr });

    let source_port = match unsafe { (*ipv4hdr).proto } {
        IpProto::Tcp => {
            let tcphdr: *const TcpHdr =
                ptr_at(&ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;
            u16::from_be_bytes(unsafe { (*tcphdr).source })
        }
        IpProto::Udp => {
            let udphdr: *const UdpHdr =
                ptr_at(&ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;
            unsafe { (*udphdr).src_port() }
        }
        _ => return Err(()),
    };

    // (3)
    info!(&ctx, "SRC IP: {:i}, SRC PORT: {}", source_addr, source_port);

    Ok(xdp_action::XDP_PASS)
}
  1. Here we define ptr_at to ensure that packet access is always bound checked.
  2. Use ptr_at to read our ethernet header.
  3. Here we log IP and port.

Don’t forget to rebuild your eBPF program!

User-space component

Our user-space code doesn’t really differ from the previous chapter, but for the reference, here’s the code:

use anyhow::Context;
use aya::programs::{Xdp, XdpFlags};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/xdp-log"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let program: &mut Xdp =
        bpf.program_mut("xdp_firewall").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}

Running the program

As before, the interface can be overwritten by providing the interface name as a parameter, for example, RUST_LOG=info cargo xtask run -- --iface wlp2s0.

$ RUST_LOG=info cargo xtask run
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 172.52.22.104, SRC PORT: 443
[2022-12-22T11:32:21Z INFO  xdp_log] SRC IP: 234.130.159.162, SRC PORT: 443

Dropping Packets

In the previous chapter our XDP program just logged traffic. In this chapter we’re going to extend it to allow the dropping of traffic.

Note

Full code for the example in this chapter is available on GitHub.

Design

In order for our program to drop packets, we’re going to need a list of IP addresses to drop. Since we want to be able to lookup them up efficiently, we’re going to use a HashMap to hold them.

We’re going to:

  • Create a HashMap in our eBPF program that will act as a blocklist
  • Check the IP address from the packet against the HashMap to make a policy decision (pass or drop)
  • Add entries to the blocklist from userspace

Dropping packets in eBPF

We will create a new map called BLOCKLIST in our eBPF code. In order to make the policy decision, we will need to lookup the source IP address in our HashMap. If it exists we drop the packet, if it does not, we allow it. We’ll keep this logic in a function called block_ip.

Here’s what the code looks like now:

#![no_std]
#![no_main]
#![allow(nonstandard_style, dead_code)]

use aya_ebpf::{
    bindings::xdp_action,
    macros::{map, xdp},
    maps::HashMap,
    programs::XdpContext,
};
use aya_log_ebpf::info;

use core::mem;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::Ipv4Hdr,
};

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

#[map] // (1)
static BLOCKLIST: HashMap<u32, u32> =
    HashMap::<u32, u32>::with_max_entries(1024, 0);

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

#[inline(always)]
unsafe fn ptr_at<T>(ctx: &XdpContext, offset: usize) -> Result<*const T, ()> {
    let (start, end) = (ctx.data(), ctx.data_end());
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    let ptr = (start + offset) as *const T;
    Ok(unsafe { &*ptr })
}

// (2)
fn block_ip(address: u32) -> bool {
    unsafe { BLOCKLIST.get(&address).is_some() }
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let ethhdr: *const EthHdr = unsafe { ptr_at(&ctx, 0)? };
    match unsafe { (*ethhdr).ether_type() } {
        Ok(EtherType::Ipv4) => {}
        _ => return Ok(xdp_action::XDP_PASS),
    }

    let ipv4hdr: *const Ipv4Hdr = unsafe { ptr_at(&ctx, EthHdr::LEN)? };
    let source = u32::from_be_bytes(unsafe { (*ipv4hdr).src_addr });

    // (3)
    let action = if block_ip(source) {
        xdp_action::XDP_DROP
    } else {
        xdp_action::XDP_PASS
    };
    info!(&ctx, "SRC: {:i}, ACTION: {}", source, action);

    Ok(action)
}
  1. Create our map
  2. Check if we should allow or deny our packet
  3. Return the correct action

Populating our map from userspace

In order to add the addresses to block, we first need to get a reference to the BLOCKLIST map. Once we have it, it’s simply a case of calling blocklist.insert(). We’ll use the IPv4Addr type to represent our IP address as it’s human-readable and can be easily converted to a u32. We’ll block all traffic originating from 1.1.1.1 in this example.

Note

IP addresses are always encoded in network byte order (big endian) within packets. In our eBPF program, before checking the blocklist, we convert them to host endian using u32::from_be_bytes. Therefore it’s correct to write our IP addresses in host endian format from userspace.

The other approach would work too: we could convert IPs to network endian when inserting from userspace, and then we wouldn’t need to convert when indexing from the eBPF program.

Here’s how the userspace code looks:

use anyhow::Context;
use aya::{
    maps::HashMap,
    programs::{Xdp, XdpFlags},
};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use std::net::Ipv4Addr;
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/xdp-drop"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let program: &mut Xdp =
        bpf.program_mut("xdp_firewall").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    // (1)
    let mut blocklist: HashMap<_, u32, u32> =
        HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;

    // (2)
    let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).into();

    // (3)
    blocklist.insert(block_addr, 0, 0)?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
  1. Get a reference to the map
  2. Create an IPv4Addr
  3. Write this to our map

Running the program

$ RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 1.1.1.1, ACTION: 1
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 192.168.1.21, ACTION: 2
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 192.168.1.21, ACTION: 2
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 18.168.253.132, ACTION: 2
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 1.1.1.1, ACTION: 1
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 18.168.253.132, ACTION: 2
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 18.168.253.132, ACTION: 2
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 1.1.1.1, ACTION: 1
[2022-10-04T12:46:05Z INFO  xdp_drop] SRC: 140.82.121.6, ACTION: 2

Working With Aya

This chapter covers some of the more advanced concepts of working with Aya.

Program Lifecycle

In Aya, an instance of the Bpf type manages the lifetime of all the eBPF objects created through it.

Consider the following example:

use aya::Bpf;
use aya::programs::{Xdp, XdpFlags};

fn main() {
    {
        // (1)
        let mut bpf = Ebpf::load_file("bpf.o"))?;

        let program: &mut Xdp = bpf.program_mut("xdp").unwrap().try_into().unwrap();
        // (2)
        program.load()?;
        // (3)
        program.attach("eth0", XdpFlags::default()).unwrap();
    }
    // (4)

}
  1. When you call load or load_file, all the maps referenced by the eBPF code are created and stored inside the returned Bpf instance.
  2. Similarly when you load a program to the kernel, it’s stored inside the Bpf instance.
  3. When you attach a program, it stays attached until the parent Bpf instance gets dropped.
  4. At this point the bpf variable has been droppped. Our program and maps are detached/unloaded.

Reading Values From A Context

This page is a work in progress, please feel free to open a Pull Request!

Using aya-tool

Note

Full code for the example in this chapter is available on GitHub.

Very often you will need to use type definitions that your running Linux kernel uses in its source code. For example, you might need a definition of task_struct, because you are about to write a BPF program which receives an information about new scheduled process/task. Aya doesn’t provide any definition of this structure. What should be done to get that definition? And we also need that definition in Rust, not in C.

That’s what aya-tool is designed for. It’s a tool which allows to generate Rust bindings for specific kernel structures.

It can be installed with the following commands:

cargo install bindgen-cli
cargo install --git https://github.com/aya-rs/aya -- aya-tool

Ensure that you have bpftool and bindgen installed in your system, aya-tool is not going to work without it.

The syntax of the command is:

$ aya-tool
aya-tool

USAGE:
    aya-tool <SUBCOMMAND>

OPTIONS:
    -h, --help    Print help information

SUBCOMMANDS:
    generate    Generate Rust bindings to Kernel types using bpftool
    help        Print this message or the help of the given subcommand(s)

Let’s assume that we want to generate Rust definition of task_struct. Let’s also assume that your project is called myapp. Your userspace part is in myapp subdirectory, your eBPF part is in myapp-ebpf. We need to generate the bindings for the eBPF part, which can be done with:

aya-tool generate task_struct > myapp-ebpf/src/vmlinux.rs

Tip

You can also specify multiple types to generate, for example:

aya-tool generate task_struct dentry > vmlinux.rs

The remainder of this example focuses only on task_struct.

Then we can use vmlinux as a module with mod vmlinux in our eBPF program, like here:

#![no_std]
#![no_main]

#[allow(
    clippy::all,
    dead_code,
    improper_ctypes_definitions,
    non_camel_case_types,
    non_snake_case,
    non_upper_case_globals,
    unnecessary_transmutes,
    unsafe_op_in_unsafe_fn,
)]
#[rustfmt::skip]
mod vmlinux;

use aya_ebpf::{
    cty::{c_int, c_ulong},
    macros::{lsm, map},
    maps::HashMap,
    programs::LsmContext,
};

use vmlinux::task_struct;

#[map]
static PROCESSES: HashMap<i32, i32> = HashMap::with_max_entries(32768, 0);

#[lsm(hook = "task_alloc")]
pub fn task_alloc(ctx: LsmContext) -> i32 {
    match unsafe { try_task_alloc(ctx) } {
        Ok(ret) => ret,
        Err(ret) => ret,
    }
}

unsafe fn try_task_alloc(ctx: LsmContext) -> Result<i32, i32> {
    let (pid, _clone_flags, retval): (_, c_ulong, c_int) = unsafe {
        let task: *const task_struct = ctx.arg(0);
        ((*task).pid, ctx.arg(1), ctx.arg(2))
    };
    // Save the PID of a new process in map.
    PROCESSES.insert(&pid, &pid, 0).map_err(|e| e as i32)?;

    // Handle results of previous LSM programs.
    if retval != 0 {
        return Ok(retval);
    }

    Ok(0)
}

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

Portability and different kernel versions

Structures generated by aya-tool are portable across different Linux kernel versions thanks to mechanism called BPF CO-RE. The structures are not simply generated from kernel headers. However, the target kernel (regardless of version) should have CONFIG_DEBUG_INFO_BTF option enabled.

Using aya-log

This page is a work in progress, please feel free to open a Pull Request!

Program Types

This section contains information about concrete types of eBPF programs which can be written and loaded using Aya.

Probes

Note

Full code for the example in this chapter is available on GitHub.

What are the probes in eBPF?

The probe BPF programs attach to kernel (kprobes) or user-side (uprobes) functions and are able to access the function parameters of those functions. You can find more information about probes in the kernel documentation, including the difference between kprobes and kretprobes.

Example project

To illustrate kprobes with Aya, let’s write a program which attaches a eBPF handler to the tcp_connect function and allows printing the source and destination IP addresses from the socket parameter.

Design

For this demo program, we are going to rely on aya-log to print IP addresses from the BPF program and not going to have any custom BPF maps (besides those created by aya-log).

eBPF code

  • From the tcp_connect signature, we see that struct sock *sk is the only function parameter. We will access it from the ProbeContext ctx handle.
  • We call bpf_probe_read_kernel helper to copy the struct sock_common __sk_common portion of the socket structure. (For uprobe programs, we would need to call bpf_probe_read_user instead.)
  • We match the skc_family field, and for AF_INET (IPv4) and AF_INET6 (IPv6) values, extract and print the src and destination addresses using aya-log info! macro.

Here’s how the eBPF code looks like:

#![no_std]
#![no_main]

#[allow(
    clippy::all,
    dead_code,
    improper_ctypes_definitions,
    non_camel_case_types,
    non_snake_case,
    non_upper_case_globals,
    unnecessary_transmutes,
    unsafe_op_in_unsafe_fn,
)]
#[rustfmt::skip]
mod vmlinux;

use crate::vmlinux::{sock, sock_common};

use aya_ebpf::{
    helpers::bpf_probe_read_kernel, macros::kprobe, programs::ProbeContext,
};
use aya_log_ebpf::info;

const AF_INET: u16 = 2;
const AF_INET6: u16 = 10;

#[kprobe]
pub fn kprobetcp(ctx: ProbeContext) -> u32 {
    match try_kprobetcp(ctx) {
        Ok(ret) => ret,
        Err(ret) => ret.try_into().unwrap_or(1),
    }
}

fn try_kprobetcp(ctx: ProbeContext) -> Result<u32, i64> {
    let sock: *mut sock = ctx.arg(0).ok_or(1i64)?;
    let sk_common = unsafe {
        bpf_probe_read_kernel(&(*sock).__sk_common as *const sock_common)
    }?;
    match sk_common.skc_family {
        AF_INET => {
            let src_addr = u32::from_be(unsafe {
                sk_common.__bindgen_anon_1.__bindgen_anon_1.skc_rcv_saddr
            });
            let dest_addr: u32 = u32::from_be(unsafe {
                sk_common.__bindgen_anon_1.__bindgen_anon_1.skc_daddr
            });
            info!(
                &ctx,
                "AF_INET src address: {:i}, dest address: {:i}",
                src_addr,
                dest_addr,
            );
            Ok(0)
        }
        AF_INET6 => {
            let src_addr = sk_common.skc_v6_rcv_saddr;
            let dest_addr = sk_common.skc_v6_daddr;
            info!(
                &ctx,
                "AF_INET6 src addr: {:i}, dest addr: {:i}",
                unsafe { src_addr.in6_u.u6_addr8 },
                unsafe { dest_addr.in6_u.u6_addr8 }
            );
            Ok(0)
        }
        _ => Ok(0),
    }
}

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

Userspace code

The purpose of the userspace code is to load the eBPF program and attach it to the tcp_connect function.

Here’s how the code looks like:

use aya::programs::KProbe;
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let _opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/kprobetcp"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let program: &mut KProbe =
        bpf.program_mut("kprobetcp").unwrap().try_into()?;
    program.load()?;
    program.attach("tcp_connect", 0)?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}

Running the program

$ RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'
[2022-12-28T20:50:00Z INFO  kprobetcp] Waiting for Ctrl-C...
[2022-12-28T20:50:05Z INFO  kprobetcp] AF_INET6 src addr: 2001:4998:efeb:282::249, dest addr: 2606:2800:220:1:248:1893:25c8:1946
[2022-12-28T20:50:11Z INFO  kprobetcp] AF_INET src address: 10.53.149.148, dest address: 10.87.116.72
[2022-12-28T20:50:30Z INFO  kprobetcp] AF_INET src address: 10.53.149.148, dest address: 98.138.219.201

Tracepoints

This page is a work in progress, please feel free to open a Pull Request!

Sockets

This page is a work in progress, please feel free to open a Pull Request!

Classifiers

Note

Full code for the example in this chapter is available on GitHub.

What is Classifier in eBPF?

Classifier is a type of eBPF program which is attached to queuing disciplines in Linux kernel networking (often referred to as qdisc) and therefore being able to make decisions about packets that have been received on the network interface associated with the qdisc.

For each network interface, there are separate qdiscs for ingress and egress traffic. When attaching Classifier program to an interface,

What’s the difference between Classifiers and XDP?

  • Classifier is older than XDP, it’s available since kernel 4.1, while XDP - since 4.8.
  • Classifier can inspect both ingress and egress traffic. XDP is limited to ingress.
  • XDP provides better performance, because it’s executed earlier - it receives a raw packet from the NIC driver, before it goes to any layers of kernel networking stack and gets parsed to the sk_buff structure.

Example project

To make a difference from the XDP example, let’s try to write a program which allows the dropping of egress traffic.

Design

We’re going to:

  • Create a HashMap that will act as a blocklist.
  • Check the destination IP address from the packet against the HashMap to make a policy decision (pass or drop).
  • Add entries to the blocklist from userspace.

eBPF code

The program code is going to start with a definition of BLOCKLIST map. To enforce the policy, the program is going to lookup the destination IP address in that map. If the map entry for that address exist, we are going to drop the packet. Otherwise, we are going to pipe it with TC_ACT_PIPE action - which means allowing it on our side, but let the packet be inspected also by another Classifier programs and qdisc filters.

Note

There is also a possibility to allow the packet while bypassing the other programs and filters - TC_ACT_OK. We recommend that option only if absolutely sure that you want your program to have a precedence over the other programs or filters.

Here’s how the eBPF code looks like:

#![no_std]
#![no_main]

use aya_ebpf::{
    bindings::{TC_ACT_PIPE, TC_ACT_SHOT},
    macros::{classifier, map},
    maps::HashMap,
    programs::TcContext,
};
use aya_log_ebpf::info;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::Ipv4Hdr,
};

#[map]
static BLOCKLIST: HashMap<u32, u32> = HashMap::with_max_entries(1024, 0);

#[classifier]
pub fn tc_egress(ctx: TcContext) -> i32 {
    match try_tc_egress(ctx) {
        Ok(ret) => ret,
        Err(_) => TC_ACT_SHOT,
    }
}

fn block_ip(address: u32) -> bool {
    unsafe { BLOCKLIST.get(&address).is_some() }
}

fn try_tc_egress(ctx: TcContext) -> Result<i32, ()> {
    let ethhdr: EthHdr = ctx.load(0).map_err(|_| ())?;
    match ethhdr.ether_type() {
        Ok(EtherType::Ipv4) => {}
        _ => return Ok(TC_ACT_PIPE),
    }

    let ipv4hdr: Ipv4Hdr = ctx.load(EthHdr::LEN).map_err(|_| ())?;
    let destination = u32::from_be_bytes(ipv4hdr.dst_addr);

    let action = if block_ip(destination) {
        TC_ACT_SHOT
    } else {
        TC_ACT_PIPE
    };

    info!(&ctx, "DEST {:i}, ACTION {}", destination, action);

    Ok(action)
}

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}
  1. Create our map.
  2. Check if we should allow or deny our packet.
  3. Return the correct action.

Userspace code

The purpose of the userspace code is to load the eBPF program, attach it to the given network interface and then populate the map with an address to block.

In this example, we’ll block all egress traffic going to 1.1.1.1.

Here’s how the code looks like:

use std::net::Ipv4Addr;

use aya::{
    maps::HashMap,
    programs::{SchedClassifier, TcAttachType, tc},
};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/tc-egress"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    // error adding clsact to the interface if it is already added is harmless
    // the full cleanup can be done with 'sudo tc qdisc del dev eth0 clsact'.
    let _ = tc::qdisc_add_clsact(&opt.iface);
    let program: &mut SchedClassifier =
        bpf.program_mut("tc_egress").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, TcAttachType::Egress)?;

    // (1)
    let mut blocklist: HashMap<_, u32, u32> =
        HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;

    // (2)
    let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).into();

    // (3)
    blocklist.insert(block_addr, 0, 0)?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
  1. Get a reference to the map.
  2. Create an IPv4Addr.
  3. Populate the map with remote IP addresses which we want to prevent the egress traffic to.

The third thing is done with getting a reference to the BLOCKLIST map and calling blocklist.insert. Using IPv4Addr type in Rust will let us to read the human-readable representation of IP address and convert it to u32, which is an appropriate type to use in eBPF maps.

Running the program

$ RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'
LOG: SRC 1.1.1.1, ACTION 2
LOG: SRC 35.186.224.47, ACTION 3
LOG: SRC 35.186.224.47, ACTION 3
LOG: SRC 1.1.1.1, ACTION 2
LOG: SRC 168.100.68.32, ACTION 3
LOG: SRC 168.100.68.239, ACTION 3
LOG: SRC 168.100.68.32, ACTION 3
LOG: SRC 168.100.68.239, ACTION 3
LOG: SRC 1.1.1.1, ACTION 2
LOG: SRC 13.248.212.111, ACTION 3

Cgroups

This page is a work in progress, please feel free to open a Pull Request!

Cgroup SKB

Note

Full code for the example in this chapter is available on GitHub.

What is Cgroup SKB?

Cgroup SKB programs are attached to v2 cgroups and get triggered by network traffic (egress or ingress) associated with processes inside the given cgroup. They allow to intercept and filter the traffic associated with particular cgroups (and therefore - containers).

What’s the difference between Cgroup SKB and Classifiers?

Both Cgroup SKB and Classifiers receive the same type of context - SkBuffContext.

The difference is that Classifiers are attached to the network interface.

Example project

This example will be similar to the Classifier example - a program which allows the dropping of egress traffic, but for the specific cgroup.

Design

We’re going to:

  • Create a HashMap that will act as a blocklist.
  • Check the destination IP address from the packet against the HashMap to make a policy decision (pass or drop).
  • Add entries to the blocklist from userspace.

Using network types

In this example, we are going to read the IPv4 protocol header. We need to use the network-types crate that provides it, as well as many other type definitions of the common Internet protocols.

Let’s add it to cgroup-skb-egress-ebpf/Cargo.toml:

[package]
name = "cgroup-skb-egress-ebpf"
version = "0.1.0"
edition.workspace = true

[dependencies]
aya-ebpf = { git = "https://github.com/aya-rs/aya" }
aya-log-ebpf = { git = "https://github.com/aya-rs/aya" }
cgroup-skb-egress-common = { path = "../cgroup-skb-egress-common" }
memoffset = "0.9"
network-types = "0.1.0"

[build-dependencies]
which = { version = "8.0.0", default-features = false, features = ["real-sys"] }

[[bin]]
name = "cgroup-skb-egress"
path = "src/main.rs"

eBPF code

The program is going to start with a definition of BLOCKLIST map. To enforce the police, the program is going to lookup the destination IP address in that map. If the map entry for that address exists, we are going to drop the packet by returning 0. Otherwise, we are going to accept it by returning 1.

Here’s how the eBPF code looks like:

#![no_std]
#![no_main]

use aya_ebpf::{
    macros::{cgroup_skb, map},
    maps::{HashMap, PerfEventArray},
    programs::SkBuffContext,
};
use memoffset::offset_of;
use network_types::ip::Ipv4Hdr;

use cgroup_skb_egress_common::PacketLog;

#[map]
static EVENTS: PerfEventArray<PacketLog> = PerfEventArray::new(0);

#[map] // (1)
static BLOCKLIST: HashMap<u32, u32> = HashMap::with_max_entries(1024, 0);

#[cgroup_skb]
pub fn cgroup_skb_egress(ctx: SkBuffContext) -> i32 {
    try_cgroup_skb_egress(ctx).unwrap_or(0)
}

// (2)
fn block_ip(address: u32) -> bool {
    unsafe { BLOCKLIST.get(&address).is_some() }
}

fn try_cgroup_skb_egress(ctx: SkBuffContext) -> Result<i32, i64> {
    let protocol = unsafe { (*ctx.skb.skb).protocol };
    if protocol != ETH_P_IP {
        return Ok(1);
    }

    let destination =
        u32::from_be_bytes(ctx.load(offset_of!(Ipv4Hdr, dst_addr))?);

    // (3)
    let action = if block_ip(destination) { 0 } else { 1 };

    let log_entry = PacketLog {
        ipv4_address: destination,
        action,
    };
    EVENTS.output(&ctx, &log_entry, 0);
    Ok(action)
}

const ETH_P_IP: u32 = 8;

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}
  1. Create our map.
  2. Check if we should allow or deny our packet.
  3. Return the correct action.

Userspace code

The purpose of the userspace code is to load the eBPF program, attach it to the cgroup and then populate the map with an address to block.

In this example, we’ll block all egress traffic going to 1.1.1.1.

Here’s how the code looks like:

use std::net::Ipv4Addr;

use aya::{
    maps::{
        HashMap,
        perf::{Events, PerfEventArray},
    },
    programs::{CgroupAttachMode, CgroupSkb, CgroupSkbAttachType},
    util::online_cpus,
};
use bytes::BytesMut;
use clap::Parser;
use log::info;
use tokio::{signal, task};

use cgroup_skb_egress_common::PacketLog;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "/sys/fs/cgroup/unified")]
    cgroup_path: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/cgroup-skb-egress"
    )))?;
    let program: &mut CgroupSkb =
        bpf.program_mut("cgroup_skb_egress").unwrap().try_into()?;
    let cgroup = std::fs::File::open(opt.cgroup_path)?;
    // (1)
    program.load()?;
    // (2)
    program.attach(
        cgroup,
        CgroupSkbAttachType::Egress,
        CgroupAttachMode::Single,
    )?;

    let mut blocklist: HashMap<_, u32, u32> =
        HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;

    let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).into();

    // (3)
    blocklist.insert(block_addr, 0, 0)?;

    let mut perf_array =
        PerfEventArray::try_from(bpf.take_map("EVENTS").unwrap())?;

    for cpu_id in online_cpus().map_err(|(_, error)| error)? {
        let buf = perf_array.open(cpu_id, None)?;
        let mut buf = tokio::io::unix::AsyncFd::with_interest(
            buf,
            tokio::io::Interest::READABLE,
        )?;

        task::spawn(async move {
            let mut buffers =
                std::iter::repeat_with(|| BytesMut::with_capacity(1024))
                    .take(10)
                    .collect::<Vec<_>>();

            loop {
                let mut guard = buf.readable_mut().await.unwrap();
                loop {
                    let Events { read, lost: _ } = guard
                        .get_inner_mut()
                        .read_events(&mut buffers)
                        .unwrap();
                    for buf in buffers.iter_mut().take(read) {
                        let ptr = buf.as_ptr() as *const PacketLog;
                        let data = unsafe { ptr.read_unaligned() };
                        let src_addr = Ipv4Addr::from(data.ipv4_address);
                        info!("LOG: DST {}, ACTION {}", src_addr, data.action);
                    }
                    if read != buffers.len() {
                        break;
                    }
                }
                guard.clear_ready();
            }
        });
    }

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
  1. Loading the eBPF program.
  2. Attaching it to the given cgroup.
  3. Populating the map with remote IP addresses which we want to prevent the egress traffic to.

The third thing is done with getting a reference to the BLOCKLIST map and calling blocklist.insert. Using IPv4Addr type in Rust will let us to read the human-readable representation of IP address and convert it to u32, which is an appropriate type to use in eBPF maps.

Testing the program

First, check where cgroups v2 are mounted:

$ mount | grep cgroup2
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)

The most common locations are either /sys/fs/cgroup or /sys/fs/cgroup/unified.

Inside that location, we need to create our new cgroup (as root):

mkdir /sys/fs/cgroup/foo

Then run the program with:

RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'

And then, in a separate terminal, as root, try to access 1.1.1.1:

bash -c "echo \$$ >> /sys/fs/cgroup/foo/cgroup.procs && curl 1.1.1.1"

That command should hang and the logs of our program should look like:

LOG: DST 1.1.1.1, ACTION 0
LOG: DST 1.1.1.1, ACTION 0

On the other hand, accessing any other address should be successful, for example:

$ bash -c "echo \$$ >> /sys/fs/cgroup/foo/cgroup.procs && curl google.com"
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

And should result in the following logs:

LOG: DST 192.168.88.10, ACTION 1
LOG: DST 192.168.88.10, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1
LOG: DST 172.217.19.78, ACTION 1

XDP

Note

Full code for the example in this chapter is available on GitHub.

What is XDP in eBPF?

XDP (eXpress Data Path) is a type of eBPF program that attaches to the network interface. It enables filtering, manipulation and redirection of network packets as soon as they are received from the network driver, even before they enter the Linux kernel networking stack, resulting in low latency and high throughput.

The idea behind XDP is to add an early hook in the RX path of the kernel, and let a user supplied eBPF program decide the fate of the packet. The hook is placed in the NIC driver just after the interrupt processing, and before any memory allocation needed by the network stack itself.

The XDP program is allowed to edit the packet data and, after the XDP program returns, an action code determines what to do with the packet:

  • XDP_PASS: let the packet continue through the network stack
  • XDP_DROP: silently drop the packet
  • XDP_ABORTED: drop the packet with trace point exception
  • XDP_TX: bounce the packet back to the same NIC it arrived on
  • XDP_REDIRECT: redirect the packet to another NIC or user space socket via the AF_XDP address family

AF_XDP

Along with XDP, a new address familiy entered in the Linux kernel, starting at 4.18. AF_XDP, formerly known as AF_PACKETv4 (which was never included in the mainline kernel), is a raw socket optimized for high performance packet processing and allows zero-copy between kernel and applications. As the socket can be used for both receiving and transmitting, it supports high performance network applications purely in user-space.

If you want a more extensive explanation about AF_XDP, you can find it in the kernel documentation.

XDP operation modes

You can connect an XDP program to an interface using the following modes:

Generic XDP

  • XDP programs are loaded into the kernel as part of the ordinary network path
  • Doesn’t need support from the network card driver to function
  • Doesn’t provide full performance benefits
  • Easy way to test XDP programs

Native XDP

  • XDP programs are loaded by the network card driver as part of its initial receive path
  • Requires support from the network card driver to function
  • Default operation mode

Offloaded XDP

  • XDP programs are loaded directly on the NIC, and executed without using the CPU
  • Requires support from the NIC

Driver support for native XDP

For more information, please visit the Cilium XDP documentation under Drivers supporting native XDP.

Driver support for offloaded XDP

Currently, only the Netronome NFP drivers have support for offloaded XDP.

Example project

Now that you have a little more understanding about XDP, let’s follow up with a practical example. We are going to write a simple XDP Program that drops packets incoming from certain IPs.

Setting up the development environment

Make sure you already have the prerequisites.

Since we are writing an XDP program, we will use the XDP template (created with cargo generate):

cargo generate --name simple-xdp-program -d program_type=xdp \
    https://github.com/aya-rs/aya-template

Creating the eBPF component

First, we must create the eBPF component for our program, in this component, we will decide what to do with the incoming packets.

Since we want to drop the incoming packets from certain IPs, we are going to use the XDP_DROP action code whenever the IP is in our blacklist, and everything else will be treated with the XDP_PASS action code.

#![no_std]
#![no_main]

use aya_ebpf::{
    bindings::xdp_action,
    macros::{map, xdp},
    maps::HashMap,
    programs::XdpContext,
};

use aya_log_ebpf::info;

use core::mem;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::Ipv4Hdr,
};

We import the necessary dependencies:

  • aya_ebpf: For XDP actions (bindings::xdp_action), the XDP context struct XdpContext (programs:XdpContext), map definitions (for our HashMap) and XDP program macros (macros::{map, xdp})
  • aya_log_ebpf: For logging within the eBPF program
  • core::mem: For memory manipulation
  • network_types: For Ethernet and IP header definitions

Important

Make sure you add the network_types dependency in your Cargo.toml.

Here’s how the code looks:

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

An eBPF-compatible panic handler is provided because eBPF programs cannot use the default panic behavior.

#[map]
static BLOCKLIST: HashMap<u32, u32> = HashMap::with_max_entries(1024, 0);

Here, we define our blocklist with a HashMap, which stores integers (u32), with a maximum of 1024 entries.

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

The xdp_firewall function (picked up in user-space) accepts an XdpContext and returns a u32. It delegates the main packet processing logic to the try_xdp_firewall function. If an error occurs, the function returns xdp_action::XDP_ABORTED (which is equal to the u32 0).

#[inline(always)]
unsafe fn ptr_at<T>(
    ctx: &XdpContext, offset: usize
) -> Result<*const T, ()> {
    let start = ctx.data();
    let end = ctx.data_end();
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    let ptr = (start + offset) as *const T;
    Ok(&*ptr)
}

Our ptr_at function is designed to provide safe access to a generic type T within an XdpContext at a specified offset. It performs bounds checking by comparing the desired memory range (start + offset + len) against the end of the data (end). If the access is within bounds, it returns a pointer to the specified type; otherwise, it returns an error. We are going to use this function to retrieve data from the XdpContext.

fn block_ip(address: u32) -> bool {
    unsafe { BLOCKLIST.get(&address).is_some() }
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let ethhdr: *const EthHdr = unsafe { ptr_at(&ctx, 0)? };
    match unsafe { (*ethhdr).ether_type() } {
        Ok(EtherType::Ipv4) => {}
        _ => return Ok(xdp_action::XDP_PASS),
    }

    let ipv4hdr: *const Ipv4Hdr = unsafe { ptr_at(&ctx, EthHdr::LEN)? };
    let source = u32::from_be_bytes(unsafe { (*ipv4hdr).src_addr });

    let action = if block_ip(source) {
        xdp_action::XDP_DROP
    } else {
        xdp_action::XDP_PASS
    };
    info!(&ctx, "SRC: {:i}, ACTION: {}", source, action);

    Ok(action)
}

The block_ip function checks if a given IP address (address) exists in the blocklist.

As said before, the try_xdp_firewall contains the main logic for our firewall. We first retrieve the Ethernet header from the XdpContext with the ptr_at function, the header is located at the beginning of the XdpContext, therefore we use 0 as an offset.

If the packet is not IPv4 (ether_type check), the function returns xdp_action::XDP_PASS and allows the packet to pass through the network stack.

ipv4hdr is used to retrieve the IPv4 header, source is used to store the source IP address from the IPv4 header. We then compare the IP address with those that are in our blocklist using the block_ip function we created earlier. If block_ip matches, meaning that the IP is in the blocklist, we use the XDP_DROP action code so that it doesn’t get through the network stack, otherwise we let it pass with the XDP_PASS action code.

Lastly, we log the activity, SRC is the source IP address and ACTION is the action code that has been used on it. We then return Ok(action) as a result.

The full code:

#![no_std]
#![no_main]
#![allow(nonstandard_style, dead_code)]

use aya_ebpf::{
    bindings::xdp_action,
    macros::{map, xdp},
    maps::HashMap,
    programs::XdpContext,
};
use aya_log_ebpf::info;

use core::mem;
use network_types::{
    eth::{EthHdr, EtherType},
    ip::Ipv4Hdr,
};

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

#[map]
static IP_BLOCKLIST: HashMap<u32, u32> = HashMap::with_max_entries(1024, 0);

#[xdp]
pub fn xdp_firewall(ctx: XdpContext) -> u32 {
    match try_xdp_firewall(ctx) {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

#[inline(always)]
unsafe fn ptr_at<T>(
    ctx: &XdpContext, offset: usize,
) -> Result<*const T, ()> {
    let start = ctx.data();
    let end = ctx.data_end();
    let len = mem::size_of::<T>();

    if start + offset + len > end {
        return Err(());
    }

    let ptr = (start + offset) as *const T;
    Ok(&*ptr)
}

fn block_ip(address: u32) -> bool {
    unsafe { IP_BLOCKLIST.get(&address).is_some() }
}

fn try_xdp_firewall(ctx: XdpContext) -> Result<u32, ()> {
    let ethhdr: *const EthHdr = unsafe { ptr_at(&ctx, 0)? };
    match unsafe { (*ethhdr).ether_type() } {
        Ok(EtherType::Ipv4) => {}
        _ => return Ok(xdp_action::XDP_PASS),
    }

    let ipv4hdr: *const Ipv4Hdr = unsafe { ptr_at(&ctx, EthHdr::LEN)? };
    let source = u32::from_be_bytes(unsafe { (*ipv4hdr).src_addr });

    let action = if block_ip(source) {
        xdp_action::XDP_DROP
    } else {
        xdp_action::XDP_PASS
    }; 
    info!(&ctx, "SRC: {:i}, ACTION: {}", source, action);

    Ok(action)
}

Populating our map from user-space

In order to add the addresses to block, we first need to get a reference to the BLOCKLIST map.

Once we have it, it’s simply a case of calling ip_blocklist.insert() to insert the ips into the blocklist.

We’ll use the IPv4Addr type to represent our IP address as it’s human-readable and can be easily converted to a u32.

We’ll block all traffic originating from 1.1.1.1 in this example.

Note

IP addresses are always encoded in network byte order (big endian) within packets. In our eBPF program, before checking the blocklist, we convert them to host endian using u32::from_be_bytes. Therefore it’s correct to write our IP addresses in host endian format from userspace.

The other approach would work too: we could convert IPs to network endian when inserting from userspace, and then we wouldn’t need to convert when indexing from the eBPF program.

Let’s begin with writing the user-space code:

Importing dependencies

use anyhow::Context;
use aya::{
    maps::HashMap,
    programs::{Xdp, XdpFlags},
};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use std::net::Ipv4Addr;
use tokio::signal;
  • anyhow::Context: Provides additional context for error handling
  • aya: Provides the Bpf structure and related functions for loading eBPF programs, as well as the XDP program and its flags (aya::programs::{Xdp, XdpFlags})
  • aya_log::EbpfLogger: For logging within the eBPF program
  • clap::Parser: Provides argument parsing
  • log::{info, warn}: The logging library we use for informational and warning messages
  • std::net::Ipv4Addr: A struct to work with IPv4 addresses
  • tokio::signal: For handling signals asynchronously, see this link for more information

Note

aya::Bpf is deprecated since version 0.13.0 and aya_log:BpfLogger since 0.2.1. Use aya::Ebpf and aya_log:EbpfLogger instead if you are using the more recent versions.

Defining command-line arguments

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

A simple struct is defined for command-line parsing using clap’s derive feature, with the optional argument iface to provide our network interface name.

Main function

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/simple-xdp-program"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let program: &mut Xdp =
        bpf.program_mut("xdp_firewall").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - "
                    "try changing XdpFlags::default() to XdpFlags::SKB_MODE")?;

    let mut blocklist: HashMap<_, u32, u32> =
        HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;

    let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).into();

    blocklist.insert(block_addr, 0, 0)?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
Parsing command-line arguments

Inside the main function, we first parse the command-line arguments, using Opt::parse() and the struct defined earlier.

Initializing environment logging

Logging is initialized using env_logger::init(), we will make use of the environment logger later in our code.

Loading the eBPF program

The eBPF program is loaded using Ebpf::load(), choosing the debug or release version based on the build configuration (debug_assertions).

Loading and attaching our XDP

The XDP program named xdp_firewall is retrieved from the eBPF program we defined earlier using bpf.program_mut(). The XDP program is then loaded and attached to our network interface.

Setting up the IP blocklist

The IP blocklist (BLOCKLIST map) is loaded from the eBPF program and converted to a HashMap. The IP 1.1.1.1 is added to the blocklist.

Waiting for the exit signal

The program awaits the CTRL+C signal asynchronously using signal::ctrl_c().await, once received, it logs an exit message and returns Ok(()).

Full user-space code

use anyhow::Context;
use aya::{
    maps::HashMap,
    programs::{Xdp, XdpFlags},
};
use aya_log::EbpfLogger;
use clap::Parser;
use log::{info, warn};
use std::net::Ipv4Addr;
use tokio::signal;

#[derive(Debug, Parser)]
struct Opt {
    #[clap(short, long, default_value = "eth0")]
    iface: String,
}

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let opt = Opt::parse();

    env_logger::init();

    let mut bpf = aya::Ebpf::load(aya::include_bytes_aligned!(concat!(
        env!("OUT_DIR"),
        "/simple-xdp-program"
    )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let program: &mut Xdp =
        bpf.program_mut("xdp_firewall").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags - "
                    "try changing XdpFlags::default() to "
                    "XdpFlags::SKB_MODE")?;

    let mut blocklist: HashMap<_, u32, u32> =
        HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;

    let block_addr: u32 = Ipv4Addr::new(1, 1, 1, 1).into();

    blocklist.insert(block_addr, 0, 0)?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}

Running our program

Now that we have all the pieces for our eBPF program, we can run it using:

RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'

or

RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"' -- \
  --iface <interface>

if you want to provide another network interface name. note that you can also omit RUST_LOG=info, but you won’t get any logging.

LSM

Note

Full code for the example in this chapter is available on GitHub.

What is LSM

LSM stands for Linux Security Modules which is a framework which allows developers to write security systems on top of the Linux kernel. It’s also briefly described in the Linux kernel documentation.

LSM is used by kernel modules or (since kernel 5.7) by eBPF programs. The most popular modules that make use of LSM are AppArmor, SELinux, Smack and TOMOYO. eBPF LSM programs allow developers to implement the same functionality implemented by the modules just mentioned, using eBPF APIs.

The central concept behind LSM is LSM hooks. LSM hooks are exposed in key locations in the kernel, and eBPF programs can attach to them to implement custom security policies. Examples of operations that can be policied via hooks include:

  • filesystem operations
    • opening, creating, moving and removing files
    • mounting and unmounting filesystems
  • task/process operations
    • allocating and freeing tasks, changing user and group identify for a task
  • socket operations
    • creating and binding sockets
    • receiving and sending messages

Each of those actions has a corresponding LSM hook. Each hook takes a number of arguments, which provides context about the program and it’s operation in order to implement policy decisions. The list of hooks with their arguments can be found in the lsm_hook_defs.h header.

For example, consider the task_setnice hook, which has the following definition:

LSM_HOOK(int, 0, task_setnice, struct task_struct *p, int nice)

The hook is triggered when a nice value is set for any process in the system. If you are not familiar with the concept of process niceness, check out this article. As you can see from the definition, this hook takes the following arguments:

  • p is the instance of task_struct which represents the process on which the nice value is set
  • nice is the nice value

By attaching to the hook, an eBPF program can decide whether to accept or reject the given nice value.

In addition to the arguments found in the hook definition, eBPF programs have access to one extra argument - ret - which is a return value of potential previous eBPF LSM programs.

Ensure that BPF LSM is enabled

Before proceeding further and trying to write a BPF LSM program, please make sure that:

  • Your kernel version is at least 5.7.
  • BPF LSM is enabled.

The second point can be checked with:

$ cat /sys/kernel/security/lsm
capability,lockdown,landlock,yama,apparmor,bpf

The correct output should contain bpf. If it doesn’t, BPF LSM has to be manually enabled by adding it to kernel config parameters. It can be achieved by editing the GRUB config in /etc/default/grub and adding the following to the kernel parameters:

GRUB_CMDLINE_LINUX="lsm=[YOUR CURRENTLY ENABLED LSMs],bpf"

Then rebuilding the GRUB configuration with any of the commands listed below (each of them might be available or not in different Linux distributions):

update-grub2
grub2-mkconfig -o /boot/grub2/grub.cfg
grub-mkconfig -o /boot/grub/grub.cfg

And finally, rebooting the system.

Writing LSM BPF program

Let’s try to create an LSM eBPF program which which is triggered by task_setnice hook. The purpose of this program will be denying setting the nice value lower than 0 (which means higher priority), for a particular process.

The renice tool can be used to change niceness values:

renice [value] -p [pid]

With our eBPF program, we want to make it impossible to call renice for a given pid with a negative [value].

eBPF projects come with two parts: eBPF program(s) and the userspace program. To make our example simple, we can try to deny a change of a nice value of the userspace process which loads the eBPF program.

The first step is to create a new project:

cargo generate --name lsm-nice -d program_type=lsm \
    -d lsm_hook=task_setnice https://github.com/aya-rs/aya-template

That command should create a new Aya project with an empty program attaching to the task_setnice hook. Let’s go to its directory:

cd lsm-nice

One of the arguments passed to the task_setnice hook is a pointer to a task_struct type. Therefore we need to generate a binding to task_struct with aya-tool.

If you are not familiar with aya-tool, please refer to this section.

aya-tool generate task_struct > lsm-nice-ebpf/src/vmlinux.rs

Now it’s time to modify the lsm-nice-ebpf project and write an actual program there. The full program code should look like this:

#![no_std]
#![no_main]

use aya_ebpf::{cty::c_int, macros::lsm, programs::LsmContext};
use aya_log_ebpf::info;

// (1)
#[allow(
    clippy::all,
    dead_code,
    improper_ctypes_definitions,
    non_camel_case_types,
    non_snake_case,
    non_upper_case_globals,
    unnecessary_transmutes,
    unsafe_op_in_unsafe_fn,
)]
#[rustfmt::skip]
mod vmlinux;

use vmlinux::task_struct;

// (2)
/// PID of the process for which setting a negative nice value is denied.
#[unsafe(no_mangle)]
static PID: i32 = 0;

#[lsm(hook = "task_setnice")]
pub fn task_setnice(ctx: LsmContext) -> i32 {
    match unsafe { try_task_setnice(ctx) } {
        Ok(ret) => ret,
        Err(ret) => ret,
    }
}

// (3)
unsafe fn try_task_setnice(ctx: LsmContext) -> Result<i32, i32> {
    let (pid, nice, ret, global_pid): (c_int, c_int, c_int, c_int) = unsafe {
        let p: *const task_struct = ctx.arg(0);
        (
            (*p).pid,
            ctx.arg(1),
            ctx.arg(2),
            core::ptr::read_volatile(&PID),
        )
    };

    info!(
        &ctx,
        "The PID supplied to this program is: {}, with nice value {} and return value {}. Monitoring for changes in PID: {}",
        pid,
        nice,
        ret,
        global_pid
    );
    if ret != 0 {
        return Err(ret);
    }

    if pid == global_pid && nice < 0 {
        return Err(-1);
    }

    Ok(0)
}

#[cfg(not(test))]
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}
  1. We include the autogenerated binding to task_struct:
  2. Then we define a global variable PID. We initialize the value to 0, but at runtime the userspace side will patch the value with the actual pid we’re interested in.
  3. Finally we have the program and the logic what to do with nice values.

After that we also need to modify the userspace part. We don’t need as much work as with the eBPF part, but we need to:

  1. Get the PID.
  2. Log it.
  3. Write it to the global variable in the eBPF object.

The final result should look like:

use std::process;

use aya::{Btf, programs::Lsm};
use aya_log::EbpfLogger;
use log::{info, warn};
use tokio::signal;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    env_logger::init();

    // (1)
    let pid = process::id() as i32;
    info!("PID: {pid}");

    // This will include your eBPF object file as raw bytes at compile-time and load it at
    // runtime. This approach is recommended for most real-world use cases. If you would
    // like to specify the eBPF program at runtime rather than at compile-time, you can
    // reach for `Ebpf::load_file` instead.
    let mut bpf = aya::EbpfLoader::new()
        .override_global("PID", &pid, true)
        .load(aya::include_bytes_aligned!(concat!(
            env!("OUT_DIR"),
            "/lsm-nice"
        )))?;
    match EbpfLogger::init(&mut bpf) {
        Err(e) => {
            // This can happen if you remove all log statements from your eBPF program.
            warn!("failed to initialize eBPF logger: {e}");
        }
        Ok(logger) => {
            let mut logger = tokio::io::unix::AsyncFd::with_interest(
                logger,
                tokio::io::Interest::READABLE,
            )?;
            tokio::task::spawn(async move {
                loop {
                    let mut guard = logger.readable_mut().await.unwrap();
                    guard.get_inner_mut().flush();
                    guard.clear_ready();
                }
            });
        }
    }
    let btf = Btf::from_sys_fs()?;
    let program: &mut Lsm =
        bpf.program_mut("task_setnice").unwrap().try_into()?;
    program.load("task_setnice", &btf)?;
    program.attach()?;

    info!("Waiting for Ctrl-C...");
    signal::ctrl_c().await?;
    info!("Exiting...");

    Ok(())
}
  1. Where we start with getting and logging a PID:
  2. And then we set the global variable:

After that, we can build and run our project with:

RUST_LOG=info cargo run --config 'target."cfg(all())".runner="sudo -E"'

The output should contain our log line showing the PID of the userspace process, i.e.:

16:32:30 [INFO] lsm_nice: [lsm-nice/src/main.rs:22] PID: 573354

Now we can try to change the nice value for that process. Setting a positive value (lowering the priority) should still work:

$ renice 10 -p 587184
587184 (process ID) old priority 0, new priority 10

But setting a negative value should not be allowed:

$ renice -10 -p 587184
renice: failed to set priority for 587184 (process ID): Operation not permitted

If doing that resulted in Operation not permitted, congratulations, your LSM eBPF program works!

Community

How to get involved in the Aya Community!

  • Join the discussion on Discord
  • Add your project to Awesome
  • Contribute to Aya on Github

Contributor Covenant Code of Conduct

Our Pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.

Our Standards

Examples of behavior that contributes to a positive environment for our community include:

  • Demonstrating empathy and kindness toward other people
  • Being respectful of differing opinions, viewpoints, and experiences
  • Giving and gracefully accepting constructive feedback
  • Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
  • Focusing on what is best not just for us as individuals, but for the overall community

Examples of unacceptable behavior include:

  • The use of sexualized language or imagery, and sexual attention or advances of any kind
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information, such as a physical or email address, without their explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement on Discord. All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the reporter of any incident.

Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:

1. Correction

Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.

Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.

2. Warning

Community Impact: A violation through a single incident or series of actions.

Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.

3. Temporary Ban

Community Impact: A serious violation of community standards, including sustained inappropriate behavior.

Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.

4. Permanent Ban

Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.

Consequence: A permanent ban from any sort of public interaction within the community.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.

For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.