Full code for the example in this chapter is available here
What is Cgroup SKB?
Cgroup SKB programs are attached to v2 cgroups and get triggered by network
traffic (egress or ingress) associated with processes inside the given cgroup.
They allow to intercept and filter the traffic associated with particular
cgroups (and therefore - containers).
What's the difference between Cgroup SKB and Classifiers?
Both Cgroup SKB and Classifiers receive the same type of context -
SkBuffContext.
The difference is that Classifiers are attached to the network interface.
Example project
This example will be similar to the Classifier example - a
program which allows the dropping of egress traffic, but for the specific
cgroup.
Design
We're going to:
Create a HashMap that will act as a blocklist.
Check the destination IP address from the packet against the HashMap to
make a policy decision (pass or drop).
Add entries to the blocklist from userspace.
Generating bindings to vmlinux.h
In this example, we are going to use one kernel structure called iphdr, which
represents the IP protocol header. We need to generate Rust bindings to it.
First, we must make sure that bindgen is installed.
cargo install bindgen-cli
Let's use xtask to automate the process of generating bindings so we can
easily reproduce it in the future by adding the following code:
useaya_tool::generate::InputFile;usestd::{fs::File,io::Write,path::PathBuf};pubfngenerate()-> Result<(),anyhow::Error>{letdir=PathBuf::from("cgroup-skb-egress-ebpf/src");letnames: Vec<&str>=vec!["iphdr"];letbindings=aya_tool::generate(InputFile::Btf(PathBuf::from("/sys/kernel/btf/vmlinux")),&names,&[],)?;// Write the bindings to the $OUT_DIR/bindings.rs file.letmutout=File::create(dir.join("bindings.rs"))?;write!(out,"{bindings}")?;Ok(())}
Once we've generated our file using cargo xtask codegen from the root of the
project, we can access it by including mod bindings from eBPF code.
eBPF code
The program is going to start with a definition of BLOCKLIST map. To enforce
the police, the program is going to lookup the destination IP address in that
map. If the map entry for that address exists, we are going to drop the packet
by returning 0. Otherwise, we are going to accept it by returning 1.
usestd::net::Ipv4Addr;useaya::{include_bytes_aligned,maps::{perf::AsyncPerfEventArray,HashMap},programs::{CgroupAttachMode,CgroupSkb,CgroupSkbAttachType},util::online_cpus,Ebpf,};usebytes::BytesMut;useclap::Parser;uselog::info;usetokio::{signal,task};usecgroup_skb_egress_common::PacketLog;#[derive(Debug, Parser)]structOpt{#[clap(short, long, default_value = "/sys/fs/cgroup/unified")]cgroup_path: String,}#[tokio::main]asyncfnmain()-> Result<(),anyhow::Error>{letopt=Opt::parse();env_logger::init();// This will include your eBPF object file as raw bytes at compile-time and load it at// runtime. This approach is recommended for most real-world use cases. If you would// like to specify the eBPF program at runtime rather than at compile-time, you can// reach for `Ebpf::load_file` instead.#[cfg(debug_assertions)]letmutbpf=Ebpf::load(include_bytes_aligned!("../../target/bpfel-unknown-none/debug/cgroup-skb-egress"))?;#[cfg(not(debug_assertions))]letmutbpf=Ebpf::load(include_bytes_aligned!("../../target/bpfel-unknown-none/release/cgroup-skb-egress"))?;letprogram: &mutCgroupSkb=bpf.program_mut("cgroup_skb_egress").unwrap().try_into()?;letcgroup=std::fs::File::open(opt.cgroup_path)?;// (1)program.load()?;// (2)program.attach(cgroup,CgroupSkbAttachType::Egress,CgroupAttachMode::Single,)?;letmutblocklist: HashMap<_,u32,u32>=HashMap::try_from(bpf.map_mut("BLOCKLIST").unwrap())?;letblock_addr: u32=Ipv4Addr::new(1,1,1,1).try_into()?;// (3)blocklist.insert(block_addr,0,0)?;letmutperf_array=AsyncPerfEventArray::try_from(bpf.take_map("EVENTS").unwrap())?;forcpu_idinonline_cpus().map_err(|(_,error)|error)?{letmutbuf=perf_array.open(cpu_id,None)?;task::spawn(asyncmove{letmutbuffers=(0..10).map(|_|BytesMut::with_capacity(1024)).collect::<Vec<_>>();loop{letevents=buf.read_events(&mutbuffers).await.unwrap();forbufinbuffers.iter_mut().take(events.read){letptr=buf.as_ptr()as*constPacketLog;letdata=unsafe{ptr.read_unaligned()};letsrc_addr=Ipv4Addr::from(data.ipv4_address);info!("LOG: DST {}, ACTION {}",src_addr,data.action);}}});}info!("Waiting for Ctrl-C...");signal::ctrl_c().await?;info!("Exiting...");Ok(())}
Loading the eBPF program.
Attaching it to the given cgroup.
Populating the map with remote IP addresses which we want to prevent the
egress traffic to.
The third thing is done with getting a reference to the BLOCKLIST map and
calling blocklist.insert. Using IPv4Addr type in Rust will let us to read
the human-readable representation of IP address and convert it to u32, which
is an appropriate type to use in eBPF maps.
Testing the program
First, check where cgroups v2 are mounted:
$ mount | grep cgroup2
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
The most common locations are either /sys/fs/cgroup or /sys/fs/cgroup/unified.
Inside that location, we need to create our new cgroup (as root):
# mkdir /sys/fs/cgroup/foo
Then run the program with:
RUST_LOG=info cargo xtask run
And then, in a separate terminal, as root, try to access 1.1.1.1: