Hey,

Some days ago I wrote about how you can configure YouCompleteMe to navigate the Linux source code.

One great benefit of going through such setup is that it showed me how I could use the very same concepts to develop eBPF code better by leveraging autocompletion and jumps to definitions and declarations.

Example of YouCompleteMe searching available structure fields

While it’s common for people to just embed their eBPF code directly into Python scripts, I find such approach very hard to debug (not being very accustomed to Kernel stuff).

Given that not all eBPF code targets the same subsystems in the Kernel, some peculiarities arise when trying to provide proper autocompletion.

ps.: I’m still learning these things!

Autocompleting eBPF compiled via Clang without BCC

If you’re performing the builds yourself (calling clang with the bpf backend specified via -target bpf), nothing more than a basic .ycm_extra_conf.py like showed in the article mentioned before is needed.

Usually, all you need is referencing /usr/include and you’re done.

def FlagsForFile( filename, **kwargs ):
        return { 'flags': [
            '-I/usr/include',
            '-I/usr/local/include',
            '-std=gnu99', # this allows us to leverage `asm`
            '-xc'
        ]}

ps.: if you need to reference internal kernel structures, make sure you include the proper paths from your local kernel source. For instance, if I have the kernel cloned at /home/ubuntu/linux, then I’d probably add something like -I/home/ubuntu/linux/include or whatever else you need from it.

As a more specific example, in case you’re targetting ebpf programs for tc, it might be useful that you leverage the helpers from iproute2. It ships with a set of helpers that might be useful for you when using eBPF with tc, so it might be worth cloning iproute2 and manually installing their bpf helpers file:

# Clone the IPROUTE2 repository and get into the directory
git clone git://git.kernel.org/pub/scm/network/iproute2/iproute2.git 
cd ./iproute2

# Copy the `bpf_api.h` helpers file that lives under `./include`
# to your `/usr/include` directory.
#
# ps.: this could be anywhere - including your current source tree.
install \
        -m 0644 \
        ./include/bpf_api.h \
        /usr/include/iproute2

Now, create a bpf hello world that gets loaded into the ingress path and you’ll see that the definitions properly show up:

#include <iproute2/bpf_api.h>
#include <linux/bpf.h>

__section("ingress") 
int cls_ingress(struct __sk_buff* skb)
{
        (void)skb;
	printt("Hello World\n");

	return TC_ACT_UNSPEC;
}

With the autocompletion working, we can quickly visualize what are the fields that the __sk_buff structure gives us:

Example of YouCompleteMe searching available structure fields

More importantly, we can get our Vim cursor on top of it and jump to its declaration (YcmCompleter GoToDeclaration).

Given that there are helpers defined in iproute2/bpf_api.h, and that we include it (and YouCompleteMe knows where to find this file given that we specified the include list at .ycm_extr_conf.py), we can do the same for the bpf helper functions (as shown in the image at the beginning of the article).

Autocompleting eBPF compiled using BCC

If you’re using bcc to get your eBPF in, then the story changes a little bit - there’s a subset of code that gets injected behind the scenes (not very good for us, trying to provide the whole source code to the autocompletion engine).

Such subset is what gives you some handy definitions like BPF_HASH.

Not only BCC provides helpers, but it actually acts as part of the Clang compilation process, transforming the eBPF source code that we provide.

For instance, we can see how bcc adds some helpers:

// Helpers are inlined in the following file (C). 
// Load the definitions and pass the partially compiled 
// module to the B frontend to continue with.
auto helpers_h = ExportedFiles::headers()
        .find("/virtual/include/bcc/helpers.h");

if (helpers_h == ExportedFiles::headers().end()) {
        fprintf(stderr, "Internal error: missing bcc/helpers.h");
        return -1;
}

if (int rc = load_includes(helpers_h->second))
        return rc;

BLoader b_loader(flags_);

Source code from bpf_module.cc#L970-l980

I’m still not completely sure about how it interacts with the LLVM toolchain, but it seems like it’s something like this:

Illustration of the BCC LLVM compilation path

ps.: There’s a great presentation by Alexei Starovoitov illustrating the process: Slides - BPF in LLVM and kernel.

If we want to leverage those definitions (from the injected helpers.h) in our autocompletion though, we need to reference that file manually. The catch here is that if you look at it, there’s an extra line at both the beginning and end that would invalidate our use - it makes the whole code a comment.

R"********(
/*
 * Copyright (c) 2015 PLUMgrid, Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.

 [...]

        } while (0);
#endif
)********"

Naturally, that’s not acceptable for us as including it would fail.

To remove the first and last line, we can then make use of head and tail:

# Create the location at which we plan to store the `helpers.h`
# file.
mkdir -p /usr/local/include/bcc


# Clone the bcc project
git clone https://github.com/iovisor/bcc


# `tail -n +2` removes the first line by making
# it send to `stdout` only the last `n` lines starting
# at the second line.
#
# `head -n -1` does the opposite: it the first `n-1`
# lines (that is, all of the lines except the last one)
# and sends them to stdout.
#
# Given that we're doing those two operations in a pipeline
# manner, in the end, we have removed the first and the
# last lines.
#
# Piping the result to `/usr/local/include/bcc/helpers.h` makes 
# the modified content available at such location.
cat ./bcc/src/cc/export/helpers.h | \
        tail -n +2  | \
        head -n -1 | \
        sudo tee -a > /usr/local/include/bcc/helpers.h

Now, let’s create an example to be used with bcc to see the autocompletion working.

First, create a Python wrapper that is going to take our source and then make use of the BCC toolchain.

The BCC toolchain uses its frontend to translate our code to a Clang-compatible code which gets turned into BPF bytecode via the LLVM BPF backend.

from bcc import BPF

# BPF creates a new BPF module with the given
# source code that we can specify either via
# `src_file` or `text`.
#
# `trace_print` reads the kernel debug trace pipe
# and then prints what's there to stdout.
#
# This way, we're able to easily visualize our
# debug statements issued via `bpf_trace_printk`.
BPF(src_file='./trace_vfs_open.c').trace_print()

With the wrapper set, we can the code our trace_vfs_open.c file:

/**
 * If `BPF_HASH` is already defined, it means that we're
 * in the BCC compilation path, so we don't need to import
 * `bcc/helpers.h` as its been already added to this file
 * already.
 */
#ifndef BPF_HASH
#include <bcc/helpers.h>
#endif

/**
 * kprobe__vfs_open() - instrument the vfs_open call with a custom handler
 * @ctx: registers and the bpf context (unused).
 *
 * Instruments the internall call `vfs_open` (`fs/open.c`) which is responsible
 * for allocating a file structure, initializing it and then submitting a
 * subsequent `open` call to the filesystem.
 *
 * By using the special prefix `kprobe__`, `bcc` will automatically
 * attach the kprobe for us to the kernel method `sys_statfs`.
 */
int
kprobe__vfs_open(void* __attribute__((unused)) ctx)
{
	/**
	 * `bpf_trace_printk` is a method that gets defined by the bcc
	 * toolchain - see [1].
	 *
	 * It defines a `printk`-like facility for debugging (that should
	 * really just be used for quick debugging).
	 *
	 * [1]:
	 * https://github.com/iovisor/bcc/blob/d17d5a8fd4f3b8a9638c8326a77b56ba56dc5eec/src/cc/frontends/clang/b_frontend_action.cc#L840-L852
	 */
	bpf_trace_printk("vfs_open called\n");

	return 0;
}

Although the example is pretty contrived, we can already see where the autocompletion support shines - we can discover which operations we can perform with the ctx argument:

Animated example of EBPF autocomplete

If we switch this trace from vps_open to something more elaborate, like tcp_v4_connect, then we can see how such functionality shines. Consider the tcpv4connect.py example from the BCC repository:

int kprobe__tcp_v4_connect(
        struct pt_regs *ctx, 
        struct sock *sk)
{
	u32 pid = bpf_get_current_pid_tgid();
	// stash the sock ptr for lookup on return
	currsock.update(&pid, &sk);
	return 0;
};

If we wanted to know more about sock, using the autocomplete feature we can quickly do it:

Screenshot of the terminal with VIM showing the autocomplete of a socket linux structure

Closing thoughts

Not being a Kernel developer myself, I found that by leveraging some additional tooling made me much more productive when learning about eBPF.

It seems to me that if more people who are not very educated about Linux start getting into eBPF, then the more approachable this area tends to get.

So, I hope that this helped you! Please let me know if you find something weird (or completely wrong). I’m cirowrc on Twitter.

Have a good one!

finis