During the work of getting
concourse to have containers being created by
containerd, one of the steps that I took in the configuration of
was to disable the
cri plugin in the
disabled_plugins = ["cri"] # <<< (!) [grpc] address = "/run/containerd/containerd.sock" [debug] address = "/run/containerd/debug.sock" level = "debug" [plugins] [plugins."io.containerd.runtime.v1.linux"] runtime = "runc" shim = "containerd-shim"
When doing so, we were essentially configuring
containerd to not serve the
interface that lets
kubelet communicate with a runtime that should provide
the primitives to materialize
pods into something “tangible”.
Essentially, I was taking that “cri plugin” from the diagram below away:
kubelet | | .---------------------------. | |.------------. | *----++->cri plugin| containerd -+---> containers |'------------' | '---------------------------'
But, what if instead, we put
concourse a side and try to get
materialize the Pods that a
ps.: all of the article assumes you’re running Linux. in my case, 5.3 from Ubuntu Eoan (19.10)
building kubelet from source
I thought this would be one of the hardest parts - a bunch of dependencies to figure out -, but it turns out that it was the easiest.
Having Go 1.13.5 already set up, I pretty much followed the
k8s development guide that lives under
git clone https://github.com/kubernetes/kubernetes pushd $_ make WHAT=cmd/kubelet popd
As a result of that,
_output/local/bin/linux/amd64/kubelet got populated with
For the sake of making the process of calling
kubelet from anywhre, I linked
/usr/local/bin/kubelet to that destination
ln -s \ $(realpath ./_output/local/bin/linux/amd64/kubelet) \ /usr/local/bin/kubelet
setting the kernel up
We’re dealing with container tech, so, customizing some kernel parameters and ensuring that some kernel modules are ready is needed.
Under the hood,
containerd will use
overlayfs (overlay filesystem) at least
to manage container images, and
br_netfilter (bridge netfilter) for the
purpose of working on packets that go through a bridge device that it sets up.
To ensure that those are loaded, we can use
modprobe overlay modprobe br_netfilter
Naturally, this keeps them activated only until a reboot. The
systemd-modules-load service (from
systemd itself) can then take care of
loading modules for us (so that we don’t have to use
modprobe to add it all
the time that the system is initialized) during initialization.
cat > /etc/modules-load.d/containerd.conf <<EOF overlay br_netfilter EOF
As the network functionality requires us to have packets traversing the
being sent to
iptables for processing, we need to enable a module parameter that
allows that to occur:
net.bridge.bridge-nf-call-iptables (and the IPv6
With ip packets needing to be routed between multiple network interfaces (e.g., a
container - with an internal virtual ethernet device - trying to “ping” an
external service will need to have its packets being forwarded through a default
gateway that is another ethernet device, thus, making the machine act as a
router), we need to explicitly allow that in the kernel (through
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF sysctl --system
That done, from the kernel perspective, it’s good to go.
containerd requires setting three things up:
- its own binaries
I summarized the whole process in a Makefile under a repo
cirocosta/containerd-install), but here’s how it goes.
In order to facilitate cleaning things up, I put all binaries and configurations
under the same directory tree (
runc is a single binary that can be retrieved right from the GitHub releases
It’s essentially a dependency of the
containerd’s runtime v2 interface, providing to
ability to create containers using
containerd '--- tasks '---- runtime v2 '---- containerd-runc-shim-v2 '--- runc
runc is a binary that gets called with a very specific set of
arguments by whoever consumes it, the version here is very important - get one
that’s not compatible with
containerd and things might just suddenly fail.
For this reason,
containerd makes that version explicit in its vendor file:
github.com/opencontainers/runc d736ef14f0288d6993a1845745d6756cfc9ddd5a # v1.0.0-rc9
containerd composed of not only the
containerd daemon that provides
the capabilities of creating containers, managing container images (and more),
what we get from its release is a set of binaries that provide the bulk of the
ctr - cli to interact w/ containerd containerd-stress - stress testing containerd - the daemon containerd-shim - acts as a parent to the containers containerd-shim-runc-v2 - implements the runtime iface for runc containerd-shim-runc-v1 - same as v2, but older (I guess?)
containerd is supposed to be running in the background as a daemon, I
preferred to go with a
systemd service, but anything would do the job here.
[Unit] After=network.target Description=an open and reliable container runtime Documentation=https://containerd.io [Service] Delegate=yes Environment=PATH=/usr/local/kubelet-sample/bin:/usr/sbin ExecStart=/usr/local/kubelet-sample/bin/containerd --config=/usr/local/kubelet-sample/conf/containerd.toml KillMode=process LimitCORE=infinity LimitNOFILE=1048576 LimitNPROC=infinity Restart=always TasksMax=infinity [Install] WantedBy=multi-user.target
With regards to the
containerd configuration, not much is really needed aside
from letting it know where the
cni binaries are, and where the network
configurations can be found:
diff --git a/default.toml b/./containerd.toml index 2e72de9..0ddc311 100644 --- a/tmp/before +++ b/./containerd.toml @@ -84,8 +84,8 @@ oom_score = 0 runtime_root = "" privileged_without_host_devices = false [plugins."io.containerd.grpc.v1.cri".cni] - bin_dir = "/opt/cni/bin" - conf_dir = "/etc/cni/net.d" + bin_dir = "/usr/local/kubelet-sample/bin" + conf_dir = "/usr/local/kubelet-sample/conf/cni" max_conf_num = 1 conf_template = "" [plugins."io.containerd.grpc.v1.cri".registry]
cni just an interface that gets implemented by plugins that adhere to
that interface, what we actually need to download in this case is the set of
plugins that we plan to use when having our pods set up.
The reference ones that are maintained by the CNI team can be found under containernetworking/plugin, which releases all the binaries in the form of a compressed tarball.
getting kubelet targetting containerd
Having all of those installed in a known locations, it was now a matter of
kubelet know where
containerd lives in order to target it when it
realizes that it needs to instantiate the pods whose definitions were assigned
To do so, we tweak just two parameters:
# specify that we'd like to connect to something that implemnts the CRI # --container-runtime remote # specify where the CRI implementation lives # --container-runtime-endpoint unix:///run/containerd/containerd.sock
running a pod
Lastly, to have the kubelet knowing which pods we want to run, we tell it so:
# where kubelet should look for pod definitions # --pod-manifest-path $(realpath ./pods)
That’s because it’s capable of discovering pod definitions from the filesystem,
making the whole thing possible without an
apiserver at all.
Put a pod definition there, and
kubelet will run it for you.
If you want to check the whole thing out from a Makefile that does this all - check out https://github.com/cirocosta/containerd-install/tree/kubelet-sample