Hey,
During the work of getting concourse
to have containers being created by
containerd
, one of the steps that I took in the configuration of containerd
was to disable the cri plugin
in the containerd
configuration:
disabled_plugins = ["cri"] # <<< (!)
[grpc]
address = "/run/containerd/containerd.sock"
[debug]
address = "/run/containerd/debug.sock"
level = "debug"
[plugins]
[plugins."io.containerd.runtime.v1.linux"]
runtime = "runc"
shim = "containerd-shim"
When doing so, we were essentially configuring containerd
to not serve the
interface that lets kubelet
communicate with a runtime that should provide
the primitives to materialize pod
s into something “tangible”.
Essentially, I was taking that “cri plugin” from the diagram below away:
kubelet
|
| .---------------------------.
| |.------------. |
*----++->cri plugin| containerd -+---> containers
|'------------' |
'---------------------------'
But, what if instead, we put concourse
a side and try to get containerd
to
materialize the Pods that a kubelet
sees?
ps.: all of the article assumes you’re running Linux. in my case, 5.3 from Ubuntu Eoan (19.10)
building kubelet from source
I thought this would be one of the hardest parts - a bunch of dependencies to figure out -, but it turns out that it was the easiest.
Having Go 1.13.5 already set up, I pretty much followed the k8s development guide
that lives under kubernetes/community
:
git clone https://github.com/kubernetes/kubernetes
pushd $_
make WHAT=cmd/kubelet
popd
As a result of that, _output/local/bin/linux/amd64/kubelet
got populated with
the kubelet
binary.
For the sake of making the process of calling kubelet
from anywhre, I linked
/usr/local/bin/kubelet
to that destination
ln -s \
$(realpath ./_output/local/bin/linux/amd64/kubelet) \
/usr/local/bin/kubelet
setting the kernel up
We’re dealing with container tech, so, customizing some kernel parameters and ensuring that some kernel modules are ready is needed.
Under the hood, containerd
will use overlayfs
(overlay filesystem) at least
to manage container images, and br_netfilter
(bridge netfilter) for the
purpose of working on packets that go through a bridge device that it sets up.
To ensure that those are loaded, we can use modprobe
:
modprobe overlay
modprobe br_netfilter
Naturally, this keeps them activated only until a reboot. The
systemd-modules-load
service (from systemd
itself) can then take care of
loading modules for us (so that we don’t have to use modprobe
to add it all
the time that the system is initialized) during initialization.
cat > /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
As the network functionality requires us to have packets traversing the bridge
being sent to iptables
for processing, we need to enable a module parameter that
allows that to occur: net.bridge.bridge-nf-call-iptables
(and the IPv6
equivalent).
With ip packets needing to be routed between multiple network interfaces (e.g., a
container - with an internal virtual ethernet device - trying to “ping” an
external service will need to have its packets being forwarded through a default
gateway that is another ethernet device, thus, making the machine act as a
router), we need to explicitly allow that in the kernel (through
net.ipv4.ip_forward
).
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
That done, from the kernel perspective, it’s good to go.
installing containerd
Installing containerd
requires setting three things up:
- its own binaries
- runc
- cni
I summarized the whole process in a Makefile under a repo
(cirocosta/containerd-install
), but here’s how it goes.
In order to facilitate cleaning things up, I put all binaries and configurations
under the same directory tree (/usr/local/kubelet-sample/{bin,conf}
).
runc
runc
is a single binary that can be retrieved right from the GitHub releases
page: https://github.com/opencontainers/runc/releases
It’s essentially a dependency of the containerd-runc-shim-v2
, which
implements containerd
’s runtime v2 interface, providing to containerd
the
ability to create containers using runc
.
containerd
'--- tasks
'---- runtime v2
'---- containerd-runc-shim-v2
'--- runc
Given that runc
is a binary that gets called with a very specific set of
arguments by whoever consumes it, the version here is very important - get one
that’s not compatible with containerd
and things might just suddenly fail.
For this reason, containerd
makes that version explicit in its vendor file:
github.com/opencontainers/runc d736ef14f0288d6993a1845745d6756cfc9ddd5a # v1.0.0-rc9
containerd
Being containerd
composed of not only the containerd
daemon that provides
the capabilities of creating containers, managing container images (and more),
what we get from its release is a set of binaries that provide the bulk of the
functionality:
ctr - cli to interact w/ containerd
containerd-stress - stress testing
containerd - the daemon
containerd-shim - acts as a parent to the containers
containerd-shim-runc-v2 - implements the runtime iface for runc
containerd-shim-runc-v1 - same as v2, but older (I guess?)
As containerd
is supposed to be running in the background as a daemon, I
preferred to go with a systemd
service, but anything would do the job here.
[Unit]
After=network.target
Description=an open and reliable container runtime
Documentation=https://containerd.io
[Service]
Delegate=yes
Environment=PATH=/usr/local/kubelet-sample/bin:/usr/sbin
ExecStart=/usr/local/kubelet-sample/bin/containerd --config=/usr/local/kubelet-sample/conf/containerd.toml
KillMode=process
LimitCORE=infinity
LimitNOFILE=1048576
LimitNPROC=infinity
Restart=always
TasksMax=infinity
[Install]
WantedBy=multi-user.target
With regards to the containerd
configuration, not much is really needed aside
from letting it know where the cni
binaries are, and where the network
configurations can be found:
diff --git a/default.toml b/./containerd.toml
index 2e72de9..0ddc311 100644
--- a/tmp/before
+++ b/./containerd.toml
@@ -84,8 +84,8 @@ oom_score = 0
runtime_root = ""
privileged_without_host_devices = false
[plugins."io.containerd.grpc.v1.cri".cni]
- bin_dir = "/opt/cni/bin"
- conf_dir = "/etc/cni/net.d"
+ bin_dir = "/usr/local/kubelet-sample/bin"
+ conf_dir = "/usr/local/kubelet-sample/conf/cni"
max_conf_num = 1
conf_template = ""
[plugins."io.containerd.grpc.v1.cri".registry]
cni
Being cni
just an interface that gets implemented by plugins that adhere to
that interface, what we actually need to download in this case is the set of
plugins that we plan to use when having our pods set up.
The reference ones that are maintained by the CNI team can be found under containernetworking/plugin, which releases all the binaries in the form of a compressed tarball.
getting kubelet targetting containerd
Having all of those installed in a known locations, it was now a matter of
letting kubelet
know where containerd
lives in order to target it when it
realizes that it needs to instantiate the pods whose definitions were assigned
to it.
To do so, we tweak just two parameters:
# specify that we'd like to connect to something that implemnts the CRI
#
--container-runtime remote
# specify where the CRI implementation lives
#
--container-runtime-endpoint unix:///run/containerd/containerd.sock
running a pod
Lastly, to have the kubelet knowing which pods we want to run, we tell it so:
# where kubelet should look for pod definitions
#
--pod-manifest-path $(realpath ./pods)
That’s because it’s capable of discovering pod definitions from the filesystem,
making the whole thing possible without an apiserver
at all.
Put a pod definition there, and kubelet
will run it for you.
If you want to check the whole thing out from a Makefile that does this all - check out https://github.com/cirocosta/containerd-install/tree/kubelet-sample