Hey,
I’ve recently bought a Raspberry Pi 3B+, and that seemed like a great target for a Concourse worker run at.
It turns out that just the process of building and adapting the Concourse binary itself was already an interesting thing to do, so here I share the lessons learned and how that process looked like.
To provide ARM-compiled binaries for node_extra_exporter
(a Rust-based Prometheus exporter for exposing some metrics that the traditional node_exporter
doesn’t expose), it felt like having an ARM-based machine in my Concourse cluster would be a great idea - have an extra job for building the ARM binary and there you go!
If you’d like to know what does it take to get a Concourse worker ready to take workloads on ARM, make sure you stick to the end.
- Concourse workers
- Compiling the Concourse binary
- Cross compiling Go code
- Compiling CGO code
- Cross compiling CGO code
- Building Guardian, the containerizer
- Building Baggageclaim, the volumizer
- Running the Concourse worker
- Generating the modified registry-image-resource
- Automating the process of building an ARM-based Concourse distribution
- Summarizing
- Some surprises
- Closing Thoughts
Concourse workers
Regardless of your knowledge of Concourse, the tl;dr is that being a “continuous thing-doer”, it needs some form of compute nodes to do the things you want - these nodes are the workers.
To perform the job of “doing things”, in any platform, these workers are made of three components:
- a “containerizer”, that manages containers
- a “volumizer”, that manages volumes
- “beacon”, the piece that registers the worker against the cluster and manages its lifecycle
In the case of Linux, of those three components, two1 (the “volumizer” and “beacon”) are already part of the concourse
binary. This means that we not only need to build concourse
, but we also need to build the “containerizer”.
1 there’s an expection, but that won’t be covered here.
Compiling the Concourse binary
Being Concourse a project entirely written in Go (actually, there’s also the UI, which is in Elm), all that we need at this stage is the Go toolchain.
As almost all of the Go code necessary for building Concourse is now under a single repository (github.com/concourse/concourse
), and Go modules is the way that dependencies are handled there, this is what should be the most straightforward part.
Given that installing Go is quite straightforward, I decided to do that right in my Raspberry PI (why not?).
This means that just the following two steps are required:
# clone the main concourse repository
#
git clone https://github.com/concourse/concourse .
# build the main Concourse binary and put the result of
# the compilation into `$GOPATH/bin/`.
#
go install -v ./cmd/concourse
The problem though, is that the Raspberry PI is not as powerful as one would imagine: despite the 4 core SoC, it takes no less than seven minutes once all of the dependencies are already in place. Yeah, SEVEN MINUTES already having fetched all of the dependencies. Without the dependencies: 20 minutes.
For a full picture of the dashboard, click here to check out the interactive snapshot).
As a side note, it was quite interesting for me to see how there are clearly very different phases that the compiler toolchain goes trough when performing the build (not including the dependency fetching that is not in this panel).
Cross compiling Go code
Knowing that if I’d need to recompile this multiple times, it’d be crazy slow and, thus, super time consuming, I decided to go with cross compiling - this way we could just use all of the speed we have available and, in the end, just ship the binaries.
Another benefit is that it’d be easy for anyone to build it too! No need for an actual Raspberry PI (or any ARMv7) to build Concourse.
The good news is that Go makes that whole job easy for us, making the whole cross compilation dependant on just few flags:
GOOS
the name of the target operating systemGOARCH
the name of the target architectureGOARM
the version of ARM that want to target
For Go code that is free of any calls to C code (via CGO
), this “Just Works™” by the magic of the default Go toolchain - the go compilation infrastructure has all the necessary bits to perform the right translation to the right architectures and OSes that it supports.
If you’re curious about the intermediate representation that the Go compiler generates and how that becomes machine-dependant code, checkout the following example views of Go’s SSA:
ps.: to compile reproduce: GOSSAFUNC=main go build -gcflags "-S" main.go
However, being free of calls to C code is not a property of all projects - in the case of Concourse itself, dex
, one of our dependencies, depends on go-sqlite3
, which has bindings to C, which means that we now depend on CGO.
├ github.com/concourse/concourse/skymarshal/storage
├ github.com/concourse/dex/storage
├ github.com/concourse/dex/storage/sql
├ database/sql
├ database/sql/driver
...
└ github.com/mattn/go-sqlite3
├ database/sql
├ database/sql/driver
..
├ unsafe
└ C << CGO!!
If you’re curious about an example in go-sqlite3
where CGO gets used, check out sqlite3.go.
To see what I mean by “CGO doesn’t ‘just works’”, let’s try doing cross compilation with just those flags:
CGO_ENABLED=1 \
GOARCH=arm \
GOARM=7 \
GOOS=linux \
go build ./cmd/concourse/
# runtime/cgo
gcc: error: unrecognized command line option '-marm';
did you mean '-mabm'?
And the reason why it wouldn’t work out of the box makes sense: when it comes to CGO, you’re not only in the realm of the Go toolchain, but you’re also relying on the infrastructure to build the C code as well, and there, things are slightly different. Let’s expand on it.
Compiling CGO code
As an example of how cross compilation with CGO looks like, let’s assume that we have a super extra very efficient library in C that is optimized to print strings to stdout
.
First, starting with the declaration of the function we want to consume from Go (in the printer.h
file):
#include <stdio.h>
void super_optimized_print(char* str);
Then, in the definition (printer.c
):
#include "printer.h"
void super_optimized_print(char* str)
{
printf("%s\n", str);
return;
}
And now, finally, our Go code that uses CGO (main.go
):
package main
// #include "printer.h"
// #include <stdlib.h>
import "C"
import "unsafe"
func main() {
str := C.CString("hello world")
defer C.free(unsafe.Pointer(str))
C.super_optimized_print(str)
}
We can then try to build all of that code to our own OS and ARCH, and nothing really needs to be changed: you give the package to the go
compiler, and let it do its job:
go build -v .
As, in this case, our C code is definitely portable, and we’re targetting our own machine architecture, all works. Let’s now try to build to a different platform though.
Cross compiling CGO code
However, if, again, we try the cross compilation, it’ll fail with the very same error we saw before:
CGO_ENABLED=1 GOOS=linux GOARCH=arm GOARM=7 go build -v .
gcc: error: unrecognized command line
option '-marm'; did you mean '-mabm'?
The good news is that without even looking at the Go code, we can see how that separation of “who compiles what” happens by tracing all of the execve
s that happen:
PCOMM PID PPID ARGS
--------------------------------------------------------
go 29352 26142 go build .
cgo 29361 29352 cgo -V=full
compile 29362 29352 compile -V=full
compile 29365 29352 compile -V=full
compile 29367 29352 compile -V=full
compile 29376 29352 compile -V=full
asm 29381 29352 asm -V=full
asm 29382 29352 asm -V=full
cgo 29394 29352 cgo -objdir /tmp/go-build123931463/b003/...
gcc 29399 29394 gcc -E -dM -marm -I /tmp/go-build1239314...
For the output above, execsnoop
from iovisor/bcc
was used.
And that seems perfectly reasonable - to build C
code, one needs to have a C compiler, which is something totally out of the realm of where the Go team should be focusing on, reason why it calls out to a C compiler.
The failure that we see there though comes from the fact that to have GCC performing the cross compilation, we need to install separate packages before, and properly adjust the C compiler to point to the right compiler.
Luckly, letting Go know which compiler to use when performing the build comes down to setting the CC
environment variable (see golang/go#gccBaseCmd()
).
CGO_ENABLED=1 \
GOOS=linux \
GOARCH=arm \
GOARM=7 \
CC=arm-linux-gnueabihf-gcc \
go build -v
# workkss!!
PCOMM PID PPID ARGS
go 30825 26142 go build -v .
cgo 30834 30825 cgo -V=full
compile 30835 30825 compile -V=full
compile 30840 30825 compile -V=full
compile 30841 30825 compile -V=full
compile 30842 30825 compile -V=full
asm 30857 30825 asm -V=full
asm 30862 30825 asm -V=full
asm 30863 30825 asm -V=full
asm 30872 30825 asm -V=full
cgo 30877 30825 cgo -objdir /tmp/go-build130487924/b003/ ...
arm-linux-30882 30877 /usr/bin/arm-linux-gnueabihf-gcc -E -dM ...
cc1 30883 30882 /usr/lib/gcc-cross/arm-linux-gnueabihf/7/cc1 -E ...
...
Now, with the Concourse binary built, we can move to its dependencies.
Building Guardian, the containerizer
As Concourse steps and resource checks run in Containers1, and there’s a piece of the Concourse worker that is responsible for that, but Concourse as a team is not necessarily in the business of creating container runtimes, Concourse uses a separate component for doing so: Guardian(gdn
), an implementation of the Garden interface for container management.
With the process of creating Linux containers being all standardized now (see opencontainers/runtime-spec), Guardian takes the approach of leveraging what’s already there, wrapping runc
, the de facto implementation of the Runtime Spec, allowing consumers of the Garden interface to have containers created by Runc, without leaking the implementation details to the Garden interface.
The detail here though, is that while most of gdn
is pure Go, there are many bits from runc
(gdn
’s dependency) that are C based, and Guardian itself depends on other binaries (one being a C program).
Another detail in the process os building gdn
is that, by default, gdn
is not suitable for multiple architectures due to the way the way that it interacts with runc
when it comes to asking runc
to block specific syscalls.
// Seccomp represents syscall restrictions
//
// By default, only the native architecture of the kernel is allowed to be used
// for syscalls. Additional architectures can be added by specifying them in
// Architectures.
//
type Seccomp struct {
DefaultAction Action `json:"default_action"`
Architectures []string `json:"architectures"`
Syscalls []*Syscall `json:"syscalls"`
}
That means that, in gdn
itself, we needed to have in the architectures
slice, ARM included:
var seccomp = &specs.LinuxSeccomp{
DefaultAction: specs.ActErrno,
Architectures: []specs.Arch{
specs.ArchX86_64,
specs.ArchX86,
specs.ArchX32,
+ specs.ArchARM,
},
Syscalls: []specs.LinuxSyscall{
As we know how to build all of those in a cross-platform way, we just need to follow the same recipe: set the right compiler, and there you go.
1: in platforms that support containers.
Building Baggageclaim, the volumizer
As mentioned before, the “volumizer” is already part of the worker, thus, by building concourse
(the binary), we already have baggageclaim
built.
The only detail for baggageclaim
is that if the backing filesystem for it is set to be btrfs
, then the machine that runs the worker needs to have the btrfs
CLI on it (built to the right platform).
Running the Concourse worker
At this point, we can already have our Concourse worker in a state that it could run - it has all of its dependencies, even though it has no base resource types, meaning that it wouldn’t be able to have any steps or even checks running.
The reason for that is that the root filesystem that ends up being fetched by Concourse to run a container needs to come from somewhere - the resource type that is configured to retrieve those bits. As there are none to do so, nothing fetch the base, thus, nothing can run.
To break out of that, we have to create a resource type to ship with our cross platform build that is able to retrieve root filesystems that are built for the specific platform that we target.
As we’re targetting non Linux AMD64, that meant that we’d need to go through the cross compilation dance again, now for the registry-image
resource.
The problem though, is that not only compiling would be sufficient - when a registry client asks for a container image that lives in a registry, it has to specify what’s the platform that such image is created for.
By default, the underlying library that registry-image-resource
uses assuming the linux
amd64
tuple, thus, I created a pull request (PR) to address that: https://github.com/concourse/registry-image-resource/pull/36.
Generating the modified registry-image-resource
As a base resource type is defined by having a rootfs
and resource_metadata.json
(see https://concourse-ci.org/tasks.html#task-image-resource), the easiest way of getting to a rootfs
would be to have a container image generated from a Dockerfile and then extracting the final rootfs, and placing that into a tarball.
Having that rootfs.tgz
that contains the root filesystem for the modified registry-image-resource
, that would mean that I could then distribute this resource in the tarball that contains all of the necessary bits for Concourse, effectively bootstrapping the whole thing!
While that sounds great, we have to remember that the registry-image-resource
makes requests to external systems.
The problem here is that to have the rootfs
properly created, we need to execute some commands within that container that creates the rootfs to get ca-certificates
so that we can make requests to HTTPS endpoints.
# the final representation of the registry-image
# Concourse resource type.
#
FROM rootfs-${arch} AS registry-image-resource
COPY --from=registry-image-resource-build \
/assets/ \
/opt/resource/
RUN apt update -y && \
apt install -y ca-certificates && \
rm -rf /var/lib/apt/lists/*
As the base image of that container must be an ARM-based image, it means that any binaries that we try executing there, will be executing instructions that only ARM machines can run. Damn!
To overcome that, we have at least two options:
- create a bootstrapping container image once in the target architecture, or
- emulate the target architecture.
While 1
sounds like something that could work, 2
is now quite simple to achieve, if you’re using a MacOS or Windows 10 machine.
Since not too long ago, Docker for Desktop has been shipping their internal VM with the right hooks to be able to emulate other architectures in a very transparent way for the developer (see the recent announcement: Building Multi-Arch Image for Arm and X86 with Docker Desktop).
Automating the process of building an ARM-based Concourse distribution
As manually building all of those binaries is not fun at all, I made the whole process buildable by creating a multi-stage Dockerfile (see cirocosta/concourse-arm#Dockerfile
).
Given that to build such Dockerfile we’d need to have a task that is able to do so, the Dockerfile builds a version of the builder-task
, which wraps genuinetools/img
, which is able to build container images from Dockerfiles (using buildkit).
Yeah, that’s a lot of names in a single article!
Summarizing
In the end, we build a bunch of stuff!
For instance, consider the building of the binaries:
And, for the registry-image-resource
rootfs:
The good news is that it’s all declared in that very same Dockerfile I mentioned, making the build mostly reproducible.
Some surprises
-
While trying to figure out what was going wrong with Guardian, I wanted so much to use
dlv
to troubleshoot what was going on, but, unfortunatelly, it doesn’t support any 32bit systems at the moment. -
I didn’t know that in some Linux distros you’d have to run
modprobe config
to have the/proc/config.gz
file accessible to check the configuration used to build that Kernel - interesting to know!
Closing Thoughts
In the end, it turns out that it’s not super complicated to achieve building a Go project that depends on some dependencies (having some C code involved too) to other architectures - set the right variables here and there, make use of some emulation if needed, and there you go.
It’s quite cool what you can do using cross compilation, and how a combination of Go and container images with a well defined set of build steps can make the whole process of building for multi platforms work great even when you don’t have access to those platforms.
I’m very curious about the whole movement of supporting other architectures other than amd64, so it’s nice to start having a foot in this space.
Even though we made use of cross compilation, there were steps where we were still needing to run some details in such architecture, which makes me think that it might be worth investing in having Concourse workers running smoothly on these other platforms too.
Please let me know what you think! I’m @cirowrc on Twitter.
See you!