Hey,
Recently, something quite interesting came to Concourse’s attention: while
bumping the base image that we used to build our binaries, we end up breaking
fly
, the client-side CLI used to interact with Concourse installations, for
all of those users who were using it with an old version of Linux (2.6):
fly --version
FATAL: kernel too old
That’s crazy, right? How could we suddenly be facing that?
Going down the path of figuring out where that comes from, it turns out that the
offender here was glibc
, which turns out to be statically linked into fly
-
when we bumped the base image that’d build fly
, that end up bumping glibc
,
getting it to a version that put a constraint on what’d be a minimal kernel
version that’d support, and then the rest is history.
With that resolved, it was tiime to learn a bit more about those pieces involved.
ps.: I’m not expert on what I talk about in this blog post. Please let me know if I wrote something wrong! Please reach out to me @cirowrc.
glibc? libc?
First of all, I didn’t have a good idea of what glibc
really was - sure, it
was something related to implementing something related to the C standard, but,
what exactly?,
“The term “libc” is commonly used as a shorthand for the “standard C library”, a library of standard functions that can be used by all C programs (and sometimes by programs in other languages).”
“[glibc is] By far the most widely used C library on Linux.”
Thus, naturally, we can infer that:
- there’s a standard set of functions and conventions that C programs can expect to leverage
glibc
is one of many possible implementations of such standard.
Let’s go oveer these two items then.
1. standards and libc
The first item can be verified by a quick look at the glibc
website:
“These libraries provide critical APIs including ISO C11, POSIX.1-2008, BSD, OS-specific APIs and more”
Well, that said, what if we went searching for that ISO C11 standard? It turns
out that that you can either buy the 2018 version right from ISO (at a price of
almost U$200!), or get direct free access to (from what I understand) is a very
close version to what’d be the final version of the standard from
open-std.org
: ISO/IEC 9899:201x.
In the very beginning of it, we can see what’s it all about:
[…] specifies the form and establishes the interpretation of programs expressed in the programming language C. Its purpose is to promote portability, reliability, maintainability, and efficient execution of C language programs on a variety of computing systems.
That’s pretty cool! But, more practically, what’s that it really about though?
“There are five standard signed integer types, designated as
signed char
,short int
,int
,long int
, andlong long int
.”
[…]
“An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a
typedef
name; a label name; a macro name; or a macro parameter”
But, that’s definitely not something that glibc
would care about, right?
That’s somerthing that a compiler (like GCC
) would implement.
Going further in the reading (that doc has 701 pages), we can see a section that’s about something that now is not compiler specific: libraries.
7.1.2 Standard headers
The standard headers are:
<assert.h>
,<complex.h>
,<ctype.h>
,<errno.h>
, …
7.2 Diagnostics <assert.h>
7.2.1.1 The
assert
macro
void assert(scalar expression);
When it is executed, if
expression
(which shall have a scalar type) is false (that is, compares equal to 0), theassert
macro writes information about the particular call that failed (including the text of the argument, the name of the source file, the source line number, and the name of the enclosing function […]
Now, that’s something interesting - there are some default facilities that as a consumer of the language I can use to be productive with it.
Guess who’s implementing that stuff? Exactly, glibc
!
#define assert(expr) \
((expr) \
? __ASSERT_VOID_CAST (0) \
: __assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION))
void
__assert_fail (const char *assertion, const char *file, unsigned int line,
const char *function)
{
__assert_fail_base (_("%s%s%s:%u: %s%sAssertion `%s' failed.\n%n"),
assertion, file, line, function);
}
(from assert/assert.h
and assert/assert.c
)
If glibc
implements a standard (which is not driven by a single
implementation), who else is implementing it?
2. multiple implementations of libc
Aside from glibc
, another implementationt that is quite popular is musl
,
whose license (MIT) is very different from GLIBC (LGPL).
“musl is a new general-purpose implementation of the C library. It is lightweight, fast, simple, free, and aims to be correct in the sense of standards-conformance and safety.”
At least in size, one can tell the difference:
- 2.0M, libc-2.27.so (ubuntu 18.04)
- 566.5K, ld-musl-x86_64.so.1 (alpine 3.10)
I’m not very familiar with the practical differences between one and another,
aside from the fact that musl
seems to be the de-facto choice when it comes to
statically linking libc
(e.g., it’s Rust’s choice).
I highly recommendinig checking out musl
’s FAQ.
what golang has to do with glibc?
It turns out that those implementations of libc
don’t only bring the
implementation of the C library - they’re also concerned with the Portable
Operating System Interface (POSIX), a standard that’s all about how one goes
about communicating with the underlying OS.
POSIX.1-200x defines a standard operating system interface and environment, including a command interpreter (or
shell
), and common utility programs to support applications portability at the source code level.
That sounds great from the perspective of those trying to provide the standard library of a language, doesn’t it? It’s essentially all about letting people have portable code across a variaty of systems.
For instance, it standardizes how getaddrinfo(3)
should behave:
The
getaddrinfo()
function shall translate the name of a service location (for example, a host name) and/or a service name and shall return a set of socket addresses and associated information to be used in creating a socket with which to address the specified service.
And, that’s the kind of stuff that, once again, glibc
ends up implementing
(just like musl
too!).
While that sounds all good, it turns out the implementation details matter, and
those who have been always leveraging the fact that glibc
ends up using NSS
before performing such looks, ends up forcing other who are trying to provide
such functionality to bring that behavior.
What did Go do in this case? It went with linking to libc
(very clearly
glibc
specifically, as POSIX and ISO C have nothing to do with NSS), and
deferring to it the resolution of names and users.
Now, you might ask: “what about those not using glibc
? Do things break when
using Go?” It turns out that the answer is: yes!
For instance, Alpine linux doesn’t have nsswitch configuration file
(remember, Alpine is musl
-based), and thus, Go behaves badly by default.
Given that such interface has been stable since … a very long time, and people
have been leveraging that to further extend what could be done with it, Go ships
with the ability of letting the binary be compiled in a way that it links
against a libc
, being able to then extend its ability to resolve addresses to
leverage all of the extra stuff that projects like glibc
have already
implemented.
glibc does have kernel requirements
Back to the problem of requirements that glibc
imposes of a system…
Typically, as one would dynamically link against a local glibc
, you’d have a
version of it that is compatible with the target, thus, having a piece of
software that works.
By statically linking glibc, that premise is not true anymore though, and we learned this “the hard way”.
Looking at the Golang runtime specifications, the version of Linux announced under Golang minimum requirements is 2.6.23+.
The problem is that that’s not assuming at all that glibc
is included, as
glibc
can essentially carry its own requirements on a given kernel version.
For instance, by inspecting the mailing list, we can see that when glibc
2.26
was released, a new minimum kernel version requirement was introduced:
Linux kernel 3.2 or later is required at runtime, on all architectures supported by that kernel. (This is a change from version 2.25 only for x86-32 and x86-64.)
if (__LINUX_KERNEL_VERSION > 0 && version < __LINUX_KERNEL_VERSION) \
/* Not sufficent. */ \
FATAL ("FATAL: kernel too old\n"); \
(from glibc/.../dl-osinfo.h
)
how can I see what requires cgo?
Before bringing up that we could potentially be better off by just removing all of our use of CGO in Concourse (see concourse#4342), I needed to find out where we were making use of it.
Here are some commands that might help you doing the same:
# tags we'd using during regular compilation.
#
gotags="-tags netgo"
# grab a space-separated list of all of the dependencies we have.
#
dependencies=$(go list $gotags -f "{{.ImportPath}}{{range .Deps}} {{.}}{{end}}")
# liist those that have cgo files.
#
go list $gotags -f "{{if .CgoFiles}}{{.ImportPath}}{{end}}" $dependencies