Hey,
While figuring out how to run a standalone kubelet with
containerd, I had quite a bit of
trouble getting the networking part set up given that I had previously
misconfigured the PATH
environment variable set for the systemd
service that
I was using to run containerd
.
Everything for me made sense: I did indeed have iptables
in my PATH
(after
confirming that it’s location matched what I’ve gotten for PATH
, so, why was
it failing like that?
failed to locate iptables: exec: \"iptables\":
executable file not found in $PATH"
Having spent quite a bit of time just to figure out my stupid mistake of
tailoring a wrong PATH
in the systemd
service, here’s how I got to it.
who’s trying to find iptables
?
That was the first question I had - perhaps I could trace who was failing to
execve
, and then that’d be all:
-
I’d be able to tell which process did the
execve
, thus, be able to inspect its environment (through/proc/pid/environ
)proc1: | execve(iptables) | <--- ERR | printf("failed to locate ...") | '-------> with a misconfigured PATH '--> just gotta figure out who that is
But, as one more familiar with execve
1, that’s just not how it works - execve
would expect either a relative path (from AT_FDCWD
or another directory that you
provide via a file descriptor), or an absolute path.
Regardless, someone is going through what is set for their PATH
, and then
trying to see if iptables
exists there.
All that we needed now is to figure out which syscall to trace:
ls /sys/kernel/debug/tracing/events/syscalls/ | grep enter | grep stat
sys_enter_fstatfs // filesystem statistics
sys_enter_newfstat
sys_enter_newfstatat
sys_enter_newlstat
sys_enter_newstat
sys_enter_statfs // filesystem statistics
sys_enter_statx
sys_enter_ustat // filesystem statistics
From those, newfstat
is the only one that’d require more than a one-liner to
trace as it takes an open file descriptor (in which case, an open
would have
already occurred first, not very likely to be something done under the hood).
Thus, let’s trace those and figure out who’s issuing them:
#!/snap/bin/bpftrace
tracepoint:syscalls:sys_enter_newfstatat,
tracepoint:syscalls:sys_enter_newlstat,
tracepoint:syscalls:sys_enter_newstat,
tracepoint:syscalls:sys_enter_statx
/ comm != "iptables" /
{
printf("%-16s %s\n", comm, str(args->filename));
}
And then, there I found it:
bridge /opt/containerd/bin/iptables
the bridge
command (a cni
plugin, part of
containernetworking/plugins)
was only searching for iptables
under /opt/containerd/bin
(which matched
exactly the PATH I had set up for containerd
- the process that spawned
bridge
).
what else could’ve been done?
To further improve that code, we could have traced the exitting side of that syscall, and then filter by those that were failing (i.e., the searches that resulted in a “not found”):
cat /sys/kernel/debug/tracing/events/syscalls/sys_exit_newfstatat/format
name: sys_exit_newfstatat
ID: 691
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int __syscall_nr; offset:8; size:4; signed:1;
field:long ret; offset:16; size:8; signed:1;
print fmt: "0x%lx", REC->ret
This way, the filter would look like / args->ret != 0 /
.