The mechanics of moving volumes around workers in Concourse involves two steps:
- streaming a volume out of a machine
- streaming a volume into a machine
In case 1, baggageclaim
creates an archive of that directory (using tar
),
compresses it (using either zstd
or gzip
), and then send its the content
over to whoever is trying to consume that1.
In case 2, baggageclaim
does the opposite: takes a stream of bytes, then
decompresses, and then lets tar
convert that into a directory tree in an
empty volume (thus, filling the volume with the contents as they were in the
other machine).
Given that a user (lets say, cirocosta
with UID 1004) might exist in
machine-a
, but not in machine-b
, how does tar
deal with that?
It turns out that tar
does keep the UIDs around in the archive when you
create it:
# create a compressed tarball w/ files that are owned by `1234`
#
touch {a,b,c}
sudo chown 1234 {a,b,c}
tar czvf ./unpriv-files.tgz
# check who the owner is
#
tar --numeric-owner -tzvf ./unpriv-files.tgz
-rw-r--r-- 1234/0 ./a
-rw-r--r-- 1234/0 ./b
-rw-r--r-- 1234/0 ./c
|
UID
And, we can see that being also true for UID 0:
# make those files privileged
#
sudo su -
chown 0 {a,b,c}
tar czvf ./priv-files.tgz
# check the uid
#
tar --numeric-owner -tzvf ./priv-files.tgz
-rw-r--r-- 0/0 ./a
-rw-r--r-- 0/0 ./b
-rw-r--r-- 0/0 ./c
|
UID
When it comes to getting extracting that though, things change quite a bit.
extracting as an unprivileged user
regardless of how the UIDs are set up inside the archive, it gets extracted with the current user’s UID.
For instance:
# extracting the unprivileged payload (files w/ uid 1234), ends up w/
# files owned by myself (cirocosta uid=1004)
#
tar xvzf ./unpriv-files.tgz
-rw-r--r-- 1 1004 1006 0 Nov 27 13:55 a
-rw-r--r-- 1 1004 1006 0 Nov 27 13:55 b
-rw-r--r-- 1 1004 1006 0 Nov 27 13:55 c
| |
UID GID
# extracting the privileged payload (files w/ uid 0), ends up w/
# files owned by myself (cirocosta uid=1004)
#
tar xvzf ./priv-files.tgz
-rw-r--r-- 1 1004 1006 0 Nov 27 14:00 a
-rw-r--r-- 1 1004 1006 0 Nov 27 14:00 b
-rw-r--r-- 1 1004 1006 0 Nov 27 14:00 c
| |
UID GID
extracting as a privileged user
Given that a privileged user is capable of freely using [setuid(2)
], tar
in this case leverages that and then uses the permissions set in the archive:
# extracting the unprivileged payload (files w/ uid 1234), ends up w/
# files owned 1234 (not ourselves - what was set in the archive).
#
tar xvzf ./unpriv-files.tgz
-rw-r--r-- 1 1234 0 0 Nov 27 13:55 a
-rw-r--r-- 1 1234 0 0 Nov 27 13:55 b
-rw-r--r-- 1 1234 0 0 Nov 27 13:55 c
| |
UID GID
# extracting the privileged payload (files w/ uid 0), ends up w/
# files owned by 0 (just like in the archive).
#
-rw-r--r-- 1 0 0 0 Nov 27 14:00 a
-rw-r--r-- 1 0 0 0 Nov 27 14:00 b
-rw-r--r-- 1 0 0 0 Nov 27 14:00 c
| |
UID GID
1: that’s more of a pipeline actually - tar | zstd | ...