Hey,
this is a quick tip that might save you some time if you need to have all the output of a script going out to a file and you don’t have control over the execution of it.
Here I go through two possible ways to tackle the problem:
- using
exec
to redirect thestdout
andstderr
to a specific file; - using a subprocess and controlling its
stdout
andstderr
.
Both achieve the same, but are slightly different.
- When do you need this?
- Using exec to redirect stdout and stderr
- Redirecting subprocess stdout and stderr
- Closing thoughts
When do you need this?
The typical case for needing to ensure (from within the script) that the logs end up in a file is when you can’t control a shell execution of your script (if you had, then you could simply forward both standard out and standard error to a file without modifying the script).
Note.: if you’re using AWS EC2 AutoScaling groups with a custom script for initializing your instance, you don’t necessarily need to use this to get your logs sent to a file - you can already grab them at /var/log/cloud-init-output.log
even if you don’t use cloud-init
-, unless you want to change the default file.
Using exec to redirect stdout and stderr
The whole thing is rather simple, although it relies on some bash details that most people don’t know:
#!/bin/bash
set -o errexit
readonly LOG_FILE="/var/log/script.log"
# Create the destination log file that we can
# inspect later if something goes wrong with the
# initialization.
sudo touch $LOG_FILE
# Make sure that the file is accessible by the user
# that we want to give permissions to read later
# (maybe this script is executed by some other user)
sudo chown ubuntu $LOG_FILE
# Open standard out at `$LOG_FILE` for write.
# This has the effect
exec 1>$LOG_FILE
# Redirect standard error to standard out such that
# standard error ends up going to wherever standard
# out goes (the file).
exec 2>&1
The whole deal is about making use of exec
, which is document in man bash
:
exec [-cl] [-a name] [command [arguments]]
If command is specified, it replaces the shell.
No new process is created.
The arguments become the arguments to command.
So, nothing fancy, just like a execvp
syscall would do - replace the image of the current process and initialize the program we pass.
The catch is in the end of the description of the command though:
...
If command is not specified, any redirections
take effect in the current shell, and the return
status is 0. If there is a redirection error, the
return status is 1.
It means that by making use of exec
without a program, we can affect the redirection of the current process (the one executing our commands).
To understand that better, I started pausing the script at several break points to see what’s going on under the hood from the perspective of /proc/$PROC/fd
.
At the first moment (I), nothing fancy - the standard redirections to the pseudo terminal are set:
ls -lah /proc/$(pgrep script.sh)/fd
0 -> /dev/pts/1
1 -> /dev/pts/1
2 -> /dev/pts/1
255 -> /tmp/script.sh
At the second momement though (II), the file has already been open and the standard out got redirected:
ls -lah /proc/$(pgrep script.sh)/fd
0 -> /dev/pts/1 # stdin, untouched
1 -> /var/log/script.log # redirected!
2 -> /dev/pts/1 # not redirected to the log file
# yet
255 -> /tmp/script.sh
At the third moment (III - final of the script), as expected, we have everybody redirecting to the log file:
0 -> /dev/pts/1 # stdin, untouched
1 -> /var/log/script.log
2 -> /var/log/script.log
255 -> /tmp/script.sh # other file descriptors - untouched
If you’re extra curious, take a look at the use of syscalls during the script execution using strace
:
# Open the log file with 666 permissions, create
# if it doesn't exist (`O_CREAT`), start writing
# in the file from the very beginning if it already
# exists (O_TRUNC) and open it as write only
# (`O_WRONLY`).
#
# With the succesfull `openat`, it tells us that
# we can access it via the file descriptor `3`.
openat(AT_FDCWD,
"/var/log/script.log",
O_WRONLY|O_CREAT|O_TRUNC,
0666) = 3
...
# dup2 duplicates a file descriptor (first
# argument) creating a copy of it and giving
# it the number specified in the second
# argument (1, in this case).
#
# This comes from `1>$LOG_FILE` where `$LOG_FILE`
# is being pointed to as the file descriptor `3`.
dup2(3, 1) = 1
# Close the first file descriptor associated
# with out log files as we don't use it directly
# in the rest of the application - it was only
# openned to be duplicated later.
close(3) = 0
# Duplicate the stdout to also be used as the
# stderr (`2>&1`).
dup2(1, 2) = 2
Redirecting subprocess stdout and stderr
Using bash, any command (or series of them) inside a parenthesis group ((
and )
) gets executed in a subshell (a copy of the original shell process but in a fork
ed one).
From the bash man pages:
A compound command is one of the following.
In most cases a list in a command's description
may be separated from the rest of the command by
one or more newlines, and may be followed by
a newline in place of a semicolon.
>>>>> (list) list is executed in a subshell environment.
Let’s take an example:
#!/bin/bash
set -o errexit
say_hello() {
echo "Hello! pid=$BASHPID"
}
# Calls `say hello` from the main
# process
say_hello
# Creates a subshell (in a sub process)
# and then calls `say_hello`.
#
# By sleeping we can look at `pstree` and
# figure out how the process tree looks
# like (the outer process will wait this
# inner process)
(
say_hello
sleep 3600
)
The execution leads to:
# Run the script and leave it running in the
# background
./script.sh &
Hello! pid=20127
Hello! pid=20128
# Check out the process tree of the recently
# executed script (`./script.sh`)
pstree -p $!
script.sh(20127)
───script.sh(20128)
───sleep(20129)
Being a compound command, IO redirection can be properly done:
#!/bin/bash
set -o errexit
readonly LOG_FILE="/var/log/script.log"
say_hello() {
echo "Hello! pid=$BASHPID"
}
# Calls `say hello` from the main
# process.
#
# Logs the hello message to the process
# stdout as no redirection has been
# performed at this point.
say_hello
# Creates a subshell (in a sub process)
# and then calls `say_hello`.
#
# The subshell that gets created has both
# stdout and stderr redirected to the LOG
# file `/var/log/script.log`.
(
say_hello
sleep 3600
) &>$LOG_FILE
Execute the script and then inspect /proc/<pid>/fd
and you should see the redirection taking place:
# Look at the file descriptors of the
# parent process
ls -lah /proc/20370/fd
0 -> /dev/pts/1
1 -> /dev/pts/1
2 -> /dev/pts/1
255 -> /tmp/script.sh
# Look at the file descriptors of the
# child process (subshell) that got
# the redirection:
ls -lah /proc/20371/fd
0 -> /dev/pts/1
1 -> /var/log/script.log
2 -> /var/log/script.log
Closing thoughts
Maybe you’ll never need to do this kind of thing, but now you know how it works haha.
If you have any questions or just want to chat about something, reach me at @cirowrc on Twiter.
Have a good one!
finis