In my previous post, I provided insight on the importance of running docker with the
--init
flag to ensure the proper exit code is returned when an application calls the abort
function.
While researching on the subject, I stumbled upon a bug report on GitHub regarding this issue, for which a member of the community was only able to provide a workaround. My intrigue got the best of me and as a result, I am happy to share that I was able to update this bug report with my findings.
In this post, I am going to expand on these findings on the mechanism of signal handling in a docker container when it runs without this
flag.
The first step is to take a look at how abort
is
implemented in the GNU C Library.
The function initially unblocks the SIGABRT
signal:
1
2
3
4
5
6
7
8
/* Unblock SIGABRT. */
if (stage == 0)
{
++stage;
__sigemptyset (&sigs);
__sigaddset (&sigs, SIGABRT);
__sigprocmask (SIG_UNBLOCK, &sigs, 0);
}
Then it sends SIGABRT
(line 13):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/* Send signal which possibly calls a user handler. */
if (stage == 1)
{
/* This stage is special: we must allow repeated calls of
`abort' when a user defined handler for SIGABRT is installed.
This is risky since the `raise' implementation might also
fail but I don't see another possibility. */
int save_stage = stage;
stage = 0;
__libc_lock_unlock_recursive (lock);
raise (SIGABRT);
__libc_lock_lock_recursive (lock);
stage = save_stage + 1;
}
If the application does not have a user-defined signal handler (which my program inthe previous post did not), then it should be caught by the default signal handler, i.e. the kernel.
Since my program’s container was started without
the --init
flag, the application had the PID 1 in the container’s PID
namespace, and was treated as a standalone
“init” process for this namespace.
The Linux kernel handles signals differently for the init process
than it does for other processes. Signal handlers are not automatically registered for this process, meaning that
signals will not have effect by default.
Hence, as in my program, the signal would not be handled and the function would continue running.
At this stage, abort
assumes that the program has a user-defined handler which is malfunctioning, i.e. not killing the
process, thus abort
replaces it with the default handler (line 6):
1
2
3
4
5
6
7
8
9
10
/* There was a handler installed. Now remove it. */
if (stage == 2)
{
++stage;
memset (&act, '\0', sizeof (struct sigaction));
act.sa_handler = SIG_DFL;
__sigfillset (&act.sa_mask);
act.sa_flags = 0;
__sigaction (SIGABRT, &act, NULL);
}
It then makes the last attempt to send the signal:
1
2
3
4
5
6
/* Try again. */
if (stage == 3)
{
++stage;
raise (SIGABRT);
}
As expected, sending the signal again does not help.
As a natural response to a consistently failing methodology, abort
tries a different approach, where it attempts to execute a platform-specific command to terminate the
process:
1
2
3
4
5
6
/* Now try to abort using the system specific command. */
if (stage == 4)
{
++stage;
ABORT_INSTRUCTION;
}
For the x86_64 architecture, the command is defined in glibc/sysdeps/x86_64/abort-instr.h:
1
#define ABORT_INSTRUCTION asm ("hlt")
The instruction just pauses the CPU until the next external interrupt is fired, though it requires ring 0
access which is available only for privileged software, such as the kernel. When a process
attempts to violate such permissions, the hardware triggers the general protection fault (GPF) interrupt, and a kernel is expected to kill the violating process. In Linux OS, the kernel’s general protection fault exception handler
(exc_general_protection) is called. The
handler checks if the violator is a userspace process and if so, it calls force_sig(SIGSEGV);
which terminates the
process running in the docker, eventually setting its exit code to 139:
1
2
3
4
5
6
7
8
if (user_mode(regs)) {
tsk->thread.error_code = error_code;
tsk->thread.trap_nr = X86_TRAP_GP;
show_signal(tsk, SIGSEGV, "", desc, regs, error_code);
force_sig(SIGSEGV);
goto exit;
}
Once the application exits, docker sets its own exit code equal to the application’s.
As a demonstration of this behavior, I ran dmesg right after the container stopped in order to see the kernel logs:
$ dmesg [523555.291893] traps: app[109848] general protection fault ip:7ff4d9391a10 sp:7ffe992508a0 error:0 in libc-2.27.so[7ff4d9351000+1e7000]
This message shows the process “app” with the PID 109848 (the program’s PID outside of the
container’s PID namespace), the GPF error, and
Glibc (libc-2.27.so) that tried to execute the HLT
instruction.
Conclusion
Unless your application has user-defined signal handlers, it is strongly advised and encouraged to run a
container with the --init
flag as a safety measure, as this tiny docker-implemented init process will enable default signal handling for
your application and reap zombie processes.
Please share your thoughts on Twitter.