Killing a container from the inside

This past week a tsuru user was having a problem in one of his apps. Some units of his app would simply get stuck, timing out every request and apparently doing nothing. As a workaround he wanted to kill the problematic unit and force it to restart. On tsuru you are able to restart all units of an app but may not choose a particular unit to restart.

We then “sshed” into the problematic container, by using tsuru app shell -a <app-name>, and tried sending a SIGKILL to their application process (pid 1) and surprisingly it did not work.

$ sudo kill -9 1 # our first try, from inside the container

We tried SIGTERM and SIGQUIT and nothing happened. We then ssh’ed into the host, found out the pid (using docker top), issued the SIGKILL and boom the container restarted.

Reading the man page for kill(2) helped understanding this behavior:

The only signals that can be sent to process ID 1, the init process, are those for which init has explicitly installed signal handlers. This is done to assure the system is not brought down accidentally.

So, to be able to kill the container from the inside, you need to register a handler for the particular signal. It turns out that you cannot register a handler for SIGKILL (you are also not able to ignore this signal). So, one must handle a different signal, e.g, SIGTERM, and use it to shutdown the application (by raising a SIGKILL or simply exiting).

The following code shows an example that might be used to check this behavior.

#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

void handler(int sig)
{
	exit(sig);
}


int main(int argc, char *argv[])
{
	int duration;
	if (argc > 1)
	{
		duration = atoi(argv[1]);
		printf("Sleeping for %ds\n", duration);
		sleep(duration);
		exit(EXIT_SUCCESS);
	}
	if(signal(SIGQUIT, handler) == SIG_ERR)
		exit(EXIT_FAILURE);

	for (;;)
		pause();
}

If the code is run as ./killable 30, the application will sleep for 30 seconds and then just exit. If that is the init process of a container, you won’t be able to send any signal to it as no handler was registered. If no argument is provided, a handler for the SIGQUIT signal is registered and we are able to send signals to it. In this latter case, we are able to kill the container successfully.

As it turns out, our advice to the user was to setup a signal handler for SIGTERM and to shutdown the application when receiving this signal.

comments powered by Disqus