Toolserver:Daemons

In UNIX circles, a program that runs in the background is called a daemon. This page is a guide to running a program on the Toolserver as a daemon.

Setting up a daemon
The following procedure uses cron to run /sbin/start-stop-daemon at set intervals; start-stop-daemon checks if your daemon is still running, and starts a new instance if it is not.

Step 1. Write the daemon program itself. This is the program that actually does whatever it is you want to do. You have to add a small amount of initialization code so that your program will run as a daemon; see below.

Step 2. Test the command line for the start-stop-daemon program. This program does not run all the time. Its only purpose is to kick-start your program from step 1.

The basic command line is:
 * /sbin/start-stop-daemon -q --oknodo --start --pidfile /home/cbm/daemon/daemon.pid --name daemon.pl --startas /home/cbm/daemon/daemon.pl

where
 * /home/cbm/daemon/daemon.pid is the name of a file into which your daemon writes its pid when it starts
 * daemon.pl is the filename of the program
 * /home/cbm/daemon/daemon.pl is the command line to start the program.

For testing purposes, you will want to remove the -q</tt> option and add -v</tt> and sometimes -t</tt> as documented at man start-stop-daemon</tt>. Make sure your command line for start-stop-daemon</tt> will start the daemon. Then make sure it will not start a second instance while the first is running. Then kill the first instance and make sure your command line will start a new instance. Then remove the -v</tt> and -t</tt> options, and add the -q</tt> option.

Step 3. Add the command line from part 2 to your crontab. First run crontab -e</tt> to edit your crontab. Then, if COMMANDLINE is the command from step 2, you want to this line to the bottom of the file:
 * 0,30 * * * * COMMANDLINE</tt>

This line tells cron that it should run /sbin/start-stop-daemon</tt> every 30 minutes. See man -s 5 crontab</tt> for details on the format of this line.

Finally, you want to double-check that your daemon is not leaving zombie processes. Run ps aux | grep daemon.pl</tt> and look at the output. If you see any "defunct" processes, like the example below, then your program has not fully disassociated itself. In that case, disable the crontab, fix the program, and then repeat step 3.

cbm@nightshade:~/daemon$ ps aux | grep daemon.pl cbm     28651  0.2  0.0      0     0 ? Zs  21:10   0:00 [daemon.pl] cbm     28706  0.0  0.0 105816  2064 ? Ss  21:10   0:00 /usr/bin/perl /home/cbm/daemon/daemon.pl

Writing your daemon program
To intialize itself as a daemon, your program must follow these general steps:
 * Fork a child, and have the parent exit.
 * The child (now the main process) calls setsid</tt> to become a process group leader. Now the original terminal closing will not kill the daemon (for example, when you log out). You can still kill the daemon with the kill</tt> command.
 * Chdir to the root directory /</tt>.
 * Close and/or reopen the standard I/O streams STDOUT, STDERR, and STDIN so that they are no longer attached to the original terminal. Failure to do this can result in zombies. You will want to make sure your own I/O only uses streams that were opened after the program became a daemon.
 * Set the umask to something appropriate.
 * For the second time, fork a child, and have the parent exit. Now the main process is the grandchild of the original process. This second fork is needed to prevent zombies in some circumstances.

Make sure your program can be run from the command line; use <tt>chmod</tt> to set its permissions so it is executable.

There are a few things you want to keep in mind with a long-running process:
 * Since you cannot check on the program from the console, some sort of logging is needed to diagnose runtime errors.
 * Don't consume excessive CPU, memory, or I/O.
 * Open files when you need them and close them as soon as you are done. Don't hog or leak filehandles.
 * If you program has an expensive state, save it regularly so that nothing is lost in a system crash.
 * Make your code robust against temporary errors such as network outages. Expect that your process may be unable to connect to external hosts for minutes at a time; retry all network requests after appropriate timeouts.
 * Use full paths to files that you open and programs that you execute.
 * Make sure the permissions are correct on files you create.

Python
This example code isn't best practice. You should probably use the daemon module instead. (This is installed on both nightshade and wolfsbane.)