Facebook Open Academy/Cron

A common requirement in infrastructure maintenance is the ability to execute tasks at scheduled times and intervals. On Unix systems (and, by extension, Linux) this is traditionally handled by a cron daemon. Traditional crons, however, run on a single server and are therefore unscalable and create single points of failure. While there are a few open source alternatives to cron that provide for distributed scheduling, they either depend on a specific "cloud" management system or on other complex external dependencies; or are not generally compatible with cron.

Requirements
The Wikimedia Labs has a need for a scheduler that:


 * Is configurable by traditional crontabs;
 * Can run on more than one server, distributing execution between them; and
 * Guarantees that scheduled events execute as long as at least one server is operational.

The ideal distributed cron replacement would have as few external dependencies as possible.

Research
Some interesting avenues of investigations have already been mentioned in the related Bugzilla (which see), as well as possible alternatives and counterarguments.

What are the current solutions that exist and what lessons can be learned from them?
 * Chronos
 * Supports dependencies between jobs
 * Will retry failed jobs
 * One of multiple nodes is elected a master
 * Has many dependencies including Apache Mesos and Zookeeper
 * Cronie
 * If I understand the man page correctly:
 * It only allows jobs to be executed on one chosen server at a time
 * Must manually switch the chosen server if it goes down
 * Requires a network-mounted share for the directory containing the shared crontabs
 * (FYI, that requirement is met for Lab's particular use case, but would normally indeed be considered onerous) &mdash; Coren (talk)/(enwp) 06:22, 4 February 2014 (UTC)


 * Jenkins
 * Meant for continuous integration not job scheduling
 * Only allows one master (single point of failure)
 * Gearman
 * Framework for distributing tasks
 * Has fault tolerance and job retries
 * Would still require a scheduler and worker application for the APIs to be written
 * Seems to be better suited to executing jobs at arbitrary times rather than being scheduled

Language to use
Set to Python by fiat for expediency
 * Widely distributed, well known by a large development base
 * High availability of libraries
 * Many (most) Linux distributions default to it for system scripts
 * Python 2 or Python 3??? GLM

What is the difference between a job and a schedule?
A job represents an entry in a crontab (ex: */5 * * * * echo "megacron"). A schedule is a specific instance of that job. In the previous example, a schedule would be created every five minutes for the job.

Should servers that run jobs be decoupled from servers that handle scheduling?
No, every server can potentially become a scheduler. This makes it easier for the users of our project because it abstracts the distinction between a worker and scheduler away from them.

How are the schedules and jobs stored and distributed between servers?
We are depending on a database being available and accessible to every server. The user will be able to choose from a few types of supported databases.

How is the worker that runs the schedule chosen and how is that information synchronized?
The server that has been running the longest will become the scheduler. A server will know that it has been running the longest if it has the oldest Id in the database. This approach is by far the simplest to implement and it seems to be working fine (so far).

How late can a job run?
If a run of a job is missed, it will be run at the next possible opportunity. If multiple runs are missed, only one will be run.

How bad is it if a job runs multiple times?
This should never happen.

What libraries should we usewe use to solve subproblems?

 * croniter for parsing crontab intervals
 * setuptools for packaging the project

How do we tell if a worker goes down?
A worker will periodically "check in" by updating it's heartbeat timestamp in the database. If the timestamp becomes older than a predetermined value, the worker is then considered to be down.

How does the scheduler distribute schedules?
Whenever it creates a schedule, it will specify a time to run and a worker to run it. Workers are assigned in a round-robin fashion. Workers then fetch a list of jobs assigned to them, and run them at the appropriate time.

What user executes the schedules?
The UID of the user that created the job is stored. The schedules are then executed as that user.

How bad is it if a deleted job still gets ran?

 * If a server gets isolated and the crontab is updated, it'll run all the jobs on the old table in perpetuity for as long as it cannot connect to any of the other servers. GLM
 * If a job is deleted after its start time, it should run (unless deleted refers to "aborted"). Such cases would need to be handled manually by an administrator. JT
 * In a common case of deleting a job from the crontab during the job's idle period, it should not run on non-isolated servers (this definition depends on the technique we use for server selection). Otherwise, there is essentially no delete functionality provided. FC

API

 * 1) Functions
 * 2) * get_jobs [Obtains a list of jobs that correspond 1:1 to each entry in every user's crontab]
 * 3) ** returns [Jobs]
 * 4) * get_jobs_for_user(user_id) [Obtains a list of jobs that correspond 1:1 to each entry in a single specified user's crontab]
 * 5) **returns [Jobs]
 * 6) * set_jobs([Job], user_id) [Removes existing jobs for the specified user and puts the list of Jobs into the datastore]
 * 7) * set_job_time(Job) [Sets the last time run for the job to the current time]
 * 8) * get_schedules(Worker) [Obtains a list of schedules that are currently assigned to the worker]
 * 9) ** returns [Schedule]
 * 10) * add_schedules([Schedule]) [Puts a list of schedules into the datastore]
 * 11) * add_schedule(Schedule) [Puts a schedule into the datastore]
 * 12) * remove_schedule(Schedule) [Removes a schedule from the datastore]
 * 13) * get_heartbeat(Worker) [Obtains the value of the heartbeat timestamp associated with the worker]
 * 14) ** returns timestamp
 * 15) * update_heartbeat(Worker) [Updates the worker's timestamp to the current time]
 * 16) * get_next_worker [Returns the next available worker and moves it to the end of the queue]
 * 17) ** returns Worker
 * 18) * get_workers [Obtains a list of all workers]
 * 19) ** returns [Worker]
 * 20) * create_worker [Adds a new worker to the datastore and returns it]
 * 21) ** returns Worker
 * 22) * destroy_worker(Worker) [Removes a worker from the datastore]
 * 23) Objects
 * 24) * Job [A single parsed crontab row representing a single command to be run]
 * 25) ** String : interval [Ex: "* * * * *"]
 * 26) ** String : command [Ex: "foo.py"]
 * 27) ** String : user_id
 * 28) ** Datetime : last_time_run
 * 29) * Schedule [A single element extracted from a Job, in a format that a worker can run]
 * 30) ** Worker : worker
 * 31) ** Datetime : time_to_run
 * 32) ** Job : job
 * 33) * Worker [A machine which Scheduler is a specific instance of, capable of performing crontab tasks]
 * 34) ** Datetime : heartbeat [timestamp of last contact]

Where can the code be tested in an isolated environment?
With appropriate credentials, you can ssh into all of the three following servers:
 * megacron-one.wmflabs.org


 * megacron-two.wmflabs.org

You should have root access here using sudo. A NFS shared filesystem is mounted on each server at "/data/project/".
 * megacron-three.wmflabs.org