If you’re reading this article, it’s probably because you’ve heard of cron jobs, cron tasks, or crontab. Cron is a piece of software written for *nix-type operating systems to help with the scheduling of recurring tasks. You may want to use cron to schedule certain recurring actions in your Rails application, such as checking each day for expired accounts or clearing out expired sessions from your database.
It’s pretty easy to start working with cron jobs. You can start editing your cron tasks using the crontab command:
This will open up a text file in your terminal’s default editor (probably vim or nano). You can change the editor that is used by prefixing the crontab command with an adjustment to the
Remember that your scheduled tasks will be run as the user that you use when you invoke crontab.
Once you’re in the editor, you can start creating and editing cron tasks. To schedule a task, you need to define when the task will run, then define the command to trigger that task. The scheduling syntax has five parts, separated by spaces (diagram taken from drupal.org):
1 2 3 4 5 6 7
The parts represent, in this order, the minute(s), hour(s), day(s) of the month, month(s), and day(s) of the week to run the command. You can use the asterisk (
*) to represent every unit of the part in question. So using
* * * * * schedules your task to run every minute of every hour on every day of every month and on every day of the week. In addition to the asterisk, you can use comma-delimited integers and names of the week. You can also replace the 5 parts with a single predefined schedule, as explained in the wikipedia article on Cron. Once you have your schedule defined, you can follow it with any valid bash command:
1 2 3 4 5
Running Rails Application Code via Cron
It can be useful to run system and server tasks with cron, but a lot of times you’ll need to run Rails application code on a schedule. You can do this the hard way and set up a controller action (e.g.
CronJobsController#some_task) that triggers application code to run, then set up a cron task to send a GET request to that action (e.g.
/usr/bin/wget -O -q - -t 1 http://example.com/cron_jobs/some_task), or you can do it the easy way and run the code directly via cron. Rails has runners and rake tasks to facilitate this:
Rake Tasks from Cron
To run an existing rake task, you can change your directory to your application root, then call the rake task:
You can find the path to your
bundle executable by typing
which bundle from within your application’s root folder.
There’s no need to limit your rake tasks to ones that are already available via Rails or whatever framework and gems you happen to be using. You can write your own rake tasks and use those in your cron jobs as well. You can find more information about how to write your own rake tasks in the Rails Guides.
Another way to run code in your Rails app directly from cron is by using Rails runners. To execute rails code using
rails runner, all you have to do is call
rails runner "ruby code":
-e production sets the environment to “production”, and can be altered as needed. The
Model.long_running_method portion represents a class method that can be called from within your Rails application. Using
rails runner to run the code loads up your application into memory first, then evaluates the ruby code in the Rails environment. When the task completes, the runner exits.
Debugging Custom Code
To debug a runner or a custom rake task, you can print to STDOUT from within your code (using
puts or similar methods). Within crontab, make sure to set the
MAILTO variable to your email address so that you’ll receive that output.
As long as the server is set up properly to send outgoing email, you’ll get emailed the output.
The Whenever Gem
One difficulty with using cron is that it is awkward to maintain and store your cron jobs. The syntax can be cumbersome to beginners as well, and it’s easy to make mistakes in setting up your schedules and your executables’ paths. One way to overcome these difficulties is to use the whenever gem. Just add
gem 'whenever', require: false to your Gemfile and run
bundle install. Then within your application’s root directory, run the following command:
This will add a schedule file to your config folder (
config/schedule.rb). This file is where you set up your scheduled tasks. You can use regular cron scheduling, or you can use a more idiomatic DSL for scheduling. The following examples are take from the gem’s README.md:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Once you’ve set up your scheduled tasks in
config/schedule.rb, you still need to apply them to your crontab. From your terminal, use the
whenever command to update your crontab:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
There is a capistrano extension for whenever, so if you are using capistrano to deploy your application, make sure you take advantage of it.
Pitfalls of Crontab
While using the
whenever gem can fix some of the issues with using cron to manage your application’s recurring tasks, there are several reasons you may want to avoid using cron tasks at all:
- It’s easy to forget to set up cron tasks on servers.
- If you are running multiple servers, you may not want them all to run the cron tasks — doing so can cause conflicts or can at least require you to try to handle semaphores/locks in your code. This can get messy fast.
- Cron schedules are maxed out at minute level. Scheduling for multiple times a minute is impossible.
- If your server goes down, it could miss a critical scheduled task, so it often becomes a necessity to run cron more frequently and have the code account for missed runs.
- Cron tasks can be difficult to debug. It is common practice to suppress all output, but by doing so you are possibly taking away a lot of good debugging information.
- Each time a runner or rake task is run on a Rails app, the entire app has to be loaded into memory, which can take a substantial amount of CPU time and a substantial amount of memory. If your cron tasks start overlapping, you can quickly run out of RAM and cause your server to go kaput.
Despite the above issues, there are some cases where running a cron task is perfectly acceptable and possible better than some of the alternatives. When it is not absolutely essential that the task be run consistently, a cron job can be a quick and easy solution that doesn’t require a lot of setup. Combined with a
whenever schedule.rb file and a deployment strategy that maintains your cron tasks, it can be a viable strategy.
Alternatives to Crontab
Sidekiq and Sidetiq
In cases where cron tasks don’t quite cut it, there are alternatives. My personal favorite is a combination of Sidekiq and Sidetiq. Understand, though, that these aren’t simple to setup. They require a redis server and additional code. When you run sidekiq, your application is loaded into memory and runs tasks as you schedule them in your code. This is nice because the code is constantly running and can quickly pick up new tasks and process them without additional overhead. When multiple application servers point to a single redis server (or server cluster) endpoint, you can also have multiple sidekiq instances pointing to that same redis server and have them pick off tasks as they have availability. This is great for scaling and (mostly) eliminates issues of locking, semaphores, and race conditions.
The Heroku Scheduler
Another alternative to cron tasks is the Heroku Scheduler, if your application is running on Heroku. As is noted in the documentation, the Heroku Scheduler does not guarantee that tasks will be run, so you are encouraged to include a custom clock in your application to make sure that tasks are run. It also has a limit of running at most once every 10 minutes, so you may need to use a worker dyno and create your own looping mechanism in a long running task to make sure tasks are run as frequently as they should be.
While cron jobs are useful, and can get the job done in many instances, you should be careful in your decision of how to implement scheduled and delayed tasks. Cron jobs are best for tasks that are not highly critical or essential to your application’s functionality or performance. They should also usually execute in a reasonably short amount of time so that they don’t bog down your server(s). When you have scheduled tasks of a critical nature, or tasks that need to be run more than once per minute, tools such as Heroku’s worker dynos or Sidekiq are very performant and viable solutions.