Learning Center

How to Set Up Healthchecks for Backups and Cron Jobs

Use Healthchecks to monitor whether your backups, cron jobs, and maintenance scripts actually ran, instead of discovering failures only when you need them most.

MonitoringBackupsCron jobs

What you learn

How to connect a scheduled task to Healthchecks, capture success and failure, and build a simple alerting loop around important jobs.

Good for

Restic backups, database dumps, system maintenance scripts, and any recurring job that should never fail quietly.

Risk to watch

Backup systems often fail from neglect, not complexity. A job that stopped running last week can look perfectly fine until restore day.

Before you begin

A scheduled backup or maintenance script already working by hand.
curl installed on the server.
A Healthchecks account or self-hosted instance.
A place to store the ping URL safely, such as a protected env file.

Monitoring a web app is useful, but background jobs matter just as much. Backups, certificate renewal hooks, cleanup tasks, and database dumps often fail without any visible user-facing symptom. Healthchecks is a simple answer to that specific problem.

Warning: Do not add monitoring to a script you have never tested manually. Otherwise you can end up monitoring a broken job and only proving that broken code still runs on schedule.

Why this works so well for small systems

Healthchecks flips the monitoring model. Instead of polling your server from the outside and guessing whether a scheduled task ran, the task itself sends a ping when it starts, succeeds, or fails. That is perfect for cron jobs because cron has a simple question to answer: did the job finish in time?

A common free or open-source-first setup is either the hosted Healthchecks service for convenience or a self-hosted Healthchecks instance if you want to keep monitoring fully under your own control.

Step 1: Create a check in Healthchecks

Create a new check named something clear like nightly-restic-backup. Set:

The expected schedule, such as every 24 hours.
A grace period, such as 60 to 120 minutes.
The notification channels you actually look at, such as email, Slack, or Matrix.

Healthchecks will give you a unique ping URL. It usually looks something like this:

https://hc-ping.com/12345678-90ab-cdef-1234-567890abcdef

Save it in a protected file, not in a public repo.

Step 2: Wire the ping into your backup or cron script

Here is a practical backup wrapper script:

#!/usr/bin/env bash
set -euo pipefail

PING_URL="https://hc-ping.com/12345678-90ab-cdef-1234-567890abcdef"
BACKUP_LOG="/var/log/restic-backup.log"

curl -fsS -m 10 --retry 3 "$PING_URL/start" >/dev/null || true

if /usr/local/bin/run-restic-backup.sh >>"$BACKUP_LOG" 2>&1; then
  curl -fsS -m 10 --retry 3 "$PING_URL" >/dev/null
else
  curl -fsS -m 10 --retry 3 "$PING_URL/fail" >/dev/null || true
  exit 1
fi

This gives you three useful signals:

/start tells Healthchecks the job began.
The plain ping marks success.
/fail marks failure explicitly.

If you want to send command output with the ping, Healthchecks also supports attaching data, but start simple first.

Expected outcome: A successful run appears in the Healthchecks dashboard with a recent timestamp, and failed runs alert you quickly instead of hiding in log files.

Step 3: Schedule it and verify the full flow

Add the wrapper to cron:

crontab -e

Example nightly job:

15 2 * * * /usr/local/bin/run-restic-backup-with-healthcheck.sh

Test it manually before waiting for night:

/usr/local/bin/run-restic-backup-with-healthcheck.sh

Then confirm all three layers:

The script exits successfully when the backup succeeds.
The log file contains useful output.
The Healthchecks dashboard shows the run and schedule correctly.

Recovery notes:

If the monitoring endpoint is temporarily unreachable, your backup should still run. That is why the /start and /fail pings above tolerate curl errors.
If the job logic changed, re-test the whole script manually before trusting the new schedule.
If the check fires false alerts after maintenance, use pause controls or update the grace period instead of ignoring notifications long term.

Troubleshooting common mistakes

The job runs, but Healthchecks says it missed a ping.
Confirm the exact ping URL, schedule, and grace period. Also check whether the script exits before reaching the success ping.

Healthchecks shows starts but not success.
The job is probably failing after the /start ping. Check the script log and test the underlying command separately.

Cron runs the script differently than my shell.
Use absolute paths for binaries and files. Cron has a smaller environment than an interactive shell.

I am getting alerts during planned downtime.
Pause the check in advance or temporarily widen the grace period while maintenance is in progress.

What to do next

Once your jobs are observable, the next improvement is making notification delivery from your self-hosted apps more reliable. Continue with How to Send Email From Self-Hosted Apps With an SMTP Relay.