Continuing on using Jenkins as a dashboard, I’d like to highlight some things that are essential to ensure reliable job execution:

  1. Always use “set -xeu -o pipefail” and the first line of your bash script (or its equivalent if you use another language for the build script. You must be sure that every non-zero exit code triggers a notification. Some tools may not conform to standard and return 0 exit code even when there are errors. In that case, use Log parser plugin, to search for errors and warnings in the output and possibly fail the build based on that.
  2. Always use Build timeout plugin, and set reasonable timeouts for the builds, with failure on timeout. It will ensure that you don’t have hanged jobs. Also add Timestamper plugin, which will log execution time of every command. Useful for debugging.
  3. One more pitfall is that a slave may get disconnected. In that case, no job will run and no notification will be sent. Make sure it doesn’t happen with a special job.

This cover almost all possible cases when a job doesn’t run and the administrator doesn’t get a notification. (2 remaining issues are reported to Jenkins Jira: one, two - vote and comment if you want them to be fixed).

That’s it, you’ve been warned. If you still use cron for critical tasks such as backups, don’t act surprised next time when you discover they’ve been failing for half a year.

Comments