Using Jenkins as an automation dashboard, part 2

Continuing on using Jenkins as a dashboard, I’d like to highlight some things that are essential to ensure reliable job execution:

  1. Always use “set -xeu -o pipefail” and the first line of your bash script (or its equivalent if you use another language for the build script. You must be sure that every non-zero exit code triggers a notification.
    Some tools may not conform to standard and return 0 exit code even when there are errors. In that case, use Log parser plugin, to search for errors and warnings in the output and possibly fail the build based on that.
  2. Always use Build timeout plugin, and set reasonable timeouts for the builds, with failure on timeout. It will ensure that you don’t have hanged jobs.
    Also add Timestamper plugin, which will log execution time of every command. Useful for debugging.
  3. One more pitfall is that a slave may get disconnected. In that case, no job will run and no notification will be sent. Make sure it doesn’t happen with a special job.

This cover almost all possible cases when a job doesn’t run and the administrator doesn’t get a notification. (2 remaining issues are reported to Jenkins Jira: one, two – vote and comment if you want them to be fixed).

That’s it, you’ve been warned. If you still use cron for critical tasks such as backups, don’t act surprised next time when you discover they’ve been failing for half a year.

Using Jenkins as an automation dashboard, part 1


The most common automation tool on Unix-like system is cron. It seems to be convenient at first, but it has a number of drawbacks. There are just too many things to worry about:

  1. If some command fails, there’s no notification.
  2. A command might not fail, but hang. No notification about that either.
  3. And in addition to that, a slow running or hanged command might cause multiple instances of the script simultaneuosly. Which, in some cases, might be disastrous. So you have to add lock files, pid files, etc.
  4. Even if you do add a notification in the script, it might not be sent due to, say, mail server being down.
  5. You need to store script logs. And to rotate them, too.
  6. And, if something goes wrong, and you do get a notification, then you have to log into the problem server, find the log, search it for problems, to see what’s really happened.

All this isn’t impossible to solve with enough scripting. But it’s tedious and time consuming.

It would be rather nice to have a bird’s eye view of your tasks instead, wouldn’t it? Current state, logs, history of launches, launch on demand, execution time control, chaining tasks, etc – all one click away in a browser? Well, Jenkins CI is just that. It solves all those problems.

Move any task that you care about to Jenkins, and sleep well – you will know when there are issues. It’s so much more convenient than scavenging servers for logs.

Stay tuned for second part – proper Jenkins configuration for automation.

Using multiple ELBs with Cloudflare DNS

If you have an autoscaling app on EC2, you need an ELB to distribute traffic. And if you don’t trust in ELB to be HA, you need at least 2 of them. The issue is that ELBs don’t have statis IPs, they can only be referenced with CNAME. That limits your DNS hosting option to just Route53, because apex record can’t be CNAMEd. But what if you (for some reason) don’t want to?