Backup

ROLE bacula-client, bacula-director

We use Bacula to manage backups. This usually runs on at least two machines. Each machine to be backed up includes the bacula-client role, and a separate machine initiates and stores the backup data. This is configured by the bacula-director role.

The client and director communicate over several ports. Since the client is in the cloud and our director is usually NATed and behind a firewall, we leave those ports blocked on both. When it is time for a backup, the director sets up an ssh tunnel for the ports. We use a restricted backup account with a separate ssh key to set up the tunnel, as described in the Bacula wiki. We can back up a mix of many machines, some needing a tunnel (like a cloud machine) and some that can directly connect (like a local PC).

We back up user data like the mail spool and document roots for the web sites, as well as system data like certs, DKIM keys, and so forth. The snapshot tag is another way to make a short-term copy of the system data. Is it worth it to back up logs?

Bacula has several processes that run, usually spread over at least two machines, and they are configured by the following files in /etc/bacula:

Variables

The director uses a database for the catalog of files and jobs. Postgres is the default, but MySQL and Sqlite3 are also available. Postgres and MySQL need a password but Sqlite does not. The next three passwords for authenticate the director to the client, storage, and console processes and are usually given long (50-60char) random strings of letters and numbers. The ssh tunnel that Bacula uses requires a public and private keyfile.

bacula_db: pgsql / mysql / sqlite3
bacula_db_pw: changeme  

bacula_client_pw:  "{{ vault_bacula_client_pw }}"
bacula_storage_pw: "{{ vault_bacula_storage_pw }}"
bacula_console_pw: "{{ vault_bacula_console_pw }}"

bacula_tunnel_sshpub:   "{{ lookup('file', '~/.ssh/tunnel.auth') }}"
bacula_tunnel_sshkeypath: ~/.ssh/tunnel.key

# put the backups where you prefer.
bacula_storage_datadir: /backup/bacula

The client does not need many settings. The bacula_client_fileset variable describes the directories you want backed up. There are many knobs other knobs to play with if you want to do something fancy.

bacula_client_fileset:
  include:
    - /etc/letsencrypt
    - "{{mail_db_root}}"
    - "{{mail_dkim_root}}"
    - "{{mail_spool_root}}"
    - "{{webdata_root}}"

If you have detailed knowledge of Bacula, you can add options and exclude entries to the bacula_client_fileset dictionary. These translate to the equivalent Bacula FileSet sub-blocks. You can also include a freeform entry with raw text for your own jobs and filesets.

If a client does not need an ssh tunnel, set bacula_client_tunnel to false in variables for that client. You can have some clients with a tunnel and others without. Without a tunnel, the Bacula ports on director and client need to be open so they can contact each other - add 'bacula' to firewall_services list.

# Set to no if a client does not need an SSH tunnel.   
#bacula_client_tunnel: no
#firewall_services: ['ssh', 'other things', 'bacula']

Creating these roles forced a deeper understanding of Ansible variables. Most roles need settings for just one machine at a time. The Bacula roles need settings over a network of machines, like each client of a director. Defaults makes it even more interesting. The bacula-dflts role is used by the client and director roles to coordinate these settings. The source files for that role contain more discussion on defaults and hostvars.

Using Bacula

Backups are usually automatic, and an status email will be sent to the admin account when they are done. All other interaction is done on the director machine using the Bacula console bconsole.

I recommend firing up the console and getting familiar with it before you need to. Try to manually run a backup ("run") and try to restore some files ("restore"). The Bacula manual has a nice walkthrough of the restore process.

Why Bacula?

We do backups so that we can recover to a prior "known good" state. Two reasons we might need that:

  1. Hardware problem, accidental deletion, or other PEBKAC situation.
  2. Someone has exploited the machine.

Initially I planed to use Borg because it is simple to configure, has nice features like deduplication & encryption, and a good community. It stores data on local disk or over SSH to a server like by rsync.net. Borg won't go to an S3 bucket, but a similar project called Restic will.

Once I thought more deeply about who could access the backups, I changed my mind. For a disk crash or mistake, any backup software is fine. All of them can provide a recent copy of any damaged files so you can get your machine back to normal.

The second scenario — a p0wned machine — is trickier. If you find yourself in this situation, you should not trust anything that the attacker had access to and, after figuring out how they got in, you should probably rebuild the machine from scratch. Happily, with everything in an Ansible playbook, this is not as big a job as it used to be.

Borg and Restic work locally on the machine, and can access past backups for pruning, etc. Once an attacker is on the machine, they can mess with the backups too. Borg has an append-only mode for this, but it just queues deletions until a trusted machine touches the backups. This makes any automatic pruning of old backups, even from a trusted machine, dangerous unless you are sure nothing was compromised.

So after considering all of these factors, I returned to Bacula, which is slightly more complex to configure, but handles both restore scenarios, scales to as many machines as you want, and has all storage options, including tape (which Borg thinks no longer exists). It requires setting up more than one machine, but you can keep your Bacula director and storage on a machine that you physically control, with minimal access to the outside world. Even a $10 Raspberry Pi Zero sitting on a shelf can back up a personal email/web server.

Fujitsu Eagle
Choose backup storage wisely. A 128GB USB Key is spacious and portable, but a 300MB Fujitsu Eagle can resist small-arms fire.

NOTE: A new Fujitsu Eagle cost $10,000 in 1982. Think about that when you complain about terabyte SSD prices.