published on in Featured Geekyness General Backups Bacula CouchDB FreeBSD Python

bacula framework on github

I’m pretty happy with the Bacula environment I’ve created.

It has gone through a few iterations, and I’ve learned a lot since I started using it a few years ago. I think its only appropriate to share the evolution of my environment with as many people as possible, and I hope it can help save new bacula administrators some time.

Enough of the preamble, here is my github project page: The most important feature this project has is the bcreate-fd tool I’ve written.

Bacula (the open source version), comes with very little in the way of tools to help you add clients and manage storage nodes. I originally wrote a pretty gnarly shell script that had to be run directly on the bacula director server to add clients. The script relied upon sed to modify a “template” file with the options specified, and when I put Bacula under GIT control, I started to run across usability issues when multiple administrators would try to add new clients.

After extending my original script far beyond its usefulness, I finally re-wrote it in Python.

Once I started working with Python, I was immediately able to use very useful things like CouchDB Kit to directly interface with the CouchDB database (opposed to running curl command and parsing the output) and use Jinja2 for templating.

I have also geared the script towards a more a distributed development model. Since we keep bacula under git control, you can specify your own bacula_dir in the fd.conf file. That way, you don’t need to replicate /usr/local/etc/bacula (or, /etc/bacula), or sudo to root to make your changes.

This framework is not entirely turn-key though, and it is based on the two Bacula environments I’ve built; my current bacula environment at Bay Photo Lab, as well as the infrastructure I started at LLNL.

The biggest external dependency is the use of CouchDB. I used CouchDB as a middle man between my Puppet environment and Bacula. Puppet will pull down configuration settings for a client (Bacula FD) like FD name, the password used to communicate with the bacula server (no shared passwords here, each client gets its own) and a ssl certificate that the client will use to encrypt/decrypt its data.

Otherwise, everything else is part of how Bacula operates. I setup a few standards though.

Schedules go in $bacula_dir/schedules.conf, Storage Nodes are declared as separate conf files in $bacula_dir/storage.d/, and FD configurations go in $bacula_dir/clients.d/. The bcreate_fd tool will parse schedules.conf and prepare a list of Schedules to choose from as well, so its all to help the administrator out.

Let me demonstrate my workflow so the configuration file makes a little sense.

I only use this github project so I can share what I’ve done. My real bacula git repository is cloned from the github repo, and then managed on a separate server.

mikec@b-bot ~/projects> git clone ssh://git@scm.bayphoto.local:7999/BACULA/bacula.git bacula_example.git
Cloning into 'bacula_example.git'...
remote: Counting objects: 735, done.
remote: Compressing objects: 100% (700/700), done.
remote: Total 735 (delta 134), reused 605 (delta 21)
Receiving objects: 100% (735/735), 453.53 KiB, done.
Resolving deltas: 100% (134/134), done.
mikec@b-bot ~/projects> cd bacula_example.git/
mikec@b-bot ~/p/bacula_example.git> cat scripts/fd.conf 
schedule: Standard
os_type: unix
storage_node: sd-1
domain: bayphoto.local
couchdb_server: https://puppet.bayphoto.local
couchdb_db: bacula_meta
bacula_dir: /home/BAYPHOTO/mikec/projects/bacula
client_conf_dir: /home/BAYPHOTO/mikec/projects/bacula/clients.d
bacula_cert_dir: /home/BAYPHOTO/mikec/projects/bacula/certs/
I’m first going to modify the bacula_dir, client_conf_dir and bacula_cert_dir values in my configuration file to match my new project home directory (which is now /home/BAYPHOTO/mikec/projects/bacula_example.git)
    schedule: Standard
    os_type: unix
    storage_node: sd-1
    domain: bayphoto.local
    couchdb_server: https://puppet.bayphoto.local
    couchdb_db: bacula_meta
    bacula_dir: /home/BAYPHOTO/mikec/projects/bacula_example
    client_conf_dir: /home/BAYPHOTO/mikec/projects/bacula_example/clients.d
    bacula_cert_dir: /home/BAYPHOTO/mikec/projects/bacula_example/certs/

Now, I’ll add a new client:

    mikec@b-bot ~/p/bacula_example.git> cd scripts
    mikec@b-bot ~/p/b/scripts> ./ 
    usage: [options] error: argument -H/--hostname is required
    mikec@b-bot ~/p/b/scripts> ./  --help
    usage: [options]
    optional arguments:
      -h, --help            show this help message and exit
      -c CONFIGFILE, --config-file CONFIGFILE
                            Use a different config file other than ./fd.conf
      -H HOSTNAME, --hostname HOSTNAME
                            Short hostname of fd client
      -d {bayphoto.local,}, --domain {bayphoto.local,}
                            Domain (ie: that the fd client is in
      -s {WeeklyCycle,WeeklyCycleAfterBackup,Standard,workstation_afterhours,workstation_duringmeeting}, --schedule {WeeklyCycle,WeeklyCycleAfterBackup,Standard,workstation_afterhours,workstation_duringmeeting}
                            Set a backup schedule for the client
      -t {unix,win,osx}, --os-type {unix,win,osx}
                            FD Client OS type
      -n {sd-1}, --storage-node {sd-1}
                            Bacula storage node
      --client-conf-dir CLIENT_CONF_DIR
                            Override the default client configuration directory
      --bacula-dir BACULA_DIR
                            Override the default bacula configuration directory

Those are all of my default options, pulled in from fd.conf, and as you can see, the schedules were automatically populated.

I’ll add a test client:

    mikec@b-bot ~/p/b/scripts> ./  -H testing
            hostname:   testing
            domain:     bayphoto.local
            schedule:   Standard
            fqdn:       testing.bayphoto.local
            os_type:    unix
            storage:    bup-sd-1
            client.d:   /home/BAYPHOTO/mikec/projects/bacula_example/clients.d
            couch_server: https://puppet.bayphoto.local
            couch_db: bacula_meta
    Client does not exist. A new record for testing will be created.
    testing does not have a certificate in https://puppet.bayphoto.local/bacula_meta. A new certificate will be generated.
    certificate pushed to https://puppet.bayphoto.localbacula_meta for testing

Lets take a look at the generated testing.conf:

    mikec@b-bot ~/p/b/scripts> cat ../clients.d/testing.conf 
    # -*- coding: utf-8 -*-
    client {
        Name = testing.bayphoto.local-fd
        Address = testing.bayphoto.local
        FDPort = 9102
        Catalog = MyCatalog
        Password = HxTxlSQkSNS0Pt8Go9kqy6ZfkXIJaWBu
        File Retention = 40 days
        Job Retention = 1 months
        AutoPrune = yes
        Maximum Concurrent Jobs = 10
        Heartbeat Interval = 300
    Console {
        Name = testing.bayphoto.local-acl
        Password = ItsASecret
        JobACL = "testing.bayphoto.local RestoreFiles", "testing.bayphoto.local"
        ScheduleACL = *all*
        ClientACL = testing.bayphoto.local-fd
        FileSetACL = "testing.bayphoto.local FileSet"
        CatalogACL = MyCatalog
        CommandACL = *all*
        StorageACL = *all*
        PoolACL = testing.bayphoto.local-File
    Job {
        Name = "testing.bayphoto.local"
        Type = Backup
        Level = Incremental
        FileSet = "testing.bayphoto.local FileSet"
        Client = "testing.bayphoto.local-fd"
        Storage =  SD1FileD401
        Pool = testing.bayphoto.local-File
        Schedule = "Standard"
        Messages = Standard
        Priority = 10
        Write Bootstrap = "/var/db/bacula/%c.bsr"
        Maximum Concurrent Jobs = 10
        Reschedule On Error = yes
        Reschedule Interval = 1 hour
        Reschedule Times = 1
        Max Wait Time = 30 minutes
        Cancel Lower Level Duplicates = yes
        Allow Duplicate Jobs = no    
    Pool {
        Name = testing.bayphoto.local-File
        Pool Type = Backup
        Recycle = yes
        AutoPrune = yes
        Volume Retention = 1 months
        Maximum Volume Jobs = 1
        Maximum Volume Bytes = 5G
        LabelFormat = "testing.bayphoto.local"
        Maximum Volume Jobs = 5
    Job {
        Name = "testing.bayphoto.local RestoreFiles"
        Type = Restore
        Client= testing.bayphoto.local-fd
        FileSet="testing.bayphoto.local FileSet"
        Storage = SD1FileD401
        Pool = testing.bayphoto.local-File
        Messages = Standard
        #Where = /tmp/bacula-restores
    FileSet {
        Name = "testing.bayphoto.local FileSet"
        Include {
            Options {
                signature = MD5
                compression = GZIP6
                fstype = ext2
                fstype = xfs
                fstype = jfs
                fstype = ufs
                fstype = zfs
                onefs = no
                Exclude = yes
            File = /
            File = /usr/local
            Exclude Dir Containing = .excludeme
        Exclude {

Here is a really cool part that my co-worker helped me enforce, and that is to leverage git’s post-receive hook to automatically update the bacula director node.

In remote git repository, I have a git commit hook that automatically updates a git repo on our bacula server ( hooks/post-receive)

    git push service_account@bacula-dir:/data/staging/bacula.git

My push here will update that remote repository, the one we consider our “master” (it’s also tied into JIRA and Stash, great for tracking issues and progress).

We have another hook in THAT repo ( bacula-dir:/data/staging/bacula.git ) that will run bacula-dir -t to validate the config, and if that passes, update the real bacula/ directory:

    export GIT_WORK_TREE=/usr/local/etc/bacula
    git checkout -f
    sudo /usr/local/sbin/bacula-dir -t
    if [ $? -eq 1 ];then
    	git show | mailx -s "Bacula Director test failed"

If a commit fails the test, we get an email about where it failed and in what commit.

So go ahead and take a look, fork it and have fun. I’m very new to Python, and open to pull requests and feed back.