bacula framework on github
2012-09-21
I’m pretty happy with the Bacula environment I’ve created.
It has gone through a few iterations, and I’ve learned a lot since I started using it a few years ago. I think its only appropriate to share the evolution of my environment with as many people as possible, and I hope it can help save new bacula administrators some time.
Enough of the preamble, here is my github project page: https://github.com/m87carlson/bacula-director-conf The most important feature this project has is the bcreate-fd tool I’ve written.
Bacula (the open source version), comes with very little in the way of tools to help you add clients and manage storage nodes. I originally wrote a pretty gnarly shell script that had to be run directly on the bacula director server to add clients. The script relied upon sed to modify a “template” file with the options specified, and when I put Bacula under GIT control, I started to run across usability issues when multiple administrators would try to add new clients.
After extending my original create_client.sh script far beyond its usefulness, I finally re-wrote it in Python.
Once I started working with Python, I was immediately able to use very useful things like CouchDB Kit to directly interface with the CouchDB database (opposed to running curl command and parsing the output) and use Jinja2 for templating.
I have also geared the script towards a more a distributed development model. Since we keep bacula under git control, you can specify your own bacula_dir in the fd.conf file. That way, you don’t need to replicate /usr/local/etc/bacula (or, /etc/bacula), or sudo to root to make your changes.
This framework is not entirely turn-key though, and it is based on the two Bacula environments I’ve built; my current bacula environment at Bay Photo Lab, as well as the infrastructure I started at LLNL.
The biggest external dependency is the use of CouchDB. I used CouchDB as a middle man between my Puppet environment and Bacula. Puppet will pull down configuration settings for a client (Bacula FD) like FD name, the password used to communicate with the bacula server (no shared passwords here, each client gets its own) and a ssl certificate that the client will use to encrypt/decrypt its data.
Otherwise, everything else is part of how Bacula operates. I setup a few standards though.
Schedules go in $bacula_dir/schedules.conf, Storage Nodes are declared as separate conf files in $bacula_dir/storage.d/, and FD configurations go in $bacula_dir/clients.d/. The bcreate_fd tool will parse schedules.conf and prepare a list of Schedules to choose from as well, so its all to help the administrator out.
Let me demonstrate my workflow so the configuration file makes a little sense.
I only use this github project so I can share what I’ve done. My real bacula git repository is cloned from the github repo, and then managed on a separate server.
mikec@b-bot ~/projects> git clone ssh://git@scm.bayphoto.local:7999/BACULA/bacula.git bacula_example.git
Cloning into 'bacula_example.git'...
remote: Counting objects: 735, done.
remote: Compressing objects: 100% (700/700), done.
remote: Total 735 (delta 134), reused 605 (delta 21)
Receiving objects: 100% (735/735), 453.53 KiB, done.
Resolving deltas: 100% (134/134), done.
mikec@b-bot ~/projects> cd bacula_example.git/
mikec@b-bot ~/p/bacula_example.git> cat scripts/fd.conf
[default]
schedule: Standard
os_type: unix
storage_node: sd-1
domain: bayphoto.local
couchdb_server: https://puppet.bayphoto.local
couchdb_db: bacula_meta
bacula_dir: /home/BAYPHOTO/mikec/projects/bacula
client_conf_dir: /home/BAYPHOTO/mikec/projects/bacula/clients.d
bacula_cert_dir: /home/BAYPHOTO/mikec/projects/bacula/certs/
[default]
schedule: Standard
os_type: unix
storage_node: sd-1
domain: bayphoto.local
couchdb_server: https://puppet.bayphoto.local
couchdb_db: bacula_meta
bacula_dir: /home/BAYPHOTO/mikec/projects/bacula_example
client_conf_dir: /home/BAYPHOTO/mikec/projects/bacula_example/clients.d
bacula_cert_dir: /home/BAYPHOTO/mikec/projects/bacula_example/certs/
Now, I’ll add a new client:
mikec@b-bot ~/p/bacula_example.git> cd scripts
mikec@b-bot ~/p/b/scripts> ./bcreate-fd.py
usage: bcreate-fd.py [options]
bcreate-fd.py: error: argument -H/--hostname is required
mikec@b-bot ~/p/b/scripts> ./bcreate-fd.py --help
usage: bcreate-fd.py [options]
optional arguments:
-h, --help show this help message and exit
-c CONFIGFILE, --config-file CONFIGFILE
Use a different config file other than ./fd.conf
-H HOSTNAME, --hostname HOSTNAME
Short hostname of fd client
-d {bayphoto.local,bayphoto.com}, --domain {bayphoto.local,bayphoto.com}
Domain (ie: example.com) that the fd client is in
-s {WeeklyCycle,WeeklyCycleAfterBackup,Standard,workstation_afterhours,workstation_duringmeeting}, --schedule {WeeklyCycle,WeeklyCycleAfterBackup,Standard,workstation_afterhours,workstation_duringmeeting}
Set a backup schedule for the client
-t {unix,win,osx}, --os-type {unix,win,osx}
FD Client OS type
-n {sd-1}, --storage-node {sd-1}
Bacula storage node
--client-conf-dir CLIENT_CONF_DIR
Override the default client configuration directory
--bacula-dir BACULA_DIR
Override the default bacula configuration directory
Those are all of my default options, pulled in from fd.conf, and as you can see, the schedules were automatically populated.
I’ll add a test client:
mikec@b-bot ~/p/b/scripts> ./bcreate-fd.py -H testing
hostname: testing
domain: bayphoto.local
schedule: Standard
fqdn: testing.bayphoto.local
os_type: unix
storage: bup-sd-1
client.d: /home/BAYPHOTO/mikec/projects/bacula_example/clients.d
couch_server: https://puppet.bayphoto.local
couch_db: bacula_meta
Client does not exist. A new record for testing will be created.
testing does not have a certificate in https://puppet.bayphoto.local/bacula_meta. A new certificate will be generated.
certificate pushed to https://puppet.bayphoto.localbacula_meta for testing
Lets take a look at the generated testing.conf:
mikec@b-bot ~/p/b/scripts> cat ../clients.d/testing.conf
# -*- coding: utf-8 -*-
client {
Name = testing.bayphoto.local-fd
Address = testing.bayphoto.local
FDPort = 9102
Catalog = MyCatalog
Password = HxTxlSQkSNS0Pt8Go9kqy6ZfkXIJaWBu
File Retention = 40 days
Job Retention = 1 months
AutoPrune = yes
Maximum Concurrent Jobs = 10
Heartbeat Interval = 300
}
Console {
Name = testing.bayphoto.local-acl
Password = ItsASecret
JobACL = "testing.bayphoto.local RestoreFiles", "testing.bayphoto.local"
ScheduleACL = *all*
ClientACL = testing.bayphoto.local-fd
FileSetACL = "testing.bayphoto.local FileSet"
CatalogACL = MyCatalog
CommandACL = *all*
StorageACL = *all*
PoolACL = testing.bayphoto.local-File
}
Job {
Name = "testing.bayphoto.local"
Type = Backup
Level = Incremental
FileSet = "testing.bayphoto.local FileSet"
Client = "testing.bayphoto.local-fd"
Storage = SD1FileD401
Pool = testing.bayphoto.local-File
Schedule = "Standard"
Messages = Standard
Priority = 10
Write Bootstrap = "/var/db/bacula/%c.bsr"
Maximum Concurrent Jobs = 10
Reschedule On Error = yes
Reschedule Interval = 1 hour
Reschedule Times = 1
Max Wait Time = 30 minutes
Cancel Lower Level Duplicates = yes
Allow Duplicate Jobs = no
}
Pool {
Name = testing.bayphoto.local-File
Pool Type = Backup
Recycle = yes
AutoPrune = yes
Volume Retention = 1 months
Maximum Volume Jobs = 1
Maximum Volume Bytes = 5G
LabelFormat = "testing.bayphoto.local"
Maximum Volume Jobs = 5
}
Job {
Name = "testing.bayphoto.local RestoreFiles"
Type = Restore
Client= testing.bayphoto.local-fd
FileSet="testing.bayphoto.local FileSet"
Storage = SD1FileD401
Pool = testing.bayphoto.local-File
Messages = Standard
#Where = /tmp/bacula-restores
}
FileSet {
Name = "testing.bayphoto.local FileSet"
Include {
Options {
signature = MD5
compression = GZIP6
fstype = ext2
fstype = xfs
fstype = jfs
fstype = ufs
fstype = zfs
onefs = no
Exclude = yes
@/etc/bacula/excludes.d/common.conf
}
File = /
File = /usr/local
Exclude Dir Containing = .excludeme
}
Exclude {
@/etc/bacula/excludes.d/unix.conf
}
}
Here is a really cool part that my co-worker helped me enforce, and that is to leverage git’s post-receive hook to automatically update the bacula director node.
In remote git repository, I have a git commit hook that automatically updates a git repo on our bacula server ( hooks/post-receive)
#!/bin/sh
git push service_account@bacula-dir:/data/staging/bacula.git
My push here will update that remote repository, the one we consider our “master” (it’s also tied into JIRA and Stash, great for tracking issues and progress).
We have another hook in THAT repo ( bacula-dir:/data/staging/bacula.git ) that will run bacula-dir -t to validate the config, and if that passes, update the real bacula/ directory:
#!/bin/sh
export GIT_WORK_TREE=/usr/local/etc/bacula
git checkout -f
sudo /usr/local/sbin/bacula-dir -t
if [ $? -eq 1 ];then
git show | mailx -s "Bacula Director test failed" bacula-admins@bayphoto.com
fi
If a commit fails the test, we get an email about where it failed and in what commit.
So go ahead and take a look, fork it and have fun. I’m very new to Python, and open to pull requests and feed back.