This is part four of my Bacula tutorial. The first part covered some basics as well as installing Bacula and starting the three daemons that it consists of. Part two dealt with allowing all components of Bacula to interact with each other locally, debugging config problems and doing a first test backup. In part three the configuration was cleaned up and split into smaller parts, the first self-created resources (fileset, device and storage) were added and a backup job customized using the bconsole.
Part four will discuss jobs, and show how to do a restore. We will change the default settings for the backup job so that it’s no longer necessary to modify it using the bconsole. Also volumes, labels and pools will discussed.
The fourth part continues where the third one left off. During this tutorial series we use VirtualBox VMs managed by Vagrant to make things as simple as possible. If you don’t know how to use them, have a look here and here for a detailed introduction into using Vagrant and on how to prepare a VM template with FreeBSD that is used in the Bacula tutorial.
Jobs
We’ve already seen and used backup jobs. There are also jobs for different actions like restore (and some others). But what is a job? It basically is your way of telling Bacula what to do and how to do it. The job type or action defines what Bacula should do in the first place. Back up data? Restore it? Verify (compare) data? The client determines who to operate on; it answers the question: Which host to back up data from or restore data to?
Then there’s the fileset which in case of a backup determines which files should be included and the pool that determines where to store the data. And finally there’s the schedule that determines when a job is run. So far we have only started jobs manually using the bconsole – but that’s certainly not what you have in mind for your backup solution (or maybe for simple backups it is. But in that case using Bacula for your backups is most likely overkill and you might want to look for a simpler backup utility)!
When we did our second backup, we changed a lot of settings using the bconsole. Now let’s modify the configuration instead so that those will be the new defaults for the backup job. Of course we’ll first have Vagrant spin up the VM, SSH into it and so on (you now the score by now):
% cd ~/vagrant/backuphost
% vagrant snapshot restore tut_3_end
% vagrant ssh
% sudo su -
# cd /usr/local/etc/bacula/
Then we can edit the job defaults (if you haven’t read the previous part(s) and wonder why you don’t have that file or even the directory – that’s because we’ve split the configuration for better readability!):
# vi includes/dir_job.conf
The topmost resource should be the JobDefs one that has the name DefaultJob. This is a kind of template for all jobs which only overwrite directives that differ from the default one and just use the rest as set here. Change FileSet to etc
, Storage to File3
and Pool to Default
. Save the file and exit the editor.
Now restart the director and prepare to run the backup job again using the bconsole:
# service bacula-dir restart
# bconsole
* run
Automatically selected Catalog: MyCatalog
Using Catalog “MyCatalog”
A job name must be specified.
The defined Job resources are:
1: Backuphost.local
2: BackupCatalog
3: RestoreFiles
Select Job resource (1-3):
Choose 1:
Run Backup job
JobName: Backuphost.local
Level: Incremental
Client: backuphost.local-fd
FileSet: etc
Pool: Default (From Job resource)
Storage: File3 (From Job resource)
When: 2016-09-23 23:48:01
Priority: 10
OK to run? (yes/mod/no):
That’s looking like it should. No more need to use mod multiple times! Type no now as we don’t actually need to do another backup at this time.
Restoring files
Instead we’ll be doing a restore next. Issuing the command to initiate a restore job leads to a long list of choices:
* restore
[…]
To select the JobIds, you have the following choices:
1: List last 20 Jobs run
2: List Jobs where a given File is saved
3: Enter list of comma separated JobIds to select
4: Enter SQL list command
5: Select the most recent backup for a client
6: Select backup for a client before a specified time
7: Enter a list of files to restore
8: Enter a list of files to restore before a specified time
9: Find the JobIds of the most recent backup for a client
10: Find the JobIds for a backup for a client before a specified time
11: Enter a list of directories to restore for found JobIds
12: Select full restore to a specified Job date
13: Cancel
Select item: (1-13):
Pick option 5 – the most frequent one that I use, BTW:
Defined Clients:
1: backuphost.local-fd
2: fbsd-template.local-fd
Select the Client (1-2):
Huh? Where’s that fbsd-template.local client coming from? Haven’t we removed it from the configuration completely? Yes, we have. However we did our very first backup when this was still the hostname of the virtual machine and the catalog remembers that it holds a backup for that client! Ignore that for now and select 1:
[…]
424 files inserted into the tree.
You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added, unless
you used the “all” keyword on the command line.
Enter “done” to leave this mode.
cwd is: /
Notice that the prompt symbol changed ($)? You’re in a virtual shell now from which you can navigate through a filesystem rebuilt from the files contained in the selected backup. It is fairly limited, however. The most obvious limitation is that it does not provide auto-completion. You’ll have to live with that. And of course it does only provide a basic set of commands that allow you to change paths, list files, etc.
Bacula said that we’re in /. Let’s see what we have there:
$ ls
etc/
usr/
Ok, so obviously our filesystem consists of a subset of both /etc and /usr (subset because we’ve excluded /etc/caspar from /etc and only included /usr/local/etc and not the whole /usr, remember?).
Let’s see what is in /usr/local/etc, shall we:
ls /usr/local/etc
Nothing? Ouch. Does that mean that the backup is broken for whatever reason? No, in fact everything is fine. The problem here is that even this rather simple command is too advanced for Bacula! You want to see the contents of some directory? Go there and have a look again:
$ cd /usr/local/etc
cwd is: /usr/local/etc/
$ ls
X11/
bacula/
bash_completion.d/
drirc
man.d/
pam.d/
periodic/
pkg.conf
pkg.conf.sample
rc.d/
sudoers
sudoers.d/
sudoers.sample
xdg/
There you go. It’s all there. Let’s mark the sudoers file so that it’ll be added to the restore job (you can also use the mark command, but add is shorter!):
$ add sudoers
1 file marked.
Ok, that worked. Just bear in mind that you always have to enter the directory first before you can mark (or view) any files. Even if you know where in the filesystem something is, Bacula can’t cope with anything more complicated than the very basic way of doing things.
Now let’s change to /etc:
$ cd /etc
cwd is: /etc/
I won’t show an ls here since that’d be too much output. But do it yourself and see if /etc/casper was really left out from the backup. Alright. Now let’s assume we want to restore csh.cshrc, csh.login and csh.logout as well. Thankfully Bacula’s virtual shell does support globbing (wildcard expansion):
$ add csh*
3 files marked.
After selecting a bunch of files, let’s tell Bacula that we’ve finished adding files:
$ done
[…]
4 files selected to be restored.
Run Restore job
JobName: RestoreFiles
Bootstrap: /var/db/bacula/backuphost.local-dir.restore.1.bsr
Where: /tmp/bacula-restores
Replace: Always
FileSet: Full Set
Backup Client: backuphost.local-fd
Restore Client: backuphost.local-fd
Storage: File3
When: 2016-09-25 08:02:20
Catalog: MyCatalog
Priority: 10
Plugin Options:
OK to run? (yes/mod/no):
Bacula has prepared a restore job and shows us a summary so we can either run, modify or cancel it. One thing to take note of is the Where: line. All files that are restored will have their path prefixed with /tmp/bacula-restores. You could choose another directory or set it to just / if you want Bacula to overwrite the current files in-place. For now accept the current settings by entering yes:
Job queued. JobId=3
Wait a moment and hit Enter to see if Bacula has any news for you. It should:
You have messages.
Let’s look at those:
* mes
You know the job report by now. Look for the following line that shows that everything went right:
Termination: Restore OK
The restore job completed successfully. There are some more useful commands that you can use when you select the files for the restore. I just want to mention two of them: unmark and lsmark. What the former does should be pretty obvious: It deselcts files that were marked for restore before. This allows you to e.g. add * and then unmark a few files which can be a much less painful way if you have more files that are to be restored than files that shouldn’t! The other one shows marked files in and below the current directory. That means if you want to see the full list of marked files, change to / before you use lsmark!
File examination
Let’s quit the bconsole now and take a look at the files that we just recovered from the backup:
* exit
# ls -1 /tmp/bacula-restores/etc/
csh.cshrc
csh.login
csh.logout
Looks like something was restored. Since the original files have not actually been modified since we’ve backed them up, comparing the original and the restored ones should assure us of the files being intact:
# diff -q /usr/local/etc/sudoers /tmp/bacula-restores/usr/local/etc/sudoers
No output means that the files match exactly. Good! But where did those files get restored from? Remember what we did when we configured our backup device. Let’s take a look at the directory that we specified there:
# ls -lh /var/backup/
total 1952
-rw-r—– 1 bacula bacula 1.9M Sep 24 22:08 file3a
This is the volume that we specified in the configuration and that was actually created when we had Bacula label it.
For learning purposes our very simple setup (just one volume) worked great. But before we move on, it’s time to take care of creating a storage system that’s a little bit more advanced: We need a pool! But how do those work?
Volumes, pools and labels
Speaking of labels… In the previous part we had to create one before the job that we queued could actually start. To be able to come up with a sensible backup solution for your use case you will have to understand how Bacula stores backup data. It uses so-called volumes. Think of a volume as some kind of storage medium. This could either be tape or disk-backed storage (i.e. a file). Backup data can be written to a volume until the maximum capacity is reached. Additional data will have to be written to another volume.
We’re not really talking about using tapes here (which comes with its own set of problems from what I ‘ve read in Bacula’s manual). Still it makes sense to remember that tapes are the reason for some design choices of Bacula. Volumes are such a case. While supporting multiple files may not seem like a huge benefit (for one host that is), it’s easy to see that supporting more than one tape does. Once it’s full, write to the next. But to be able to distinguish them, Bacula needs some means of telling them apart. This is where the label comes in. A label basically means that some medium is marked as a volume that Bacula may use combined with a unique name so multiple volumes won’t get confused. So each volume needs a label before Bacula will use it to put data on.
If backup jobs were tied to a volume this could work for some cases but would probably lead to problems sooner or later. Imagine the case when a volume is probably only half full but nevertheless the next backup won’t fit on it. That backup would have to be written to the next volume, wasting the free space on the former. Issues and inflexibilities like that are solved by introducing pools. A pool is basically a list of volumes (plus some options). If your job targets a pool, it no longer matters which volume to put it on – Bacula can take care of that for you in a dynamic way. Pools also allow enforcing some restrictions (like maximum size, maximum time to use) on volumes depending on what your needs are and what you are trying to do.
Since this post is already long enough, it’s time to end this part. As always, let’s save our progress by shutting down the VM and taking the next snapshot (when the status has reached poweroff:
# shutdown -p now
% vagrant status
% vagrant snapshot save tut_4_end
Intermission
After this part of the series we finally know how to restore files from a backup. We also have a better understanding of what jobs, volumes, labels and pools are.
In the next post we’ll create and test a new pool, do some configuration cleanup and reset the catalog. This should conclude the single node part of the tutorial.
Have any comments for me? Or did you perhaps find a mistake? Just leave me a comment.