Bacula on FreeBSD (pt. 5): A day at the pool

Edit: I put up the original post only for a short time before I decided that I wasn’t really happy with it at all. So I took it down again to rework it a bit. By now I’ve more or less rewritten it completely. I hope that this version will be more useful.

This is part five of my Bacula tutorial. The first part covered some basics as well as installing Bacula and starting all three daemons. Part two dealt with allowing all components of Bacula to interact with each other locally, debugging config problems and doing a first test backup was.

In part three the configuration was cleaned up and split into smaller parts, the first self-created resources (fileset, device and storage) were added and a backup job customized using the bconsole. Part four detailed restoring files from backup and discussed jobs as well as volumes, labels and pools.

Part five will show how to create a new storage pool. We’ll use memory disks so that we can simulate partitions. Then we’ll talk about jobs again and have a closer look at volumes in our pool and some of the states they can have. Finally we’ll briefly touch upon the topic of recycling volumes. You’ll need to know all that (and actually some more) prior to planning your real pool(s).

The fifth part continues where the third one left off. During this tutorial series we use VirtualBox VMs managed by Vagrant to make things as simple as possible. If you don’t know how to use them, have a look here and here for a detailed introduction into using Vagrant and on how to prepare a VM template with FreeBSD that is used in the Bacula tutorial.

Preparations for a storage pool

Load the latest snapshot, enter the VM and switch to root:

% cd ~/vagrant/backuphost
% vagrant snapshot restore tut_4_end
% vagrant ssh
% sudo su -

Our VM has a simple partition scheme where essentially everything is put in one partition. A volume that provides storage backed by a file will keep growing until the disk is full. To enforce restrictions we have to create a pool. Also our VM certainly does not have a tape drive or anything like that. Still I’d like to simulate filling a medium! How can we do that? Probably by creating virtual disk-backed storage for which we can set a fixed size.

This time we’ll backup a directory with a bit more data in it. I suggest /usr/bin. Let’s see how big it is (mind this size! Especially if you’re using a version newer than the 11.0 release that I’m using here the size may be different):

# du -sh /usr/bin

143M /usr/bin

Ok, we’re going to create a sparse file of 300 MB in size next (so that at a bit more than two full backups would fit in there) as well as a directory for the mountpoint:

# truncate -s 300M /var/backup/ufs0
# mkdir /mnt/storage0

Now we need to make the storage accessible as a device rather than just a file. FreeBSD calls these devices memory disks because most of the time you’ll want such a device to be backed by RAM (to benefit from its speed), but file-backed storage is also possible. As a first step we need to create the device and, as a second one, put a filesystem on it. Then in step three we can mount it on a directory to make it available on the machine’s global filesystem hierarchy. Fortunately FreeBSD is pretty good in doing memory disks and comes with a command that can do all three steps at once:

# mdmfs -F /var/backup/ufs0 -S -n -p 700 -w bacula:bacula md0 /mnt/storage0

Let’s check if that worked:

# mount

See the mount? Looks like we have our file backed storage in place.

Configuration changes

Now we can create a new fileset to back up /usr/bin:

# cd /usr/local/etc/bacula
# vi includes/dir_fileset.conf

Add the following lines at the end of the file:

FileSet {
Name = "usr-bin"
Include {
Options {
signature = MD5
}
File = /usr/bin
}
}

Save the file and exit. Of course we need to tell the sd where to store the backups – we need to create another device:

# vi includes/sd_device.conf

Add this device resource at the end of the file:

Device {
Name = Stor0-Test
Media Type = Test
Archive Device = /mnt/storage0
LabelMedia = yes;
Random Access = Yes;
AutomaticMount = yes;
RemovableMedia = no;
AlwaysOpen = no;
Maximum Concurrent Jobs = 5
}

And the director needs to know about this storage as well:

# vi includes/dir_storage.conf

Add this block at the end of the file:

Storage {
Name = Test
Address = localhost
SDPort = 9103
Password = "sdPASSWORD"
Device = Stor0-Test
Media Type = Test
Maximum Concurrent Jobs = 10
}

We’ve done all this before but the next step is new. We’re going to create a pool:

# vi includes/dir_pool.conf

Again add the following lines to the file:

Pool {
Name = Testpool
Pool Type = Backup
Recycle = no
Maximum Volume Bytes = 90M
}

Just set recycle to “no” for now – you’ll see what that does in a minute. We also want to force a maximum volume size of less than what the backup is worth of data so that two volumes will be needed for one full backup. And to avoid having to use “mod” at the bconsole all the time, we’ll add a new job for convenience:

# vi includes/dir_job.conf

Here’s the job resource to put at the end of the file:

Job {
Name = "TestJob"
JobDefs = "DefaultJob"
Level = Full
FileSet = usr-bin
Storage = Test
Pool = Testpool
}

Jobs

Ok. Our basic preparations are done. Let’s restart the daemons and then try and see what happens if we run our new backup job:

# service bacula-dir restart
# service bacula-sd restart
# bconsole
* run
4
yes
* mes

[…]
29-Oct 12:48 backuphost.local-sd JobId 4: Job Testjob.2016-10-31_12.48.33_03 is waiting. Cannot find any appendable volumes.
Please use the “label” command to create a new Volume for:
[…]

The job cannot start because there are no volumes that Bacula could use. We could now create volumes using the label command as we’ve done before. But we configured our device with the LabelMedia directive set to “yes”! It should create volumes automatically as needed! Time to look at the configuration again. But let’s first take the chance to cancel the currently pending job.

We already know the JobId from the message above. But let’s pretend we didn’t know. We’ll ask the director for an overview of jobs (both past and present):

* list jobs

There we have a simple table. Let’s see what information it holds. First we have a unique JobId for each job. The second column holds the name of the backup job – this will usually be the name of the client or RestoreFiles for a restore job. But as you can see in our case, “TestJob” will work as well (however you really should stick to names that hint which client they belong to in a production environment as things would get pretty confusing really fast). StartTime is self-explanatory. Type is “B” for backup and “R” for restore in our example. We’ve only run full backup level jobs so far. JobFiles and JobBytes is self-explanatory again. And in case of JobStatus, a “T” means terminated, an “R” running, “A” for aborted, etc. Now we’ll cancel the job that is waiting for a new volume:

* cancel 4
yes
* mes

[…]
Backup Canceled
[…]

So we’ve canceled the job. Let’s exit the console now and take a look at the pool configuration again:

* exit
# vi includes/dir_pool.conf

Our pool resource is missing a directive that specifies how the label names are to be composed. We should add that one line real quick:

Label Format = "Test-"

Now we need to restart the dir and invoke the bconsole again.

# service bacula-dir restart
# bconsole

Turn up the volume(s)!

Then we can take a look at the volumes that we have so far:

* list volumes

[…]
Pool: Testpool
No results to list.
[…]

The pool Testpool is empty right now. Take a look at the pools used for our previous jobs and get an idea of what they look like. Now let’s run the test job again and see what happens:

* run
4
yes
* mes

[…]
Labeled new Volume “Test-0003” on file device “Stor0-Test” (/mnt/storage0).
[…]

So auto labeling obviously works. This is one huge benefit of using a pool! But there are others like limiting the volume sizes and many more. Did the backup complete successfully by now?

* mes

[…]
Backup OK
[…]

It did. Time to look at the volumes again; look for the column VolStatus:

* list volumes

[…]
Full
Append
[…]

The first volume has a VolStatus of Full and the second one is Append which means that more backup data can be written to it. We’ll do that by simply running our test job again:

* run
4
yes
* mes

[…]
WARNING: device is full! Please add more disk space then …

Please mount append Volume “Test-0005” or label a new one for:
[…]

We’ve created a really small ramdisk for /mnt/storage0 and it is full before the second full backup could be completed. But how’s that? There’s 300 MB of space and the fileset backs up less that 150 MB! Is there so much overhead? No, not really. What has happened here is that we restricted the pool to volumes of 90 MB each. The three volumes occupy 270 MB – and while there’s 30 MB more left on the pool, that’s too little space to create another volume on it! So what do we do now? First have a look at the volume list again:

* list volumes

[…]
Full
Full
Full
[…]

Coming full cycle

All of them are listed as Full. But there’s some more interesting info there. Notice the Recycle column? We’ve forbidden Bacula to recycle old volumes when we defined the pool resource. We can fix that, right? Let’s exit the bconsole and edit the configuration file:

* exit
# vi includes/dir_pool.conf

Change the respective line to:

Recycle = yes

Then save the file and exit the editor. The configuration changed and so the dir needs to reinitialize. To do so we restart it.

# service bacula-dir restart

Seems like it’s not responding? Hit CTRL+C to cancel. Let’s stop and start it instead:

# service bacula-dir stop
# service bacula-dir start

That worked. Now we enter the bconsole again and take a look at the volumes:

# bconsole
* list volumes

Huh? That configuration change didn’t work! The recycling flag is still set to 0. Why? There is an easy answer to that: Because this value is not read from the configuration! It comes from the catalog. The configuration setting is applied at the moment a new value is created. Once the volume exists, the configuration setting is irrelevant. Of course we are not out of luck here. We can modify the flag using the bconsole (and yes, that asterisk right before the MediaId (the three in this case) is NOT a prompt symbol; type it in!):

* update
1
7
4
*3
* yes
18

Let’s see if that worked:

* list volumes

It did! So will Bacula now reuse the old volume and overwrite all data on it? No it won’t. Bacula knows that there’s data on it because it keeps track of all that in the catalog. And it tries to preserve that data even though the volume allows recycling.

Purging a volume

However we can tell Bacula to get rid of the catalog data that references this volume. To do so, we purge job and volume information for that volume from the catalog:

purge [purge jobs volume]
3
4
*3
* list volumes

[…]
Purged
Full
Full
[…]

See how the VolStatus changed? It’s purged and recycling is enabled. That means that Bacula will reuse the volume. But will our job resume automatically? Let’s take a look at it:

* list jobs

Oh no! What’s that JobStatus? It’s “f” for failed! What happened? Well, we stopped the director, remember? That killed the running job! So to try out recycling we need to run another backup job. Let’s start the job now:

* run
4
yes

Take a look at the volumes again:

[…]
Recycle
Full
Full
[…]

The VolStatus changed to Recycle and Bacula will reuse the old volume. In theory we’d have to purge one more volume for our backup to succeed. But since this is just an example job to show off some important things, we’re actually done at this point. But for today’s tasks we’re done now.

Save the current status for later:

* exit
# shutdown -p now
% vagrant status
% vagrant snapshot save tut_5_end

Intermission

You now know the basics of pool creation and some of the features that come with it. You’ve also purged and recycled a volume and should have a better understanding of how Bacula works in general. There’s a lot more to pools however and the next post in this series should probably go into retention periods and discuss a topic that we’ve only touched upon so far: The Catalog.

However this post concludes my „Bacula October“ and I’ll end this tutorial series here. It takes a lot of effort and time to write these posts and while I hope that this is of any use to somebody, I have no idea whether it is or not. For that reason I might or might not take this topic up again in the future. I had planned to simulate multiple backup clients with vagrant, do encrypted backups and so on. But now I’m looking forward to write about something else again! Of course feel free to comment on any of the parts if you liked (or want to tell my why you didn’t like) this tutorial.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s