Writing a daemon using FreeBSD and Python pt.3

Part 1 of this series covered Python fundamentals, signal handling and logging. We wrote an init script as well as a program that can be daemonized by daemon(8).

In the previous part we modified the program as well as the init script so that it can daemonize itself using the Python daemon module. I also covered a few topics that people totally new to programming (or Python) might want to know to better understand what’s happening.

Part 3 is about exploring a simple means of IPC (inter-program communication) by using named pipes.

Creating a named pipe

What is a named pipe – also known as a fifo (first in, first out)? It is a way of connecting two processes together, where one can sequentially send data and the other receives it in exactly the same order. It’s basically what us Unix lovers use for our command lines all the time when we pipe the input of one program into another. E.g.:

ls | wc -l

In this case the output of ls is piped to wc which will then print the amount of lines to stdout (which could be used as input for another program with another pipe). This kind of pipe between two programs is usually short lived. When the first program is done sending output and the second one has received all the data, the pipe goes away with the two processes. It also only exists between the two processes involved.

A named pipe in contrast is something a bit more permanent and more flexible. It has a representation in the filesystem (which is why it’s a named pipe). One program creates a named pipe (usually in /var/run) and attaches to the receiving end of the pipe. Another process can then attach to the sending end and start putting data into it which will then be received by the former. Named pipes have their own character (p) showing that a file is of type named pipe, looking like this when you ls -l:

prw-rw-r--

Here’s what the next version of the code looks like:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()

    logging.info("Baby Daemon started up and ready!")
    while True:
        time.sleep(1)

We’re using a new import here: os. It gives the programmer access to various OS-dependent functions (like pipes which are not existent on Windows for example). I’ve also added a global definition for the location of the named pipe.

The next thing that you’ll notice is that the signal handler function got some new code. Before the daemon terminates it tries to clean up. If the named pipe exists the program will attempt to delete it. I’m not handling what could possibly go wrong here as this is just an example. That’s why in this case I just re-raise the exception and let the program error out.

Then we have a new “start_logging()” function that I put the logging stuff into to unclutter main. Except for that changed structure, there’s really nothing new here.

The next new function, “assert_no_pipe_exists()” should be fairly easy to read: It checks if a file by the name it wants to use is already present in the filesystem (be it as a leftover from an unclean exit or by chance from some other program). If it is found, the daemon aborts because it cannot really continue. If the filename is not taken, however, “make_pipe()” will attempt to create the named pipe.

The other thing that I did was moving the main part back from being a function directly to the program. And since we’re doing small incremental steps, that’s it for today’s step 1. Fire up the daemon using the init script and you should see that the named pipe was created in /var/run. Stop the process and the pipe should be gone.

Using the named pipe

Creating and removing the named pipe is a good first step, but now let’s use it! To do so we must first modify the daemon to attach to the receiving end of the pipe:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, errno, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    try:
        close(inpipe)
    except:
        pass

    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

def read_from_pipe():
    try:
        buffer = os.read(inpipe, 255)
    except OSError as err:
        if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
            buffer = None
        else:
            raise
 
    if buffer is None or len(buffer) == 0:
        logging.debug("Inpipe not ready.")
    else:
        logging.debug("Got data from the pipe: " + buffer.decode())
    
 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()
    inpipe = os.open(IN_PIPE, os.O_RDONLY | os.O_NONBLOCK)
    logging.info("Baby Daemon started up and ready!")

    while True:
        time.sleep(5)
        read_from_pipe()

Apart from one more import, errno, we have three important changes here. First, the cleanup has been extended, there is a new function, called “read_from_pipe()” and then main has been modified as well. We’ll take a look at the latter first.

There’s a ton of examples on named pipes on the net, but they usually use just one program that forks off a child process and then communicates over the pipe with that. That’s pretty simple to do and works nicely by just copying and pasting the example code in a file. But adapting it for our little daemon does not work: The daemon just seems to “hang” after first trying to read something from the pipe. What’s happening there?

By default, reads from the pipe are in blocking mode, which means that on the attempt to read, the system just waits for data if there is none! The solution is to use non-blocking mode, which however means to use the raw os.open function (that supports flags to be passed to the operating system) instead of the nice Python open function with its convenient file object.

So what does the line starting with “inpipe” do? It calls the function os.open and tells it to open IN_PIPE where we defined the location of our pipe. Then it gives the flags, so that the operating system knows how to open the file, in this case in read-only and in non-blocking mode. We need to open it read-only, because the daemon should be at the receiving side of the pipe. And, yes, we want non-blocking, so that the program continues on if there is no data in the pipe without waiting for it all the time!

What might look a little strange to you, is the | character between the two flags. Especially since on the terminal it’s known as the pipe character and we’re talking about pipes here, right? In this case it’s something completely unrelated however. That symbol just happens to be Python’s choice for representing the bit-wise OR operator. Let’s leave it at that (I’ll explain a bit more of it in a future “Python pieces” section, but this article will be long enough without it).

However that’s still not all that the line we’re just discussing does. The os.open() function returns a file descriptor that we’re then assigning to the inpipe variable to keep it around.

What’s left is a new infinite loop that calls read_from_pipe() every 5 seconds.

Speaking of that function, let’s take a closer look at what it does. It tries to use the os.read function to read up to 255 bytes from the pipe into the variable named buffer. We’re doing so in a try/except block, because the read is somewhat likely to fail (e.g. if the pipe is empty). When there’s an exception, the code checks for the exact error that happened and if it’s EAGAIN or EWOULDBLOCK, we deliberately empty the buffer. If some other error occurred, it’s something that we didn’t expect, so let’s better take the straight way out by raising the exception again and crashing the program.

On FreeBSD the error numbers are defined in /usr/include/errno.h. If you take a look at it, you see that EAGAIN and EWOULDBLOCK are the same thing, so checking for one of them would be enough. But it makes sense to know that on some systems these are separate errors and that it’s good practice to check for both.

If the buffer either has the None value or has a length of 0, we assume that the read failed. Otherwise we put the data into the log. To make it readable we have to use decode, because we will be receiving encoded data.

All that’s left is the cleanup function. I’ve added another try/except block that simply tries to close the pipe file before trying to delete it. This is example code, so to make things not even more complex, I just silently ignore if the attempt fails.

Control script

Ok, great! That was quite a bit of things to cover, but now we have a daemon that creates a pipe and tries to read data from it. There’s just one problem: How can we test it? By creating another, separate program, that puts data in the pipe of course! For that let’s create another file with the name bdaemonctl.py:

#!/usr/local/bin/python3.6

 # Imports #
import os, time

 # Globals #
OUT_PIPE = '/var/run/bd_in.pipe'

 # Main #
try:
    outpipe = os.open(OUT_PIPE, os.O_WRONLY)
except:
    raise

for i in range(0, 21):
    print(i)
    try:
        os.write(outpipe, bytes(str(i).encode('utf-8')))
    except BrokenPipeError:
        print("Pipe has disappeared, exiting!")
        os.close(outpipe)
        exit(1)
    time.sleep(3)
os.close(outpipe)

Fortunately this one is fairly simple. We do our imports and define a variable for the pipe. We could skip the latter, because we’re using it on only one occasion but in general it’s a good idea to keep it as it is. Why? Because hiding things deep in the code may not be such a smart move. Defining things like this at the top of the file increases the maintainability of your code a lot. And since we want to send data this time, of course we name our variable OUT_PIPE appropriately.

In the main section we just try to open the pipe file and crash if that doesn’t work. It’s pretty obvious that such a case (e.g. the pipe is not there because the daemon is not running) should be better handled. But I wanted to keep things simple here because it’s just an example after all.

Then we have a loop that counts from 0 to 20, outputs the current number to stdout and tries to also send the data down the pipe. If that works, the program waits three seconds and then continues the loop.

To be able to write to the pipe we need a byte stream but we only have numbers. We first convert them to a string and use a proper encoding (utf8) and then convert them to bytes that can be sent over the pipe.

When the loop is over, we close the pipe file properly because we as the sender are done with it. I added a little bit of code to handle the case when the daemon exits while the control script runs and still tries to send data over the pipe. This results in a “broken pipe” error. If that happens, we just print an error message, close the file (to not leak the file descriptor) and exit with an error code of 1.

So for today we’re done! We can now send data from a control program to the daemon and thus have achieved uni-directional communication between two processes.

What’s next?

I’ll take a break from these programming-related posts and write about something else next.

However I plan to continue with a 4th part later which will cover argument parsing. With that we could e.g. modify our control program to send arbitrary data to the daemon from the command line – which would of course be much more useful than the simple test case that we have right now.

Writing a daemon using FreeBSD and Python pt.2

The previous part of this series left off with a running “baby daemon” example. It covered Python fundamentals, signal handling, logging as well as an init script to start the daemon.

Daemonization with Python

The outcome of part 1 was a program that needed external help actually to be daemonized. I used FreeBSD’s handy daemon(8) utility to put the program into the background, to handle the pidfile, etc. Now we’re making one step forward and try to achieve the same thing using just Python.

To do that, we need a module that is not part of Python’s standard library. So you might need to first install the package py36-daemon if you don’t already have it on your system. Here’s a small piece of code for you – but don’t get fooled by the line count, there’s actually a lot of things going on there (and of concepts to grasp):

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile
import logging
import signal
import time

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def main_program():
    signal.signal(signal.SIGTERM, handler_sigterm)
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)
 
    logging.info("Started!")
    while True:
        time.sleep(1)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    main_program()

I dropped some ballast from the previous version; e.g. overriding SIGINT was a nice thing to try out once, but it’s not useful as we move on. Also that countdown is gone. Now the daemon continues running until it’s signaled to terminate (thanks to what is called an “infinite loop”).

We have two new imports here that we need for the daemonization. As you can see, it is possible to import multiple modules in one line. For readability reasons I wouldn’t recommend it in general. I only do it when I import multiple modules that kind of belong together anyway. However in the coming examples I might just put everything together to save some lines.

The first more interesting thing with this version is that the main program was moved to a function called “main_program”. We could have done that before if we really wanted to, but I did it now so the code doesn’t take attention away from the primary beast of this example. Take a look at the line that starts with the with keyword. Now that’s a mouthful, isn’t it? Let’s break this one up into a couple of pieces so that it’s easier to chew, shall we?

The value for umask is looking a bit strange. It contains an “o” among the numbers, so it has to be a string, doesn’t it? But why is it written without quotes then? Well, it is a number. Python uses the “0o” prefix to denote octal (the base-8 numbering system) numbers and 0x would mean hexadecimal (base-16) ones.

Remember that we talked about try/except before (for the logging)? You can expand on that. A try block can not only have except blocks, it can also have a finally block. Statements in such a block are meant to be executed no matter the outcome of the try block. The classical example is that when you open a file, you definitely want to close it again (everything else is a total mess and would make your program an exceptionally bad one).

Closing it when you are done is simple. But what if an exception is raised? Then the code path that properly closes the file might never be reached! You could close the file in every thinkable scenario – but that would be both tedious and error-prone. For that reasons there’s another way to handle those cases: Close the file in the finally block and you can be sure that it will be closed regardless of what happens in the try or in any except block.

Ok, but what does this have to do with our little daemon? Actually a lot. That case of try/finally has been so common that Python provides a shortcut with so-called context managers. They are objects that manage a resource for you like this: You request it, it is valid only inside one block (the with one!) and when the block ends, the context manager takes care of properly cleaning up for you without having you add any extra code (or even without you knowing, if you just copy/paste code from the net without reading explanations like this).

So the with statement in our code above lets Python handle the daemonization process while the main_program function is running. When it ends on the signal, Python cleans up everything and the process terminates – which is great for us. Accept that for now and live with the fact that you might not know just how it does that. We’ll come back to things like that.

Updated init script

Ok, the one thing left to do here is making the required changes to the init script. We are no longer using the daemon(8) utility, so we need to adjust it. Here it is the new one:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/root/bdaemon.py"
command_interpreter=/usr/local/bin/python3.6
pidfile="/var/run/${name}.pid"

load_rc_config $name
run_rc_command "$1"

Not too much changed here, but let’s still go over what has. The command definition is pretty obvious: The program can now daemonize itself, so we call it directly. It doesn’t take any arguments, which means we can drop command_args.

However we need to add command_interpreter instead (one important thing that I had overlooked first), because the program will look like this in the process list:

/usr/local/bin/python3.6 /root/bdaemon.py

Without defining the interpreter, the init system would not recognize this process as being the correct one. Then we also need to point it to the to the pidfile, because in theory there could be multiple processes that match otherwise.

And that’s it! Now we have a daemon process running on FreeBSD, written in pure Python.

Python pieces

This next part is a completely optional excursion for people who are pretty new to programming. We’ll take a step back and discuss concepts like functions and arguments, modules, as well as namespaces. This should help you better understand what’s happening here, if you like to know more. Feel free to save some time and skip the excursion if you are familiar with those things.

Functions and arguments

As you’ve seen, functions are defined in Python by using the def keyword, the function name and – at the very least – an empty pair of parentheses. Inside the parentheses you could put one or more arguments if needed:

def greet(name):
    print("Hi, " + name + "!")

greet("Alice")
greet("Bob")

Here we’re passing a string to the function that it uses to greet that person. We can add a second argument like this:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet("Alice", "Great to see you again!")
greet("Bob", "How are you doing?")

The arguments used here are called positional arguments, because it’s decided by their position what goes where. Invert them when calling the function and the output will obviously be garbage as the strings are assigned to the wrong function variable. However it’s also possible to refer to the variable by name, so that the order does no longer matter:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")

This is what is used to assign the values for the daemon context. Technically it’s possible to mix the ways of calling (as done here), but that’s a bit ugly.

We’re not using it, yet, but it’s good to know that it exists: There are also default values. Those mean that you can leave out some arguments when calling a function – if you are ok with the default value.

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")
greet("Carol")

And then there’s something known as function overloading. We’re not going into the details here, but you might want to know that you can have multiple functions with the same name but a different number of arguments (so that it’s still possible to precisely identify which one needs to be called)!

Modules

When reading about Python it usually won’t take too long before you come across the word module. But what’s a module? Luckily that’s rather easy to explain: It’s a file with the .py extension and with Python code in it. So if you’ve been following this daemon tutorial, you’ve been creating Python modules all the way!

Usually modules are what you might want to refer as to libraries in other languages. You can import them and they provide you with additional functions. You can either use modules that come with Python by default (that collection of modules is known as the standard library, so don’t get confused by the terminology there), additional third-party modules (there are probably millions) or modules that you wrote yourself.

It’s fairly easy to do the latter. Let’s pick up the previous example and put the following into a file called “greeter.py”:

forgot_name = "Sorry, what was your name again?"

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

Now you can do this in another Python program:

import greeter

greeter.greet("Carol")
print(greeter.forgot_name)

This shows that after importing we can use the “greet()” function in this program, even though it’s defined elsewhere. We can also access variables used in the imported module (greeter.forgot_name in this case).

Namespaces

Ever wondered what that dot means (when it’s not used in a filename)? You can think of it as a hierarchical separator. The standard Python functions (e.g. print) are available in the global namespace and can thus be used directly. Others are in a different namespace and to use them, it’s necessary to refer to that namespace as well as the function name so that Python understand what you want and finds the function. One example that we’ve used is time.sleep().

Where does this additional namespace come from? Well, remember that we did import time at the top of the program? That created the “time” namespace (and made the functions from the time module available there).

There’s another way of importing; we could import either everything (using an asterisk (*) character, but that’s considered poor coding) or just specific functions from one module into the global namespace:

from time import sleep
sleep(2)
exit(0)

This code will work because the “from MODULE import FUNCTION” statement in this example imported the sleep function so that it becomes available in the global namespace.

So why do we go through all the hassle to have multiple namespaces in the first place? Can’t we just put everything in the global one? Sure, we could – and for more simple programs that’s in fact an option. But consider the following case: Python provides the open keyword. It’s used to open a file and get a nice object back that makes accessing or manipulating data really easy. But then there’s also os.open, which is not as friendly, but let’s you use more advanced things since it uses the raw operating system functionality. See the problem?

If you import the functions from os into the global namespace, you have a name clash in the case of open. This is not an error, mind you. You can actually do that, but you should know what happens. The function imported later will override the one that went by that name previously, effectively making the original one inaccessible. This is called “shadowing” of the original function.

To avoid problems like this it’s often better to have your own separate namespace where you can be sure that no clashes happen.

What’s next?

In the next part we’ll take a look at implementing IPC (inter-process communication) using named pipes (a.k.a “fifos”).

Writing a daemon using FreeBSD and Python pt.1

Being a sysadmin by profession, I don’t code. At least not often enough or with as high quality output that programmers would accept to call coding. I do write and maintain shell scripts. I also write new formulas for configuration management with SaltStack.

The latter is Python-based and after hearing mostly good things about that language, I’ve been trying to do some simple things with it for a while now. And guess what: It’s just so much more convenient compared to using shell code! I’ll definitely keep doing some simple tasks in Python, just to get some experience with it.

Not too long I thought about a little project that I’d try to do and decided to go with Python again. Thinking about what the program should do, I figured that a daemon would make a nice fit for it. But how do you write a daemon? Fortunately it’s especially easy on FreeBSD. So let’s go!

Python

The first thing that I did, was to create a new file called bdaemon.py (for “baby daemon”) and use chmod to make it executable. And here’s what I put into it as a first test:

#!/usr/local/bin/python3.6

 # Imports #
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #

 # Main #
print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

This very simple program has the shebang line that points the operating system to the right interpreter. Then I import Python’s time module which gives me access to a lot of time-related functions. Next I define two global variables that control how long the program runs and in which interval it will give output.

The main part of the program first outputs a starting message on the terminal. It then enters a for loop, that counts from 1 to 30. In Python you do this by providing a list of values after the in keyword. Counting to 5 could have been written as for i in [1, 2, 3, 4, 5]: for example.

With range we can have Python create a list of sequential numeric values on the fly – and since it’s much less to type (and allows for dynamic list creation by setting the final number via a variable), I chose to go with that. Oh, BTW: In Python the last value of those ranges is exclusive, not inclusive. This means that range(1, 5) leads to [1, 2, 3, 4] – if you want the 5 included in the list, you have to use range(1, 6)! That’s why I add 1 to the TTL_SECONDS variable.

I use time.sleep to create a delay in the loop block. Then I do a check if the remainder of the division of the current running time by the defined check interval is zero (% is the modulus operator which gives that remainder value of the division). If it is, the program creates more output.

Mind the indentation: In Python it is used to create code blocks. The for statement is not indented, but it ends with a colon. That means that it’s starting a code block. Everything up to (but not including) the second to last print statement is indented by four spaces and thus part of the code block. Said print statement is indented two levels (8 spaces) – that’s because it’s another block of its own started by the if statement before it. We could create a third, forth and so on level deep indentation if we required other blocks beneath the if block.

Eventually the program will print that the TTL has been reached and exit the program with an error code of 0 (which means that there was no error).

Have you noticed the str(i) part in one of the print statements? That is required because the counter variable “i” holds numeric values and we’re printing data of a different type. So to be able to concatenate (that’s what the plus sign is doing in this case!) the variable’s contents to the rest of the data, it needs to match its type. We’re achieving this by doing a conversion to a string (think converting the number 5 to the literal “5” that can be part of a line of text where it looks similar but is actually a different thing).

Oh, and the pound signs are used to start comments that are ignored by Python. And that’s already it for some fundamental Python basics. Hopefully enough to understand this little example code (if not, tell me!).

Signals

The next thing to explore is signal handling. Since a daemon is essentially a program running in the background, we need a way to tell it to quit for example. This is usually done by using signals. You can send some of them to normal programs running in the terminal by hitting key combinations, while all of them can be sent by the kill command.

If you press CTRL-C for example, you’re sending SIGINT to the currently running application, telling it “abort operation”. A somewhat similar one is SIGTERM, which kind of means “hey, please quit”. It’s a graceful shutdown signal, allowing the program to e.g. do some cleanup and then shut down properly.

If you use kill -9, however, you’re sending SIGKILL, the ungraceful shutdown signal, that effectively means “die!” for the process targeted (if you’ve ever done that to a live database or another touchy application, you know that you really have to think before using it – or you might be in for all kinds of pain for the next few hours).

#!/usr/local/bin/python3.6

 # Imports #
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def signal_handler(signum, frame):
    print("Received signal" + str(signum) + "!")
    if signum == 2:
        exit(0)

 # Main #
signal.signal(signal.SIGHUP, signal_handler)
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGQUIT, signal_handler)
signal.signal(signal.SIGILL, signal_handler)
signal.signal(signal.SIGTRAP, signal_handler)
signal.signal(signal.SIGABRT, signal_handler)
signal.signal(signal.SIGEMT, signal_handler)
#signal.signal(signal.SIGKILL, signal_handler)
signal.signal(signal.SIGSEGV, signal_handler)
signal.signal(signal.SIGSYS, signal_handler)
signal.signal(signal.SIGPIPE, signal_handler)
signal.signal(signal.SIGALRM, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
#signal.signal(signal.SIGSTOP, signal_handler)
signal.signal(signal.SIGTSTP, signal_handler)
signal.signal(signal.SIGCONT, signal_handler)
signal.signal(signal.SIGCHLD, signal_handler)
signal.signal(signal.SIGTTIN, signal_handler)
signal.signal(signal.SIGTTOU, signal_handler)
signal.signal(signal.SIGIO, signal_handler)
signal.signal(signal.SIGXCPU, signal_handler)
signal.signal(signal.SIGXFSZ, signal_handler)
signal.signal(signal.SIGVTALRM, signal_handler)
signal.signal(signal.SIGPROF, signal_handler)
signal.signal(signal.SIGWINCH, signal_handler)
signal.signal(signal.SIGINFO, signal_handler)
signal.signal(signal.SIGUSR1, signal_handler)
signal.signal(signal.SIGUSR2, signal_handler)
#signal.signal(signal.SIGTHR, signal_handler)

print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

For this little example code I’ve added a function called “signal_handler” – because that’s what it is for. And in the main program I installed that signal handler for quite a lot of signals. To be able to do that, I needed to import the signal module, of course.

If this program is run, it will handle every signal you can send on a FreeBSD system (run kill -l to list all available signals on a Unix-like operating system). Why are some of those commented out? Well, try commenting those lines in! Python will complain and stop your program. This is because not all signals are allowed to be handled.

SIGKILL for example by its nature is something that you don’t want to allow to be overridden with custom behavior after all! While your program can choose to handle e.g. SIGINT and choose to ignore it, SIGKILL means that the process totally needs to be shutdown immediately.

Try running the program and send some signals while it’s running. On BSD systems you can e.g. send CTRL-T for SIGINFO. The operating system prints some information about the current load. And then the program has the chance to output some additional information (some may tell you what file they currently process, how much percent they have finished copying, etc.). If you send SIGINT, this program terminates as it should.

Logging

There’s another thing that we have to consider when dealing with processes running in the background: A daemon detaches from the TTY. That means it can no longer receive input the usual way from STDIN. But we investigated signals so that’s fine. However it also means a daemon cannot use STDOUT or STDERR to print anything to the terminal.

Where does the data go that a daemon writes to e.g. STDOUT? It goes to the system log. If no special configuration for it exists, you will find it in /var/log/messages. Since we expect quite a bit of debug output during the development phase, we don’t really want to clutter /var/log/messages with all of that. So to write a well-behaving little daemon, there’s one more topic that we have to look into: Logging.

#!/usr/local/bin/python3.6

 # Imports #
import logging
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def handler_sigint(signum, frame):
    logging.debug("Not going to quit, there you have it!")

 # Main #
signal.signal(signal.SIGINT, handler_sigint)
signal.signal(signal.SIGTERM, handler_sigterm)
try:
    logging.basicConfig(filename='bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
except:
    print("Error: Could not create log file! Exiting...")
    exit(1)

logging.info("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        logging.info("Running for " + str(i) + " seconds...")
logging.info("TTL reached, terminating!")
exit(0)

The code has been simplified a bit: Now it installs only handlers for two signals – and we’re using two different handler functions. One overrides the default behavior of SIGINT with a dummy function, effectively refusing the expected behavior for testing purposes. The other one handles SIGTERM in the way it should. If you are fast enough on another terminal window, you can figure out the PID of the running program and then kill -15 it.

Logging with Python is extremely simple: You import the module for it, call a function like logging.basicConfig – and start logging. This line sets the filename of the log to “bdaemon.log” (for “baby daemon”) in the current directory. It changes the default format to displaying just the log level and the actual message. And then it defines the lowest level that should be logged.

There are various pre-defined levels like debug, info, warning, critical, etc. But what’s that try and except thing? Well, the logging module will attempt to create a logfile (or append to it, if it already exists). This is an operation that could fail. Perhaps we’re running the program in a directory where we don’t have the permission to create the log file? Or maybe for whatever reason a directory of that name exists? In both cases Python cannot create the file an an error occurs.

If such a thing happens, Python doesn’t know what to do. It knows what the programmer wanted to do, but has no clue on what to do if things fail. Does it make sense to keep the program running if something unexpected happened? Probably not. So it throws an exception. If an unhandled exception occurs, the program aborts. But we can catch the exception.

By putting the function that opens the file in a try block, we’re telling Python that we’re expecting it could fail. And with except we can catch an exception and handle expected problems. There are a lot of exception types; by not specifying any, we’re catching all of them. That might not be the best idea, because maybe something else happened and we’re just expecting that the logfile could not be created. But let’s keep it simple for now.

The one remaining thing to do is to change any print statements so that we’re using the logging instead. Depending on how important the log entry is, we can also use different levels from least important (DEBUG) to most important (CRITICAL).

You can either wait for the program to finish and then take a look at the log, or you open a second terminal and tail -f bdaemon.log there to watch output as the program is running.

Alright! With this we have everything required to daemonize the program next. Let’s write a little init script for it, shall we?

Init

Init scripts are used to control daemons (start and stop them, telling them to reload the configuration, etc.). There are various different init systems in use across the Unix-like operating system. FreeBSD uses the standard BSD init system called rc.d. It works with little (or not so little if you need to manage very complex daemons) shell scripts.

Since a lot of the functionality of the init system is all the same across most of these scripts, rc.d handles all the common cases in shell files of it’s own that are then used in each of the scripts. In Python this would be done by importing a module; the term with shell scripting is to source another shell script (or fragment).

Create the file /usr/local/etc/rc.d/bdaemon with the following contents:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/usr/sbin/daemon"
command_args="-p /var/run/${name}.pid /path/to/script/bdaemon.py"

load_rc_config $name
run_rc_command "$1"

Yes, you need root privileges to do that. Daemons are system services and so we’re messing with the system now (totally at beginner level, though). Save the file and you should be able to start the program as a daemon e.g. by running service bdaemon onestart!

How’s that? What does that all mean and where does the daemonization happen? Well, the first line after the shebang sources the main rc fragment with all the required functions (read the dot as “source”). Then it defines a name for the daemon and an rcvar.

What is an rcvar? Well, by putting “bdaemon_enable=YES” into your /etc/rc.conf you could enable this daemon for automatic startup when the system is coming up. If that line is not present there, the daemon will not start. That’s why we need to use “onestart” to start it anyway (try it without the “one” if you’ve never done that and see what happens!).

Then the command to run as well as the arguments for that command are defined. And eventually two helper functions from rc.subr are called which do all the actual complex magic that they thankfully hide from us!

Ok, but what is /usr/sbin/daemon? Well, FreeBSD comes with an extremely useful little utility that handles the daemonization process for others! This means it can help you if you want to use something as a background service but you don’t want to handle the actual daemonization yourself. Which is perfect in our case! With it you could even write a daemon in shell script for example.

The “-p” argument tells the daemon utility to handle the PID file for the process as well. This is required for the init system to control the daemon. While our little example program is short-lived, we can still do something while it runs. Try out service onestatus and service onestop on it for example. If there was no PID file present, the init system would claim that the daemon is not running, even if it is! And it would not be able to shut it down.

There we go. Our first FreeBSD daemon process written in Python! One last thing that you should do is change the filename for the logfile to use an absolute path like /var/log/bdaemon.log. If you want to read more about the daemon utility, read it’s manpage, daemon(8). And should you be curious about what the init system can do, have a look here.

What’s next?

While using /usr/sbin/daemon is perfectly fine, you might feel that we kind of cheated. So next time we’ll take a brief look at daemonizing with Python directly.

I also want to explore IPC (“inter-process communication) with named pipes. This will allow for a little bit more advanced daemon that can be interacted with using a separate program.

The history of *nix package management

Very few people will argue against the statement that Unix-like operating systems conquered the (professional) world due to a whole lot of strong points – one of which is package management. Whenever you take a look at another *nix OS or even just another Linux distro, one of the first things (if not the first!) is to get familiar with how package management works there. You want to be able to install and uninstall programs after all, right?

If you’re looking for another article on using jails on a custom-built OPNsense BSD router, please bear with me. We’re getting there. To make our jails useful we will use packages. And while you can safely expect any BSD or Linux user to understand that topic pretty well, products like OPNsense are also popular with people who are Windows users. So while this is not exactly a follow-up article on the BSD router series, I’m working towards it. Should you not care for how that package management stuff all came to be, just skip this post.

When there’s no package manager

There’s this myth that Slackware Linux has no package manager, which is not true. However Slackware’s package management lacks automatic dependency resolving. That’s a very different thing but probably the reason for the confusion. But what is package management and what is dependency resolving? We’ll get to that in a minute.

To be honest, it’s not very likely today to encounter a *nix system that doesn’t provide some form of package manager. If you have such a system at hand, you’re quite probably doing Linux from Scratch (a “distribution” meant to learn the nuts and bolts of Linux systems by building everything yourself) or have manually installed a Linux system and deliberately left out the package manager. Both are special cases. Well, or you have a fresh install of FreeBSD. But we’ll talk about FreeBSD’s modern package manager in detail in the next post.

Even Microsoft has included Pkgmgr.exe since Windows Vista. While it goes by the name of “package manager”, it turns pale when compared to *nix package managers. It is a command-line tool that allows to install and uninstall packages, yes. But those are limited to operating system fixes and components from Microsoft. Nice try, but what Redmond offered in late 2006 is vastly inferior to what the *nix world had more than 10 years earlier.

There’s the somewhat popular Chocolatey package manager for Windows and Microsoft said that they’d finally include a package manager called “one-get” (apt-get anyone?) with Windows 10 (or was it “nu-get” or something?). I haven’t read a lot about it on major tech sites, though, and thus have no idea if people are actually using it and if it’s worth to try out (I would, but I disagree with Microsoft’s EULA and thus I haven’t had a Windows PC in roughly 10 years).

But how on earth are you expected to work with a *nix system when you cannot install any packages?

Before package managers: Make magic

Unix begun its life as an OS by programmers for programmers. Want to use a program on your box that is not part of your OS? Go get the source, compile and link it and then copy the executable to /usr/local/whatever. In times where you would have just some 100 MB of storage in total (or even less), this probably worked well enough. You simply couldn’t go rampage and install unneeded software anyways, and sticking to the /usr/local scheme you separate optional stuff from the actual operating system.

More space became available however and software grew bigger and more complex. Unix got the ability to use libraries (“shared objects”), ELF executables, etc. To solve the task of building more complicated software easily, make was developed: A tool that read a Makefile which told it exactly what to do. Software begun shipping not just with the source code but also with Makefiles. Provided that all dependencies existed on the system, it was quite simple to build software again.

Compilation process (invoked by make)

Makefiles also provide a facility called “targets” which made a single file support multiple actions. In addition to a simple make statement that builds the program, it became common to add a target that allowed for make install to copy the program files into their assumed place in the filesystem. Doing an update meant building a newer version and simply overwriting the files in place.

Make can do a lot more, though. Faster recompiles by to looking at the generated file’s timestamp (and only rebuilding what has changed and needs to be rebuilt) and other features like this are not of particular interest for our topic. But they certainly helped with the quick adoption of make by most programmers. So the outcome for us is that we use Makefiles instead of compile scripts.

Dependency and portability trouble

Being able to rely on make to build (and install) software is much better than always having to invoke compiler, linker, etc. by hand. But that didn’t mean that you could just type “make” on your system and expect it to work! You had to read the readme file first (which is still a good idea, BTW) to find out which dependencies you had to install beforehand. If those were not available, the compilation process would fail. And there was more trouble: Different implementations of core functionality in various operating systems made it next to impossible for the programmers to make their software work on multiple Unices. Introduction of the POSIX standard helped quite a bit but still operating systems had differences to take into account.

Configure script running

Two of the answers to the dependency and portability problems were autoconf and metaconf (the latter is still used for building Perl where it originated). Autoconf is a tool used to generate configure scripts. Such a script is run first after extracting the source tarball to inspect your operating system. It will check if all the needed dependencies are present and if core OS functionality meets the expectations of the software that is going to be built. This is a very complex matter – but thanks to the people who invested that tremendous effort in building those tools, actually building fairly portable software became much, much easier!

How to get rid of software?

Back to make. So we’re now in the pleasant situation that it’s quite easy to build software (at least when you compare it to the dark days of the past). But what would you do if you want to get rid of some program that you installed previously? Your best bet might be to look closely at what make install did and remove all the files that it installed. For simple programs this is probably not that bad but for bigger software it becomes quite a pain.

Some programs also came with an uninstall target for make however, which would delete all installed files again. That’s quite nice, but there’s a problem: After building and installing a program you would probably delete the source code. And having to unpack the sources again to uninstall the software is quite some effort if you didn’t keep it around. Especially since you probably need the source for exactly the same version as newer versions might install more or other files, too!

This is the point where package management comes to the rescue.

Simple package management

So how does package management work? Well, let’s look at packages first. Imagine you just built version 1.0.2 of the program foo. You probably ran ./configure and then make. The compilation process succeeded and you could now issue make install to install the program on your system. The package building process is somewhat similar – the biggest difference is that the install destination was changed! Thanks to the modifications, make wouldn’t put the executable into /usr/local/bin, the manpages into /usr/local/man, etc. Instead make would then put the binaries e.g. into the directory /usr/obj/foo-1.0.2/usr/local/bin and the manpages into /usr/obj/foo-1.0.2/usr/local/man.

Installing tmux with installpkg (on Slackware)

Since this location is not in the system’s PATH, it’s not of much use on this machine. But we wanted to create a package and not just install the software, right? As a next step, the contents of /usr/obj/foo-1.0.2/ could be packaged up nicely into a tarball. Now if you distribute that tarball to other systems running the same OS version, you can simply untar the contents to / and achieve the same result as running make install after an unmodified build. The benefit is obvious: You don’t have to compile the program on each and every machine!

So far for primitive package usage. Advancing to actual package management, you would include a list of files and some metadata into the tarball. Then you wouldn’t extract packages by hand but leave that to the package manager. Why? Because it would not only extract all the needed files. It will also record the installation in its package database and keep the file list around in case it’s needed again.

Uninstalling tmux and extracting the package to look inside

Installing using a package manager means that you can query it for a list of installed packages on a system. This is much more convenient than ls /usr/local, especially if you want to know which version of some package is installed! And since the package manager keeps the list of files installed by a package around, it can also take care of a clean uninstall without leaving you wondering if you missed something when you deleted stuff manually. Oh, and it will be able to lend you a hand in upgrading software, too!

That’s about what Slackware’s package management does: It enables you to install, uninstall and update packages. Period.

Dependency tracking

But what about programs that require dependencies to run? If you install them from a package you never ran configure and thus might not have the dependency installed, right? Right. In that case the program won’t run. As simple as that. This is the time to ldd the program executable to get a list of all libraries it is dynamically linked against. Note which ones are missing on your system, find out which other packages provide them and install those, too.

Pacman (Arch Linux) handles dependencies automatically

If you know your way around this works ok. If not… Well, while there are a lot of libraries where you can guess from the name which packages they would likely belong to, there are others, too. Happy hunting! Got frustrated already? Keep saying to yourself that you’re learning fast the hard way. This might ease the pain. Or go and use a package management system that provides dependency handling!

Here’s an example: You want to install BASH on a *nix system that just provides the old bourne shell (/bin/sh). The package manager will look at the packaging information and see: BASH requires readline to be installed. Then the package manager will look at the package information for that package and find out: Readline requires ncurses to be present. Finally it will look at the ncurses package and nod: No further dependencies. It will then offer you to install ncurses, readline and BASH for you. Much easier, eh?

Xterm and all dependencies downloaded and installed (Arch Linux)

First package managers

A lot of people claim that the RedHat Package Manager (RPM) and Debian’s dpkg are examples of the earliest package managers. While both of them are so old that using them directly is in fact inconvenient enough to justify the existence of another program that allows to use them indirectly (yum/dnf and e.g. apt-get), this is not true.

PMS (short for “package management system”) is generally regarded to be the first (albeit primitive) package manager. Version 1.0 was ready in mid 1994 and used on the Bogus Linux distribution. With a few intermediate steps this lead to the first incarnation of RPM, Red Hat’s well-known package manager which first shipped with Red Hat Linux 2.0 in late 1995.

FreeBSD 1.0 (released in late 1993) already came with what is called the ports tree: A very convenient package building framework using make. It included version 0.5 of pkg_install, the pkg_* tools that would later become part of the OS! I’ll cover the ports tree in some detail in a later article because it’s still used to build packages on FreeBSD today.

Part of a Makefile (actually for a FreeBSD port)

Version 2.0-RELEASE (late 1994) shipped the pkg_* tools. They consisted of a set of tools like pkg_add to install a package, pkg_info to show installed packages, pkg_delete to delete packages and pkg_create to create packages.

FreeBSD’s pkg_add got support for using remote repositories in version 3.1-RELEASE (early 1999). But those tools were really showing their age when they were put to rest with 10.0-RELEASE (early 2014). A replacement has been developed in form of the much more modern solution initially called pkg-ng or simply pkg. Again that will be covered in another post (the next one actually).

With the ports tree FreeBSD undoubtedly had the most sophisticated package building framework of that time. Still it’s one of the most flexible ones and a bliss to work with compared to creating DEB or RPM packages… And since Bogus’s PMS was started at least a month after pkg_install, it’s even entirely possible that the first working package management tool was in fact another FreeBSD innovation.

Precomp (or: How to compress already compressed data?)

It’s a kind of strange feeling, but while half of the IT world seems to either already burn (or to tremble with fear), I can choose freely whatever topic I want to write about this month. I haven’t had a Windows box for almost a decade now and people who I work or keep in contact with, are also mostly *nix only. So this post is not about encryption or ransomware at all. It is about useful, respectable compression. Or more precise: The art of re-compressing already compressed data!

In January Precomp, a precompression utility, has been open-sourced! The first two sections tell a bit about how I became interested in this topic and in Precomp. Skip them if you don’t want to read that kind of stuff.

Compressing compressed data?

When I was young and new to PCs, I once tried to compress a ZIP archive with ACE (a lesser known archiver that once was comparable to the more popular RAR). I knew that ACE offered stronger compression and so I thought that this should make the file smaller. Just imagine my surprise when it turned out that I was wrong!

I guess that most of us have a story like that to tell, a story from our childhood when compression was nothing short of magic. Later when I begun to understand that even though it in fact does start with “m”, it’s not magic but math (a subject that I totally sucked at in school – but fortunately I grasped enough to get a rough idea on how compression works ;)). Now there was no surprise anymore: The compressed data is not well fit for any other general purpose compression method, even if it’s compressed with a weak algorithm.

How to work around that? Well, decompressing the ZIP file and creating a new ACE archive does the trick in the case mentioned above. Of course things are not always that straight forward. If they were, I wouldn’t really have much to write about right now and this post would be really, really short!

For whatever reason, compression continued to fascinate me and I loved compressing things to sizes as tiny as possible. It was fun to try out new experimental compression programs specialized on some specific types of files. I did that for years – until I had to stop due to a lack of time.

Games

Let’s fast forward some years from that failed compression experiment with ACE; I had replaced DOS 6.22 with Win95 which I had replaced with Win98 (SE) that I had replaced with WinME, … On some day I wanted to install Quake ]|[ Arena (yes, friends, I once was 1337 young enough to spell it like that!) on my main computer to get into it again for a LAN party next weekend. So I went looking for the darn CD. It took me a while but I finally found the CD case. I opened it up and… the CD itself was missing. Oh great! Since I didn’t feel like looking into all the other cases to find out into which I might have put it accidentally, I decided to just copy it off an older computer which had it already installed (ID were nice people. I don’t remember which version of Q3A it was, but there eventually was an official patch which also removed the CD check for the game so there was no need for a crack or anything).

Now, different versions of Windows didn’t always play together too well on the LAN and since my Quake installation was on a computer with an older Windows (and I didn’t have another cable at hand), I decided that I’d just burn it to CD. It turned out however, that the other machine didn’t have vanilla Q3A installed but the expansion set as well. Together it was obviously too big to fit on one CD. There would have been easy solutions: Leave out the resource files for the expansion, burn two CDs, put the hard drive into the new computer, … Sure, easy solutions are nice and all. But sometimes they are also boring! And when you’re young and have some free time, you don’t do boring stuff. So of course I opted for the more challenging solution: Get it all on one cd!

Quake 3’s resource containers go by the file extension of .pk3 and, more importantly, are in fact ZIP files without any compression. This meant that they could be compressed well because there was no ZIP compression getting in the way. But guess what: Even after applying the most extreme compression programs, the result simply would not fit onto one CD…

Bad luck, eh? Well, not really. Unpacking the container files was in fact the solution in this case. Not because of weak compression but because it enabled me to test each of the files it contained separately with all compressors and could group together all files that compressed best with one compression utility or another! I think that I was able to shrink it down almost as much as needed with just a couple of megs over the CD limit. There were blank CDs with 800 MB capacity as well, so it would have fit onto one CD – but I didn’t have one of those. So I replaced the ID video with an empty video file and I was set.

Since I liked doing these things I begun doing backups like that for a lot of my favorite games, ripping apart (and later rebuild) resource containers, convert between file formats, decompress whatever could be decompressed before applying stronger compression, etc.

How Precomp works

The more I got into free and open source things, the more I wondered if some of them wouldn’t benefit from better compression. A friend and former classmate of mine invented Precomp and I of course was among the first to make use of it and provide feedback. But what is Precomp?

Precomp is what the name says: A pre-compressor. It is not directly meant to reduce the size of files. On the contrary: It can make some files even bigger than the original input. But that’s a good thing really! How’s that? Well, it’s meant to prepare files for compression so that eventually these files can be compressed to a smaller size than the original file could – without losing data of course!

What Precomp does is look for streams in its input file that are compressed with a compression method known to Precomp. It then decompresses and recompresses them so that they can be compared. If they are identical, Precomp will write the decompressed stream (plus how to recompress it properly) to its output file.

While this sounds quite simple in theory, it is in fact a bit more complex. The reason for that lies in the flexibility of some compression algorithms. Have you ever zipped up a file? Then you know that there are a lot of parameters that you can provide which affects how the file will be compressed: “fast”, “normal”, “strong” or “maximum” compression? What about the dictionary size? A lot of things like that. So either combination of compression parameters will result in a valid zip stream that can be decompressed by any zip-compatible utility. Replacing such a stream with a compatible one is fairly easy. Reproducing the exact, bit for bit identical stream, is not.

To be truly lossless, Precomp uses trial and error on each stream. If it can figure out the combination of parameters that result in the original stream: Great! If not, that stream has to be left untouched.

What Precomp can do

Early versions of Precomp were only available on Windows but there have been Linux versions for quite a while as well. I also use it on FreeBSD without any problems. The .PCF files are platform-independent. You can restore the original file on Windows from a file precompressed on Linux or BSD and vice versa.

While Precomp originally was only a pre-compressor for zlib streams (which are used in a variety of file formats like ZIP, GZIP, PNG, PDF, …), it can do more things now. It can use bzip2 to compress its input file after precompression. It can losslessly compress some JPEG pictures to smaller sizes (thanks to an external library). And in the current development version there’s even support for compressing MP3 music files further (also using an external lib)!

Currently, Precomp relies on temporary files for all the extracted streams and thus puts heavy load on your hard drive (and is a bit slow due to that bottleneck). SSDs obviously perform better, but it totally makes sense to use a memdrive if you can spare some RAM for it. I’ve forked the project on Github and added an experimental shell script to assist with the creation of such a memdrive. It’s currently FreeBSD only (I’ve migrated all of my boxes to *BSD and currently have no Linux machine remaining but will set up one for cases like that some time in the future). Feel free to take a look at it if you’re into portable shell scripting and please do tell me if you have any suggestions!

Precomp is not at all at the limit of its possibilities. There are a lot of things that can be tweaked, optimized or added. If you feel like that could be a fun project – go ahead and play with it, it’s on Github. Or perhaps you have an idea what this could be useful for? Please help yourself and use it. It’s free software after all (Apache licensed).

Thea: The gain of giving away for free

This post is inspired by the game Thea: The Awakening. No, Eerie Linux has not mutated into a games blog. Yes, I will give a short description of the game. But what this post is really about is some thoughts about software development in the past, today and what could be a more open future.

Why Thea? Because the developers did something very uncommon: They decided to give the game away for free – if you’re a Linux user that is!

Thea: The Awakening

The game in question is a turn-based strategy game with a strong focus on survival. There’s a nice background story: The world had turned to darkness (playing the game you will discover why) and is haunted by creatures and spirits of the dark. Now the sun is rising again and the gods have returned but both are very weak and darkness will not give up without a fierce fight. Slavic mythology makes for a very nice and rather uncommon setting.

In case you want to give it a try, you can find a download link here. And yes, it is really completely free. You don’t need to buy the Windows version first or something.

I’ve successfully run the game on the Mint laptop that I share with my wife and can confirm that it works well. No luck on a 32-bit machine that I installed Arch on to give the 32-bit version of the game a try. It won’t start and the console messages give no clues why this may be. So if you’re still stuck with 32-bit only systems, you’re probably out of luck. 😉

The developers stated that they have not even tested the Linux version themselves! So what works and what doesn’t? Most things seem to work surprisingly well in fact. Sound, graphics, even the intro video. I’ve experienced graphical glitches with some white pixels appearing for a second (nope, no AMD video card – it’s Intel!). But this happens just rarely and is a fairly minor issue. Far more annoying is the fact that you cannot really use the keyboard: A key press works but the release event doesn’t… This is a known issue with the version of the Unity engine that Thea uses. It may or may not be addressed in a future release. You can however get the keys released by ALT-TABbing out of the game and back in. That way you can at least always access the menu.

You choose one of the gods when starting a game. I’ve played scenarios for multiple gods now. The main story (“Cosmic Tree”) gets pretty repetitive soon since it’s always the same. This is also true for a lot of the other quests. However the game has options to skip a lot of the text in case you already know it which certainly was a good idea. Some of the quests are different depending of which god you chose which keeps things interesting story-wise. Maps, resources, encounters, etc. are randomly generated for each game. This together with a challenging survival, plenty of combinations to try for crafting items and interesting gameplay, Thea might still cause a rather high motivation to replay the game often.

Software development models

I’d like to separate some development approaches here and sum them up by giving their model as I see it a name. These are no official models (I’m not a game developer) but an attempt to sum up the whole thing in one heading.

The shareware model

There was once a time when software was developed in a purely closed manner. It was developed internally and when it was ready, a release was done and advertised. The good thing was that games were often cut into “episodes” and the first one given away as shareware so people could try out the game for free and might decide to buy the full product.

The public relations model

Advertising grew bigger and bigger as well as more and more aggressive. Top titles games were often announced as development begun and some material was released along the development process to keep people hooked. This worked in some cases and failed in others (say Duke Nukem forever announced in 1996).

It was a reasonable move to try to build up an audience interested in a certain title early. The problem with that is mainly two things: You cannot keep people hooked for an arbitrary amount of time and such a continuing advertising campaign costs a whole lot of money way before you start earning anything from sales.

These problems lead to a new one, however. It puts very high pressure on the developers to meet deadlines to stay on schedule. And sometimes people in charge may even decide to release a half-baked product which almost always is a very bad idea… (what was the latest example? That Batman game perhaps?)

The community-aware model

It’s not a new insight that it is rather helpful for any title to have a large community. Some studios provide forums in an attempt to simplify building up of a community. And it’s also common knowledge today that feedback from that community is extremely valuable: Knowing your audience better helps a lot to provide the perfect product after all!

The most important point of this model is that interacting with the players is now bidirectional: There’s advertising targeting them but you certainly want to have (and honor) feedback provided by them. And it also makes sense to think about designing the game and/or providing the tools to easily modify the game and thus make it as easy as possible to create mods for the game. This can also be a huge plus when it leads to a bigger, more active and longer living community!

Independent of a single title, there is a possibility for a studio to get itself a good name by opening the source code for older games. This may require some cleaning up work first but some studios have also released code as-is (which can be rather terrible). But usually the community figures out what to do with it and before long the game is ported to new platforms, receives technical updates and enhancements. This has totally made some titles immortal: There are still new episodes, mods and total conversions for Wolfenstein being released. Yes, for a game from 1992 with extremely “poor graphics” (320×240, 8bit) by today’s standards! And there’s not one week without new maps for the mighty DooM (1993).

The community-supported model

There’s this interesting trend of “early access” games: Players are given the opportunity to playtest games before they are ready for release. People know they have to expect bugs but they can try out a game of their interest early and if they are very committed to it, they can report bugs as they encounter them.

This is a classical win-win situation: The developers get a broad testing done for free and the players can have a peak into the game early. Oh, and any form of interaction is of course always a good thing.

The community-backed model

That’s a rather new thing and basically means that some developers try to get their game crowd-funded. This can succeed and this can fail. There are examples for both cases. But while this is clearly a development model since it has a lot of impact on it, I’d say that it’s also more of a special case than a general model.

The future?

MuHa Games have made one clever step ahead with Thea as the gain of giving the title away for free on Linux is really considerable. How’s that? Well, if there was no Linux version, Linux people wouldn’t have bought the game, either. So giving it away is no actual loss: The number of people of the “hey, I would have bought it for Windows but why should I since I can play it for free on Linux!” kind are most likely extremely rare – if they exist at all.

No loss is fine, but where’s the actual gain? Well, there’s the “Just bought the Windows version. Besides: I don’t run Windows at all” type of guy. These people alone should suffice to cover the costs of the additional efforts to package a Linux release and upload it somewhere. But that’s not the main point at all: Can you say “Free advertising”? People talk about the game and people write about the game, many of which would not have done it if it had just been an ordinary game! Now with the free Linux release the game, MuHa managed to make it stand out (and that is not too easy today).

For these reasons giving it away proves to be a very sensible PR action! I do not mind if that was intended or not. That doesn’t change the facts.

Community-assisted model?

So what could the future hold? I can imagine that making the community engage even more would be a big benefit. From a studio’s perspective, fans do unpaid work because they love the product. And from the fan’s perspective it’s just cool to be part of one of your favorite games and help improve it.

What could this look like? My vision is to sort of blend closed source development with what we learned from open source development. It’s cool that people playtesting a game can report bugs via forum or email. But when will the first project set up a public bugtracker along with a tutorial on how to use that for bug reports and maybe (sensible) feature requests?

Then: What about translation? Open source achieved made very, very good results using translation frameworks like Transifex. Now Thea is only available in English. My native language is German and I would not have minded at all to dedicate some time translating a few strings (I got a nice game for free after all!). There’s a lot of potential in this.

And along that it would totally make sense to avoid using proprietary containers for files. I did not bother to try to extract text out of whatever format it is that MuHa uses for Thea. In 1999 ID Software did a clever thing for Quake III Arena: They used container files called “.pk3” – which were simply renamed, uncompressed Zip files. The benefit is obvious: Everybody can extract the resources, modify them and put things back together. Great! I noticed a lot of spelling mistakes in Thea. If I had had access to the game text you’d have received a series of patches from me (and by applying they you’d instantly see which ones are still valid and fixing mistakes). Wouldn’t that be a great way to improve the game?

Licensed Open Source model?

Can open source work for a commercial game? Well, why not? Open source alone does mean just that: The source is open. It does not say under which license and it does not say that it’s free. Now I generally support as much freedom as possible – but that last word there is important. A more open development is a nice improvement IMO. There’s no reason to demand more than that.

In this model the customers pay for the game data without which you obviously cannot play the games but the program source is open (or perhaps semi-open where it is included with the copy of the game you get when you buy it and you’re free to distribute a series of patches but not the source itself). I’m pretty sure that this can work. One potential problem here may be deadlines. Often the code in commercial games must be horrible – not because the programmers suck but because unrealistic deadlines blow. A lot of studios may hesitate to open up their code just for that very reason…

Addressing the problem could however also be easy: You sell games in early access? Buyers get the code and know that it’s early and may not be in perfect shape (and can actually help improving it). Again both sides win: The studio gets code review and maybe some patches plus some people may even attempt to port the game to platforms unsupported by the studio. The players get better games they can help to improve, take modding to the next level and even a chance to see what coding is like and get yourself some reference work if you intent to work in that industry!

There’s one other issue, though. In many cases studios will want to hide some things from competitors. That may be old (and at some point hopefully obsolete) thinking but we have to accept it as a present fact. So what about this? Well, those things could be put into libraries… It’s far better to have the program code open and make it use closed libraries than having nothing open at all!

Time for change

Who’s stepping forward making the next step in game development? I’m really curious if something in the direction of what I wrote here happens any time in the future. For each step there’s good press to catch for free again, you know? 😉 Perhaps some small studio dares to make the move.

Update: I wrote this in a hurry on 11/30 to rush out my November post. And then I once again forgot to make it public. But now it is…

An interview with the Nanolinux developer

2014 is nearly over and for the last post this year I have something special for you again. Last year I posted an interview with the EDE developer and I thought that another interview would conclude this year of blogging quite fine.

In the previous post I reviewed Nanolinux (and two years ago XFDOS). Since I was in mail contact with the author about another project as well, it suggested itself that I’d ask him if he’d agree to give me an interview. He did!

So here’s an interview with Georg Potthast (known for a variety of projects: DOSUSB, Nanolinux and Netrider – to just name some of them) about his projects, the FLTK toolkit, DOS and developing Open Source software in general. Enjoy!

Interview with Georg Potthast

This interview was conducted via email.

Please introduce yourself first: How old are you and where are you from?

I am 61 years old and live in Ahlen, Germany. This is about 30 minutes drive from Dortmund where they used to brew beer and where the BVB Dortmund soccer team is currently struggling.

Do you have any hobbies which have nothing to do with the IT sector?

Not really. I did some Genealogy, which has to do a lot with IT these days. But now I have several IT projects I am working on.

DOS

You’re involved in the FreeDOS community and have put a lot of effort into XFDOS. A lot of people shake their heads and mumble something like “It’s 2014 now and not 1994…” – you know the score. What is your motivation to keep DOS alive?

I have been using DOS for a long time and wish it would not go away completely. So I developed these DOS applications, hoping to get more people to use DOS. But I have to agree that I have not been successful with that.

Potential software developers find only very few users for their applications which is demotivating. Also there is simply no hardware available today that is limited so much that you better use DOS on it. Everything is 32/64 bit, has at least 4 GB of memory and terabytes of disk space. And even the desktop PC market is suffering from people moving to tablets and smartphones.

People are still buying my DOSUSB driver frequently. They are using it mostly for embedded applications which shall not be ported to a different operating system for one reason or another.

Do you have any current plans regarding DOS?

I usually port my FLTK applications to DOS if it is not too much effort to do so. So they are available for Linux, Windows and DOS. Such as my FlDev IDE (Link here).

Recently I made a Qemu/FreeDOS bundle named DOS4WIN64 (Link here) that you can run as an application on any Windows 7/8 machine. This includes XFDOS. I see this as a path to run 16bit applications on 64bit Windows.

How complicated and time consuming is porting FLTK applications from Linux to DOS or vice versa?

It depends on the size and the dependencies on external libraries. I usually run ./configure on Linux and then copy the makefile to DOS where I replace-lXlib with -lNXlib plus -lnano-X. Then, provided the required external libraries could be downloaded from the djgpp site, it will compile if the makefile is not too complicated (recursive). Sometimes I also compile needed libraries for DOS which is usually not difficult if they have a command line interface.

You then have to test if all the features of the application work on DOS and make some adjustments here and there. Often you can use the Windows branch if available for the path definitions.

Porting DOS applications to Linux can be more complicated than vice versa.

Linux

For how long have you been using Linux?

I have been using Linux on and off. I began using SCO-Unix. However, I did not like setting things up with configuration files (case sensitive) scattered over many directories. It took me over a week to get serial communications to work to connect a modem. When I asked Linux developers for help they recommended to recompile the kernel first – which means they did not know how to do that either. So I returned to DOS at that time. But I have been using Linux a lot for several years now.

What is your distribution of choice and why?

I mainly use SUSE but I think Ubuntu may work just as well. This may sound dull but you do not have to spend time on adding drivers to the operating system or porting the libraries you need. The mainstream Linux distributions are well tested and documented and you do not have to spend the time to tailor the distro to your needs. They do just much more than you need so you are all set to start right away.

My own distro, Nanolinux, is a specialized distro which is meant to show how small a working Linux distro can be. It can be used on a flash disk, as an embedded system, a download on demand system or to quickly switch to Linux from within Windows.

However, if you have a 2 Terabyte hard disk available I would not use Nanolinux as the main distribution.

FLTK

Which programming languages do you prefer?

I like Assembler. To be able to use X11 and FLTK I learned C and C++ which I currently use. I have not done any assembler in a while though.

You seem to like the idea of minimalism. Do you do use those minimalist applications on a daily base or are they more of a nice hobby?

Having a DOS and assembler background I always try not use more disk space than necessary. Programming is just my hobby.

Many of your projects use the FLTK toolkit. Why did you choose this one and not e.g. FOX?

I had ported Nano-X to DOS to provide an Xlib alternative for DOS developers. In addition I ported FLTK to DOS as well since FLTK can be used on the basis of Nano-X. So I am now used to FLTK.

Compared to the more common toolkits, FLTK suffers from a lack of applications. Which three FLTK applications that don’t exist (yet) do you miss the most?

I think FLTK is a GUI toolkit for developers, so it is not so important what applications are available based on FLTK.

If you look at my Nanolinux – given I add the NetRider browser and my FlMail program to the distro – it comes with all the main office applications done in FLTK. However, the quality of these applications is not as good as Libre Office, Firefox or Gimp. I do not expect anyone to write Libre Office with a FLTK GUI.

When you awake at night, a strange light surrounds you. The good FOSS fairy floats in the air before you! She can do absolutely everything FOSS related; whether it’s FLTK 3 being completed and released this year, a packaging standard that all Linux distros agree on or something a bit less unlikely. 😉 She grants you three wishes!

As with FLTK 3 I wish it would change its name and the development would concentrate on FLTK 1.3.

Regarding the floating fairy I would wish the internet would be used by nice and friendly people only. Currently I see it endangered by massive spam, viruses, criminals and even cyber war as North Korea apparently did regarding the movie the ruling dictator wanted to stop being shown.

Back to serious. What do you think: Why is FLTK such a little known toolkit? And what could be done about that?

I do not think it is little known, just most people use GTK and so this is the “market leader”. If you work in a professional team this will usually decide to go for GTK since most members will be familiar with that.

What could be done about that? If KDE and Gnome would be based on FLTK I think the situation will change.

From your perspective of a developer: What do you miss about FLTK that the toolkit really should provide?

Frankly speaking, as a DOS developer the alternative would be to write your own GUI. And FLTK provides more features than you could ever develop on your own.

What I do not like is the lack of support for third party schemes. Dimitrj, a Russian FLTK developer who frequently posts as “kdiman” on the FLTK forums, created a very nice Oxy scheme. But it is not added to FLTK since the developers do not have the time to test all the changes he made to make FLTK look that good.

What do you think about the unfortunate FLTK 2 and the direction of FLTK 3?

I think these branches have been very unfortunate for FLTK. Many developers expected FLTK 2 to supersede FLTK 1.1 and waited for FLTK 2 to finish before developing an FLTK application. But FLTK 2 never got into a state where it could replace FLTK 1.1. Now the same seems to happen with FLTK 3.

So they should have named FLTK2/3 the XYZ-Toolkit and not FLTK 2 to avoid stopping people to choose FLTK 1.1.

Currently there is no development on FLTK 2/3 that I am aware of and I think the developers should concentrate on one version only. FLTK 1.3 works very well and does all that you need as a software developer as far as I can say.

Somebody with a bit of programming experience and some free time would like to get into FLTK. Any tips for him/her?

I wrote a tutorial which should allow even beginners in C++ programming to use FLTK successfully (Link here).

Nanolinux

You’ve written quite a number of such applications yourself. Which of your projects is the most popular one (in terms of downloads or feedback)?

This is the Nanolinux distro. It has been downloaded 30.000 times this year.

NanoLinux… Can you describe what it is?

Let me cite Distrowatch, I cannot describe it better: Nanolinux is an open-source, free and very lightweight Linux distribution that requires only 14 MB of disk space. It includes tiny versions of the most common desktop applications and several games. It is based on the “MicroCore” edition of
the Tiny Core Linux distribution. Nanolinux uses BusyBox, Nano-X instead of X.Org, FLTK 1.3.x as the default GUI toolkit, and the super-lightweight SLWM window manager. The included applications are mainly based on FLTK.

After compiling the XFDOS distro I thought I would gain more users if I would port it to Linux. The size makes Nanolinux quite different from the others and I got a lot of downloads and reviews for it.

The project is based on TinyCore which makes use of FLTK itself. Is that the reason you chose this distro?

TinyCore was done by the former main developer of Damn Small Linux. So he had a lot of experience and did set up a very stable distro. Since I wanted to make a very small distro this was a good choice to use as a base. And I did not have to start from scratch and test that part of the distro forever.

NanoLinux uses an alternative windowing system. What can you tell us about the differences between NanoX and Xorg’s X11?

Nano-X is simply a tiny Xlib compatible library which has been used in a number of embedded Linux projects. Development started about 15 years ago as far as I recall. At that time many Linux application developers used X11 directly and therefore were willing to use an alternative like nano-X for their projects.

Since nano-X is not fully compatible to X11, a wrapper called NXlib was developed, which provides this compatibility and allows to base FLTK and other X11 applications on nano-X without code change. The compatibility is not 100% of cause, it is sufficient for FLTK and many X11 applications.

Since nano-X supported DOS in the early days I took this library and ported the current version to DOS again.

Netrider

The project you are working on currently is NetRider, a browser based on webkit and FLTK. Please tell us how you came up with the idea for it.

Over the years I looked at other browser applications and thought how I could build my own browser, just out of interest. Finally Laura, another developer from the US, and I discussed it together. She came up with additional ideas and thoughts. That made me have a go at WebKit with FLTK.

What are your aims for NetRider?

I wanted to add a better browser to my Nanolinux distro replacing the Dillo browser. Also, as a FLTK user I wanted to provide a FLTK GUI for the WebKit package as an alternative to GTK and Qt.

There’s also the project Fifth which has quite similar aims at first sight. Why don’t you work together?

Lauri, the author of Fifth, and I started out about the same time with our FLTK browser projects, not knowing of each other’s plans. Now our projects run in parallel. Even though we both use FLTK, the projects are quite different.

We have not discussed working together yet and our objectives are different. He wants to write an Opera compatible browser and competes with the Otter browser while I am satisfied to come up with something better than Dillo.

I did not ask Lauri whether he thinks we should combine the projects. I am also not sure if this would help us both because we implemented different WebKit APIs for our browsers so we would have to make a WebKit library featuring two APIs. This could be done though. Also he is not interested in
supporting Windows which Laura and I want to support.

Would you say that NetRider is your biggest project so far? And what plans do you have for it?

Setting up Nanolinux and developing/porting all the applications for it was a big project too, and I plan to make a new release beginning of next year.

As with NetRider it depends if people like to use it or are interested to develop for / port it. Depending on the feedback I will make my plans. Recently I added some of the observations I got from beta testers, did support for additional languages, initial printing support etc.

The last one is yours: Which question would you have liked me to ask in addition to those and what is the answer to it?

I think you already asked more questions than I would have been able to come
up with. Thank you for the interesting questions.

Thanks a lot Georg, for answering these questions! Best wishes for your current and future projects!

What’s next?

I have a few things in mind… But I don’t know yet which one I’ll write about next. A happy new year to all my readers!