Dystopian Open Source

[New to Gemini? Have a look at my Gemini FAQ.]

This article was bi-posted to Gemini and the Web; Gemini version is here: gemini://gemini.circumlunar.space/users/kraileth/neunix/2021/dystopian_open_source.gmi

Happy New Year dear reader! The other day I watched a video on YouTube that had only 6 views since last October. It is about a very important topic, though, and I wish it would have a larger impact as well as get more people alarmed and thinking about the current trends in Open Source. This is not a “OMG we’re all doomed!!1” post, but I want to talk about what I feel are grave dangers that we should really, really aim some serious consideration at.

“Pay to Play”

For the readers who would like to watch the video (about 7 minutes), it’s here. Some background info: It’s by Lucas Holt. He is the lead developer of MidnightBSD, a project that began as a Fork of FreeBSD 6.1 and aimed for better usability on the desktop. There were a couple of people who contributed to the project over time, but it never really took off. Therefore it has continued as a project almost entirely done by one man.

It’s not hard to imagine just how much work it is to keep an entire operating system going; much larger teams have failed to deliver something useful after all. So while it’s no wonder that MidnightBSD is not in a state where anybody would recommend it to put to everyday usage, I cannot deny that I admire all the work that has been done.

Holt has merged changes back from FreeBSD several times, eventually updating the system to basically 11.4 plus the MidnightBSD additions and changes. He maintains almost 5,000 ports for his platform (of course not all are in perfect shape, though). And he has kept the project going since about 2006 – despite all the taunting and acid-tongued comments on “the most useless OS ever” and things like that. Even though I never found somewhat serious use for MidnightBSD (and I tried a couple of times!), considering all of that he has earned my deepest respect.

To sum up the video: He talks about a trend in Open Source that some very important projects started to raise the bar on contributing to them. Sometimes you’re required to employ two full-time (!) developers to be considered even worth hearing. Others require you to provide them with e.g. a paid Amazon EC2 instance to run their CI on. And even where that’s not the case, some decision makers will just turn you down if you dare to hand in patches for a platform that’s not a huge player itself.

Quite a few people do not even try to hide that they only ever care about Linux and Holt has made the observation that some of the worst-behaving, most arrogant of these are – Redhat employees. There are people on various developer teams that choose to deliberately ruin things for smaller projects, which is certainly not good and shouldn’t be what Open Source is about.

What does Open Source mean to us?

At a bare minimum, Open Source only means that the source for some application, collection of software or even entire operating system is available to look at. I could write some program, put the code under an extremely restrictive license and still call this thing “Open Source” as long as I make the code available by some means. One could argue that in the truest sense of the two words that make up the term, that would be a valid way to do things. But that’s not what Open Source is or ever was about!

There are various licenses out there that are closely related to Open Source. Taking a closer look at them is one great way to find the very essence of what Open Source actually is. There are two important families of such licenses: The so-called Copyleft licenses and the permissive licenses. One could say that downright religious wars have been waged about which side holds the one real truth…

People who have been reading my blog for a while know that I do have a preference and made quite clear which camp I belong to, even though I reject the insane hostility that some zealots preach. But while the long-standing… err… let’s say: controversy, is an important part of Open Source culture, the details are less relevant to our topic here. They basically disagree on the question of what requirements to put in the license. Should there be any at all? Is it sufficient to ask for giving credit to the original authors? Or should users be forced to keep the source open for example?

Both license families however do not dispute the fundamental rights given to users: They want you to be able to study the code, to build it yourself, to make changes and to put the resulting programs to good use. While it’s usually not explicit, the very idea behind all of Open Source is to allow for collaboration.

Forkability of Open Source projects

Over the years we’ve seen a lot of uproar in the community when the leaders of some project made decisions that go against these core values of Open Source. While some even committed the ultimate sin of closing down formerly open code, most of the time it’s been slightly less harsh. Still we have seen XFree86 basically falling into oblivion after Xorg was forked from it. The reason this happened was a license change: One individual felt that it was time for a little bit of extra fame – and eventually he ended up blowing his work to pieces. Other examples are pfSense and OPNsense, Owncloud and Nextcloud or Bacula and Bareos. When greed strikes, some previously sane people begin to think that it’s a good idea to implement restrictions, rip off the community and go “premium”.

One of the great virtues of Open Source is that a continuation of the software in the old way of the project is possible. With OPNsense we still have a great, permissively licensed firewall OS based on FreeBSD and Pf despite NetGate’s efforts to mess with pfSense. Bareos still has the features that Bacula cut out (!) of the Open Source version and moved to the commercial one. And so on. The very nature of Open Source also allows for people to pick up and continue some software when the original project shuts down for whatever reason.

There are a lot of benefits to Open Source over Closed Source models. But is it really immune to each and every attack you can aim at it?

Three dangers to Open Source!

There is always the pretty obvious danger of closing down source code if the license does not prohibit that. Though I make the claim that this in fact mostly a non-issue. There are a lot of voices out there who are going hysteric about this. But despite what they try to make things look, it is impossible to close down source code that is under an Open Source license! A project can stop releasing the source for newer versions, effectively stopping to distribute current code. But then the Open Source community can always stop using that stuff and continue on with the a fork that stays open.

But we haven’t talked about three other immanent dangers: narrow-mindedness, non-portability and leadership driven by monetary interest.

Narrow-mindedness

One could say that today Open Source is victim of its overwhelming success. A lot of companies and individual developers jumped the wagon because it’s very much beneficial for them. “Let’s put the source on GitHub and people might report issues or even open pull-requests, actively improving our code – all for free!” While this is a pretty smart thing to do from a commercial point of view, in this case software code was not opened up because somebody really believes in the ideas of Open Source. It was merely done to benefit from some of the most obvious advantages.

Depending on how far-sighted such an actor is, he might understand the indirect advantages to the project when keeping things as open as possible – or maybe not. For example a developer might decide that he’ll only ever use Ubuntu. Somebody reports a problem with Arch Linux: Close (“not supported!”). Another person opens a PR adding NetBSD support: Close (“Get lost, freak!”).

Such behavior is about as stupid and when it comes to the values also as anti Open Source as it gets. Witnessing something like this makes people who actually care about Open Source cringe. How can anybody be too blind to see that they are hurting themselves in the long run? But it happens time and time again. By turning down the Arch guy, the project has probably lost a future contributor – and maybe the issue reported was due to incompatibilities with the never GCC in Arch that will eventually land in Ubuntu, too, and could have been fixed ahead of time…

Open Source is about being open-minded. Just publishing the source and fishing for free contributions while living the ways of a closed-source spirit is in fact a real threat to Open Source. I wish more people would just say no to projects that regularly say “no” to others (without a good reason). It’s perfectly fine that some project cannot guarantee their software to even compile on illumos all the time. But the illumos people will take care of that and probably submit patches if needed. But refusing to even talk about possible support for that platform is very bad style and does not fit well with the ideals of Open Source.

If I witness that an arrogant developer insults, say a Haiku person, I’ll go looking for more welcoming alternatives (and am perfectly willing to accept something that is technically less ideal for now). Not because I’ve ever used Haiku or do plan to do so. But simply because I believe in Open Source and in fact have a heart for the cool smaller projects that are doing interesting things aside of the often somewhat boring mainstream.

Non-portability

Somewhat related to the point above is (deliberate) non-portability. A great example of this is Systemd. Yes, there have been many, many hateful comments about it and there are people who have stated that they really hope the main developer will keep the promise to never make it portable “so that *BSD is never going to be infected”.

But whatever your stance on this particular case is – there is an important fact: As soon as any such non-portable Open Source project gains a certain popularity, it will begin to poison other projects, too. Some developers will add dependencies to such non-portable software and thus make their own software unusable on other platforms even though that very software alone would work perfectly fine! Sometimes this happens because developers make the false assumption that “everybody uses Systemd today, anyway”, sometimes because they use it themselves and don’t realize the implication of making it a mandatory requirement.

If this happens to a project that basically has three users world-wide, it’s a pitty but does not have a major impact. If it’s a software however that is a critical component in various downstream projects it can potentially affect millions of users. The right thing here is not to break solidarity with other platforms. Even if the primary platform for your project is Linux, never ever go as far as adding a hard dependency on Systemd and other such software! If you can, it’s much better to make support optional so that people who want to use it benefit from existing support. But don’t ruin the day for everybody else!

And think again about the exemplary NetBSD pull-request mentioned above: Assume that the developer had shown less hostility and accepted the PR (with no promises to ever test if things actually work properly or at all). The software would have landed in Pkgsrc and somebody else would soon have hit a problem due to a corner case on NetBSD/SPARC64. A closer inspection of that would have revealed a serious bug that remained undetected and unfixed. After a new feature was added not much later, the bug became exploitable. Eventually the project gained a “nice” new CVE of severity 9.2 – which could well have been avoided in an alternate reality where the project leader had had a more friendly and open-minded personality…

Taking portability very seriously is exceptionally hard work. But remember: Nobody is asking you to support all the hardware you probably don’t even have or all the operating systems that you don’t know your way around on. But just be open to enthusiasts who care for such platforms and let them at least contribute.

Leadership with commercial interests

This one is a no-brainer – but unfortunately one that we can see happening more and more often. Over the last few years people started to complain about e.g. Linux being “hi-jacked by corporations”. And there is some truth to it: There is a lot of paid work being done on various Open Source projects. Some of the companies that pay developers do so because they have an interest in improving Open Source software they use. A couple even fund such projects because they feel giving back something after receiving for free is the right thing to do. But then there’s the other type, too: Corporations that have their very own agenda and leverage the fact that decision makers on some projects are their employees to influence development.

Be it the person responsible for a certain kernel subsystem turning down good patches that would be beneficial for a lot of people for seemingly no good reason – but in fact because they were handed in by a competitor because his employer is secretly working on something similar and has an interest to get that one in instead. Be it because the employer thinks that the developer is not payed to do anything for platforms that are not of interest to its own commercial plan and is expected to simply turn those down to “save time” for “important work”. Things like that actually happen and have been happening for a while now.

Limiting the influence of commercial companies is a topic on its own. IMO more projects should think about governance models much more deeply and consider the possible impacts of what can happen if a malicious actor buys in.

Towards a more far-sighted, “vrij” Open Source?

As noted above, I feel that some actors in Open Source are too much focused on their own use-case only and are completely ignorant of what other people might be interested in. But as this post’s topic was a very negative one, I’d like to end it more positively. Despite the relatively rare but very unfortunate misbehaving of some representatives of important projects, the overwhelming majority of people in Open Source are happy to allow contributions from more “exotic” projects.

But what’s that funny looking word doing there in the heading? Let me explain. We already have FOSS, an acronym for “Free and Open Source Software”. There’s a group of people arguing that we should rather focus on what they call FLOSS, “Free and Libre Open Source Software”. The “libre” in there is meant to put focus on some copyleft ideas of freedom – “free” was already taken and has the problem that the English word doesn’t distinguish between free “as in freedom” and free of charge. I feel that a term that emphasizes the community aspect of Open Source, the invitation to just about anybody to collaborate and Open Source solidarity with systems other than what I use, could be helpful. How about VOSS? I think it’s better than fitting in another letter there.

Vrij is the Dutch word for free. Why Dutch? For one part to honor the work that has been done at the Vrije Universiteit of Amsterdam (for readers who noticed the additional “e”: That’s due to inflection). Just think of the nowadays often overlooked work of Professor Tanenbaum e.g. with Minix (which inspired Linux among other things). The other thing is that it’s relatively easy to pronounce for people who speak English. It’s not completely similar but relatively close to the English “fray”. And if you’re looking for the noun, there’s both vrijheid and vrijdom. I think the latter is less common, but again: It’s much closer to English “freedom” and thus probably much more practical.

So… I really care for vrij(e) Open Source Software! Do you?

Writing a daemon using FreeBSD and Python pt.3

Part 1 of this series covered Python fundamentals, signal handling and logging. We wrote an init script as well as a program that can be daemonized by daemon(8).

In the previous part we modified the program as well as the init script so that it can daemonize itself using the Python daemon module. I also covered a few topics that people totally new to programming (or Python) might want to know to better understand what’s happening.

Part 3 is about exploring a simple means of IPC (inter-program communication) by using named pipes.

Creating a named pipe

What is a named pipe – also known as a fifo (first in, first out)? It is a way of connecting two processes together, where one can sequentially send data and the other receives it in exactly the same order. It’s basically what us Unix lovers use for our command lines all the time when we pipe the input of one program into another. E.g.:

ls | wc -l

In this case the output of ls is piped to wc which will then print the amount of lines to stdout (which could be used as input for another program with another pipe). This kind of pipe between two programs is usually short lived. When the first program is done sending output and the second one has received all the data, the pipe goes away with the two processes. It also only exists between the two processes involved.

A named pipe in contrast is something a bit more permanent and more flexible. It has a representation in the filesystem (which is why it’s a named pipe). One program creates a named pipe (usually in /var/run) and attaches to the receiving end of the pipe. Another process can then attach to the sending end and start putting data into it which will then be received by the former. Named pipes have their own character (p) showing that a file is of type named pipe, looking like this when you ls -l:

prw-rw-r--

Here’s what the next version of the code looks like:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()

    logging.info("Baby Daemon started up and ready!")
    while True:
        time.sleep(1)

We’re using a new import here: os. It gives the programmer access to various OS-dependent functions (like pipes which are not existent on Windows for example). I’ve also added a global definition for the location of the named pipe.

The next thing that you’ll notice is that the signal handler function got some new code. Before the daemon terminates it tries to clean up. If the named pipe exists the program will attempt to delete it. I’m not handling what could possibly go wrong here as this is just an example. That’s why in this case I just re-raise the exception and let the program error out.

Then we have a new “start_logging()” function that I put the logging stuff into to unclutter main. Except for that changed structure, there’s really nothing new here.

The next new function, “assert_no_pipe_exists()” should be fairly easy to read: It checks if a file by the name it wants to use is already present in the filesystem (be it as a leftover from an unclean exit or by chance from some other program). If it is found, the daemon aborts because it cannot really continue. If the filename is not taken, however, “make_pipe()” will attempt to create the named pipe.

The other thing that I did was moving the main part back from being a function directly to the program. And since we’re doing small incremental steps, that’s it for today’s step 1. Fire up the daemon using the init script and you should see that the named pipe was created in /var/run. Stop the process and the pipe should be gone.

Using the named pipe

Creating and removing the named pipe is a good first step, but now let’s use it! To do so we must first modify the daemon to attach to the receiving end of the pipe:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, errno, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    try:
        close(inpipe)
    except:
        pass

    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

def read_from_pipe():
    try:
        buffer = os.read(inpipe, 255)
    except OSError as err:
        if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
            buffer = None
        else:
            raise
 
    if buffer is None or len(buffer) == 0:
        logging.debug("Inpipe not ready.")
    else:
        logging.debug("Got data from the pipe: " + buffer.decode())
    
 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()
    inpipe = os.open(IN_PIPE, os.O_RDONLY | os.O_NONBLOCK)
    logging.info("Baby Daemon started up and ready!")

    while True:
        time.sleep(5)
        read_from_pipe()

Apart from one more import, errno, we have three important changes here. First, the cleanup has been extended, there is a new function, called “read_from_pipe()” and then main has been modified as well. We’ll take a look at the latter first.

There’s a ton of examples on named pipes on the net, but they usually use just one program that forks off a child process and then communicates over the pipe with that. That’s pretty simple to do and works nicely by just copying and pasting the example code in a file. But adapting it for our little daemon does not work: The daemon just seems to “hang” after first trying to read something from the pipe. What’s happening there?

By default, reads from the pipe are in blocking mode, which means that on the attempt to read, the system just waits for data if there is none! The solution is to use non-blocking mode, which however means to use the raw os.open function (that supports flags to be passed to the operating system) instead of the nice Python open function with its convenient file object.

So what does the line starting with “inpipe” do? It calls the function os.open and tells it to open IN_PIPE where we defined the location of our pipe. Then it gives the flags, so that the operating system knows how to open the file, in this case in read-only and in non-blocking mode. We need to open it read-only, because the daemon should be at the receiving side of the pipe. And, yes, we want non-blocking, so that the program continues on if there is no data in the pipe without waiting for it all the time!

What might look a little strange to you, is the | character between the two flags. Especially since on the terminal it’s known as the pipe character and we’re talking about pipes here, right? In this case it’s something completely unrelated however. That symbol just happens to be Python’s choice for representing the bit-wise OR operator. Let’s leave it at that (I’ll explain a bit more of it in a future “Python pieces” section, but this article will be long enough without it).

However that’s still not all that the line we’re just discussing does. The os.open() function returns a file descriptor that we’re then assigning to the inpipe variable to keep it around.

What’s left is a new infinite loop that calls read_from_pipe() every 5 seconds.

Speaking of that function, let’s take a closer look at what it does. It tries to use the os.read function to read up to 255 bytes from the pipe into the variable named buffer. We’re doing so in a try/except block, because the read is somewhat likely to fail (e.g. if the pipe is empty). When there’s an exception, the code checks for the exact error that happened and if it’s EAGAIN or EWOULDBLOCK, we deliberately empty the buffer. If some other error occurred, it’s something that we didn’t expect, so let’s better take the straight way out by raising the exception again and crashing the program.

On FreeBSD the error numbers are defined in /usr/include/errno.h. If you take a look at it, you see that EAGAIN and EWOULDBLOCK are the same thing, so checking for one of them would be enough. But it makes sense to know that on some systems these are separate errors and that it’s good practice to check for both.

If the buffer either has the None value or has a length of 0, we assume that the read failed. Otherwise we put the data into the log. To make it readable we have to use decode, because we will be receiving encoded data.

All that’s left is the cleanup function. I’ve added another try/except block that simply tries to close the pipe file before trying to delete it. This is example code, so to make things not even more complex, I just silently ignore if the attempt fails.

Control script

Ok, great! That was quite a bit of things to cover, but now we have a daemon that creates a pipe and tries to read data from it. There’s just one problem: How can we test it? By creating another, separate program, that puts data in the pipe of course! For that let’s create another file with the name bdaemonctl.py:

#!/usr/local/bin/python3.6

 # Imports #
import os, time

 # Globals #
OUT_PIPE = '/var/run/bd_in.pipe'

 # Main #
try:
    outpipe = os.open(OUT_PIPE, os.O_WRONLY)
except:
    raise

for i in range(0, 21):
    print(i)
    try:
        os.write(outpipe, bytes(str(i).encode('utf-8')))
    except BrokenPipeError:
        print("Pipe has disappeared, exiting!")
        os.close(outpipe)
        exit(1)
    time.sleep(3)
os.close(outpipe)

Fortunately this one is fairly simple. We do our imports and define a variable for the pipe. We could skip the latter, because we’re using it on only one occasion but in general it’s a good idea to keep it as it is. Why? Because hiding things deep in the code may not be such a smart move. Defining things like this at the top of the file increases the maintainability of your code a lot. And since we want to send data this time, of course we name our variable OUT_PIPE appropriately.

In the main section we just try to open the pipe file and crash if that doesn’t work. It’s pretty obvious that such a case (e.g. the pipe is not there because the daemon is not running) should be better handled. But I wanted to keep things simple here because it’s just an example after all.

Then we have a loop that counts from 0 to 20, outputs the current number to stdout and tries to also send the data down the pipe. If that works, the program waits three seconds and then continues the loop.

To be able to write to the pipe we need a byte stream but we only have numbers. We first convert them to a string and use a proper encoding (utf8) and then convert them to bytes that can be sent over the pipe.

When the loop is over, we close the pipe file properly because we as the sender are done with it. I added a little bit of code to handle the case when the daemon exits while the control script runs and still tries to send data over the pipe. This results in a “broken pipe” error. If that happens, we just print an error message, close the file (to not leak the file descriptor) and exit with an error code of 1.

So for today we’re done! We can now send data from a control program to the daemon and thus have achieved uni-directional communication between two processes.

What’s next?

I’ll take a break from these programming-related posts and write about something else next.

However I plan to continue with a 4th part later which will cover argument parsing. With that we could e.g. modify our control program to send arbitrary data to the daemon from the command line – which would of course be much more useful than the simple test case that we have right now.

Writing a daemon using FreeBSD and Python pt.2

The previous part of this series left off with a running “baby daemon” example. It covered Python fundamentals, signal handling, logging as well as an init script to start the daemon.

Daemonization with Python

The outcome of part 1 was a program that needed external help actually to be daemonized. I used FreeBSD’s handy daemon(8) utility to put the program into the background, to handle the pidfile, etc. Now we’re making one step forward and try to achieve the same thing using just Python.

To do that, we need a module that is not part of Python’s standard library. So you might need to first install the package py36-daemon if you don’t already have it on your system. Here’s a small piece of code for you – but don’t get fooled by the line count, there’s actually a lot of things going on there (and of concepts to grasp):

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile
import logging
import signal
import time

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def main_program():
    signal.signal(signal.SIGTERM, handler_sigterm)
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)
 
    logging.info("Started!")
    while True:
        time.sleep(1)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    main_program()

I dropped some ballast from the previous version; e.g. overriding SIGINT was a nice thing to try out once, but it’s not useful as we move on. Also that countdown is gone. Now the daemon continues running until it’s signaled to terminate (thanks to what is called an “infinite loop”).

We have two new imports here that we need for the daemonization. As you can see, it is possible to import multiple modules in one line. For readability reasons I wouldn’t recommend it in general. I only do it when I import multiple modules that kind of belong together anyway. However in the coming examples I might just put everything together to save some lines.

The first more interesting thing with this version is that the main program was moved to a function called “main_program”. We could have done that before if we really wanted to, but I did it now so the code doesn’t take attention away from the primary beast of this example. Take a look at the line that starts with the with keyword. Now that’s a mouthful, isn’t it? Let’s break this one up into a couple of pieces so that it’s easier to chew, shall we?

The value for umask is looking a bit strange. It contains an “o” among the numbers, so it has to be a string, doesn’t it? But why is it written without quotes then? Well, it is a number. Python uses the “0o” prefix to denote octal (the base-8 numbering system) numbers and 0x would mean hexadecimal (base-16) ones.

Remember that we talked about try/except before (for the logging)? You can expand on that. A try block can not only have except blocks, it can also have a finally block. Statements in such a block are meant to be executed no matter the outcome of the try block. The classical example is that when you open a file, you definitely want to close it again (everything else is a total mess and would make your program an exceptionally bad one).

Closing it when you are done is simple. But what if an exception is raised? Then the code path that properly closes the file might never be reached! You could close the file in every thinkable scenario – but that would be both tedious and error-prone. For that reasons there’s another way to handle those cases: Close the file in the finally block and you can be sure that it will be closed regardless of what happens in the try or in any except block.

Ok, but what does this have to do with our little daemon? Actually a lot. That case of try/finally has been so common that Python provides a shortcut with so-called context managers. They are objects that manage a resource for you like this: You request it, it is valid only inside one block (the with one!) and when the block ends, the context manager takes care of properly cleaning up for you without having you add any extra code (or even without you knowing, if you just copy/paste code from the net without reading explanations like this).

So the with statement in our code above lets Python handle the daemonization process while the main_program function is running. When it ends on the signal, Python cleans up everything and the process terminates – which is great for us. Accept that for now and live with the fact that you might not know just how it does that. We’ll come back to things like that.

Updated init script

Ok, the one thing left to do here is making the required changes to the init script. We are no longer using the daemon(8) utility, so we need to adjust it. Here it is the new one:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/root/bdaemon.py"
command_interpreter=/usr/local/bin/python3.6
pidfile="/var/run/${name}.pid"

load_rc_config $name
run_rc_command "$1"

Not too much changed here, but let’s still go over what has. The command definition is pretty obvious: The program can now daemonize itself, so we call it directly. It doesn’t take any arguments, which means we can drop command_args.

However we need to add command_interpreter instead (one important thing that I had overlooked first), because the program will look like this in the process list:

/usr/local/bin/python3.6 /root/bdaemon.py

Without defining the interpreter, the init system would not recognize this process as being the correct one. Then we also need to point it to the to the pidfile, because in theory there could be multiple processes that match otherwise.

And that’s it! Now we have a daemon process running on FreeBSD, written in pure Python.

Python pieces

This next part is a completely optional excursion for people who are pretty new to programming. We’ll take a step back and discuss concepts like functions and arguments, modules, as well as namespaces. This should help you better understand what’s happening here, if you like to know more. Feel free to save some time and skip the excursion if you are familiar with those things.

Functions and arguments

As you’ve seen, functions are defined in Python by using the def keyword, the function name and – at the very least – an empty pair of parentheses. Inside the parentheses you could put one or more arguments if needed:

def greet(name):
    print("Hi, " + name + "!")

greet("Alice")
greet("Bob")

Here we’re passing a string to the function that it uses to greet that person. We can add a second argument like this:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet("Alice", "Great to see you again!")
greet("Bob", "How are you doing?")

The arguments used here are called positional arguments, because it’s decided by their position what goes where. Invert them when calling the function and the output will obviously be garbage as the strings are assigned to the wrong function variable. However it’s also possible to refer to the variable by name, so that the order does no longer matter:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")

This is what is used to assign the values for the daemon context. Technically it’s possible to mix the ways of calling (as done here), but that’s a bit ugly.

We’re not using it, yet, but it’s good to know that it exists: There are also default values. Those mean that you can leave out some arguments when calling a function – if you are ok with the default value.

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")
greet("Carol")

And then there’s something known as function overloading. We’re not going into the details here, but you might want to know that you can have multiple functions with the same name but a different number of arguments (so that it’s still possible to precisely identify which one needs to be called)!

Modules

When reading about Python it usually won’t take too long before you come across the word module. But what’s a module? Luckily that’s rather easy to explain: It’s a file with the .py extension and with Python code in it. So if you’ve been following this daemon tutorial, you’ve been creating Python modules all the way!

Usually modules are what you might want to refer as to libraries in other languages. You can import them and they provide you with additional functions. You can either use modules that come with Python by default (that collection of modules is known as the standard library, so don’t get confused by the terminology there), additional third-party modules (there are probably millions) or modules that you wrote yourself.

It’s fairly easy to do the latter. Let’s pick up the previous example and put the following into a file called “greeter.py”:

forgot_name = "Sorry, what was your name again?"

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

Now you can do this in another Python program:

import greeter

greeter.greet("Carol")
print(greeter.forgot_name)

This shows that after importing we can use the “greet()” function in this program, even though it’s defined elsewhere. We can also access variables used in the imported module (greeter.forgot_name in this case).

Namespaces

Ever wondered what that dot means (when it’s not used in a filename)? You can think of it as a hierarchical separator. The standard Python functions (e.g. print) are available in the global namespace and can thus be used directly. Others are in a different namespace and to use them, it’s necessary to refer to that namespace as well as the function name so that Python understand what you want and finds the function. One example that we’ve used is time.sleep().

Where does this additional namespace come from? Well, remember that we did import time at the top of the program? That created the “time” namespace (and made the functions from the time module available there).

There’s another way of importing; we could import either everything (using an asterisk (*) character, but that’s considered poor coding) or just specific functions from one module into the global namespace:

from time import sleep
sleep(2)
exit(0)

This code will work because the “from MODULE import FUNCTION” statement in this example imported the sleep function so that it becomes available in the global namespace.

So why do we go through all the hassle to have multiple namespaces in the first place? Can’t we just put everything in the global one? Sure, we could – and for more simple programs that’s in fact an option. But consider the following case: Python provides the open keyword. It’s used to open a file and get a nice object back that makes accessing or manipulating data really easy. But then there’s also os.open, which is not as friendly, but let’s you use more advanced things since it uses the raw operating system functionality. See the problem?

If you import the functions from os into the global namespace, you have a name clash in the case of open. This is not an error, mind you. You can actually do that, but you should know what happens. The function imported later will override the one that went by that name previously, effectively making the original one inaccessible. This is called “shadowing” of the original function.

To avoid problems like this it’s often better to have your own separate namespace where you can be sure that no clashes happen.

What’s next?

In the next part we’ll take a look at implementing IPC (inter-process communication) using named pipes (a.k.a “fifos”).

Writing a daemon using FreeBSD and Python pt.1

Being a sysadmin by profession, I don’t code. At least not often enough or with as high quality output that programmers would accept to call coding. I do write and maintain shell scripts. I also write new formulas for configuration management with SaltStack.

The latter is Python-based and after hearing mostly good things about that language, I’ve been trying to do some simple things with it for a while now. And guess what: It’s just so much more convenient compared to using shell code! I’ll definitely keep doing some simple tasks in Python, just to get some experience with it.

Not too long I thought about a little project that I’d try to do and decided to go with Python again. Thinking about what the program should do, I figured that a daemon would make a nice fit for it. But how do you write a daemon? Fortunately it’s especially easy on FreeBSD. So let’s go!

Python

The first thing that I did, was to create a new file called bdaemon.py (for “baby daemon”) and use chmod to make it executable. And here’s what I put into it as a first test:

#!/usr/local/bin/python3.6

 # Imports #
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #

 # Main #
print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

This very simple program has the shebang line that points the operating system to the right interpreter. Then I import Python’s time module which gives me access to a lot of time-related functions. Next I define two global variables that control how long the program runs and in which interval it will give output.

The main part of the program first outputs a starting message on the terminal. It then enters a for loop, that counts from 1 to 30. In Python you do this by providing a list of values after the in keyword. Counting to 5 could have been written as for i in [1, 2, 3, 4, 5]: for example.

With range we can have Python create a list of sequential numeric values on the fly – and since it’s much less to type (and allows for dynamic list creation by setting the final number via a variable), I chose to go with that. Oh, BTW: In Python the last value of those ranges is exclusive, not inclusive. This means that range(1, 5) leads to [1, 2, 3, 4] – if you want the 5 included in the list, you have to use range(1, 6)! That’s why I add 1 to the TTL_SECONDS variable.

I use time.sleep to create a delay in the loop block. Then I do a check if the remainder of the division of the current running time by the defined check interval is zero (% is the modulus operator which gives that remainder value of the division). If it is, the program creates more output.

Mind the indentation: In Python it is used to create code blocks. The for statement is not indented, but it ends with a colon. That means that it’s starting a code block. Everything up to (but not including) the second to last print statement is indented by four spaces and thus part of the code block. Said print statement is indented two levels (8 spaces) – that’s because it’s another block of its own started by the if statement before it. We could create a third, forth and so on level deep indentation if we required other blocks beneath the if block.

Eventually the program will print that the TTL has been reached and exit the program with an error code of 0 (which means that there was no error).

Have you noticed the str(i) part in one of the print statements? That is required because the counter variable “i” holds numeric values and we’re printing data of a different type. So to be able to concatenate (that’s what the plus sign is doing in this case!) the variable’s contents to the rest of the data, it needs to match its type. We’re achieving this by doing a conversion to a string (think converting the number 5 to the literal “5” that can be part of a line of text where it looks similar but is actually a different thing).

Oh, and the pound signs are used to start comments that are ignored by Python. And that’s already it for some fundamental Python basics. Hopefully enough to understand this little example code (if not, tell me!).

Signals

The next thing to explore is signal handling. Since a daemon is essentially a program running in the background, we need a way to tell it to quit for example. This is usually done by using signals. You can send some of them to normal programs running in the terminal by hitting key combinations, while all of them can be sent by the kill command.

If you press CTRL-C for example, you’re sending SIGINT to the currently running application, telling it “abort operation”. A somewhat similar one is SIGTERM, which kind of means “hey, please quit”. It’s a graceful shutdown signal, allowing the program to e.g. do some cleanup and then shut down properly.

If you use kill -9, however, you’re sending SIGKILL, the ungraceful shutdown signal, that effectively means “die!” for the process targeted (if you’ve ever done that to a live database or another touchy application, you know that you really have to think before using it – or you might be in for all kinds of pain for the next few hours).

#!/usr/local/bin/python3.6

 # Imports #
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def signal_handler(signum, frame):
    print("Received signal" + str(signum) + "!")
    if signum == 2:
        exit(0)

 # Main #
signal.signal(signal.SIGHUP, signal_handler)
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGQUIT, signal_handler)
signal.signal(signal.SIGILL, signal_handler)
signal.signal(signal.SIGTRAP, signal_handler)
signal.signal(signal.SIGABRT, signal_handler)
signal.signal(signal.SIGEMT, signal_handler)
#signal.signal(signal.SIGKILL, signal_handler)
signal.signal(signal.SIGSEGV, signal_handler)
signal.signal(signal.SIGSYS, signal_handler)
signal.signal(signal.SIGPIPE, signal_handler)
signal.signal(signal.SIGALRM, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
#signal.signal(signal.SIGSTOP, signal_handler)
signal.signal(signal.SIGTSTP, signal_handler)
signal.signal(signal.SIGCONT, signal_handler)
signal.signal(signal.SIGCHLD, signal_handler)
signal.signal(signal.SIGTTIN, signal_handler)
signal.signal(signal.SIGTTOU, signal_handler)
signal.signal(signal.SIGIO, signal_handler)
signal.signal(signal.SIGXCPU, signal_handler)
signal.signal(signal.SIGXFSZ, signal_handler)
signal.signal(signal.SIGVTALRM, signal_handler)
signal.signal(signal.SIGPROF, signal_handler)
signal.signal(signal.SIGWINCH, signal_handler)
signal.signal(signal.SIGINFO, signal_handler)
signal.signal(signal.SIGUSR1, signal_handler)
signal.signal(signal.SIGUSR2, signal_handler)
#signal.signal(signal.SIGTHR, signal_handler)

print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

For this little example code I’ve added a function called “signal_handler” – because that’s what it is for. And in the main program I installed that signal handler for quite a lot of signals. To be able to do that, I needed to import the signal module, of course.

If this program is run, it will handle every signal you can send on a FreeBSD system (run kill -l to list all available signals on a Unix-like operating system). Why are some of those commented out? Well, try commenting those lines in! Python will complain and stop your program. This is because not all signals are allowed to be handled.

SIGKILL for example by its nature is something that you don’t want to allow to be overridden with custom behavior after all! While your program can choose to handle e.g. SIGINT and choose to ignore it, SIGKILL means that the process totally needs to be shutdown immediately.

Try running the program and send some signals while it’s running. On BSD systems you can e.g. send CTRL-T for SIGINFO. The operating system prints some information about the current load. And then the program has the chance to output some additional information (some may tell you what file they currently process, how much percent they have finished copying, etc.). If you send SIGINT, this program terminates as it should.

Logging

There’s another thing that we have to consider when dealing with processes running in the background: A daemon detaches from the TTY. That means it can no longer receive input the usual way from STDIN. But we investigated signals so that’s fine. However it also means a daemon cannot use STDOUT or STDERR to print anything to the terminal.

Where does the data go that a daemon writes to e.g. STDOUT? It goes to the system log. If no special configuration for it exists, you will find it in /var/log/messages. Since we expect quite a bit of debug output during the development phase, we don’t really want to clutter /var/log/messages with all of that. So to write a well-behaving little daemon, there’s one more topic that we have to look into: Logging.

#!/usr/local/bin/python3.6

 # Imports #
import logging
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def handler_sigint(signum, frame):
    logging.debug("Not going to quit, there you have it!")

 # Main #
signal.signal(signal.SIGINT, handler_sigint)
signal.signal(signal.SIGTERM, handler_sigterm)
try:
    logging.basicConfig(filename='bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
except:
    print("Error: Could not create log file! Exiting...")
    exit(1)

logging.info("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        logging.info("Running for " + str(i) + " seconds...")
logging.info("TTL reached, terminating!")
exit(0)

The code has been simplified a bit: Now it installs only handlers for two signals – and we’re using two different handler functions. One overrides the default behavior of SIGINT with a dummy function, effectively refusing the expected behavior for testing purposes. The other one handles SIGTERM in the way it should. If you are fast enough on another terminal window, you can figure out the PID of the running program and then kill -15 it.

Logging with Python is extremely simple: You import the module for it, call a function like logging.basicConfig – and start logging. This line sets the filename of the log to “bdaemon.log” (for “baby daemon”) in the current directory. It changes the default format to displaying just the log level and the actual message. And then it defines the lowest level that should be logged.

There are various pre-defined levels like debug, info, warning, critical, etc. But what’s that try and except thing? Well, the logging module will attempt to create a logfile (or append to it, if it already exists). This is an operation that could fail. Perhaps we’re running the program in a directory where we don’t have the permission to create the log file? Or maybe for whatever reason a directory of that name exists? In both cases Python cannot create the file an an error occurs.

If such a thing happens, Python doesn’t know what to do. It knows what the programmer wanted to do, but has no clue on what to do if things fail. Does it make sense to keep the program running if something unexpected happened? Probably not. So it throws an exception. If an unhandled exception occurs, the program aborts. But we can catch the exception.

By putting the function that opens the file in a try block, we’re telling Python that we’re expecting it could fail. And with except we can catch an exception and handle expected problems. There are a lot of exception types; by not specifying any, we’re catching all of them. That might not be the best idea, because maybe something else happened and we’re just expecting that the logfile could not be created. But let’s keep it simple for now.

The one remaining thing to do is to change any print statements so that we’re using the logging instead. Depending on how important the log entry is, we can also use different levels from least important (DEBUG) to most important (CRITICAL).

You can either wait for the program to finish and then take a look at the log, or you open a second terminal and tail -f bdaemon.log there to watch output as the program is running.

Alright! With this we have everything required to daemonize the program next. Let’s write a little init script for it, shall we?

Init

Init scripts are used to control daemons (start and stop them, telling them to reload the configuration, etc.). There are various different init systems in use across the Unix-like operating system. FreeBSD uses the standard BSD init system called rc.d. It works with little (or not so little if you need to manage very complex daemons) shell scripts.

Since a lot of the functionality of the init system is all the same across most of these scripts, rc.d handles all the common cases in shell files of it’s own that are then used in each of the scripts. In Python this would be done by importing a module; the term with shell scripting is to source another shell script (or fragment).

Create the file /usr/local/etc/rc.d/bdaemon with the following contents:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/usr/sbin/daemon"
command_args="-p /var/run/${name}.pid /path/to/script/bdaemon.py"

load_rc_config $name
run_rc_command "$1"

Yes, you need root privileges to do that. Daemons are system services and so we’re messing with the system now (totally at beginner level, though). Save the file and you should be able to start the program as a daemon e.g. by running service bdaemon onestart!

How’s that? What does that all mean and where does the daemonization happen? Well, the first line after the shebang sources the main rc fragment with all the required functions (read the dot as “source”). Then it defines a name for the daemon and an rcvar.

What is an rcvar? Well, by putting “bdaemon_enable=YES” into your /etc/rc.conf you could enable this daemon for automatic startup when the system is coming up. If that line is not present there, the daemon will not start. That’s why we need to use “onestart” to start it anyway (try it without the “one” if you’ve never done that and see what happens!).

Then the command to run as well as the arguments for that command are defined. And eventually two helper functions from rc.subr are called which do all the actual complex magic that they thankfully hide from us!

Ok, but what is /usr/sbin/daemon? Well, FreeBSD comes with an extremely useful little utility that handles the daemonization process for others! This means it can help you if you want to use something as a background service but you don’t want to handle the actual daemonization yourself. Which is perfect in our case! With it you could even write a daemon in shell script for example.

The “-p” argument tells the daemon utility to handle the PID file for the process as well. This is required for the init system to control the daemon. While our little example program is short-lived, we can still do something while it runs. Try out service onestatus and service onestop on it for example. If there was no PID file present, the init system would claim that the daemon is not running, even if it is! And it would not be able to shut it down.

There we go. Our first FreeBSD daemon process written in Python! One last thing that you should do is change the filename for the logfile to use an absolute path like /var/log/bdaemon.log. If you want to read more about the daemon utility, read it’s manpage, daemon(8). And should you be curious about what the init system can do, have a look here.

What’s next?

While using /usr/sbin/daemon is perfectly fine, you might feel that we kind of cheated. So next time we’ll take a brief look at daemonizing with Python directly.

I also want to explore IPC (“inter-process communication) with named pipes. This will allow for a little bit more advanced daemon that can be interacted with using a separate program.

The history of *nix package management

Very few people will argue against the statement that Unix-like operating systems conquered the (professional) world due to a whole lot of strong points – one of which is package management. Whenever you take a look at another *nix OS or even just another Linux distro, one of the first things (if not the first!) is to get familiar with how package management works there. You want to be able to install and uninstall programs after all, right?

If you’re looking for another article on using jails on a custom-built OPNsense BSD router, please bear with me. We’re getting there. To make our jails useful we will use packages. And while you can safely expect any BSD or Linux user to understand that topic pretty well, products like OPNsense are also popular with people who are Windows users. So while this is not exactly a follow-up article on the BSD router series, I’m working towards it. Should you not care for how that package management stuff all came to be, just skip this post.

When there’s no package manager

There’s this myth that Slackware Linux has no package manager, which is not true. However Slackware’s package management lacks automatic dependency resolving. That’s a very different thing but probably the reason for the confusion. But what is package management and what is dependency resolving? We’ll get to that in a minute.

To be honest, it’s not very likely today to encounter a *nix system that doesn’t provide some form of package manager. If you have such a system at hand, you’re quite probably doing Linux from Scratch (a “distribution” meant to learn the nuts and bolts of Linux systems by building everything yourself) or have manually installed a Linux system and deliberately left out the package manager. Both are special cases. Well, or you have a fresh install of FreeBSD. But we’ll talk about FreeBSD’s modern package manager in detail in the next post.

Even Microsoft has included Pkgmgr.exe since Windows Vista. While it goes by the name of “package manager”, it turns pale when compared to *nix package managers. It is a command-line tool that allows to install and uninstall packages, yes. But those are limited to operating system fixes and components from Microsoft. Nice try, but what Redmond offered in late 2006 is vastly inferior to what the *nix world had more than 10 years earlier.

There’s the somewhat popular Chocolatey package manager for Windows and Microsoft said that they’d finally include a package manager called “one-get” (apt-get anyone?) with Windows 10 (or was it “nu-get” or something?). I haven’t read a lot about it on major tech sites, though, and thus have no idea if people are actually using it and if it’s worth to try out (I would, but I disagree with Microsoft’s EULA and thus I haven’t had a Windows PC in roughly 10 years).

But how on earth are you expected to work with a *nix system when you cannot install any packages?

Before package managers: Make magic

Unix begun its life as an OS by programmers for programmers. Want to use a program on your box that is not part of your OS? Go get the source, compile and link it and then copy the executable to /usr/local/whatever. In times where you would have just some 100 MB of storage in total (or even less), this probably worked well enough. You simply couldn’t go rampage and install unneeded software anyways, and sticking to the /usr/local scheme you separate optional stuff from the actual operating system.

More space became available however and software grew bigger and more complex. Unix got the ability to use libraries (“shared objects”), ELF executables, etc. To solve the task of building more complicated software easily, make was developed: A tool that read a Makefile which told it exactly what to do. Software begun shipping not just with the source code but also with Makefiles. Provided that all dependencies existed on the system, it was quite simple to build software again.

Compilation process (invoked by make)

Makefiles also provide a facility called “targets” which made a single file support multiple actions. In addition to a simple make statement that builds the program, it became common to add a target that allowed for make install to copy the program files into their assumed place in the filesystem. Doing an update meant building a newer version and simply overwriting the files in place.

Make can do a lot more, though. Faster recompiles by to looking at the generated file’s timestamp (and only rebuilding what has changed and needs to be rebuilt) and other features like this are not of particular interest for our topic. But they certainly helped with the quick adoption of make by most programmers. So the outcome for us is that we use Makefiles instead of compile scripts.

Dependency and portability trouble

Being able to rely on make to build (and install) software is much better than always having to invoke compiler, linker, etc. by hand. But that didn’t mean that you could just type “make” on your system and expect it to work! You had to read the readme file first (which is still a good idea, BTW) to find out which dependencies you had to install beforehand. If those were not available, the compilation process would fail. And there was more trouble: Different implementations of core functionality in various operating systems made it next to impossible for the programmers to make their software work on multiple Unices. Introduction of the POSIX standard helped quite a bit but still operating systems had differences to take into account.

Configure script running

Two of the answers to the dependency and portability problems were autoconf and metaconf (the latter is still used for building Perl where it originated). Autoconf is a tool used to generate configure scripts. Such a script is run first after extracting the source tarball to inspect your operating system. It will check if all the needed dependencies are present and if core OS functionality meets the expectations of the software that is going to be built. This is a very complex matter – but thanks to the people who invested that tremendous effort in building those tools, actually building fairly portable software became much, much easier!

How to get rid of software?

Back to make. So we’re now in the pleasant situation that it’s quite easy to build software (at least when you compare it to the dark days of the past). But what would you do if you want to get rid of some program that you installed previously? Your best bet might be to look closely at what make install did and remove all the files that it installed. For simple programs this is probably not that bad but for bigger software it becomes quite a pain.

Some programs also came with an uninstall target for make however, which would delete all installed files again. That’s quite nice, but there’s a problem: After building and installing a program you would probably delete the source code. And having to unpack the sources again to uninstall the software is quite some effort if you didn’t keep it around. Especially since you probably need the source for exactly the same version as newer versions might install more or other files, too!

This is the point where package management comes to the rescue.

Simple package management

So how does package management work? Well, let’s look at packages first. Imagine you just built version 1.0.2 of the program foo. You probably ran ./configure and then make. The compilation process succeeded and you could now issue make install to install the program on your system. The package building process is somewhat similar – the biggest difference is that the install destination was changed! Thanks to the modifications, make wouldn’t put the executable into /usr/local/bin, the manpages into /usr/local/man, etc. Instead make would then put the binaries e.g. into the directory /usr/obj/foo-1.0.2/usr/local/bin and the manpages into /usr/obj/foo-1.0.2/usr/local/man.

Installing tmux with installpkg (on Slackware)

Since this location is not in the system’s PATH, it’s not of much use on this machine. But we wanted to create a package and not just install the software, right? As a next step, the contents of /usr/obj/foo-1.0.2/ could be packaged up nicely into a tarball. Now if you distribute that tarball to other systems running the same OS version, you can simply untar the contents to / and achieve the same result as running make install after an unmodified build. The benefit is obvious: You don’t have to compile the program on each and every machine!

So far for primitive package usage. Advancing to actual package management, you would include a list of files and some metadata into the tarball. Then you wouldn’t extract packages by hand but leave that to the package manager. Why? Because it would not only extract all the needed files. It will also record the installation in its package database and keep the file list around in case it’s needed again.

Uninstalling tmux and extracting the package to look inside

Installing using a package manager means that you can query it for a list of installed packages on a system. This is much more convenient than ls /usr/local, especially if you want to know which version of some package is installed! And since the package manager keeps the list of files installed by a package around, it can also take care of a clean uninstall without leaving you wondering if you missed something when you deleted stuff manually. Oh, and it will be able to lend you a hand in upgrading software, too!

That’s about what Slackware’s package management does: It enables you to install, uninstall and update packages. Period.

Dependency tracking

But what about programs that require dependencies to run? If you install them from a package you never ran configure and thus might not have the dependency installed, right? Right. In that case the program won’t run. As simple as that. This is the time to ldd the program executable to get a list of all libraries it is dynamically linked against. Note which ones are missing on your system, find out which other packages provide them and install those, too.

Pacman (Arch Linux) handles dependencies automatically

If you know your way around this works ok. If not… Well, while there are a lot of libraries where you can guess from the name which packages they would likely belong to, there are others, too. Happy hunting! Got frustrated already? Keep saying to yourself that you’re learning fast the hard way. This might ease the pain. Or go and use a package management system that provides dependency handling!

Here’s an example: You want to install BASH on a *nix system that just provides the old bourne shell (/bin/sh). The package manager will look at the packaging information and see: BASH requires readline to be installed. Then the package manager will look at the package information for that package and find out: Readline requires ncurses to be present. Finally it will look at the ncurses package and nod: No further dependencies. It will then offer you to install ncurses, readline and BASH for you. Much easier, eh?

Xterm and all dependencies downloaded and installed (Arch Linux)

First package managers

A lot of people claim that the RedHat Package Manager (RPM) and Debian’s dpkg are examples of the earliest package managers. While both of them are so old that using them directly is in fact inconvenient enough to justify the existence of another program that allows to use them indirectly (yum/dnf and e.g. apt-get), this is not true.

PMS (short for “package management system”) is generally regarded to be the first (albeit primitive) package manager. Version 1.0 was ready in mid 1994 and used on the Bogus Linux distribution. With a few intermediate steps this lead to the first incarnation of RPM, Red Hat’s well-known package manager which first shipped with Red Hat Linux 2.0 in late 1995.

FreeBSD 1.0 (released in late 1993) already came with what is called the ports tree: A very convenient package building framework using make. It included version 0.5 of pkg_install, the pkg_* tools that would later become part of the OS! I’ll cover the ports tree in some detail in a later article because it’s still used to build packages on FreeBSD today.

Part of a Makefile (actually for a FreeBSD port)

Version 2.0-RELEASE (late 1994) shipped the pkg_* tools. They consisted of a set of tools like pkg_add to install a package, pkg_info to show installed packages, pkg_delete to delete packages and pkg_create to create packages.

FreeBSD’s pkg_add got support for using remote repositories in version 3.1-RELEASE (early 1999). But those tools were really showing their age when they were put to rest with 10.0-RELEASE (early 2014). A replacement has been developed in form of the much more modern solution initially called pkg-ng or simply pkg. Again that will be covered in another post (the next one actually).

With the ports tree FreeBSD undoubtedly had the most sophisticated package building framework of that time. Still it’s one of the most flexible ones and a bliss to work with compared to creating DEB or RPM packages… And since Bogus’s PMS was started at least a month after pkg_install, it’s even entirely possible that the first working package management tool was in fact another FreeBSD innovation.

Precomp (or: How to compress already compressed data?)

It’s a kind of strange feeling, but while half of the IT world seems to either already burn (or to tremble with fear), I can choose freely whatever topic I want to write about this month. I haven’t had a Windows box for almost a decade now and people who I work or keep in contact with, are also mostly *nix only. So this post is not about encryption or ransomware at all. It is about useful, respectable compression. Or more precise: The art of re-compressing already compressed data!

In January Precomp, a precompression utility, has been open-sourced! The first two sections tell a bit about how I became interested in this topic and in Precomp. Skip them if you don’t want to read that kind of stuff.

Compressing compressed data?

When I was young and new to PCs, I once tried to compress a ZIP archive with ACE (a lesser known archiver that once was comparable to the more popular RAR). I knew that ACE offered stronger compression and so I thought that this should make the file smaller. Just imagine my surprise when it turned out that I was wrong!

I guess that most of us have a story like that to tell, a story from our childhood when compression was nothing short of magic. Later when I begun to understand that even though it in fact does start with “m”, it’s not magic but math (a subject that I totally sucked at in school – but fortunately I grasped enough to get a rough idea on how compression works ;)). Now there was no surprise anymore: The compressed data is not well fit for any other general purpose compression method, even if it’s compressed with a weak algorithm.

How to work around that? Well, decompressing the ZIP file and creating a new ACE archive does the trick in the case mentioned above. Of course things are not always that straight forward. If they were, I wouldn’t really have much to write about right now and this post would be really, really short!

For whatever reason, compression continued to fascinate me and I loved compressing things to sizes as tiny as possible. It was fun to try out new experimental compression programs specialized on some specific types of files. I did that for years – until I had to stop due to a lack of time.

Games

Let’s fast forward some years from that failed compression experiment with ACE; I had replaced DOS 6.22 with Win95 which I had replaced with Win98 (SE) that I had replaced with WinME, … On some day I wanted to install Quake ]|[ Arena (yes, friends, I once was 1337 young enough to spell it like that!) on my main computer to get into it again for a LAN party next weekend. So I went looking for the darn CD. It took me a while but I finally found the CD case. I opened it up and… the CD itself was missing. Oh great! Since I didn’t feel like looking into all the other cases to find out into which I might have put it accidentally, I decided to just copy it off an older computer which had it already installed (ID were nice people. I don’t remember which version of Q3A it was, but there eventually was an official patch which also removed the CD check for the game so there was no need for a crack or anything).

Now, different versions of Windows didn’t always play together too well on the LAN and since my Quake installation was on a computer with an older Windows (and I didn’t have another cable at hand), I decided that I’d just burn it to CD. It turned out however, that the other machine didn’t have vanilla Q3A installed but the expansion set as well. Together it was obviously too big to fit on one CD. There would have been easy solutions: Leave out the resource files for the expansion, burn two CDs, put the hard drive into the new computer, … Sure, easy solutions are nice and all. But sometimes they are also boring! And when you’re young and have some free time, you don’t do boring stuff. So of course I opted for the more challenging solution: Get it all on one cd!

Quake 3’s resource containers go by the file extension of .pk3 and, more importantly, are in fact ZIP files without any compression. This meant that they could be compressed well because there was no ZIP compression getting in the way. But guess what: Even after applying the most extreme compression programs, the result simply would not fit onto one CD…

Bad luck, eh? Well, not really. Unpacking the container files was in fact the solution in this case. Not because of weak compression but because it enabled me to test each of the files it contained separately with all compressors and could group together all files that compressed best with one compression utility or another! I think that I was able to shrink it down almost as much as needed with just a couple of megs over the CD limit. There were blank CDs with 800 MB capacity as well, so it would have fit onto one CD – but I didn’t have one of those. So I replaced the ID video with an empty video file and I was set.

Since I liked doing these things I begun doing backups like that for a lot of my favorite games, ripping apart (and later rebuild) resource containers, convert between file formats, decompress whatever could be decompressed before applying stronger compression, etc.

How Precomp works

The more I got into free and open source things, the more I wondered if some of them wouldn’t benefit from better compression. A friend and former classmate of mine invented Precomp and I of course was among the first to make use of it and provide feedback. But what is Precomp?

Precomp is what the name says: A pre-compressor. It is not directly meant to reduce the size of files. On the contrary: It can make some files even bigger than the original input. But that’s a good thing really! How’s that? Well, it’s meant to prepare files for compression so that eventually these files can be compressed to a smaller size than the original file could – without losing data of course!

What Precomp does is look for streams in its input file that are compressed with a compression method known to Precomp. It then decompresses and recompresses them so that they can be compared. If they are identical, Precomp will write the decompressed stream (plus how to recompress it properly) to its output file.

While this sounds quite simple in theory, it is in fact a bit more complex. The reason for that lies in the flexibility of some compression algorithms. Have you ever zipped up a file? Then you know that there are a lot of parameters that you can provide which affects how the file will be compressed: “fast”, “normal”, “strong” or “maximum” compression? What about the dictionary size? A lot of things like that. So either combination of compression parameters will result in a valid zip stream that can be decompressed by any zip-compatible utility. Replacing such a stream with a compatible one is fairly easy. Reproducing the exact, bit for bit identical stream, is not.

To be truly lossless, Precomp uses trial and error on each stream. If it can figure out the combination of parameters that result in the original stream: Great! If not, that stream has to be left untouched.

What Precomp can do

Early versions of Precomp were only available on Windows but there have been Linux versions for quite a while as well. I also use it on FreeBSD without any problems. The .PCF files are platform-independent. You can restore the original file on Windows from a file precompressed on Linux or BSD and vice versa.

While Precomp originally was only a pre-compressor for zlib streams (which are used in a variety of file formats like ZIP, GZIP, PNG, PDF, …), it can do more things now. It can use bzip2 to compress its input file after precompression. It can losslessly compress some JPEG pictures to smaller sizes (thanks to an external library). And in the current development version there’s even support for compressing MP3 music files further (also using an external lib)!

Currently, Precomp relies on temporary files for all the extracted streams and thus puts heavy load on your hard drive (and is a bit slow due to that bottleneck). SSDs obviously perform better, but it totally makes sense to use a memdrive if you can spare some RAM for it. I’ve forked the project on Github and added an experimental shell script to assist with the creation of such a memdrive. It’s currently FreeBSD only (I’ve migrated all of my boxes to *BSD and currently have no Linux machine remaining but will set up one for cases like that some time in the future). Feel free to take a look at it if you’re into portable shell scripting and please do tell me if you have any suggestions!

Precomp is not at all at the limit of its possibilities. There are a lot of things that can be tweaked, optimized or added. If you feel like that could be a fun project – go ahead and play with it, it’s on Github. Or perhaps you have an idea what this could be useful for? Please help yourself and use it. It’s free software after all (Apache licensed).

Thea: The gain of giving away for free

This post is inspired by the game Thea: The Awakening. No, Eerie Linux has not mutated into a games blog. Yes, I will give a short description of the game. But what this post is really about is some thoughts about software development in the past, today and what could be a more open future.

Why Thea? Because the developers did something very uncommon: They decided to give the game away for free – if you’re a Linux user that is!

Thea: The Awakening

The game in question is a turn-based strategy game with a strong focus on survival. There’s a nice background story: The world had turned to darkness (playing the game you will discover why) and is haunted by creatures and spirits of the dark. Now the sun is rising again and the gods have returned but both are very weak and darkness will not give up without a fierce fight. Slavic mythology makes for a very nice and rather uncommon setting.

In case you want to give it a try, you can find a download link here. And yes, it is really completely free. You don’t need to buy the Windows version first or something.

I’ve successfully run the game on the Mint laptop that I share with my wife and can confirm that it works well. No luck on a 32-bit machine that I installed Arch on to give the 32-bit version of the game a try. It won’t start and the console messages give no clues why this may be. So if you’re still stuck with 32-bit only systems, you’re probably out of luck. 😉

The developers stated that they have not even tested the Linux version themselves! So what works and what doesn’t? Most things seem to work surprisingly well in fact. Sound, graphics, even the intro video. I’ve experienced graphical glitches with some white pixels appearing for a second (nope, no AMD video card – it’s Intel!). But this happens just rarely and is a fairly minor issue. Far more annoying is the fact that you cannot really use the keyboard: A key press works but the release event doesn’t… This is a known issue with the version of the Unity engine that Thea uses. It may or may not be addressed in a future release. You can however get the keys released by ALT-TABbing out of the game and back in. That way you can at least always access the menu.

You choose one of the gods when starting a game. I’ve played scenarios for multiple gods now. The main story (“Cosmic Tree”) gets pretty repetitive soon since it’s always the same. This is also true for a lot of the other quests. However the game has options to skip a lot of the text in case you already know it which certainly was a good idea. Some of the quests are different depending of which god you chose which keeps things interesting story-wise. Maps, resources, encounters, etc. are randomly generated for each game. This together with a challenging survival, plenty of combinations to try for crafting items and interesting gameplay, Thea might still cause a rather high motivation to replay the game often.

Software development models

I’d like to separate some development approaches here and sum them up by giving their model as I see it a name. These are no official models (I’m not a game developer) but an attempt to sum up the whole thing in one heading.

The shareware model

There was once a time when software was developed in a purely closed manner. It was developed internally and when it was ready, a release was done and advertised. The good thing was that games were often cut into “episodes” and the first one given away as shareware so people could try out the game for free and might decide to buy the full product.

The public relations model

Advertising grew bigger and bigger as well as more and more aggressive. Top titles games were often announced as development begun and some material was released along the development process to keep people hooked. This worked in some cases and failed in others (say Duke Nukem forever announced in 1996).

It was a reasonable move to try to build up an audience interested in a certain title early. The problem with that is mainly two things: You cannot keep people hooked for an arbitrary amount of time and such a continuing advertising campaign costs a whole lot of money way before you start earning anything from sales.

These problems lead to a new one, however. It puts very high pressure on the developers to meet deadlines to stay on schedule. And sometimes people in charge may even decide to release a half-baked product which almost always is a very bad idea… (what was the latest example? That Batman game perhaps?)

The community-aware model

It’s not a new insight that it is rather helpful for any title to have a large community. Some studios provide forums in an attempt to simplify building up of a community. And it’s also common knowledge today that feedback from that community is extremely valuable: Knowing your audience better helps a lot to provide the perfect product after all!

The most important point of this model is that interacting with the players is now bidirectional: There’s advertising targeting them but you certainly want to have (and honor) feedback provided by them. And it also makes sense to think about designing the game and/or providing the tools to easily modify the game and thus make it as easy as possible to create mods for the game. This can also be a huge plus when it leads to a bigger, more active and longer living community!

Independent of a single title, there is a possibility for a studio to get itself a good name by opening the source code for older games. This may require some cleaning up work first but some studios have also released code as-is (which can be rather terrible). But usually the community figures out what to do with it and before long the game is ported to new platforms, receives technical updates and enhancements. This has totally made some titles immortal: There are still new episodes, mods and total conversions for Wolfenstein being released. Yes, for a game from 1992 with extremely “poor graphics” (320×240, 8bit) by today’s standards! And there’s not one week without new maps for the mighty DooM (1993).

The community-supported model

There’s this interesting trend of “early access” games: Players are given the opportunity to playtest games before they are ready for release. People know they have to expect bugs but they can try out a game of their interest early and if they are very committed to it, they can report bugs as they encounter them.

This is a classical win-win situation: The developers get a broad testing done for free and the players can have a peak into the game early. Oh, and any form of interaction is of course always a good thing.

The community-backed model

That’s a rather new thing and basically means that some developers try to get their game crowd-funded. This can succeed and this can fail. There are examples for both cases. But while this is clearly a development model since it has a lot of impact on it, I’d say that it’s also more of a special case than a general model.

The future?

MuHa Games have made one clever step ahead with Thea as the gain of giving the title away for free on Linux is really considerable. How’s that? Well, if there was no Linux version, Linux people wouldn’t have bought the game, either. So giving it away is no actual loss: The number of people of the “hey, I would have bought it for Windows but why should I since I can play it for free on Linux!” kind are most likely extremely rare – if they exist at all.

No loss is fine, but where’s the actual gain? Well, there’s the “Just bought the Windows version. Besides: I don’t run Windows at all” type of guy. These people alone should suffice to cover the costs of the additional efforts to package a Linux release and upload it somewhere. But that’s not the main point at all: Can you say “Free advertising”? People talk about the game and people write about the game, many of which would not have done it if it had just been an ordinary game! Now with the free Linux release the game, MuHa managed to make it stand out (and that is not too easy today).

For these reasons giving it away proves to be a very sensible PR action! I do not mind if that was intended or not. That doesn’t change the facts.

Community-assisted model?

So what could the future hold? I can imagine that making the community engage even more would be a big benefit. From a studio’s perspective, fans do unpaid work because they love the product. And from the fan’s perspective it’s just cool to be part of one of your favorite games and help improve it.

What could this look like? My vision is to sort of blend closed source development with what we learned from open source development. It’s cool that people playtesting a game can report bugs via forum or email. But when will the first project set up a public bugtracker along with a tutorial on how to use that for bug reports and maybe (sensible) feature requests?

Then: What about translation? Open source achieved made very, very good results using translation frameworks like Transifex. Now Thea is only available in English. My native language is German and I would not have minded at all to dedicate some time translating a few strings (I got a nice game for free after all!). There’s a lot of potential in this.

And along that it would totally make sense to avoid using proprietary containers for files. I did not bother to try to extract text out of whatever format it is that MuHa uses for Thea. In 1999 ID Software did a clever thing for Quake III Arena: They used container files called “.pk3” – which were simply renamed, uncompressed Zip files. The benefit is obvious: Everybody can extract the resources, modify them and put things back together. Great! I noticed a lot of spelling mistakes in Thea. If I had had access to the game text you’d have received a series of patches from me (and by applying they you’d instantly see which ones are still valid and fixing mistakes). Wouldn’t that be a great way to improve the game?

Licensed Open Source model?

Can open source work for a commercial game? Well, why not? Open source alone does mean just that: The source is open. It does not say under which license and it does not say that it’s free. Now I generally support as much freedom as possible – but that last word there is important. A more open development is a nice improvement IMO. There’s no reason to demand more than that.

In this model the customers pay for the game data without which you obviously cannot play the games but the program source is open (or perhaps semi-open where it is included with the copy of the game you get when you buy it and you’re free to distribute a series of patches but not the source itself). I’m pretty sure that this can work. One potential problem here may be deadlines. Often the code in commercial games must be horrible – not because the programmers suck but because unrealistic deadlines blow. A lot of studios may hesitate to open up their code just for that very reason…

Addressing the problem could however also be easy: You sell games in early access? Buyers get the code and know that it’s early and may not be in perfect shape (and can actually help improving it). Again both sides win: The studio gets code review and maybe some patches plus some people may even attempt to port the game to platforms unsupported by the studio. The players get better games they can help to improve, take modding to the next level and even a chance to see what coding is like and get yourself some reference work if you intent to work in that industry!

There’s one other issue, though. In many cases studios will want to hide some things from competitors. That may be old (and at some point hopefully obsolete) thinking but we have to accept it as a present fact. So what about this? Well, those things could be put into libraries… It’s far better to have the program code open and make it use closed libraries than having nothing open at all!

Time for change

Who’s stepping forward making the next step in game development? I’m really curious if something in the direction of what I wrote here happens any time in the future. For each step there’s good press to catch for free again, you know? 😉 Perhaps some small studio dares to make the move.

Update: I wrote this in a hurry on 11/30 to rush out my November post. And then I once again forgot to make it public. But now it is…

An interview with the Nanolinux developer

2014 is nearly over and for the last post this year I have something special for you again. Last year I posted an interview with the EDE developer and I thought that another interview would conclude this year of blogging quite fine.

In the previous post I reviewed Nanolinux (and two years ago XFDOS). Since I was in mail contact with the author about another project as well, it suggested itself that I’d ask him if he’d agree to give me an interview. He did!

So here’s an interview with Georg Potthast (known for a variety of projects: DOSUSB, Nanolinux and Netrider – to just name some of them) about his projects, the FLTK toolkit, DOS and developing Open Source software in general. Enjoy!

Interview with Georg Potthast

This interview was conducted via email.

Please introduce yourself first: How old are you and where are you from?

I am 61 years old and live in Ahlen, Germany. This is about 30 minutes drive from Dortmund where they used to brew beer and where the BVB Dortmund soccer team is currently struggling.

Do you have any hobbies which have nothing to do with the IT sector?

Not really. I did some Genealogy, which has to do a lot with IT these days. But now I have several IT projects I am working on.

DOS

You’re involved in the FreeDOS community and have put a lot of effort into XFDOS. A lot of people shake their heads and mumble something like “It’s 2014 now and not 1994…” – you know the score. What is your motivation to keep DOS alive?

I have been using DOS for a long time and wish it would not go away completely. So I developed these DOS applications, hoping to get more people to use DOS. But I have to agree that I have not been successful with that.

Potential software developers find only very few users for their applications which is demotivating. Also there is simply no hardware available today that is limited so much that you better use DOS on it. Everything is 32/64 bit, has at least 4 GB of memory and terabytes of disk space. And even the desktop PC market is suffering from people moving to tablets and smartphones.

People are still buying my DOSUSB driver frequently. They are using it mostly for embedded applications which shall not be ported to a different operating system for one reason or another.

Do you have any current plans regarding DOS?

I usually port my FLTK applications to DOS if it is not too much effort to do so. So they are available for Linux, Windows and DOS. Such as my FlDev IDE (Link here).

Recently I made a Qemu/FreeDOS bundle named DOS4WIN64 (Link here) that you can run as an application on any Windows 7/8 machine. This includes XFDOS. I see this as a path to run 16bit applications on 64bit Windows.

How complicated and time consuming is porting FLTK applications from Linux to DOS or vice versa?

It depends on the size and the dependencies on external libraries. I usually run ./configure on Linux and then copy the makefile to DOS where I replace-lXlib with -lNXlib plus -lnano-X. Then, provided the required external libraries could be downloaded from the djgpp site, it will compile if the makefile is not too complicated (recursive). Sometimes I also compile needed libraries for DOS which is usually not difficult if they have a command line interface.

You then have to test if all the features of the application work on DOS and make some adjustments here and there. Often you can use the Windows branch if available for the path definitions.

Porting DOS applications to Linux can be more complicated than vice versa.

Linux

For how long have you been using Linux?

I have been using Linux on and off. I began using SCO-Unix. However, I did not like setting things up with configuration files (case sensitive) scattered over many directories. It took me over a week to get serial communications to work to connect a modem. When I asked Linux developers for help they recommended to recompile the kernel first – which means they did not know how to do that either. So I returned to DOS at that time. But I have been using Linux a lot for several years now.

What is your distribution of choice and why?

I mainly use SUSE but I think Ubuntu may work just as well. This may sound dull but you do not have to spend time on adding drivers to the operating system or porting the libraries you need. The mainstream Linux distributions are well tested and documented and you do not have to spend the time to tailor the distro to your needs. They do just much more than you need so you are all set to start right away.

My own distro, Nanolinux, is a specialized distro which is meant to show how small a working Linux distro can be. It can be used on a flash disk, as an embedded system, a download on demand system or to quickly switch to Linux from within Windows.

However, if you have a 2 Terabyte hard disk available I would not use Nanolinux as the main distribution.

FLTK

Which programming languages do you prefer?

I like Assembler. To be able to use X11 and FLTK I learned C and C++ which I currently use. I have not done any assembler in a while though.

You seem to like the idea of minimalism. Do you do use those minimalist applications on a daily base or are they more of a nice hobby?

Having a DOS and assembler background I always try not use more disk space than necessary. Programming is just my hobby.

Many of your projects use the FLTK toolkit. Why did you choose this one and not e.g. FOX?

I had ported Nano-X to DOS to provide an Xlib alternative for DOS developers. In addition I ported FLTK to DOS as well since FLTK can be used on the basis of Nano-X. So I am now used to FLTK.

Compared to the more common toolkits, FLTK suffers from a lack of applications. Which three FLTK applications that don’t exist (yet) do you miss the most?

I think FLTK is a GUI toolkit for developers, so it is not so important what applications are available based on FLTK.

If you look at my Nanolinux – given I add the NetRider browser and my FlMail program to the distro – it comes with all the main office applications done in FLTK. However, the quality of these applications is not as good as Libre Office, Firefox or Gimp. I do not expect anyone to write Libre Office with a FLTK GUI.

When you awake at night, a strange light surrounds you. The good FOSS fairy floats in the air before you! She can do absolutely everything FOSS related; whether it’s FLTK 3 being completed and released this year, a packaging standard that all Linux distros agree on or something a bit less unlikely. 😉 She grants you three wishes!

As with FLTK 3 I wish it would change its name and the development would concentrate on FLTK 1.3.

Regarding the floating fairy I would wish the internet would be used by nice and friendly people only. Currently I see it endangered by massive spam, viruses, criminals and even cyber war as North Korea apparently did regarding the movie the ruling dictator wanted to stop being shown.

Back to serious. What do you think: Why is FLTK such a little known toolkit? And what could be done about that?

I do not think it is little known, just most people use GTK and so this is the “market leader”. If you work in a professional team this will usually decide to go for GTK since most members will be familiar with that.

What could be done about that? If KDE and Gnome would be based on FLTK I think the situation will change.

From your perspective of a developer: What do you miss about FLTK that the toolkit really should provide?

Frankly speaking, as a DOS developer the alternative would be to write your own GUI. And FLTK provides more features than you could ever develop on your own.

What I do not like is the lack of support for third party schemes. Dimitrj, a Russian FLTK developer who frequently posts as “kdiman” on the FLTK forums, created a very nice Oxy scheme. But it is not added to FLTK since the developers do not have the time to test all the changes he made to make FLTK look that good.

What do you think about the unfortunate FLTK 2 and the direction of FLTK 3?

I think these branches have been very unfortunate for FLTK. Many developers expected FLTK 2 to supersede FLTK 1.1 and waited for FLTK 2 to finish before developing an FLTK application. But FLTK 2 never got into a state where it could replace FLTK 1.1. Now the same seems to happen with FLTK 3.

So they should have named FLTK2/3 the XYZ-Toolkit and not FLTK 2 to avoid stopping people to choose FLTK 1.1.

Currently there is no development on FLTK 2/3 that I am aware of and I think the developers should concentrate on one version only. FLTK 1.3 works very well and does all that you need as a software developer as far as I can say.

Somebody with a bit of programming experience and some free time would like to get into FLTK. Any tips for him/her?

I wrote a tutorial which should allow even beginners in C++ programming to use FLTK successfully (Link here).

Nanolinux

You’ve written quite a number of such applications yourself. Which of your projects is the most popular one (in terms of downloads or feedback)?

This is the Nanolinux distro. It has been downloaded 30.000 times this year.

NanoLinux… Can you describe what it is?

Let me cite Distrowatch, I cannot describe it better: Nanolinux is an open-source, free and very lightweight Linux distribution that requires only 14 MB of disk space. It includes tiny versions of the most common desktop applications and several games. It is based on the “MicroCore” edition of
the Tiny Core Linux distribution. Nanolinux uses BusyBox, Nano-X instead of X.Org, FLTK 1.3.x as the default GUI toolkit, and the super-lightweight SLWM window manager. The included applications are mainly based on FLTK.

After compiling the XFDOS distro I thought I would gain more users if I would port it to Linux. The size makes Nanolinux quite different from the others and I got a lot of downloads and reviews for it.

The project is based on TinyCore which makes use of FLTK itself. Is that the reason you chose this distro?

TinyCore was done by the former main developer of Damn Small Linux. So he had a lot of experience and did set up a very stable distro. Since I wanted to make a very small distro this was a good choice to use as a base. And I did not have to start from scratch and test that part of the distro forever.

NanoLinux uses an alternative windowing system. What can you tell us about the differences between NanoX and Xorg’s X11?

Nano-X is simply a tiny Xlib compatible library which has been used in a number of embedded Linux projects. Development started about 15 years ago as far as I recall. At that time many Linux application developers used X11 directly and therefore were willing to use an alternative like nano-X for their projects.

Since nano-X is not fully compatible to X11, a wrapper called NXlib was developed, which provides this compatibility and allows to base FLTK and other X11 applications on nano-X without code change. The compatibility is not 100% of cause, it is sufficient for FLTK and many X11 applications.

Since nano-X supported DOS in the early days I took this library and ported the current version to DOS again.

Netrider

The project you are working on currently is NetRider, a browser based on webkit and FLTK. Please tell us how you came up with the idea for it.

Over the years I looked at other browser applications and thought how I could build my own browser, just out of interest. Finally Laura, another developer from the US, and I discussed it together. She came up with additional ideas and thoughts. That made me have a go at WebKit with FLTK.

What are your aims for NetRider?

I wanted to add a better browser to my Nanolinux distro replacing the Dillo browser. Also, as a FLTK user I wanted to provide a FLTK GUI for the WebKit package as an alternative to GTK and Qt.

There’s also the project Fifth which has quite similar aims at first sight. Why don’t you work together?

Lauri, the author of Fifth, and I started out about the same time with our FLTK browser projects, not knowing of each other’s plans. Now our projects run in parallel. Even though we both use FLTK, the projects are quite different.

We have not discussed working together yet and our objectives are different. He wants to write an Opera compatible browser and competes with the Otter browser while I am satisfied to come up with something better than Dillo.

I did not ask Lauri whether he thinks we should combine the projects. I am also not sure if this would help us both because we implemented different WebKit APIs for our browsers so we would have to make a WebKit library featuring two APIs. This could be done though. Also he is not interested in
supporting Windows which Laura and I want to support.

Would you say that NetRider is your biggest project so far? And what plans do you have for it?

Setting up Nanolinux and developing/porting all the applications for it was a big project too, and I plan to make a new release beginning of next year.

As with NetRider it depends if people like to use it or are interested to develop for / port it. Depending on the feedback I will make my plans. Recently I added some of the observations I got from beta testers, did support for additional languages, initial printing support etc.

The last one is yours: Which question would you have liked me to ask in addition to those and what is the answer to it?

I think you already asked more questions than I would have been able to come
up with. Thank you for the interesting questions.

Thanks a lot Georg, for answering these questions! Best wishes for your current and future projects!

What’s next?

I have a few things in mind… But I don’t know yet which one I’ll write about next. A happy new year to all my readers!

The concepts of complexity and simplicity

Life in general is a very complex thing. Society is a complex matter, too. Also the IT world is a complex one. And so are many of today’s programs – for the good or the bad.

In many fields complexity is a necessity. You cannot build a simple microprocessor that satisfies today’s needs. And there is no way to have a very simple kernel that can do everything that we need. I agree to that and I do not want to condemn complexity as a whole. But – and I cannot stress that enough – while more and more sophisticated programs are being developed, projects have the tendency to become overly complex. And this is what I criticize.

A bit of history

Most of my readers are probably happy users of some Unix-like operating system. Some may live long enough to have witnessed how these systems changed over time. Many of us younger ones did not and so we only know what we have read about these times (or probably not even that).

Thinking about the heritage of Unix, another OS called Multix comes to one’s mind. This system was jointly developed by AT&T, GE and the MIT. It was a sophisticated operating system which had many remarkable or even truly revolutionary features for its time. A lot of effort and money was put into it. High expectations were put on Multics. And then eventually – it failed.

AT&T had pulled out of the project when they realized that it was rather slow and overly complex. They learned from it and attempted to create a system which followed the opposite approach: Aim for simplicity. This system lead to an incredible success: Unix.

So it is important to know that enthusiasm for technology and the urge to develop more and more complex programs is not a new phenomena at all. In fact I’d claim that it is the logical consequence of how man thinks. While all things begin with relatively simple forms, complexity as a concept does not follow after the concept of simplicity. On the contrary: Simplicity is the lesson learned after realizing the downsides of complexity.

Universalism and particularism

Some people seem to be fascinated with the idea to have one tool that should do nearly everything. Let’s assume we have that tool available today. The result will be an extremely complex application which has an overwhelming number of features. There will hardly be any single person who will know all these features (let alone bring all of them to use).

Now each feature you don’t use wastes space on your drive. While this is true, it is certainly the smallest problem when you’re not working in the embedded field. A bigger one is that it will surely be of low quality: While it can do a hell of a lot of things, it is very unlikely that all of its features will be comprehensive. The program is likely to be rather slow because optimizing a very complex program is extremely difficult. The worst thing however is that it is bound to contain a high amount of bugs, too!

It is a well-known fact that program code where functions are longer than the maximum lines that fit on the screen, contain far more bugs. For some reason a lot of programmers seem not interested in writing good code but either just want to get something done or aim at too ambitious goals which make the project overly complex.

On the other hand there are projects which specialize in a single, narrow field. If you suggest a new feature it may very well happen that it will be rejected. The people who work on this project do not care for stuff just because that’s currently ultra-hip. Instead they often refer to features which are not really needed as unnecessary bloat. These programs cannot do a lot of things by themselves but excel at what they can do.

Following the later idea is the Unix way of doing things. The true power comes from the combination of specialized tools which can yield mind-blowing results when used by an experienced user.

Featuritis?

There are quite a few programs which suffer from a strange illness which could be called “featuritis”. It often makes the host look handsome and appealing for many people. This illness is usually not deadly and often invisible for quite some time. But it does bear a very destructive aspect, too…

Two of the programs recently found infected are OpenSSL and BASH. The former kept so much legacy code in the project and even re-implemented things done better by others that it was impossible to have a good overview of the whole project code. The later implements a lot of features which are hardly ever used by anybody and also uses some functions of its own which are arguably wasted code since there are better alternatives out there.

Both projects succeeded in being widely distributed but read by few and understood by even fewer. And those few didn’t look at all the obscure parts of those unclear and confusing code. This is why severe bugs could exist for a very long time before anybody ever noticed.

Probably the most important project where I diagnose a particularly intense form of featuritis is Systemd. It acts like an init system but absorbed the functionality of so many other programs by now that I’m getting dizzy thinking of it. Worse: A lot of people who have looked at it more than just a bit claim that it is badly designed and the code is rather unclean. Even worse: The developers of Systemd have had a conflict with Linus Torvalds because they broke things with their code and even refused to fix it insisting that it was not their problem! And the true tragedy is that it has spread to a great many Linux distros. Once a really bad bug is found concerning Systemd, this will probably take suffering for admins and users to a whole new level.

An exit strategy for the brave

My respect for the OpenBSD guys continues to grow the more I read about it. They claim to have a very secure OS and from what they do I can only say that they mean it. The LibreSSL fork or the SystemBSD project are just two examples that show how dead serious they are. A lot of people seem to ridicule them because there are not too many OpenBSD users out there when compared to Linux. That’s true of course. Their OS may also not be very familiar from a Linux user’s point of view and the OpenBSD guys may not be too friendly towards newbies. But they are nice enough to make their projects portable so that everybody can profit from them!

And in case you want to stick with Linux, there’s a great source for this platform as well. The guys over at suckless aim at creating programs “that suck less”. Go ahead and read a bit – especially the sucks and rocks pages! On the first one you’ll flabbergasted at how bad the situation really is with a lot of programs. Yes, they are fundamentally broken – and their developers don’t care about that. Code correctness doesn’t pay of if you just want to target the masses. But if you want to do things right it does.

Are there really people out there who care? You bet there are. Think about this topic again and try out a few alternatives. You might well find a real gem here and there – if you are able to look over some of the shortcomings compared to the well-known, featureful and bloated defaults.

Software licenses (pt. 1): A general introduction

This is obviously not about the EERIE distro or Arch:E5. The reason for that is simply that I didn’t succeed in getting everything working. And to be honest, I didn’t have much time to attempt it in the first place. My second child was born this month and I guess everybody would agree that family comes first. So here’s the first post of a series that I had in mind since well before Christmas. Time just goes by so damn fast!

I’ve been thinking about software licenses a bit and decided to write about it. It’s a rather special topic for sure. Many programmers like dealing with licenses just as much as they like to write documentation: Not at all. For that reason a lot of people who support open source software decide to take the easy way and simply GPL their code.

However licensing is a very important thing and should be taken seriously. This article is meant to give a quick introduction – while avoiding the major problem of the whole license issue: Being boring for most people!

What is a license?

Ever read a typical Microsoft EULA (“End User License Agreement”)? No? Well, it’s not just you. Most people haven’t. Still you probably should. Or at least read a bit of it. Even if you didn’t read the license, you’ve agreed to it if you’re using Microsoft software. And that means that you are bound by its content – no matter if you’re in fact ignorant about that.

But what is a license? In fact you can think of a license which comes with a program as a blueprint. It is a draft of what the author proposes to you. Simply put: If you accept it, it will become a valid legal treaty. Yes – accepting a license isn’t some neglectable action. In doing it, you’re signing a treaty which legally binds you.

In general licenses contain various items which permit and prohibit certain actions. For example you may be granted the right to install one copy of a program on one computer and use it. You’re usually forbidden to analyze the program by means of reverse-engeneering. However there may be additional requirements being imposed on you. Probably you have to pay a monthly fee to continue using some service or you’re required to supply a valid address and keep it up to date.

Once you understood that by accepting a license you’re signing a treaty, you might no longer do so carelessly. After all you’re legally accountable if you knowingly or unknowingly violate it.

There has been a steady development in terms of proprietary licenses over the past decade. Many of them are getting stricter and more intrusive all the time. There are probably some quite nasty passage in the next license you’re going to “agree” to. So you should probably care about it.

(Why) do licenses matter?

Software – like any other artificial thing – doesn’t emerge out of nowhere. It has one or more author(s) who wrote the program and dedicated time and effort to that project. Now it is absolutely comprehensible that the author gets to decide what he or she would like to with it. The creator of some software project may decide to keep it private, try to sell it or give it away for free. And of course it’s all the author’s decision whether to open-source the project or to keep it close.

Depending on the nature of the program either path may seem like a good choice. The really important thing however is that each possible decision is perfectly legitimate. Who creates something decides. So if a coder makes the decision to give the source to his program away to the public – then this project is open-source and we can do all the good things with it, right? Wrong. Unfortunately.

Our coder may give the source of his or her program away for free but still remains the sole copyright holder. If the code is made available to you that implies that you may take a look at it. Certainly a nice move of the author. But other than that you’re not actually allowed to do anything with it. It is the intellectual property of another person. So without permission you may not redistribute it, are not allowed to modify it or in fact to actually put it to some use! Yes: In each of these cases you’d have to ask for permission first.

Let’s assume you cannot reach the author since he or she got a new email address and didn’t update the project page. Or the author simply doesn’t have the desire (or time) to answer mail asking for permissions. Well, if you’re a coder yourself, the source code may help you to get an idea on how to solve certain problems. But that’s pretty much it. If you need some functionality you’ll have to repeat the work already done and reimplement it because you’re not allowed to reuse the code despite it being publically available.

So for these reasons the answer to the question in the headline can only be: Yes. Licensing does matter!

What’s next?

I currently have three things in the making: 1) The basic repo of a musl-based Arch-like Linux distro 2) Quite some i686 packages for ArchBSD 3) An article about “Optimistic and pessimistic licenses”. And 4) I have no idea how much time I can spend before my screen in the coming weeks. 😉