Writing a daemon using FreeBSD and Python pt.3

Part 1 of this series covered Python fundamentals, signal handling and logging. We wrote an init script as well as a program that can be daemonized by daemon(8).

In the previous part we modified the program as well as the init script so that it can daemonize itself using the Python daemon module. I also covered a few topics that people totally new to programming (or Python) might want to know to better understand what’s happening.

Part 3 is about exploring a simple means of IPC (inter-program communication) by using named pipes.

Creating a named pipe

What is a named pipe – also known as a fifo (first in, first out)? It is a way of connecting two processes together, where one can sequentially send data and the other receives it in exactly the same order. It’s basically what us Unix lovers use for our command lines all the time when we pipe the input of one program into another. E.g.:

ls | wc -l

In this case the output of ls is piped to wc which will then print the amount of lines to stdout (which could be used as input for another program with another pipe). This kind of pipe between two programs is usually short lived. When the first program is done sending output and the second one has received all the data, the pipe goes away with the two processes. It also only exists between the two processes involved.

A named pipe in contrast is something a bit more permanent and more flexible. It has a representation in the filesystem (which is why it’s a named pipe). One program creates a named pipe (usually in /var/run) and attaches to the receiving end of the pipe. Another process can then attach to the sending end and start putting data into it which will then be received by the former. Named pipes have their own character (p) showing that a file is of type named pipe, looking like this when you ls -l:

prw-rw-r--

Here’s what the next version of the code looks like:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()

    logging.info("Baby Daemon started up and ready!")
    while True:
        time.sleep(1)

We’re using a new import here: os. It gives the programmer access to various OS-dependent functions (like pipes which are not existent on Windows for example). I’ve also added a global definition for the location of the named pipe.

The next thing that you’ll notice is that the signal handler function got some new code. Before the daemon terminates it tries to clean up. If the named pipe exists the program will attempt to delete it. I’m not handling what could possibly go wrong here as this is just an example. That’s why in this case I just re-raise the exception and let the program error out.

Then we have a new “start_logging()” function that I put the logging stuff into to unclutter main. Except for that changed structure, there’s really nothing new here.

The next new function, “assert_no_pipe_exists()” should be fairly easy to read: It checks if a file by the name it wants to use is already present in the filesystem (be it as a leftover from an unclean exit or by chance from some other program). If it is found, the daemon aborts because it cannot really continue. If the filename is not taken, however, “make_pipe()” will attempt to create the named pipe.

The other thing that I did was moving the main part back from being a function directly to the program. And since we’re doing small incremental steps, that’s it for today’s step 1. Fire up the daemon using the init script and you should see that the named pipe was created in /var/run. Stop the process and the pipe should be gone.

Using the named pipe

Creating and removing the named pipe is a good first step, but now let’s use it! To do so we must first modify the daemon to attach to the receiving end of the pipe:

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile, errno, logging, os, signal, time
 
 # Globals #
IN_PIPE = '/var/run/bd_in.pipe'
 
 # Fuctions #
def handler_sigterm(signum, frame):
    try:
        close(inpipe)
    except:
        pass

    logging.debug("Caught SIGTERM! Cleaning up...")
    if os.path.exists(IN_PIPE):
        try:
            os.unlink(IN_PIPE)
        except:
            raise
    logging.info("All done, terminating now.")
    exit(0)

def start_logging():
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)

def assert_no_pipe_exists():
    if os.path.exists(IN_PIPE):
        logging.critical("Cannot start: Pipe file \"" + IN_PIPE + "\" already exists!")
        exit(1)

def make_pipe():
    try:
        os.mkfifo(IN_PIPE)
    except:
        logging.critical("Cannot start: Creating pipe file \"" + IN_PIPE + "\" failed!")
        exit(1)
    logging.debug("Created pipe \"" + IN_PIPE)

def read_from_pipe():
    try:
        buffer = os.read(inpipe, 255)
    except OSError as err:
        if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
            buffer = None
        else:
            raise
 
    if buffer is None or len(buffer) == 0:
        logging.debug("Inpipe not ready.")
    else:
        logging.debug("Got data from the pipe: " + buffer.decode())
    
 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    signal.signal(signal.SIGTERM, handler_sigterm)
    start_logging()
    assert_no_pipe_exists()
    make_pipe()
    inpipe = os.open(IN_PIPE, os.O_RDONLY | os.O_NONBLOCK)
    logging.info("Baby Daemon started up and ready!")

    while True:
        time.sleep(5)
        read_from_pipe()

Apart from one more import, errno, we have three important changes here. First, the cleanup has been extended, there is a new function, called “read_from_pipe()” and then main has been modified as well. We’ll take a look at the latter first.

There’s a ton of examples on named pipes on the net, but they usually use just one program that forks off a child process and then communicates over the pipe with that. That’s pretty simple to do and works nicely by just copying and pasting the example code in a file. But adapting it for our little daemon does not work: The daemon just seems to “hang” after first trying to read something from the pipe. What’s happening there?

By default, reads from the pipe are in blocking mode, which means that on the attempt to read, the system just waits for data if there is none! The solution is to use non-blocking mode, which however means to use the raw os.open function (that supports flags to be passed to the operating system) instead of the nice Python open function with its convenient file object.

So what does the line starting with “inpipe” do? It calls the function os.open and tells it to open IN_PIPE where we defined the location of our pipe. Then it gives the flags, so that the operating system knows how to open the file, in this case in read-only and in non-blocking mode. We need to open it read-only, because the daemon should be at the receiving side of the pipe. And, yes, we want non-blocking, so that the program continues on if there is no data in the pipe without waiting for it all the time!

What might look a little strange to you, is the | character between the two flags. Especially since on the terminal it’s known as the pipe character and we’re talking about pipes here, right? In this case it’s something completely unrelated however. That symbol just happens to be Python’s choice for representing the bit-wise OR operator. Let’s leave it at that (I’ll explain a bit more of it in a future “Python pieces” section, but this article will be long enough without it).

However that’s still not all that the line we’re just discussing does. The os.open() function returns a file descriptor that we’re then assigning to the inpipe variable to keep it around.

What’s left is a new infinite loop that calls read_from_pipe() every 5 seconds.

Speaking of that function, let’s take a closer look at what it does. It tries to use the os.read function to read up to 255 bytes from the pipe into the variable named buffer. We’re doing so in a try/except block, because the read is somewhat likely to fail (e.g. if the pipe is empty). When there’s an exception, the code checks for the exact error that happened and if it’s EAGAIN or EWOULDBLOCK, we deliberately empty the buffer. If some other error occurred, it’s something that we didn’t expect, so let’s better take the straight way out by raising the exception again and crashing the program.

On FreeBSD the error numbers are defined in /usr/include/errno.h. If you take a look at it, you see that EAGAIN and EWOULDBLOCK are the same thing, so checking for one of them would be enough. But it makes sense to know that on some systems these are separate errors and that it’s good practice to check for both.

If the buffer either has the None value or has a length of 0, we assume that the read failed. Otherwise we put the data into the log. To make it readable we have to use decode, because we will be receiving encoded data.

All that’s left is the cleanup function. I’ve added another try/except block that simply tries to close the pipe file before trying to delete it. This is example code, so to make things not even more complex, I just silently ignore if the attempt fails.

Control script

Ok, great! That was quite a bit of things to cover, but now we have a daemon that creates a pipe and tries to read data from it. There’s just one problem: How can we test it? By creating another, separate program, that puts data in the pipe of course! For that let’s create another file with the name bdaemonctl.py:

#!/usr/local/bin/python3.6

 # Imports #
import os, time

 # Globals #
OUT_PIPE = '/var/run/bd_in.pipe'

 # Main #
try:
    outpipe = os.open(OUT_PIPE, os.O_WRONLY)
except:
    raise

for i in range(0, 21):
    print(i)
    try:
        os.write(outpipe, bytes(str(i).encode('utf-8')))
    except BrokenPipeError:
        print("Pipe has disappeared, exiting!")
        os.close(outpipe)
        exit(1)
    time.sleep(3)
os.close(outpipe)

Fortunately this one is fairly simple. We do our imports and define a variable for the pipe. We could skip the latter, because we’re using it on only one occasion but in general it’s a good idea to keep it as it is. Why? Because hiding things deep in the code may not be such a smart move. Defining things like this at the top of the file increases the maintainability of your code a lot. And since we want to send data this time, of course we name our variable OUT_PIPE appropriately.

In the main section we just try to open the pipe file and crash if that doesn’t work. It’s pretty obvious that such a case (e.g. the pipe is not there because the daemon is not running) should be better handled. But I wanted to keep things simple here because it’s just an example after all.

Then we have a loop that counts from 0 to 20, outputs the current number to stdout and tries to also send the data down the pipe. If that works, the program waits three seconds and then continues the loop.

To be able to write to the pipe we need a byte stream but we only have numbers. We first convert them to a string and use a proper encoding (utf8) and then convert them to bytes that can be sent over the pipe.

When the loop is over, we close the pipe file properly because we as the sender are done with it. I added a little bit of code to handle the case when the daemon exits while the control script runs and still tries to send data over the pipe. This results in a “broken pipe” error. If that happens, we just print an error message, close the file (to not leak the file descriptor) and exit with an error code of 1.

So for today we’re done! We can now send data from a control program to the daemon and thus have achieved uni-directional communication between two processes.

What’s next?

I’ll take a break from these programming-related posts and write about something else next.

However I plan to continue with a 4th part later which will cover argument parsing. With that we could e.g. modify our control program to send arbitrary data to the daemon from the command line – which would of course be much more useful than the simple test case that we have right now.

Writing a daemon using FreeBSD and Python pt.2

The previous part of this series left off with a running “baby daemon” example. It covered Python fundamentals, signal handling, logging as well as an init script to start the daemon.

Daemonization with Python

The outcome of part 1 was a program that needed external help actually to be daemonized. I used FreeBSD’s handy daemon(8) utility to put the program into the background, to handle the pidfile, etc. Now we’re making one step forward and try to achieve the same thing using just Python.

To do that, we need a module that is not part of Python’s standard library. So you might need to first install the package py36-daemon if you don’t already have it on your system. Here’s a small piece of code for you – but don’t get fooled by the line count, there’s actually a lot of things going on there (and of concepts to grasp):

#!/usr/local/bin/python3.6
 
 # Imports #
import daemon, daemon.pidfile
import logging
import signal
import time

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def main_program():
    signal.signal(signal.SIGTERM, handler_sigterm)
    try:
        logging.basicConfig(filename='/var/log/bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
    except:
        print("Error: Could not create log file! Exiting...")
        exit(1)
 
    logging.info("Started!")
    while True:
        time.sleep(1)

 # Main #
with daemon.DaemonContext(pidfile=daemon.pidfile.TimeoutPIDLockFile("/var/run/bdaemon.pid"), umask=0o002):
    main_program()

I dropped some ballast from the previous version; e.g. overriding SIGINT was a nice thing to try out once, but it’s not useful as we move on. Also that countdown is gone. Now the daemon continues running until it’s signaled to terminate (thanks to what is called an “infinite loop”).

We have two new imports here that we need for the daemonization. As you can see, it is possible to import multiple modules in one line. For readability reasons I wouldn’t recommend it in general. I only do it when I import multiple modules that kind of belong together anyway. However in the coming examples I might just put everything together to save some lines.

The first more interesting thing with this version is that the main program was moved to a function called “main_program”. We could have done that before if we really wanted to, but I did it now so the code doesn’t take attention away from the primary beast of this example. Take a look at the line that starts with the with keyword. Now that’s a mouthful, isn’t it? Let’s break this one up into a couple of pieces so that it’s easier to chew, shall we?

The value for umask is looking a bit strange. It contains an “o” among the numbers, so it has to be a string, doesn’t it? But why is it written without quotes then? Well, it is a number. Python uses the “0o” prefix to denote octal (the base-8 numbering system) numbers and 0x would mean hexadecimal (base-16) ones.

Remember that we talked about try/except before (for the logging)? You can expand on that. A try block can not only have except blocks, it can also have a finally block. Statements in such a block are meant to be executed no matter the outcome of the try block. The classical example is that when you open a file, you definitely want to close it again (everything else is a total mess and would make your program an exceptionally bad one).

Closing it when you are done is simple. But what if an exception is raised? Then the code path that properly closes the file might never be reached! You could close the file in every thinkable scenario – but that would be both tedious and error-prone. For that reasons there’s another way to handle those cases: Close the file in the finally block and you can be sure that it will be closed regardless of what happens in the try or in any except block.

Ok, but what does this have to do with our little daemon? Actually a lot. That case of try/finally has been so common that Python provides a shortcut with so-called context managers. They are objects that manage a resource for you like this: You request it, it is valid only inside one block (the with one!) and when the block ends, the context manager takes care of properly cleaning up for you without having you add any extra code (or even without you knowing, if you just copy/paste code from the net without reading explanations like this).

So the with statement in our code above lets Python handle the daemonization process while the main_program function is running. When it ends on the signal, Python cleans up everything and the process terminates – which is great for us. Accept that for now and live with the fact that you might not know just how it does that. We’ll come back to things like that.

Updated init script

Ok, the one thing left to do here is making the required changes to the init script. We are no longer using the daemon(8) utility, so we need to adjust it. Here it is the new one:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/root/bdaemon.py"
command_interpreter=/usr/local/bin/python3.6
pidfile="/var/run/${name}.pid"

load_rc_config $name
run_rc_command "$1"

Not too much changed here, but let’s still go over what has. The command definition is pretty obvious: The program can now daemonize itself, so we call it directly. It doesn’t take any arguments, which means we can drop command_args.

However we need to add command_interpreter instead (one important thing that I had overlooked first), because the program will look like this in the process list:

/usr/local/bin/python3.6 /root/bdaemon.py

Without defining the interpreter, the init system would not recognize this process as being the correct one. Then we also need to point it to the to the pidfile, because in theory there could be multiple processes that match otherwise.

And that’s it! Now we have a daemon process running on FreeBSD, written in pure Python.

Python pieces

This next part is a completely optional excursion for people who are pretty new to programming. We’ll take a step back and discuss concepts like functions and arguments, modules, as well as namespaces. This should help you better understand what’s happening here, if you like to know more. Feel free to save some time and skip the excursion if you are familiar with those things.

Functions and arguments

As you’ve seen, functions are defined in Python by using the def keyword, the function name and – at the very least – an empty pair of parentheses. Inside the parentheses you could put one or more arguments if needed:

def greet(name):
    print("Hi, " + name + "!")

greet("Alice")
greet("Bob")

Here we’re passing a string to the function that it uses to greet that person. We can add a second argument like this:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet("Alice", "Great to see you again!")
greet("Bob", "How are you doing?")

The arguments used here are called positional arguments, because it’s decided by their position what goes where. Invert them when calling the function and the output will obviously be garbage as the strings are assigned to the wrong function variable. However it’s also possible to refer to the variable by name, so that the order does no longer matter:

def greet(name, phrase):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")

This is what is used to assign the values for the daemon context. Technically it’s possible to mix the ways of calling (as done here), but that’s a bit ugly.

We’re not using it, yet, but it’s good to know that it exists: There are also default values. Those mean that you can leave out some arguments when calling a function – if you are ok with the default value.

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

greet(phrase="Great to see you again!", name="Alice")
greet("Bob", "How are you doing?")
greet("Carol")

And then there’s something known as function overloading. We’re not going into the details here, but you might want to know that you can have multiple functions with the same name but a different number of arguments (so that it’s still possible to precisely identify which one needs to be called)!

Modules

When reading about Python it usually won’t take too long before you come across the word module. But what’s a module? Luckily that’s rather easy to explain: It’s a file with the .py extension and with Python code in it. So if you’ve been following this daemon tutorial, you’ve been creating Python modules all the way!

Usually modules are what you might want to refer as to libraries in other languages. You can import them and they provide you with additional functions. You can either use modules that come with Python by default (that collection of modules is known as the standard library, so don’t get confused by the terminology there), additional third-party modules (there are probably millions) or modules that you wrote yourself.

It’s fairly easy to do the latter. Let’s pick up the previous example and put the following into a file called “greeter.py”:

forgot_name = "Sorry, what was your name again?"

def greet(name, phrase = "Pleased to meet you."):
    print("Hi, " + name + "! " + phrase)

Now you can do this in another Python program:

import greeter

greeter.greet("Carol")
print(greeter.forgot_name)

This shows that after importing we can use the “greet()” function in this program, even though it’s defined elsewhere. We can also access variables used in the imported module (greeter.forgot_name in this case).

Namespaces

Ever wondered what that dot means (when it’s not used in a filename)? You can think of it as a hierarchical separator. The standard Python functions (e.g. print) are available in the global namespace and can thus be used directly. Others are in a different namespace and to use them, it’s necessary to refer to that namespace as well as the function name so that Python understand what you want and finds the function. One example that we’ve used is time.sleep().

Where does this additional namespace come from? Well, remember that we did import time at the top of the program? That created the “time” namespace (and made the functions from the time module available there).

There’s another way of importing; we could import either everything (using an asterisk (*) character, but that’s considered poor coding) or just specific functions from one module into the global namespace:

from time import sleep
sleep(2)
exit(0)

This code will work because the “from MODULE import FUNCTION” statement in this example imported the sleep function so that it becomes available in the global namespace.

So why do we go through all the hassle to have multiple namespaces in the first place? Can’t we just put everything in the global one? Sure, we could – and for more simple programs that’s in fact an option. But consider the following case: Python provides the open keyword. It’s used to open a file and get a nice object back that makes accessing or manipulating data really easy. But then there’s also os.open, which is not as friendly, but let’s you use more advanced things since it uses the raw operating system functionality. See the problem?

If you import the functions from os into the global namespace, you have a name clash in the case of open. This is not an error, mind you. You can actually do that, but you should know what happens. The function imported later will override the one that went by that name previously, effectively making the original one inaccessible. This is called “shadowing” of the original function.

To avoid problems like this it’s often better to have your own separate namespace where you can be sure that no clashes happen.

What’s next?

In the next part we’ll take a look at implementing IPC (inter-process communication) using named pipes (a.k.a “fifos”).

Writing a daemon using FreeBSD and Python pt.1

Being a sysadmin by profession, I don’t code. At least not often enough or with as high quality output that programmers would accept to call coding. I do write and maintain shell scripts. I also write new formulas for configuration management with SaltStack.

The latter is Python-based and after hearing mostly good things about that language, I’ve been trying to do some simple things with it for a while now. And guess what: It’s just so much more convenient compared to using shell code! I’ll definitely keep doing some simple tasks in Python, just to get some experience with it.

Not too long I thought about a little project that I’d try to do and decided to go with Python again. Thinking about what the program should do, I figured that a daemon would make a nice fit for it. But how do you write a daemon? Fortunately it’s especially easy on FreeBSD. So let’s go!

Python

The first thing that I did, was to create a new file called bdaemon.py (for “baby daemon”) and use chmod to make it executable. And here’s what I put into it as a first test:

#!/usr/local/bin/python3.6

 # Imports #
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #

 # Main #
print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

This very simple program has the shebang line that points the operating system to the right interpreter. Then I import Python’s time module which gives me access to a lot of time-related functions. Next I define two global variables that control how long the program runs and in which interval it will give output.

The main part of the program first outputs a starting message on the terminal. It then enters a for loop, that counts from 1 to 30. In Python you do this by providing a list of values after the in keyword. Counting to 5 could have been written as for i in [1, 2, 3, 4, 5]: for example.

With range we can have Python create a list of sequential numeric values on the fly – and since it’s much less to type (and allows for dynamic list creation by setting the final number via a variable), I chose to go with that. Oh, BTW: In Python the last value of those ranges is exclusive, not inclusive. This means that range(1, 5) leads to [1, 2, 3, 4] – if you want the 5 included in the list, you have to use range(1, 6)! That’s why I add 1 to the TTL_SECONDS variable.

I use time.sleep to create a delay in the loop block. Then I do a check if the remainder of the division of the current running time by the defined check interval is zero (% is the modulus operator which gives that remainder value of the division). If it is, the program creates more output.

Mind the indentation: In Python it is used to create code blocks. The for statement is not indented, but it ends with a colon. That means that it’s starting a code block. Everything up to (but not including) the second to last print statement is indented by four spaces and thus part of the code block. Said print statement is indented two levels (8 spaces) – that’s because it’s another block of its own started by the if statement before it. We could create a third, forth and so on level deep indentation if we required other blocks beneath the if block.

Eventually the program will print that the TTL has been reached and exit the program with an error code of 0 (which means that there was no error).

Have you noticed the str(i) part in one of the print statements? That is required because the counter variable “i” holds numeric values and we’re printing data of a different type. So to be able to concatenate (that’s what the plus sign is doing in this case!) the variable’s contents to the rest of the data, it needs to match its type. We’re achieving this by doing a conversion to a string (think converting the number 5 to the literal “5” that can be part of a line of text where it looks similar but is actually a different thing).

Oh, and the pound signs are used to start comments that are ignored by Python. And that’s already it for some fundamental Python basics. Hopefully enough to understand this little example code (if not, tell me!).

Signals

The next thing to explore is signal handling. Since a daemon is essentially a program running in the background, we need a way to tell it to quit for example. This is usually done by using signals. You can send some of them to normal programs running in the terminal by hitting key combinations, while all of them can be sent by the kill command.

If you press CTRL-C for example, you’re sending SIGINT to the currently running application, telling it “abort operation”. A somewhat similar one is SIGTERM, which kind of means “hey, please quit”. It’s a graceful shutdown signal, allowing the program to e.g. do some cleanup and then shut down properly.

If you use kill -9, however, you’re sending SIGKILL, the ungraceful shutdown signal, that effectively means “die!” for the process targeted (if you’ve ever done that to a live database or another touchy application, you know that you really have to think before using it – or you might be in for all kinds of pain for the next few hours).

#!/usr/local/bin/python3.6

 # Imports #
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def signal_handler(signum, frame):
    print("Received signal" + str(signum) + "!")
    if signum == 2:
        exit(0)

 # Main #
signal.signal(signal.SIGHUP, signal_handler)
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGQUIT, signal_handler)
signal.signal(signal.SIGILL, signal_handler)
signal.signal(signal.SIGTRAP, signal_handler)
signal.signal(signal.SIGABRT, signal_handler)
signal.signal(signal.SIGEMT, signal_handler)
#signal.signal(signal.SIGKILL, signal_handler)
signal.signal(signal.SIGSEGV, signal_handler)
signal.signal(signal.SIGSYS, signal_handler)
signal.signal(signal.SIGPIPE, signal_handler)
signal.signal(signal.SIGALRM, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
#signal.signal(signal.SIGSTOP, signal_handler)
signal.signal(signal.SIGTSTP, signal_handler)
signal.signal(signal.SIGCONT, signal_handler)
signal.signal(signal.SIGCHLD, signal_handler)
signal.signal(signal.SIGTTIN, signal_handler)
signal.signal(signal.SIGTTOU, signal_handler)
signal.signal(signal.SIGIO, signal_handler)
signal.signal(signal.SIGXCPU, signal_handler)
signal.signal(signal.SIGXFSZ, signal_handler)
signal.signal(signal.SIGVTALRM, signal_handler)
signal.signal(signal.SIGPROF, signal_handler)
signal.signal(signal.SIGWINCH, signal_handler)
signal.signal(signal.SIGINFO, signal_handler)
signal.signal(signal.SIGUSR1, signal_handler)
signal.signal(signal.SIGUSR2, signal_handler)
#signal.signal(signal.SIGTHR, signal_handler)

print("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        print("Running for " + str(i) + " seconds...")
print("TTL reached, terminating!")
exit(0)

For this little example code I’ve added a function called “signal_handler” – because that’s what it is for. And in the main program I installed that signal handler for quite a lot of signals. To be able to do that, I needed to import the signal module, of course.

If this program is run, it will handle every signal you can send on a FreeBSD system (run kill -l to list all available signals on a Unix-like operating system). Why are some of those commented out? Well, try commenting those lines in! Python will complain and stop your program. This is because not all signals are allowed to be handled.

SIGKILL for example by its nature is something that you don’t want to allow to be overridden with custom behavior after all! While your program can choose to handle e.g. SIGINT and choose to ignore it, SIGKILL means that the process totally needs to be shutdown immediately.

Try running the program and send some signals while it’s running. On BSD systems you can e.g. send CTRL-T for SIGINFO. The operating system prints some information about the current load. And then the program has the chance to output some additional information (some may tell you what file they currently process, how much percent they have finished copying, etc.). If you send SIGINT, this program terminates as it should.

Logging

There’s another thing that we have to consider when dealing with processes running in the background: A daemon detaches from the TTY. That means it can no longer receive input the usual way from STDIN. But we investigated signals so that’s fine. However it also means a daemon cannot use STDOUT or STDERR to print anything to the terminal.

Where does the data go that a daemon writes to e.g. STDOUT? It goes to the system log. If no special configuration for it exists, you will find it in /var/log/messages. Since we expect quite a bit of debug output during the development phase, we don’t really want to clutter /var/log/messages with all of that. So to write a well-behaving little daemon, there’s one more topic that we have to look into: Logging.

#!/usr/local/bin/python3.6

 # Imports #
import logging
import signal
import time

 # Globals #
TTL_SECONDS = 30
TTL_CHECK_INTERVAL = 5

 # Fuctions #
def handler_sigterm(signum, frame):
    logging.debug("Exiting on SIGTERM")
    exit(0)

def handler_sigint(signum, frame):
    logging.debug("Not going to quit, there you have it!")

 # Main #
signal.signal(signal.SIGINT, handler_sigint)
signal.signal(signal.SIGTERM, handler_sigterm)
try:
    logging.basicConfig(filename='bdaemon.log', format='%(levelname)s:%(message)s', level=logging.DEBUG)
except:
    print("Error: Could not create log file! Exiting...")
    exit(1)

logging.info("Started!")
for i in range(1, TTL_SECONDS + 1):
    time.sleep(1)
    if i % TTL_CHECK_INTERVAL == 0:
        logging.info("Running for " + str(i) + " seconds...")
logging.info("TTL reached, terminating!")
exit(0)

The code has been simplified a bit: Now it installs only handlers for two signals – and we’re using two different handler functions. One overrides the default behavior of SIGINT with a dummy function, effectively refusing the expected behavior for testing purposes. The other one handles SIGTERM in the way it should. If you are fast enough on another terminal window, you can figure out the PID of the running program and then kill -15 it.

Logging with Python is extremely simple: You import the module for it, call a function like logging.basicConfig – and start logging. This line sets the filename of the log to “bdaemon.log” (for “baby daemon”) in the current directory. It changes the default format to displaying just the log level and the actual message. And then it defines the lowest level that should be logged.

There are various pre-defined levels like debug, info, warning, critical, etc. But what’s that try and except thing? Well, the logging module will attempt to create a logfile (or append to it, if it already exists). This is an operation that could fail. Perhaps we’re running the program in a directory where we don’t have the permission to create the log file? Or maybe for whatever reason a directory of that name exists? In both cases Python cannot create the file an an error occurs.

If such a thing happens, Python doesn’t know what to do. It knows what the programmer wanted to do, but has no clue on what to do if things fail. Does it make sense to keep the program running if something unexpected happened? Probably not. So it throws an exception. If an unhandled exception occurs, the program aborts. But we can catch the exception.

By putting the function that opens the file in a try block, we’re telling Python that we’re expecting it could fail. And with except we can catch an exception and handle expected problems. There are a lot of exception types; by not specifying any, we’re catching all of them. That might not be the best idea, because maybe something else happened and we’re just expecting that the logfile could not be created. But let’s keep it simple for now.

The one remaining thing to do is to change any print statements so that we’re using the logging instead. Depending on how important the log entry is, we can also use different levels from least important (DEBUG) to most important (CRITICAL).

You can either wait for the program to finish and then take a look at the log, or you open a second terminal and tail -f bdaemon.log there to watch output as the program is running.

Alright! With this we have everything required to daemonize the program next. Let’s write a little init script for it, shall we?

Init

Init scripts are used to control daemons (start and stop them, telling them to reload the configuration, etc.). There are various different init systems in use across the Unix-like operating system. FreeBSD uses the standard BSD init system called rc.d. It works with little (or not so little if you need to manage very complex daemons) shell scripts.

Since a lot of the functionality of the init system is all the same across most of these scripts, rc.d handles all the common cases in shell files of it’s own that are then used in each of the scripts. In Python this would be done by importing a module; the term with shell scripting is to source another shell script (or fragment).

Create the file /usr/local/etc/rc.d/bdaemon with the following contents:

#!/bin/sh

. /etc/rc.subr

name=bdaemon
rcvar=bdaemon_enable

command="/usr/sbin/daemon"
command_args="-p /var/run/${name}.pid /path/to/script/bdaemon.py"

load_rc_config $name
run_rc_command "$1"

Yes, you need root privileges to do that. Daemons are system services and so we’re messing with the system now (totally at beginner level, though). Save the file and you should be able to start the program as a daemon e.g. by running service bdaemon onestart!

How’s that? What does that all mean and where does the daemonization happen? Well, the first line after the shebang sources the main rc fragment with all the required functions (read the dot as “source”). Then it defines a name for the daemon and an rcvar.

What is an rcvar? Well, by putting “bdaemon_enable=YES” into your /etc/rc.conf you could enable this daemon for automatic startup when the system is coming up. If that line is not present there, the daemon will not start. That’s why we need to use “onestart” to start it anyway (try it without the “one” if you’ve never done that and see what happens!).

Then the command to run as well as the arguments for that command are defined. And eventually two helper functions from rc.subr are called which do all the actual complex magic that they thankfully hide from us!

Ok, but what is /usr/sbin/daemon? Well, FreeBSD comes with an extremely useful little utility that handles the daemonization process for others! This means it can help you if you want to use something as a background service but you don’t want to handle the actual daemonization yourself. Which is perfect in our case! With it you could even write a daemon in shell script for example.

The “-p” argument tells the daemon utility to handle the PID file for the process as well. This is required for the init system to control the daemon. While our little example program is short-lived, we can still do something while it runs. Try out service onestatus and service onestop on it for example. If there was no PID file present, the init system would claim that the daemon is not running, even if it is! And it would not be able to shut it down.

There we go. Our first FreeBSD daemon process written in Python! One last thing that you should do is change the filename for the logfile to use an absolute path like /var/log/bdaemon.log. If you want to read more about the daemon utility, read it’s manpage, daemon(8). And should you be curious about what the init system can do, have a look here.

What’s next?

While using /usr/sbin/daemon is perfectly fine, you might feel that we kind of cheated. So next time we’ll take a brief look at daemonizing with Python directly.

I also want to explore IPC (“inter-process communication) with named pipes. This will allow for a little bit more advanced daemon that can be interacted with using a separate program.

Illumos (v9os) on SPARC64 SunFire v100

Over the last month or so I’ve written a couple of articles on an old SunFire v100 machine that I own for a while now. First I took a look at the hardware of the machine and the LOM (Lights Out Management). Then I installed OpenBSD 6.0 from CD and updated all the way to 6.5. Finally I played a bit with OpenBSD to see what it can do and how well it supports SPARC64. This post will be the last SPARC64 one before I visit other topics again.

v9os?

While I was pretty happy with OpenBSD on the SunFire, there’s one reason that I wanted to try out something else, too. That reason has three letters: Z-F-S. The first thing that I tried out when I got the hardware, was FreeBSD – but I ran into problems. I’ve managed to overcome circumvent them (might be worth another story in the future), only to find that FreeBSD does not support ZFS on SPARC64!

One option that suggests itself, is just putting Solaris on there. I have a copy of Solaris 10 for Sparc, but I prefer to keep things Open-Source. Also there’s the problem, that my machine is old enough to not have a DVD drive and it doesn’t support booting from USB and the like.

So it’s illumos. Since I’m really just getting started with the broader Solaris universe, I had to do a little research first. And I was a little surprised that most illumos distros seem to not even support Sparc at all! Of the four that do

  • OpenSXCE seems dead (last release in 2014)
  • DilOS uses Debian packaging (which is not my cup of tea at all)
  • Tribblix sounds really interesting to me, but does not fit on a CD
  • v9os is a minimal Sparc distro that is small enough

As you can see, there wasn’t so much choice after all! While v9os is an experimental one-man project that you should probably stay away from for production use, it might be just right for my purposes of tinkering with an old machine.

Installing the OS – first try

There are not many preparations necessary: I downloaded the ISO image and burned it on a CD. Then I connected to my SunFire via serial, powered it on and put the CD into the drive. It takes quite some time, but after a while I can read that v9os is in fact starting.

Booting up v9os from the CD

After the system booted, it gives the user the option to select a keymap.

Keymap selection

Then it shows the installation menu. There you can choose if you want to install, load additional drivers, drop to a shell, change the terminal type or reboot. I go with the first option.

v9os installation menu

After a moment the installer has started an a welcome screen is printed. Unfortunately in my case there’s a problem with the CD, so that four lines of debug info overwrite important information: How to actually proceed with the installation! But this is an OpenSolaris derivative, and so it’s not that hard to figure out that F2 is the key to go on.

v9os installer: Welcome screen

Next it’s selecting the disk to install on. I thought that it all looked good – and didn’t pay much attention to the message “A VTOC label was not found.”. VTOC is the Volume Table Of Contents, the SPARC partition scheme (think MBR/GPT on amd64). We’ll come back to that a little later. 😉

v9os installer: Disk selection

I think that the installer is quite nice. It even offers help pages that give newcomers like me an idea of what they should do for the current step. Great work on that!

v9os installer: Disks help page

Then you can choose to either dedicate the whole disk to v9os or just use a slice. I decide to go the easy route and select the former.

v9os installer: Disk layout selection

Now the installer wants to know the hostname for the new system. The suggested default of v9os is fine for me since I don’t plan to add another machine with that OS to my network anytime soon.

v9os installer: Hostname selection

Finally you can select the time zone – or rather: the zone region.

v9os installer: Time zone selection

Unfortunately things went sideways after that choice and I had to reset the machine…

Ok, after going through the previous steps again, I decided to give the advanced setup a try and selected slicing up the drive.

v9os installer: Slice selection

Unfortunately the result was the same as before: The installer just died. I tried again a few times, playing with different slice setup, but didn’t have any luck.

The installer died… Time to reboot.

At this point I was out of ideas on what else I could try, so I removed the CD and powered down the system.

Writing the label manually

When I powered the system on again, I had forgotten that I removed the CD and to my surprise OpenBSD (the system that I had previously installed on the machine) booted up! This meant that the installer had not even changed anything on the disk, yet!

My next guess was (and still is) that the v9os installer might have problems with BSD disklabels being present on the drive. I took a look at the disklabel from OpenBSD, just to find out some information about the drive.

OpenBSD’s disklabel information of the system hard drive

Then I booted the v9os install medium again but this time selected the shell option. After a little research I found out how to get some drive information on Solaris with iostat.

v9os shell session: Collecting drive hardware info

Next I decided to give the format utility a try. I don’t know if v9os stripped down some hardware information and that together with the disk being really old, it wasn’t properly auto-detected. So I had to do something that I haven’t done in years (and never missed it): Typing in the geometry information by hand!

Typing in disk geometry information (Ah, the (bad!) memories…)

Once the drive has been described to the utility, it shows a menu of what it can do. I haven’t used that program before and judging from the name alone was a bit surprised at how powerful it seems to be. Things like being able to define profiles must have been pretty useful in the past.

Solaris’ format utility

Since I want to partition the drive, I select that. I’m presented with a sub-menu, giving me some more choices.

Partitioning menu of format

I have no clue what a Solaris partitioning scheme should look like (need to explore some older versions of that OS somewhen!).

Partitioning the drive for Solaris

So I look around a little but eventually accept the proposed default and just hope that this works.

Installing the OS – second try

After restarting the machine again and choosing the installer, it looks like this time there is no missing disklabel. At least! But will it make a difference?

Returning to the installer: Partitioning was detected

And yes! Now the installer continues and gets the data written to disk!

Finally installing the OS!

The process takes quite a while – but that’s due to the slow machine that I’m using. Eventually the installation is finished.

v9os installer: All done!

First steps with v9os

Another reboot and after removing the CD-ROM from the drive, the freshly installed system boots up. A moment later it displays the prompt where I can log in using the user root and the password solaris.

First start of v9os

The first thing that I want to do is to get rid of the serial console. So I set up networking and enable SSH.

Setting up networking and enabling SSH

Then I disable the automounter to make the home directory writable and create a user for remote SSH login. Finally I enable the machine to do name resolution and give the new user a password.

Adding a user and name resolution capabilities

That should suffice to SSH into the box from another machine.

Package management with IPS

Logging in remotely works just fine. As v9os does not have an online package repository, I have to download a compressed copy of the repository from SourceForge.

SSHing into the v9os box and downloading the package repository

I don’t know much about the IPS package system and thus really struggle to make it all work. There is no guide on the v9os site and so I try to put the downloaded file in various locations, decompress it and try everything again. Since that also doesn’t work, I unpack the contents of the archive but still cannot get it right…

Struggling to get the repo working…

After more than an hour of struggling with pkg, reading manpages, doing online research and trying to fit everything together, I finally manage to remove the default publisher that comes with the system and add a new one that eventually works!

Finally figured out how to deal with IPS publishers

The v9os operating system is one of the strangest Unices that I’ve ever touched in not providing the vi editor with the system! But now that I have the repository available, I can simply install vim to find out that using packages does work after all.

Installing packages (vim) works!

This is about how far I wanted to take this quick post on v9os. If I had a faster machine, I might have been tempted to try and build the system from source. But with my old SunFire… No.

While v9os might not be fit for production use, I accomplished one goal over OpenBSD: I have an operating system on the machine that is installed on ZFS!

ZFS on SPARC64 with v9os

Conclusion

The v9os operating system is an exotic one for sure. But it’s nice to see that somebody values SPARC64 machine and illumos enough to put the time required to built something like this into such a project. And actually I think it’s not half bad! I didn’t do too much with it, but it seemed stable and except for the installer problem (it would probably just have worked on an empty drive) everything worked fine.

Well, maybe some hints on how to get the package repo in place would have saved me some time… On the other hand Solaris veterans are likely to get it working with just a few commands. And while it has been kind of frustrating for a while, it has also lead to at least a basic understanding of what IPS is and how it works. I’m sure that I’d have missed at least some of that if I had just copied some lines from a guide.

I might not end up making v9os my primary operating system (for various obvious reasons). But it’s another nice little part in the mosaic of the illumos world that I’ve started exploring. Also I noticed that I’ve become a little bit more comfortable with using an OpenSolaris-derivative. Compared to my first encounter with OmniOS, it didn’t take me as long to figure out the very basics again. Which is always a good sign.

Running OpenBSD on SPARC64 (HTTPd, packages, patching, X11, …)

In my previous post I described the process of installing OpenBSD 6.0 on a SPARC64 machine and updating it all the way to 6.5. Now it’s time to actually do something with it to get an idea of how well OpenBSD works on this architecture!

OpenBSD’s base system

The OpenBSD team takes pride in providing an ultra-secure operating system. It’s a well-known fact that the project’s extremely high standards only apply to the base system. Every now and then critics pop up and claim that this basically defeats the whole idea and even accuse the project of “keeping their base system so small that it’s useless by itself” to keep up their defined goals.

There’s some truth to it: The base system is kept (relatively) small if you compare it to some of the fatter operating systems out there. But that’s about it because actually these allegation could not be further from the truth. The base system includes doas, a simpler sudo replacement. It comes with tmux. OpenBSD even maintains it’s own fork of X.org, called Xenocara (not even FreeBSD comes with an X11 server by default) and there’s in fact a lot that you can achieve with the base system alone! Let’s look at one such possibility.

HTTPd

Since the OpenBSD developers are convinced that a webserver is something to keep around all the time, there’s one in base. Originally they used the Apache HTTPd for this. The problem was that at some point, the Apache Foundation decided to give up their Apache 1.0 license and replace it with version 2.0 (they had been criticized a lot for being incompatible with the GPL and the new version solved that problem). The newer version also made the license less simple and permissive than it had been before and OpenBSD did not like the new license. For that reason they basically stayed with the old Apache 1.3 webserver for a long time. They maintained and patched it all that time, but the software really begun to show it’s age.

So for version 5.6, OpenBSD finally removed the old Apache webserver in base and replaced it with Nginx. One release later, they did away with that, too, because they felt that it was starting to become too bloated for their needs. They imported OpenBSD HTTPd into base instead: A home-grown, very simple webserver. It evolved over time, but even though it having gotten more features implemented and becoming a fine little webserver, it strives to keep it simple.

The developers resist the temptation to add new features just because they could and have even made a list of things that some people might want which however will never be implemented because they would raise complexity to an unacceptable level. OpenBSD HTTPd does not want to be a webserver for everyone. It wants to be a ultra-secure webserver that does enough to be useful to many people. If you have any needs above what it offers – get another one.

Simple static website configuration of OpenHTTPd

The simplicity of HTTPd adds a lot to its beauty. I’ve written some HTML for a test page (see screenshot). All of the configuration that I need to do for HTTPd is as follows:

server spaffy.local {
    listen on egress port 80
}

Yes, that’s all that is required: I basically define a vHost (“Server” in HTTPd lingo) and have the application listen on the HTTP default port 80 on egress (a keyword which means whatever interface has the default route). Let’s check if that configuration really is valid by issuing

httpd -n

And it is! Impossible? No. Remember that OpenBSD comes with sane defaults. For that reason there’s usually pretty little that you need to configure. You could, of course. And we’ll be doing that a little later.

Now let’s force-start httpd (we need -f since the service is not enabled, yet, and we want to manually start it once):

rcctl -f start httpd

I’ve edited the /etc/hosts file on my laptop to be able to use the spaffy.local name. So now I can just type that into the address bar of my browser and reach the test page that the SPARC64 machine hosts. OK, a static page probably doesn’t impress you so much. Fortunately that’s not all that we can do in just relying on what base offers!

Static test page displayed in browser

CGI

OpenBSD also comes with Perl as part of the default install. I got that Lama book several years ago, read through about 2/3 of it and then decided that I didn’t like Perl too much. For that reason I never really did anything with it, but here I want to do something with what OpenBSD provides me with, so Perl is a logical choice and I might finally do something with it. Here’s what I came up with:

#!/usr/bin/perl
use strict;
use warnings;

my $osname = `uname -s`;
my $osver = `uname -r`;
my $osarch = `uname -m`;
chomp($osname, $osver, $osarch);

my @months = qw( Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec );
my @days = qw( Sun Mon Tue Wed Thu Fri Sat Sun );
my ( $sec, $min, $hour, $mday, $mon, $year, $wday, $yday ) = gmtime();

print "Content-type: text/html\n\n";
print "<html><head><title>Greetings!</title></head>";
print "<body>Hello from <strong>$osname $osver</strong> on <strong>$osarch</strong>!";
print "<br><br>This page was created by Perl $^V on $days[$wday], $months[$mon] $mday";

if (length($mday < 2)) {
  if (substr($mday, -1) == "1") {
    print "st"; }
  elsif (substr($mday, -1) == "2") {
    print "nd"; }
  elsif (substr($mday, -1) == "3") {
    print "rd"; }
  else {
    print "th"; }
} else {
  if ((substr($mday, 0, 1) ne "1") and (substr($mday, -1) == "1")) {
    print "st"; }
  if ((substr($mday, 0, 1) ne "1") and (substr($mday, -1) == "2")) {
    print "nd"; }
  if ((substr($mday, 0, 1) ne "1") and (substr($mday, -1) == "3")) {
    print "rd"; }
  else {
    print "th"; }
}

print ", $hour:";
if (length($min) == 1) {
  print "0";
}
print "$min (UTC)</body></html>";

Nothing too fancy, but for a first attempt at writing Perl it’s probably OK. After making the script executable, I can run it on the system and get the expected output. Things get a little more complex, though. HTTPd runs in a chroot for security reasons. And just copying the script into the chroot and trying it to execute in a chrooted environment fails with “no such file or directory”.

Huh? I just copied it there, didn’t I? I sure did. The reason for this happening is that the Perl interpreter is not available in the chroot. So let’s copy that one over as well and try again. Abort trap! How do they say? Getting a different error can be considered progress…

Perl CGI script failing due to chroot

Ok, now Perl is there, but it’s not functional. It requires some system libraries not present in the chroot. Using ldd on the Perl executable, I learn which libraries it needs. And after providing them, I can run the script in the chroot! There is a new problem, though: Perl is complaining about missing modules. The simplest solution in our case is to just remove them from the demo script as they are not strictly (haha!) necessary.

Providing Perl dependencies in the web chroot

On to the next step. Here’s a little addition to the HTTPd configuration:

    location "/cgi-bin/*" {
        root "/"
        fastcgi
    }

It basically adds different rules for the case that anything below /cgi-bin is being requested. It changes the document root for this and enables fastcgi. Now I only need to start the slowcgi service (OpenBSD’s shrewdly named fastcgi implementation) and restart HTTPd. My Perl program makes uses of the system’s uname command, so that should be made accessible in the chroot, too, of course.

Finishing the dynamic webpage setup

And that’s it. The script is executed in the webserver and the expected resulting page generated ,which is then served properly:

Dynamically created webpage displayed in beowser

I think this is pretty cool. Try to do that with just the default install of other operating systems! BTW: Want to make HTTPd and slowcgi start automatically after boot? No problem, just put the following into /etc/rc.conf.local:

httpd_flags=""
slowcgi_flags=""

This makes the init system start both daemons by default (and you can of course drop the “-f” flag to rcctl if you need to interact with them).

Binary packages

For OpenBSD 6.5, pre-built packages are offered for 8 of the 13 supported architectures – among them SPARC64. There’s just a couple short of 9,500 packages available (on amd64 it’s 10,600 – so in fact most packages are there)!

Things like GCC 8.3 and even GNAT 4.9 (the Ada part of GCC which is interesting because it’s written in Ada and thus needs to be bootstrapped to every new architecture by means of cross-compiling) are among the packages, as is LLVM 7.0. When it comes to desktop environments, you can choose e.g. between recent versions of Xfce, MATE and Gnome.

Actually, SPARC64 is one of only 4 architectures (the others being the popular ones amd64, i386 and arm64) that are receiving updates to the packages via the packages-stable repository. In there you’ll find newer versions of e.g. PHP, Exim (which had some pretty bad remote exploits fixed), etc.

Basic OpenBSD package management

I choose to install the sysclean package. Remember when I said that I skipped deleting the obsolete files when updating the OS in my last post? This program helps in finding files that should be deleted. However it’s not too intelligent – it just compares a list for a fresh system to the actual system on disk. For that reason it also lists a lot of files that I wouldn’t want to delete. Still it’s helpful to find out obsolete files that you might have forgot to remove.

Sysclean shows a lot of possible remove candidates

Errata patches

While OpenBSD tries it’s very best at providing a safe to use operating system, there really is nothing both useful and free from errors in the IT. If problems with some component of the system are found later, an erratum is published for it. If you are using OpenBSD in production, you are supposed to keep an eye on errata as they are released. Usually they consist of a patch or set of patches for system source code as well as instructions on how to apply it and recompile the needed parts.

Since version 6.1, OpenBSD comes with a handy utility called syspatch(8), which can e.g. be used to fetch binary patches for all known errata that have not been applied to the OS on the respective machine. This is nice – but it’s only available for amd64, i386 and arm64. So on SPARC64 we still have to deal with the old manual way keeping the system secure. However errata patches are also applied to the -STABLE branch and we can use that to get all the fixes.

No syspatch on SPARC64 – tracking -STABLE manually as it used to be

To upgrade our installation to 6.5-STABLE, the first step is to get the operating source of the current release (the sys tarball contains the kernel and src the rest of the base system). After extracting those, CVS is used to update the code to the latest 6.5-STABLE.

Done getting the stable changes from CVS

Once that’s done, it’s time to build the new (non-SMP) kernel:

# cd /sys/arch/$(machine)/compile/GENERIC
# make obj
# make config
# make && make install
# reboot

Building a 6.5-STABLE kernel

On my SunFire v100 the kernel build took 1h 20m. I was curious enough to build the userland as well, just to see how long it would take… The answer is: 85h 17m! I think that LLVM alone took about three days. The rest of the system wasn’t much of a problem for this old machine, but LLVM certainly was.

BTW, I had problems with “permission denied” when trying to “make obj”. After reading the manpage for release(8), I found out that /usr/obj should be owned by build:wobj with 770 permissions which had not been the case on my system.

Kernel build complete

Having done that, I thought that I might build Xenocara as well, to compare how long it takes to build. So I got the sources for that, too, updated them via CVS and built it. It took 9h 26m to build and install.

Xenocara built from (-STABLE) source

X11 on SPARC64

I had left out all X11-related distribution sets when installing OpenBSD. But after having installed Xenocara from source, I had it all available. So I decided to just do something with it. Since the server does not have a graphics card, I cannot run any X program on it directly, because the xserver won’t run. I decided to get a graphical application that is not part of Xenocara installed first. After browsing through the list, I settled on Midori, a WebKitGTK-based webbrowser.

Installing the Midori browser via packages

It took a moment to install all the dependencies, but everything worked. As the next step I enabled SSH X11 forwarding and restarted SSH.

Midori is installed, allowing X11 forwarding for SSH

After connecting to the SPARC64 machine via SSH and checking that the DISPLAY environment variable was set, I could just launch Midori and have it sent over to my laptop that I used to SSH into the other box. So the browser is being executed on the SPARC64 server but displayed on my other machine.

SSHing into the SPARC64 machine and forwarding Midori to my amd64 laptop

Everything worked well, I could even visit the OpenBSD homepage and it was rendered correctly.

The webkit-based browser works well on SPARC64!

Conclusion

OpenBSD is a fine operating system for people who value quality. The SPARC64 port of it seems to be in pretty good shape: Most packages and even stable-package updates are available. What is missing, is syspatch support – but only three architectures have that right now. Also the system compiler is still the ancient GCC version 4.2 which was the last one before the project switched the license to GPLv3.

OpenBSD 6.6 has been released one day after I finished compiling 6.5-STABLE. On amd64 I could now use sysupgrade(8) to upgrade to the new release even easier than before. This is also not supported on SPARC64. But these two little shortcommings just mean a little extra work that all OpenBSD users on any platform had to do anyway until not that long ago.

For 6.6 there are even more packages available for SPARC64. E.g. the Rust compiler has been bootstrapped on this architecture which definitely is great news. Maybe the system compiler will change to LLVM/Clang one day, too. Right now the SPARC64 backend for Clang is incomplete upstream at the LLVM project, if I understood things right. But we’ll see. Maybe it’ll become available in the future. I guess I’ll really have to get a newer SPARC64-based machine with a faster processor. Luckily OpenBSD supports quite a few of them.

OpenBSD on SPARC64 (6.0 to 6.5)

Earlier this year I came by an old SunFire v100 that I wrote about in my previous article. After taking a look at the hardware and the LOM, it’s time to actually do something with it! And that of course means to install an operating system first.

OpenBSD

OpenBSD, huh? Yes, I usually write about FreeBSD and that’s in fact what I tried installing on the machine first. But I ran into problems with it very early on (never even reached single user mode) and put it aside for later. Since I powered up the SunFire again last month, I needed an OS now and chose OpenBSD for the simple reason that I have it available.

First I wanted to call this article simply “OpenBSD on SPARC” – but that would have been misleading since OpenBSD used to support 32-bit SPARC processors, too. The platform was just put to rest after the 5.9 release.

OpenBSD 6.0 CD set

Version 6.0 was the last release of OpenBSD that came on CD-ROM. When I bought it, I thought that I’d never use the SPARC CD. But here was the chance! While it is an obsolete release, it comes with the cryptographic signatures to verify the next release. So the plan is to start at 6.0 as I can trust the original CDs and then update to the latest release. This will also be an opportunity to recap on some of the things that changed over the various versions.

Preparations

I had already prepared the machine for installation previously, so I only had to make a serial connection and everything was good to go. If you’re in need of doing this and don’t feel like reading the whole previous article, here’s the important steps:

  1. Attach power to go to the lom prompt
  2. Issue boot forth and then poweron to go to the loader
  3. At the ok prompt use setenv boot-device cdrom disk to set the boot order
  4. Set an alias for the CD-ROM device with nvalias cdrom /pci@1f,0/ide@d/cdrom@3,0:f
  5. Reset the machine with reset-all or powerdown and then poweron again

Booting up the OpenBSD 6.0 sparc64 CD

Insert the OpenBSD installation CD for SPARC64 and after just a moment you should be in the installation program.

Installing 6.0

OpenBSD’s installation program is very simple. It’s basically an installation script that asks the user several questions and then goes ahead and does the things required for the desired options. In the Linux world e.g. Alpine Linux does the same, and I’ve always liked that approach.

OpenBSD 6.0 installer started

On a casual installation, the script would ask for the keyboard layout. But since we’re installing over serial here, that doesn’t matter. It asks for the kind of terminal instead. Since our CPU architecture is SPARC64, OpenBSD assumes we’re using a Sun Terminal. Well, I don’t, so I choose Xterm.

Of course we need a hostname for the new system. Since it’s Puffy (the OpenBSD mascot) on SPARC here, I settled on spaffy. 😉

Choosing the root password

Next is network configuration. DHCP is fine for this test machine. Then the root password is being set.

Of course I want to access the box over SSH later, so that I don’t need the serial connection anymore and can put the machine in a different room. Compared to many x86 servers it’s not as loud as those, but still quite a bit louder than you would want a machine sitting directly next to you to be. Allowing root over SSH is very bad practice, so I create a user next and disallow remote root logins.

Selecting the partitioning

Then I choose my timezone. Next is deciding on the partitioning. There I noticed a difference compared to i386/amd64 installations. I have a habit of creating partition B first (to put the swap space on the beginning of the drive). When I tried to do this, the installer told me that this architecture didn’t allow doing that. I assume that limitation is due to Sun’s partitioning scheme VTOC that is being used on the SPARC machines. So I created them in order.

What you can see on the screenshot is OpenBSD’s default partitioning. It’s more complex than many people may be used to, but for a good reason. Remember that you can mount filesystems with different options? That way you can e.g. have /tmp mounted noexec. OpenBSD makes good use of this, e.g. enabling or disabling W^X protection on a filesystem-wide base. This is not a production machine, though, and the drive is fairly small for today’s needs. So in the end I went with a much simpler way of dividing the drive.

Selecting the distribution sets to install

Finally I need to choose what to install. OpenBSD offers so-called “sets” for various parts of the full operating system. Since I’m only installing 6.0 as a starting point, I go with the minimum required options: The kernel (bsd) and the base system.

I have no use for the install (ramdisk) kernel (bsd.rd) or the SMP-enabled multi processor kernel (bsd.mp). Also I don’t need the system compiler (comp), manpages (man) or small games (game). Of course I also don’t need the X11-related sets.

Installation finished!

Then the installer goes off and prepares everything. When it has finished, the only thing that is left is rebooting the system (and removing the CD). Now we can also change the boot order in the ok prompt, to set it to booting from disk only, speeding up the boot time minimally:

ok> setenv boot-device disk

And that’s it! Now I have an old but known good version of OpenBSD on my SunFire box.

Freshly installed OpenBSD 6.0 booted up

Updating to 6.1

Alright. What’s next? Running a 3 years old version of OpenBSD is probably not that good an idea if newer versions are available for this architecture – and they are.

So the first thing to do is fetching the ramdisk kernel of version 6.1 and the signature for it. Then I check the integrity of the kernel with signify(1). Everything is fine, so I go on and replace the standard kernel with the install kernel for the newer version. There’s probably a better way to do this, but the SPARC bootcode seems to have “bsd” as the kernel file name hard-coded and I admittedly didn’t dig very deep to figure out a different way of booting alternate kernels.

Getting 6.1 ramdisk kernel and verifying signature

After restarting, the systems boots into the install kernel. This time I select upgrade instead of install, of course. The installer then checks the existing operating system (or at least the root partition).

I then select http for the location of the sets and point the installer to a mirror that still holds the old releases.

Installer started in upgrade mode

Next is selecting the distribution sets to be installed. Again I choose only the bare minimum, since the upgrade is just an intermediary step to upgrading all the way to a current release.

In earlier versions of OpenBSD, etc was a separate set. Since the files required to check newer releases are in /etc, I’d have chosen a different installation strategy if they were still available separately. However the etc set has been included in the big base set for a while now.

Necessary sets updated

After the sets have been downloaded and extracted the upgrade is mostly complete. The remaining things are done in the live system. So it’s time to complete this step and reboot.

Configuration files get updated on first boot after the OS upgrade

OpenBSD automatically updates various configuration files for the new release. If you pay attention, you’ll see that there is one case where the changes could not be merged automatically. So we will I need to see to that myself.

The system also looked if newer firmware was available. However this was not the case (which really is no wonder on this old machine).

Merging OpenSSH config and adding installurl

After doing the manual merge of the OpenSSH configuration, it’s time to do the final tasks to complete the upgrade. OpenBSD keeps a detailed upgrade guide for each version that lists the required manual steps. In fact you should read it before doing the upgrade, since it can involve steps that need to be done prior to booting the install kernel and updating the base system! I skipped them, because they didn’t apply in my case – e.g. I hadn’t installed the manpages anyway.

I chose to only set the installurl since that one is really convenient. Actually I should remove some obsolete files from the filesystem, too. But I decided to leave this for later as there is another method to do so.

Updating to 6.2

Getting the system updated to 6.2 means repeating what I did for the 6.1 update: Get the ramdisk kernel for the new release as well as the signature and verify it. Once that’s done, another reboot is in order.

Downloading and preparing OpenBSD 6.2 install kernel

One thing that’s different is that the installer now defaults to fetching from the web and not from CD. And thanks to setting the installurl before I rebooted, it also knows the default mirror to get the sets from. Which makes the process of upgrading even more straight-forward and convenient.

OpenBSD 6.2 installer: Now knows the URL to fetch from

Finishing the upgrade after the actual unpacking of the new files takes a bit longer for this version. After making all known device nodes, the installer re-links the kernel! This is due to a new feature called KARL (Kernel Address Randomized Link). The idea here is that the objects that make up the kernel are linked in random order for each reboot, essentially creating a new and unique kernel every time. This makes it much harder or even impossible to use parts of the kernel otherwise known to be in certain memory regions for sophisticated attacks.

OpenBSD 6.2 introduced Kernel re-linking (“KARL”)

Oh, and did you notice that the bsd.mp set is gone? This machine only has a single-core CPU and therefore the SMP kernel doesn’t make much sense. The installer detected the CPU and did not offer to install the SMP kernel (even though it of course is still available for machines with multiple cores).

As always, the system needs to rebooted after the upgrade is complete. Just a moment later I’m greeted by my new OpenBSD 6.2! Again I’m skipping the manual steps to be taken afterwards.

OpenBSD 6.2 booted up

Updating to 6.3

Preparing and doing the upgrade for 6.3 is just like you’ve seen twice now, so I’m not going to repeat it. There’s one new feature in the installer that could be mentioned, though: After the upgrade is complete, the reboot option is now the default thing that the installer offers instead of just dropping you to a shell. This means you can save another 6 keystrokes when updating! Yay! 😉

OpenBSD 6.3 install kernel: Rebooting after completion is now the default choice

Updating to 6.5

The upgrade to 6.4 is simply more of the same. Of course I did that step, but I’m cutting it out here. 6.5 is the most recent release as I’m writing this (though 6.6 is already around the corner). This means I’m going to do one more upgrade, following the process that we know pretty well by now: Get and verify bsd.rd, boot it and select “Upgrade”.

Choosing all the sets except for X11-related ones for 6.5

This time I decide to install all the sets except for anything X11-related. The SunFire v100 is a server-class machine which does not even have a graphics card! For that reason there’s no VGA port to connect a monitor to, either. And while X11 could still be of some use, it’s simply not needed at all.

Upgrade to OpenBSD 6.5 complete

Again the upgrade process takes a bit longer, but that’s only thanks to the additional sets (as well as the base distribution getting a little bigger and bigger with each release). After just a little while everything is done and there’s one more reboot to make.

OpenBSD 6.5 booted up and ready

All done! I now have a fine OpenBSD 6.5 system up and running on my old SPARC64 box. And even better: Everything has been cryptographically verified to be the data that I want and no bad person has tempered with it. Sure, the system has not been cleaned up, yet – and it’s just 6.5-RELEASE with no errata fixes applied. Still I’d say: We’re off to a good start! Aren’t we?

What’s next?

In the next post I intend to explore the system a little and find out where there are differences from a common amd64 installation of OpenBSD.

A SPARC in the night – SunFire v100 exploration

While we see a total dominance of x86_64 CPUs today, there are at least some alternatives like ARM and in the long run hopefully RISC-V. But there are other interesting architectures as well – one of them is SPARC (the Scalable Processor ARChitecture).

This article is purely historic, I’m not reviewing new hardware here. It’s more of a “20 years ago” thing (the v100 is almost that old) written for people interested in the old Sun platform. The intended audience is persons who are new to the Sun world, who are either to young like me (while I had a strong interest in computers back in the day, I hadn’t even finished school, yet, and heck… I was still using Windows!) or never had the chance to work with that kind of hardware in their professional career. Readers who know machines like that quite well and don’t feel like reading this article for nostalgic reasons might just want to skip it.

The SPARC platform

SPARC is a Reduced Instruction Set Computing (RISC) Instruction Set Architecture (ISA) developed by Sun Microsystems and Fujitsu in 1986. Up to the Sun-3 series of computers, Sun had used the m68k processors but with Sun-4 started to use 32-bit SPARC processors instead. The first implementation is known as SPARCv7. In 1992 Sun introduced machines with v8, also known as SuperSPARC and in 1995 the first processors of SPARCv9 became available. Version 9, known as UltraSPARC, is a 64-bit architecture that is still in use today.

SunFire v100: Top and front view

SPARC is a fully open ISA, taken care of by SPARC International. Architecture licenses are available for free (only an administration fee of 99$ has to be payed) and thus any interested corporation could start designing, manufacturing and marketing components conforming to the SPARC Architecture. And Sun did really mean it with OpenSPARC: They released the Verilog code for their T1 and T2 processors under the GPLv2, making them the first ever 64 bit processors that were open-sourced. And not enough with that – they also released a lot of tools along with it like a verification suite, a simulator, hypervisor code and such!

After Sun was acquired by Oracle in 2010, the future of the platform became unclear. Initially, Oracle continued development of SPARC processors, but in 2017 completely terminated any further efforts and laid off employees from the SPARC team.

Fujitsu has made official statements that they are continuing to develop the SPARC-based servers and even about a “100 percent commitment”. In the beginning of this year, they even wrote about a Resurgence of SPARC/Solaris on the company’s blog and since they are the last one to provide SPARC servers (which are still highly valued by some customers), chances are that they will continue improving SPARC. According to their roadmap, even a new generation is due for 2020.

So while SPARC is not getting a lot of attention these days, it’s not a dead platform either. But will it survive in the long run? Time will tell.

SunFire v100

I’m working for company that offers various hosting services. We run our own data center where we also provide colocation for customers who want that. Years ago a customer ran a root server with an (now) old SunFire v100 machine. I don’t remember when it was decommissioned and removed from the rack, but that must have been quite a while ago.

SunFire v100: Back view

That customer was meant to come over to collect the old hardware and so we put the machine in the storage room. For whatever reason, he never came to get it. Since it had been sitting there for years now, I decided to mail the customer and asked if he still wanted the machine. He didn’t and would in fact prefer to have us to dispose of it. So I asked if he’d be ok with us shreddeing the hard drives and me taking the actual machine home. He didn’t have any objections and thus I got another interesting machine to play with.

The SunFire v100 is a 1U server that was introduced in 2001 and went EOL in 2006. According to the official documentation, the machine came with 64 bit Solaris 8 pre-installed. It was available with an UltraSPARC IIe or IIi processor and had a 40 GB, 7200 RPM IDE HDD built-in. My v100 has 1GB of RAM and a 550 MHz UltraSPARC IIe. I also put a 60 GB IBM HDD into it.

It has a single PDU, two ethernet ports as well as two USB ports. It also features two serial ports – and these are a little special. Not only are they RJ-45, but they have two different uses cases. One is for the LOM (we’ll come to that a little later), the other one is a regular serial port that can be used e.g. to upload data uninterrupted (i.e. not going to be processed by the LOM). The serial connection uses 9600 baud, no parity, one stop bit and full duplex mode.

RJ-45 to DB9 cable and DB9 to USB cable

The other interesting thing is the system configuration card. It stores host ID and MAC address of the server as well as NVRAM settings. What is NVRAM? It’s an acronym for Non-Volatile Random-Access Memory, a means for storing information that must not be lost when the power goes off like regular RAM does. If you’re thinking “CMOS” in PC terms, you’re right – except it seems that Sun used a proper means of NVRAM and not an in fact volatile source made “non-volatile” by keeping the data alive with the help of a battery. The data is stored on a dedicated chip, or in this case on a card. The advantage of the latter is that it can be easily transferred to another system, taking all the important configuration with it! Pretty neat.

Inside the v100

When I opened up the box, I was actually astonished by how much space there was inside. I know some old 1U x86 servers from around that time (or probably a little later) that really are a pain to work with. Fitting two drives into them? It’s sure possible, but certainly not fun at all. At least I hated doing anything with them. And those at least used SATA drives – I haven’t seen any IDE machines in our data center, not even with the oldest replacement stuff (it was all thrown out way before I got my job). But this old Sun machine? I must say that I immediately liked it.

SunFire v100: Inside view

Taking out the HDD and replacing it with another drive was a real joy compared to what I had feared that I’d be in for. The drive bays are fixed using a metal clamp that snaps into a small plastic part (the lavender ones in the picture). I’ve removed the empty bay and leaned it against the case so that it’s easier to see what they look like. It belongs where the ribbon cable lies – rotated 90 degrees of course.

Old x86 server for comparison – getting two drives in there is very unpleasant to do…

All the other parts are easily accessible as well: The PDU in the upper left corner of the picture, the CDROM drive in the lower right, as well as the RAM modules in the lower left one. It’s all nicely laid out and well assembled. Hats off to Sun, they really knew what they were doing!

Lights out!

I briefly mentioned the LOM before. It’s short for Lights-Out Management. You might want to think IPMI here. While this LOM is specific to Sun, its basic idea is the same as the wide-spread x86 management system: It allows you do things to the machine even when it’s powered off. You can turn it on for example. Or change values stored in the NVRAM.

LOM starting up

How do we access it? Well, the machine has a RJ-45 socket for serial connections appropriately labeled “LOM”. The server came with two cables to use with it, one RJ-45 to DB26 (“parallel port”) used with e.g. a Sun Workstation, and a RJ-45 to DB9 (“serial port” a.k.a. “COM port”). Then you can use any of the various tools usually used for serial connections like cu, tip or even screen.

Just plug your cable into say your laptop and the other end into the A/LOM port, then you can then access the serial console. If you plug in the power cable of the SunFire machine now, you will see the LOM starting up. Notice that the actual server is still off. It’s in standby mode now but the LOM is independent of that.

LOM help text

By default, the LOM port operates in mixed mode, allowing to access both the LOM and the serial console. These two things can be separated if desired; then the A port is dedicated to the LOM only and the console can be accessed via the B port.

In case you have no idea how to work with the LOM, there’s a help command available to at least give you an idea what commands are supported. Most of these commands have names that make it pretty easy to guess what they do. Let’s try out some!

LOM monitoring overview (powered off)

Viewing the environment gives some important information about the system. Here it reveals that ALARM 3 is set. Alarm 1, 2 and 3 are software flags that don’t do anything by themselves. They can be set and used by software installed on the Solaris operating system that came with the machine.

I really have no idea why the alarm is set. It was that way when I got the server. Even though it’s harmless, let’s just clear it.

Disabling alarm, showing users and booting to the ok prompt

The LOM is pretty advanced in even supporting users and privileges. Up to four LOM users can be created, each with an individual password. There are four privileges that these can have: A for general LOM administration like setting variables, U for managing LOM users, C to allow console access as well as R for power-related commands (e.g. resetting the machine). When no users are configured, the LOM prompt is not protected and has full privileges.

OpenBoot prompt

It is also possible to set the boot mode in the LOM. By doing this, the boot process can e.g. be interrupted at the OpenBoot prompt which (for obvious reasons) is also called the ok prompt. In case you wonder why the command is “boot forth” – this is because of the programming language Forth which the loader is written in (and can be programmed in).

ok prompt help

In the ok prompt you can also get help if you are lost. As you can see, it is also somewhat complex and you can get more help on the respective areas that interest you.

Resetting defaults and probing devices

OpenBoot has various variables to control the boot sequence. Since I got a used machine, it’s probably a good idea to reset everything to the defaults.

From the ok prompt it’s also possible to probe for devices built into the server. In this case, an HDD and a CDROM drive were found which is correct.

Setting NVRAM variables, escaping to LOM, returning to the ok prompt and resetting the machine

The ok prompt allows for setting variables, too, of course. Here I create an alias for the CDROM drive to get rid of working with the long and complex device path. Don’t ask me about the details of the latter however. I found this alias on the net and it worked. I don’t know enough about Solaris’ device naming to explain it.

Next I set the boot order to CDROM first and then HDD. Just to show it off here, I switch back to the LOM – using #. (hash sign and dot character). That is the default LOM escape sequence, however it can be reconfigured if desired. In the LOM I use the date command to display how long the LOM has been running and then switch back to the ok prompt using break.

LOM monitoring overview while the machine is running

Finally I reset the machine, so that the normal startup process is initiated and an attempt at booting from the CDROM is being made. I threw in a FreeBSD CD and escaped to the FreeBSD bootloader (which was also written in Forth until it was replaced with a LUA-based one recently).

Showing the monitoring overview while the machine is actually running is much more interesting of course. Here we can see that all the devices still work fine which is great.

LOM log and date, returning to console and powering off

Finally I wanted to show the LOM log and returning to the console. The latter shows the OK prompt now. Mind the case here! It’s OK and not ok. Why? Because this is not the OpenBoot prompt from the SunFire but the prompt from the FreeBSD loader which is the second-stage loader in my case!

That’s it for the exploring this old machine’s capabilities and special features. I just go back to the LOM again and power down the server.

Conclusion

The SunFire v100 is a very old machine now and probably not that useful anymore (can you say: IDE drive?). Still it was an interesting adventure for me to figure out what the old Sun platform would have been like.

While I’m not entirely sure if this is useful knowledge (SPARC servers in the wild are more exotic then ever – and who knows what the platform has evolved into in almost 20 years!), I enjoy digging into Unix history. And Sun’s SPARC servers are most definitely an important mosaic in the big picture!

What’s next?

Reviewing this old box without installing something on there would feel very incomplete. For that reason I plan to do another article about installing a BSD and something Solaris-like on it.