ARM’d and dangerous: FreeBSD on Cavium ThunderX (aarch64)

While I don’t remember for how many years I’ve had an interest in CPU architectures that could be an alternative to AMD64, I know pretty well when I started proposing to test 64-bit ARM at work. It was shortly after the disaster named Spectre / Meltdown that I first dug out server-class ARM hardware and asked whether we should get one such server and run some tests with it.

While the answer wasn’t a clear “no” it also wasn’t exactly “yes”. I tried again a few times over the course of 2018 and each time I presented some more points why I thought it might be a good thing to test this. But still I wasn’t able to get a positive answer. Finally in January 2019 year I got a definitive answer – and it was “yes, go ahead”! The fact that Amazon had just presented their Graviton ARM Processor may have helped the decision.

Getting my hands on an ARM64 server

Eventually a Gigabyte R120-T32 was ordered and arrived at our Data Center in early February. It wasn’t perfect timing at all since there are quite a few important projects running currently which draw just about all the resources that we currently have. Still we put together a draft of an evaluation sheet with quite a few things to test. There are a lot of “yes / no” things and some that require performance measurements.

We set aside a few hours to put some drives into the machine, put it into the rack, configure the EFI and get BMC/IPMI working. We quickly found that there was no firmware update or anything available, so we should be good to go. However… The IPMI console (Java…) is not working at all. A few colleagues that have different Linux distros running on their workstations tried to access it – but no luck. It looks like a problem with too new Java versions (*sigh*). I didn’t even try it with my BSD workstation: I could never get Supermicro IPMIs to work – even on AMD64. So for me it’s Linux in Bhyve for that kind of task (if anybody has any idea on how to get IPMI console access working on FreeBSD please let me know!!).

Our common installation procedure for special servers involves a little device that stores CD/DVD images and presents a virtual optical drive to the machine. Since ARM64 machines usually don’t have optical drives anymore, there are no CD images available for that architecture. Fortunately it’s not that complicated to prepare a USB thumb drive instead.

Once that was done I was able to confirm that remote administration using SoL (“Serial over LAN”) worked, so I could start playing around with the new server it in my free time.

Hello Beastie!

Being our company’s “BSD guy”, it was pretty obvious that testing FreeBSD on the machine (even if it’s only for comparison reasons with Linux) would by my natural task. FreeBSD on ARM64 is currently a tier 2 platform. However it has been the plan for several years now to eventually make it tier 1 – i.e. a perfectly well supported platform. Even better: The Cavium ThunderX was selected as the reference platform for ARM64 and a partnership established between Cavium and the FreeBSD foundation! So doing a few tests there should be a piece of cake, right?

I booted the machine with the USB stick attached. Just as expected, there’s no working VGA console support (this is pretty much the default when working with ARM). The SoL connection worked, though and showed the nice beastie menu! “Off to a good start”, I thought. I was wrong.

The loader did its job of loading the kernel and… that’s where everything stopped. I rebooted again and again, trying to set loader tunables to get the system to boot (need I mention that the machine literally takes minutes to execute EFI stuff before it gets to the loader?). No luck whatsoever. The line “Entered …. EfiSelReportStatusCode Value: 3101019” was always the last thing printed to the screen. And also waiting for minutes the machine didn’t do anything afterwards.

“Maybe that status code shows that something is wrong?” I thought. A quick search on the net revealed a console log for NetBSD booting on the same hardware. It printed the same line but the kernel obviously did it’s thing. Weird – and quite discouraging. At this point I had no more ideas that I could try and was about to accept that it “just didn’t work”. Then I read a post to the FreeBSD-arm mailing list: Somebody else had the same problem! However it looked like nobody had a solution – there was not a single answer to that mail… But the post included one important piece of information: FreeBSD 11.2 did work?

FreeBSD 11.2

Ok, that would be a lot better than nothing. So I downloaded the 11.2 image, dd’d it to the stick and booted. This time there was no beastie (before 12.0 FreeBSD used a different loader on ARM) – but the kernel booted! I even got to the installer, Yay!

I did a fairly simple install like I often do it. The big difference was that I couldn’t configure the network. But let’s leave that for later, shall we? The installation completed and I rebooted. The loader appeared, the kernel booted and… Bam! The system is unable to import the pool! Too bad…

So I booted the stick again, did a second install choosing UFS instead of ZFS. This time FreeBSD would boot successfully into the new system and allow me to log in. Unfortunately no NICs were detected – and a server without network connectivity is rather… boring. What now?

After a bit of research on the net I learned about LIQUIDIO(4), the “Cavium 10Gb/25Gb Ethernet driver”. Sounded great. However the manpage says that it “first appeared in FreeBSD 12.0”. *sigh* Looks like I have to upgrade to 12.0 somehow.

FreeBSD 12?

I downloaded the source for the latest 12.0 from the releng branch, put it on a stick and attached it to the ARM server. That alone – I didn’t even get to import the pool – was enough to make the system panic! Oh great…

A UFS formatted stick then? Yes, that worked. I transferred the source, built the system, installed the kernel and saw that familiar line: “Entered …. EfiSelReportStatusCode Value: 3101019”. Obviously it’s not the images that are broken but actually FreeBSD 12.0 is…

What’s next? Let’s try 12-STABLE. Maybe the problem is fixed there. Unless… The installed kernel is broken and trying to boot kernel.old does not work! Oh excellent, it looks like “make installkernel” did not create any backup! This probably makes a lot of sense on small ARM devices, but it doesn’t make any in this case. Now I’m in for another reinstall.

After reinstalling 11.2 I did “cp /boot/kernel /boot/kernel.11” – just in case. Then I transferred the code for 12-STABLE onto the machine and built it. You would probably nave guessed it by now: It’s broken.

HEADaches

The next day at work I organized a USB to ethernet adapter and configured ue0 so that I could finally ssh into the OS. That felt a lot better and I could also use svnlite directly to checkout the source code. Quite some time later I had built and installed -CURRENT. But would it work? Nope!

Now this was all very unfortunate. But I happen to like FreeBSD and really want it to work on ARM64, too. Just waiting for somebody to fix it didn’t sound like a good strategy, however. The hardware is still pretty exotic and if just one person reported breakage in the final days of RCs for 12.0, it’s pretty clear that not too many developers have access to it to test things regularly… So I figured that I should try to find out exactly when HEAD broke. That would most likely be helpful information.

Alright… r302409 was the commit that turned HEAD into 12-CURRENT. Let’s make sure that this version still works – otherwise it would mean that 11.0 on ThunderX was fixed after branching off HEAD. Seems very unlikely, but making sure wouldn’t hurt, right?

usr/local/aarch64-freebsd/bin/ranlib -D libc_pic.a
--- libc.so.7.full ---
/usr/local/aarch64-freebsd/bin/ld: getutxent.So(.debug_info+0x3c): R_AARCH64_ABS64 used with TLS symbol udb                                                                                   
/usr/local/aarch64-freebsd/bin/ld: getutxent.So(.debug_info+0x59): R_AARCH64_ABS64 used with TLS symbol uf                                                                                    
/usr/local/aarch64-freebsd/bin/ld: utxdb.So(.debug_info+0x5c): R_AARCH64_ABS64 used with TLS symbol futx_to_utx.ut                                                                            
/usr/local/aarch64-freebsd/bin/ld: jemalloc_tsd.So(.debug_info+0x3d): R_AARCH64_ABS64 used with TLS symbol __je_tsd_tls                                                                       
/usr/local/aarch64-freebsd/bin/ld: jemalloc_tsd.So(.debug_info+0x1434): R_AARCH64_ABS64 used with TLS symbol __je_tsd_initialized                                                             
/usr/local/aarch64-freebsd/bin/ld: xlocale.So(.debug_info+0x404): R_AARCH64_ABS64 used with TLS symbol __thread_locale                                                                        
/usr/local/aarch64-freebsd/bin/ld: setrunelocale.So(.debug_info+0x3d): R_AARCH64_ABS64 used with TLS symbol _ThreadRuneLocale                                                                 
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libc.so.7.full] Error code 1

Perhaps building an older version of FreeBSD on a newer one is not such a good idea… So I chose to install 11.0 first. That worked without problems. However… 11.0 was cut before the switch to LLVM 4.0 – which brought in a usable lld! For that reason 11.0 needed an external toolchain. Of course that version is not supported anymore so there are no packages for it around. To really build anything on 11.0 would mean that I’d first have to cross-compile a binutils package for it! I admit that this wasn’t terribly attractive to me – especially since I already had a way different goal.

Ok… So HEAD likely broke somewhere between r302409 and r343815 – that’s roughly 40,000 commits to check! Let’s just hope that it broke after FreeBSD on ARM64 became self-hosting…

Checking r330000 – it works! r340000? Broken. r335000: Still works. r337500: Nope. After a lot of deleting previous builds and source, checking out other old revisions and rebuilding kernel-toolchain and kernel (and hitting some versions that didn’t build on ARM in the first place), I found out that it was commit r336520 that broke FreeBSD on ThunderX: The vt_efifb device was added to the GENERIC kernel on ARM64. According to the commit, it was tested on PINE64 but obviously it does not work with ThunderX, yet.

Problem solved?

So if it’s just one line in the GENERIC configuration it should not be any problem to fix this, right? To test I checked out HEAD again, removed the line and rebuilt the kernel. And after rebuilding and installing once more – it booted just fine!

Then I restored the GENERIC config and wrote a custom kernel configuration instead:

/usr/src/sys/arm64/conf/THUNDERX:

include GENERIC
ident THUNDERX

nodevice        vt_efifb

I wrote to the mailing list that I had identified a problem with ThunderX and made proposals on how to unbreak HEAD.

An answer pointed out that a custom kernel was not even required: It suffices to set the loader tunable hw.syscons.disable to make HEAD work again! No need to remove the device from GENERIC. That’s even better! But what I actually wanted was 12.0 or 12-STABLE and not HEAD. Installing that however, I found that it’s… well. Broken!

It broke twice!

What’s going on here? Could it be that HEAD broke twice and the other problem was fixed in a later revision? I had no better explanation. So let’s get compiling for another few nights again and see…

r339436 is when HEAD was renamed to 13-CURRENT. Would it work just setting the loader tunable? Nope. I played the same game as before to eventually figure out that r338537 was the offending commit here. It’s message says that it’s increasing two values because of ThunderX – but unfortunately actually that broke the platform.

On to more compiling… Finally here’s the other commit: r343764 changed those values again in preparation for ThunderX2 – and accidentally fixed ThunderX, too! Unfortunately that wasn’t until after 12.0 was branched off.

Again it’s a rather simple patch to make 12-STABLE work:

--- sys/arm/arm/physmem.c.orig  2019-02-17 08:47:05.675448000 +0100
+++ sys/arm/arm/physmem.c       2019-02-17 08:48:53.209050000 +0100
@@ -29,6 +29,7 @@
  #include 
  __FBSDID("$FreeBSD: stable/12/sys/arm/arm/physmem.c 341760  
2018-12-09 06:46:53Z mmel $");

+#include "opt_acpi.h"
  #include "opt_ddb.h"

  /*
@@ -48,8 +49,13 @@
   * that can be allocated, or both, depending on the exclusion flags  
associated
   * with the region.
   */
+#ifdef DEV_ACPI
+#define MAX_HWCNT       32      /* ACPI needs more regions */
+#define MAX_EXCNT       32
+#else
  #define        MAX_HWCNT       16
  #define        MAX_EXCNT       16
+#endif

  #if defined(__arm__)
  #define        MAX_PHYS_ADDR   0xFFFFFFFFull

Fixing 12?

After I had a working 12 system, I wrote to the mailing list again, asking if the change could be MFC’d from r343764 into 12-STABLE.

Rod Grimes was nice enough to ping the developer who had made the commit. So far he hasn’t replied but I really hope that it’ll be possible to MFC the change. Because otherwise that would mean all of FreeBSD 12.x would remain broken – and that would definitely not be cool… If it is possible we’re likely to have a working 12.1 image when that version is out. 12.0 will remain broken, I guess.

Until we have a working version again, you can either install using 11.2 and build from (currently patched) source. Or you can try a fixed image that I build (if you trust me). What I did to build was this:

  1. svnlite co svn://svn.freebsd.org/base/stable/12 /usr/src
  2. cd /usr/src
  3. patch -i patchfile
  4. make -j 49 buildworld
  5. make -j 49 buildkernel
  6. edit stand/defaults/loader.conf to set hw.syscons.disable
  7. cd release
  8. make memstick NOPKG=1 NOPORTS=1 NOSRC=1 NODOC=1 WITH_COMPRESSED_IMAGES=1

Conclusion

Tier 1? Obviously we’re not quite there, yet. Booting off of ZFS does not work. There are no binary updates. Even with 12-STABLE I’ve been unable to get the NICs working – and you wouldn’t want to run a server with 4x10Gbit and one 40Gbit NICs using just a 1Gbit USB to ethernet adapter.

On the plus side, FreeBSD generally works and seems stable with UFS. Building from source using 49 threads works and so does building software from ports. I even installed Synth and ran the “build-everything” for over a day to put some serious load on the machine without any problems.

I’d like to at least get the network working, too, before that machine will turn into a penguin for production. Maybe I can set aside a few more hours at night. If I end up with anything noteworthy, I’ll follow-up on this post. Oh, and if anybody has any information on making it work, please let me know!

Advertisements

ZFS and GPL terror: How much freedom is there in Linux?

There has been a long debate about whether man is able to learn from history. I’d argue that we can – at least to some degree. One of the lessons that we could have learned by now is how revolutions work. They begin with the noblest of ideas that many followers wholeheartedly support and may even risk their lives for. They promise to rid the poor suppressed people of the dreaded current authorities. When they are over (and they didn’t fail obviously) they will have replaced the old authorities with – new authorities. These might be “better” (or rather: somewhat less bad) than the old ones if we’re lucky, but they are sure to be miles away from what the revolution promised to establish.

Death to the monopoly!

Do you remember Microsoft? No, not the modern “cloud first” company that runs Azure and bought Github. I mean good old Microsoft that used to dominate the PC market with their Windows operating system. The company that used their market position with Windows 3.x to FUD Digital Research and their superior DR-DOS out of the market by displaying a harmless line of text with a warning about possible compatibility issues. The company that spent time and resources on strategies to extinguish Open Source.

Yes, due to vendor lock-in (e.g. people needing software that only runs on Windows) and to laziness (just using whatever comes installed on a pc), they have maintained their dominance on the desktop. However the importance of it has been on a long decline: Even Microsoft have acknowledged this by demoting their former flagship product and even thinking of making it available for free. They didn’t quite take that extreme step, but it’s hard to argue that Windows still has the importance it had since the 1990’s up to the early 2010’s.

They’ve totally lost the mobile market – Windows Phone is dead – and are not doing too well in the server market, either.

A software Golden Age with Linux?

In both areas Linux has won: It’s basically everywhere today! Got a web-facing server? It’s quite likely running some Linux distro. With most smart phones on the planet it’s Android – using a modified Linux kernel – that drives them. And even in space – on the ISS – Linux is in use.

All of us who have fought against the evil monopoly could now be proud of what was accomplished, right? Right? Not so much. There’s a new monopolist out there, and while it’s adhering to Open Source principles by the letter, it has long since started actually violating the idea by turning it against us.

For those who do not deliberately look the other way, Linux has mostly destroyed POSIX. How’s that? By more or less absorbing it! If software is written with POSIX in mind, today that means it’s written to work on Linux. However POSIX was the idea to establish a common ground to ensure that software runs across all of the *nix platforms! Reducing it basically to one target shattered the whole vision to pieces. Just ask a developer working on a Unix-like OS that is not Linux about POSIX and the penguin OS… You’re not in for stories about respect and being considerate of other systems. One could even say that they have repeatedly acted quite rude and ignorant.

But that’s only one of the problems with Linux. There are definitely others – like people acting all high and mighty and bullying others. The reason for this post is one such case.

ZFS – the undesirable guest

ZFS is todays most advanced filesystem. It originated on the Solaris operating system and thanks to Sun’s decision to open it up, we have it available on quite a number of Unix-like operating systems. That’s just great! Great for everyone.

For everyone? Nope. There are people out there who don’t like ZFS. Which is totally fine, they don’t need to use it after all. But worse: There are people who actively hate ZFS and think that others should not use it. Ok, it’s nothing new that some random guys on the net are acting like assholes, trying to tell you what you must not do, right? Whoever has been online for more than a couple of days probably already got used to it. Unfortunately its still worse: One such spoilsport is Greg Kroah-Hartman, Linux guru and informal second-in-command after Linus Torvalds.

There have been some attempts to defend the stance of this kernel developer. One was to point at the fact that the “ZFS on Linux” (ZoL) port uses two kernel functions, __kernel_fpu_begin() and __kernel_fpu_end(), which have been deprecated for a very long time and that it makes sense to finally get rid of them since nothing in-kernel uses it anymore. Nobody is going to argue against that. The problem becomes clear by looking at the bigger picture, though:

The need for functions doing just what the old ones did has of course not vanished. The functions have been replaced with other ones. And those ones are deliberately made GPL-only. Yes, that’s right: There’s no technical reason whatsoever! It’s purely ideology – and it’s a terrible one.

License matters

I’ve written about licenses in the past, making my position quite clear: It’s the authors right to choose whatever license he or she thinks is right for the project, but personally I would not recommend using pessimistic (copyleft) licenses since they do more harm than good.

While I didn’t have any plans to re-visit this topic anytime soon, I feel like I have to. Discussing the matter on a German tech forum, I encountered all the usual arguments and claims – most of which are either inappropriate or even outright wrong:

  • It’s about Open Source!
  • No it’s absolutely not. ZFS is Open Source.

  • Only copyleft will make sure that code remains free!
  • Sorry, ZFS is licensed under the CDDL – which is a copyleft license.

  • Sun deliberately made the CDDL incompatible with the GPL!
  • This is a claim supported primarily by one former employee of Sun. Others disagree. And even if it was verifiably true: What about Open Source values? Since when is the GPL the only acceptable Open Source license? (If you want to read more, user s4b dug out some old articles about Sun actually supporting GPLv3 and thinking about re-licensing OpenSolaris! The forum post is in German, but the interesting thing there is the links.)

  • Linux owes its success to the GPL! Every Open Source project needs to adopt it!
  • This is a pretty popular myth. Like every myth there’s some truth to it: Linux benefited from the GPL. If it had been licensed differently, it might have benefited from that other license. Nobody can prove that it benefited more from the GPL or would have from another license.

  • The GPL is needed, because otherwise greedy companies will suck your project dry and close down your code!
  • This has undoubtedly happened. Still it’s not as much of a problem as some people claim: They like to suggest that formerly free code somehow vanishes when used in proprietary projects. Of course that’s not true. What those people actually dislike is that a corporation is using free code for commercial products. This can be criticized, but it makes sense to do that in an honest way.

  • Linux and BSD had the same preconditions. Linux prospers while BSD is dying and has fallen into insignificance! You see the pattern?
  • *sign* Looks like you don’t know the history of Unix…

  • You’re an idiot. Whenever there’s a GPL’d project and a similar one that’s permissively licensed, the former succeeds!
  • I bet you use Mir (GPL) or DirectFB (LGPL) and not X.org or Wayland (both MIT), right?

What we can witness here is the spirit of what I’d describe as GPL supremacist. The above (and more) attacks aren’t much of a problem. They are usually pretty weak and the GPL zealots get enraged quite easy. It’s the whole idea to trade the values of Open Source for religious GPL worship (Thou shalt not have any licenses before me!) that’s highly problematic.

And no, I’m not calling everybody who supports the idea of the GPL a zealot. There are people who use the license because it fits their plans for a piece of software and who can make very sensible points for why they are using it. I think that in general the GPL is far from being the best license out there, but that’s my personal preference. It’s perfectly legitimate to use the GPL and to promote it – it is an Open Source license after all! And it’s also fine to argue about which license is right for some project.

My point here is that those overzealous people who try to actually force others to turn towards the GPL are threatening license freedom and that it’s time to just say “no” to them.

Are there any alternatives?

Of course there are alternatives. If you are concerned about things like this (whether you are dependent on modules that are developed out-of-kernel or not), you might want to make 2019 the year to evaluate *BSD. Despite repeated claims, BSD is not “dying” – it’s well alive and innovative. Yes there are areas where it’s lacking behind, which is no wonder considering that there’s a much smaller community behind it and far less companies pumping money into it. There are companies interested in seeing BSD prosper, though. In fact even some big ones like Netflix, Intel and others.

Linux developer Christoph Hellwig actually advises to switch to FreeBSD in a reply to a person who has been a Linux advocate for a long time but depends on ZFS for work. And that recommendation is not actually a bad one. A monopoly is never a good thing. Not even for Linux. It makes sense to support the alternatives out there, especially since there are some viable options!

Speaking about heterogenous environments: Have you heard of Verisign? They run the registry for .com and .net among other things. They’ve built their infrastructure 1/3 on Linux, 1/3 on FreeBSD and 1/3 on Solaris for ultra-high resiliency. While that might be an excellent choice for critical services, it might be hard for smaller companies to find employees that are specialized in those operating systems. But bringing in a little BSD into your Linux-only infrastructure might be a good idea anyway and in fact even lead to future competitive advantage.

FreeBSD is an excellent OS for your server and also well fit if you are doing embedded development. It’s free, ruled by a core team elected by the developers, and available under the very permissive BSD 2-clause license. While it’s completely possible to run it as a desktop, too (I do that on all of my machines both private and at work and it has been my daily driver for a couple of years now), it makes sense to look at a desktop-focused project like GhostBSD or Project Trident for an easy start.

So – how important is ZFS to you – and how much do you value freedom? The initial difficulty that the ZOL project had has been overcome – however they are just working around it. The potential problem that non-GPL code has when working closely with Linux remains. Are you willing to look left and right? You might find that there’s some good things out there that actually make life easier.

Ravenports explained: Why not just join XYZ?

As the year comes to an end, I’ve seen quite some interest in my previous post. There has been a question on Reddit what the benefit(s) of Raven over Pkgsrc might be and why the developers don’t simply join an existing effort instead of building something new.

I’ve touched on this topic about half a year ago, but I think the question is worth a detailed reply that fully covers both parts of it. So I’ll try to answer 1) why Ravenports exists in the first place and 2) what sets it apart from Pkgsrc and other ports systems.

Why maintain Ravenports instead of working on Pkgsrc?

Well, obviously because its author felt it was worthwhile to start and maintain the project! Of course that leads to another and more important question – why didn’t John Marino just join e.g. Pkgsrc instead? The answer to that is: Well… He did.

John got his NetBSD commit bit and became a Pkgsrc developer back in the day when DragonflyBSD still used Pkgsrc by default. He maintained a ton of ports there and made sure that other people’s ports still worked on DF after they had been updated. DragonflyBSD had been considered a first-class citizen by Pkgsrc. However there had been two big problems:

1) Being primarily a NetBSD project, Pkgsrc development takes place mostly on NetBSD of course. Things were tested on NetBSD and then committed. There was no testing done on the other supported platforms – which is a completely comprehensible decision given the amount of ports available and the number of supported platforms as well as the need to get software updated in a somewhat timely manner! However this lead to frequent breakage. A few suggestions that made sense from the Dragonfly perspective could not be agreed upon taking the whole of Pkgsrc into account. In the end the policy was: “If things in the tree break for your platform, go ahead and fix it.” So basically the answer to problem 1 was: “Throw more manpower at it.”

2) As the small project that DragonflyBSD is, there simply were not too many people available for this task however. In fact it was largely John alone who did most of the work with some help here and there. It’s impossible to spend resources that you don’t have available!

As you can see problem 1 causes problem 2 – and that one proved to be unfixable. Thus the problems with Pkgsrc grew and there was really not much that could have been done about it. And as the suggestions to somewhat relieve the worst impact were turned down, Dragonfly had to give up Pkgsrc. Please keep in mind that there’s a major difference between how Dragonfly used Pkgsrc and how some other platforms do. Sure, it’s great that you can use Pkgsrc on AIX to obtain some current software. Same thing for many other systems. Dragonfly used Pkgsrc just as NetBSD does, though: As the primary means to get software installed. Large-scale breakage of packages is a no-go in such a case, especially if it happens somewhat often and was bound to happen again and again.

Ok – another project then. Adapt the FPC maybe?

John then brought the new FreeBSD package manager as well as the FreeBSD ports collection over to Dragonfly with a system called “delta ports” or Dports. It’s basically an overlay with patches that Dfly requires to build those ports. Even though the FPC is meant for FreeBSD only and Pkgsrc – being cross-platform – might seem like the more logical candidate, this worked out a lot better and John maintained Dports for years.

In maintaining so many ports for both Pkgsrc and Dports he had a quite few ideas on how to do things better. They wouldn’t fit into the projects as they were organized, though. So he begun playing with various things on his own. Then… FreeBSD introduced flavored ports.

Don’t get me wrong here: I’m a FreeBSD user and I’m glad that flavored ports are finally available. However from a technical point of view they are implemented in a way that’s far from perfect. This is no wonder, though: When the ports tree was first introduced, nobody thought of flavors. What we have today is a fine example of a feature implemented as an afterthought. It works, yes, but it meant a disrupting change and broke expectations of all ports-related programs. It also made maintaining Dports much, much more time-intensive – to the point where it becomes no longer feasible to keep it up.

What does Ravenports have to offer over Pkgsrc?

Just like every younger project, Ravenports has the considerable advantage of starting fresh without the burden of choices that seemed right in the past but were probably regretted later. If this is combined with the will to learn from previous attempts to get packaging right as well as considerable experience with those, this has a lot of potential.

Think about it for a moment: FreeBSD’s ports collection shipped with the 1.0 release of the OS – and thus was created back in 1993. Pkgsrc began as a fork of it in 1997. So both were originally designed in a decade that has long passed (and in fact not even in this millennium!). Yes, both have been modernized over time. There are limits to this, however. It can be pretty hard to integrate new features into a structure that never meant to support anything like that. Do you think anybody in the mid 90’s could have thought about the needs of today? Ravenports deliberately does not support some old cruft. It’s meant for the coming decade of the 2020’s.

Here’s some strong points where Raven is ahead of Pkgsrc:

  • Tooling:
  • It offers a modern, integrated solution. There’s one control program (“ravenadm”) that deals with everything regarding Ravenports: It’s used to configure the package building system, it fetches the buildsheets (ports) and keeps them up to date, it builds all the packages or a subset thereof, …

  • Pristine package builds:
  • Everything is built in a chroot sandbox specifically assembled for that build process. There is no way that build dependencies clutter your build system (chances are you don’t want to use m4 or automake yourself and thus don’t need them installed on the OS). There’s also no way that installed packages of your system pollute the packages that Raven builds: The isolation prevents e.g. linking against additional stuff that you didn’t mean to.

  • It’s fast:
  • Did you ever run a bulk-build for Pkgsrc packages? Ravenports optimizes build times on modern systems by taking advantage of memory disks and such. The port scan alone makes a huge difference.

  • Potentially package manager agnostic:
  • Currently Raven supports only the Pkg package manager but as all it does is build packages, it was designed to support additional package managers if needed. You actually want it to generate rpm or pacman packages? Not currently implemented but certainly possible if desired.

  • Powerful default package manager:
  • Pkg, a modern tool for package management, is quite capable. If you read the manpages for it you will find out that it’s loaded with useful features. The old pkg_tools that Pkgsrc still use totally pale in comparison – and rightfully so.

  • Easy administration of multiple repos:
  • Need multiple repositories? No problem. Just create profiles for them. E.g. one that uses LibreSSL and another one that links against OpenSSL instead. Also you can choose the default version of Perl, Python, Ruby, ect. to use. And you can choose if MySQL should be Oracle’s MySQL, MariaDB, Galera, ect.

  • Convenient use of custom ports:
  • Can you use custom ports that are not in the official buildsheet collection? Sure thing. You can create directories for your custom ports and even use different ones in different profiles. Want to change an existing port? Just place one with the same name in your custom port directory and it will override the original one. Buildsheets from custom ports are generated automatically so there’s no hassle there. It probably doesn’t get much more convenient!

  • Variants and subpackages:
  • Package variants (i.e. “flavors”) and subpackages are not an afterthought and are thus used excessively right from the beginning. This makes package management with Raven very flexible.

  • Testing:
  • The Ravenports system has very strict rules for buildsheets. If the ravenadm tool considers a port to be valid, it is almost guaranteed that it is actually fine. Also packages can not only be mass-built but they can also be tested automatically as well (Is the RPATH ok? Are all required shared objects available? Is the manifest file complete? Are the required descriptions in place? Is the license ok or lacking? Things like that).

  • Automation:
  • Ravenports tries to automate many things that do not actually need human attention. For example quite often Python-related ports can be auto-generated. This saves time and effort of the maintainers that can be better spent on other things.

  • Modern day development:
  • Want to contribute something? It’s extremely easy. If you have a GitHub account you’re all set: Fork the git repo, make your changes, then commit and push them. Now all that’s left is opening a Pull Request. Yes, that’s all. If you don’t have a GH account, create one. Or send us patches as it was traditionally done. Ravenadm will happily create a template for you to assist you if you want to contribute a new port.

  • No ports ownership:
  • In Ravenports nobody “owns” a port. If you submitted one you become a contact for it. If somebody wants to make major changes to the port, that person is expected to contact you and communicate the proposals. Small or trivial changes however (like a simple version upgrade) can be done by anybody. This ensures rapid development and very fast adoption of new versions even if the original porter does not currently have the time to maintain everything in a timely manner.

  • Fast releases:
  • Ravensource provides new releases quite often. This way you can get pretty fresh software early on. There is no fixed time frame for it, though: Releases are made when it makes sense. If there have been major changes to the tree the next release might be delayed for testing.

  • Binary bootstrap:
  • Ravenports has a very simple and fast bootstrap process that makes use of binary packages for the respective platform. No system compiler required! Raven brings in its own full toolchain.

There are of course cases where it makes sense to use Pkgsrc and it’s not too hard to find any: E.g. if you need packages for a platform that’s unsupported in Raven or if you need software not yet available there. In the end this is Open Source: We’re all friends and using the right tool for the job makes sense.

Couldn’t Ports/Pkgsrc be modernized?

I’ve used Pkgsrc both in private and at work and I’m pretty happy that it’s available when I need it. But I don’t like the old pkg_tools much. They do their job but they are far from modern programs and really feel like relics today. And while I’m pretty happy with FreeBSD’s ports, those aren’t portable (and for some reason I’ve never been completely happy with Poudriere, FreeBSD’s package builder).

Before finally creating Ravenports, John wrote Synth, a very nice package builder for FreeBSD and DragonflyBSD that supports Ports/Dports. It has been put on hold in favor of Raven, but it is still maintained and I continue to use it on FreeBSD to build my packages.

John also created Pkgsrc-synth. It’s a version of Pkgsrc that uses the Pkg package manager. I’ve never tried it out – but it was stopped exactly two month ago as there seems to not have been any interest from the Pkgsrc people. I think this is a pitty, as pkg is really nice and has the right license for any BSD project. It could have been a chance to move Pkgsrc into a more modern direction. But meh.

Conclusion

Raven does not exist because everything else sucks. It exists because all the other candidates proved to not quite fit the needs of Ravenport’s author. As such it is a chance to keep the good parts of its various precursors that it heavily draws inspiration from. It’s a chance to combine these good parts to make something awesome. And it’s a chance to implement a lot of new ideas that should make sense in modern-day *nix package building which – for various reasons – cannot have a place in the old projects.

There’s still a lot of work to do, but we’re getting there. In my previous post I wrote that one of the big shortcomings was the lack of Rust. In the meantime Rust support has landed for DragonflyBSD, FreeBSD and Linux.

If there are any more questions feel free to post them here. I’m not on Reddit and I just saw the above question by accident. So I cannot promise to answer anywhere else than here.

Happy new year everyone!

One year of flying with the Raven: Ready for the Desktop?

It has been a little over one year now that I’m with the Ravenports project. Time to reflect my involvement, my expectations and hopes.

Ravenports

Ravenports is a universal packaging framework for *nix operating systems. For the user it provides easy access to binary packages of common software for multiple platforms. It has been the long-lasting champion on Repology’s top 10 repositories regarding package freshness (rarely dropping below 96 percent while all other projects keep below 90!).

For the porter it offers a well-designed and elegant means of writing cross-platform buildsheets that allow building the same version of the software with (completely or mostly) the same compile-time configuration on different operating systems or distributions.

And for the developer it means a real-world project that’s written in modern Ada (ravenadm) and C (pkg) – as well as some Perl for support scripts and make. Things feel very optimized and fast. Not being a programmer though, I cannot really say anything about the actual code and thus leave it to the interested reader’s judgement.

If you’re interested in a more comprehensive introduction to Ravenports, I’ve written one half a year ago.

Platforms

Ravenports has initially been developed on DragonFly BSD. When I became aware of it, it had already been ported to work on Linux, too. I liked the idea of the project, but had no DF or Linux boxes available for tinkering and didn’t feel like setting one up. Thus I moved on.

As I checked back a little later, FreeBSD support had been added. Since I had just lost my excuse not to try it out right away, I started playing with it – and was pretty happy. At that time I had trouble to get a port that I wrote into FreeBSD’s Ports Collection and thought that Raven could be an excellent playground to learn something and get a bit of experience that might help me later with FreeBSD.

The Xfce4 desktop – installed via Raven

I’ve long changed my mind, though! Raven is rather similar to FreeBSD’s ports system in many ways but where it differs it’s clearly superior. Also I love the cross-platform aspect and thus Raven is simply the better place for me to make home.

This year saw the introduction of Solaris/Illumos support that I tried out on OmniOS. Also Darwin support landed, upping the count of supported platforms to 5 already! Not too bad for a young project, huh? While Raven does work on all five platforms now it does so to varying degrees. But more on that later.

General activity

The Ravenports project consists of multiple Git repositories hosted on GitHub. The first one is Ravensource which most importantly holds the “raw” ports as they are written by the porters. It’s the most busy repo with over 5.200 commits since March 2017 (including almost 500 by me).

Then there’s the actual Ravenports repo that mostly contains the buildsheets which are compiled from Ravensource. It has over 1.400 commits right now.

Installing the xfce-single-core meta-package

Finally there’s the repo for the Ravenadm command-line tool. It’s approaching 900 commits since February 2017.

There’s still more to Raven like the Pkg package manager from FreeBSD (that was modified to add Zstd compression support) or libbsd4sol, a portability library which allows building code on Solaris that uses BSDisms (which was needed to add support for that platform to Raven). Most of the work on all repos was done by John alone.

With over 100 pull requests and more than 20 issues it’s clear now that there’s some interest in the project. Raven is still very small, though, with 6 people haveing contributed ports so far. After learning the basics and opening pull requests for half a year, I’ve been granted write-access to the source repository. Just recently I was able to push my 100th active port (there have been ports that became obsolete and were removed).

In general I’d say that there could of course be more people around and that the project would benefit from being able to provide more packages – though more than 3.200 is not bad at all! Also it’s good that there seems to be a growing user base which is even more important than having more porters join in. From my point of view, Raven is a healthy and fast-moving project. Still young, but doing well and heading in the right direction.

Major changes

There have been some pretty big changes that happened with Raven over time. Initially John started with a GCC6-based toolchain, only to switch to GCC7 when that was released. That was before my time with the project, but I witnessed the switch to GCC8.

Changing the toolchain certainly is a major interruption and most people are advised to just wait for the official repository to be re-rolled and then update. I had some bad luck in this regard – literally the day after I finally completed a working (and almost complete) set of basic packages for the FreeBSD_i386 platform, I faced the change to GCC8. Due to a lack of time I still haven’t repeated the switch on i386 (but I still plan to do it sometime).

The thunar file manager

Other changes that always have a huge impact (causing lots and lots of packages to be rebuilt) is adopting a new version (as well as dropping an old one) of the popular interpreter languages like Python, Perl and Ruby. Ravenports always supports two versions of Perl and Ruby and two versions of Python 3 (as well as 2.7 for now). So when Python 3.7 was released, 3.5 was removed and Perl 5.24 had to go when 5.28 was added.

Recently the former LLVM port that included everything regarding LLVM was split (LLVM, Clang, lld, openmp). Also now and then new statements are added to Ravenadm, so that old versions cannot work with a new release of the buildsheet repository (which is called “conspiracy”). But this is pretty easy to work around compared to the changes mentioned before.

So on the whole, Raven has proven that it can easily stand even big changes. For me this is essential to build faith in a project. And Raven is doing well in this regard.

Desktop-ready?

There are lots of people who will want to use Raven on servers. That’s totally fine of course. But for a project as ambitious as Ravenports, it’s necessary to provide a somewhat comfortable environment for the developers and the users alike. If it doesn’t manage to become a daily driver for people it cannot succeed.

For that reason I decided to work towards good desktop support for the little dev machine that I dedicated to my work on the project. When I started, X11 was already working and Openbox had freshly landed in the repos. So I had a simplistic environment to work with: Openbox + Xterm. However I could not even change my keyboard layout! Therefor I wrote a port for setxkbmap and eventually it was accepted as the first outside contribution to the project.

The Surf web browser

Next I did some work to get the FLTK toolkit and the EDE desktop in. Then I added my favorite terminal emulator, Sakura. This worked out pretty well and the biggest shortcoming at the end of 2017 was that there was no real graphical browser available. A lot has changed since then!

Desktop choices

Today you can choose between multiple window managers, both floating and tiling:

  • twm
  • cwm
  • openbox
  • fluxbox
  • xfwm4
  • pekwm
  • i3

And in case you prefer a real desktop environment, there are also several available:

  • Lumina (moderate, Qt-based)
  • Xfce4 (somewhat light-weight, GTK-based)
  • EDE (extremely frugal and minimalistic, FLTK-based)

Two graphical web browsers are available, Surf (which is deliberately simplistic and does not even support tabs) as well as an old version of Firefox (the last one that builds without Rust). This is certainly not perfect but much better than a year before.

Also other important programs are available, including LibreOffice! Last month the Apache webserver landed – which is a pretty complex port compared to many others.

Shortcomings

Are there packages you’ll miss? Most certainly. However there’s a wishlist now with ports that people would like to see created (please feel free to add more requests there). And that’s another good step ahead. Currently it’s almost 120 items long. Fortunately there’s been some success, too, and 26 requested ports have been created and taken of the list so far.

There are some future ports that will require lots of effort (hint: Help wanted!). The most important one that blocks some other important ports is the Rust compiler. There has been some work done on this but it’s not done, yet. Another real beast is TeX. This totally must be supported at some point. Current versions of Firefox and Chromium are often asked for. And somebody even requested Eclipse (which needs Java!). So there’s definitely more than enough work to do.

Using Raven on Linux works, but there are some flaws. Initially the Pkg package manager used to crash quite often. John traced that back to a bug in the version of SQlite that’s used internally by Pkg: The problem only struck on Linux and was fixed by using a newer version instead. While it’s much better now, there’s still the occasional problem with it.

While the packages from the repo work finde on Solaris 10u8 and above as well als Illumos, the exact version 10u8 is currently required to build packages. This is due to Solaris not being able to work with older system libraries in the build chroot. It would be great to haven an alternative ravensys-root for any Illumos distribution (OmniOS, SmartOS, Tribblix, …) available so that interested people without access to that specific closed-source Solaris version can develop Raven on that platform.

I don’t know how well Raven works on Darwin. Since I don’t have access to any macOS machines and PureDarwin is not really ready, yet, there’s currently no chance for me to test it. I intend to buy an older MacBook or something in the future, though, if I come across a fair offer and have some money available to spend on my hobby.

Some ports are not available on one platform or the other: Illumos mostly because they’d require patches to build and Linux often because it relies on additional libraries that have not yet been added to Raven. And then there’s a lot of packages that are mostly untested. All of these issues can be fixed, of course. All of those require a larger user-base, though. So it’s probably the best strategy to keep working on making Raven attractive to more users and address things when the right people show up.

What’s to come?

Currently Raven uses the primordial X11 input drivers (xf86-input-keyboard and xf86-input-mouse) on all platforms. In 2013 Linux pioneered support for generic input drivers by exposing the kernels “event devices”. Not too much later many Linux distributions adopted xf86-input-evdev. In 2014 there was a GSOC project to add evdev support for FreeBSD. Like many projects it came along a good part of the way but eventually was left unfinished. It was picked up and completed by a FreeBSD developer in 2016.

Xfce’s settings and applications menu

To use it, a special kernel had to be built so it would expose /dev/input device nodes. Then a sysctl had to be set – and eventually X11 had to be patched for emulated udev support… Why would anybody want to do all this just for different input drivers? Multi-touch support is just one valid reason. Another one is that having evdev-based input drivers is half the way to eventually support libinput, too. And that is one of the prerequisites for Wayland!

This month FreeBSD has finally enabled evdev support in the GENERIC kernel in both -CURRENT and 12-STABLE. That means the upcoming FreeBSD 12.0 will not support it out of the box, but most likely a future 12.1 will. Dragonfly BSD has also grown support for event devices and people are interested in working towards Wayland. I hope that we’ll be able to get xf86-input-evdev working with our X11 (on Dragonfly, FreeBSD and Linux) next year,

I’m taking a little break from Xfce now (but plan to port most of the remaining components later to make it a well-supported DE in Raven). There are a few things I have planned like adding Linux support for OpenVPN (it depends on some libraries and programs that are Linux only which are not yet in Raven). Also I intend to take a look at adding some more Qt5 components and write a few requested ports. And finally I want to write another post next year – a tutorial on using Ravenports and creating new ports.

So keep flying with us – it’s exciting times!

Exploring OmniOS in a VM (2/2)

This is the second part of my post about “exploring OmniOS in a VM”. The first post showed my adventures with service and user management on a fresh installation. My initial goal was to make the system let me ssh into it: Now the SSH daemon is listening and I’ve created an unprivileged user. So the only thing still missing is bringing up the network to connect to the system from outside.

Network interfaces

Networking can be complicated, but I have rather modest requirements here (make use of DHCP) – so that should not be too much of a problem, right? The basics are pretty much the same on all Unices that I’ve come across so far (even if Linux is moving away from ifconfig with their strange ip utility). I was curious to see how the Solaris-derived systems call the NIC – but it probably couldn’t be any worse than enp2s0 or something that is common with Linux these days…

# ifconfig
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
          inet 127.0.0.1 netmask ff000000
lo0: flags=200200849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
          inet6 ::1/128

Huh? Only two entries for lo0 (IPv4 and v6)? Strange! First thought: Could it be that the type of NIC emulated by VirtualBox is not supported? I looked for that info on the net – and the opposite is true: The default VirtualBox NIC (Intel PRO/1000) is supported, the older models that VBox offers aren’t!

Ifconfig output and fragment of the corresponding man page

Obviously it’s time again to dig into manpages. Fortunately there’s a long SEE ALSO section again with ifconfig(1M). And I’ve already learned something so far: *adm commands are a good candidate to be read first. Cfgadm doesn’t seem to be what I’m looking for, but dladm looks promising – the short description reads: “Administer data links”.

I must say that I like the way the parameters are named. No cryptic and hard to remember stuff here that makes you wonder what it actually means. Parameters like “show-phys” and “show-link” immediately give you an idea of what they do. And I’m pretty sure: With a little bit of practice it will work out well to guess parameters that you’ve not come across, yet. Nice!

# dladm show-link
LINK         CLASS     MTU    STATE     BRIDGE     OVER
e1000g0      phys      1500   unknown   --         --

Ok, there we have the interface name: e1000g0. I don’t want to create bridges, bond together NICs or anything, so I’m done with this tool. But what to do with that information?

The ifconfig manpage mentions the command ipadm – but it’s kind of hidden in the long text. For whatever reason it’s missing in SEE ALSO! I’d definitely suggest that it’d be added, too. More often than not people don’t have the time to really read through a long manpage (or are not as curious about a new system as I am and thus don’t feel like reading more than they have to). But anyways.

Ipadm manpage

# ipadm create-if e1000g0

This should attach the driver and create the actual interface. And yes, now ifconfig can show it:

Interface e1000g0 created

Network connection?

Almost there! Now the interface needs an IP address. Looks like ipadm is the right tool for this job as well. I had some trouble to find out what an interface-id is, though. The manpage obviously assumes the reader is familiar with that term, which was not the case with me. I tried to find out more about it, but was unable to locate any useful info in other manpages. So I resorted to the OmniOS wiki again and that helped (it seems that you can actually choose almost anything as an Interface ID but there are certain conventions). Ok, let’s try it out and see if it works:

# ipadm create-addr -T dhcp e1000g0/v4

No error or anything, so now the system should have acquired an IPv4 address.

IPv4 address assigned via DHCP

10.0.2.15, great! Let’s see if we can reach the internet:

# ping elderlinux.org
ping: unknown host elderlinux.org

DNS pt. 1

Ok, looks like we don’t have name resolution, yet. Is the right nameserver configured?

# cat /etc/resolv.conf
cat: cannot open /etc/resolv.conf: No such file or directory

Oops. There’s something not right here! If I configure my FreeBSD box to use DHCP that makes sure resolv.conf is populated properly. The interface e1000g0 got an IP – so the DHCP request must have been successful. But did something break before the network was configured completely? Is the DHCP client daemon even running?

DHCP

# ps aux | grep -i [d]hcp
root       551  0.0  0.1 2836 1640 ?        S 18:52:22  0:00 /sbin/dhcpagent

Hm! I’m used to dhclient, but dhcpagent is unknown to me. According to the manpage, it’s the DHCP client daemon, so that must be what OmniOS uses to initiate and renew DHCP requests. And obviously it’s running. However the manpage solves the mystery:

Aside from the IP address, and for IPv4 alone, the netmask, broadcast address, and the default router, the agent does not directly configure the workstation, but instead acts as a database which may be interrogated by other programs, and in particular by dhcpinfo(1).

Ah-ha! So I have to configure things like DNS myself. Nevertheless the system should have received the correct nameserver IP. Let’s see if we can actually get it via the dhcpinfo command that I’ve just learned about. A look at the manpage as well as at the DHCP inittab later I know how to ask for that information:

# dhcpinfo -i e1000g0 DNSserv
192.168.2.1

Right, that’s my nameserver.

Some lines from /etc/dhcp/inittab

Route?

Is the routing information correct? Let’s check.

# netstat -r -f inet
[...]
default      10.0.2.2  [...]
10.0.2.0     10.0.2.15 [...]
localhost    localhost [...]

Looks good and things should work. One more test:

# route get 192.168.2.1
   route to: fw.local
destination: default
       mask: default
    gateway: 10.0.2.2
  interface: e1000g0
[...]

Alright, then it’s probably not a network issue… But what could it be?

DNS pt. 2

Eventually I found another hint at the wiki. Looks like by default OmniOS has a very… well, old-school setting when it comes to sources for name resolution: Only the /etc/hosts is used by default! I haven’t messed with nsswitch.conf for quite a while, but in this case it’s the solution to this little mystery.

# fgrep "hosts:" /etc/nsswitch.conf
hosts:      files

There are a couple of example configurations that can be used, though:

# ls -1 /etc/nsswitch.*
/etc/nsswitch.ad
/etc/nsswitch.conf
/etc/nsswitch.dns
/etc/nsswitch.files
/etc/nsswitch.ldap
/etc/nsswitch.nis

Copying nsswitch.dns over nsswitch.conf should fix that problem.

And really: Instead of getting a ping: unknown host elderlinux.org now I get no answer from elderlinux.org – which is OK, since my nameserver doesn’t answer ping requests.

Remote access

Now with the network set up correctly, it’s finally time to connect to the VM remotely. To be able to do so, I configure VirtualBox to forward port 22 of the VM to port 10022 on the host machine. And then it’s time to try to connect – and yes, it works!

SSHing into the OmniOS VM… finally!

Conclusion

So much for my first adventure with OmniOS. I had quite a few difficulties to overcome even in this very simple scenario of just SSHing into the VM. But what could have been a trivial task proved to be rather educational. And being a curious person, I actually enjoyed it.

I’ll take a break from the Illumos universe for now, but I definitely plan to visit OmniOS again. Then I’ll do an installation on real hardware and plan to take a look at package management and other areas. Hope you enjoyed reading these articles, too!

Exploring OmniOS in a VM (1/2)

While I’ve been using Unix-like systems for quite a while and heavily for about a decade, I’m completely new to operating systems of the Solaris/Illumos family. So this post might be of interest for other *BSD or Linux users who want to take a peek at an Illumos distribution – or to the Illumos people interested in how accessible their system is for users coming from other *nix operating systems.

In my previous post I wrote about why I wanted to take a look at OmniOS in the first place. I also detailed the installation process. This post is about the first steps that I took on my newly installed OmniOS system. There’s a lot to discover – more actually than I first thought. For that reason I decided to split what was meant to be one post into two. The first part covers service management and creating a user.

Virtualize or not?

According to a comment, a Reddit user would be more interested in an installation on real hardware. There are two reasons why I like to try things out using a hypervisor: 1) It’s a quick thing to do and no spare hardware is required 2) it’s pretty easy to create screenshots for an article like this.

OmniOS booted up and root logged in

However I see the point in trying things out on real hardware and I’m always open to suggestions. For that reason I’ll be writing a third post about OmniOS after this one – and will install it on real hardware. I’ve already set a somewhat modern system aside so that I can test if things work with UEFI and so on.

Fresh system: Where to go?

In the default installation I can simply login as root with a blank password. So here we are on a new OmniOS system. What shall we explore first?

The installer asked me which keymap to use and I chose German. While I’m familiar with both the DE and US keymaps enough that I can work with them, I’ve also learned an ergonomic variant called Neo² (think dvorak or colemak on steroids) and strongly prefer that.

It’s supported by X11 and after a simple setxkbmap de neo -option everything is fine. On the console however… For FreeBSD there’s a keymap available, but for Illumos-based systems it seems I’m out of luck.

So here’s the plan: Configure the system to let me SSH into it! Then I can use whatever I want on my client PC. Also this scenario touches upon a nice set of areas: System services, users, network… That should make a nice start.

Manpages (pt. 1)

During the first startup, smf(5) was mentioned so it might be a good idea to look that up. So that’s the service management facility. But hey, what’s that? The manpage clearly does not describe any type of configuration file. And actually the category headline is “Standards, Environments, and Macros”! What’s happening here?

smf(5) manpage

First discovery: The manpage sections are different. Way different, actually! Sections like man1, man1b, man1c, man3, man3bsm, man3contract, man3pam, etc… Just take a look. Very unfamiliar but obviously clearly arranged.

The smf manpage is also pretty informative and comprehensive including the valuable see also section. The same is true for other pages that I’ve looked at so far. On the whole this left a good impression on me.

System services

Solaris has replaced the traditional init system with smf and Illumos inherited it. It does more than the old init did, though: Services are now supervised, too. If a service is meant to be kept running, smf can take care of restarting it, should it die. It could be compared to systemd in some regards, but smf was created earlier (and doesn’t have the same … controversial reputation).

Services are described by a Fault Management Resource Identifier or FMRI which looks like svn:/network/loopback:default but can be shortened if they are unambiguous. I had no idea how to work with all this but the first “see also” reference was already helpful: scvs can be used to view service states (and thus get an idea about what the system is doing anyway).

svcs: service status

Another command caught my attention right away, too: svcadm sounded like it might be useful. And indeed this was just what I was searching for! The manpage revealed a really straight-forward syntax, so here we go:

# svcadm enable sshd
svcadm: Pattern 'sshd' doesn't match any instances
# svcadm enable ssh

The latter command did not produce any output. And since we’re in Unix territory here, that’s just what you want: It’s the normal case that something worked as intended, no need for output. Issuing a svcs and grepping for ssh shows it now so it should be running. But is SSH really ready now? Let’s check:

# sockstat -4 -l
-bash: sockstat: commmand not found

Yes, right, this is not FreeBSD. One netstat -tulpen later I know that it’s not exactly like Linux, either. Once more: man to the rescue!

# netstat -af inet -P tcp

The output (see image below) looks good. Let’s do one last check and simply connect to the default SSH port on the VM: Success. SSHd is definitely running and accepting connections.

Testing if the SSH daemon is running

Alright, so much for very basic service management! It’s just a test box, but I don’t really like sshing into it as root. Time to add a user to the system. How hard could that be?

User management

Ok, there’s no user database like on FreeBSD or anything like that. It’s just plain old /etc/passwd and /etc/shadow. How boring. So let’s just create a user and go on with the next topic, right?

# useradd -m kraileth
UX: useradd: ERROR: Unable to create home directory: Operation not applicable.

Uh oh… What is this? Maybe it’s not going to be so simple after all. And really: Reading through the manpage for useradd, I get a feeling that this topic is everything – but certainly not boring!

There’s a third file, /etc/user_attr, involved in user management. Users and roles can be assigned extended attributes there. Users can have authorizations, roles, profiles and be part of a project. I won’t go into any detail here (it makes sense to actually read the fine manpages yourself if you’re interested). Now if you’re thinking that some of this might be related to what FreeBSD does with login.conf, you’re on the right track. I cannot claim that I understood everything after reading about it just once. But it is sufficient to get an idea of what sort of complex and fine-grained user management Illumos can do!

Content of /etc/user_attr

Manpages (pt. 2)

Ok, while this has been an interesting read and is certainly good to know, it didn’t solve the problem that I had. The useradd manpage even has a section called DIAGNOSTICS where several possible errors are explained – however the one that I’m having isn’t. And that’s a pitty, since some of the ones listed here are pretty self-explanatory while “Operation not applicable” isn’t (at least for me).

I read a bit more but didn’t find any clue to what’s going on here. When my man skills failed me I turned to documentation on the net. And what better place to start with than the OmniOSce Wiki?

When it comes to local users, the first sentence (ignoring the missing word) reads: On OmniOS, home is under automounter control, so the directory is not writable.

Ah-ha! That sounds quite a bit like the reason for the mysterious “not applicable”! That should be hint enough so I can go on with manpages, right?

# apropos automounter
/usr/share/man/whatis: No such file or directory
# man -k automounter
/usr/share/man/whatis: No such file or directory

Hm… Looks like the whatis database does not exist! Let’s create it and try again:

# man -w
# apropos automounter
# apropos automount
autofs(4)      - automount configuration properties
automount(1m)  - install automatic mount points
automountd(1m) - autofs mount/unmount daemon

There we go, more pages to read and understand.

User’s home directories

The automounter is another slightly complex topic (and the fact that the wiki mentions /export/home while useradd seems to default to /home doesn’t help to prevent confusion, either). So I’m going to sum up what I found out so far:

It seems that people at Sun had the idea that it would be nice to be able work not only on your own workstation but at any other one, too. Managing users locally on each machine would be a nightmare (with people coming and going). Therefore they created the Yellow Pages, later renamed to NIS (Network Information Service). If you have never heard of it, think LDAP (as that has more or less completely replaced NIS today). Thus it was possible to get user information over the net instead of from local passwd and friends.

The logical next step was shared home directories so employees could have fully networked user logins on multiple machines. Sun already had NFS (Network File System) which could be used for the job. But it made sense to accompany it with the automounter. So this is the very short story of why home directories are typically located in /export/home on Solaris-derived operating systems: They were meant to be shared via NFS!

So we have to add a line to /etc/auto_home to let the automounter know to handle our new home directory:

* localhost:/export/home/&

Configuring the automounter for the home directory

Most of the two automounter configuration files are made up of the CDDL (pronounced “cuddle”) license – I’ve left it out here by using tail (see picture). After adding the needed rule (following the /export/home standard even though I don’t plan on using shared home directories), the autofs daemon needs to be restarted, then the user can finally be created:

# mkdir /export/home
# useradd -m -b /export/home kraileth
# passwd kraileth

Creating a new user

So with the user present on the system, we should be able to SSH into the local machine with that user:

ssh kraileth@localhost

SSHing into the system locally

Success! Now all that remains is bringing the net up.

What’s next?

The next post will be mostly network related and feature a conclusion of my first impressions. I hope to finish it next week.

A look beyond the BSD teacup: OmniOS installation

Five years ago I wrote a post about taking a look beyond the Linux teacup. I was an Arch Linux user back then and since there were projects like ArchBSD (called PacBSD today) and Arch Hurd, I decided to take a look at and write about them.

Things have changed. Today I’m a happy FreeBSD user, but it’s time again to take a look beyond the teacup of operating systems that I’m familiar with.

Why Illumos / OmniOS?

There are a couple of reasons. The Solaris derivatives are the other big community in the *nix family besides Linux and the BSDs and we hadn’t met so far. Working with ZFS on FreeBSD, I now and then I read messages that contain a reference to Illumos which certainly helps to keep up the awareness. Of course there has also been a bit of curiosity – what might the OS be like that grew ZFS?

Also the Ravenports project that I participate in planned to support Solaris/Illumos right from the beginning. I wanted to at least be somewhat “prepared” when support for that platform would finally land. So I did a little research on the various derivatives available and settled on the one that I had heard a talk about at last year’s conference of the German Unix Users Group: “OmniOS – Solaris for the Rest of Us”. I would have chosen SmartOS as I admire what Bryan Cantrill does but for getting to know Illumos I prefer a traditional installation over a run-from-RAM system.

There was also a meme about FreeBSD that got me thinking:

Internet Meme: Making fun of FreeBSD

Of course FreeBSD is not run by corporations, especially when compared to the state of Linux. And when it comes to sponsoring, OpenBSD also takes the money… When it comes to FreeBSD developers, there’s probably some truth to the claim that some of them are using macOS as their desktop systems while OpenBSD devs are more likely to develop on their OS of choice. But then there’s the statement that “every innovation in the past decade comes from Solaris”. Bhyve alone proves this wrong. But let’s be honest: Two of the major technologies that make FreeBSD a great platform today – ZFS and DTrace – actually do come from Solaris. PAM originates there and a more modern way of managing services as well. Also you hear good things about their zones and a lot of small utilities in general.

In the end it was a lack of time that made me cheat and go down the easiest road: Create a Vagrantfile and just pull a VM image of the net that someone else had prepared… This worked to just make sure that the Raven packages work on OmniOS. I was determined to return, though – someday. You know how things go: “someday” is a pretty common alias for “probably never, actually.”

But then I heard about a forum post on the BSDNow! podcast. The title “Initial OmniOS impressions by a BSD user” caught my attention. I read that it was written by somebody who had used FreeBSD for years but loathed the new Code of Conduct enough to leave. I also oppose the Conduct and have made that pretty clear in my February post [ ! -z ${COC} ] && exit 1. As stated there, I have stayed with my favorite OS and continue to advocate it. I decided to stop reading the post and try things out on my own instead. Now I’ve finally found the time to do so.

First attempt at installing the OS

OmniOS offers images for three branches: Stable, LTS and Bloody. Stable releases are made available twice a year with every fourth release being supported for three years (LTS) instead of one. Bloody images are more or less development snapshots meant for advanced users who want to test the newest features.

I downloaded the latest stable ISO and spun up a VM in Virtual Box. This is how things went:

Familiar Boot Loader

Ah, the good old beastie menu – with some nice ASCII artwork! OmniOS used GRUB before but not too long ago, the FreeBSD loader was ported over to Illumos. A good choice!

Two installers available

It looks like the team has created a new installer. I’m a curious person and want to know what it was like before – so I went with the old text-based installer.

Text installer: Keymap selection

Not much of a surprise: The first thing to do is selecting the right keymap.

ZFS pool creation options

Ok, next it’s time to create the ZFS pool for the operating system to install on. It seems like the Illumos term is rpool (resource pool I guess?). Since I’m just exploring the OS for the first time, I picked option 1 and… Nothing happened! Well, that’s not exactly true, since a message appears for a fraction of a second. If I press 1 again, it blinks up briefly again. Hm!

I kept the key pressed and try my best to read what it’s saying:
/kayak/installer/kayak-menu[254]: /kayak/installer/find-and-install: not found [No such file or directory]

Oops! Looks like there’s something broken on the current install media… So this was a dead-end pretty early on. However since we’re all friends in Open Source, I filed an issue with OmniOS’s kayak installer. A developer responded the next day and the issue was solved. This left a very good impression on me. Quality in development doesn’t show in that you never introduce bugs (which is nearly impossible even for really lame programs) but in how you react to bugs being found. Two thumbs up for OmniOS here (my latest PRs with FreeBSD have been rotting for about a year now)!

Dialog-based installer

What a great opportunity to test the new installer as well! Will it work?

Dialog-based installer: Keymap selection

Back on track with the dialog-based installer. Keymap selection is now done via a simple menu.

ZFS pool creation options

Ok, here we are again! Pool creation time. In the new installer it just does its job…

Disk selection

… and finds the drives, giving me a choice on where to install to. Of course it’s a pretty easy decision to make in case of my VM with just one virtual drive!

ZFS Root Pool Configuration

Next the installer allows for setting a few options regarding the pool. It’s nice to see that UEFI seems to be already supported. For this VM I went with BIOS GPT, though.

Hostname selection

Then the hostname is set. For the impatient (and uncreative) it suggests omniosce. There’s not too much any installer could do to set itself apart (and no need to).

Time zone selection 1

Another important system configuration is time zone settings. Since there’s a lot of time zones, it makes sense to group them together by continent instead of providing one large list. This looks pretty familiar from other OS installations.

Time zone selection 2

The next menu allows for selecting the actual time zone.

Time zone confirmation

Ok, a confirmation screen. A chance to review your settings probably doesn’t hurt.

Actual copying of OS data

Alright! Now the actual installation of the files to the pool starts.

Installer: Success!

Just a moment later the installation is finished. Cool, it even created a boot environment on its own! Good to see that they are so tightly integrated into the system.

Last step

Finally it’s time to reboot. The installer is offering to do some basic configuration of the system in case you want to do that.

Basic configuration options

I decided not to do it as you probably learn most when you force yourself to figure out how to configure stuff yourself. Of course I was curious, though, and took a peek at it. If you choose to create a user (just did this in another VM, so I can actually write about the installer), you’ll get to decide if you want to make it an administrative user, whether to give it sudo privileges and if you want to allow passwordless sudo. Nice!

First start: Preparing services

After rebooting the string “Loading unix…” made me smile and I was very curious about what’s to come. On the first boot it takes a bit longer since the service descriptions need to be parsed once. It’s not a terribly long delay, though.

First login on the new system

And there we have it, my first login into an OmniOS system installed myself.

What’s next?

That’s it for part one. In part two I’ll try to make the system useful. So far I have run into a problem that I haven’t been able to solve. But I have some time now to figure things out for the next post. Let’s see if I manage to get it working or if I have to report failure!