Linux Setup Notes

name and address
dec 02, 2013; updated dec 11, 2013

Opensuse 13.1: Review and Problems

We purchased a new high-end server last month for molecular modeling. The software we plan to use only runs on Linux. The challenge was getting some version of Linux on it.

This computer uses UEFI, also known as secure boot, which is the latest abomination created by Microsoft to ensure that new operating systems are crushed before they can get a foothold. The OS needs a “signature” from Microsoft before it can boot. Apparently, people are cool with that. But Linux has somehow gotten a signature, and that may be a bad sign, because it could mean Microsoft doesn't consider Linux to be a serious threat. And after dealing with Opensuse 13.1, I'm thinking they could be right.

Attempted installation of Opensuse 11 and 12

In my opinion, 11.0 was by far the best version of Opensuse ever. This is the version I put on every computer that's available. Unfortunately, it's very finicky about what kinds of DVD drives it recognizes, and it wouldn't install on this one: after displaying the green introductory screen, it dropped into text mode and started asking for a CD. “Please make sure CD #1 is in the drive!” The text screen of death.

Installing over NFS didn't work either. I tried copying the DVD to a memory stick, until I discovered that the maximum speed of the stick was only 309 kB/s. It would have taken 7.4 hours to copy it. Yes, it's old.

Update: using the bs=4M option in dd increased the speed to 7.0 MB/s.

Warning The dd command copied the ISO file to the USB stick cleanly. However, the system load went up to 10, and while it was copying, the network interface started flickering on and off. Afterward, the network interface would not come back up. The RTL8169 network driver had actually wedged a port in our HP Procurve network switch! It was necessary to power-cycle the switch to recover. This should not be possible. Is the kernel sending garbage out the network interface when it's overloaded? This could be a problem ....

Opensuse 12.3 was a little better. It went through the entire process of installing, but was unable to install a boot loader. Both grub and elilo installation just crashed, saying that mkinitrd had failed and /boot/grub2 didn't exist. Nothing we did could get around that.

Installation of Opensuse 13.1

So we resigned ourselves to moving to 13.1. OpenSuse 13.1 looks almost exactly like the 12.x versions, but as always, there were a few small problems. This time, though, there were two big problems: UEFI and libjpeg. For the first time, these turned out to be insurmountable.

Minor Problems

First, the minor problems that surfaced after we got the OS installed.

NFS problems

At first, mounting of NFS disks didn't work. It was necessary to start portmap manually. The NFS Client window in Yast 2 just hangs on "starting services," and editing the system config in Yast2 gave the message “unable to write changes.”

After a few tries, and a few subtle sarcastic remarks, starting it from the command line systemctl start nfs.service finally worked.

imlib2 problems

E16 could not find imlib2, even though it was installed. It turns out that the second 'i' in libimlib2 is capitalized in Suse, but Enlightenment expected all lower-case. Once this was fixed, and after we created a fake pkg-config file for imlib2 in usr/share/pkgconfig, e16 finally compiled and started up, though as described in more detail below it was still unable to use the Enlightenment libraries and was thus unusable.

findutils-locate problems

Even though findutils was marked as having been installed, locate and updatedb did not seem to exist. It turned out to be necessary to install mlocate as well, and to re-set the perms of /usr/bin/locate and /var/lib/mlocate/mlocate/db as well as the directory /var/lib/mlocate. Even so, every time updatedb was run, it changed the permissions of the database /var/lib/mlocate/mlocate.db so that regular users couldn't use it.

Networking

This machine had a duplex gigabit NIC, as well as a modular network jack on the motherboard, so we ended up with 3 network interfaces, called for some reason eno1, enp1s0, and ens1. The networking functioned flawlessly, as far as I could tell—our entire IT department is out this week, so I never did get an IP address from them. (This is not due to any holiday. I think it's part of the University's new strategy to keep their email system running.) Networking is one thing Opensuse has done very well.

Window Manglers

There was no more need for SaX; the correct screen mode and resolution were selected automatically. Upon starting X11, a password screen popped up saying that authentication is required to create a color profile. The username was set to Administrator, and there was no password that would work, but it didn't prevent logging in. The Gnome desktop consisted of giant icons on a charcoal gray background, which was mind-bogglingly hideous. So there aren't too many changes there.

The default terminal in Gnome is gterm. Gterm got confused after reading our customized /etc/profile (which is the same as the default one except for the command prompt and the path), so that the command prompt stopped working. It was necessary to type pwd and whoami every so often to find out if we were still root and what directory we were in. But this was a small matter, since gterm was only needed long enough to compile rxvt-unicode (yes, Opensuse has dropped it).

So the next step was to build a window manager. Since I didn't want to risk wearing out my 'K' key again, I passed over KDE and went to Enlightenment. Enlightenment was dropped from Opensuse a long time ago, so I downloaded all the e17 stuff. The Enlightenment libraries compiled and installed, but e17 itself wouldn't, so I went back to e16. Unfortunately e16 couldn't see the libraries, and it retained its bland appearance with no border decorations and unusable config menus. So I compiled and installed my modified version of fvwm95. Yes, a 20-year-old window manager was the one that looked best on this system.

Major problems

Before we could get to the minor problems, we had to overcome a bigger problem: getting Opensuse 13.1 to create a bootable hard drive.

UEFI

UEFI created major problems for Opensuse 13.1. After eight attempts at installation, we finally discovered the trick by trial and error.

But in fact, Grub2-EFI is the only one that works. There's just a trick you need to go through before you can select it. Install grub2, selecting boot from MBR instead of the default root partition. The grub boot screen should then show a menu with 3 options:

So it still won't boot up at this point. But if you select New Installation at this point, it now becomes possible to select grub2-efi and check Enable Secure Boot Support. The red error message was no longer present, and Opensuse successfully created a bootable hard drive. At boot-up, Grub finally gave Opensuse as an item in a green menu. This means OpenSuse 13.1 is actually able to boot up properly. Success.

At this point, anyone with any sense would have stopped. But no, we are Linux fans. Common sense is not in our vocabulary. We had skipped so many packages that we thought it would be easier to re-install. Big mistake. There is no option to install without trashing your existing boot configuration. It went back to creating un-bootable disks again. It took a couple more hours of going through the above trick to get back to a bootable system again.

Partition table madness

Opensuse used to handle partition tables very well. But with Opensuse 13.1, each time we re-installed it mangled the partition table a little more. Some of your old partitions are left dangling, unformatted and unmounted, and more and more new ones, labeled sda5, sda6, sda7, etc. are automatically tacked on. Some of them have a size of zero. You have to delete these extra useless partitions manually, making sure to leave the first one formatted with VFAT, like this:

   sda1 156.88 MB EFI boot  FAT   /boot/efi
   sda2 50 GB               ext4  /
   sda3 2 GB                swap
   sda4 1.75 TB             ext4  /home 

It doesn't seem to matter whether the VFAT partition starts on sector 0 or on the default of 200 (or whatever it was), but it has to be present or the system will never boot. This isn't Suse's fault. You can thank the conspiracy between MS and Intel for that.

Libjpeg problems

Finally we got to the show-stopper. Opensuse now includes a new, bogus jpeg library called libjpeg 8.0.2. This library has a completely different API from the libjpegs from the Independent Jpeg Group, and made it impossible to compile any of our existing software.

The solution is not pretty: either rewrite your code to detect and handle two incompatible libraries, hard-code the library paths in your config script, or install a valid libjpeg and then re-compile half the software on the system. Libjpegs from IJG, including jpeg-8, jpeg-9 and the venerable jpeg-6b all worked fine, but wherever OpenSuse got this 8.0.2 from, I wish they'd put it back.

It is possible to deselect libjpeg8 during installation, but if you do, very little else works. Firefox won't install. Yast2 runs in text mode, with a blue screen that must be navigated by pressing TAB multiple times, like we used to do in the 1990s. Even Samba, which surely doesn't read JPEGs, doesn't install, due to a hopeless tangle of dependencies.

With libjpeg8 installed, it was fiendishly hard to get software to select which jpeg headers to use, and we ended up with crazy compilation errors while compiling libtiff 4.0.3, like this one:

#define FALSE 0
expected identifier before numeric constant in line 72

If you remove the ifdef, it says

error: 'FALSE' undeclared in tif_codec.c

To make it compile you have to go through and edit all the C files, putting

#define TRUE 1
#define FALSE 0

at the top. Once it's compiled, it still doesn't work. Our software linked to it, but the image sizes were wrong, turning our JPEGs into trash. Deleting the evil libjpeg 8.0.2 was a mistake, and prevented Yast2 from running.

Even the bona fide IJG jpeg-9 had problems:

/usr/local/include/jmorecfg.h:264.16:error: expected identifier 
before numeric constant
typedef enum { FALSE = 0, TRUE = 1 } boolean;

According to IJG, we're now supposed to add HAVE_BOOLEAN to all our programs:

#define HAVE_BOOLEAN

However, this prevented it from compiling. It will compile if you add these two additional lines:

#include <stdbool.h>
#define boolean bool

but if you do this while your software is linked to jpeg-6b you get the following message when your software tries to read a JPEG image:

JPEG parameter struct mismatch: library thinks size is 632, caller expects 600

So, don't do that, either.

Conclusion

Most of the changes in Opensuse 13.1 seem to be under-the-hood bug fixes. The installation screens and the list of software looked identical to previous versions. Even so, it's still quite easy to create a system that won't boot. For example, if you select LILO instead of grub as your boot loader, you might think that lilo would be installed automatically. But no. It's not selected in the package list, and unless you select it manually Opensuse happily plows itself into the ground.

Once we got it running, it was blazingly fast on our computer. It ought to be, considering what we paid for it. This thing is no toy. We plan to use it for drug discovery, and it will be running number-crunching programs for weeks at a time.

But the jpeg problem is nasty. The amount of effort in editing and recompiling Motif, libtiff, libjpeg, and so on in order to get our software to run was ridiculous. I'm not optimistic about ever getting this to work: Opensuse 13.1 is by far the most frustrating version I have ever used.

The following Monday I tried again, to no avail. Sorry to say, the libjpeg problem was fatal. I finally gave up on this version of Opensuse. I wiped the machine, and purchased and installed a copy of Suse Linux Enterprise Server 11 SP3 instead.


Back