freebsd 10 dot 1 upgrade trouble

2014-12-15

I’ve used freebsd-update(8) since its release, and I can only recall having one issue where I had to take extreme action (rolling back, due to driver issue with an intel 10Gb ethernet card…)

Both my co-worker and I faced the same upgrade problem between FreeBSD 10 and 10.1, and it turns out, we were not alone.

Here is the scene. Its Friday afternoon, which means the fantastic idea of upgrading a vm used to build our custom FreeBSD packages popped into my head.

What could go wrong?

Installing updates...Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)

A quick google search brought me to the FreeBSD Forums, where it seems to main problem was with nsswitch.

We, like other posters in that thread, use network based authentication. Either via LDAP or Winbind. While Samba authentication seemed to work after the first freebsd-update round, freebsd-update was seg-faulting on setting the file permissions.

This left me in a lurch. First off, this is a VM running on a bare-bones Xen hypervisor. Booting off of a usb image to reinstall is kind of a pain, and, its difficult to quickly get to the boot options before the VM is already loading the FreeBSD boot loader.

Secondly, it seemed that all of the userland tools that dealt with writing data would seg-fault.

Since I knew that the issue lay within nsswitch, I decided to copy the stock file from /usr/src/etc/nsswitch.conf to /etc:

# cp /usr/src/etc/nsswitch.conf /etc/nsswitch.conf
# cat /etc/nsswitch.conf
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: releng/10.1/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $
#
group: compat
group_compat: nis
hosts: files dns
networks: files
passwd: compat
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files

Then, I attempted a ‘make buildworld’, except /usr/src was not all there, and the make process immediately error-ed out due to missing source files.

With this knowleded, I fetched the 10.1 txz archive files from ftp://ftp.freebsd.org

# fetch ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/10.1-RELEASE/kernel.txz
# fetch ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/10.1-RELEASE/base.txz
# fetch ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/10.1-RELEASE/lib32.txz
# fetch ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/10.1-RELEASE/doc.txz
# fetch ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/10.1-RELEASE/src.txz

Extracted:

# sh
# EXPORT DESTDIR=/
for file in base.txz kernel.txz doc.txz src.txz; 
do (cat $file | tar --unlink -xpJf - -C ${DESTDIR:-/}); 
done

This worked out okay, except for extracting base.txz, which I had anticipated.

Just to ensure a sane userland, I performed the make buildworld ; make installworld.

I’m certainly happy this was not a more critical piece of our infrastructure, but I usually an very cautious during those upgrades (never on a Friday afternoon :) )

After all of this, I reinstalled the packages, re-ran the salt states, and I was up and running once again