In my earlier post I have been experimenting with LXC on Debian 9 as part of my project to move away from OpenVZ containers. With good success.
However at work we run exclusively CentOS servers, and have a lot of custom built RPMs and team knowledge around CentOS. To switch to Debian just to move away from OpenVZ containers would be a dramatic change and require the team to be re-trained.
We have no issues with CentOS 6, beyond its age, so if we could move to a newer system container system (like LXC), but just migrate to CentOS 7, then this would be ideal.
It was with this mind set that I set about trying LXC (and LXCFS) on CentOS 7.
In summary, I can say that LXC and LXCFS work great on CentOS 7. However I did have a lot of trouble with LXCFS initially because CentOS’ kernel does not have cgroup namespaces, and LXCFS’ emulation of cgroup namespaces caused containers to hang intermittently when starting and stopping them.
The solution to this was to move to an ELRepo style vanilla upstream kernel, but to rebuild my own package, called kernel-llt (Latest Long Term) that provides a more current kernel, but still with long term stability and support.
LXC 3.x package for CentOS 7
The first step was to produce an RPM to install the latest stable LXC version (3.0.2 at time of writing).
Thankfully, there was already a SPEC file in the official LXC repo, which I did try using initially. However I found that it pulled in some additional network dependencies (like dnsmasq) that I didn’t require.
LXC is simple to compile, so I produced a new minimal LXC SPEC file and built my RPM from that.
This worked like a charm, and I was now able to start creating containers on CentOS 7.
LXCFS 3.x package for CentOS 7
Next up, the LXC project provides another application called LXCFS. This application runs as a service on the node, and provides emulation of the
/proc/ and other virtual system Linux directories that provide information about the running machine.
The idea behind LXCFS is to emulate some of these virtual system files, such that applications running inside the containers appear to be running inside a proper virtual machine. This allows commands like
free to show the memory allowed in the container for example.
There was no RPM SPEC file available for LXCFS, so I again, created my own.
At first this appeared to be working well. Containers started and were able to see the emulated system files, such that applications like
top functioned as if they were running inside a virtual machine (showing memory limits and container uptime).
For a few days this appeared to be working well. However during my routine starting and stopping of containers, I started to notice that intermittently the containers would hang during shutdown. The
lxc-stop command would never end (unless forcefully killed), and that would just leave the container running, with systemd just waiting inside the container.
After some diagnosis I found that LXCFS itself was crashing, and leave errors in the syslog. So I opened an Issue with the LXCFS team. Christian Brauner was able to fix a potential memory free issue quickly, but sadly this did not solve the problem. He asked for a coredump using GDB (which was a learning experience in itself for me), which I provided for several instances of the crash.
Unusually, in this instance the LXCFS devs were stumped and no solution was forthcoming. I suspect their focus is ensuring LXC works well on vanilla and Ubuntu kernels, and not the rather strange, CentOS kernels, that tend to be very old and have lots of features back-ported into them.
Unperturbed I thought I would try an alternative approach, at least to try and diagnose where the problem was coming from (even if I couldn’t fix it). From my experience creating the RPM for LXCFS, I knew that LXCFS installs a mount hook script to be run for all containers. The script is broadly split into 2 sections; one for setting up the emulation of the /proc directory, and the other to setup emulation of the cgroups filesystem (if not provided by the kernel directly on kernels with cgroup namespaces).
I tried adding an
exit statement in various places in the hook script to see if the container would still start, whether the LXCFS functionality was broken, and whether or not this fixed the shutdown hangs.
I found that adding an exit statement before the cgroup emulation part, but after the /proc emulation part, allowed the majority of the LXCFS functionality to work, and it meant that the shutdown hangs didn’t occur anymore.
This gave me a clue, as the comments in the hook script suggested that if the kernel had cgroup namespaces available, that it would exit and not run the latter half of the script.
A newer kernel was needed
I needed a newer kernel, one that supported cgroup namespaces. This wasn’t added until Kernel 4.6 (which probably explains why Debian’s 4.9 kernel doesn’t experience these issues).
After some searching, I found a third party YUM repo called, ELRepo Kernel-lt that produce RPMs of the vanilla upstream kernel directly from kernel.org but packaged for compatibility with CentOS 7. Awesome!
However, there was another snag, the kernel-lt (Long Term) version supplied at ELRepo was 4.4, but I needed at least 4.6. I could see at kernel.org that the current long term kernel was 4.14, which would be ideal.
So I setup about using the newer SPEC file for the kernel-ml repo and then backporting it to build kernel 4.14. I am now maintaining this as a package called kernel-llt (Latest Long Term) for use with CentOS 7. I planning to switch this to 4.19 once that becomes the latest long term version at kernel.org.
With the kernel 4.14 installed, LXC and LXCFS have been rock solid for months now!