One-day build; Testing Gluster on ARM
I’ve had a small group of four ROCK64 SBCs for a while now, without much purpose other than testing. Each has one USB 3.0 port, some reasonably powerful CPU cores, and a decent amount of RAM for an SBC. I also recently freed up four Sabrent USB-SATA adapters from another project and thought about making a four-node storage cluster with my ARM devices, mostly as a proof-of-concept.
In order to make it a sort of challenge, I decided to see how far I could get this in a day, using only these four SBCs and Ansible. I ended up with a reasonably fast cluster and some realizations on how far ARM SBCs have come, and how far they still have to go.
I put the end resulting role on GitHub, and am planning to keep updating this in the future. There are a few possibilities for a small cluster like this, and potentially something like Gluster geo-rep could be useful as a remote redundant backup in the future.
Ceph vs. Gluster
Originally I wanted to use Ceph, but I had a series of issues. Ceph only has APT support for a small number of OSes, and none of them are Bookworm. Bullseye is only supported on Pacific and Quincy, while Reef only supports Ubuntu 20.04 (Focal) and 22.04 (Jammy). There were other issues around architecture names, the special kernel mods from Armbian, and a bunch more than threatened to eat up the time I would probably need to get this all working in a day.
After a few attempts with no success, I decided to switch to GlusterFS, since that has a native server and client in Debian. In the future I’ll revisit Ceph with my Raspberry Pis, since those have an official Ubuntu build that might be more stable. I might also try it with my RISC-V devices, just to see if I can even get it running on those.
Gluster isn’t as flexible as Ceph, but in some ways it’s better suited to this, since it’s far faster to get running. All Gluster needs is a server daemon, a backing filesystem, some straightforward commands, and viola, a working GlusterFS filesystem. Ansible can handle that easy.
The Process
I decided to break the Gluster process down into parts to make building the playbook easier. Once each step was working, I could move on to the next step, testing as I went. This made the whole process both fast and simple.
Step 1: Cleaning the Disks
The first step was to wipe the disks. Ansible has a community module for parted, and examples for clearing disks, which made this part very easy. In order to prevent the role from deleting the partitions every single time it’s run, I added a series of flags which, when not set, will skip over the partitioning steps.
The first issue happened early, since one of the drives wouldn’t accept the modifications. Because of this, there were errors about how the partition table needed to be refreshed, and things hadn’t been handled properly. However, attempting to reboot caused the SBC to try and boot from the external SSD since it was originally a boot drive. If Ansible had called a reboot, it would have failed since the device would have never come back up.
The drive itself was the problem, and it had effectively gone into read-only mode. After swapping it out, the issues went away, and I moved on. This is just a precaution I would have to take with this setup in the future, either pre-wiping the drives, or just accepting that any errors will cause the playbook to end, and I will be doing some manual intervention.
Step 2: Creating the File System
After the partitioning, I moved on to the backing filesystem. Originally I wanted to use ZFS to take advantage of the snapshots, scrubbing, and potentially mirroring or adding spares in the future. The problem was that the zfs-dkms
package just would not install. I eventually traced the problem back to the custom Armbian kernel, since even after installing the kernel headers, the package still failed.
I decided that a one-day build was not the time to learn how to recompile the Armbian kernel, and switched to XFS, which is actually what Gluster recommends in their install guide anyways. While less capable, it’s also less work, since all that needs to happen is the creation of the partition, the filesystem, and then mounting that filesystem and creating the folder structure.
However, somewhere in the process, probably the repeated installation and uninstallation of the ZFS DKMS, one of the ROCK64s stopped booting. After some UART debugging, it looked like the initramfs had corrupted. This led to a step I hadn’t been anticipating.
Unintentional Step 2.5: Disaster Recovery
One of the reasons I use DietPi is that I can copy in a config file when I flash the OS, and it will install and set up the entire system for me. This works very, very well with Ansible since I can drop in the SSH keys, and boom, working system.
After confirming that I had indeed corrupted the initramfs on one ROCK64 EMMC module, I decided that I had likely caused some issues with all four in my debugging and testing, so I decided to restart with a fresh slate and set them all up from scratch.
Thankfully DietPi makes this easy. I had the whole cluster back up and running in about ten minutes, after spending most of that removing, flashing, copying the parameters file, reinstalling the EMMC modules, and waiting for the systems to boot so I could run my setup playbooks.
One theory as to why the system corrupted was that this particular ROCK64 was the one that accidentally tried to boot using the dead SSD. If something on that SSD loaded and broke the EMMC, that could explain why it was only this one that failed to restart. I decided not to risk any more unwanted USB booting and zeroed out all four SSDs before continuing with the experiment.
Step 3: File Structure and Gluster Cluster
I decided for my test to have two Gluster volumes, so I would need two backing stores. With XFS, it’s just a folder on the filesystem, so that made it easy to handle with Ansible. After mounting the partition, then creating the file structure, it was a simple case of installing the Gluster server, starting it, and peering the servers.
This last part was anticlimactic. It just worked, and then mounted. No problems, not issues, no fuss.
Step 4: Benchmarking
The results aren’t too surprising. The Gluster cluster isn’t fast, and the Gigabit network bottleneck wasn’t much of a bottleneck. At barely 0.61 MiB/s reads and 0.41 MiB/s writes, the cluster is less than a tenth as fast as the test VM’s virtual disk on ZFS mirrored SSDs. In a more real-world test, I copied a 10 GB file from the test system to the cluster, and it reached 55 MB/s, and took on average about 3 minutes to rsync the file.
I’ll include the test results here since they aren’t very relevant to the git repo.