I've been using my last generation gaming PC as my home NAS/server for about 7-8 years now, serving media files to Plex and Jellyfin, hosting Docker containers for various services, and being my mass data storage and archive. And with working in IT full time for 16 years, there have been times where I have admittedly neglected some things, hoping that they would continue to work and no repercussions would be had.
About 3 weeks ago, we had a storm that knocked our power out a few times, which killed off a 3TB drive that I've had for an unknown amount of years. All of my computers are plugged into a UPS but I've had the worst luck trying to get UPS controls working properly with NUT and Ubuntu Linux; shutting down the server would also cut power to other devices for some reason (the UPS shows good health too!). I was lucky enough that I only lost 2 years of Clonezilla OS images (a "backup" of sorts for my OS drive) so nothing of REAL importance in my opinion.
This really got me paranoid and sent me down a rabbit hole on how I can properly store and back up all of my important data. Actually, it made me re-think my entire home network setup, realizing that I've been neglecting it for some time now.
Assessing the Damage
My first step was to take a full inventory of all my data and hard drives, getting a good idea of where things were. This lead to a few notes and tables in Obsidian that I've heavily referenced while doing research on what my next steps would be. Over a span of 7-8 years, I have not been a good steward of keeping things tidy on the server, so I took time to combine things, move some files around, and simplify the layout and structure. I also use SmartMonTools to check the health and "Power On Hours" of my drives to get a rough idea of how old the drives were.
Next, I replaced the now dead 3TB hard drive with a manufacturer refurbished 12TB Seagate Ironwolf Pro hard drive from ServerPartDeals.com. I then copied all of my now organized data over the new drive and pointed all of my services to it (mainly Docker and Samba); this way, my old drives would to become an "archive".
Fresh Start for the OS
Using Ubuntu Server for so many years (many updates and 1 major LTS version upgrade) means a lot of left over, residual junk hanging around from where I've tried applications, packages, Docker containers, etc. I'm not a fan of things like this "clogging up" and I've been trying to start thinking of things like operating systems as "disposable" instead of "critical". This is one reason why I like Docker containers so much: the persistent data is critical but the application stack isn't. App acting up and not working as expected? Kill the container, build a new one, mount the persistent data, and you have a fresh instance.
One option was to reinstall Ubuntu Server with the latest LTS version, but then I started to read about another option that seemed a little more interesting: NixOS. NixOS follows the ideology that the OS itself is not what's important, but rather the state of the OS and the persistent data. Imagine setting up the entire server in a single (or multiple) configuration file(s) one time and not having to really care about the OS anymore. Also imagine trying something out and being able to roll back easily if it doesn't turn out like you wanted. No more creating images via Clonezilla each quarter, no more residual junk hanging around, and most importantly, your server configuration is self-documented. If the OS screws up or the hard drive dies, I can simply re-install the OS, put in my configuration file, and I'm back exactly to where I want the server to be.
Converting
I decided to go through the entire server, write down every task that it performs, and then find the best way to declare it in a Nix configuration way. This was actually really simple to do as I was able to have the server up and running in no time. The only issue that I ran across is that some of the Docker containers would not spin up when declared in Nix. I had to substitute this for a simple Docker Compose file, which worked with no problem. I would say that this my fault due to a lack of knowledge on some things but that's another story.
Things Left to Do
One thing that is still on my to do list is to try and get Network UPS Tools (NUT) working properly with my UPS, but have it declared in a NIX way. I plan on doing this soon so I can physically unplug the UPS and make sure that it works as intended.
Finally, I was able to setup Borgmatic for a few backup tasks, however the Cron job I've setup doesn't want to run the command for some reason.
Needless to say, I'm about 80% or so through the conversion process and really happy with the results. I'm glad that I have a Nix configuration that I can easily reproduce and a little more peace of mind when it comes to any hardware dying on me.
Once I get to a good stopping point to where my server won't see many more changes, I'll write a few more posts that go more in depth. You can see my Nix configuration on my Github page.