I’ve been a fan of ZFS storage for a while now and I find it particularly useful in this type of standalone environment. One of my constant preoccupations (personally and professionally) is data protection from both a backup and disaster recovery perspective. So in this case, I’m going to create a VM locally that is going to play the role of a storage server that will publish NFS shares to the ESXi server internally.
From here, I gain all of the advantages of the ZFS built-in features like snapshots, compression, and the ability to efficiently replicate data back to the ZFS servers at home.
There are basically two main streams of ZFS based systems: Solaris kernel based solutions and BSD ones. ZFS on Linux is making some headway, but I haven’t yet had the time to poke at the current release to check out its stability.
If you want a simple setup with a really nice web interface for management, FreeNAS is probably the best bet. I personally prefer OmnisOS for this kind of thing since I drive mostly from the command line and it’s a very lean distribution with almost no extra overhead. Generally speaking the big advantage of FreeNAS is the extensive hardware support because of its BSD roots. But in my case, the VM “hardware” is supported so this doesn’t matter as much.
I’ll give a walkthrough on the process for using OmniOS here.
You could simply deploy the ZFS VM directly on the virtual LAN network you’ve created, but I like to keep storage traffic segmented from the regular network traffic. So I create a new virtual switch with a port for connecting VMs called Storage (or NFS or whatever). I also create a new VMkernel interface on this vSwitch using a completely different subnet that will be used to connect the NFS server to the ESXi server.
If this server were completely isolated and there was no need for routing traffic, I would just leave it this way, but there are a couple of things to keep in mind: how to you manage the server and how does it replicate data? You could manage the server by using the Remote Console in the VMware client, but ssh is more convenient. For replication the server will also need to be able to connect to machines on the remote network so it will need to be attached to the routed LAN subnet.
So the preferred configuration for the VM is with two network cards, one for the LAN for management and replication, and one for NFS traffic published to the ESXi server.
Installation & Configuration
The install process is just a matter of downloading the ISO, attaching it to a new VM and following the instructions. I configured mine with 3Gb of RAM (ZFS like memory), an 8 Gb virtual hard disk from the SSD for the OS and two 400Gb virtual disks, one from the SSD and one from the HD. When I build custom appliances like this, I usually use the vmDirectPath feature in order to pass a disk controller directly to the VM, but it’s a little hard to bootstrap remotely.
Then from the console, there’s some configuration to do. The first stages are best done from the console since you’ll be resetting the networking from DHCP to a fixed address.
Here’s a set of commands that will configure the networking to use a fixed IP address, setup DNS and so on. You’ll need to replace the values with ones appropriate to your environment. Most of this requires that you run this as root, either via sudo or by activating the root account by setting a password.
echo "192.168.3.1" > /etc/defaultrouter # sets the default router to the LAN router echo "nameserver 192.168.3.201" > /etc/resolv.conf # create resolv.conf and add a name server echo "nameserver 192.168.2.205" >> /etc/resolv.conf # append a second DNS server echo "domain infrageeks.com" >> /etc/resolv.conf # and your search domain echo "infrageeks.com" > /etc/defaultdomain cp /etc/nsswitch.dns /etc/nsswitch.conf # a configuration file for name services that uses DNS svcadm enable svc:/network/dns/client:default # enable the DNS client service echo "192.168.3.2 nfsserver" >> /etc/hosts # the LAN IP and the name of your server echo "192.168.3.2/24" > /etc/hostname.e1000g0 # the IP for the LAN card - a virtual Intel E1000 card echo "192.168.101.2/24" > /etc/hostname.e1000g1 # the IP for the NFS card svcadm disable physical:nwam # disable the DHCP service svcadm enable physical:default # enable fixed IP configuration svcadm enable svc:/network/dns/multicast:default # enable mDNS
Reboot and verify that all of the network settings work properly.
Setting up zpools
The first thing you’ll need to do is to find the disks and their current identifiers. You can do this with the format command to list the disks and Ctrl-C to cancel the format application.
# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c2t0d0 <VMware-Virtualdisk-1.0 cyl 4093 alt 2 hd 128 sec 32> /pci@0,0/pci15ad,1976@10/sd@0,0 1. c2t1d0 <VMware-Virtual disk-1.0-400.00GB> /pci@0,0/pci15ad,1976@10/sd@1,0 2. c2t2d0 <VMware-Virtual disk-1.0-400.00GB> /pci@0,0/pci15ad,1976@10/sd@2,0 Specify disk (enter its number): ^C
In my case, I have my two 400 Gb disks available, and are available at the identifiers c2t1d0 and c2t2d0, which corresponds to Controller 2, Targets 1 & 2, Disk 0. The first entry is for the OS disk.
I’m going to create my first non-redundant zpool on the SSD disk which is the first one with:
zpool create ssd c2t1d0
Note that these names are cases sensitive. The result is a new zpool called ssd which is automatically mounted on the root file system so it’s contents are available at /ssd.
Before continuing there are a few settings that I like to configure since any filesystems I create will inherit these settings so I only need to do it once.
zfs set atime=off ssd zfs set compression=on ssd
Disabling atime means that it won’t update the last accessed time metadata for files which incurs an unnecessary write overhead for our use case. On a regular file server this value may be more useful. Setting compression is a practically free way to get more usable space and the overhead is negligeable.
Now we have a pool, we can create filesystems on it. I’m going to keep it simple with top level filesystems only. So for my setup, I’m creating one filesystem for my mail server all by itself and a second one for general purpose VMs on the LAN.
zfs create ssd/mail zfs create ssd/lan
To make these accessible to ESXi over NFS, we need to share the filesystems:
zfs set email@example.com/24,firstname.lastname@example.org/24 ssd/mail
Here I’m publishing the volume over NFS to all IP addresses in the 192.168.101.0/24 subnet. Currently the only other IP in this zone is the VMkernel interface on the storage vSwitch, but you could imagine having others here. This means that the NFS share is not available to the LAN, which is the desired configuration since the only thing on this volume should be the mail server VM. You can also set this to use specific IP addresses by ommiting the @ prefix which designates a subnet.
Now I’ve had some permissions issues from time to time with NFS and VMware, so to keep things simple, I open up all rights on this filesystem. Not really a security issue since the only client that can see it is the ESXi server.
chmod -R a+rwx
Now on the ESXi Configuration we need to mount the NFS share so that we can start using it to store VMs. This is done in Configuration > Storage > Add Storage… and select NFS.
The server address is it’s IP address on the storage subnet and the path to the share is the complete path: /ssd/mail in this case.
Snapshots and backups
Now that we’ve got a VMware datastore where we can put VMs, we can profit from the advanced features of the ZFS filesystem underlying the NFS share.
The first cool feature is snapshots where you can make a note of the state of the filesystem that you can then use to roll back to previous points in time, or mount it on a new filesystem so you can go pick out files you want to recover. Now it’s worth remembering that snapshots are not backups since if you lose the underlying storage device, you lose the data. But it is a reasonable tool for doing data recovery.
Creating snapshots is as simple as:
zfs snapshot ssd/mail@snapshot_name
Any new writes are now written to storage in such as way as to not touch the blocks referred to the filesystem at the time the snapshot was taken. The side effect is that you can consume more disk space since all writes have to go to fresh storage so you also need to delete (or commit) snapshots so that they don’t hang around forever.
zfs destroy ssd/mail@snapshot_name
Will delete the snapshot and free up any associated blocks that were uniquely referred to by the snapshot. Obviously, this is the sort of thing that you want to automate and there are a number of tools out there. I’ve created a couple of little scripts that you can pop into cron jobs to help manage these tasks. simple-snap just creates a snapshot with a date/time stamp as the name. The following line in crontab will create a new snapshot every hour.
0 * 0 0 0 /root/scripts/simple-snap.ksh ssd/mail
For cleaning them up, I have auto-snap-cleanup which takes a filesystem and the number of snaps to keep as arguments, so the following command will delete all but the last 24 snapshots:
1 * 0 0 0 /root/scripts/auto-snap-cleanup.ksh ssd/mail 24
So you’ll have a day’s worth of “backups” to go back to.
For the remote replication go check out the scripts under the projects tab. If you have two physical machines, each with an SSD and an HD a reasonable approach would be to replicate the SSD filesystems to the HD of the other machine (assuming you have an IPsec tunnel between the two). In this case, if one machine goes offline, you simply set the replicated filesystems on the HD to read/write, and mount them to the local ESXi host, register the VMs and you can start the machines. Obviously there will be a performance reduction since you’ll be running from the hard drive, but the machines can be made available practically immediately with at most an hour’s worth of lost data.
You can also use a ZFS server as a Time Machine destination with Netatalk which adds the AFP protocol to the server. Compiling netatalk can be a picky process so on OmniOS I tend to use the napp-it package which automates all of this and provides a simple web interface for configuration, making it a similar system to FreeNAS (but not as pretty).