Tuesday
Apr182017

Unsung feature of the Airpods

There’s one thing that seems to be consistently missing from all of the reviews of modern headphones is the quality of the microphones. Lots of time and talk about the quality of the headphones for listening to music and podcasts and so on, but rarely do I see tests of the microphones.

I am a happy user of Bose QC headphones for the noise cancelling features, and in this space Bose still reigns over all of the others I’ve tried. But the microphones are just awful. They pick up everything in the local environment and make it pretty much useless for actually talking to anyone. And if there’s even a tiny bit of wind, just forget about it. 

This is one spot where the AirPods stand out from the crowd as they seem to have integrated some noise cancelling on the microphones rather than the headphone portion. I’ve done tests with people who literally can’t hear me while I’m sitting at an outdoor café when using the Bose QC35, but switching to the AirPods makes it seem like I’m in a relatively quiet office setting to people on the other end.

So a shout out to all reviewers out there, please include a section to qualify the microphones since we do sometime actually use these things to talk to people…

In the meantime, I’m stuck swapping from the Bose to the AirPods depending on whether I’m listening to something to talking to someone.

Tuesday
Mar212017

FreeNAS Corral Notes & First Impressions

Well, it’s finally here - a complete rewrite of the FreeNAS middleware and UI with the latest version of FreeBSD bring things up to modern standards with nice things like USB 3 support for those of us building smaller home systems.

Lots of new goodies that I haven’t yet had the time to do a deep dive into like the addition of a proper hypervisor so that you can run virtual machines directly on a FreeNAS box as well as Docker support to replace the older BSD jails method of packaging applications without the overhead of a full virtual machine.

Virtual Machines vs Docker

For the moment, the virtual machine option isn’t going to be terribly useful to me since generally I’ll be using VMware ESXi for my VMs and relying on FreeNAS to provide the underlying storage, whether via NFS or iSCSI (but almost always on NFS). I can see that this could be useful for a lot of people that would like to be able to take advantage of Linux applications and stuff that don’t have current equivalents in the BSD world.

The Docker side of the house is considerably more interesting to me since this may be a quick shortcut to having a Docker-enabled system in the home lab without having to build it out myself. That said, I have to dig in a little deeper to see how to add additional registries and stuff since it currently points to a restricted list from the FreeNAS registry. But definitely something to look into.

Odd bits

Browsers

The first thing that I ran into as an issue is the fact that for some reason, the UI doesn’t seem to behave 100% when using Safari. The quick example is that things like activating services like NFS via Safari doesn’t work. Click the button to enable it and then save doesn’t work. However, switching to Chrome fixes this directly and the same actions result in the service being correctly started. I’m a little disappointed that Safari isn’t a first class client browser, but at least there are reasonable alternatives.

Multi-disk USB

This would seem to be more of an underlying BSD issue but I have a bunch of multi-disk USB 3 boxes that I like to use for quick testing and for big data movements where I can take advantage of the ZFS mirroring with checksums and read load balancing across disks. However there is an odd situation that you might run into when using these kinds of boxes or it may be related to the way that disk serial numbers are interpreted in BSD.

My first test bed is an Intel NUC attached to an Akitio NT2 with two 4 Tb Seagate 2.5” disks. When I go to create a volume (zpool) and look at the disks that are available, it only shows one disk with the curious name of mpath0, rather than the expected BSD naming of da0 and da1 for the two disks. Digging in on the command line, I found that da0 and da1 are both present in /dev, but for some reason I can’t use them to create a zpool, even from the command line. So the hint here is the name mpath0 in the name of the disk that I do see so I’m thinking that there’s some multipath code running that would map the presence of the same disk twice via two different communications paths (either remote iSCSI volumes or dual-ported SAS disks).

Looking at the contents of dmesg, it becomes clear that the gmultipath has determined that my disks da0 and da1 are in fact the same disk! Which is understandeable when you dig into what the box is reporting back to the BSD APIs

[root@freenas] ~# camcontrol devlist
<TS32GMTS400 N1126KB>              at scbus0 target 0 lun 0 (ada0,pass0)
<inXtron, NT2 U3.1 0>              at scbus2 target 0 lun 0 (pass1,da0)
<inXtron, NT2 U3.1 0>              at scbus2 target 0 lun 1 (pass2,da1)
[root@freenas] ~# camcontrol inquiry da0 -S
160318E6006C
[root@freenas] ~# camcontrol inquiry da1 -S
160318E6006C

So from the BSD perspective, these two disk are the same ones.

So if you are not using hardware that is explicitly designed for multipathing on the back-end, you’ll probably want to disable the multipath kernel module. I’ve tried a number of approaches without success, so I went with the brutal method of simply moving the module out of the /boot/kernel directory:

mv /boot/kernel/geommultipath.ko \~/

After this command and a reboot, it goes back to the old school da0 and da1 disks that are seen as two different devices.

I suspect that this module will probably come back when you run an update so keep an eye open.

And in general be warned that the BSD USB hotplug feature is a little hit and miss. You’re much better off having any USB storage devices you want online connected at boot time in order to properly have them seen by the OS. And despite this, for some reason, the webUI will not always show you all available disks for creating your zpool. But all is not lost - you can create the zpool at the command line and have it recognized by the UI. In fact the websockets part of the UI is startlingly fast in that as soon as I created the pool at the command line, it popped up in real time in the webUI.

Keyboard layout

One thing that still hasn’t been addressed in the installation process is the ability to select the local keyboard type before entering the new root password. You can change the console keyboard layout in the installation wizard post-install, but it should be an option available during the initial install so that a complex password can be used.

UI

The new UI is a lot prettier than the last version and a little clearer without the confusion between the duplicate names in the top bar and the hierarchical menu on the left. I really like the way that alerts are displayed in the top corner and actions are show in the right column with progress and status, along with some more details when you click on them. I’d like to see something that gives more details on task failures (log information etc.).

I like the fact that I can select a few widgets that will live on the dashboard (IOPS, ARC hits, network bandwidth etc.) so that I have that information immediately upon connecting to the server.

UPS

The UPS integration seems to have gotten a serious overhaul and exposed in the new UI which makes me happy with an exceedingly complete list of UPS models that can be used, including the EATON models that I use on my home lab. Now to find out just how far I can push things with the possibility of scripts to shut down other devices on the network.

Friday
Aug122016

Thunderbolt: a fast and cheap SAN

The lack up uptake by major vendors of Thunderbolt as a transport for storage systems has baffled me for a long time. For large systems there are a number of hard limitations that make it less than ideal, but in the smaller SAN/NAS space it seems like it would be a perfect replacement for SAS connected disk trays. 

This may be changing now that we are seeing some cluster interconnects coming on the market oriented towards leveraging the fact that Thunderbolt is basically a PCIe bus extension and we’re starting to see PCIe switches coming on the market from players like IDTAvago and Broadcom.

But getting back to practical applications, I started having a need for a portable SAN/NAS box to help with some client projects involving datacenter migrations or storage migrations where they needed a little extra swing space. So with the current state of the art there are quite a few Thunderbolt based storage systems that are well adapted to what I had in mind. But the first issue I ran into is that while Thunderbolt 3 on USB-C connectors is starting to appear in newer laptops, NUCs and MicroPCs, they are almost always single port setups and I wanted to be able to dedicate one port to storage and another port to the network interconnect. Which led me back to what appears to be the only small form factor dual Thunderbolt equipped machine on the market: the Mac Mini. Now this means that I’m “stuck” with Thunderbolt 2 over the Mini DisplayPort connections, but hey it’s only 20 Gbps so for a mobile NAS, this ought to be OK.

What pushed this over the line from something on my list of things I should do some day was the discovery of a relatively new line of disk bays from Akitio that have four 2.5” slots that will accept the thicker high density drives and are daisy chainable with two Thunderbolt 2 ports.

So with this in mind, my bill of materials looks like this:

And I completed this with a set of four 1 Tb drives that I had kicking around.

With all of this in hand, I started off with FreeNAS, but for some reason I couldn’t get it to install with a native EFI boot on the Mac Mini, so I ended up tweaking the configuration using the rEFInd Boot Manager to get FreeNAS running.

The basic install and configuration worked just fine, but for some reason I could never track down, the system would freeze once I started putting some load on it, whether local stress tests or copying data from another system. So about this time, I noticed that the latest Ubuntu 16 distribution includes ZFS natively and went to give that a spin.

In the first place, the Ubuntu installs natively and boots via EFI without a hitch so that simplified the setup a little bit. Then just a matter of installing the zfs utilities (apt-get install zfsutils-linux) and setting up an SSD pool and a couple of disk pools.

On my local network I am sadly behind the times and am still living in GbE-land, but my initial load tests of transferring some ZFS file systems (4 Tb of data) using zfs send/recv over netcat worked flawlessly and saturated the GbE link using either the built-in network port or the Sonnet.

Physical assembly

Similarly to the Mobile Lab I just built, I wanted to simplify the power cabling and management and limit the box to a single power plug. I did look at including a power bar and using the included adaptors, but that actually takes a lot of space and adds a significant amount of weight. Happily the Akitio boxes take 12V in, so it was just a matter of soldering some connectors onto the 12V lines out of the PSU and running one direct line from the plug over to the Mac Mini.

Then it was off to my metalworking shop to build a case to hold all of this which resulted in the following design:

 

Real life

I’ve got a project going where I’m working on a team that is consolidating a number of data centers where we deploy a local staging area equipped with 2 ESXi servers and a 24Tb ZFS based NAS. So from there I need to move the data to another data center and we’re leveraging the ability of ZFS to sync across different systems using the snapshots (as discussed here in my auto-replicate scripts). 

Given the volume of data involved, I do the initial replication manually using netcat instead of the script that uses ssh since ssh is CPU bound on a single core which limits the potential throughput. Using this method I was getting sustained network throughput of 500MB/sec. Yes that’s Megabytes, not bits. Peaks were hitting 750MB/sec. All of this through a Mac Mini…

Mobile NAS next to its big brother:

Miscellany

I’m usually trying to design systems to be as quiet as possible, and while there are fans on the Akitio boxes, they are low RPM and make hardly any noise. Using the included power adapters it’s actually very very quiet. In the final mobile configuration, the only thing that makes any noise is the fan on the PSU. So this setup could very well be used as a home NAS without having to hide it in the garage. If I were used this as a design spec for a home NAS, I’d probably start with an Intel NUC or Gigabyte BRIX with Thunderbolt3 using FreeNAS though for the simplicity of management and easy access to protocols other than NFS.

While it’s certainly easiest to do all of this over Ethernet, I can also extend the setup to be able to handle Fibre Channel with something like the Promise SANLink2 and the Linux FC Target software.

Thunderbolt 2 on the Mac Mini is extensible to up to 6 external devices on each chain so I could theoretically add two more Akitio boxes on the storage chain and another five more if I wanted to share the Thunderbolt connection I’m using for the network.

Friday
Aug122016

In a pinch...

A fun story for the sysadmins out there on an ugly situation that got fixed relatively easily. Recently, I ran into a situation in a client datacenter where they were running a FreeNAS system where the USB key with the OS had died. All of the important file services, notably the NFS service to a couple of ESXi servers were still running, but anything that touched the OS was dead. So no web console, and no SSH connections.

In my usual carry bag, I have my MacBook Pro, Lightning to GbE adaptor and a Samsung 1Tb T2 USB 3 flash drive, formatted in ZFS. And of course some spare USB keys.

So first up, using VMware Fusion, I installed the latest version of FreeNAS on a spare key in case the original was a complete loss. How to do this? Well, you can’t boot a BIOS based VM off a USB key, but you can boot from an ISO and then connect the USB key as a destination for the install. So now I have something to run the server on later.

Then the question is, how to swap this out without taking down the production machines that are running on the ESXi servers? For this I created a new Ubuntu VM and installed ZFS on Linux plus the NFS kernel server. Now that I have an environment that has native USB 3 and automatic NFS publishing from the ZFS attribute “sharenfs” I connected the Samsung T2 to the VM and imported the zpool. I couldn’t use FreeNAS in this case since it’s support for USB 3 is not great.

Then there was a quick space calculation to see if I could squeeze the running production machines into the free space. I had to blow away some temporary test machines and some older iso images to be sure I was OK. Then create a new file system with the ever so simple “zfs create t2ssd/panic” followed by “zfs set sharenfs=on” and open up all of the rights on the new filesystem. Oh, and of course, “zfs set compression=lz4” wasn’t necessary since was already on by default on the pool.

Then it was just a matter of mounting the NFS share on the ESXi servers and launching a pile of svMotion operations to move them to the VM on my portable computer on a USB Drive. Despite the complete non enterprisey nature of this kludge, I was completely saturating the GbE link (the production system runs on 10GbE - thank god for 10GBase-T and ethernet backwards compatibility).

Copying took a while, but after a few hours I had all of the running production machines transferred over and running happily on a VM on my portable computer on a USB Drive.

Then it was just a matter of rebooting the server off of the new USB key, importing the pool and setting up the appropriate IP addresses and sharing out the volumes. Once the shares came back online, they were immediately visible to the ESXi servers.

Then I left the MacBook in the rack overnight while the svMotion operations copied all the VMs back to the points of origin.

Best part: nobody noticed.

Monday
Jun272016

An interesting dual-site ScaleIO Configuration (probably unsupported)

ScaleIO is a member of the new class of scale-out storage systems that permits you to scale-out your storage by adding additional nodes either in a hyperconverged configuration with VMs installed in your hypervisors or as a bare-metal storage cluster.

I have been a fan of this type of architecture since it gets rid of many of the limitations of the traditional scale-up SANs and offers (potentially) a new degree of portability and finally the end of the fork-lift upgrade cycle.

However, with the latest version of ScaleIO there are some odd design choices that can be problematic in the smaller and mid-sized environments. Specifically, it is now enforcing the minimum of three fault sets (should you decide to use them). The concept of the fault set is a group of nodes that are more likely to fail as a group due to some common dependency, generally power to a rack. For data protection reasons, whenever a block is written, a second copy block is written to another node in the cluster. Adding fault sets to the mix forces this second block to go to a node outside the fault set where the original block was written to ensure availability.

The problem with ScaleIO’s new enforcement of the three fault set model is that this means you can no longer easily build out a dual room configuration for availability which is pretty much the design of most highly available configuration in small and medium sized configurations (and even in quite a number of large ones). With this limitation in mind and knowing a bit about the way the data paths and metadata are placed in ScaleIO I decided to see if this really was a hard limitation or if there was a way to work around it to build a more traditional dual-site configuration with the 2.0 release.

Cluster configuration

In order to ensure a minimum level of viability when one site is offline, I set up a test bed with a cluster of two fault sets of three nodes each. The nodes used here all have three 100 Gb disks (yes, these are virtual machines). There is also a third fault set configured with a single node with the minimum of 100 Gb of storage assigned to it.

There is a shared L2 network across the entire cluster for storage services so this would be similar to having a stretched VLAN across two rooms.

On the MDM side of things, I used the 5 node cluster configuration with the primary MDM in one fault set and the standby in the second fault set.

These are attached to a three node vSphere cluster to general load and test connectivity with a half-dozen Linux VMs.

Operations

Once all of the ScaleIO nodes are online, I can use the CLI or the vSphere plugin to create and map volumes from the cluster to the SDCs on the ESXi hosts. Here there is no problem. There is an alert in the ScaleIO reporting that the fault sets are not balanced, but this simply has the result that the data distribution is not equal by volume across the fault sets, but simply by percentage used. Otherwise, the cluster is fully operational. At this stage I have all of the VMs running nicely and am running bonnie++ to generate a read and write load across the cluster.

At this point I take the single node of the third fault set offline politely using the delete_service.sh command in /opt/emc/scaleio/sds/bin.

This has the expected result of activating a rebuild operation to properly protect the blocks that were stored on the 100 Gb of the third fault set. Since there is a relatively small amount of data involved, this goes fairly quickly.

At this point, the storage is still available and operational to the SDCs and everything is running. However there is one limitation at this point: I cannot modify the structure of the cluster without the third fault set online. That’s to say I can’t create or delete volumes to present to the SDCs. In a steady state operation this is not a big deal since I don’t modify the volumes on a daily basis.

Once the rebalance has finished, I have my desired state: a dual-site setup with data being written across the two fault sets that are online. Now for the “disaster” test. Here I brutally poweroff all three of the nodes in one of the remaining fault sets and observe the results. At this stage, the result is that the storage is still available to the SDCs and the VMs are still running and generating read/write traffic. So we have a reasonable DR test for a single site failure.

Now for the fail-back: I bring the nodes in the failed fault set back online and the expected rebuild operation kicks off, reestablishing the two fault-set cluster with blocks distributed across the two fault sets.

Resume

ScaleIO is an impressively robust and resilient system that allows for things that the designers probably didn’t have in mind. That said, a simple dual-room setup based on two fault sets with a minimum number of nodes per fault set should be part of the standard configuration options given the ubiquity of this type of configuration and to put them on level competitive ground with all of the dual-site HA offerings available from HP, Huawei, Datacore, etc.

And to finish, I would also recommend separating the MDM roles from the SDS on completely different systems, perhaps in VMs pinned to local storage on site for a clear separation of responsibility. For those getting started with ScaleIO the fact that the two roles can cohabit the same servers can lead some confusion when you’re just getting started and not clear on the dependencies.