The T5500 was all but useless without suitably fast shared storage to use. With it’s very limited 2 x 3.5″ bays, options were limited without purchasing more hardware, such as an eSATA enclosure, or a SAS card. Call it a sports car with a tiny boot. The dilemma was that the only network storage available was my 13TB Unraid server (which I built back here, out of an old E5300 and 4GB of RAM), which was coming up on 3 years of solid service. Even when I assembled it, it was put together with various bits of leftover mixed hard disks and desktop throwaway components – there was no requirement to buy new gear for it, one of the outstanding features of Unraid, which I love.
However, two big issues. Firstly, Unraid’s write performance is it’s Achilles Heel. Since each write needs parity calculated from across all drives and then written, it’s remarkably slow, in the vicinity of 30-50MB/s if that. Reads are fine, at 70-80MB/s (as per one drive), but not writes. There are stop gap measures, like a dedicated non-redundant cache drive which buffers and writes to the main array at intervals, but for shared VM storage where continual fast writes are required, it’s ineffective. ZIL it is not.
ESXi VM’s usually store their VHD’s in large single VMDK files, often thin provisioned, and continual changes to the file contents would bypass the cache drive. I’ve tried hosting VM’s from Unraid before and it works in a pinch, it just works very poorly. Migrating storage on more than 1 or 2 machines at a time results in gridlock. The good news was that I had 2 x 750GB and 1 x 1TB drives spare. They had each served more than 35,000 hours of service, and I had little use for them anymore. Nor did I think I could get any more life from them. Until now.
I pondered on this conundrum.
I settled on a compromise. Unraid had to run, because of the drive size mish-mash in the existing array, and also migrating the data was not an option – there was just no space anywhere else to move it to. But yet, I could still create a reasonably well-performing second shared storage array from the 3 leftover drives with a bit of fiddling.
The franken-RAID is back, but in a different body. The topmost 3 drives are the Ubuntu RAID5 array.
That way, there would be two storage arrays. One 13TB that was cost/space efficient, but slow. Then a 1.5TB and much faster, but more volatile (though still with redundancy).
Here’s what I did:
* Step 1) De-commission the E5300 machine. All 6 drives which make up the Unraid array are installed in the i5-machine (Bonus: it has 6 SATA slots instead of 4, but the port multiplier was still a requirement for later).
* Step 2) Set up USB passthrough for the Unraid VM for licensing (it’s based off the stick’s UID). ESXi doesn’t have USB boot, so Plop boot (plpbt) was required. Plpbt waits for user input at the boot screen, so plpbtcfg for modifying boot ISO was required. All working and dandy so far.
* Step 3) Install 3 spare HDDs above in machine, pass through physical raw device mapping to a VM, using vmkfstools -z switch. I intentionally avoided converting to VMFS, to allow for easier rebuilding on other machines if required. Not to mention squeezing out that last bit of performance, snapshots be damned.
* Step 4) A 2-core 2GB RAM Ubuntu 13.10 VM for these drives. BSD/Solaris + ZFS was an option, but I had other plans for this VM. If you miss ZIL and L2ARC on an SSD, there’s always bcache, dm-cache and flashcache. RAID5 the 3 drives together with mdadm and format as ext4. NFS and SMB shares are set up, along with iSCSI targets.
* Step 5) Tuning and Benching
– dev.raid.speed_limit_min/max for RAID rebuild speed- /sys/block/mdX/md/stripe_cache_size set to 8192
– blockdev –setra 8192 /dev/mdX
Observations – RAID5 sequential performance is poorer than RAID0, as expected, but random reads and writes, where it matters, are very good. I had issues with NAS4Free’s results, but didn’t have time to investigate further. The RAID5 array performs nearly as well as the RAID0 array, when tested under Ubuntu. Identical read speeds, but marginally slower read speeds, with reasonable CPU overhead. Mdadm is reliable and polished.
* Step 6) Add 120GB SSD as standalone device. Allocate 40GB to cache drive on Unraid array to improve write performance. Allocate 16GB to local host for swapfile caching if physical memory runs low (this will come into play later). Allocate rest for critical system VMs which are time-critical in the event of a reboot, like DHCP server, domain controller and vCenter server. Everything else moves onto the RAID5 array for leisurely starts.
* Step 7) Set up cron to automate live backups of all VMs from both ESXi servers to the Unraid server (which itself is hosted on one of the ESXi servers). It’s not Inception level, but it can be confusing.
* Step 8) Add Intel I217V PCI-E NIC to ESXi server for load-balancing and failover. Bit of extra bandwidth never hurt anybody.
Even without jumbo frames, network performance wasn’t too bad.
Performance so far of the RAID5 array has been quite adequate – the local VM’s write cache, which is sized at around 1.5GB fills up quickly, but dirty writes never exceed about 30-40MB on heavy transfers. Ioping also reveals that storage latency to the array doesn’t exceed 30-40ms on heavy use.
The T5500 runs VMs nicely off the RAID5 array, Unraid still functions exactly like it did, except now it does it under ESXi, and overall performance has greatly improved. In the far future when some of the drives start dying, I plan to set up a SAS enclosure on the T5500, and migrate all the working large drives across to create a compressed and de-duplicated array. But for now, success!
SSDs have no complaints about being stuck on strange surfaces upside-down.