(updated) Distributed File System benchmark

Note: this is an update to my previous test

I'm investigating various distributed file systems (loosely termed here to include SAN-like solutions) for use in Docker, Drupal, etc. and couldn't find recent benchmark stats for some popular solutions so I figured I'd put one together.

Disclaimer: This is a simple benchmark test with no optimization or advanced configuration so the results should not be interpreted as authoritative.  Rather, it's a 'rough ballpark' product comparison to augment additional testing and review.

My Requirements:
  • No single-point-of-failure (masterless, multi-master, or automatic near-instantaneous master failover)
  • POSIX-compliant (user-land FUSE)
  • Open source (non-proprietary)
  • Production ready (version 1.0+, self-proclaimed, or widely recognized as production-grade)
  • New GA release within the past 12 months
  • Ubuntu-compatible and easy enough to set up via CloudFormation (for benchmark testing purposes)

Products Tested:

AWS Test Instances:
  • Ubuntu 14.04 LTS paravirtual x86_64 (AMI)
  • m1.medium (1 vCPU, 3.75 GB memory, moderate network performance)
  • 410 GB hard drive (local instance storage)

Test Configuration:

Three master servers were used for each test of 2, 4, and 6 clients.  Each client runs a small amount of background disk usage (file create and update):

(crontab -l ; echo "* * * * * ( echo \$(date) >> /mnt/glusterfs/\$(hostname).txt && echo \$(date) > /mnt/glusterfs/\$(hostname)_\$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 25 | head -n 1).txt )") | sort - | uniq - | crontab -

Results of the three tests were averaged.  Benchmark testing was performed with bonnie++ 1.97 and fio 2.1.3.
Example Run:
$ sudo su -
# apt-get update -y && apt-get install -y bonnie++ fio 
# screen 
# bonnie++ -d /mnt/glusterfs -u root -n 1:50m:1k:6 -m 'GlusterFS with 2 data nodes' -q | bon_csv2html >> /tmp/bonnie.html
# cd /tmp
# wget -O crystaldiskmark.fio http://www.winkey.jp/downloads/visit.php/fio-crystaldiskmark
# sed -i 's/directory=\/tmp\//directory=\/mnt\/glusterfs/' crystaldiskmark.fio
# sed -i 's/direct=1/direct=0/' crystaldiskmark.fio 
# fio crystaldiskmark.fio
Translation: "Login as root, update the server, install bonnie++ and fio, then run the bonnie++ benchmark tool in the GlusterFS-synchronized directory as the root user using a test sample of 1,024 files ranging between 1 KB and 50 MB in size spread out across 6 sub-directories.  When finished, send the raw CSV result to the html converter and output the result as /tmp/bonnie.html.  Next, run the fio benchmark tool using the CrystalDiskMark script by WinKey referenced here."

Important Notes: 

1.  Only GlusterFS and LizardFS could complete the intense multi-day bonnie++ test.  The others failed with these errors:
  • CephFS (both kernel and fuse)
    • Can't write block.: Software caused connection abort
    • Can't write block 585215.
    • Can't sync file.
  • SXFS
    • Can't write data.
2.  GlusterFS and LizardFS had significant differences in bonnie++ latency which couldn't be shown on the graph without distorting the scale:

Seq Create (sec)Rand Create (sec)
GlusterFS 3.7.6173164
LizardFS 3.9.4 33

3.  GlusterFS took at least twice as long as LizardFS to complete the bonnie++ tests (literally 48 hours!).  Switching to xfs out of curiosity helped performance significantly (less than 24 hours), however all tests were done with ext4 (Ubuntu default).

4.  CephFS did not complete the "Rand-Read-4K-QD32" fio test

Results (click to view larger image):

(Note: raw results can be found here)


Concluding Remarks:
  • Since GlusterFS and LizardFS were the only ones that could complete the more intense bonnie++ test, I would feel more confident recommending them as "production ready" for heavy, long-term loads.
  • Also (as mentioned above), LizardFS was much faster than GlusterFS (at the cost of higher CPU usage).
  • In terms of setup and configuration, GlusterFS was easiest, followed by LizardFS, then SXFS, and finally (in a distant last place) CephFS.
  • SXFS shows promise but they'll need to simplify their setup process (especially for non-interactive configuration) and resolve the bonnie++ failure.
  • My overall recommendation is currently LizardFS GlusterFS.  (Update: I have stopped recommending LizardFS because metadata HA is not currently supported out of the box -- see comments below).


Mateus Mattos said...

Thank you for sharing your tests.

Mr. Blue Coat said...

You're welcome, Mateus!

274 said...

Greatly appreciate these

Rijnhard Hessel said...

thanks for these, pretty informative

Benjamin said...

Thanks very much for not only providing some nice, neutral results but sharing the exact methods needed to reproduce this for others.

Chris said...

Hello Mr. Blue Coat! Thank you for sharing these results! :)

You say that you use a "410 GB hard drive (local instance storage)" but the CloudFormation script for CephFS doesnt seem to have that configured. I'm sorry if this sounds noobish, but how do you set up such a thing in a cloudformation script?

Thanks again! :)

Mr. Blue Coat said...

Hi Chris, the 410 GB hard drive is the default with my selected AWS instance: https://gist.github.com/anonymous/f9e36edf2c341db4d8c3#file-cephfs-json-L54-L64 You don't have to configure additional drive space, it just comes with it by default.

Shine and Go said...

Is it possible to share performance results between LizardFS and Moosefs.


Mr. Blue Coat said...

MooseFS is a commercial product and I don't have a license. Sorry.

Kiran Ranjane said...

Moosefs have 2 version of software - One is a complete open source (GPLv2 license) which does not need any license and the commercial one which needs a license. The only difference between both is, commercial comes with built in support for HA for master server and in open source version you need to configure it using corosync or ucarp etc.

You can try comparing Lizardfs and open source version of Moosefs and lizardfs is fork from open source version of Moosefs.

Link to open source Moosefs


Mr. Blue Coat said...

Hi Kiran, this guide is intended for busy and new sysadmins that simply want a single free HA product that meets their needs. My goal is not academic research-grade completeness by combining and configuring a suite of technical tools. Feel free to use my scripts above, though, and perform that additional study.

Danny Kulchinsky said...

Thank you for this work, this is very useful and informative! I've been considering LizardFS myself however after chatting with one of their reps today I found that both the Windows Native Client and Metadata HA mechanism are closed sourced and provided only as part of a Commercial agreement. Both are important for my use case as we have a mixed Windows/Linux environment and Metadata HA is kind of a basic requirement :)

I do wonder, how did you setup the HA for Metadata servers ? from brief review of the CloudFormation script I see that you have 3 Masters, Master1 (Personality=master) and Master2+3 (Personality=shadow).

With such a configuration, the fail-over is not automatic, in case Master1 dies one of the Shadow Masters needs to be promoted manually (as was explained to me today).

I guess something like corosync could be used to handle this, but I do wonder how you've managed this.


Mr. Blue Coat said...

Good point Danny! It seems from https://github.com/lizardfs/lizardfs/issues/266#issuecomment-93455849 and https://github.com/lizardfs/lizardfs/issues/299 and https://github.com/lizardfs/lizardfs/issues/326 that Metadata HA is not reliable or requires additional tooling. I'll update my recommendation to prefer GlusterFS. Thanks!

Post a Comment

Keep it clean and professional...