Distributed File System benchmark

See my updated post here


I'm investigating various distributed file systems (loosely termed here to include SAN-like solutions) for use in Docker, Drupal, etc. and couldn't find recent benchmark stats for some popular solutions so I figured I'd put one together.

Disclaimer: This is a simple benchmark test with no optimization or advanced configuration so the results should not be interpreted as authoritative.  Rather, it's a 'rough ballpark' product comparison to augment additional testing and review.


My Requirements:
  • No single-point-of-failure (masterless, multi-master, or automatic near-instantaneous master failover)
  • POSIX-compliant (user-land FUSE)
  • Open source (non-proprietary)
  • Production ready (version 1.0+, self-proclaimed, or widely recognized as production-grade)
  • New GA release within the past 12 months
  • *.deb package and easy enough to set up via CloudFormation (for benchmark testing purposes)

Products Tested:
Others:

AWS Test Instances:
  • Debian 7 (wheezy) paravirtual x86_64 (AMI)
  • m1.medium (1 vCPU, 3.75 GB memory, moderate network performance)
  • 410 GB hard drive (local instance storage)

Test Configuration:

Two master servers were used for each test of 2, 10, and 18 clients.  Results of the three tests were averaged.  Benchmark testing was performed with bonnie++ 1.96 and fio 2.0.8.
Example Run:
$ sudo su -
# apt-get update -y && apt-get install -y bonnie++ fio 
# screen
# bonnie++ -d /mnt/glusterfs -u root -n 4:50m:1k:6 -m 'GlusterFS with 2 data nodes' -q | bon_csv2html >> /tmp/bonnie.html
# cd /tmp
# wget -O crystaldiskmark.fio http://www.winkey.jp/downloads/visit.php/fio-crystaldiskmark
# sed -i 's/directory=\/tmp\//directory=\/mnt\/glusterfs/' crystaldiskmark.fio
# sed -i 's/direct=1/direct=0/' crystaldiskmark.fio 
# fio crystaldiskmark.fio
Translation: "Login as root, update the server, install bonnie++ and fio, then run the bonnie++ benchmark tool in the GlusterFS-synchronized directory as the root user using a test sample of 4,096 files (4*1024) ranging between 1 KB and 50 MB in size spread out across 6 sub-directories.  When finished, send the raw CSV result to the html converter and output the result as /tmp/bonnie.html.  Next, run the fio benchmark tool using the CrystalDiskMark script by WinKey referenced here."

Results (click to view larger image):







(Note: raw results can be found here)

_______________________________________________________


Concluding Remarks:

Both GlusterFS and LizardFS had strong showings with pros and cons for each.  Both should work fine for production use.  While not an endorsement, I will mention that GlusterFS had more consistent results (less spikes and outliers) between each test and I also like the fact that GlusterFS doesn't distinguish between master servers (master-master peers versus LizardFS' master-shadow[slave] configuration).

Update: GlusterFS requires your number of bricks to be a multiple of the replica count.  This adds complexity to your scaling solution.  For example, if you want two copies of each file kept in the cluster you must add/remove bricks in multiples of two.  Similarly, if you want three copies of each file kept in the cluster you must add/remove bricks in multiples of three.  And so on.  Since they recommend one brick per server as a best practice, this will also likely add cost to your scaling solution.  For this reason, I'm now preferring LizardFS over GlusterFS since it does not impose that limitation.


P.S. Check out this related article by Luis Elizondo for further reading on Docker and distributed file systems.


30 comments:

Bernd Schubert said...

Maybe you should check release dates before claiming there have been no updates for some time?

Mr. Blue Coat said...

Hi Bernd, I tried to find release dates for each product before making the claims above but I'm not perfect (and software updates happen all the time) so please feel free to suggest corrections. Thanks!

Bernd Schubert said...

I'm biased for this, as it is my previous work, but there are anyway quite frequently updates: BeeGFS, i.e. Changes in 2014.01-r14 (release date: 2015-03-23)

Problem seems to be as ever the version scheme - the date and the month specify the major update date (so far network protocol incompatible updates), but minor releases (which also can be full of new features) are given by -rXY

Bernd Schubert said...

Sorry, I mean the "Year and the month"...

Mr. Blue Coat said...

Oh, I see. I'll update the entry for BeeGFS to remind me to check the Changelog date next time. Thanks again!

Mr. Blue Coat said...

Unfortunately, BeeGFS is not fully open source: http://www.beegfs.com/wiki/FAQ#open_source

Bernd Schubert said...

Yes, sorry, that is unfortunately true.

oxide94 said...

LizardFS is a fork of *old* version of MooseFS (1.6). LizardFS 2.5.x and 2.6.x = MooseFS 1.6.x (e.g. in performance).

There is a *huge* difference in performance between MooseFS 2.0 and 1.6. Also a lot of algorithms (e.g. rebalance algorithm) has been improved since 1.6.

MooseFS is not a dead software. It is updated more frequently than LizardFS, so your sentence "MooseFS was forked by LizardFS (tested above)" is unfair for MFS, because it suggests, that MFS is not being developed anymore and LizardFS substituted it, which is not true.

You can find a lot of info about new version, download source code etc. on https://moosefs.com (not .org).

Mr. Blue Coat said...

Hi oxide94, thanks for the update! Unfortunately, MooseFS 2.x or 3.x will not work for my needs because I need an open source product *with* high availability. I've updated my MooseFS reasoning above.

oxide94 said...

Ok, I see ;)

Jane Logan said...

There has been a lot of GfarmFS releases since the Apr-13 date you mentioned above, in fact as recent as Aug-15:
http://sourceforge.net/projects/gfarm/files/?source=navbar

Mr. Blue Coat said...

Hi Jane, does gfarm2fs (the fuse client) have a *.deb package?

Mr. Blue Coat said...

https://packages.debian.org/source/wheezy/gfarm looks pretty outdated

Jane Logan said...

Try this:
https://answers.launchpad.net/debian/+source/gfarm/+changelog

Mr. Blue Coat said...

Hi Jane, I'll do a follow-up test soon using Ubuntu 15.10

Mr. Blue Coat said...

Sorry, Jane, but after wrestling with Gfarm for a couple days I gave up because the documentation is awful, there are missing critical files, and there's too much manual process (copy shared secret to hostb, copy gfarm2.conf to hostb, run the following command on hosta, etc.) to automate via CloudFormation. If you have a working CloudFormation template with master-slave metatdata synchronization I'll include it in future tests.

Mateus Mattos said...

Hello, could you please update the tests?
https://lizardfs.com/release-of-lizardfs-3-9-2/
https://lizardfs.com/download/

Mr. Blue Coat said...

Hi Mateus, thanks for the heads up! LizardFS jumped from 2.6.0 to 3.9.2? Weird. Either way, I'll carve out some time soon to update the tests since technically LizardFS has moved from 2.x to 3.x

Mr. Blue Coat said...

Hi Mateus, looks like 2.6.0 is the latest version in the Ubuntu repositories: http://packages.ubuntu.com/search?keywords=lizardfs&searchon=names&suite=wily&section=all

Let me know when that changes and I'll re-run the tests.

Doophy said...

Would be nice to see a comparison to GPFS or Lustre. Despite the reasons you mention for not including them, they are probably the two most widely-used enterprise clustered file systems. Performance numbers from one or both of these would provide a good baseline for comparison, since most storage professionals are pretty familiar with the two products. Also, any notes you have on documentation (like you have in the comments for Gfarm) and support experiences you've had are valuable. While GPFS is proprietary, it's very stable, has excellent documentation, scales easily, and offers solid support avenues (phone (paid) and the GPFS forum).

Full disclosure, I've run GPFS for years both as a customer and employee of IBM. I now work for another company and am looking at testing different products for service offerings. The kind of testing you're doing here is really helpful.

Mr. Blue Coat said...

Hi Doophy,

I too would like to expand the result set to more products and was disappointed that my fairly straightforward requirements didn't produce more matching options.

Regarding your suggestions (and thank you for your full disclosure), I generally distrust commercial offerings in this sector since I believe they're overpriced, overcomplicated, and obtaining GPFS would be difficult since they don't appear to offer it as a trial. Furthermore, there appears to be issues getting it to work on Debian/Ubuntu (which the Docker community favors). For example, see: https://help.ubuntu.com/community/SettingUpGPFSHowTo

Lustre, on the other hand, would be nice to include. I'm rather surprised they don't support Debian considering their product popularity and developer community size, but I guess they don't feel it's a priority. I'll do some research on how feasible it would be to convert and install via CloudFormation.

As for tech notes, I agree I could do better at highlighting the hidden dragons and tricky setup gotchas that are buried in the CloudFormation scripts. Since this posting seems to be gaining in popularity, I may flesh out the details a little more.

Mr. Blue Coat said...

Hi Doophy,

Does Lustre require a patched Linux kernel? See: https://lists.01.org/pipermail/hpdd-discuss/2015-June/002286.html

If so, I'd say that would be a show-stopper for me.

Doophy said...

Storage is big money, and only getting bigger! Companies have no problem leveraging open source code, but they hate giving their stuff away when money could be made!

FYI, GPFS actually comes w/ deb packages now. The doc you linked to mentions v3.1 which hit EOL years ago. Deb packages were included somewhere in the 3.5 timeline, which began in 2012. As for getting copies of GPFS, it's sad that IBM doesn't offer it for personal use. Sad, but not surprising.

You wouldn't be alone in not wanting to patch the kernel. From what I've seen at various customer engagements, most enterprise storage installations exist in pretty closed-off environments. Hence, admins often don't see any need to keep things patched. This is particularly true when supporting scientific communities. Consequently, lots of code intended for enterprise use is only supported on stock kernels (RHEL#(.#) or SLES##(SP#)).

My experience comes mainly from GPFS, where kernel modules are installed. The process is very simple and generally supports updated kernels. I'm not sure about Lustre, but it wouldn't surprise me if it has similar requirements. I'm also not terribly surprised by its lack of Ubuntu support. While its popularity is growing in the enterprise sector, it's still a far cry from the Red Hat's market share. I tend to think many companies support SUSE only because it's RPM-based and adapting a product from supporting RH to SUSE generally doesn't require much work.

Personally, I'm not averse to kernel patching - even in stateless installations. I wish more effort was put into DKMS as that has the possibility of alieviating much of the pain of patching kernels!

Doophy said...

Considering that GPFS 3.5 is still maintained, and 4.1 is the current recommended release, it's especially sad that IBM doesn't offer 3.5 for personal/small business/non-profit use.

Unknown said...

Interesting comparison! Would love to see sxfs in the list.
It's included in Skylable SX 2.0: http://www.skylable.com/get-started

Mr. Blue Coat said...

Is it POSIX-compatible?

Mr. Blue Coat said...

Unfortunately SXFS won't work for this test due to incompatibility with CloudFormation (see updated note in my post) and it leans too heavily on interactive mode (prompting user for data input) for my tastes. In addition, it doesn't separate the metadata server role from the mounted client role so it will consume resources from client instances and the client setup is just as complex as the volume servers (since they're the same thing).

Mr. Blue Coat said...

Hi All,

I finally found a way to securely share runtime secrets between cluster nodes in a CloudFormation template so I'm in the process of creating an updated test with Ubuntu 14.04 LTS and the following products:

GlusterFS
LizardFS
XtreemFS
CephFS
SheepFS
SXFS

...stay tuned

kacho said...

Really looking forward to the above - staying tuned....

Mr. Blue Coat said...

Updated test results have been posted: http://mrbluecoat.blogspot.com/2016/02/updated-distributed-file-system.html

Post a Comment

Keep it clean and professional...