(updated) Distributed File System benchmark
Note: this is an update to my previous test
I'm investigating various distributed file systems (loosely termed here to include SAN-like solutions) for use in Docker, Drupal, etc. and couldn't find recent benchmark stats for some popular solutions so I figured I'd put one together.
Disclaimer: This is a simple benchmark test with no optimization or advanced configuration so the results should not be interpreted as authoritative. Rather, it's a 'rough ballpark' product comparison to augment additional testing and review.
- No single-point-of-failure (masterless, multi-master, or automatic near-instantaneous master failover)
- POSIX-compliant (user-land FUSE)
- Open source (non-proprietary)
- Production ready (version 1.0+, self-proclaimed, or widely recognized as production-grade)
- New GA release within the past 12 months
- Ubuntu-compatible and easy enough to set up via CloudFormation (for benchmark testing purposes)
- GlusterFS 3.7.6 [2015-11-09]
- ('replicated volume' configuration) - CloudFormation script
- LizardFS 3.9.4 [2015-12-09]
XtreemFS 1.5.1 [2015-03-12]
- couldn't get write-replication (WqRq or WaR1) to work ("Input/output error") - CloudFormation script
- CephFS 9.2.0 [2015-11-06]
SheepFS 0.9.3 [2015-11-05]
- can't write any files to the client mounted folder ("cannot touch 'text.txt': Function not implemented") - CloudFormation script
- SXFS 2.0 [2015-12-15]
- Bazil is not production ready
- BeeGFS server-side components are not open source (EULA, Section 2)
- Behrooz (BFS) is not production ready
- Chirp/Parrot does not have a *.deb package
- Gfarm version 2.6.8 compiled from source kept returning x.xx/x.xx/x.xx for gfhost -H for any non-local filesystem node (and in general the documentation and setup process was terrible)
- GPFS is proprietary
- Hadoop's HDFS is not POSIX-compliant
- Lustre does not have a *.deb package and requires a patched kernel
- MaggieFS has a single point of failure
- MapR-FS is proprietary
- MooseFS only provides high availability in their commercial professional edition
- ObjectiveFS is proprietary
- OpenAFS kerberos requirement is too complex for CloudFormation
- OrangeFS is not POSIX-compliant
- Ori latest release Jan 2014
- QuantcastFS has a single point of failure
- PlasmaFS latest release Oct 2011
- Pomegranate (PFS) latest release Feb 2013
- S3QL does not support concurrent mounts and read/write from multiple machines
- SeaweedFS is not POSIX-compliant
- Tahoe-LAFS is not recommended for POSIX/fuse use cases
- TokuFS latest release Feb 2014
AWS Test Instances:
- Ubuntu 14.04 LTS paravirtual x86_64 (AMI)
- m1.medium (1 vCPU, 3.75 GB memory, moderate network performance)
- 410 GB hard drive (local instance storage)
Three master servers were used for each test of 2, 4, and 6 clients. Each client runs a small amount of background disk usage (file create and update):
(crontab -l ; echo "* * * * * ( echo \$(date) >> /mnt/glusterfs/\$(hostname).txt && echo \$(date) > /mnt/glusterfs/\$(hostname)_\$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 25 | head -n 1).txt )") | sort - | uniq - | crontab -
Results of the three tests were averaged. Benchmark testing was performed with bonnie++ 1.97 and fio 2.1.3.
$ sudo su -
# apt-get update -y && apt-get install -y bonnie++ fio
# bonnie++ -d /mnt/glusterfs -u root -n 1:50m:1k:6 -m 'GlusterFS with 2 data nodes' -q | bon_csv2html >> /tmp/bonnie.html
# cd /tmp
# wget -O crystaldiskmark.fio http://www.winkey.jp/downloads/visit.php/fio-crystaldiskmark
# sed -i 's/directory=\/tmp\//directory=\/mnt\/glusterfs/' crystaldiskmark.fio
# sed -i 's/direct=1/direct=0/' crystaldiskmark.fio
# fio crystaldiskmark.fio
Translation: "Login as root, update the server, install bonnie++ and fio, then run the bonnie++ benchmark tool in the GlusterFS-synchronized directory as the root user using a test sample of 1,024 files ranging between 1 KB and 50 MB in size spread out across 6 sub-directories. When finished, send the raw CSV result to the html converter and output the result as /tmp/bonnie.html. Next, run the fio benchmark tool using the CrystalDiskMark script by WinKey referenced here."
1. Only GlusterFS and LizardFS could complete the intense multi-day bonnie++ test. The others failed with these errors:
- CephFS (both kernel and fuse)
- Can't write block.: Software caused connection abort
- Can't write block 585215.
- Can't sync file.
- Can't write data.
|Seq Create (sec)||Rand Create (sec)|
3. GlusterFS took at least twice as long as LizardFS to complete the bonnie++ tests (literally 48 hours!). Switching to xfs out of curiosity helped performance significantly (less than 24 hours), however all tests were done with ext4 (Ubuntu default).
4. CephFS did not complete the "Rand-Read-4K-QD32" fio test
Results (click to view larger image):
(Note: raw results can be found here)
- Since GlusterFS and LizardFS were the only ones that could complete the more intense bonnie++ test, I would feel more confident recommending them as "production ready" for heavy, long-term loads.
- Also (as mentioned above), LizardFS was much faster than GlusterFS (at the cost of higher CPU usage).
- In terms of setup and configuration, GlusterFS was easiest, followed by LizardFS, then SXFS, and finally (in a distant last place) CephFS.
- SXFS shows promise but they'll need to simplify their setup process (especially for non-interactive configuration) and resolve the bonnie++ failure.
- My overall recommendation is currently
LizardFSGlusterFS. (Update: I have stopped recommending LizardFS because metadata HA is not currently supported out of the box -- see comments below).