• untoreh-light
  • untoreh

    I am Francesco Giannelli. The website «unto.re» is the place where I put stuff I should remember...or forget. Located in south italy. Born in the early nineties.

  • I am Francesco Giannelli. The website «unto.re» is the place where I put stuff I should remember...or forget. Located in south italy. Born in the early nineties.
    italy

Distributed Filesystems

A round-up of distributed file systems

Goals

Distributed file systems?

A distributed file system, generally, provide a ideally POSIX compliant file system interface. This is the most piece of its definition because building a cluster of nodes that hold data in a distributed fashion can be achieved in many different ways, but building one that provides access to a usable file system interface is challenging. A file filesystem is usually assumed to be local and as such, many applications assume fast access to it, disregarding possible latency issues that might arise on a file-system backed by a remote data. Very few applications discern between local and remote file systems.

Swapping out a file-systems with a distributed one can be considered a form of backward compatibility...in the case you want to deploy an application in a cloud environment that relies on file system access for its data layer, the cloud has to provide a file system interface that can arbitrarily replicate across machines. However, in a single user case, it can also be considered as a way to reduce managing overhead...instead of tracking backups for data from every single server you run, you can track the health of the network based file system and schedule backups on it.

If you don't need strict access to file systems semantics, a distributed object storage interface is simpler and as portable and universal as a file system, with less of synchronicity burden on the network since an object storage per se, doesn't hold meta data. Some object storage software offers a file-system interface built on top.

Round up

Since our goal is not big data, we ignore solutions like HDFS.

Here some benchmark result in a table, they do not cover all the file systems, and might be outdated at this point, and in the f2fs results caching might have slipped through :)

Bandwidth

FS seq rread rrw files create read append rename delete
raw 78793 1.0409e6 89958 179483 17300.0 23550.0 14408.0 4677 5373
zfs 102121 1.3985e6 92391 198410 29180.0 4470.0 18980.0 4695 8468
f2fs 2.064e6 1.455e6 101674 184495 28320.0 10950.0 16890.0 4233 3912
xtreemefs 159310 29117 29468 1690 510.0 1190.0 520.0 274 330
glusterfs 178026 17222 18152 5681 4380.0 7620.0 3110.0 413 1076
beegfs 79934 103006 85983 24867 9830.0 12660.0 10470.0 2889 3588
orangefs 330781 54735 41611 5523 5120.0 7020.0 6130.0 638 1989

IOPS

FS seq rread rrw files create read append
raw 76 266440 22489 44870 4430 6028 3688
zfs 99 358000 23097 49602 7470 1146 4860
f2fs 2064 372524 25418 46123 7250 2803 4325
xtreemefs 155 7279 7366 422 131 306 134
glusterfs 173 4305 4537 1420 1123 1951 798
beegfs 78 25751 21495 6216 2518 3242 2682
orangefs 323 13683 10402 1380 1310 1979 1571

Resources

FS CPU (Server) CPU (Client) RAM (Server) RAM (Client)
xtreemefs 100 25 300 201
glusterfs 100 50 92 277
beegfs 80 80 42 31
orangefs 15 75 60 20

Data

Here are the benchmarks data

The sysctl knobs were tuned for max throughput, but they should be arguably useless, and probably skew the benchmarks, since in an heterogeneous network those knobs are not always applied, and anyways they are network dependent, so even if they are applied, there could be other bottlenecks in place.

Additional comparisons, from wikipedia, from seaweedfs.