This post is also available in: Italian

Reading Time: 4 minutes

When you talk about data protection against hardware failure in storage system, probably RAID (Redundant Array of Independent Disks) is the first technology that you think.

RAID could be hardware or software based, and in the second case it can be implemented (with functions similar to RAID, but not necessary the same) also at the filesystem level (think about ZFS, for example).

But with the new hard disk drive (HDD) capacities edging upwards (6TB HDDs are now available) the traditional RAID is becoming increasingly problematic both for the rebuild time and the bottleneck related to each single disk.

New technologies are becoming more interesting as a replacement of RAID: EC and RAIN. According to TechTarget definitions:

  • Erasure coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces, and stored across a set of different locations, such as disks, storage nodes or geographic locations.
  • RAIN (also called channel bonding, redundant array of independent nodes, reliable array of independent nodes, or random array of independent nodes) is a cluster or group of nodes connected in a network topology with multiple interfaces and redundant storage, providing fault tolerance and graceful degradation.

As you can notice RAIN it’s just something related to a some kind of scale-out storage with a distribuited (across nodes) data protection level. Nothing new because also the LeftHand product (for example) was an example using something similar to a RAID1 across the network and the nodes.

The interesting new part is EC (used in a RAIN environment) that provide a replacement for RAID5, RAID6 or other pariry RAID for a scale-out environment:

Erasure-Coding

Also in this case is nothing completly new (in information theory, erasure code is well analyzed) and was already used and implemented in ojbect storage (and in large public clouds, see for example: Erasure Coding in Windows Azure Storage).

But the combination of RAIN and EC is starting to become more common and usual in storage:

Compared to parity based technique, erasure coding is much complex because it break data into smaller fragments that are expanded and encoded with a configurable number of redundant pieces of data and stored across different locations, such as disks, storage nodes (in RAIN case) or different geographical locations. The key is that you can recover the data from any combination of a smaller number of those fragments and it allows the failure of two or more elements of a storage array

So it offers more protection than RAID, and Marc Staimer, president of Dragon Slayer Consulting, describes erasure coding as up to 10,000 times more resilient than RAID6.

As written an erasure code provides redundancy by breaking objects up into smaller fragments and storing the fragments in different places, so the implementation depends by those numbers:

  • Number of fragments data divided into:……m
  • Number of fragments data recoded into:…..n (n>m)
  • The key property of erasure codes is that the stored object in n fragments can be reconstructed from any m fragments.
  • Encoding rate……………………………………r = m/n (<1)
  • Storage required………………………………..1/r

Those numbers almost define the space efficiency, and partially the performance efficiency (more fragments, more computing power is required, but more nodes, more power can be available). But of course performance depends also by the erasure code that is used and how is implemented.

See also:

Share

Virtualization, Cloud and Storage Architect. Tech Field delegate. VMUG IT Co-Founder and board member. VMware VMTN Moderator and vExpert 2010-24. Dell TechCenter Rockstar 2014-15. Microsoft MVP 2014-16. Veeam Vanguard 2015-23. Nutanix NTC 2014-20. Several certifications including: VCDX-DCV, VCP-DCV/DT/Cloud, VCAP-DCA/DCD/CIA/CID/DTA/DTD, MCSA, MCSE, MCITP, CCA, NPP.