Who knew storage could cool again? Pure Storage has upset the legacy IT business known as enterprise storage by bringing in several innovations to the storage industry. From their original FlashArray //M, they made the decision to build modular architectures and this resulted in two platforms: FlashArray //X for primary storage, and FlashBlade for object storage and unstructured data.
In the FlashArray //X, Pure Storage built a software-defined flash management layer across the storage fabric. The goal of this layer is to eliminate all the inefficiencies that are bound to using flash drives, such as having a single Flash Translation Layer (FTL) per drive. The development of DirectFlash modules in the FlashArray //X is in line with this strategy and makes the array more efficient by handling at a global scale operations such as garbage collection, allocation, I/O optimization and error correction.
While this makes FlashArray //X an industry leading all-flash arrays (AFA), there are use cases where data needs to be as close as possible to compute resources, such as shared-nothing architectures and big data workloads. Solutions up until now have been to use DAS. Direct attached storage resides on a server and doesn’t need to be shared with any other node. Customers choose DAS because it is fast, simple to configure, and very cheap.
But DAS has limitations – it has no snapshots, no replication, no deduplication, no thin provisioning or business continuity features. It is also not enterprise-grade reliable. So what does Pure Storage do? They develop DirectFlash Fabric – an enterprise-grade approach that resolves the DAS performance, reliability, and manageability problems.
DirectFlash Fabric is an end-to-end NVMe solution. The term “end-to-end NVMe” means the storage platform supports front-end NVMe connectivity (the fabric or network) along with back-end device connectivity (NVMe flash). With this architecture, Pure Storage looks, feels and tastes like DAS.
As a performance comparison, Pure Storage has reduced DB query times on a standard test workload from 10 seconds on their first FlashArray //M storage array all the way down to 2.5 seconds on the FlashArray //X with DirectFlash Fabric. This is a 4x reduction. For reference, the same query would take five minutes to run on a legacy disk arrays.
The illustration below shows the latency areas (in orange) that have been either eliminated or optimized during each step.
To get this level of performance, all you need is a FlashArray //X storage array and NVMe/RoCE adapters installed. You will also need to have servers with network adapters that support RoCE (RDMA over Converged Ethernet). RoCE network adapters use hardware-offloading and communicate directly with a storage array using direct memory access and native NVMe commands, eliminating the latencies associated with the iSCSI protocol as well as reducing CPU utilization (Pure tested a 25% CPU savings).
RDMA over Converged Ethernet is only the first step: Pure Storage plans to deliver support for NVMe over Fiber Channel later in 2019 and NVMe over TCP in 2020.
Summary: DirectFlash Fabric provides not only better performance, but also a reduced data center footprint. A single FlashArray //X with 15 physical servers attached can deliver usable capacity of 1 PB (based on a 5:1 data reduction ratio) in a half rack footprint. Compare this to a DAS solution requiring 19 2U servers and delivering around 250 TB of usable capacity (and without any data management services such as deduplication, compression, replication and snapshots).