Pure Storage wants to make hard drives permanently obsolete

Pure Storage wants to make hard drives permanently obsolete

Disk is done”, we hear very often during Pure Accelerate 2023. Pure Storage has a clear vision how it plans to achieve this, and claims it will make good on this promise. We outline this vision for you here.

Pure Storage is an unusual player in the storage market. From its founding nearly 15 years ago in 2009, the company has been entirely focused on offering flash storage. It has never made a product that includes hard drives. From that perspective, the statement that we need to get rid of hard disks as soon as possible makes a lot of sense. As more storage moves to flash, Pure will get a bigger market share. By the way, the trend is also that flash is becoming more widely available. Especially with the advent of QLC flash, it became a lot more affordable again. TLC and before that SLC was (and is) very pricey.

However, the natural transition from magnetic hard drives to flash is not happening fast enough for Pure. Starting this year in particular, it feels it’s in a position to properly attack hard drives head-on. There are reasons for this, as we will see later. Pure puts its money where its mouth is, as it were, when it predicted earlier this year no new hard drives will be sold after 2028. That’s only five years from now. When you consider that the storage market is currently still overwhelmingly made up of systems that use hard drives, there is a lot to be done before then.

In itself, it stands to reason that hard drives will eventually disappear. They are already no longer in most devices. The first iPod contained a hard drive, and laptops used them as well, for example. That has all been replaced by NAND flash. Basically, NAND is more energy efficient and offers better performance than hard drives. So you create significantly better products when you move away from old-fashioned technology to modern ones. Right now, HDDs are only used in two locations: data centers of (enterprise) organizations and data centers of hyperscalers. The only reason they are still used there is because HDDs are cheaper to buy than flash-based storage.

Pure has the wind in its sails

Before we get into the specific technical reasons why Pure thinks it has a good chance of taking over a large part of the traditional storage market, let’s take a brief look at Pure’s position in the market. This shows that the company is showing very healthy growth. Meanwhile, annual sales are well past $2 billion and Pure is also increasingly replacing hard drives, CEO Charlie Giancarlo indicated in a session we attended during Accelerate.

To further highlight Pure’s achievements, we see the slide below pop up several times during the event:

Based on figures from IDC, it appears that Pure is the only storage vendor that gained market share between 2014 and 2022. All others have had to give up a little to a lot.

DFM as a differentiator

When we look for reasons why Pure is doing so well in the market, we can’t ignore DFM. You can see the DirectFlash Module as the basis of all the company’s success. The name of these modules already gives a bit of an indication of what it is and what it does. These are modules that Pure develops itself and to which it sticks NAND flash. It can then address this flash memory directly from Pure1, the management environment for FlashArray and FlashBlade products.

The architecture of DFMs is radically different from that of hard disk drives and SSDs. For example, SSDs use an extra layer of tables stored in DRAM. These tables map the SSD’s writes to flash memory. This requires freeing up space on the SSD. That comes down to 1GB per 1TB. SSDs also need to overprovision, to make sure they can always deliver. Think of overwrites they have to do because of I/O alignment issues. This overprovisioning quickly amounts to about 20 percent. The same things apply to HDDs, by the way, only there the percentages are even higher.

With a DFM, there is no need to run a mapping table in an additional layer in DRAM. It also does not require overprovisioning of NAND. Mapping writes to flash is host-based. That is, it is done directly from the management environment.

Of course, the above is only possible if Pure optimally integrates hardware and software. Therein lies the crux. Pure develops and builds everything itself. This was not true from the beginning of its existence, by the way. In the beginning, Pure also used SSDs in their very first FlashArrays. Since 2014, it has made the switch to DFMs.

The choice in 2014 to use DFMs makes Pure a fundamentally different storage vendor from all the other well-known names. Those are also increasingly bringing flash-based systems to market, but they all use SSDs. However, SSDs work broadly the same as HDDs, only they are a lot faster. They have the samen issues around things like overhead as HDDs.

These issues mean that the capacity of SSDs will grow a lot slower than DFMs. In the new FlashArray//E, for example, there is a DFM with a capacity of 75 TB. SSDs are at 30 TB. By the end of next year, Pure will launch a model with a capacity of 150 TB. SSDs are expected to be at 60 TB by then. By the end of 2025, Pure will hit 300 TB per DFM. This is without the modules getting any bigger, by the way. They just stick more NAND on a module.

There is no need to talk about HDDs at all in these comparisons. They will reach their limits much sooner in their current form. The differences between DFMs, on the one hand, and SSDs and HDDs, on the other, will only increase, is the prediction during Accelerate from one of Pure’s founders, John Colgrove.

Getting more affordable

DFMs with huge capacities sound very interesting. However, there is also such a thing as affordability. That was (and still is) the main reason for organizations to go for HDDs as opposed to flash. The announcement of FlashArray//E should address this. With it, the company completes its portfolio (which we will discuss in more detail below). FlashArray//E is aimed at offering the highest possible capacity of file and block storage, at the lowest possible price. It is the counterpart to FlashBlade//E, which came to market earlier this year and offers file and object storage in a single platform.

Charlie Giancarlo, CEO of Pure Storage

Pure wants to make a statement with the //E line. It basically wants to cover the entire lower end of the market with it. This is the first time a FlashArray and a FlashBlade have been given the same naming convention. They both want to achieve the same thing, but for different workloads. Pure set the price per GB at $20 cents. We hear from Jay Subramanian, in charge of Pure’s FlashBlade business, that this price is based on the 48TB DFM included in FlashBlade//E. The latest DFM with a capacity of 75TB will hit the market in a month or so.

Price per GB, by the way, has played a leading role in the development of FlashBlade//E and FlashArray//E. We gather this from the words of CTO International Alex McMullan, when we ask him about a calculation for the $20 cents per GB. According to him, this is not a hugely complicated calculation and is based mainly on the so-called cost of goods. For the math to add up, a certain capacity of flash must be purchased. The minimum commitment of 1 PB for FlashArray//E and 4 PB for FlashBlade//E is largely the result of this. In other words, with 1 and 4 petabytes, respectively, you end up with 20 cents per gigabyte. It also follows that the cost per gigabyte goes down as you purchase more.

More than price per gigabyte

$20 cents per GB is undoubtedly a low price for flash. However, if we look at HDDs, they still seem to be quite a bit cheaper. Their price is already close to a dollar cent per GB these days. So it doesn’t sound all that affordable yet.

Yet again, there are a few caveats to this. First of all, a GB on an HDD is not the same as a GB on a DFM. We saw above that a DFM has much more usable capacity than an HDD (and an SSD). To get to a DFM’s usable capacity of 75 TB, you need a lot more than 75 TB of HDDs. So this should mean that the ‘realistic’ price per gigabyte of an HDD is higher, and as such closer to that of a DFM.

There is also power consumption. This is where flash can make a huge impact. During Accelerate, someone from Virgin Media O2 was on stage talking about a 98 percent reduction in power consumption compared to what they had before they switched to Pure. Of course, such math depends on where you’re coming from and how old (read: inefficient) the equipment was. But it should be clear that significantly lower square footage and number of racks coupled with inherently lower power consumption can result in substantial savings in terms of energy consumption. Especially with the developments in the energy market over the past few years in mind, this is a very interesting side benefit, not to mention the green aspect of it.

It is very difficult to make an accurate cost estimate for HDDs, SSDs and DFMs as it involves quite a few variables. But we would venture to say that blindly choosing HDDs because they are cheaper is no longer the best approach. Organizations would do well to look a little further than this figure.

Complete and clear portfolio

We have already mentioned some of the products from Pure’s portfolio above. It’s time now to take a closer look at the portfolio as a whole. That is rather remarkable. First, because it is very clear. Pure basically carries two product lines, FlashArray and FlashBlade. The first focuses on unified file and block storage, the second on unified file and object storage. You can think of FlashArray as the storage for all mission-critical and everyday workloads, FlashBlade focuses more on things like HPC, training AI models, analytics, genomics, that sort of thing.

Within the two main categories, Pure has subdivided based on a spectrum, running from ultimate performance to ultimate capacity. For the best performance within FlashArray, there’s FlashArray//XL, within FlashBlade it’s FlashBlade//S. FlashArray//X still has the focus on performance. It uses TLC flash, just like the //XL. FlashArray//C should offer the optimal balance between performance and capacity, while the new FlashArray//E is mainly about the highest possible capacity. Here we also find the FlashBlade//E. The three latter models use QLC flash.

So FlashBlade//E is also used for the heavy workloads we cited above for FlashBlade//S, but for data that is used less frequently. According to Subramanian, this is in about eighty percent of cases. So the addition of FlashBlade//E earlier this year is a huge leap forward for Pure in this market. After all, this is all potentially new business. Previously, organizations used HDDs or SSDs for this. He expects a lot of growth in this.

Even more simplification and convergence

The portfolio in itself is pretty straightforward, but behind the scenes Pure has been working to fuse it all together a bit more, we hear during a technical deep dive. Pure often refers to the new version of the FlashArray//X and FlashArray//C, the R4 versions, which Pure announced during Accelerate, as XC R4. This is not out of laziness; Pure has good reasons for that. In fact, behind the scenes there is increasing convergence. These are arrays with the same NICs, the same CPUs and the same controllers. The controller is deployed differently, of course, but is basically the same. The main difference is that FlashArray//C R4 uses QLC flash, while FlashArray//X R4 uses TLC flash.

Such behind-the-scenes simplification not only means that developing products this way can be faster. It also makes things more manageable in terms of spare parts. This, in turn, should eventually be noticeable to customers. Pure can make these parts in larger quantities. This should also mean that the prices of them go down.

Despite simplification behind the scenes, the new FlashArray//C R4 and FlashArray//X R4 offer a huge leap forward in terms of performance. Where the performance improvement from one generation to the next used to be about 20 percent, it’s about 40 percent with the R4 version, we hear from Shawn Hansen, VP and GM for FlashArray at Pure. During the technical deep dive, one of Pure’s spokespeople states that the performance jump for FlashArray//C is as much as 65 percent. Those are undoubtedly very solid performance gains.

Exactly what these performance gains are attributable to is a bit difficult to determine. How much is due to new hardware (like Intel’s Sapphire Rapids Xeon Scalable CPUs) and how much is due to what Pure adds? We can’t get a very clear answer to that. However, we can report that PCIe 5 is not available on the FlashArray for now, even though Sapphire Rapids does support it. A major reason for staying on PCIe 4 has to do with power consumption. In addition, Pure sees that there is also currently no demand for the bandwidth that PCIe 5 has to offer.

Simplicity comes first

In terms of the portfolio, Pure overwhelmingly likes clarity and simplicity. The only deviation from that might be that there are now two different models with the //E suffix. Since FlashBlade and FlashArray have fundamentally different use cases, the confusion this may cause won’t be too bad in practice. Other than that, it is simplicity that rules the roost in Pure’s offerings.

This simplicity is very difficult to achieve, we hear in conversation with Pure CTO Rob Lee. Traditionally, it has also not been common to focus on simplicity. Every time a new feature is added, there is an expectation that users can configure it, he points out. But offering many options creates complexity, and that is not the intention. A good example of how seriously Pure takes this simplicity is the fact that there is no power button on their equipment. Since Pure guarantees zero-downtime, even when upgrading arrays and blades to the latest generation according to its Evergreen principles, there is no need for such a button at all.

Lee does still see a challenge at this point within many organizations. They regularly get the feedback from potential customers that they want to do and see more. However, this is mostly about fear, Lee believes. The idea that you know more if you have more data is pretty ingrained in people’s minds.

At the end of the day, Pure’s emphasis on simplicity produces a very uncluttered picture. There is one OS with Purity, one management layer with Pure1 and two hardware architectures with FlashArray and FlashBlade. Within those components, Pure also tries to simplify as much as possible. It succeeds quite well, although the Pure1 interface could use some attention as far as we are concerned. The panel on the left side of the screen contains a lot of options. As the capabilities of such a management platform increase, there is no escaping the fact that eventually something has to be done about things like that.

From left to right: Pure’s new 75TB DFM, an SSD module and an HDD.

Evergreen cloud model with SLAs

While DFM is the differentiator for Pure in terms of technology, it also wants to differentiate itself in terms of product and service delivery. Its goal is to always have and keep customers on the latest version of the platform. This applies to both software and hardware. The first, software-as-a-service, is not extremely complicated. The second, hardware-as-a-service, is. That’s what we hear from Prakash Darji, GM Digital Experience Business Unit at Pure. After all, the company promises zero-downtime. How do you do that when you come to upgrade or replace a system? Pure always does this without having to migrate customers’ data, he points out.

When we look at a cloud model, it’s not just about how you purchase it, but also how you use it. This is where Pure Fusion plays an important role. With it, you actually abolish tiering within your organization. You manage all your storage as a storage pool (or purchase it as a service), no longer having to think about what data should land where.

A third important pillar in terms of what Pure has to offer consists of the SLAs it offers. This then involves what you might call the digital experience. Pure now offers six SLAs: uptime, buffer capacity, performance, zero planned downtime, energy efficiency and the most recent one is ransomware recovery. So with the latter, Pure is also getting more involved in the security posture of organizations. In addition, it is the first SLA that does not come standard with a contract with Pure. It is an add-on.

According to Taruna Gandhi, VP of Product Marketing, Digital Experience Business Unit at Pure, the addition of this most recent SLA is an important step because it is important to optimally set up resilience at all layers of the IT environment. With this most recent SLA, you get recommendations on how to do this for the part that Pure operates in. In addition, anomaly detection is already available in this first iteration. With this, Pure1 signals that there is a sudden decrease in the amount of data. Toward the future, more such signals will be added, she indicates. Think of file renaming, something that could also indicate foul play.

Killing hard drives is not a pure price/technology discussion

In this article, we have tried to outline Pure’s unique approach in the storage market. This is to clarify how the company can begin to deliver on the claim/promise that hard disks have had their day. In doing so, it is good to realize that it is about much more than just the price per gigabyte and the technology of DFMs. Pure likes to advertize all sorts of impressive numbers. How much better the performance is, how much higher the capacity, how much lower the footprint and so on. That in itself is very interesting, but as far as we’re concerned, it’s not the way Pure is going to get everyone off hard drives.

The ‘problem’ in the storage market is that storage is seen as a commodity. That in itself is a good thing, but it also means that many organizations think of it as a one-size-fits-all market when it comes to equipment. Storage is storage, no more, no less. For most vendors, that’s true. With Pure, however, things are a little different. Partly because of the DFMs, but mostly because of the model it has built around them. Pure’s SaaS and HwaaS approach makes the company special. As far as we’re concerned, that’s where the real added value of the investment a customer makes is.

That’s not necessarily a simple story to tell. Hardware-as-a-service in particular does not have a tremendously good reputation in the infrastructure world because it is expensive. However, Pure also does this in a different way than many other vendors. That is the story that customers need to understand. Then Pure will continue to grow much faster in the coming years than it is already doing. The focus on simplicity that Pure has must, of course, continue to be strictly monitored. That goes without saying. Without simplicity there is no Pure, even if the simple route is not always the fastest and easiest.