The Composable Platform - Enabling Innovation Through Flexible Infrastructure
Data centers have the formidable task of improving operating efficiency and maximizing their IT investments in hardware infrastructure in the face of evolving and varied application requirements. Last year, alone, data centers worldwide spent well over $140B on server and storage infrastructure, yet still do not operate at peak efficiency.
New architectures are needed to better utilize and optimize hardware assets spanning compute, memory and storage. Enabling resource agility where physical compute, memory, and storage resources are treated as composable building blocks is a key to unlocking efficiencies and eliminating stranded and underutilized assets.
In part 1 of this 2-part excerpt, we will explore the inefficiency of today’s data center platform and the challenges this poses. In part 2 we will learn how Microchip innovations solve these challenges, building an agile infrastructure of compute, memory and storage. We will also highlight technology advancements that both the industry and vendors like us are enabling to meet the needs of a composable platform.
Challenges in Computing
The data center accelerator market alone in 2018 recorded $2.8B in revenues, and over the next five years this number is expected to grow at a cumulative annual rate of over 50% and reach over $20B in 2023. That is a lot of new money going into accelerators, FPGAs, GPUs and the like.
Second, each workload has unique optimization points for compute, for memory, and for storage resources. However, today's data center operations are largely dealing with fixed configuration systems that cannot easily adapt to workload needs as they change.
Over $20B of DRAM was purchased and deployed in data centers last year – but, much of that DRAM is not being used. In many data centers, the average DRAM utilization sits at around 50%. How many billions of dollars of stranded DRAM do we have sitting in data centers around the world right now? And this isn’t only a CAPEX problem - DRAM consumes 15-20% of the data center’s power. The problem of stranded memory is going to become an even bigger concern in the future. We need to start efficiently using those resources.
A second significant challenge with memory is that memory bandwidth has not been scaling with compute core counts. Core counts have been increasing, but available memory channels attached to the CPU have not been scaling at the same rate due to physical packaging and pinout constraints. The result of this is that IO bandwidth is decreasing, and memory latency is increasing, on a per -ore basis.
Challenges in Storage
Enterprise NVMe SSD revenue topped $15B last year, and that's in addition to the $10B or so that it being spent on spinning media and packed into data centers around the world.
So, what are some challenges with all this storage? Traditional architectures require that storage be over-provisioned, resulting in stranded storage in a typical server. And, even though we have very high performance NVMe SSDs, machine learning applications require even more bandwidth than a single NVMe SSD can deliver. We also live in a world of multiple different media types and interfaces. We have SSDs and we have HDDs. In the world of an HDD, it's not just an HDD - is it single -ported? Is it multi-ported? Is it single-actuator? Multi-actuator? Is it a SATA device? Is it a SAS device? In the world of SSDs - is it an NVMe interface? Is it a SAS interface? Is it a SATA interface? There are a multitude of different interfaces to deal with.
Inside the SSD, the pace of innovation also continues unabated. NAND is moving from MLC to TLC to QLC. NAND layer counts are ever-increasing, from 96 layers to 128 to 176 layers. And latencies continue to be driven down to previously unheard-of levels. New drive form factors are being designed to improve capacity per unit of real estate and allow for thermals that provide for more capacity. How do we avoid isolating that capacity such that workloads that don't require it are not wasting it or stranding it, due to fixed hardware configurations in the data center?
There must be a better way! In the next and final installment in this series, I will describe our building-block approach for adapting to new use cases and requirements while enabling system-level composability. With composable and flexible infrastructure, or what we like to call agile infrastructure, tremendous strides in efficiencies are possible.
Andrew Dieckmann is the Vice President of Marketing Application Engineering for the Data Center Solutions Division at Microchip Technology. He is responsible for product management, product marketing, product strategy and the application engineering team supporting Microchip’s broad portfolio of storage solutions, including SSD controllers, RAID solutions, HBAs, PCIe switches, and SAS expanders.