NVMe over TCP: How it supercharges SSD storage using standard IP networks
For maximum storage performance, NVMe/TCP marks the next step forward in SSD networking.
Soon after data centers began transitioning from hard drives to solid-state drives (SSD), the NVMe protocol arrived to support high-performance, direct-attached PCIe SSDs. NVMe was followed by NVMe over Fabrics (NVMe-oF), which was designed to efficiently support hyperscale remote SSD pools, effectively replacing direct-attached storage (DAS) to become the default protocol for disaggregated storage within a cloud infrastructure.
Most recently, NVMe over TCP has arrived to provide a more powerful NVMe-oF technology, promising high performance with lower deployment costs as well as reduced design complexity. In essence, NVMe over TCP extends NVMe across the entire data center using the simple and efficient TCP/IP fabric.
“Having the ability to communicate at high bandwidth with low latency, while gaining physical separation between storage arrays, and then adding a normal switched network incorporating the TCP protocol for transport, is a game changer,” says Eric Killinger, IT director at business and technology advisory firm Capgemini North America. “Cloud hyperscalers are already adopting this technology, replacing formerly new two- and three-year-old SSD technologies to enable greater query access for data analytics and IoT,” he says.
Background: Emergence of NVMe and NVMe-oF
Storage received a massive speed boost when the first arrays built with NVMe SSDs arrived, but the devices still talked to servers over a SCSI-based host connection. NVMe-oF deployments can support remote direct memory access (RDMA) for block storage devices based on NVMe across switched fabrics.
“It’s a high-performance storage networking protocol that’s optimized specifically for solid-state storage … and offers much lower latencies, higher bandwidth, parallelism, and much better efficiencies,” says Eric Burgener, research vice president, infrastructure systems at technology research firm IDC.
NVMe-oF can be used over different types of network transports, including Fibre Channel (FC), Ethernet and InfiniBand. Within Ethernet, there are different transport options, including RDMA over Converged Ethernet (RoCE), and iWARP, as well as TCP.
The downside is that FC, InfiniBand, RoCE and iWARP options all require custom host bus adapters and drivers, making them challenging and expensive to implement and maintain. “NVMe over TCP is a true industry standard and works with the standard Converged Ethernet adapters that ship on almost every enterprise server,” Burgener explains. In addition, most major Linux variants now include an NVMe-over-TCP driver in their standard distribution.
“It’s a published and accepted standard, which means that it will dominate NVMe-oF deployments in the long run,” Burgener says. “It’s also less expensive to implement and doesn’t require an upgrade schedule outside of standard Linux or Ethernet adapters, but will have a bit higher latency than RoCE, the other Ethernet option that to date has been widely deployed.”
RoCE and iWARP each support RDMA, while FC and TCP don’t. This ability to support RDMA allows slightly lower latency, yet all of the approaches manage to deliver a significant performance improvement over plain SCSI-based storage networking technologies, such as Fibre Channel (FC) and iSCSI.
NVMe/TCP deployment and use
A primary reason for adopting NVMe/TCP is to provide a low latency, shared storage solution.
“If you have an all-flash array based on NVMe, but still connected to servers across a SCSI-based storage network, you’re potentially leaving a lot of performance on the table and you aren’t using your solid-state storage resources nearly as efficiently,” Burgener says. “If you want the performance of an NVMe-based all-flash array to be delivered all the way to your applications, you need an NVMe-oF storage network.”
For most organizations, the final transport choice will be driven by whatever technology has already been deployed or by performance-at-scale requirements.
“FC is the best transport for that latter requirement, but the differentiator narrows with each new release of higher bandwidth Ethernet networks, because Ethernet is able to handle more … storage workloads with its higher bandwidth,” Burgener says. “There are very performance-sensitive apps that will do better with FC as the transport layer for NVMe-oF, but there are going to be fewer and fewer of them over time,” he adds.
If an enterprise already has an FC network, it’s relatively easy to install NVMe-oF on it, as many organizations already do. Most commercial greenfield deployments will opt to go with Ethernet, however, and TCP will eventually win out there, Burgener says.
In terms of applications, “We will probably see a lot of NVMe over TCP in time for AI/ML-driven big data analytics workloads, particularly if they are real-time in nature,” Burgener says. “Another place where [adoption] makes sense is in environments that have consolidated a lot of workloads onto a single storage array and need to be able to deliver performance at scale at high workload density.”
While AI/ML-driven big data analytics adoption is growing, the field is still at a relatively nascent stage. More immediately, NVMe/TCP is gaining traction in settings with ultra-large flash-based storage deployments, particularly when large pools of low-latency data must be rapidly accessible via existing investments in high-bandwidth switched networks.
“Hyperscalers are a natural consumer of this technology, since it enables lightning-fast data access and allows data to be distributed across multiple data-center pods, providing power grid, cooling, and localized high availability architecture benefits without the added costs incurred through normal fiberoptic network buildouts,” Killinger says.
NVMe/TCP can also allow adopters to leverage existing investments in switched networking technologies already commoditized and available from many OEMs. “The cost-per-port to connect multiple bonded 10G switched Ethernet ports versus Infiniband or Fibre Channel alone makes the case to leverage TCP stack implementations of NVMe,” Killinger says.
Many high-end storage adopters are already committed to FC storage networks, and already have upgraded, or are currently planning to upgrade, to RoCE, Burgener notes. Yet this situation will likely change in the next few years.
As storage infrastructure moves more and more to solid-state storage, and customers care more about infrastructure efficiency, NVMe over TCP will be the clear winner over SCSI and it will be inexpensive and easy to implement, Burgener says.
NVMe and its specifications are owned and maintained by NVM Express, Inc., a consortium of network, storage, and other IT-related firms. The NVMe specification, released in 2011, defines how host software communicates with non-volatile memory across a PCI Express (PCIe) bus and is now the industry standard for PCIe SSDs in all form factors. NVMe/TCP was ratified by the NVM Express consortium in 2018.
As things currently stand, NVMe/TCP support is primarily available from networking vendors, such as Lightbits Labs and Mellanox Technologies (now owned by NVIDIA), as well as a handful of storage startups, including Excelero, Pavilion Data, and Infinidat. SSD chipmaker Kioxia (formerly a part of Toshiba) also supports NVMe over TCP.
NVMe/TCP availability is expected to grow rapidly over the next few years. “Most of the major enterprise storage vendors haven’t introduced it yet, but probably will within the next 12 to18 months,” Burgener predicts.
One current roadblock to short-term, wide-scale NVMe-over-TCP adoption is also a reason that it’s likely to become a long-term success.
“Most IT organizations are at least experimenting with public cloud services, with estimates of over 90% of businesses having some presence in one or more public clouds,” Killinger says. Meanwhile, IT refresh rates are trending downward, and many organizations aren’t budgeting sufficient funds to refresh their aging storage infrastructures with high-end NVMe technologies. “Some of these same companies, however, are looking toward public cloud service providers to replace their corporate IT services, and that’s where NVMe over TCP will thrive, growing many times greater than corporate IT buying power ever could,” he notes.
Looking ahead, Burgener sees a bright future for NVMe/TCP. “But [adoption] probably won’t really start to ramp up until the end of 2022 or 2023,” he predicts.
Killinger is also optimistic that NVMe/TCP will eventually emerge as a mainstay technology. “I can see no reason for NVMe over TCP not to thrive, and even accelerate SSD deployments for the coming years,” he says.
Killinger expects there will soon be a large marketing push from SSD storage OEMs, eager to showcase their products’ performance on NVMe over TCP. “To the right corporate vice president of IT, that will be just enough to sway their procurement decision,” he says.