Views: 720
Share: Twitter · Email 🖨 Ctrl+P / Cmd+P to print

Contents

NVIDIA STX: Storage and Context Memory Enter the AI Factory Stack

Date: March 16, 2026 | Event: GTC 2026 Day 1 Announcement | Ticker: MULTI — read-throughs to NVDA, DDN, VAST, WEKA, DELL, HPE, NTAP, PSTG/Everpure, STX, WDC, MU | Category: AI Infrastructure — Storage & Context Memory

Bottom Line

STX is best interpreted as the storage and context-memory arm of NVIDIA’s broader AI-factory platformization strategy. The underlying problem is real, the architecture aligns with external system research, and the component choices are consistent with NVIDIA’s objective of moving beyond GPU ownership into ownership of the data path that determines utilization, responsiveness, and cost per token.

The available disclosures do not yet support a near-term revenue step-function: adoption disclosures are preliminary, public benchmarks are not fully neutralized, and broad partner availability is still ahead in 2H 2026. But the strategic signal is strong. STX indicates that NVIDIA is seeking to define the memory, storage, and data-movement hierarchy around agentic inference with the same degree of control it already exerts over the compute layer. That is the central investment takeaway.

Near-term financial relevance to NVIDIA is likely modest against a base of $62.3B in Data Center revenue for the quarter ended January 25, 2026 and $215.9B in total fiscal 2026 revenue. The larger issue is strategic duration. If AI clusters continue shifting from training-dominant architectures toward long-context, multi-turn, agentic inference, STX increases the probability that a larger share of non-GPU infrastructure spend remains inside NVIDIA’s platform boundary over the next system cycle.

1. What STX Is

NVIDIA STX is a modular reference architecture for AI-native data platforms — not a conventional storage product. The STX announcement describes it as the layer that bridges massive AI compute and data storage by centralizing intelligent data handling on top of Vera Rubin, BlueField-4, and Spectrum-X, spanning training, analytics, and real-time agentic inference. The initial rack-scale implementation is the CMX context memory storage platform, which extends GPU memory with a high-performance context layer designed for scalable inference. The central conclusion of the launch is that NVIDIA is formalizing storage and context memory as a core subsystem of the AI factory rather than as an external peripheral behind the GPU cluster.

The strategic significance is broader than the product label. STX inserts storage, KV-cache handling, vector processing, and enterprise data movement into the same co-designed envelope as Rubin compute, BlueField DPUs, Spectrum-X Ethernet, DOCA, Dynamo, and AI Enterprise. On March 16, 2026, NVIDIA presented BlueField-4 STX storage racks alongside Vera Rubin NVL72 GPU racks, Vera CPU racks, Groq 3 LPX inference racks, and Spectrum-6 Ethernet racks inside the Vera Rubin platform. That rack-level framing is a platform strategy signal: storage is being elevated to a top-level, NVIDIA-authored building block of the AI factory, carrying the same canonical status as compute and networking.

STX vs. CMX: Architecture and Instantiation

The distinction between STX and CMX is important and often conflated in coverage. STX is the umbrella reference architecture — the framework of protocols, silicon choices, software stack, and integration assumptions that define how storage and context memory fit into the AI factory. CMX is the first concrete instantiation of STX, purpose-built for the long-context inference use case. CMX extends GPU memory capacity with a pod-level shared context tier optimized for ephemeral KV cache — the short-lived, latency-sensitive key-value data generated during transformer inference that represents the dominant I/O bottleneck in agentic workflows.

STX is designed around a specific systems problem: conventional storage is optimized for durability, capacity, and data-management semantics, while long-context agentic inference is increasingly dominated by ephemeral, latency-sensitive KV cache and related vector operations. NVIDIA’s architecture offloads KV-cache and vector tasks to BlueField-4 and accelerates high-performance storage, enterprise AI data, and context memory. NVIDIA’s January 2026 ICMS technical documentation explains this as a new G3.5 Ethernet-attached flash tier between local memory/storage and durable shared storage, with petabytes of shared capacity per GPU pod and bandwidth sufficient to prestage KV back into G2 and G1 memory without stalling decode. In effect, STX is not primarily a better storage array; it is a new memory-and-data-path layer for AI inference and AI data infrastructure.

Open Ecosystem, Closed Control Plane

The interoperability stance is material to the investment thesis. NVIDIA’s technical collateral describes use of NVMe, NVMe-oF, object/RDMA protocols, NIXL, DOCA, and plugin-based storage connectors, giving the ecosystem broad entry points. The fast path, however, remains centered on BlueField, Spectrum-X, Dynamo, and DOCA Memos. That architecture is open at the ecosystem boundary but closed at the control plane. The likely result is an ecosystem model in which storage vendors can participate on NVIDIA’s terms, but the critical orchestration logic and performance envelope are increasingly defined by NVIDIA silicon, networking, and software. This dynamic — open APIs, proprietary value capture — is the same model NVIDIA has deployed across CUDA, NVLink, and NVLink-C2C, and it is the correct frame for analyzing where economic value accretes.

2. Why the Problem Is Real

The technical premise behind STX and CMX is not a marketing construct. It is supported by a growing body of external academic and industry system research demonstrating that long-context agentic inference has become a memory-and-I/O problem at least as much as a FLOPS problem. Two papers in particular — the FAST 2025 Mooncake study and the 2026 DualPath study — provide empirical grounding for the KV cache bottleneck that STX is designed to address.

The Mooncake paper describes a KVCache-centric disaggregated serving architecture already operating at production scale across thousands of nodes and processing more than 100 billion tokens per day. In production at Kimi, Mooncake enabled 115% more requests on A800 clusters and 107% more on H800 clusters relative to baseline serving configurations. Controlled laboratory tests showed 59% to 498% gains in effective request capacity under latency constraints. The paper’s core insight is that the KV cache — not the GPU FLOPs — is the binding resource in high-concurrency long-context inference, and that disaggregating the cache from the GPU memory hierarchy dramatically improves system utilization.

The 2026 DualPath study reaches an independently consistent conclusion from a different angle, analyzing representative agentic traces characterized by 157 rounds of interaction, 32,700 average context length, only 429 appended tokens per round, and a 98.7% KV-cache hit rate. Those statistics are significant: nearly all of the I/O in a typical agentic session involves re-reading previously computed KV state rather than computing new tokens. DualPath reports up to 1.87x offline and 1.96x online throughput gains by better utilizing disaggregated storage bandwidth rather than increasing GPU count. This is the empirical case for external context storage as a cost-effective performance lever.

Evidence Source Key Finding
KVCache-centric disaggregated serving at Kimi Mooncake, FAST 2025 115% more requests on A800, 107% more on H800; 59%–498% throughput gains in controlled tests; >100B tokens/day in production across thousands of nodes
Agentic trace profiling and prefill-decode disaggregation DualPath, 2026 Representative traces: 157 rounds, 32.7k avg context, 429 appended tokens/round, 98.7% KV-cache hit rate; up to 1.87x offline and 1.96x online throughput via storage-bandwidth optimization
KV offloading in production serving stack vLLM documentation; LMCache integration vLLM production stack supports KV offloading to CPU and disk via LMCache; confirms software ecosystem already treating external KV storage as a first-class resource
Dynamo inference orchestration with external KV tiers NVIDIA Dynamo documentation, 2026 NVIDIA’s own inference orchestration layer positions KV offload to CPU RAM, SSDs, and networked storage as mainstream; integrates with vLLM and LMCache; uses NIXL and plugin-based connectors for third-party storage
Partner validation: VAST and WEKA KV performance NVIDIA Dynamo partner blog; vendor-generated benchmarks VAST reported 35 GB/s to a single H100; WEKA reported up to 270 GB/s across 8 H100s; VAST separately reported time-to-first-token falling from >11 seconds to 1.5 seconds on a 130,858-token Qwen3-32B workload via persistent KV reuse

This trend is also visible in software adoption. The vLLM production stack uses LMCache to move KV data from GPU memory to CPU or disk, and NVIDIA Dynamo positions KV offload as a mainstream capability in its inference orchestration layer. The relevant implication for investment analysis is that STX does not originate the KV-offload concept — it is better understood as NVIDIA’s attempt to industrialize it at rack scale with higher bandwidth, better power efficiency, and tighter integration into the broader AI-factory stack. The market is already moving in this direction; STX is NVIDIA’s bid to own the canonical implementation.

3. Architecture and Performance

The public performance claims for STX and CMX are directionally strong but not yet sufficient for precision underwriting. The March 16 STX press release claims up to 5x token throughput, up to 4x energy efficiency, and 2x faster enterprise AI data ingestion relative to traditional approaches. The CMX page separately claims up to 5x higher throughput and up to 5x better power efficiency versus general-purpose storage approaches. NVIDIA’s January ICMS technical documentation also cited 5x higher sustained tokens per second and 5x power efficiency. All three sets of materials support the same broad conclusion: general-purpose storage is a progressively poor fit for ephemeral KV-cache-heavy workloads. The precise baselines, benchmark boundaries, and workload mixes are not fully disclosed in the public materials, and the 4x-versus-5x efficiency language should be treated as evidence of directional advantage rather than as a model input.

The economic logic is clearer at the component level. STX explicitly replaces CPU-centric storage services with DPU- and fabric-centric data services. The STX press release claims 4x higher energy efficiency than traditional CPU architectures for high-performance storage, while the 2025 AI Data Platform launch claimed BlueField DPUs can deliver up to 1.6x higher performance than CPU-based storage with up to 50% lower power. That framing positions BlueField not as an incremental add-on but as a structural replacement of a significant portion of the CPU and NIC budget in storage-serving nodes.

Component Role in STX Key Spec
BlueField-4 DPU Primary data-path processor; handles KV-cache offload, NVMe-oF, DOCA Memos, and fabric-attached storage services; replaces CPU in storage hot path 800 Gb/s connectivity; 6x the compute of prior generation; 64-core Grace-based design per ICMS documentation; pairs with ConnectX-9 SuperNIC in STX rack implementation
Vera CPU Orchestration and control-plane compute within the STX rack; supports DOCA and Dynamo integration 88 Arm cores; up to 1.2 TB/s memory bandwidth; up to 1.5 TB LPDDR5X capacity per Vera CPU
ConnectX-9 SuperNIC Host-side fabric connectivity; enables RDMA and NVMe-oF at line rate between compute racks and STX storage racks Integrated into Rubin NVL72; supports NVIDIA Spectrum-X AI Ethernet fabric at full bandwidth
Spectrum-X Ethernet Fabric High-performance AI Ethernet interconnect linking compute, inference, and STX storage racks; provides deterministic low-latency fabric for KV cache reads and writes Spectrum-X claims 1.6x AI network performance over off-the-shelf Ethernet; prior benchmarks showed 20%–48% read-bandwidth gains and 9%–41% write-bandwidth gains in storage-fabric tests; Spectrum-6 presented at GTC 2026
Rubin NVL72 Primary GPU compute cluster; the demand-side consumer of KV cache managed by CMX/STX; STX storage racks are co-specified alongside NVL72 racks in the Vera Rubin platform 72 GPUs; 36 Vera CPUs; ConnectX-9 SuperNICs; BlueField-4 DPUs integrated into cluster topology
DOCA / Dynamo / NIXL Software Stack Software control plane for STX; DOCA manages DPU data services; Dynamo provides inference orchestration and KV-offload scheduling; NIXL is the data-movement abstraction layer enabling plugin-based third-party storage connectors Integrates with vLLM and LMCache; supports NVMe, NVMe-oF, object/RDMA protocols; plugin architecture enables vendor participation at ecosystem boundary while preserving NVIDIA orchestration control

The rack-level framing materially reinforces the strategic interpretation. In the Vera Rubin platform launch, STX storage racks were presented as a canonical rack type alongside compute, CPU, inference-accelerator, and Ethernet racks. That framing implies a future cluster topology in which compute, networking, inference acceleration, orchestration CPU, and context memory are all specified as interoperating NVIDIA-defined rack modules. Platform codification of that kind typically shifts bargaining power toward the architecture owner, because the value migrates from standalone components to the integration logic that determines system-level utilization and cost per token. This is the mechanism by which NVIDIA has expanded its share of wallet in every prior generation.

4. Relationship to NVIDIA’s Storage Stack

The cleanest way to interpret STX is as the umbrella architecture linking two previously separate NVIDIA storage initiatives into a single coherent platform narrative. Understanding the timeline and the relationships between layers is essential for correctly mapping financial exposure across the storage ecosystem.

On March 18, 2025, NVIDIA introduced AI Data Platform as a customizable enterprise-storage reference design built around Blackwell GPUs, BlueField-3 DPUs, Spectrum-X, and AI Enterprise. The 2025 design center was oriented toward enterprise data preparation workflows: continuously indexing multimodal data, making unstructured data AI-ready, and supporting multimodal agentic RAG, deep research agents, centralized cache for distributed inference, and semantic search. Partners — including what was then Pure Storage (subsequently rebranded to Everpure in February 2026) — were planning to offer solutions beginning in March 2025.

On March 16, 2026, STX expanded the design center to Vera Rubin and BlueField-4 and explicitly encompassed high-performance storage, enterprise AI data, and context memory within a single reference architecture. CMX then emerged as the concrete STX instantiation for long-context inference, extending GPU memory with a shared pod-level context tier optimized for ephemeral KV cache. The architecture stack therefore resolves into three distinct but interrelated layers: AI Data Platform for enterprise data preparation, CMX for context-memory inference, and STX as the reference layer that connects both across the full AI lifecycle.

Layer Function Timeline
AI Data Platform Enterprise data preparation reference design; continuously indexes multimodal data; supports agentic RAG, deep research agents, centralized distributed-inference cache, and semantic search; built on Blackwell GPUs, BlueField-3 DPUs, Spectrum-X, and AI Enterprise Introduced March 18, 2025; partner solutions available from March 2025; Blackwell-generation design center
CMX (Context Memory eXtension) Initial instantiation of STX; extends GPU memory with a pod-level shared context tier; purpose-built for ephemeral KV-cache storage in long-context agentic inference; G3.5 Ethernet-attached flash tier between local and durable storage layers Announced March 16, 2026 at GTC 2026; partner availability in 2H 2026; Vera Rubin–generation design center
STX (Storage and Transformation eXtension) Umbrella reference architecture linking AI Data Platform and CMX across the full AI lifecycle; formalizes storage, KV-cache handling, vector processing, and enterprise data movement as a top-level AI factory subsystem alongside compute and networking Announced March 16, 2026 at GTC 2026; umbrella layer encompassing both 2025 AI Data Platform and 2026 CMX; partner availability in 2H 2026

The relative maturity differential between layers is strategically important. AI Data Platform partners were building and shipping solutions as early as March 2025. STX-based and CMX-based platforms are not scheduled for broad partner availability until 2H 2026. That gap confirms evolution rather than replacement: STX is the Rubin-era extension of the AI Data Platform thesis, adding a more explicit context-memory and inference-storage layer to a storage strategy that initially emphasized enterprise RAG, query agents, and AI-ready data preparation.

The implication for ecosystem positioning is that vendors who built for AI Data Platform are being given a path to evolve into STX without starting over. Enterprise incumbents — Dell, HPE, NetApp, Everpure — who made the AI Data Platform bet in 2025 are the natural continuation candidates for STX in 2026 and 2027. The question is whether their product architectures can adapt to the BlueField-4 and Vera Rubin design center quickly enough to remain competitive with AI-native vendors who may have less legacy to carry.

5. Ecosystem and Partner Landscape

The partner mix disclosed at the STX launch is broad and strategically revealing. The STX page lists Cloudian, DDN, Dell, Everpure, Hitachi Vantara, HPE, IBM, MinIO, NetApp, Nutanix, VAST Data, and WEKA, plus AIC, QCT, and Supermicro on the system-building side. The broader AI Storage ecosystem page adds Hammerspace. The 2025 AI Data Platform announcement included Pure Storage, which rebranded to Everpure in February 2026 and began trading under the new name in March 2026. The roster spans enterprise incumbents, object-storage vendors, AI-native software-defined storage vendors, and system builders. That breadth confirms that NVIDIA is not narrowing the ecosystem to a few preferred storage stacks; it is creating a reference substrate that many vendors can implement, provided they accept NVIDIA’s data-path assumptions.

The technical beneficiaries inside storage may not be distributed evenly. STX is optimized for disaggregated, high-bandwidth, software-defined data paths rather than for traditional dual-controller enterprise arrays. That architecture is structurally favorable for AI-native vendors such as DDN, VAST, WEKA, MinIO, and Cloudian in frontier-lab and AI-cloud contexts, where flexibility, scale-out performance, and software-defined data management are the primary purchase criteria. Enterprise incumbents such as Dell, HPE, IBM, NetApp, Nutanix, and Everpure retain significant customer access, installed base, and procurement leverage in enterprise AI Data Platform deployments, but architecture control continues shifting toward NVIDIA.

Sandisk (SNDK) is a notable case that merits separate treatment. Sandisk is not named as a direct STX ecosystem partner in NVIDIA’s March 16 disclosures, but the company has three distinct vectors of exposure to the architectural shift STX represents. First, Sandisk’s 256 TB UltraQLC SSD, built on BiCS8 QLC NAND, is the highest-capacity enterprise SSD on the market and targets the durable/capacity storage tier (G4 in STX’s nomenclature) for training data, checkpoints, and RAG persistence. Second, and potentially more important, Sandisk and SK hynix jointly announced on February 25 the High Bandwidth Flash (HBF) standard through OCP — a new memory tier designed to sit between HBM and conventional SSDs, specifically targeting AI inference KV-cache workloads. HBF is both complementary to and potentially competitive with NVIDIA’s ICMS/CMX approach: where STX uses NVMe SSDs behind BlueField-4, HBF proposes a closer-coupled NAND tier with higher bandwidth and lower latency. That standard is still early, but it represents incremental NAND content demand that is not yet reflected in consensus estimates. Third, Sandisk’s SSDs can still be used inside STX implementations built by named partners — STX is a reference architecture, not a closed bill of materials. The net read-through for Sandisk is positive: STX structurally expands flash demand per AI rack, UltraQLC leads in the capacity tier, and HBF positions Sandisk at the specification table for an entirely new inference memory category.

Category Partners Positioning
AI-native storage vendors DDN, VAST Data, WEKA, MinIO, Cloudian Structurally favored in frontier-lab and AI-cloud STX deployments; architecture is optimized for disaggregated, high-bandwidth, software-defined data paths that match these vendors’ native capabilities; VAST and WEKA have already published partner benchmark results via NVIDIA Dynamo blog demonstrating production-grade KV offload performance
Enterprise storage incumbents Dell, HPE, IBM, NetApp, Nutanix, Everpure (formerly Pure Storage) Retain customer access, installed base, and procurement leverage in enterprise AI Data Platform deployments; architecture control is shifting toward NVIDIA; STX compatibility maintained but full performance envelope likely requires deeper DPU and software-defined architecture investment; Everpure (PSTG) is notable as an incumbent with a software-defined heritage and an active product refresh cycle
System builders / ODMs AIC, QCT, Supermicro Manufacturing and integration partners for STX reference-design hardware; positioned to build the STX storage rack form factor alongside existing compute and networking rack production; incremental revenue exposure tied to BOM content per STX rack deployed
Early adopters / cloud customers CoreWeave, Crusoe, IREN, Lambda, Mistral AI, Nebius, OCI, Vultr Named in the March 16 STX press release as organizations planning to adopt STX for context memory storage; skews toward AI-focused cloud providers and frontier-model labs rather than enterprise IT; validates demand signal but disclosures represent planning commitments, not yet production deployments at scale
Additional ecosystem Hammerspace, Hitachi Vantara Hammerspace listed in broader AI Storage ecosystem page; Hitachi Vantara listed on STX page as storage partner; both extend the reference-architecture coverage to hybrid and enterprise data management use cases

Demand signals are encouraging but still preliminary. As of March 16, 2026, the STX press release named CoreWeave, Crusoe, IREN, Lambda, Mistral AI, Nebius, OCI, and Vultr as organizations planning to adopt STX for context memory storage, and stated that STX-based platforms will be available from partners in 2H 2026. The wording is important. These disclosures indicate ecosystem momentum and architectural mindshare, but they do not yet constitute proof of production-scale deployment, material revenue contribution, or long-term standardization. The likely next inflection point for financial relevance is partner product launches in 2H 2026 followed by hyperscaler adoption disclosures in early 2027 earnings calls.

6. Risks and Execution Considerations

Execution Risk

Execution risk is material. The STX launch materials are explicitly forward-looking and state that many of the described products and features remain in various stages of development and will be offered on a when-and-if-available basis. There is also some inconsistency in public hardware descriptions across the launch cycle: NVIDIA’s January 2026 ICMS technical documentation described BlueField-4 powering context memory with 800 Gb/s connectivity, a 64-core Grace CPU, and high-bandwidth LPDDR memory, whereas the March 16 STX press release and Vera Rubin platform launch described a storage-optimized BlueField-4 that combines a Vera CPU with a ConnectX-9 SuperNIC. That discrepancy may reflect the difference between an earlier ICMS platform description and the finalized STX rack implementation, or simply evolving launch collateral between January and March 2026. Either interpretation reduces pre-GA precision and argues for underwriting conservatism until production hardware ships and independent benchmarks are available.

Substitution Risk

Substitution risk also remains real. Open-source software already supports KV offloading at a meaningful performance level, and external system research demonstrates that a substantial portion of the throughput gain can be achieved through smarter use of disaggregated storage bandwidth rather than through a single proprietary hardware implementation. NVIDIA’s own Dynamo documentation cites partner-generated benchmark results from VAST and WEKA showing 35 GB/s to a single H100 and up to 270 GB/s across 8 H100s, respectively, while VAST separately reported time-to-first-token falling from more than 11 seconds to 1.5 seconds on a 130,858-token Qwen3-32B workload through persistent KV reuse. These results are vendor-generated and should not be treated as neutral benchmarks, but they do show that the broader market is already experimenting successfully with high-performance external KV tiers using existing software-defined storage platforms. The underwriting question is therefore not whether external context storage works; it is whether NVIDIA’s vertically integrated implementation captures enough performance, efficiency, and ecosystem gravity to become the default implementation at scale or whether a heterogeneous software-defined market persists in parallel.

Near-Term Financial Relevance

Near-term financial relevance to NVIDIA consolidated results is likely modest. Against a Data Center revenue base of $62.3B in the quarter ended January 25, 2026 and $215.9B in total fiscal 2026 revenue, a newly announced reference architecture with partner availability not expected until 2H 2026 is unlikely to alter consolidated results in the near term. The incremental revenue in the first deployments will likely accrue to BlueField-4 ASP uplift per cluster, Spectrum-X attach rates, and AI Enterprise software licensing rather than to any standalone “STX revenue” line. The framework for sizing the opportunity is therefore not a new product category contribution but rather an expanded addressable surface for NVIDIA’s existing platform economics — more silicon and software per AI factory rack.

Competitive Response

The competitive response from storage vendors is an additional variable. VAST Data, DDN, and WEKA are already deeply embedded in frontier AI clusters and are capable of adapting their software-defined architectures to BlueField-4 without requiring NVIDIA’s STX reference design. If those vendors deliver comparable KV-offload performance outside the STX certification framework, it dilutes NVIDIA’s ability to capture ecosystem gravity through the reference-architecture model. The counterargument is that NVIDIA’s control over Dynamo, NIXL, and DOCA Memos — the software layers that manage KV scheduling and data movement — gives it leverage that no storage vendor can fully replicate through hardware alone. The strategic contest is therefore between NVIDIA’s software control plane and the storage vendors’ performance and flexibility advantages at the infrastructure layer.

7. Investment Read-Throughs

The following table synthesizes investment implications across the relevant equity universe. Directionality is based on the architectural and ecosystem analysis in this note; confidence levels reflect the current disclosure state and execution timelines.

Sector / Name Direction Rationale
NVDA Positive STX expands NVIDIA’s platform boundary beyond GPU compute into the data path — storage, context memory, KV orchestration, and fabric management. Incremental attach of BlueField-4, Spectrum-X, ConnectX-9, and AI Enterprise per cluster deepens per-rack economics. The strategic signal is most important: NVIDIA is pursuing the same degree of platform control over the memory and storage hierarchy that it has already achieved over compute. Near-term revenue impact is modest against the $62.3B/quarter Data Center base, but the strategic duration is multi-year.
DDN, VAST Data, WEKA Positive AI-native storage vendors are structurally favored in the frontier and hyperscale context where STX is designed to deploy. Both VAST and WEKA have already published partner benchmark results through NVIDIA Dynamo showing production-grade KV offload performance, and both are named STX ecosystem partners. The risk is that NVIDIA’s vertically integrated STX reference design eventually crowds out independent software-defined implementations at the margin; the near-term read is positive given ecosystem tailwind and frontier-customer demand signal.
Dell (DELL), HPE Mixed Enterprise incumbents retain customer access and installed base in AI Data Platform deployments but cede architecture control to NVIDIA in the STX design center. Dell and HPE are named STX partners and retain integration and procurement relevance in enterprise accounts, but the BOM structure increasingly favors NVIDIA-specified components. Net impact depends on whether Dell and HPE capture the system-integration margin on STX racks or whether AI-native vendors and ODMs undercut them on price.
NetApp (NTAP), Everpure (PSTG) Mixed NetApp and Everpure are listed STX partners and retain enterprise customer access. Everpure (formerly Pure Storage) is notable as an incumbent with a software-defined architecture and an active product refresh; its positioning on STX is watched more closely than traditional dual-controller vendors. Architecture control, however, is shifting toward NVIDIA. The read-through depends on how quickly each company adapts its storage OS and DPU integration to the BlueField-4 design center.
Seagate (STX), Western Digital (WDC) Positive NVIDIA’s ICMS documentation describes a G3.5 Ethernet-attached flash tier with petabytes of shared capacity per GPU pod. At that scale, high-density NAND flash is the primary storage medium, and structural demand for QLC and TLC NAND in CMX/ICMS deployments is incrementally positive for both Seagate (in the HDD/SSD controller and Lyve Cloud ecosystem) and Western Digital across both HDD and flash. The magnitude depends on cluster count and adoption timeline, not yet quantifiable from public disclosures.
Sandisk (SNDK) Positive Three vectors of STX exposure: (1) the 256 TB UltraQLC SSD is best-in-class for the capacity storage tier (G4) that remains essential for training data, checkpoints, and RAG; (2) the High Bandwidth Flash (HBF) standard, jointly announced with SK hynix on Feb 25, creates an entirely new NAND-based memory tier between HBM and SSDs targeting inference KV-cache workloads — this is incremental demand not yet in estimates; (3) Sandisk NAND can be used inside STX implementations built by named ecosystem partners even though Sandisk is not directly listed. The omission from NVIDIA’s named STX partner list is a near-term gap, but the structural demand tailwind from flash entering the active inference memory hierarchy is unambiguously positive for the largest pure-play NAND supplier. HBF standardization progress and hyperscaler UltraQLC qualification are the key catalysts to watch.
Micron (MU) Positive CMX extends GPU memory with a high-bandwidth LPDDR5X context tier, and Vera CPU supports up to 1.5 TB of LPDDR5X per node. As CMX racks proliferate alongside Rubin NVL72 compute racks, incremental LPDDR5X content per cluster grows substantially. Micron is a leading supplier of LPDDR5X and HBM3E for AI infrastructure. The structural demand driver from STX/CMX is additive to the already-strong AI memory cycle.
Optical / Networking: Coherent (COHR), Lumentum (LITE), Arista (ANET) Positive STX storage racks are specified to run on Spectrum-X AI Ethernet fabric, and the CMX design center requires high-bandwidth, low-latency fabric connectivity between compute and storage racks. That fabric requirement structurally supports demand for optical transceivers (COHR, LITE) and AI Ethernet switching (ANET). Spectrum-X 1.6x AI-fabric performance claim is relative to commodity Ethernet; building out STX cluster topologies implies incremental optical and switching content per rack pair.
BlueField / DPU Ecosystem (NVDA, downstream integrators) Positive BlueField-4 is the central silicon in STX storage racks, combining 800 Gb/s connectivity with DPU compute and ConnectX-9 SuperNIC. Every STX storage rack deployed is an incremental BlueField-4 attach into an AI cluster, generating ASP uplift relative to a CPU-only storage node. As STX proliferates in Rubin-generation clusters, BlueField attach rates should increase meaningfully. This is an underappreciated vector of NVDA networking-segment growth beyond the GPU compute narrative.

The read-through framework above is subject to revision as production hardware ships, independent benchmarks become available, and hyperscaler purchasing decisions in 2H 2026 and 2027 clarify adoption velocity. The most important near-term signposts are: (1) partner product launches from DDN, VAST, WEKA, Dell, and HPE in 2H 2026 confirming STX certification; (2) early hyperscaler and cloud-builder commentary on CMX deployments in Q3 and Q4 2026 earnings calls; (3) NVIDIA bluefields attach-rate disclosures in Data Center segment reporting; and (4) any independent academic or operator benchmarks comparing STX CMX performance to software-defined KV offload alternatives.


Data sources may include: Bloomberg, FactSet, S&P Capital IQ, company filings, earnings call transcripts, expert network interviews, SEC EDGAR.

Sources cited: NVIDIA STX official materials, March 16, 2026; NVIDIA GTC 2026 keynote; NVIDIA CMX and ICMS technical documentation; Mooncake (FAST 2025); DualPath (2026); vLLM documentation; company filings.

Was this report helpful? 👍 Yes 👎 No
← Back to Reports