In HPC systems and hyper-scale data centers, the adoption of high-performance NVMe SSDs and high-speed networks has shifted storage bottlenecks to the network stack. Under high-concurrency workloads, frequent interrupt processing exhausts CPU resources while protocol-level control–data dependencies in the NVMe over TCP write path introduce additional serialization penalties. Existing optimizations either require specialized hardware, dedicate CPU cores to user-space polling, or apply semantically blind batching that delays time-sensitive control messages. We present SENS, a Semantic-aware NVMe over TCP Scheduler embedded within the NVMe over TCP driver of the Linux kernel. SENS combines two mechanisms: (1) PDU vectorization, which aggregates discrete Protocol Data Units into memory vectors before network transmission, amortizing per-I/O system call overhead and reducing soft-interrupt frequency; and (2) instruction-aware dispatch, which detects control PDUs such as R2T and triggers an early flush of the aggregation window, mitigating the serialization penalty on the write path. A prototype evaluation with physical NVMe SSDs and 100 GbE networks shows that SENS saturates the SSD throughput ceiling using 4–5 CPU cores, halving the host-side core budget compared to the native TCP driver. With a RAMDisk backend that removes storage-media constraints, SENS sustains up to 2.5× higher concurrent IOPS. These results show that exposing storage-protocol semantics to the batching layer improves the scalability of NVMe over TCP without additional hardware.
Qiao et al. (Thu,) studied this question.