Data Center Bridging (DCB)
Data Center Bridging (DCB) is a suite of Institute
of Electrical and Electronics Engineers (IEEE) standards that enable Converged Fabrics
in the data center, where storage, data networking, cluster IPC and management traffic
all share the same Ethernet network infrastructure.DCB underlies the ability of the Windows Server operating to system to provide a defined Quality of Service (QoS) for such Windows Server features as SMB Direct
. DCB provides hardware-based
bandwidth allocation to a specific type of traffic and enhances Ethernet transport
reliability with the use of priority-based flow control. Hardware-based bandwidth
allocation is essential if traffic bypasses the operating system and is offloaded
to a converged network adapter, which might support Internet Small Computer System
Interface (iSCSI), Remote Direct Memory Access (RDMA) over Converged Ethernet, or
Fiber Channel over Ethernet (FCoE). Priority-based flow control is essential if
the upper layer protocol, such as Fiber Channel, assumes a lossless underlying transport.
FibreChannel Interface ANSI developed the FC Standard in 1988 as a practical
and expandable method of using fiber optic cabling to transfer data among desktop
computers, workstations, mainframes, supercomputers, storage devices, and display
devices. ANSI later changed the standard to support copper cabling; today, some
kinds of FC use two-pair copper wire to connect the outer four pins of a nine-pin
FibreChannel-over-Ethernet Interface Fibre Channel over Ethernet (FCoE) is
a computer network technology that encapsulates Fibre Channel frames over Ethernet
networks. This allows Fibre Channel to use Ethernet networks while preserving the
Fibre Channel protocol. The specification was part of the International Committee
for Information Technology Standards T11 FC-BB-5 standard published in 2009. FCoE
maps Fibre Channel directly over Ethernet while being independent of the Ethernet
forwarding scheme. The FCoE protocol specification replaces the FC0 and FC1 layers
of the Fibre Channel stack with Ethernet. By retaining the native Fibre Channel
constructs, FCoE is meant to integrate with existing Fibre Channel networks and
management software. FCoE operates directly above Ethernet in the network protocol
stack, in contrast to iSCSI which runs on top of TCP and IP. As a consequence, FCoE
is not routable at the IP layer, and will not work across routed IP networks. Since
classical Ethernet had no priority-based flow control, unlike Fibre Channel, FCoE
required enhancements to the Ethernet standard to support a priority-based flow
control mechanism (to reduce frame loss from congestion). The IEEE standards body
added priorities in the data center bridging Task Group.
Hardware Quality of Service Hardware Quality of Service ( HW QoS) is a new feature of Windows Server 2022. It is designed to address the issues of previous implementation, specifically;
Software implementation had substantial processor overhead, and was limited by the granularity of software timers in scheduling.
Software cannot account for packets not delivered through the software path, examples being SR-IOV where packets are going directly to/from the VM, and RDMA.
To avoid these problems, it is necessary to have the reservation system implemented entirely in HW. For more information, see "Link is TBD after RTM"
Hardware Timestamping Hardware Timestamping API in Windows Server 2022 enables the use of hardware packet timestamps by an application implementing PTP version 2 in the two step mode, as defined by IEEE. The API provides the ability to discover the network adapter's timestamping capabilities, to associate the network adapter's hardware clock to PTP v2 traffic running over UDP, and establish a relation between the network adapter's clock and the system clock. For more information, see "Link is TBD after RTM"
Internet Protocol Security (IPSec) Internet Protocol Security (IPsec) is
a protocol suite for securing Internet Protocol (IP) communications by authenticating
and encrypting each IP packet of a communication session. IPsec includes protocols
for establishing mutual authentication between agents at the beginning of the session
and negotiation of cryptographic keys to be used during the session. IPsec can be
used in protecting data flows between a pair of hosts (host-to-host), between a
pair of security gateways (network-to-network), or between a security gateway and
a host (network-to-host). IPsec is an end-to-end security scheme operating in the
Internet Layer of the Internet Protocol Suite, while some other Internet security
systems in widespread use, such as Secure Sockets Layer (SSL), Transport Layer Security
(TLS) and Secure Shell (SSH), operate in the upper layers of the TCP/IP model. Hence,
IPsec protects any application traffic across an IP network. Applications do not
need to be specifically designed to use IPsec.
iSCSI Interface iSCSI is an acronym for Internet Small Computer System Interface,
an Internet Protocol (IP)-based storage networking standard for linking data storage
facilities. By carrying SCSI commands over IP networks, iSCSI is used to facilitate
data transfers over intranets and to manage storage over long distances. iSCSI can
be used to transmit data over local area networks (LANs), wide area networks (WANs),
or the Internet and can enable location-independent data storage and retrieval.
The protocol allows clients (called initiators) to send SCSI commands (CDBs) to
SCSI storage devices (targets) on remote servers. It is a storage area network (SAN)
protocol, allowing organizations to consolidate storage into data center storage
arrays while providing hosts (such as database and web servers) with the illusion
of locally attached disks. Unlike Fibre Channel, which requires special-purpose
cabling, iSCSI can be run over long distances using existing network infrastructure.
iSCSI was submitted as draft standard in March 2000
Kernel Mode Remote Direct Memory Access (kRDMA)
Kernel RDMA uses the Network
Direct Kernel Provider Interface
(NDKPI), which is an extension to NDIS that allows
IHVs to provide kernel-mode Remote Direct Memory Access (RDMA) support in a network
adapter. To expose the adapter's RDMA functionality, the IHV must implement the
NDKPI interface as defined in the NDKPI Reference. A NIC vendor implements RDMA
as a combination of software, firmware, and hardware. The hardware and firmware
portion is a network adapter that provides NDK/RDMA functionality. This type of
adapter is also called an RDMA-enabled NIC (RNIC). The software portion is an NDK-capable
miniport driver, which implements the NDKPI interface. NDK providers must support
Network Direct connectivity via both IPv4 and IPv6 addresses assigned to NDK-capable
Receive Segment Coalescing (RSC) RSC is a stateless offload technology that
helps reduce CPU utilization for network processing on the receive side by offloading
tasks from the CPU to an RSC-capable network adapter. CPU saturation due to networking-related
processing can limit server scalability. This problem in turn reduces the transaction
rate, raw throughput, and efficiency. RSC enables an RSC-capable network interface
card to do the following: Parse multiple TCP/IP packets and strip the headers from
the packets while preserving the payload of each packet, join the combined payloads
of the multiple packets into one packet, and send the single packet, which contains
the payload of multiple packets, to the network stack for subsequent delivery to
applications. The network interface card performs these tasks based on rules that
are defined by the network stack subject to the hardware capabilities of the specific
network adapter. This ability to receive multiple TCP segments as one large segment
significantly reduces the per-packet processing overhead of the network stack. Because
of this, RSC significantly improves the receive-side performance of the operating
system (by reducing the CPU overhead) under network I/O intensive workloads.
Receive Side Scaling (RSS) Receive side scaling (RSS) is a network driver
technology that enables the efficient distribution of network receive processing
across multiple physical cores, not hyper-threaded, in multiprocessor systems. To
process received data efficiently, a miniport driver's receive interrupt service
function schedules a deferred procedure call (DPC). Without RSS, a typical DPC indicates
all received data within the DPC call. Therefore, all of the receive processing
that is associated with the interrupt runs on the CPU where the receive interrupt
occurs. With RSS, the NIC and miniport driver provide the ability to schedule receive
DPCs on other processors. Also, the RSS design ensures that the processing that
is associated with a given connection stays on an assigned CPU.
Single Root I/O Virtualization (SR-IOV)
The single root I/O virtualization
(SR-IOV) interface is an extension to the PCI Express (PCIe) specification. SR-IOV
allows a device, such as a network adapter, to separate access to its resources
among various PCIe hardware functions. These functions consist of the following
A PCIe Physical Function (PF). This function is the primary function of the device
and advertises the device's SR-IOV capabilities. The PF is associated with the Hyper-V
parent partition in a virtualized environment.
One or more PCIe Virtual Functions (VFs). Each VF is associated with the device's
PF. A VF shares one or more physical resources of the device, such as a memory and
a network port, with the PF and other VFs on the device. Each VF is associated with
a Hyper-V child partition in a virtualized environment. Each PF and VF is assigned
a unique PCI Express Requester ID (RID) that allows an I/O memory management unit
(IOMMU) to differentiate between different traffic streams and apply memory and
interrupt translations between the PF and VFs. This allows traffic streams to be
delivered directly to the appropriate Hyper-V parent or child partition. As a result,
non-privileged data traffic flows from the PF to VF without affecting other VFs.
SR-IOV enables network traffic to bypass the software switch layer of the Hyper-V
virtualization stack. Because the VF is assigned to a child partition, the network
traffic flows directly between the VF and child partition. As a result, the I/O
overhead in the software emulation layer is diminished and achieves network performance
that is nearly the same performance as in non-virtualized environments.
Virtual Machine Queue (VMQ) Virtual machine queue (VMQ) is a feature available
to computers with the Hyper-V server role installed, that have VMQ-capable network
hardware. VMQ uses hardware packet filtering to deliver packet data from an external
virtual machine network directly to virtual machines, which reduces the overhead
of routing packets and copying them to the virtual machine. When VMQ is enabled,
a dedicated queue is established on the physical network adapter for each virtual
network adapter that has requested a queue. As packets arrive for a virtual network
adapter, the physical network adapter places them in that network adapter's queue.
When packets are indicated up, all the packet data in the queue is delivered directly
to the virtual network adapter. Packets arriving for virtual network adapters that
don't have a dedicated queue, as well as all multicast and broadcast packets, are
delivered to the virtual network in the default queue. The virtual network handles
routing of these packets to the appropriate virtual network adapters as it normally
Switch Embedded Teaming Switch-embedded Teaming (SET) merges the NIC Teaming
capabilities into the SDN switch. In addition, using SET, a user may team RDMA-capable
NICs without giving up their RDMA capabilities. Also using SET, a user may team
SR-IOV NICs without giving up their SR-IOV capabilities.
Host RDMA The Host RDMA feature provides the Network Direct Kernel consumer
interface at the management or other virtual NIC (vNIC) exposed to the host partition.
This enables, for example, SMB Direct to work over vNICs instead of requiring separate
RDMA-capable physical NICs.