QoS Design Considerations for Virtual UC with UCS
m (1 revision)
Revision as of 15:45, 3 November 2011
This section covers best practices guidelines and recommendations for LAN access for UC VM application traffic (e.g. voice/video media and signaling). This section does NOT cover best practices for storage access (e.g. VM to SAN/NAS-attached array).
UCS Network Switching Hardware Compatibility
By default, UC on UCS permits any model of UCS 2100, UCS 6100, UCS 6200 and future product generations as long as the LAN access and storage access requirements of the UC VMs are met. There are no UC-specific rules on models or firmware levels, other than that some designs will require use of Specifications-based hardware support instead of Tested Reference Configurations.
However, recall that use of UCS 6100/6200 with UCS C-Series are only supported via specs-based VMware support, and not via UC on UCS Tested Reference Configurations.
Guidelines for Physical LAN Links, Trunking and Traffic Sizing
Redundant physical LAN interfaces are recommended.
Cisco UCS B-Series connect to the LAN via UCS 6100, so use redundant 1Gbps or 10Gbps ports on the UCS 6100.
Cisco UCS C-Series tested reference configurations ship with two or six 1Gbps Ethernet ports (two on the motherboard for C210 or C200 plus on C210 additional four on a PCIe NIC). The recommended best practice configuration when using this tested reference configuration is:
- One or two pairs of teamed NICs for UC VM traffic. One pair is usually sufficient on C200 due to the low load per VM.
- One pair of NICs (teamed or dedicated) for VMware-specific traffic (e.g. management, vMotion, VMware High Availability, etc.)
If using specs-based VMware support, other LAN interconnect options may be used, such as using Cisco VIC in NIV mode instead of multiple physical NICs.
Whether or not you configure the UC VM traffic links as trunks depends on your deployment - UC applications do not mandate this. E.g. if most of your VMs are in the same VLAN, then trunks may not be needed. If you decide to use trunks, 802.1q is recommended over ISL.
VM vNIC traffic can be tagged with 802.1q in case you use a trunk to your VMware host and use a VLAN other than the native VLAN. UC applications do not mandate this, and recall that any tags will be removed by the vSwitch before traffic enters the VM.
LAN Traffic Sizing
Besides redundancy, link quantity and speed will depend on aggregate LAN traffic of UC VMs. Most UC VMs have only one or two vNIC's. Each vNIC will use 1Gbps if available, but does not require this much bandwidth.
Use the design guides to size UC VM LAN traffic. For example:
- CUCM VM traffic bandwidth = Database Replication + ICCS + Signaling + MOH + CFB + MTP + TFTP as described in http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/srnd/7x/models.html
- Cisco Unity Connection VM traffic bandwidth = port media traffic + Database replication + Exchange traffic as described in http://www.cisco.com/en/US/docs/voice_ip_comm/connection/8x/design/guide/8xcucdg060.html#wp1052981
This traffic sizing can be used to size LAN access links to handle the aggregate bandwidth load of the UC VMs.
- On Cisco UCS B-Series, this can be used to size FEX links for UCS 2100 to UCS 6100, and LAN access uplinks from UCS 6100
- On Cisco UCS C-Series, this can be used to size motherboard and PCIe NICs
QoS Design Considerations for VMs with Cisco UCS B-Series Blade Servers
In a virtualized environment, Unified Communications applications such Cisco Unified Communications Manager (Unified CM) run as virtual machines on top of VMware. These Unified Communications virtual machines are connected to a virtual software switch rather than a hardware-based Ethernet switch for Media Convergence Server (MCS) deployments.
The following types of virtual software switches are available:
- Local VMware vSwitch: Available with all editions of the VMware ESXi hypervisor and independent of the type of VMware licensing scheme. Virtual software switching is limited to the local physical blade server on which the virtual machine is running.
- Distributed VMware vSwitch: Available only with the Enterprise Plus Edition of the VMware ESXi hypervisor. Distributed virtual software switching can span multiple physical blade servers and helps simplify manageability of the software switch.
- Cisco Nexus 1000V Switch: Cisco has a software switch called the Nexus 1000 Virtual (1000V) Switch. The Cisco Nexus 1000V requires the Enterprise Plus Edition of VMware ESXi. It is a distributed virtual switch visible to multiple VMware hosts and virtual machines. The Cisco Nexus 1000V Series provides policy-based virtual machine connectivity, mobile virtual machine security, enhanced QoS, and network policy.
From the virtual connectivity point of view, each virtual machine can connect to any one of the above virtual switches residing on a physical server.
UC virtual machine traffic flows from the application, to the virtual machine's vNIC, then to the VMware host's virtual software switch (one of VMware vSwitch, VMware Distributed vSwitch, or Cisco Nexus 1000V Switch), then out a physical adapter. What kind of physical adapter depends on the compute/network hardware in use:
- E.g. with UCS B-Series Blade Servers, the virtual software switch sends traffic through the blade's mezzanine physical Network Adapter (CNA or Cisco VIC), then through the blade chassis' physical UCS 2100 Series Fabric Extender, then to the physical UCS 6100 Series Fabric Interconnect Switch, and finally through a port module to the rest of the LAN.
- E.g. with UCS C-Series Rack-Mount Servers, the virtual software switch sends traffic through a local physical NIC, CNA or Cisco VIC, and then to the rest of the LAN (unless the C-Series is attached to a UCS 6100, in which case the NIC, CNA or VIC will send traffic to the physical UCS 6100).
Note that CNA, Cisco VIC and UCS 2100/6100 carry both the IP and fibre channel SAN traffic via Fibre Channel over Ethernet (FCoE) on a single wire. If using UCS 6100 Fabric Interconnect Switches, it sends IP traffic to an IP switch (for example, Cisco Catalyst or Nexus Series Switch), and it sends SAN traffic to a Fibre Channel SAN Switch (for example, Cisco MDS Series Switch).
Standard Switching Element QoS Behavior
By default within the UCS 6100 Series Fabric Interconnect Switch, a priority QoS class is automatically created for all fibre channel (FC) traffic destined to the SAN switch. This FC QoS class has no drop policy, and all the FC traffic is marked with Layer 2 CoS value of 3. By default all other traffic (Ethernet and IP), including voice signaling and media traffic, falls into Best Effort QoS class.
The VMware local vSwitch, VMWare distributed vSwitch, and UCS 6100 Series switches cannot map L3 DSCP values to L2 CoS values. Traffic can be prioritized or de-prioritize inside the UCS 6100 Switch based on L2 CoS only.
|Note:||Unified Communications applications mark the L3 DSCP values only (for instance, CS3 for voice signaling). However, it is possible to mark all traffic originating from a blade server Network Adapter with a single L2 CoS value.|
The Nexus 1000V software switch has the ability to map L3 DSCP values to L2 CoS values, and vice versa, like traditional Cisco physical switches such as the Catalyst Series Switches. Therefore, when Unified Communications traffic leaves a virtual machine and enters the Nexus 1000V switch, its L3 DSCP values can be mapped to corresponding L2 CoS values. This traffic can then be prioritized or de-prioritized based on the L2 CoS value inside the UCS 6100 Switch.
For instance, voice signaling traffic with L3 DSCP value of CS3 is mapped to L2 CoS value of 3 by Nexus 1000V. All Fibre Channel over Ethernet (FCoE) traffic is marked with L2 CoS value of 3 by Cisco UCS. When voice signaling and FCoE traffic enter the Cisco UCS 6100 Fabric Interconnect Switch, both will carry a CoS value of 3. In this situation voice signaling traffic will share queues and scheduling with the Fibre Channel priority class and will be given lossless behavior. (Fibre Channel priority class for CoS 3 in the UCS 6100 Fabric Interconnect Switch does not imply that the class cannot be shared with other types of traffic.)
On the other hand, the L2 CoS value for FCoE traffic can be changed from its default value of 3 to another value, and CoS 3 can be reserved exclusively for the voice signaling traffic. However, Cisco does not suggest or recommend this approach because some Converged Network Adapters (CNAs) cause problems when the FCoE CoS value is not set to a value of 3.
In the physical server design, the hard drives are locally attached to the MCS server, and the SCSI traffic never competes with the Ethernet IP traffic.
Virtual Unified Communications designs with UCS B-Series Systems are different than traditional MCS-based designs. In a virtual Unified Communications design, because the hard drive is remote and accessed via the FC SAN, there is a potential for FC SAN traffic to compete for bandwidth with the Ethernet IP traffic inside the UCS 6100 Series Switch. This could result in voice-related IP traffic (signaling and media) being dropped because FC traffic has a no-drop policy inside the UCS 6100 Switch. This congestion or oversubscription scenario is highly unlikely, however, because the UCS 6100 switch provides a high-capacity switching fabric, and the usable bandwidth per server blade far exceeds the maximum traffic requirements of a typical Unified Communications application.
The Nexus 1000V provides enhanced QoS and other features (for example, ACLs, DHCP snooping, IP Source Guard, SPAN, and so forth) that are essential for virtualized data centers and are not available in the other virtual switch implementations. With its capability to map L3 DSCP values to L2 CoS values, the Nexus 1000V switch is recommended for large data center implementations where Cisco Unified Communications Applications are deployed with many other virtual machines running on UCS B-Series system. For other Unified Communications deployments, the decision to use the Nexus 1000V will vary on a case-by-case basis, depending on the available bandwidth for Unified Communications Applications within the UCS architecture. If there is a possibility that a congestion scenario will arise, then the Nexus 1000V switch should be deployed.
An alternative solution if the Nexus 1000v is not deployed, is to use the Cisco VIC card to provide some level of QoS. The Cisco VIC card allows the administrator to create more than 2 virtual Ethernet NICs and a QoS policy can be applied to each virtual Ethernet NIC. Depending on which traffic is the main or most critical to a UC application for your deployment and depending on the corresponding QoS policy required for this type of traffic, the UC application VMs could be assigned different types of QoS.
- For example, for a typical CUCM deployment, the main and most critical traffic is the signaling traffic (assuming CUCM built-in software conference bridge is not heavily used in your deployment for example), which is usually assigned a DSCP value of CS3. So a pair of virtual Ethernet NIC (one active, one standby) can be created in UCS Manager for the CUCM VMs, with a CoS of 3 and a “No Drop” policy.
- Similarly, the main and most critical traffic for CUP is typically signaling traffic and therefore the CUP VMs could use the same virtual Ethernet NIC pair. Note that in this example the FC and UC VMs would both use the same QoS policy with CoS=3 and a No Drop policy, but the active virtual Ethernet NIC for FC could be configured to use one of the Fabric Interconnect switch and the active virtual Ethernet NIC for the UC VMs could be configured to use the other Fabric Interconnect switch. FC and the UC VMs would use the same Fabric Interconnect only in case of failure.
- For Cisco Unity Connection or Cisco Unified Contact Center Express, real-time audio traffic is typically the most critical traffic and they could be configured to use a virtual Ethernet NIC pair that is assigned a QoS policy where CoS=5 and packet drop is enabled, similarly to what we have with a EF DSCP marking.
In general, the downside to this approach is that ALL traffic types in a UC VM will have their CoS set to the same value and lower priority traffic in the VM (for example backups, CDRs, logs, Web Traffic...) would have the same CoS value as other higher priority traffic such as signaling or real time audio. In order to implement an optimal solution, the Nexus 1000v should still be deployed because it can mark different types of traffic with different CoS values, even if the different traffic flows are part of the same VM and are using the same vNIC.
|Back to: Unified Communications in a Virtualized Environment|