Cisco Unified Communications -- Isolating Point(s) of Failure
This information applies to all Cisco Unified Communications System releases.
After collecting information on the symptoms and behavior of the problem, to narrow the focus of your efforts you should:
- Identify the specific devices involved in the problem.
- Check the version of software running on each device.
- Determine if something has changed in the network.
- Verify the integrity of the IP network.
Identify Devices Involved in the Problem
In large- to medium-sized networks, it is crucial to identify the specific phones, routers, switches, servers and other devices that were involved in a reported problem. Isolating these devices allows you to rule out the vast majority of equipment within the network and focus your time and energy on suspect devices. To help you isolate which devices were involved in a problem, two types of information can prove invaluable:
- Network topology diagrams: It is strongly recommended that you have one or more diagrams that show the arrangement of all Cisco Unified Communications products in your network. These diagrams illustrate how these devices are connected and also capture each device's IP address and name (you may want to also have a spreadsheet or database of the latter information). This information can help you visualize the situation and focus on the devices that may be contributing to the reported problem.
- Call flow diagrams: Cisco equipment, including Unified Communications Manager servers, typically provide detailed debug and call trace log files. To interpret these log files, however, it is useful to understand the signaling that occurs between devices as calls are set up and disconnected. Using the network topology and call flow diagrams in conjunction with the log files, you can trace how far a call progressed before it failed and identify which device reported the problem.
Check Software Release Versions for Compatibility
After you have identified which devices may be involved in the problem, verify that the version of software running on each device is compatible with the software running on every other device. As part of Cisco Unified Communications release verification, Cisco Systems performs interoperability and load testing on simulated network environments running specific software versions. The Release Matrix lists the combination of software releases that were tested. However, if the combination of releases installed in your network does not match the values in the Release Matrix, it does not necessarily mean the combination is invalid.
To check interoperability for a specific device and software release, locate and review its Release Notes. Release Notes contain up-to-date information on compatibility between the product and various releases of other products. This document also describes open caveats, known issues that may cause unexpected behavior. Before beginning extensive troubleshooting work, examine the Release Notes to determine if you are experiencing a known problem that has an available workaround. Known problems are available in the Bug Toolkit.
- Tip: The Bug Toolkit requires that you are a Cisco partner or a registered Cisco.com user with a Cisco service contract. Using the Bug Toolkit, you can find caveats for any release.
Determine if Network Changes Have Occurred
Before focusing on the particular device or site where the problem occurred, it may be useful to determine if a change was made to surrounding devices. If something has been added, reconfigured or removed from elsewhere in the network, that change may be the source of the problem. It is recommended that you track changes to the IP telephony network such as:
- New user phones added
- Modifications to Cisco Unified Communications Manager call routing settings, such as new directory numbers, route patterns and dial rules to support new sites or devices
- Changes to port configurations on switches, routers or gateways (new equipment, wiring changes or new port activation)
- Changes to IP addressing schemes (such as adding new subnets) that may have affected route tables
Verify the IP Network Integrity
Always remember that Cisco Unified Communications equipment relies on a backbone IP network. Many connectivity problems are not caused by configuration errors or operational failures on Cisco devices, but rather by the IP network that interconnects them. Problems such as poor voice quality are typically due to IP network congestion, while call failures between locations may be the result of network outages due to disconnected cables or improperly configured IP route tables.
Before assuming that call processing problems result from Cisco Unified Communications devices themselves, check the integrity of the backbone IP network. Keep the OSI model in mind as you perform these checks. Start from the bottom, at the physical layer, by checking that end-to-end cabling. Then verify the status of Layer 2 switches, looking for any port errors. Move from there to confirm that the Layer 3 routers are running and contain correct routing tables. Continue up the OSI stack to Layer 7, the application layer. To resolve problems occurring at the top levels of the stack, a protocol analyzer (or "sniffer") may be useful. You can use sniffer to examine the IP traffic passing between devices and also decode the packets. Sniffers are particularly useful for troubleshooting errors between devices that communicate using Media Gateway Control Protocol (MGCP) or Session Initiation Protocol (SIP).
Have you had experience troubleshooting Cisco Unified Communications equipment? Can you expand on this topic? Please contribute to this wiki and share the tools and techniques you used by sending E-mail to firstname.lastname@example.org. Guidelines for submitting content are available on the About DocWiki pages.