Video conferencing standards
Video Conferencing (H.3xx)
H.3xx are ITU-T Study Group XVI umbrella recommendations for video conferencing. These recommendations reference other recommendations that include the protocols for coding video and audio, multiplexing, signalling and control. The core H.3xx recommendations are:
- H.320 Narrow-band videoconferencing over circuit-switched networks (N-ISDN, SW56, dedicated networks).
- H.321 Narrow-band videoconferencing over ATM and B-ISDN.
- H.323 Narrow-band videoconferencing over non-guaranteed quality-of-service packet networks (LAN, Internet,etc.).
- H.324 Very narrow-band videoconferencing over the general (dial-up) telephone network.
- H.310 Wide-band (MPEG-2) videoconferencing over ATM and B-ISDN.
For the H.3xx recommendations, referencing the T.120 recommendation for data collaboration is optional.
H.323 is an International Telecommunications Union (ITU) standard that describes the protocols, services and equipment necessary for multimedia communications including audio, video and data on networks without guaranteed Quality of Service (QoS). These networks technologies may include Ethernet, Fast Ethernet, and Token Ring and protocols like Internet Protocol (IP) or Integrated Packet Exchange (IPX). Due to the need to communicate between smaller networks connected to the Internet, H.323 will be more popular on IP networks.
H.323 Basic Network Components
H.323 specifies several new standards to allow for communications between terminals on IP networks. These standards dictate how different mandatory and optional components of the H.323 standard interoperate with each other. The major network components of H.323 include:
- Terminals: The terminal or endpoint must support a minimum of G.711 audio, H.225, H.245, Q.931 and RTP. If the terminal supports video, it must support a minimum of H.261 QCIF. The terminal may support T.120 data sharing although this support is optional
- Gatekeepers: The Gatekeeper is an optional component of H.323 that is responsible for managing other components of a H.323 network. It is a very important component to the managed network. The Gatekeeper has several responsibilities which include: translation of E.164 aliases to IP or IPX addresses, bandwidth management of incoming or outgoing calls, call admission to accept or deny calls, and zone management. Gatekeepers can also support an optional feature that allows a call to be rerouted if there is no answer from the intended terminal. Gatekeepers also help manage different H.323 zones and help manage H.323 MCU sessions. It is important to remember that while Gatekeepers are optional, the H.323 terminal must make use of the gatekeeper’s services if the Gatekeeper is present in the network. Gatekeepers are typically software products that reside on a server. Although many H.323 MCUs and gateways have embedded gatekeepers, they usually offer less features than stand alone Gatekeepers.
- Gateways: If there is a need for a H.323 terminal to communicate with another terminal on a H.320, H.324 or analogue PSTN networks, a H.323 gateway is required to perform the translation. This optional component typically have ISDN and IP network connections and support the translation between these two networks. The number of simultaneous connections allowed through a Gateway is not specified in any standard, so there are different options available from different manufacturers. Gateways typically have built in Gatekeepers with minimal features
- Multipoint Control Units: The last of the major components is the MCU which controls conferences between 3 or more terminals. The H.323 MCU may be a separate component or may be incorporated into a terminal. Some systems have optional software packages that enable internal H.323 MCU capabilities. The built-in multipoint conference unit (MCU), Multisite, has the capability to establish meetings with up to 4 video sites (5 if at least one site is a telephone call, ISDN/analogue/mobile). A conference can consist of any combination of ISDN/LAN sites. Up to 4 Multisite systems can be cascaded in one meeting. The maximum numbers of participants is 10 video system and 4 telephones.
In 1991 the European Videophony Experiment (EVE) saw various manufacturers trialing equipment using the CCITT standard H.320. Simultaneously in Japan the HATS trial was initiated, also H.320 based and this included some ISDN trial work. Today the H.320 standard forms an umbrella for a whole host of standards adopted by the main manufacturers of video conferencing equipment and ensures a fair degree of inter-connectivity. A group is currently being formed to help ensure interoperability and promote the use of the H.320 set of standards.
H.320 is an overall standard and requires, as a minimum for video conferencing intercommunication, that the following standards are used:
- G.711 Audio 3KHz bandwidth.
- H.261 Video Quarter Common Intermediate Format (QCIF).
- H.221 Packaging.
- H.242 Handshaking.
- H.230 Frame-synchronous Control
The governing standard for the transmission of audio is G.725. This standard encompasses:
- G.711 G.711 is the oldest compression algorithm. It is mandated by all H.3xx recommendations (except H.324). G.711 codes toll-quality (3KHz analog bandwidth) audio into 48, 56, or 64 Kbps.
- G.722 Optional recommendation. G.722 codes enhanced quality (7 KHz analog bandwidth) audio into 48, 56, or 64 Kbps.
- G.722.1 Codes enhanced quality (7 KHz analog bandwidth) audio into 24 or 32 Kbps. G.722.1 was approved in September 1999.
- G.723 Speech coder at 6.3 and 5.3 Kbps data rate. Medium complexity. Required for H.324; Optional for H.323.
- G.728 Optional recommendation. G.728 codes toll-quality (3KHz analog bandwidth) audio into 16 Kbps
- G.729 codes toll-quality (3KHz analog bandwidth) audio into 8 Kbps.
- G.703 Is a standard associated with the PCM standard, requires a bandwidth of 64 kbps. G.703 is the electrical and functional description.
Digitization of the audio signal is achieved through using Pulse Code Modulation (PCM). Within Europe the A-law conversion is used, whilst for the US and Japan the m-law conversion is used. Essentially PCM takes an 8 bit sample of an audio waveform at a sampling rate of 8KHz. The sample is actual 7 bits with the most significant bit (msb) being used as a sign bit.
The G.711 protocol strips off the least significant bit (lsb) from each sample to reduce the 64Kbit/s sample down to 56Kbit/s. This is essential to ensure that the 1st channel of 64Kbit/s has the capacity to carry signalling information in addition to audio. When using SDS56 where there is only a 56Kbit/s channel available, G.711 strips off a further bit to reduce the audio signal to 48Kbit/s.
The upshot off this bit stripping is the introduction of more noise to the audio component. To the human ear the difference is virtually imperceptible. G.722 which would normally require a channel capacity of 56Kbit/s is also able to reduce to 48Kbit/s however, this is achieved both at code level by altering the compression algorithms as well as at the bit level by simple lsb stripping.
- H.261 The governing standard for the transmission of video is H261 although this also acts as an umbrella for G.725. H.261 determines whether communication will exist at the Common Intermediate Format(CIF) level or the Quarter Common Intermediate Format(QCIF) and thereby provides a uniform process for a receiving Codec to interpret a compressed video signal. H.261 is based on Discrete Cosine Transform, DPCM and motion compensation techniques. In simple terms H.261 enables, for any rate up to 2Mbps connection:
- CIF - n frames/s @ a screen res. of 352x288 pixels.
- QCIF - n+ frames/s @ a screen res. of 176x144 pixels.
n is directly proportional to the degree of movement present in the image being transmitted and will be limited by processor speed and available bandwidth.
Beneath the H.261 umbrella sit the following standards:
- H.221 - Frame structure, protocol and video/audio synchronization
- H.230 - As above but for MCU communications
- H.242 - Inter -device communication i.e. in-band information exchange
Once an ISDN connection has been established the synchronization of communicating devices is handled by H.221. The H.221 continues to handle all synchronization issues throughout a conversation. H.242 caters for inter-device communication.
- H.261 Annex D Protocol for transferring high quality still images 4CIF.
- H.263 H.263 is a newer compression algorithm and is optimized for the lower data rates. Originally it was developed under the H.324 umbrella for coding video at very low data rates (15K to 20K bps). H.263's performance is superior to H.261, specially at data rates below 128 Kbps. H.263 has five resolution modes:
- SQCIF (128 pixels per line by 96 lines)
- QCIF (176 pixels per line by 144 lines)
- CIF (352 pixels per line by 288 lines)
- 4CIF (704 pixels per line by 576 lines)
- 16CIF (1408 pixels per line by 1152 lines)
H.263 is mandatory for H.324 and optional for all other H.3xx recommendations. H.263-enabled systems are required to decode both the sub-QCIF (SQCIF) and QCIF resolution modes and encode either the SQCIF or QCIF modes. All other resolution modes are optional.
- H.221 Frame Structure 64-1920 Kbps.
- H.223 Multiplexing protocol for low-bit rate multimedia communication.
- H.224 Protocol for simplex use of datachannel in H.221
- H.225 Media Stream Packetization and synchronization on non-guaranteed quality of H.230. Frame synchronous control and indication signals for audio visual systems.
- H.230 Control and indication for frame sync (MCU/graphics)
- H.231 MCU (Multipoint Control Unit) for digital network up to 2 Mbit/s
- H.233 The ITU-T's data-encryption standard for real-time multimedia, H.233 is supported across a wide range of standard services, including H.320, H.323, and H.324. A related standard is H.234, which specifies how encryption keys are handled. See H.320, H.323, H.324.
- H.234 Encryption key management and authentication system for audiovisual services. Three methods of encryption key management are ISO 8732, Diffie-Hellman, RSA. They are applicable to the encryption of audiovisual signals transmitted digitally using the H.221 frame structure. The management messages defined are transmitted within the encryption control signal (ECS) channel of H.221, whose structure and use is defined in H.233.
- H.242 Protocols for call set up and disconnect for digital network up to 2 Mbit/s. In-band information exchange . Used with H.320 not H.323.
- H.243 MCU call set up (3 or more users) for digital network up to 2 Mbit/s
- H.244 H.221 + Bonding
- H.245 Control of communications between visual telephone systems and terminal equipment on non-guaranteed bandwidth LANs (H323).
- H.281 Far end Camera Control
- H.331 Broadcast mode Video Conferencing. No use of H.242. Receiver must know what Video and Audio algorithm are being used in order for successful connection to be made service LANs.
- BONDING Frame structure for multiplexing 1 to 30 channels of 56 or 64 kbit/s. Alternative to H.221 (not part of the H.320 standards but the most commonly used industry standard for multiplexing channels)
T.120 is a family of recommendations that define the protocols for data collaboration. The T.120 recommendations are arranged in a layered hierarchy such that each layer leverages the layers above it to define the protocols and services. The following is a list of core recommendations that fall under the T.120 umbrella:
- T.121 General template that provides guidance for developing T.120 application protocols.
- T.122/125 Multipoint Communication Service protocols
- T.123 Transport protocol stack
- T.124 Generic Conference Control (GCC); the application protocol supporting reservations and basic conference control services for Multipoint teleconferences
- T.125 Multi Communication Services (MCS); protocol specification
- T.126 Still image and annotation protocol
- T.127 Binary file transfer protocol
- T.128 Application sharing protocol
- T.122 through T.125 define services for any application that uses these protocols; they can be thought of as the "plumbing" that links the user to the data management infrastructure.
The important application (peer-to-peer) protocols are T.126, T.127, and T.128. T.126 defines the protocol for sharing images and text, and for whiteboard annotations; T.127 defines the protocol for transferring binary files; and T.128 defines the protocols for sharing applications.
It should be noted that T.120 is "network independent." In other words, T.120 works over IP, ISDN, ATM, or even the analog telephone network. This is in contrast to the videoconferencing recommendations discussed below, which are keyed to a particular network transport.
H.320 and H.323 have their own collection of standards that are defined in the chart below. We have included the other popular communications standards as a comparison.
|Network||Narrowband switched digital ISDN||Broadband ISDN ATM LAN||Guaranteed bandwidth packet switched networks||Non guaranteed bandwidth packet switched networks (Ethernet)||PSTN or POTS, the analog phone system|
|Video||H.261, H.263||H.261, H.263||H.261, H.263||H.261, H.263||H.261, H.263|
|Audio||G.711, G.722, G.722.1, G.728||G.711, G.722, G.728||G.711, G.722, G.728||G.711, G.722, G.728, G.723, G.729||G.723|
|Control||H.230, H.242||H.242||H.242, H.230||H.245||H.245|
|Multipoint||H.231, H.243||H.231, H.243||H.231, H.243||H.323|
|Comm. Interface||I.400&||AAL, I.363, AJM I.361, PHY I.400||I.400&, TCP/IP||TCP/IP||V.34 Modem|