Bobboz - Fotolia
The H.323 protocol and Session Initiation Protocol, or SIP, both support voice over IP and multimedia communications,...
but the standards were developed by different standards-setting bodies and have developed in different ways. As a result, both target multimedia transmitted over IP networks but have different capabilities. This can be a strength or a weakness, depending on the network operator or enterprise needs.
While both protocols were developed beginning in 1996, SIP has become the VoIP and multimedia standard of choice over time and is being used by large hardware and software vendors, including Microsoft and Cisco. While few manufacturers are working on new H.323 implementations, it is still in use in many legacy systems, and some standards work continues.
What is the H.323 protocol?
H.323 is a binary-based standard developed by the International Telecommunication Union (ITU-T) to support rich-media communications over IP networks. The ITU-T approved the first version of H.323 in 1996. Initially, H.323 development was to support video conferencing, but grew to also support audio conferencing and fax.
H.323 is a well-defined and well-structured protocol with specific definitions for establishing sessions -- which can be loosely compared to calls -- services and session components. Because of its rigid services definition by a telecom-based standards body, all H.323 implementations are generally interoperable.
What is SIP protocol?
SIP is an ASCII text-based communications protocol that uses much of the existing design of the Hypertext Transfer Protocol (HTTP). SIP development began in the late 1990s and was driven by the goal of creating an IP-optimized and flexible communications protocol based on already defined internet standards. The Internet Engineering Task Force (IETF) issued its first SIP documentation in 1999. SIP supports voice, video, screen sharing and messaging.
While most SIP development work is complete, the IETF continues to support additional standards work. In addition, the SIP Forum, an industry association of IP communications companies, develops interoperable implementations of SIP features, called primitives.
Use Cases: H.323 vs. SIP
H.323 and SIP uses are similar. Both were initially designed to support rich media sessions -- including voice, video and screen sharing -- between two or more endpoints. The current primary uses include video conferencing, telephony and voice chat, and screen sharing.
H.323 vs. SIP comparison
As H.323 and SIP were developed by different standards-setting bodies, they have evolved in different ways. While they aim to achieve the same goal of transmitting rich media over IP networks, they have different capabilities.
H.323 has a very well-defined architecture, which includes rules for implementing features and requirements for specific components of an H.323-based service. These components include the following:
- Terminals that are the endpoints in an H.323 service.
- Gateways that allow interconnection between H.323- and non-H.323-based endpoints. For example, a gateway could allow a public switched telephone network (PSTN)- or SIP-based phone to connect to an H.323 conference.
- Gatekeepers that control sessions and manage bandwidth by limiting the number of participating endpoints. Gatekeepers are optional, but most endpoints must use them if they are present.
- Multipoint control units, which serve as the hub for interconnecting three or more terminals. MCUs receive voice and video from terminals and distribute media out to other terminals participating in a conference.
SIP loosely defines the way endpoints connect to each other. It includes the ability for developers to define options for video and video codecs and security. The protocol also offers flexibility for developers to create additional features. Key SIP components include the following:
- user agents, which are endpoints like phones and video conferencing apps or platforms; and
- intermediaries that allow scaling of conferencing, administration and security, and gateway functions. SIP intermediaries include the following:
- proxy servers to support features like discovery of endpoints, policy enforcement, multipoint control and endpoint registration;
- redirect servers to route SIP session requests to alternative endpoints based on policy, such as routing an incoming call to a backup phone or contact center; and
- back-to-back user agents to enable insertion of devices into a SIP session for security, policy or gateway functionality.
Session Description Protocol (SDP) is core to SIP as it defines the characteristics of a session. SDP enables the use of encryption, transport protocol, the selection of voice and video codecs, and compression.
Both protocols offer reliable means to establish sessions, as they support the use of TCP to ensure delivery of session setup and control messages between endpoints and intermediaries.
SIP reliability, however, is dependent on the design of the application, and the potential use of intermediaries to prevent introducing new sessions or endpoints onto a congested network.
Most H.323 implementations support H.325 for encryption of signaling and media streams, like voice and video. SIP supports the use of HTTPS to encrypt all session management messages, typically on IP Port 5161. SIP implementations may also define encryption of media streams.
Both H.323 and SIP are extremely scalable. H.323, however, has strict server requirements, which can make a large H.323 implementation more costly and complex. H.323's inherent traffic management capabilities, such as load balancing across gatekeepers and strict call admission control, mean that it will prevent session creation when the network is unable to support the desired session.
SIP lacks inherent control functions, but it provides the flexibility for application developers to implement their own approaches for call admission control and scalability. SIP session setup, management, and teardown typically require less network traffic than H.323.
In addition, SIP messages incorporate the IP addresses of the sender and receiver of a session request into the SIP header, creating difficulties in establishing sessions across network address translation boundaries. Its lack of tightly-defined service implementations -- unlike H.323 -- means not all instances of SIP are interoperable and supporting SIP across firewalls is problematic. For that reason, many enterprises deploy session border controllers, which are SIP-optimized devices, to manage security, protocol translation and policy enforcement between SIP instances.
H.323 call setup is handled by gatekeepers that receive session setup requests and join appropriate endpoints.
SIP setup is either direct between endpoints, if one endpoint knows the IP address and can directly reach another, or through gatekeepers. A SIP terminal sends a session initiation request to the gatekeeper, which forwards the request to the destination terminal. The destination terminal responds to the request, and both terminals use SDP to negotiate the characteristics of the session.
The future of SIP and H.323
While H.323 is still in use in many legacy systems due to its flexibility and the near-ubiquity of IP networks, SIP has become the de facto standard for existing and emerging voice and video communications.
SIP-based cloud services for audio conferencing, telephony, PSTN access and video conferencing are widely available. SIP will continue as the dominant protocol for multimedia session establishment and maintenance for the foreseeable future.
Discover the ins and outs of video conferencing standards.
Learn the basics of the SIP protocol.
Check out this glossary of VoIP protocols and standards.