IP network design, part 3: Designing the WAN

The wide area network (WAN) is the single biggest contributor to a corporate network's cost of ownership. Therefore, this is the area where the cost-versus-performance tradeoff is most pronounced and most critical. This article explores the various alternatives that must be evaluated when choosing and designing a WAN infrastructure. The different topological and technological options are discussed in terms of how they relate to the fundamental WAN design goals. Traditional technological alternatives that include synchronous serial lines, frame relay and ATM will be discussed along with more state-of-the-art options such as DSL and MPLS.

Choosing the WAN technology

The wide area network (WAN) is the single biggest contributor to a corporate network's cost of ownership. Therefore, this is the area where the cost-versus-performance tradeoff is most pronounced and most critical. This article explores the various alternatives that must be evaluated when choosing and designing a WAN infrastructure. The different topological and technological options are discussed in terms of how they relate to the fundamental WAN design goals. Traditional technological alternatives that include synchronous serial lines, frame relay and ATM will be discussed along with more state-of-the-art options such as DSL and MPLS.

Synchronous serial lines

Clear channel leased lines are the simplest and most traditional method of interconnecting geographically dispersed sites; however, this is also the most expensive method. The main advantage of synchronous leased lines relates to their technological simplicity. This means that less expertise is required to install and troubleshoot the technology, which can ultimately reduce support costs.

Point to point serial links are also characterized by minimal overhead, thus increasing the effective throughput and eliminating extra contributors to delay and jitter (i.e. the variation in delay). Serial links of sufficient bandwidth have the potential to exhibit excellent quality of service (QoS) characteristics. The main contributors to delay and jitter on a serial link are the queuing and packet serialization procedures at the router.

Serialization delay can be experienced when a small packet is waiting for a large packet to be sent over the link. This type of delay is more likely on low-speed links. Bandwidth budgets, however, always have a ceiling and there are more cost-effective methods of reducing delay and jitter on serial links. Sophisticated queuing technologies fragment large packets and give higher priority to small packets thus ensuring a more uniform delay profile on the serial link. This is particularly important for delay-sensitive real-time applications such as packetized voice, video and multimedia. The ultimate disadvantage of serial leased lines is cost -- so much so that many sections of the industry now regard them as an inefficient use of expensive bandwidth. This has fuelled a migration from serial leased line technology to packet switched technology like frame relay or ATM cell-relay for higher bandwidth requirements.

Frame relay

The frame relay protocol is run between the customer's router or FRAD (frame relay access device) and the local frame relay switch that typically belongs to the service provider. A permanent virtual circuit (PVC) is used for inter-site connectivity. The PVC is termed "permanent" since the end points are always the same just like a leased line. The word "virtual" is used since there is no dedicated physical connection along the entire path through the carrier's network. Instead, the carrier programs its switches to ensure that, for example, traffic entering the frame relay network from Site A will exit the carrier's network at Site B. Thus, at a very basic level, this may seem similar to the use of a leased line to connect Site A to Site B.

However, there are a number of fundamental and far-reaching differences. Frame relay incurs additional overhead by virtue of the fact that it is a packet-switched technology. The fact that there is no dedicated physical circuit along the entire path enables the carrier to provide a flexible bandwidth offering that may prove cost-effective for the customer. A frame relay service entails the purchasing of a committed information rate (CIR) for each PVC.

The CIR is the end-to-end bandwidth that the carrier will guarantee. The customer can also purchase an additional burst rate and this is the maximum traffic rate that will be supported across the PVC. Obviously, the maximum possible burst rate is the physical speed of the customer's access circuit into the frame relay service provider. However, the carrier does not guarantee that traffic will be transmitted at rates exceeding the CIR.

Once this rate is exceeded, all subsequent packets get marked "Discard Eligible" by setting the DE bit in the frame relay header. This is performed at the local frame relay switch. If congestion is detected at a node in the frame relay network, packets marked DE are the first to be dropped. Upon detecting congestion, a frame relay switch will send a Backward Explicit Congestion Notifier (BECN) message back to the source. If the sending router or FRAD has sufficient intelligence to process this message, then it may throttle the sending rate back to the CIR. A customer can therefore tailor the choice of CIR and maximum burst rate in order to attain a cost-effective bandwidth profile that adequately supports the application's requirements.

Applications that use TCP will be more resilient to dropped packets and therefore will incur less of a performance hit than unreliable UDP-based applications. For applications such as voice, an excessive percentage of dropped packets will hinder voice quality.

There is an additional problem that may also be perceptible when running voice traffic at rates in excess of the CIR. Apart from dropping DE traffic during times of congestion the frame relay switch may simply buffer it with a low priority. This means that the traffic may reach the destination but with a large delay and exhibit jitter, which has a serious effect on the quality of voice or any real-time playback. It should be taken as a general rule to avoid running real-time traffic at rates in excess of the CIR. This is pertinent because frame relay services may entail certain bandwidth guarantees; they do not feature any latency guarantees. This may necessitate using different PVC for real-time and non real-time traffic. A frame relay network can provide resilience in a cost-effective manner. Backup PVCs can be employed that have a lower CIR than the equivalent primary PVC. Such a backup PVC should ideally reside in a different cable duct to the local frame relay switch, as it is important to ensure that the resilience is not just theoretical.

Asynchronous transfer mode

ATM is a compromise technology designed to combine the consistency in bandwidth and delay that's associated with traditional clear channel TDM technology with the flexibility of packet switching. ATM's higher layers support the dynamic rerouting of SVCs using PNNI. It is also adaptable to bursty traffic conditions. Small fixed-length 53-byte cells serve to minimize the variation in delay or jitter experienced in the WAN. While ATM employs many similar principles to frame relay, the switching of small fixed-length cells coupled with QoS features inherent in the ATM protocol suite make it more suitable for heterogeneous and real-time applications.

ATM resource and QoS parameters

The user can avail of similar flexibility in bandwidth offering that frame relay affords. With ATM, a sustainable cell rate (SCR) and a peak cell rate (PCR) can be purchased from the service provider. This is very much an equivalent idea to CIR and EIR in frame relay. Thus, as with frame relay, the customer has a certain amount of control over access speeds and can tailor this to application requirements.

ATM incorporates QoS parameters apart from the traffic parameters relating to cell rate. These can be requested at the user network interface and are intended to provide better service for various delay-sensitive and loss-sensitive applications.

  • Cell loss ratio (CLR): This is the ratio of dropped cells to the total that was throughput across the connection. This ought to be a very small number. The CLR is a parameter that may be set at a particular maximum value for an application that is sensitive to packet loss such as an UDP-based data application.
  • Cell delay variation (CDV): The CDV is the average variation in delay across the ATM connection over a specific time interval. A maximum CDV value may be requested from the ATM network for applications that do not tolerate a large variation in delay such as voice and video.
  • Cell transfer delay (CTD): The CTD is the total end-to-end latency or delay across the ATM connection. This value may be set for time-sensitive voice or data applications.

ATM also supports a number of fundamentally different classes of service that relate to how bandwidth is allocated on the ATM network. The ATM forum has specified four service categories:

  • Constant bit rate (CBR): This service category ensures a constant bit rate across the ATM PVC. Constant bit rate is a prerequisite for high quality voice and video transmission. This is the most expensive type of service on a public ATM network since the provider must allocate sufficient bandwidth along the entire path of the PVC in order to meet the specification. The constant bit rate is equivalent to the SCR value purchased from the service provider. If traffic is sent across the PVC at a rate in excess of the SCR cells may be dropped during times of congestion in the ATM network. The cell loss priority (CLP) bit in the ATM header can determine what traffic is dropped in such instances.
  • Variable bit rate (VBR): The bit rate can vary in line with network conditions with this service category. A predefined maximum PCR can be achieved across the PVC when network congestion is completely absent, which of course cannot be guaranteed. An average throughput can be negotiated between the ATM access device and the switch for a particular time interval. A guaranteed maximum bit rate can also be negotiated for a short time interval. The VBR class of service is suitable for bursty data applications that are not particularly time-sensitive. VBR has a standard defined for non-real time traffic, which is termed VBR-NRT and this is very typically used for the transport of data traffic. Its "real-time" equivalent, VBR-RT was more recently ratified. VBR-RT is recognized as the most efficient way to send voice, video and other delay sensitive traffic across ATM. This is because it has been recognized that voice and video, while delay- and jitter-sensitive, are not constant bit rate by nature. Video protocols only send delta frames and statistically voice calls are around 30% silence.
  • Available bit rate (ABR): ABR is a specific type of variable bit rate. A feedback loop is implemented between the ATM switch and router (or whatever ATM adapter is accessing the network). The adapter requests a particular bit rate but will accept whatever the current network utilization permits. If the bit rate provided by the switch is lower than the requested rate then the switch may increase after a certain time interval when the network is under utilized. Similarly if the original requested rate is granted by the switch, the switch may subsequently reduce that rate if the network utilization grows. Despite the apparent complexity of ABR, it is less expensive than the CBR or VBR class of service since there is only a limited guaranteed allocation of bandwidth.
  • Unspecified bit rate (UBR): There is absolutely no guarantee of bit rate with UBR. All cells sent by the access device may be dropped by the network or may be successfully transported to the destination. The actual throughput achieved depends entirely on network conditions. For this reason UBR is frequently compared to "flying on standby."

The ATM adaptation layer (AAL) prepares cells for transmission over the ATM network. At the sending end, variable length packets are segmented into fixed-length cells and are reassembled at the receiving end. This particular function of the ATM adaptation layer is called Segmentation and Reassembly (SAR).

Different AAL protocols are defined in order to support optimized transport of traffic types that have different requirements and characteristics. There are five different AAL protocols have different characteristics in terms of bit rate profile, connection-oriented/connectionless nature and timing characteristics.

The most commonly used ATM adaptation layer encapsulations are AAL1 and AAL5. AAL1 is connection-oriented and provides constant bit rate. Constant delay is achieved by implementing connection timing end to end between the source and destination. This constant bit rate and delay makes AAL1 ideal for delay-sensitive applications such as voice and video. AAL5 are the most popular ATM adaptation layer protocols used for data transmission. AAL5 is connection-oriented and supplies a variable bit rate.

The type of AAL protocol that is to be used for an ATM PVC is selected and configured on the router and ATM switch. Different AAL protocols can be run on different PVCs. Thus, a particular PVC could be used for voice and video traffic and another PVC could be dedicated to data. The ability to support different AAL protocols makes ATM a suitable protocol in supporting applications that have different characteristics and networking requirements. Apart from the inherent delay parameters that can be requested, the ATM transmission profile can also be tailored in other ways to enable the support of traffic types with different transport requirements.

Some commentators favor the use of the Cell Loss Priority bit to give a higher priority to delay-sensitive applications such as voice and video. The network will drop traffic with the CLP bit set, if the transmission rate is greater than the SCR and congestion is detected. The only advantage to setting it at the Ingress to the network is that it affords the customer some control over what cells get set CLP. If, for example, it were decided to mark delay sensitive UDP-based voice traffic CLP so that it gets dropped rather than delayed, then this traffic will always be marked CLP regardless of traffic conditions. Ultimately it may simply mean that other customers are getting traffic through the carrier's network at your expense. ATM is typically used for WAN speeds in excess of T-1/E-1 and scales up to the SONET multiples of 155Mbps. Its niche market is therefore for high bandwidth requirements and networks that have stringent QoS specifications.

Digital subscriber line (DSL) 

DSL is a family of access technologies that use high transmission frequencies and advanced modulation techniques to deliver high bandwidth over conventional and legacy copper cabling at limited distances.

Asymmetric DSL (ADSL) is the most popular deployment of DSL. It is designed to coexist with POTS in the local loop to the central office, by using higher frequencies for data transmission and reserving the sub-4kHz range for legacy PSTN voice. Therefore no change is required in the local loop connection. The residential unit or small branch office has a DSL modem installed along with a frequency splitter that separates voice and data based on frequency. The DSL connection terminates on a DSL Access Multiplexor (DSLAM) at the Central Office. The DSLAM allows the service provider to divert voice traffic to the PSTN and data traffic to the Internet.

ADSL supports a maximum downstream speed of 1.5Mbps and up to 640kbps upstream. This is deemed more efficient as the bandwidth requirement at remote offices and for residential is usually greater downstream. The ADSL router or modem should be no more than 18,000 feet from the central office due to the attenuation in the local loop. Other variants within the DSL family include Symmetric DSL offering approximately 1.1Mbps in each direction with a distance limitation of 12,000 feet.

Very high rate DSL (VDSL) can offer increased bandwidth in both direction but has a short reach, which is the main reason it never became a standard.

What has become an ITU standard is Global Symmetric DSL (G.SHDSL). This supplies multi-rate (between 192k and 2.3Mbps in each direction) and approximately 30% longer reach than most currently deployed DSL technologies.

In summary, DSL offers an efficient and cost-effective access technology for Branch Office, Small Office/Home Office (SOHO). Connections at reasonable speeds can be made to HQ and other offices via the Internet. DSL can also be used for backup purposes and in many respects is replacing ISDN as a remote access technology that provides higher bandwidth and increased cost-effectiveness. One clear limitation of DSL is that, since it uses the Internet for connecting to remote sites, QoS cannot be guaranteed unless a Service Level Agreement (SLA) is purchases from the ISP.

Multiprotocol Label Switching (MPLS)

In conventional Layer 3 forwarding, as a packet traverses the network, each router extracts forwarding information from the Layer 3 header. Header analysis is repeated at each router (hop) through which the packet passes.

In a MPLS network, packets are forwarded based on labels. Each IP network that is reachable through an interface is assigned a unique label. A mapping is established between an incoming label and an out going label. This is maintained in the Label Forwarding Information Base (LFIB) table. Each node examines the incoming label, does a table lookup, swaps the incoming label for the outgoing label and then forwards the packet out of the out going interface. The use of such tables allows the MPLS network to build a Label Switched Path (LSP) across the network.

Figure 1 shows the details of the MPLS header. It is located between the Layer 3 (IP) header and Layer 2 header. The EXP bits and the TTL field of the MPLS header can be copied from the IP header. The S bit indicates whether there is more than one MPLS label in this packet.


Figure 1: The MPLS Label Header

A protocol is used between the routers in a MPLS network to assign labels to IP network and exchange label information with other routers. The most commonly used protocol currently used today is Label Distribution Protocol (LDP Port number 646), which is TCP-based and runs on the MPLS Label Switch Router (LSR).

The concept of a LSR is frequently used to describe MPLS devices. They run a routing protocol and thus have layer 3 intelligence. However, once the LSP has been set up, core MPLS devices merely perform a label lookup when forwarding traffic. Thus they combine the intelligence of routing with the speed of switching. Figure 2 illustrates some of the basic operations executed by an MPLS network.


Figure 2: Label Switching Overview

MPLS offers a cost-effective method for service providers to sell bandwidth on a shared network infrastructure rather than using dedicated TDM-based leased lines. The Quality of Service offered by TDM can be emulated through the use of high-priority label markings. The experimental (or EXP) field is used for this purpose and is generally copied from the IP Precedence field of the packet entering the network. For example Voice over IP packets are usually marked with a high precedence value of 5 and this would correspond to an EXP value of 5 within the MPLS core network.

The network traffic of different customers on the same service provider network is separated through the deployment of MPLS Virtual Private Networks (VPNs). It should be noted however that MPLS VPNs, unlike IPSec VPNs, do not offer encryption as a standard.

MPLS is often regarded as the latest enhancement on frame relay and ATM. All three technologies provide access to a shared service provider infrastructure but MPLS injects more intelligence into the network by making all core devices IP-aware.

IP network design series

Part 1: Fundamental principles

Part 2: The IP addressing plan

Part 3: Designing the wide area network

Part 4: LAN design

Dig Deeper on WAN technologies and services

Unified Communications
Mobile Computing
Data Center