What is a network bottleneck?
A bottleneck, in a communications context, is a point in the enterprise where the flow of data is impaired or stopped entirely. Effectively, a bottleneck results when there is not enough data handling capacity to accommodate the current volume of traffic.
A bottleneck can occur in the user network or storage fabric or within servers when there is excessive contention for internal server resources, such as central processing unit (CPU) power, memory or input/output. As a result, data flow slows down to the speed of the slowest point in the data path. This slowdown affects application performance, especially for databases and other heavy transactional applications, and can even cause some applications to crash.
What causes network bottlenecks?
A bottleneck frequently arises from poor network or storage fabric design. Mismatched hardware selection is a common cause. For example, if a workgroup server is fitted with a Gigabit Ethernet port but the corresponding switch port that connects to the server only offers a legacy 10/100 Ethernet port, the slow switch port will then become a bottleneck to the server.
Another design flaw common to storage networks is excess fan-in, where multiple storage devices are connected to the same switch port to maximize the use of that port's bandwidth. For example, connecting multiple 4 gigabit Fibre Channel (FC) storage devices to the same switch port can easily overwhelm the port and result in performance problems if multiple storage devices are active simultaneously. In many cases, bottlenecks develop over time because network administrators fail to track the demands of increased network and storage traffic.
Bottlenecks can also develop due to poor or suboptimal configuration of switches or host bus adapters (HBAs). For example, using multiple FC ports to connect devices within the storage switching fabric can improve storage availability and performance, but if the devices are not configured for load balancing, much of the benefit will be lost. Similarly, bottleneck conditions can occur due to hardware failures. Using the previous example, suppose that one of two FC links should fail. Although failover should enable the storage device to remain accessible, all the traffic that used to be carried by two links now fails over to one, potentially resulting in a bottleneck if the combined traffic exceeds the bandwidth of a single link.
How can network bottlenecks be identified and fixed?
Bottlenecks are typically located by systematically testing network performance at various devices along a data path and isolating the devices that are performing noticeably slower than other points in the network. Once identified, the bottleneck can usually be resolved by reconfiguring, upgrading or replacing the offending device. At the network level, this may involve upgrading a switch or HBA. For servers, a CPU or memory upgrade may help, or the server may need to be replaced entirely with a newer dual- or quad-CPU server. Bottlenecks can often be avoided by proactively monitoring traffic load trends over time and implementing improvements before serious problems develop.