The Distributed Replicated Block Device offers an easy, affordable way to implement block-level synchronization between disks over the network. Using DRBD can bring huge financial benefits over purchasing a proprietary SAN solution, but if it goes awry, you can lose big. Here are the five most common mistakes administrators make when performing a DRBD setup.
1. Selecting the wrong DRDB mode
DRBD can be configured in two ways: dual primary mode or primary/secondary mode. Dual primary mode allows two nodes to access blocks on the DRBD and allows two-way synchronization. In primary/secondary mode, one node is active and one is passive. Dual primary mode is great, but requires the additional complication of a file system that supports writes from both primaries simultaneously. The OCFS2 and GFS file systems are commonly used as solutions to give simultaneous write access, but setting these up requires a cluster stack. If you don't need dual primary mode, select primary/secondary mode instead, and keep the setup as simple as possible and avoid future headaches.
2. Mishandling a disconnected state
If the connection is lost, your DRBD might enter a standalone mode, or “disconnected state.” In this state, DRBD cannot find out which node has the most reliable set of data. The only way out of disconnected state is to manually restore the connection. To do this, discard all modifications on one node. On the sacrificial node, enter the following commands:
drbdadm secondary <resource>
drbdadm -- --discard-my-data connect <resource>
Next, on the node that is going to be set as the primary, use “drbdadm connect resource”. Then, set this node as primary using “drbdadm primary drbd0” and proceed as usual.
3. Using unsynchronized devices
Before the DRBD setup, synchronize any devices involved before using them. You can't build anything on a setup that hasn't been fully synchronized. Check the current synchronization speed by entering “service drbd status” for a real-time overview of the current status. Once the status indicates that both nodes are up to date, you're ready to continue and create a file system on top of the DRBD.
4. Failing to integrate DRBD in HA Clustering
To automate resource failover, it's a good idea to integrate DRBD in a high availability (HA) cluster. The HA cluster monitors the current DRBD master and makes sure a new node becomes DRBD master if the original goes down. Make sure the cluster can start DRBD by taking the drbd service out of your runlevels with the command “chkconfig drbd off”. Then start the DRBD with the cluster, not from each local node.
5. Setting the wrong synchronization speed
The most important parameter in a DRBD setup is the syncer speed and refers to the rate in MB/sec of the speed when synchronizing between the two block devices involved in DRBD. Often, a default value of 7 MBps is used, which corresponds to a network with a speed of approximately 100 Mbps. As most modern networks are faster, you might get a better synchronization speed if you increase this parameter to 70 MBps. This works well if you want to use all bandwidth that is offered by a gigabit network.
ABOUT THE AUTHOR: Sander van Vugt is an independent trainer and consultant living in the Netherlands. Van Vugt is an expert in Linux high availability, virtualization and performance and has completed several projects that implement all three. Sander is also a regular speaker on many Linux conferences all over the world. He is also the writer of various Linux-related books, such as Beginning the Linux Command Line, Beginning Ubuntu Server Administration and Pro Ubuntu Server Administration.
More resources on DRBD
Setting up a mirrored Linux DRBD configuration over the network
Setting up DRBD in an open source SAN: Open source SANs, part 2
SAN consolidation reduces costs, boosts performance