Debugging node eviction issues
Debugging Node Eviction Issues
One of the most common and complex issues in RAC is performing the root cause analysis (RCA) of the node eviction issues. A node is evicted from the cluster after it kills itself because it is not able to service the applications. This generally happens during the communication failure between the instances, when the instance is not able to send heartbeat information to the control file and various other reasons.
During failures, to avoid data corruption, the failing instance evicts itself from the cluster group. The node eviction process is reported as Oracle error ORA-29740 in the alert log and LMON trace files. To determine the root cause, the alert logs and trace files should be carefully analyzed; this process may require assistance from Oracle Support. To get into deeper levels of the node eviction process, you need to understand the basics of node membership and Instance Membership Recovery (IMR), also referred to as Instance Membership Reconfiguration.
Instance Membership Recovery
When a communication failure occurs between the instances, or when an instance is not able to issue the heartbeat information to the control file, the cluster group may be in danger of possible data corruption. In addition, when no mechanism is present to detect the failures, the entire cluster will hang. To address the issue, IMR was introduced in Oracle 9i and improved in Oracle 10g. IMR removes the failed instance from the cluster group. When a subset of a cluster group survives during failures, IMR ensures that the larger partition group survives and kills all other smaller groups.
IMR is a part of the service offered by Cluster Group Services (CGS). LMON is the key process that handles many of the CGS functionalities. As you know, cluster software (known as Cluster Manager, or CM) can be a vendor-provided or Oracle-provided infrastructure tool. CM facilitates communication between all nodes of the cluster and provides information on the health of each node—the node state. It detects failures and manages the basic membership of nodes in the cluster. CM works at the cluster level and not at the database or instance level.
Inside RAC, the Node Monitor (NM) provides information about nodes and their health by registering and communicating with the CM. NM services are provided by LMON. Node membership is represented as a bitmap in the GRD. A value of 0 denotes that a node is down and a value of 1 denotes that the node is up. There is no value to indicate a "transition" period such as during bootup or shutdown. LMON uses the global notification mechanism to let others know of a change in the node membership. Every time a node joins or leaves a cluster, this bitmap in the GRD has to be rebuilt and communicated to all registered members in the cluster.
Node membership registration and deregistration is done in a series of synchronized steps --a topic beyond the scope of this chapter. Basically, cluster members register and deregister from a group. The important thing to remember is that NM always communicates with the other instances in the cluster about their health and status using the CM. In contrast, if LMON needs to send a message to LMON on another instance, it can do so directly without the help or involvement of CM. It is important to differentiate between cluster communication and RAC communication.
A simple extract from the alert log file about member registration is provided here:
Thu Jan 1 00:02:17 1970 alter database mount Thu Jan 1 00:02:17 1970 lmon registered with NM - instance id 1 (internal mem no 0) Thu Jan 1 00:02:17 1970 Reconfiguration started List of nodes: 0, Global Resource Directory frozen
Here you can see that this instance was the first to start up and that LMON registered itself with the NM interface, which is a part of the Oracle kernel.
When an instance joins or leaves the cluster, the LMON trace of another instance shows the reconfiguration of the GRD:
kjxgmpoll reconfig bitmap: 0 1 3 *** 1970-01-01 01:20:51.423 kjxgmrcfg: Reconfiguration started, reason 1
You may find these lines together with other lines asking SMON to perform instance recovery. This happens when any instance crash occurs or when an instance departs the cluster without deregistering in a normal fashion:
Post SMON to start 1st pass IR *** 1970-01-01 01:20:51.423 kjxgmpoll reconfig bitmap: 0 1 3 *** 1970-01-01 01:20:51.423 kjxgmrcfg: Reconfiguration started, reason 1 kjxgmcs: Setting state to 2 0. *** 1970-01-01 01:20:51.423 Name Service frozen
The CGS is present primarily to provide a coherent and consistent view of the cluster from an OS perspective. It tells Oracle that n number of nodes are in the cluster. It is designed to provide a synchronized view of the cluster instance membership. Its main responsibility involves regular status checks of the members and measures whether they are valid in the group, and very importantly, it detects split-brain scenarios in case of communication failures.
Specific rules bind together members within the cluster group, which keeps the cluster in a consistent state:
Each member should be able to communicate without any problems with any other registered and valid member in the group.
Members should see all other registered members in the cluster as valid and have a consistent view.All members must be able to read from and write to the control file.
So, when a communication failure occurs between the instances, or when an instance is not able to issue the heartbeat information to the voting disk, IMR is triggered. Without IMR (there is no mechanism to detect the failures), the entire cluster could hang.
Use the following table of contents to navigate to chapter excerpts or click here to view RAC Troubleshooting in its entirety.
Oracle Database 10g: Real Application Clusters Handbook
Home: Oracle RAC troubleshooting: Introduction
1: Oracle RAC: Log directory structure in cluster ready services
2: Oracle RAC: Log directory structure in Oracle RDBMS
3: Oracle RAC and the Lamport algorithm
4: Oracle RAC: ON and OFF
5: Oracle RAC: Database performance issues
6: Oracle RAC: Debugging node eviction issues
7: Oracle RAC: Member voting
8: Oracle RAC: Cluster reconfiguration steps
9: Oracle RAC: Debugging CRS and GSD using DTRACING
|About the book:|
|Oracle Database 10g: Real Applications Clusters Handbook Learn to implement Oracle real application clusters from the ground up. Maximize database availability, scalability, and efficiency. Find RAC concepts, administration, tuning, and troubleshooting information. You'll learn how to prepare and create Oracle RAC databases and servers, and automate administrative tasks. You'll also get full coverage of cutting-edge Oracle RAC diagnostic tools, backup and recovery procedures, performance tweaks and custom application design strategies. Buy this book atMcGraw-Hill/Osborne|
|About the author:|
|K Gopalakrishnan is a senior principal consultant with the Advanced Technology Services group at Oracle Corporation, specializing exclusively in performance tuning, high availability, and disaster recovery. He is a recognized expert in Oracle RAC and Database Internals and has used his extensive expertise in solving many vexing performance issues all across the world for telecom giants, banks, financial institutions, and universities.|