Methods to deal with split-brain situation:
1. Redundant heartbeat path
network port communication plus serial port communication
2. I/O fencing
Remaining nodes separate failed node from its storage either by shutdown/reboot power port or storage port
3. Quorum disk
Quorum disk is a kind of I/O fencing, but the reboot action is executed by failed node's own quorum daemon. It also has additional feature: contributing vote to cluster. if you want the last standing node to keep the multiple-nodes cluster running, quorum disk appears to be the only solution.
RHCS (Red Hat Cluster Suite) Quorum disk facts
- A shared block device (SCSI/iSCSI/FC..), Device size requirement is approximately 10MiB
- Supports maximum 16 nodes, nodes id must be sequentially ordered
- Quorum disk can contribute votes. In multiple nodes cluster, together with quorum vote, the last standing node can still keep the cluster running
- single node votes+1 <=Quorum's disk vote < nodes total votes
- The failure of the shared quorum disk won’t result in cluster failure, as long as Quorum's disk vote < nodes total votes
- each node write its own health information in its own region, the health is determined by external checking program such as "ping"
Setup Quorum disk
#initialise quorum disk once in any node mkqdisk -c /dev/sdx -l myqdiskAdd quorum disk to cluster
Use luci or system-config-cluster to add quorum disk, following is the result xml file
Start quorum disk daemon<clusternodes><clusternode name="station1.example.com" nodeid="1" votes="2"><fence/></clusternode><clusternode name="station2.example.com" nodeid="2" votes="2"><fence/></clusternode><clusternode name="station3.example.com" nodeid="3" votes="2"><fence/></clusternode></clusternodes>#expected votes =9=(nodes total votes + quorum disk votes) = (2+2+2+3)
<cman expected_votes="9"/>
#Health check result is writen to quorum disk every 2 secs
#if health check fails over 5 tko, 10 (2*5) secs, the node is rebooted by quorum daemon
#Each heuristic check is run very 2 secs and earn 1 score,if shell script return is 0
<quorumd interval="2" label="myqdisk" min_score="2" tko="5" votes="3"><heuristic interval="2" program="ping -c1 -t1 192.168.1.60" score="1"/><heuristic interval="2" program="ping -c1 -t1 192.168.1.254" score="1"/></quorumd>
The daemon is also one of daemons automatically started by cman
service qdiskd start
Check quorum disk information
The cluster is still running with last node standing$ mkqdisk -L -dmkqdisk v0.6.0/dev/disk/by-id/scsi-1IET_00010002:/dev/disk/by-uuid/55fbf858-df75-493b-a764-5640be5a9b46:/dev/sdc:Magic: eb7a62c2Label: myqdiskCreated: Sat May 7 05:56:35 2011Host: station2.example.comKernel Sector Size: 512Recorded Sector Size: 512Status block for node 1
Last updated by node 1Last updated on Sat May 7 15:09:37 2011State: MasterFlags: 0000Score: 0/0Average Cycle speed: 0.001500 secondsLast Cycle speed: 0.000000 secondsIncarnation: 4dc4d1764dc4d176Status block for node 2
Last updated by node 2Last updated on Sun May 8 01:09:38 2011State: RunningFlags: 0000Score: 0/0Average Cycle speed: 0.001000 secondsLast Cycle speed: 0.000000 secondsIncarnation: 4dc55e164dc55e16Status block for node 3
Last updated by node 3Last updated on Sat May 7 15:09:38 2011State: RunningFlags: 0000Score: 0/0Average Cycle speed: 0.001500 secondsLast Cycle speed: 0.000000 secondsIncarnation: 4dc4d2f04dc4d2f0
Please note Total votes=quorum votes=5=2+3, if quorum disk vote is less than (node votes+1), the cluster wouldn’t have survived
$cman_tool status..Nodes: 1Expected votes: 9Quorum device votes: 3Total votes: 5Quorum: 5..
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.