Up to: McTalby::EPS |
This does the trick on Scientific Linux v4.3 — in no particular order:
apt-get install cman cman-kernel-smp apt-get install gulm ccs apt-get install gnbd gnbd-kernel-smp apt-get install dlm dlm-kernel-smp apt-get install magma-plugins apt-get install fence apt-get install lvm2-cluster apt-get install GFS-kernel-smp GFSVersions of cluster software must be the same on all nodes in cluster (some versions are not compatible resulting in communication problems).
The packages install init s-links which cause Big Trouble sometimes, so remove:
chkconfig --level 35 ccsd off chkconfig --level 35 cman off chkconfig --level 35 lock_gulmd off chkconfig --level 35 fenced off chkconfig --level 35 clvmd off chkconfig --level 35 gfs off
Load kernel modules:
modprobe cman modprobe dlm modprobe lock_dlm modprobe gnbd modprobe gfs
ccsd
# second may be required # # -- if the first fails, try the second --- <clustername> is given # in cluster.conf : # ccs_test connect ccs_test connect force block <clustername> # # ...the second is required if the cluster is not yet quorate (see # man page for ccs_test)...Also:
ccs_test get node '//cluster/@name'
/sbin/cman_tool join
cat /proc/cluster/nodes Node Votes Exp Sts Name # ...be patient, this might take a while 1 1 3 M R5-14 # to appear... 2 1 3 M R5-13 3 1 3 M R5-15and
cat /proc/cluster/status Protocol version: 5.0.1 Config version: 1 Cluster name: clusterR5-13 Cluster ID: 36497 Cluster Member: Yes Membership state: Cluster-Member Nodes: 3 Expected_votes: 3 Total_votes: 3 Quorum: 3 Active subsystems: 9 Node name: R5-13 Node ID: 2 Node addresses: 10.10.1.13
/sbin/fence_tool join
/sbin/clvmd # on both vgchange -aly # on both
cat /proc/cluster/services
/etc/cluster/cluster.conf:
<?xml version="1.0"?> <cluster name="cluster1" config_version="1"> <cman two_node="1" expected_votes="1"> </cman> <clusternodes> <clusternode name="login" votes="1"> <fence> <method name="single"> <device name="human" ipaddr="10.10.1.250"/> </method> </fence> </clusternode> <clusternode name="R5-13" votes="1"> <fence> <method name="single"> <device name="human" ipaddr="10.10.1.13"/> </method> </fence> </clusternode> </clusternodes> <fence_devices> <device name="human" agent="fence_manual"/> </fence_devices> </cluster>
<?xml version="1.0"?> <cluster name="clusterR5-13" config_version="1"> <cman></cman> <clusternodes> <clusternode name="R5-13"> <fence> <method name="single"> <device name="gnbd" nodename="R5-13"/> </method> </fence> </clusternode> <clusternode name="R5-14"> <fence> <method name="single"> <device name="gnbd" nodename="R5-14"/> </method> </fence> </clusternode> <clusternode name="R5-15"> <fence> <method name="single"> <device name="gnbd" nodename="R5-15"/> </method> </fence> </clusternode> </clusternodes> <fence_devices> <device name="gnbd" agent="fence_gnbd" servers="R5-14 R5-15"/> </fence_devices> </cluster>
As described in the RedHat Cluster FAQ:
To add a node to a cluster of three-or-more:
Here:
exporting nodes: R5-14, R5-15 importing nodes: R5-13
On each exporting node:
R5-14> /sbin/gnbd_serv -v R5-14> /sbin/gnbd_export -v -e R5-14-sdb1 -d /dev/sdb1 # # ...for example: the "-e" label R5-15> /sbin/gnbd_serv -v R5-15> /sbin/gnbd_export -v -e R5-15-sdb1 -d /dev/sdb1 R5-14> /sbin/gnbd_export -v -l R5-15> /sbin/gnbd_export -v -l
On importing node:
R5-13> gnbd_import -v -i R5-14 R5-13> gnbd_import -v -i R5-15 R5-13> gnbd -v -l Device name : R5-14-sdb1 ---------------------- Minor # : 0 sysfs name : /block/gnbd0 Server : R5-14 . Device name : R5-15-sdb1 ---------------------- Minor # : 1 sysfs name : /block/gnbd1 Server : R5-15 . ls -l /dev/gnbd total 0 brw-r--r-- 1 root root 251, 0 Feb 16 14:43 R5-14-sdb1 brw-r--r-- 1 root root 251, 1 Feb 16 14:44 R5-15-sdb1
Use remote device as local on importing node:
R5-13> mke2fs /dev/gnbd/R5-14-sdb1 R5-13> mke2fs /dev/gnbd/R5-15-sdb1 R5-13> mount /dev/gnbd/R5-14-sdb1 /mnt/gfs-R5-14-sdb1 R5-13> mount /dev/gnbd/R5-15-sdb1 /mnt/gfs-R5-15-sdb1 # # ...N.B. One can then, on the exporting node, mount /dev/sdb1 # as ext2 --- but the two nodes won't sync, so this # spells disaster... #
# # -- Either "mkfs" locally on the exporters... # R5-14> gfs_mkfs -p lock_dlm -t clusterR5-13:R5-14 -j 3 /dev/sdb1 R5-15> gfs_mkfs -p lock_dlm -t clusterR5-13:R5-15 -j 3 /dev/sdb1 # # ...or "mkfs" on the importer... # R5-13> gfs_mkfs -p lock_dlm -t clusterR5-13:R5-14-sdb1 -j 3 /dev/gnbd/R5-14-sdb1 R5-13> gfs_mkfs -p lock_dlm -t clusterR5-13:R5-15-sdb1 -j 3 /dev/gnbd/R5-15-sdb1 # # ...then mount each file-system on both exporter and importer... # R5-14> mount -t gfs /dev/sdb1 /mnt/sdb1 R5-15> mount -t gfs /dev/sdb1 /mnt/sdb1 # R5-13> mount -t gfs /dev/gnbd/R5-14-sdb1 /mnt/gfs-R5-14-sdb1 R5-13> mount -t gfs /dev/gnbd/R5-15-sdb1 /mnt/gfs-R5-15-sdb1 # # ...creating and deleting files, see that local and remote # mounted filesystems are synced, hooray... #
This is a very good place to look:
Look in the system-logs for messages re. what the cluster-daemons are up to:
tail -f /var/log/messages # ...or kernel or daemon, or whatever, # looking for "kernel: CMAN:"- and # ccsd-related messages... #
A cluster of R5-13, R5-14 and R5-15 worked fine; R5-16 would not add to this cluster. On R5-13:
Feb 16 11:45:46 R5-13 kernel: CMAN: removing node R5-16 from the cluster : No response to messagesand on R5-16
Feb 16 11:46:14 R5-16 kernel: CMAN: No functional network interfaces, leaving clusterThe FAQ (above) suggested "...inconsistent versions of cman software..." and sure enough the version number was not the same. Patching the four nodes to the same level solved the problem.