Monday, February 18, 2013

MySQL Apache Failover System with DRBD, Pacemaker, Corosync


To setup a failover system with Pacemaker, Corosync and DRBD, we need 2 servers, 3 IP address in the same subnet and one partition with same size in both server. In this setup, mysql and apache are already configured
Software requirement:
  1. DRBD & drbdlinks
  2. Pacemaker & Corosync
  3. psmisc package (needed by pacemaker)

It is highly recommended to use separate network adapters for synchronization but it is also work with 1 network adapter.
In this example, we use below configuration:
  • Node 1, hostname: fo2, ip:, synchronization partition: /dev/sdb1
  • Node 2, hostname: fo3, ip:, synchronization partition: /dev/sdb1
  • IP Cluster:
  • Domain: test.ok
  • Synchronization folder: /sync
Modify the /etc/hosts on each node: fo2.test.ok fo3.test.ok

DRBD Setup


aptitude install drbd8-utils drbdlinks


Configure each node to use ntp server, It is important for drbd because of the filesystem timestamps. We need to disable drbd init script because pacemaker will handle the start and stop of drbd.
update-rc.d -f drbd remove
Edit /etc/drbd.d/global_common.conf so it contains:
global {
        usage-count no;

common {
        protocol C;

        handlers {
                pri-on-incon-degr "/usr/lib/drbd/; /usr/lib/drbd/; echo b > /proc/sysrq-trigger ; reboot -f";
                pri-lost-after-sb "/usr/lib/drbd/; /usr/lib/drbd/; echo b > /proc/sysrq-trigger ; reboot -f";
                local-io-error "/usr/lib/drbd/; /usr/lib/drbd/; echo o > /proc/sysrq-trigger ; halt -f";

        startup {
                degr-wfc-timeout 120;

        disk {
                on-io-error detach;

        syncer {
                # rate after al-extents use-rle cpu-mask verify-alg csums-alg
                rate 100M;
                al-extents 257;
Create file /etc/drbd.d/r0.res and put the following configuration:
resource r0 {
 protocol C;
 device /dev/drbd0 minor 0;
 disk /dev/sdb1;
 flexible-meta-disk internal;

 # following 2 definition are equivalent
 on fo2 {
 on fo3 {
 net {
  after-sb-0pri discard-younger-primary; #discard-zero-changes;
  after-sb-1pri discard-secondary;
  after-sb-2pri call-pri-lost-after-sb;
Copy those 2 configs to fo3:
fo2# scp /etc/drbd.d/* fo3:/etc/drbd.d/

Metadata initialization

Do this following command on each node:
drbdadm create-md r0
If some error appears which said that a file system exist, you must delete the filesystem with this command:
dd if=/dev/zero bs=512 count=512 of=/dev/sdb1
After this, you can try to start the drbd on both node:
/etc/init.d/drbd start
If everything ok, you will get similar with this result with command: cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:524236

Initial Synchronization

Run the following command to initiate synchronization:
fo2# drbdadm -- --overwrite-data-of-peer primary r0
Above command will make fo2 a primary and start synchronization between the node. After that you can create a filesystem, only do it in primary node.
fo2# mkfs.ext4 /dev/drbd0

drbdlinks configuration

Copy drbdlinks init script to /etc/init.d/
cp /usr/sbin/drbdlinks /etc/init.d/
Modify /etc/drbdlinks.conf so it contains similar like below:

Pacemaker & Corosync Setup


aptitude install pacemaker


To use unicast messaging with corosync, you must use corosync version > 1.4, debian squeeze stable distribution only provide version 1.2, you must use debian testing or squeeze-backport to install version 1.4
Edit /etc/corosync/corosync.conf:
totem {
        version: 2
        token: 3000
        token_retransmits_before_loss_const: 10
        join: 60
        consensus: 3600
        vsftype: none
        max_messages: 20
        clear_node_high_bit: yes
        secauth: off
        threads: 0
        rrp_mode: none
        interface {
                member {
                member {
                # The following values need to be set based on your environment
                ringnumber: 0
                mcastport: 5405
                ttl: 1
        transport: udpu

amf {
        mode: disabled
service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       0
        name:      pacemaker

aisexec {
        user:   root
        group:  root

logging {
        fileline: off
        to_stderr: yes
        to_logfile: yes
          logfile: /var/log/corosync/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
                tags: enter|leave|trace1|trace2|trace3|trace4|trace6
Copy that config to fo3:
fo2# scp /etc/corosync/corosync.conf fo3:/etc/corosync/
Create key for corosync and copy to fo3, this command will take some time:
fo2# corosync-keygen
fo2# scp /etc/corosync/authkey fo3:/etc/corosync/
Start the cluster:
/etc/init.d/corosync start
Check the cluster state, the result will similar like this:
fo2# crm_mon -1f -V
crm_mon[1929]: 2022/02/21_00:14:30 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
crm_mon[1929]: 2022/02/21_00:14:30 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
crm_mon[1929]: 2022/02/21_00:14:30 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Last updated: Mon Feb 21 00:14:30 2022
Stack: openais
Current DC: fo2 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
0 Resources configured.

Online: [ fo2 fo3 ]
The following command must be run only on 1 node, it will be synchronized automatically between nodes. We don't need STONITH resources so we can disable it using this command:
fo2# crm configure property stonith-enabled="false"
Because we only setup failover system with 2 nodes, we must disable the quorum policy, if not the resources will not move to another node when primary node fail.
fo2# crm configure property no-quorum-policy="ignore"

Failover configuration

Connect to cluster system:
crm conf
Add this following configuration:
primitive ClusterIP ocf:heartbeat:IPaddr2 \ 
        params ip= \
        op monitor interval=30s
primitive WebSite ocf:heartbeat:apache params configfile=/etc/apache2/apache2.conf op monitor interval=1min
primitive DBase ocf:heartbeat:mysql
primitive Links heartbeat:drbdlinks
primitive r0 ocf:linbit:drbd \
       params drbd_resource="r0" \
       op monitor interval="29s" role="Master" \
       op monitor interval="31s" role="Slave"
ms ms_r0 r0 \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
primitive WebFS ocf:heartbeat:Filesystem \
        params device="/dev/drbd0" directory="/sync" fstype="ext4"
group WebServer ClusterIP WebFS Links DBase WebSite
colocation WebServer-with-ms_ro inf: WebServer ms_r0:Master
order WebServer-after-ms_ro inf: ms_r0:promote WebServer:start
location prefer-fo2 WebServer 50: fo2

Above configuration will make fo2 always as preferred node. When the fo2 recovered from failure, the resources will be taken again by fo2.
Check the Cluster state again, resources should be activated on fo2.
Failover test
To test the failover, you can use this command:
fo2# crm node standby
The resources should move to another node.
By running this command:
fo2# crm node online
The resources should be back to fo2.
To move resources to another node:
crm resource move WebServer fo3
To give back the control to the Cluster system:
crm resource unmove WebServer


No comments: