Linux HA Using LVS/Heartbeat+DRBD+OCFS2

Download as pdf
Download as pdf
You are on page 1of 10

by Adzmely Mansor

independent consultant

HA configuration and installation


for ACME Inc

System Specifications Version 0.9b

Adzmely Mansor / Consultant Kuala Terengganu, Terengganu Darul Iman T +6.019.959.1513 [email protected] http://blog.xjutsu.com
Private and Confidential

High Availability For Acme Inc

Introduction
The objectives of this exercise is to setup a High Availability environment for Acme Inc web services. The Acme
Homepage provide up-to-date meteorological information such as weather forecast, satellite images, earth quake alarm,
etc. This informations are regularly updated by the internal system, transferred via ftp from several servers. The Acme
Web service back end system will process the datas and updates the web periodically. Previously the Web Services is
running in a single server, without any shared/external storage.

Services Port/Transport
httpd/Apache Web Server. 80/tcp
ftp/File Transfer Protocol 21/tcp & 20/udp
rmtp/Flash Media Server 1935/tcp
mysql/RDBMS 3306/tcp

Services running in Acme Web Server

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 1


Private and Confidential

High Availability Setup Configured

This is an Active - Active (Hot Stand by mode) HA setup using two servers acting as web servers and another two serv-
ers configured and installed with Linux Virtual Server providing failover/fallback mechanism for services provided by the
web server. In this scenario we are using two LVS server to guaranteed maximum availability, configured with heartbeat
connection in a Active - Passive mode.

Storage Synchronization
Videos and meteorological images are updated by internal Acme backend intranet system, and uploaded via ftp service
and the content management system (CMS) used. Having two servers, with the secondary as a fallback/failover server, it
is necessary for both machines to have access to the same contents and files. Updates must be available on both serv-
ers. Currently there is no dedicated storage system or server for this purpose. Having a shared storage in primary server,
will introduce a single point of failure, when the main primary server in totally unaccessible, having a secondary as a failo-
ver will become useless.

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 2


Private and Confidential

The idea, is to have any updates or changes in primary server, will also updated to secondary server. And the require-
ment is to have it synchronized in real time. Having no dedicated storage system/server, the best in two server synchro-
nization scenario is by using Distributed Redundant Block Device (DRBD).

DRBD® refers to block devices  designed as  a building block to  form high availability (HA) clusters. This is done by
mirroring a whole block device via an assigned network. DRBD can be understood as network based RAID-1.

In failover situation, all updates in terms of storage will goes to the secondary hot stand by server. After recovering the
primary server, changes is secondary during the downtime period (of the primary) must also reflected/updated to primary
server, however DRBD only provide one way replication. That is why in this solution Oracle Cluster File System 2 (OCFS2)
is used on top of DRBD, so that two way synchronization is possible.

Database Replication
Replication is the most flexible way to deal with scalability, if not done right, however replication can result in disaster. The
most common problem with replication is primary key collision. When this happens, replication stops. For this reason, the
setup configured in both web servers is MySQL dual master replication and with a configuration to avoid/minimize
primary key collision. In a dual master setup, each server functions as both a master and a slave to the other server.

Because only one web server is active at one time, this master - master database replication also known as active -
active - hot stand by replication. When primary server active, all updates will be synchronized to the secondary server
that acting as slave in this situation. However, in failover situation, the secondary database will no longer act as slave,
because all request will goes to the secondary server, during this time the secondary MySQL will act as master. Once
primary is back to life again, the primary will act as slave and synchronized with the secondary server.

Failover and Service Monitoring using Linux Virtual Server (LVS)

Linux Virtual Server (LVS) is an advanced load balancing solution for Linux systems. It is an open source project and the
mission of the project is to build a high-performance and highly available server for Linux using clustering technology,
which provides good scalability, reliability and serviceability. The LVS method used for Acme, is a direct routing technique.

When a user accesses a virtual service provided by the server cluster, the packet destined for virtual IP address
(161.XXX.XXX.30) arrives. The load balancer(LinuxDirector) examines the packet's destination address and port. If they

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 3


Private and Confidential

are matched for a virtual service, the connection is added into the hash table which records connections. Then, the load
balancer directly forwards it to the primary server (161.XXX.XXX.157). When the incoming packet belongs to this
connection can be found in the hash table, the packet will be again directly routed to the primary server. When the server
receives the forwarded packet, the server finds that the packet is for the address on its alias interface or for a local
socket, so it processes the request and return the result directly to the user finally. After a connection terminates or
timeouts, the connection record will be removed from the hash table.

The LVS server will regularly negotiate with the primary web server (currently configured every 60 seconds), to monitor
the web services (port 80). If somehow the primary server cannot be negotiated/connected/contacted the LVS will
forward all web requests to the secondary server (161.XXX.XXX.158). This is the same thing for other managed services
by the LVS such as ftp/21 and rmtp/1935 (Flash Media Server).

Installation (RPMS installed and dependencies)


MySQL is installed from the RedHat EL 5 distribution CD. DRBD source can be downloaded from LINBIT via the
following URL: http://oss.linbit.com/drbd/ . The DRBD version used for Acme is DRBD 8.3.8, compiled and installed
from the source. DRBD source installtion and compilation:

# ./configure --with-km --with-heartbeat --with-initdir --prefix=/


# make
# make install

OCFS2 can be downloaded from oracle web site (http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/) .


OCFS2 RPMS installed using following command:

# rpm -i ocfs2-2.6.18-92.el5-1.4.4-1.el5.x86_64.rpm
# rpm -i ocfs2console-1.4.3-1.el5.x86_64.rpm
# rpm -i ocfs2-tools-1.4.3-1.el5.x86_64.rpm

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 4


Private and Confidential

Configurations (MySQL Master - Master Replication)

Descriptions MySQL Server 1 MySQL Server 2

/etc/my.cnf /etc/my.cnf
client port [client] [client]
port = 3306 port = 3306
# #
unix socket connection # socket = .... # socket = ....
commented out. Client only
connect via TCP/IP .
[mysqld] [mysqld]
port = 3306 port = 3306
... ...

in master master mode to [mysqld] [mysqld]


... ...
avoid any possibilities of ... ...
primary key collision auto_increment_increment = 2 auto_increment_increment = 2
auto_increment_offset = 1 auto_increment_offset = 2
especially key with auto
increment field
Binary logging used by [mysqld] [mysqld]
... ...
MySQL replication ... ...
log-bin=mysql-bin log-bin=mysql-bin
relay-log=primemaster-relay-bin relay-log=secondmaster-relay-bin

master replication server id [mysqld] [mysqld]


... ...
... ...
server-id = 1 server-id = 2

replication master config, [mysqld] [mysqld]


... ...
because both servers are ... ...
masters, the servers master-host = 192.168.1.102 master-host = 192.168.1.101
master-user = replication master-user = replication
configured with master master-password = slave master-password = slave
settings (replicating each master-port = 3306 master-port = 3306

other - both are slave and


master in the same time)
create and grant user with mysql> grant replication slave on *.* to 'replication'@’%’ mysql> grant replication slave on *.* to 'replication'@’%’
identified by 'slave'; identified by 'slave';
a’slave’ permission for
replication connection and
access. both ip address are
added.
at this point both server # /etc/init.d/mysql restart # /etc/init.d/mysql restart

need to be restarted

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 5


Private and Confidential

Descriptions MySQL Server 1 MySQL Server 2

/etc/my.cnf /etc/my.cnf
start slave in server 1 using mysql> start slave; mysql> show master status;

mysql prompt and check # check slave status


the replication status mysql> show slave status\G

# if everything ok Slave_IO_State should produce as


# follow:

Slave_IO_State: Waiting for master to send event

start slave in server 2 using mysql> show master status; mysql> start slave;

mysql prompt and check # check slave status


the replication status mysql> show slave status\G

# if everything ok Slave_IO_State should produce as


# follow:

Slave_IO_State: Waiting for master to send event

Test Procedure ( UAT Simulation ) for MySQL Replication

Descriptions Commands Expected Output


by using mysql client connect to mysql # in both server go into mysql prompt from shell no errors, and test database automatically cre-
# and use test database ated by mysql during installation
server and use test database
mysql> use test;

create a table with auto increment key mysql> create table dummy (id int(3) auto_incre- # in server 2 list all tables statement
ment not null key, usr_name varchar(30)); mysql> show tables;
in server 1
show tables statement will list out dummy table
created in server 1

insert record into newly created table mysql> insert into dummy values (‘’,’Mr Foo’); # in server 2 do select statement

in server 1 mysql> select * from dummy;

select sql statement will display newly record


added in dummy table inserted in server 1

insert record into newly created table mysql> insert into dummy values (‘’,’Mr Acme’); # in server 1 do select statement

in server 2 mysql> select * from dummy;

select sql statement will display newly record


added in dummy table inserted in server 2

monitoring slave replication status # on both server issue the following command No error in Slave_IO_State with “Waiting for mas-
mysql> show slave status\G ter to send event” statement.

Slave_IO_Running and Slave_SQL_Running both


must equal to ‘Yes’

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 6


Private and Confidential

MySQL Installation/Configuration Informations

# Item Server 1 Server 2


1 data folder /var/lib/mysql /var/lib/mysql
2 replication id replication replication
3 replication id password slave slave
4 eth0 ip address 161.XXX.XXX.157 161.XXX.XXX.158
5 eth1 ip address 192.168.1.101 192.168.1.102
6 configuration file /etc/my.cnf /etc/my.cnf
8 master replication id 1 2
9 mysql server version 5.0.45 5.0.45

DRBD and OCFS2 Configuration

# Item Server 1 Server 2


1 drbd configuration file /etc/drbd.conf /etc/drbd.conf
2 drbd device /dev/drbd1 /dev/drbd1
3 disk/partition allocated for /dev/cciss/c0d0p3 /dev/cciss/c0d0p5
drbd
4 drbd ip address and listen 192.168.1.101/7789 192.168.1.101/7789
port
5 eth0 ip address 161.XXX.XXX.157 161.XXX.XXX.158
6 eth1 ip address 192.168.1.101 192.168.1.102
7 ocfs2 configuration file /etc/ocfs2/cluster.conf /etc/ocfs2/cluster.conf
8 ocfs2 node number 0 1
9 ocfs2 ip address and listen 192.168.1.101/7777 192.168.1.102/7777
port
10 ocfs2 node name primemaster secondmaster
11 mounted folder /storage /storage

Monitoring DRBD and OCFS2 and Services

Descriptions Commands Expected Output


check DRBD status # on any server issues the following command # the output must be as follow :

cat /proc/drbd 1: cs:Connected ro:Primary/Primary


ds:Diskless/UpToDate C r----

connected and both must be in Primary mode.


The UpToDate indicator indicated that both server
are in sync.

drbd device must be automatically # on both servers issues df command to view/list /storage must be mounted
mounted file system
mounted in /storage during boot up
process. df -h

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 7


Private and Confidential

Descriptions Commands Expected Output


services that must always up at all # on both server ps command can show process make sure httpd , mysqld and fms services are
that running running
time
ps aux

DRBD Configuration: /etc/drbd.conf (same for both servers)


#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd83/drbd.conf
#
common {
startup {
wfc-timeout 60;
degr-wfc-timeout 60;
}
disk {
on-io-error detach;
}
syncer {
rate 500M;
al-extents 80;
}
protocol C;
}
resource r0 {
startup {
become-primary-on both;
}
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
on primemaster {
device /dev/drbd1;
disk /dev/cciss/c0d0p3;
address 192.168.1.101:7789;
meta-disk internal;
}
on secondmaster {
device /dev/drbd1;
disk /dev/cciss/c0d0p3;
address 192.168.1.102:7789;
meta-disk internal;
}
}

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 8


Private and Confidential

OCFS2 Configuration: /etc/ocfs2/cluster.conf (same for both servers)


node:
ip_port = 7777
ip_address = 192.168.1.101
number = 0
name = primemaster
cluster = ocfs2

node:
ip_port = 7777
ip_address = 192.168.1.102
number = 1
name = secondmaster
cluster = ocfs2

cluster:
node_count = 2
name = ocfs2

xJutsu Labs : SPA Open LDAP Installation / Configuration System Specifications 9

You might also like