Linux HA Using LVS/Heartbeat+DRBD+OCFS2
Linux HA Using LVS/Heartbeat+DRBD+OCFS2
Linux HA Using LVS/Heartbeat+DRBD+OCFS2
independent consultant
Adzmely Mansor / Consultant Kuala Terengganu, Terengganu Darul Iman T +6.019.959.1513 [email protected] http://blog.xjutsu.com
Private and Confidential
Introduction
The objectives of this exercise is to setup a High Availability environment for Acme Inc web services. The Acme
Homepage provide up-to-date meteorological information such as weather forecast, satellite images, earth quake alarm,
etc. This informations are regularly updated by the internal system, transferred via ftp from several servers. The Acme
Web service back end system will process the datas and updates the web periodically. Previously the Web Services is
running in a single server, without any shared/external storage.
Services Port/Transport
httpd/Apache Web Server. 80/tcp
ftp/File Transfer Protocol 21/tcp & 20/udp
rmtp/Flash Media Server 1935/tcp
mysql/RDBMS 3306/tcp
This is an Active - Active (Hot Stand by mode) HA setup using two servers acting as web servers and another two serv-
ers configured and installed with Linux Virtual Server providing failover/fallback mechanism for services provided by the
web server. In this scenario we are using two LVS server to guaranteed maximum availability, configured with heartbeat
connection in a Active - Passive mode.
Storage Synchronization
Videos and meteorological images are updated by internal Acme backend intranet system, and uploaded via ftp service
and the content management system (CMS) used. Having two servers, with the secondary as a fallback/failover server, it
is necessary for both machines to have access to the same contents and files. Updates must be available on both serv-
ers. Currently there is no dedicated storage system or server for this purpose. Having a shared storage in primary server,
will introduce a single point of failure, when the main primary server in totally unaccessible, having a secondary as a failo-
ver will become useless.
The idea, is to have any updates or changes in primary server, will also updated to secondary server. And the require-
ment is to have it synchronized in real time. Having no dedicated storage system/server, the best in two server synchro-
nization scenario is by using Distributed Redundant Block Device (DRBD).
DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by
mirroring a whole block device via an assigned network. DRBD can be understood as network based RAID-1.
In failover situation, all updates in terms of storage will goes to the secondary hot stand by server. After recovering the
primary server, changes is secondary during the downtime period (of the primary) must also reflected/updated to primary
server, however DRBD only provide one way replication. That is why in this solution Oracle Cluster File System 2 (OCFS2)
is used on top of DRBD, so that two way synchronization is possible.
Database Replication
Replication is the most flexible way to deal with scalability, if not done right, however replication can result in disaster. The
most common problem with replication is primary key collision. When this happens, replication stops. For this reason, the
setup configured in both web servers is MySQL dual master replication and with a configuration to avoid/minimize
primary key collision. In a dual master setup, each server functions as both a master and a slave to the other server.
Because only one web server is active at one time, this master - master database replication also known as active -
active - hot stand by replication. When primary server active, all updates will be synchronized to the secondary server
that acting as slave in this situation. However, in failover situation, the secondary database will no longer act as slave,
because all request will goes to the secondary server, during this time the secondary MySQL will act as master. Once
primary is back to life again, the primary will act as slave and synchronized with the secondary server.
Linux Virtual Server (LVS) is an advanced load balancing solution for Linux systems. It is an open source project and the
mission of the project is to build a high-performance and highly available server for Linux using clustering technology,
which provides good scalability, reliability and serviceability. The LVS method used for Acme, is a direct routing technique.
When a user accesses a virtual service provided by the server cluster, the packet destined for virtual IP address
(161.XXX.XXX.30) arrives. The load balancer(LinuxDirector) examines the packet's destination address and port. If they
are matched for a virtual service, the connection is added into the hash table which records connections. Then, the load
balancer directly forwards it to the primary server (161.XXX.XXX.157). When the incoming packet belongs to this
connection can be found in the hash table, the packet will be again directly routed to the primary server. When the server
receives the forwarded packet, the server finds that the packet is for the address on its alias interface or for a local
socket, so it processes the request and return the result directly to the user finally. After a connection terminates or
timeouts, the connection record will be removed from the hash table.
The LVS server will regularly negotiate with the primary web server (currently configured every 60 seconds), to monitor
the web services (port 80). If somehow the primary server cannot be negotiated/connected/contacted the LVS will
forward all web requests to the secondary server (161.XXX.XXX.158). This is the same thing for other managed services
by the LVS such as ftp/21 and rmtp/1935 (Flash Media Server).
# rpm -i ocfs2-2.6.18-92.el5-1.4.4-1.el5.x86_64.rpm
# rpm -i ocfs2console-1.4.3-1.el5.x86_64.rpm
# rpm -i ocfs2-tools-1.4.3-1.el5.x86_64.rpm
/etc/my.cnf /etc/my.cnf
client port [client] [client]
port = 3306 port = 3306
# #
unix socket connection # socket = .... # socket = ....
commented out. Client only
connect via TCP/IP .
[mysqld] [mysqld]
port = 3306 port = 3306
... ...
need to be restarted
/etc/my.cnf /etc/my.cnf
start slave in server 1 using mysql> start slave; mysql> show master status;
start slave in server 2 using mysql> show master status; mysql> start slave;
create a table with auto increment key mysql> create table dummy (id int(3) auto_incre- # in server 2 list all tables statement
ment not null key, usr_name varchar(30)); mysql> show tables;
in server 1
show tables statement will list out dummy table
created in server 1
insert record into newly created table mysql> insert into dummy values (‘’,’Mr Foo’); # in server 2 do select statement
insert record into newly created table mysql> insert into dummy values (‘’,’Mr Acme’); # in server 1 do select statement
monitoring slave replication status # on both server issue the following command No error in Slave_IO_State with “Waiting for mas-
mysql> show slave status\G ter to send event” statement.
drbd device must be automatically # on both servers issues df command to view/list /storage must be mounted
mounted file system
mounted in /storage during boot up
process. df -h
node:
ip_port = 7777
ip_address = 192.168.1.102
number = 1
name = secondmaster
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2