Achieving High-Availability with WSO2 Load Balancer and Keepalived

A system is considered to be highly-available, if it continues to serve requests even on an event of failure. On the service level, the high-availability can be achieved by having a cluster of service nodes and fronting the system with a load balancer with fail-over capability. So if one of the node in the system fails, the loadbalancer should detect and direct requests to other nodes in the system. This makes sure the continuous availability of services. But the point here is to note that, all the requets are going through the loadbalancer. So what if the loadbalancer it self failed unexpectedly during operation? So there should be some way to handle the failure cases of loadbalancer it self.

One possible way is to have a cluster of loadbalancers and have a master loadbalancer which fronts them. But still the same question above is applied in that situation aswell.

So the solution would be to add some fail-over mechanism to main loadbalancer itself. This can be done by having a backup loadbalancer along with the master loadbalancer, which comes alive when the master goes into an error state. Now the question would be, how to achive this fail-over?. Luckily there are some tools available which can be used in this situation. Keepalived[1] is one such tool, which does a health check on currently running services in your system. It also can be used to detect failures and act upon it.

In this post, I will go through on how to use keepalived with WSO2 Elastic Load Balancer (known as WSO2 ELB). We will be using two ELB’s where one is used as the Master, which will be in Active state and the other ELB will the backup node which wiil be in passive state.

The follwing diagram explains the scenario of interest here.

WSO2 ELB and Keepalived configuration

WSO2 ELB and Keepalived configuration

Both ELB’s are running in the same sub-domain and will be exposed outside via the same virtual ip. Initially the Master ELB, which is Active, gets assigned the virtual ip by the keepalvied. So when it goes to an error state or offline, the keepalived’s health check will detect this and assign the virtual ip to the Backup ELB. These details are hidden to outside world as they only see the virtual ip to access.

Installing keepalived
Keepalived is available as a debian package. So the systems, which are debian based, can easily install it using the .deb package.

But some may want to install it on systems like Redhat/CentOS because thier production systems may be running on those linux systems. But installing keeplaived on those systems is a bit hard. You first have to compile the keepalived source pakckages and then manually install them. This post [2] explains on how to do this.

The same installation steps should be carried on both the WSO2 ELB nodes.

Configuring keepalived

As explained in [2], keepalvied.conf is the main configuration where you specify the Master/Backup node settings. The following is the configuration for the Master node.

vrrp_script wso2_elb_health_check {
script "/opt/bin/"
interval 2
weight 2
vrrp_instance VRRP-director1 {
virtual_router_id 51
advert_int 1
priority 101
interface eth0
state MASTER
track_script {
virtual_ipaddress {

The highlighted are the things to note here. For the same in the backup node, you have to change the priority to “100” and state to “BACKUP”.

The check_wso2_elb script does the health check on running WSO2 ELB instance. The follwing should be added to that script file.

CURL=`which curl`
CODE=$(${CURL} -sL -w "%{http_code}" ${HealthCheckServiceEP} -o /dev/null)
if [[ ${CODE} == "200" ]]; then
echo "Alive"
exit 0
echo "Dead"
exit 1

The above scrpit is self explanatory. It does a call to an endpoint using curl, on which the HealthCheckService is running and retuns the status 0 – if active , 1 – if it is not active. Based on the status, the keepalived keeps or removes the virtaul ip form the Master node and assign it to Backup node. When the master node comes back online, the keepalived assign the virtual ip back to it. Further the script can be extended to do more on detecting failures. For example, sending some alert e-mails to system admin, etc, which will help to fix the issue sooner.

Since WSO2 ELB has the capability to host proxy services, the HealthCheckService (which is a proxy service) will be hosted on it.

Configure WSO2 Load Balancer

As explained earlier, we will host the HealthCheck proxy service in WSO2 ELB it self. The following is the sample proxy service. You have to place this file (HealthCheckService.xml) under $WSO2ELB_HOME/ repository/deployment/server/synapse-configs/default/proxy-services/ on both ELB nodes.

<?xml version="1.0" encoding="UTF-8"?>
<proxy xmlns="" name="HealthCheckService" transports="https http"
       startOnLoad="true" trace="disable">
            <header name="To" action="remove"/>
            <property name="RESPONSE" value="true"/>
            <property name="NO_ENTITY_BODY" scope="axis2" action="remove"/>
                    <ns:getHealthCheckResponse xmlns:ns="http://services.samples">

When this proxy endpoint is called, it will generate a 200 OK with the mock response as bellow and return it to the calling party.

<ns:getHealthCheckResponse xmlns:ns="http://services.samples">

Starting and checking the services
Keepalived is started using the follwoing command.

$ /etc/init.d/keepalived start
Starting keepalived: [ OK ]

If you don’t see the above output or run into some issues when starting, see the section below on Some tips and tricks on fixing those.

After starting both ELB’s you can check whether the proxy service is correctly deployed either by running the manullay or by doing a curl requset to the service endpoint.

Also to check whether the keepalived is correctly running, you can check the virtual ip addresses of both ELB nodes.

This can be done using the following command.

$ip addr show eth0

You would find an entry from the Master nodes eth0’s list as
inet scope global secondary eth0

This is the virtual ip for the ethernet interface (eth0).
If you kill the Master ELB manually, then the above virtual ip should get assigned to the Backup node. This is what happens on a failing case also.

The above steps make sure the continuous availability of the loadbalancer service.

Also the above steps will be common on using other loadbalancer with keepalived. Only thing to note is that, how to run a HealthCheckService on the instance on which the loadbalacer is running.

Some tips and tricks
On Redhat/CentOS
Sometimes on installing keepalived on Redhat/CentOS, you may find that when starting keepalived, it will fail with the following error

Starting keepalived: /bin/bash: keepalived: command not found [FAILED]

This is due to some missing items in the startup script of keepalived. You have to add the following to the start segment in the script.

start() {
echo -n $”Starting $prog: ”
daemon /usr/local/sbin/keepalived ${KEEPALIVED_OPTIONS}

Also after configuring and added the relevant symbolic links, you need to add another symbolic link as follows.

ln -s /usr/local/etc/keepalived/ /etc/keepalived

This is because keepalived tends to look for the configuration file in that localtion. This will solve the issue of virtual ip not getting assigned to the relevant ELB instance.



About kishanthan

I’m currently working as a Software Engineer at WSO2, an open source software company. I hold an Engineering degree, majoring in Computer Science & Engineering field, from University of Moratuwa, Sri Lanka.
This entry was posted in WSO2 and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s