In this article we examine how Pacemaker and Corosync might be used to supercharge OpenSIPS and build a highly available clustered solution. The focus is entirely on High Availability rather than any form of load sharing. This means we are looking for a way to have more than one server contactable on the same IP address.
The starting point is assumed to be a pair of similar or identical servers connected on the same subnet, each running an instance of the opensips service (for all tests I was using OpenSIPS version 2.2.x).
A quick tour of Pacemaker concepts
Pacemaker and its sidekick Corosync are available as easily installed packages on most Linux distros. They are used to create and manage clusters. A cluster is simply a group of inter-connected servers that, at the application level, behave like a single system to provide high availability or load sharing. Each server within the cluster is referred to as a node.
Corosync provides a communication framework within which nodes are aware of each other and can communicate with each other regarding their own state and their awareness of the network environment around them. Corosync also acts as an event generator alerting the management system, through an API, of changes when they are detected.
Pacemaker is the management system that sits on top of Corosync (or a similar compatible communication backbone). Pacemaker manages one or more resources, where a resource is typically a service or facility running on the servers. The out-of-the-box installation usually includes support, in the form of agents, for a large number of potential resources. You are likely to only need a small number of these.
Agents are sub-divided according to “standard” and “provider”. Standards include ocf, systemd, stonith, etc. Providers include heartbeat and pacemaker itself.
Command line management and configuration tools (shells) are generally available alongside the pacemaker package – on newer CentOS or RedHat servers this is generally pcs and on Debian it is likely to be crm. These should provide a command that can be used to list the available resources. For example, using pcs the following commands will show a list of known standards, list of known providers and then a list of agents provided by heartbeat and using the ocf standard:
pcs resource standards pcs resource providers pcs resource agents ocf:heartbeat
The resource agent that is of most interest to us is the ocf heartbeat agent called IPaddr2. It provides a mechanism for dynamically assigning a floating or ‘Virtual IP’ address to one node in a multi-node cluster.
Installing Pacemaker and Corosync
There is no point providing full step-by-step instructions for installing the Linux HA platform here because there are a number of existing online articles that cover the topic very well. I would recommend the following:
- General information – http://clusterlabs.org/
- Step-by-step instructions for Ubuntu – https://www.digitalocean.com/ …tutorials…
- Step-by-step instructions for CentOS 7 – http://jensd.be/156/linux/building-a-high-available-failover-cluster…
- This Packt book covers all the essentials for CentOS 7 users: CentOS High Availability by Mitja Resman (furthermore, the embedded link currently opens an Amazon “preview” that seems to give free access to all 155 pages )
One tip I would offer regarding basic configuration is, from the outset, to edit the /etc/hosts file on every node in the cluster and make sure the primary IP address of every cluster node is listed along with the computer name you are using in your cluster configuration commands. For example, this is what my /etc/hosts file looked like when I was setting up a test rig on my LAN with three server nodes:
127.0.0.1 localhost localhost.localdomain ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.3.125 Test-vSvr2A 192.168.3.126 Test-vSvr2B 192.168.3.130 Test-vSvr3
Once the packages have been installed and the basic Corosync configuration (access details, server names, cluster created, nodes defined, etc) is completed, you will need to create and configure resources.
Using the IPaddr2 agent with OpenSIPS
As mentioned above, the IPaddr2 agent provides a facility for automatically assigning a floating or ‘Virtual IP’ (VIP) address to one node in the cluster. We are trying achieve the goal of making two OpenSIPS servers “act like a single system” and the simplest way to do this is to have the primary target address for OpenSIPS (as seen by the rest of the network) as the VIP address and for Pacemaker to control which of our two OpenSIPS servers currently holds the VIP address. For an OpenSIPS cluster, this effectively means that the current active node in the cluster is the one that has the VIP. You then have to get used to switching the active node back and forth using the relevant pcs or crm commands – for example, momentarily switching a node to “standby” will cause Pacemaker to assign the VIP to the other node (provided it is still “online”). Always remember to set the node back to an “online” status after switching it to standby or you risk complete failure should both nodes be in standby mode.
Ideally, we might want the opensips service to be active on both servers irrespective of which one currently holds the VIP address. However, there is a problem with this – when the opensips service starts it attempts to bind to the interfaces defined in any “listen” statements. The service will fail to start if the address given in a listen statement is not currently assigned to any interface on the local host server, as would be the case on the standby server. Thus, we must choose one of the following options:
- Automatically start the opensips service on the node that receives the VIP and stop it on the node that looses the VIP
- Find a trick that allows the opensips service to start (and keep running) on a node even when that node does not have the VIP
If we choose option 1, the solution lies within Pacemaker itself because there are agents available to control the facilities that start and stop services. For example, you could use the systemd agent to manage and control the opensips service. Other agents are available for non-systemd Linux distros. Please note that you will also need to configure resource constraints to ensure the two resources are colocated and that they start in the right order. Details are given below.
If we choose option 2, the best solution I know of is to edit the file /etc/sysctl.conf and add the following line:
net.ipv4.ip_nonlocal_bind = 1
This allows the opensips service to start even if the config script contains listen statements for IP addresses that are not currently assigned to a local interface. Note that changes to the sysctl.conf file only become effective following a reboot of the server.
Using Pacemaker resource constraints to co-locate IPaddr2 and SystemD
A quirk of Pacemaker’s default behaviour is that it will attempt to spread resources fairly evenly across your cluster nodes. In other words if there are two nodes and two resources, it will assign one resource to each node. It is trying to be helpful by doing some load sharing, but when one of the resources is IPaddr2 it may actually be doing the opposite of what you want! In the case of option 1 above, we require the opensips service to run on the cluster node that currently has the Virtual IP address.
The solution comes in the form of “resource constraints”. First you need one that tells Pacemaker to co-locate the two resources; then you need another to tell Pacemaker that the floating IP address must be assigned first before the opensips service is allowed to start. The pcs commands to do this are as follows (I’ve used resource names of “ClusterIP” for IPaddr2 and “OSIPS” for the systemd resource that controls the opensips service):
pcs constraint colocation add OSIPS ClusterIP INFINITY pcs constraint order ClusterIP then OSIPS
Under test – does it work?
The solution described above provides a starting point on the road towards a resilient highly-available system, but the job is by no means complete and there are design choices that need to be made in the light of your own expectations. Let me explain in a little more detail.
What will trigger failover?
Kill the power to server 1 and pacemaker will very quickly re-assign the VIP to server 2. Great, we can tick that box.
However, if the opensips service stops with a segfault then pacemaker might restart the service after a delay, but only if you configured a resource agent to start and stop the opensips service. If you chose not to use a resource agent to start the service then you can achieve the same result by adding the parameter “Restart=always” in your systemd unit file.
If you deliberately stop the opensips service on the active node (for example, using the command “systemctl stop opensips”) then Pacemaker will probably not restart the service or flip the VIP across to the other server.
Suppose you wanted to make a small change to the opensips.cfg config file, but you made a mistake and broke the script with a syntax error. When you restart the opensips service and see that dreaded “service failed to start” message, you might hope that Pacemaker would come to the rescue and save you from the embarrassment of having to explain to your boss what happened. No such luck I’m afraid! Even though the backup server is still viable (where we hope the old error-free version of the script is still in place), Pacemaker – using the basic configuration described in this article – will almost certainly just keep trying to restart the opensips service with the dodgy config file on server 1.
The default behaviour for Pacemaker, when failover is triggered, is to then leave the resources running on the new node even if the original fault that triggered failover is fixed on the old one. If you do not want this behaviour, it is necessary to assign a resource location constraint that instructs Pacemaker to “prefer” one node or another. A command like this, although node and resource names may differ, should do:
pcs constraint location OSIPS prefers Test-vSvr2A=50
What about the database?
What arrangement does your OpenSIPS server use for storing persistent data? The most commonly used solution is a MySQL database, but this could be running on the same server (localhost) or could be on a remote address. The remote system might be a single server or a complex clustered highly-available solution such as MySQL Cluster or Galera Cluster.
Now that we have two OpenSIPS servers arranged as a cluster, should each server be pointing to a different database or should they both be pointed at the same “shared” database? There is no “right answer” I’m afraid. You have to make that decision based on knowledge of reliability and resilience of the remote database system. Also, it depends a lot on whether a mechanism for data replication exists. Data replication is possible between database servers, but since version 2.2 of OpenSIPS was released, it is also possible for OpenSIPS to provide limited replication of data – registration contacts (the location table) and dialogs at least.
Will calls be dropped when failover is triggered?
This depends very much on the path being used for the RTP media and on how widespread the fault is that triggered failover. If it was a local fault on just one of your OpenSIPS servers and that server had no role in handling the RTP media then existing calls should continue uninterrupted. The SIP messages are mostly sent and received when a call is set up and when it ends. During the call, there are a lot of RTP packets flowing but very few SIP messages being exchanged.
When RTPproxy or Mediaproxy are used, you may prefer to host them on different servers to OpenSIPS. This spreads the risk and makes it less likely that calls will be dropped when a problem happens.
However, it is very difficult to avoid the risk that audio would stop on many calls should one of your mediaproxy servers die.
Pacemaker and Corosync give us an excellent tool kit that can be used to great effect, especially for management of a virtual IP address shared between two or more servers in a cluster. However, it has to be seen as just one piece of the jigsaw. Building a resilient highly-available solution is a multi-faceted problem and the overall shape and architecture of your design will depend on a wide range of factors, not least your own expectations for reliability and cost.