Using the Clusterer Module for contact replication

Summary

In this, the second part of a three-part article about the Clusterer Module, I explain how I got on when testing a pair of OpenSIPS Registrar Proxies configured as a highly available cluster. The design, which uses Pacemaker to assign a floating IP to the currently active server, is described in some detail in part 1 (see Scenario 2 for a complete description of the solution).

My tests simulated a typical real-world scenario, using a hardware IP phone registered on the OpenSIPS Proxy. It was considered especially important during testing to check for any unexpected interactions with the nathelper module when it sends “NAT Pings” to the remote IP phone. This is a crucial part of the operation of a Registrar Proxy for many ITSP’s who need to be able to support a wide range of client devices connecting via the Internet and often using some form of NAT router/firewall at the customer’s premises.

Essential Elements for a SIP Registrar Server

The purpose of a SIP Registrar server is to accept registrations from customer devices and save details of each device in a store called the location table. When something needs to contact the device, it does so through an integrated SIP Proxy server which is able to look up the current address of the device in the location table. If a device moves or its home address changes, the data in the location table are updated the next time the device re-registers. If the device is disabled or stops working, the relevant record in the location table will expire (typically within 30 minutes) and details are automatically deleted by the Registrar server.

Since setting up Smartvox Limited in 2004, I’ve been lucky enough to work with some great customers at more than 25 different Internet Telephony Service Providers. While no two businesses have wanted exactly the same solution, there is one requirement that comes up time and time again – the need for a SIP Registrar server that can deliver reliable connectivity for a wide range of SIP User Agents running in various network environments. The scenario that most often breaks this connectivity is far-end NAT. In other words, the customer has a SIP device (VoIP phone or IP-PBX) that is behind a NAT router located at the customer’s premises and which connects to the ITSP via the Internet.

Fortunately, OpenSIPS has some built-in tools designed specifically to overcome the problems of far-end NAT – specifically, the following:

NATHELPER module: Provides functions to detect far-end NAT, to fix the IP addresses in certain headers and to send regular “NAT Pings” to devices behind NAT so as to keep the pinhole in the firewall permanently open
Mediaproxy integration: Allows tight integration with the AG-Projects Mediaproxy product which is used to proxy the RTP media streams in a way that avoids the need for inbound connections to be made through the customer’s NAT router/firewall. Since most NAT routers allow unrestricted outbound connections, this generally overcomes the problem of NAT traversal for RTP.
RTPProxy integration: Very similar to Mediaproxy, but using an open source product. This is an alternative to Mediaproxy.

In all the tests described in this article, the Nathelper module is used to detect registrations from behind NAT and to send SIP Pings (OPTIONS requests) to all registered devices with an interval of 25 seconds.

loadmodule "nathelper.so"
modparam("nathelper", "natping_interval", 25)
modparam("nathelper", "ping_nated_only", 0)
modparam("nathelper", "sipping_bflag", "NAT_SIP_PINGS")

#.. in main route
 if (nat_uac_test("59")) {
   force_rport();
   if (method=="REGISTER")
     fix_nated_register();
   else
     fix_nated_contact();
   setflag(NATTED_SRC);
 }

#..in the section that handles REGISTER requests
 setbflag(NAT_SIP_PINGS);

 if (isflagset(NATTED_SRC))
   setbflag(NATTED_CLIENT);

Putting Contact Replication under test

To begin with, both servers in the cluster – vSvr2A and vSvr2B – are fully operational. This simulates normal operating conditions where the OpenSIPS service is running on both servers, but only the active server has the VIP. Next, I power up a VoIP phone which has an account configured to register with the cluster using the VIP address. Contact Replication is enabled as described in part 1.

Even though each server has its own separate copy of the location table, Contact Replication ensures that both copies are synchronised. It not only updates the database table, but also updates the cached copy of the table stored in memory by the OpenSIPS application. As soon as a device registers on the active server, we can see the new registration appearing in the location table of both servers. This all works very nicely, not just for new registrations, but also for re-registers and un-registers.

A point worth noting is that every field from the location table record is duplicated. This means the ‘socket’ field, which stores details of the protocol and interface IP where the registration request arrived, shows the VIP address in both copies of the data.

The same is true for the cached data too:

# opensipsctl ul show
      Domain:: location table=512 records=1
      AOR:: 4321
      Contact:: sip:4321@192.168.3.60:6050;line=tswye3w3 Q=1
      Expires:: 1451
      Callid:: 3c26701a3d9f-osz6q2mtbkax
      Cseq:: 6
      User-agent:: snom360/7.3.30
      Received:: sip:192.168.3.60:6050
      State:: CS_SYNC
      Flags:: 0
      Cflags:: NAT_SIP_PINGS NATTED_CLIENT
      Socket:: udp:192.168.3.100:5060
      Methods:: 7999

Using a SIP packet capture application to monitor the network interfaces on the primary (active) server, we can see that OpenSIPS sends an OPTIONS request to the registered handset every 25 seconds as expected. But what happens on the backup server? The answer is that the backup server attempts, but fails, to send an OPTIONS request to the registered device. It fails because OpenSIPS will always attempt to send the NAT Ping request using the interface defined in the ‘socket’ field (assuming that field is populated with data) and in our tests that field contains the VIP address. Since the VIP address is not assigned on the backup server, this causes errors every 25 seconds like this:

10:57:03 ERROR:core:proto_udp_send: sendto(sock,0x7f0090d68d40,269,0,0x7ffe3306a2a0,16): Invalid argument(22)
10:57:03 CRITICAL:core:proto_udp_send: invalid sendtoparameters#012one possible reason is the server is bound to localhost and#012attempts to send to the net
10:57:03 ERROR:nathelper:msg_send: send() to 192.168.3.60:6050 for proto udp/1 failed
10:57:03 ERROR:nathelper:nh_timer: sip msg_send failed
10:57:28 ERROR:core:proto_udp_send: sendto(sock,0x7f0090d68d40,269,0,0x7ffe3306a2a0,16): Invalid argument(22)
10:57:28 CRITICAL:core:proto_udp_send: invalid sendtoparameters#012one possible reason is the server is bound to localhost and#012attempts to send to the net
10:57:28 ERROR:nathelper:msg_send: send() to 192.168.3.60:6050 for proto udp/1 failed
10:57:28 ERROR:nathelper:nh_timer: sip msg_send failed

A critical flaw

Other than filling up the log files, the error messages on the backup server are not doing any harm as such, but they have the potential to do great harm under certain circumstances. The circumstance I refer to is the use of the v2 feature designed to automatically delete zombie registrations. This feature can be enabled by the addition of the following lines to your opensips.cfg script:

#..in the module config section
modparam("nathelper", "remove_on_timeout_bflag", "KILL_ZOMBIES")
modparam("nathelper", "max_pings_lost", 4)

#..somewhere in the section that deals with REGISTER requests
setbflag(KILL_ZOMBIES);

With this feature enabled, combined with contact replication, it will completely break the registrar server. The problem is that the feature to detect zombie registrations is triggered (false positive) by the UDP transmission failures happening on the backup server when it tries to ping the registered handset. This causes the backup server to delete the offending registration from its location table, but if you have two-way contact replication then the delete operation is also replicated across to the primary server resulting in deletion of the contact on both servers, in the table and in the memory cached copy held by OpenSIPS. Oops!

How to overcome the problems with Clusterer plus Nathelper

When a description of this problem was sent to the OpenSIPS users forum, the only suggestion they offered was to toggle the pinging on or off using the MI command nh_enable_ping. This was an imaginative idea and, using Pacemaker Alerts to run a bash script, I have been able to create a solution. However, it is disappointing that OpenSIPS doesn’t provide a viable solution out-of-the-box for designers who want to combine far-end NAT remedies with detection of zombie registrations and High Availability clustering using the Clusterer module.

Testing cluster failover

After allowing myself to be slightly sidetracked there, let’s get back to the main theme of this article. How does the HA cluster perform when one node stops? What role does contact replication play in this situation?

Controlled shutdown of the primary active node

There are several different ways that we could choose to test node failure. The first test I tried represents a deliberate, controlled shutdown of one node through issuing a command to Pacemaker telling it put the primary node (vSvr2A) into standby mode:

pcs cluster standby vSvr2A

The results were very promising, giving an apparently seamless and immediate failover. Calls to my registered handset worked exactly the same after I issued the command as they did before. Even when a call was in progress, it was not cut off and, because the media for my test call was being transmitted directly from endpoint-to-endpoint, there was no loss of audio. The OPTIONS requests being sent by OpenSIPS to the handset as NAT-pings continued to be sent by the standby server (vSvr2B) as soon as it had taken over the VIP address. Failover was almost instantaneous and there was no requirement for the OpenSIPS service to start or restart – it was running already on the standby server and it already had all the required data including registration contact details.

Uncontrolled failure or segfault

My next test was meant to represent an uncontrolled breakdown of the service such as might happen if there were a segfault error. To simulate this condition, I simply issued a killall command, on the active server, like this:

killall opensips

The results from this test were not so good. My SIP packet capture software showed that the OPTIONS NAT-pings were no longer being sent to the registered handset. Calls made to the handset immediately after the simulated fault returned “service unavailable”. However, after about 30-40 seconds, Pacemaker restarted the OpenSIPS service on the original server and everything went back to normal. The active node had not changed so there was really little or no benefit from it being in a cluster.

Complete Server failure

For this last test I simply did the equivalent of switching off the power to the primary active node (in fact I was using virtual servers so this was simulated through the power options on the virtual machine management console).

As with my first test, the results were excellent. The standby server took over immediately and a call made to the registered handset just after I hit the power switch worked without a hitch. The SIP packet monitoring software on the backup server showed that it started to send OPTIONS NAT-pings to the registered handset very soon after the simulated failure – certainly within the 25 second interval.

Conclusions

Mixed results overall, but for what is essentially a “home-baked” clustering solution it goes a long way towards meeting most people’s objectives without the need to spend thousands on a turn-key commercial solution (supposing such an option even exists). I particularly like the ability to do an almost seamless controlled shutdown of the active node because this would allow planned maintenance of the software on that node. This is essential if you want to keep your servers up to date, allowing you to update installed packages or even the kernel as well as allowing the OpenSIPS software to be updated as and when newer versions or patches are released. A big advantage of each node having its own independent copy of the database is that you can leave one node running while updating the DB schema on the other node.

The less attractive findings from my tests are:

The incompatibility found when using the Clusterer and Nathelper modules alongside zombie registration detection
The poor response to a service failure such as segfault, simulated using the killall command

The module incompatibility (1) is very unfortunate, but it is possible to overcome using a somewhat elaborate work-around based on Pacemaker Alerts, tailored bash scripts and the MI command that tells OpenSIPS to enable or disable NAT pings. As for the poor response to a segfault (2), I am sure it must be possible to configure Pacemaker to detect and react to service failures. It is quite likely that my knowledge of Pacemaker is simply insufficient to know how best to fix this. Ideally, one would want to set up some kind of system monitoring application that checks if the main SIP port (usually 5060) is in “listen” mode. This would need to force the node into standby mode whenever the port ceases to be listening. If anyone reading this article knows how that can be done, please post a comment.