As I mentioned in an earlier posting we have deployed a HA pair of SIP T-Servers at this client and are using Windows Network Load balancing between them.
Even though this is a Genesys recommended solution there are a number of known issues and apparently Genesys are developing a stateless proxy to fix the problem.
In the interim we have been using an alarm reaction to enable or disable port 5060 on the relevant host during a switchover.
We have finally managed to get the switchover process reliable by using the following sequence:
- Switch over to the backup component
- Stop and then start the component that was previously in PRIMARY mode. This seems to be the secret!
- Enable the cluster host where the backup component is running and disable the cluster host where the previous primary component was running
I have configured a pair of alarm reactions using Log Event 5150 to automate the process. Note that the cancel timeout is set to 5 seconds to ensure that the alarm reactions fire again on fail-back.
During normal running on the primary, the NLB status is a follows:
On failover the NLB status is a follows: