Get well Rob!

Rob from Empirix had a “little” accident this week. No excuse for not cracking on with a baseline test though!

Get well soon mate.

Rob hearn - hospital

Share

SIP Server 8.0.400.25 – Unsupported URI Scheme

As feared, our Avaya SIP interoperability issues from last year have come back when we upgraded to SIP Server 8.0.400.25 in order to fix a Stream Manager resilience problem (see earlier post – Release 1 Operational Acceptance Testing)

SIP INVITE messages are now getting bounced with error 416 – Unsupported URI Scheme. We believe that this is because the INVITE contains “;transport=tls” even though we are not using TLS on the Genesys side and it is not enabled in SES on the Genesys mappings!

However, TLS is enabled between Avaya CM (CLAN cards) and SES. The options would seem to be:

1) Set “sip-tls-port=0” on SIP server to disable it

2) Disable TLS between CM and SES (mappings and SIP signalling links)

My money is option (2) at the moment.

Share

PSN2414 and TSAPI Link Flooding (again)

In a previous post I mentioned TSAPI link flooding problems and that the fix was contained in PSN2414.

Basically PSN2414 is an upgrade to AES 4.2.4 plus configuration to use reserved TSAPI licenses as opposed to checking global licenses in and out via Avaya WebLM.

At this client however we have an Enterprise Wide licensing model which means that reserved licenses cannot be used ….

Currently there are discussions about this taking place between Genesys and Avaya. I suspect the outcome may be a “commercial agreement”. Other than that there may be some options if we upgrade to AES 5.2.2 since the release note for this version contains a section “Reserving TSAPI User Licenses” which implies that with AES 5.2.2 reserved licenses can be implemented with an Enterprise Licensing model.

The release note also contains the comment “For AE Services 5.2, the use of floating licenses is not recommended”.

Will keep you posted!

Share

TSAPI Link Flooding

Another problem this week which caused an aborted Empirix performance testing cycle. When we added in additional extension DNs we started to see problems with DNs not being registered / monitored correctly. The Avaya TSAPI server logs were full of the following:

CSTAUniversalFailureConfEvent
{
error requestTimeoutRejection
}

CSTAUniversalFailureConfEvent
{
error outstandingRequestLimitExceeded
}

Also, in the Avaya AES logs we saw the following:

16:30:35 ERROR:CRITICAL:TSAPI:TSERVER:../ClnMsg.cpp/417 10 CLNTMSG[1]: Message CSTAMonitorCallsViaDevice for client Genesys avayatsapi_server ac 10.52.151.6, driver AVAYA#SWITCH#CSTA#KFNAY6206P, is being rejected because of driver flow control. The number of messages for this driver exceeds the allowed threshold. Messages Queued to Tserver/Driver: 752 (0x2f0), Priority Messages Queued: 0 (0x0), Messages Allocated: 51 (0x33), Max Flow Allowed: 800 (0x320)

16:30:35 ERROR:FYI:TSAPI:TSERVER:../ClnMsg.cpp/417 10 CLNTMSG[1]: If flow control occurs frequently for driver AVAYA#SWITCH#CSTA#KFNAY6206P, consider distributing traffic for this driver across additional AE Servers. If this problem occurs only intermittently, use CTI OAM Administration (Administration > Security Database > Tlinks) to increase the value of the Max Flow Allowed field.

We are using T-Server for Avaya TSAPI (8.0.006.03) connected to Avaya AES 4.2.1.

After a bit of solution searching we upgraded the Avaya TSAPI client from 4.1 to 5.2.4 without any success. After a bit of playing around we found a temporary workaround by setting various T-Server options:

background-processing = false
use-link-bandwidth = 8 ***
use-link-bandwidth-startup = 8 ***
use-link-bandwidth-backup = 8 ***
max-attempts-to-register = 10
register-attempts = 5
register-tout = 2 sec

However, the above settings limit the CTI link bandwidth to 8 messages/second so it takes a long time to restart T-Server!

Further solution searching came up with the following:

“In most cases Avaya PSN2414r2 is required for TSAPI TServer to function correctly. Avaya PSN2414r2 is a restricted patch that allows TSAPI to pull licenses up front instead of individual SSL sessions each time TServer registers DNs, routes calls, or monitors calls.”

We are running version AES 4.2.1 which does not include the PSN 2414 patch. The next step is to upgrade AES to 4.2.4 (version 5.2 although officially supported by Genesys is a major upgrade and deemed to risky at the moment)

Share

Release 1 Operational Acceptance Testing (OAT)

We are now into week 2 of Release 1 OAT testing. Some notable fixes this week are:

Configuration Server

After failure of Config Server, it is not possible to login Config Manager or CCPulse to Config Server Proxy when the primary configuration server is down.

This has been fixed by setting the reconnect timeout to 10 (seconds) on configuration server rather than the default of 0. This is fixed on configuration server 7.6.000.42 as part of ER# 221781113. However, since we are on configuration server 8.0.000.09 the fix does not seem to have made it into version 8 of configuration server!

Stream Manager Resilience

We have 4 stream managers in each site. When shutting down 3 out of 4 stream managers all calls to advisor default route. If there are two stream managers up, the calls also default route. If there are 3 stream managers up the call routes correctly to an advisor.

Seems to be fixes in SIP server 8.0.400.25 as part of ER# 230151967 and ER# 102209228:

“SIP Server now retries treatments only on media servers that are still in service (the out-of-service check shows the Voice over IP Service DN (service-type set to treatment) as available)”

“SIP Server no longer sets a DN to out of service in a scenario where a call is routed to an unresponsive device and a caller abandons the call before the sip-invite-timeout timer expires. If the caller does not abandon the call during the sip-invite-timeout time period, then, when this timeout expires, SIP Server sets the unresponsive device to out of service. Once the recovery-timeout timer configured for this device expires, SIP Server sets it back in service”

There are some possible workaround but it looks as though a SIP server upgrade is on the cards – lets hope our Avaya SIP interoperability issues from last year do not come back!

For information, the possible workarounds are:

1. Set option “sip-invite-treatment-timeout = 5” on SIP servers

Image

2. Remove OOS configuration on VoIP Service DNs. To do this add a new option:

sip-oos-enabled = false

This should insure that a treatment is re-applied to a call on the Stream Manager during a failover

3. Change the VoIP Service DN options to:

sip-oos-enabled = true
oos-check=2
oos-force=2
recovery-timeout=3

Share

Empirix Performance Testing

I mentioned in an earlier post that we have been (or trying) to undertake performance testing of the end to end solution at this client prior to rollout of the pilot (which went live in November 2009) to an additional two Contact Centre sites and a couple of 1000 advisors.

The Empirix testing infrastructure consists of 6 G5 Load Generators (2 at each of the 3 Contact Centre sites) and 3 Virtual Agent Simulators (VAS) at each of the Contact Centre sites.

We are injecting calls directly into the centralised Avaya SES server to be as representative of ISDN calls ingress as possible. From a callflow perspective this means that calls are injected in SES which then forwards the SIP INVITE to Avaya Communication Manager (CLAN cards). The call hits a VDN in the same way as for normal ISDN calls and is routed to Genesys via SES over SIP signalling links.

We have been working on 2 major issues for the last few weeks:

  • GVP (IPCS) crashes under load at a rate of 50 calls/minute. Although new calls continue OK we observe “stuck” calls on GVP ports
  • Calls hanging on Avaya stations even though they have been cleared down on the G5 load generators e.g. SIP BYE message sent. We do not want to clear down from the VAS end as this is not representative of the business process whereby advisors must wait for the caller to hangup

This week we have finally resolved both issues and have had an informal test run at moderate load. Here is what we found ….

IPCS Crashes

This turned out to set a JavaScript issue affecting IPCS 7.6.410 (MR1) all the way up to IPCS 7.6.470 (MR7) which is the latest release at the time of writing. The root cause is still under investigation by Genesys Engineering since it is not a good idea to allow a GVP Studio application “bug” which caused a JavaScript exception to crash a whole IPCS.

The error occurred when retrieving configuration data from a custom “config.xml” file in the following JavaScript line:

<assign name=”VOXFILEDIR” expr=”GetData(VOXFILEDIR, ‘VOX_FIlE_PATH’)”/>

And was fixed by changing this line to:

<data name=”VOXFILEDIR” src=”Config.xml”></data>
<assign name=”document.VOXFILEDIR” expr=”VOXFILEDIR.documentElement”/>
<assign name=”VOXFILEDIR” expr=”GetData(VOXFILEDIR, ‘VOX_FIlE_PATH’)”/>

Hung calls on Avaya stations

This turned out to be a SIP interoperability issue (surprise surprise!) and was fixed by a “downgrade” to the Empirix G5 SIP state machine.

We believe that the problem was caused by the SIP routing information (record-route attribute) being updated to include the IP address of the Avaya CLAN card in addition to the Avaya SES server via which the initial INVITE was sent:

Image

Since the Empirix G5 is stateful this updated routing information was being maintained and then re-used on the BYE message at the end of the call:

Image

As a result the BYE was being ignored and the call hanging (the Empirix G5 though that the call had been disconnected event though it never received an ACK back). To fix this problem the Empirix G5 SIP state machine was modified to ignore updated routing information (record-route attribute),

I’m not going to argue who is in the wrong here although I strongly suspect it is Avaya! The reason for saying this is another SIP interoperability issue that has popped up since. This time Avaya SIP interoperability with Kofax which we are using for Fax channel integration (hopefully!)

Please see: http://www.avayausers.com/showthread.php?p=77475

“The problem we always see is when Cisco sends a BYE to Avaya. Avaya sees the BYE but for whatever reason Avaya will never send an OK back to Call Manager. This results in a hung call leg in the Avaya. The hung call leg stays up in Call Manager until my timers expire and then the call is flushed.”

“What we have found is that Avaya fails to honour any SIP method unless record-route is used. If we run our SIP proxy servers in non-stateful mode (record-route off) Avaya fails to honor any method that didn’t come back from the first proxy that routed the call”

In the case of Kofax, Kofax does not include the record-route attribute in any response methods. This can be seen in the Wireshark trace below:

Image

“The only way the SIP stack on Avaya will function with a stateless proxy and respond to all parties that the proxy may send the call to requires the creation of a “dummy” signaling group on the Avaya PBX. Basically, you have to build your main signaling group with trunking to the proxy and then add a “dummy” signaling group into Avaya for each end point IP address that you may see SIP methods come back from. E.g. Kofax”

We have one main signaling group with trunking to the stateless proxy. Since the proxy is stateless it will only be in the call flow until the proxy sends the final OK response back from Cisco Call Manager.

In addition to the signaling on Avaya to the proxy, we also had to build a signaling group on the Avaya that has no trunking but has the IP address of the far end Call Manager server that would be in the call flow after the proxy (SES) sets the call up. This “dummy” signaling group has no trunking in Avaya – we have only defined the far end IP address in the signaling group page on Avaya.

Therefore, from a Kofax perspective I think what they are saying here is create a “Kofax signaling group” with the IP address of the Kofax server specified as the far end IP address.

Will let you know how we get on with this is a future post.

Share