“Lather, rinse, repeat” – the other day that is how we felt.
Thanks go to a variety of folks who helped get us through this. A special call-out to Justin Hiedeman for maintaining his perspective in trying times.
Scenario
We have a semi-greenfield of Lync 2013, HA –3 nodes of FE, single PSTN trunk to an AudioCodes Mediant 3000, Level 3 is the SIP Trunk provider.
The Issue (Lather, Rinse, Repeat)
Incoming PSTN calls were being disconnected and reported as a missed call at approximately 2 rings and a brief 2 second toast from the Lync client. If the user was logged out, the calls would immediately go to VM. If the user set the Call Forwarding – Unanswered Calls” to 5 seconds it would work, anything higher than that resulted in the described bad behavior.
Lync to Lync (P2P) calls operated normally; going to voice mail direct from inside the Lync client was normal. The client is Lync 2013.
Needless to say, this was very annoying to the end-user. At one point we thought a certain individual might burst into flame. Not having a fire extinguisher handy, we thought we might look into what was causing this (annoying) call behavior.
Troubleshooting
Both Level 3 and AudioCodes indicated this was an issue with the Mediation server configuration. The fact that Lync to Lync calls (P2P) worked as advertised led us to believe it was either with the AudioCodes or Level 3. After discussing the obvious finger-pointing, we turned back to the job at hand. Troubleshooting consisted of using OCSLogger and Snooper resulting in narrowing down the call appearing to drop with highlighted line below. Which was consistent with all calls. Using the ACSyslog, we discovered that after the below error that a 183 was received from Level 3 which terminated the call.
ms-diagnostics: 10037;source="serverfqdn.domain.com";reason="Normal termination response from gateway before the call was established";component="MediationServer";sip-reason="Q.850 ;cause=31 ;text="local, RTP Broken Connection""ms-diagnostics-public: 10037;reason="Normal termination response from gateway before the call was established";component="MediationServer";sip-reason="Q.850 ;cause=31 ;text="local, RTP Broken Connection"" ,
We also terminated (shut off, disabled) the mediation server service on two of the three Front End servers)
Because AudioCodes was so adamant about it not being their issue, we then spent the better part of 3 hours with a friendly soul from Microsoft. It was gratifying to note that the “expert” support engineer went over all the exact same items we had already checked – and guess what? We now had verification that we were right: the Lync trunk to gateway setup was fine. As a final note to the Microsoft CSS call, the CSS engineer had us try this: http://support.microsoft.com/kb/2817465 in a vain hope to getting something sort of related to fix our issue. Note the date on this update would indicate that this is part of CU2 for the Lync 2013 client but one can never tell.
However – and here is why it sometimes pays dividends to call CSS - within the MS internal article database our technician discovered a similar issue with a previous case that involved an AudioCodes Gateway. The MS recommendation was to change the “Disconnect on Broken Connection” setting from Yes to No under the Advanced Parameters setting for SIP definitions.
This resulted in no change in the behavior. At this time, armed with the knowledge that the Lync setup was correct, we called AudioCodes. We also got Level 3 on the same call.
The Fix
So here we are, two days into this <sigh>. All the same symptoms were occurring, Level 3 saw the disconnects, AudioCodes and the Lync Servers (Mediation service was shut down on two of the three servers) were seeing the broken Connection. On further inspection of the AudioCodes configuration it was discovered that the ‘Disconnect on Broken Connection’ setting is also in the profile Coders and Profiles Settings which would over-ride the setting in the SIP definitions. For this implementation two profiles were setup, one going to Lync and one going to Level 3.
After backing up the configuration this was changed to “No” for both profiles. Note that this profile also has an “Advanced Parameter List” – more on that in just a bit.
Level 3 Profile
The Lync Profile was already done, but remember that “Advanced Parameters List” – you ought to, I have highlighted it twice now. I will wait while you go back and check my math.
On the Profile for Lync the ‘Media Early 183’ setting was also enabled. It would appear that this was already set on the Level 3 profile.
The Outcome
After all this, an inbound call to Lync behaves as you would expect; user logged in or not.
YMMV
1 comment:
That's a crazy scenario John. It's always one little setting that's the cause for issues like this huh.
Good to hear you got to the bottom of it in the end.
Post a Comment