SN10100 and SN10200 High Availability mechanisms
Posted by Michal Podoski, Last modified by Danny Staub on 15 November 2017 02:57 PM

SN10100 and SN10200 come with few layers of High Availability

*SN10300 is a Cluster based system introducing few dedicated HA functionalities, which can be found in this article

1. Single MGW HA design

a) PSU failover
- hot swappable PSUs with realtime failover

b) SS7/MTP3 signaling link failover

c) TDM Clock failover between Line Services

d) Automatic protection mechanism on the TelcoBoard
- On board process monitoring: dead-lock or crashed process are automatically detected and board is rebooted to ensure minimal downtime.
- HW Watchdog: When a SmartMedia system is active, the TelcoBoard is polled continuously by the System Manager. If this polling stops, it means the TelcoBoard lost contact with the SmartMedia applications and it will reboot automatically if communication is not restored in a timely manner.

e) IP Network (SIP Signaling)
For outgoing calls, the system can be configured such that you can use alternate addresses to reach the SIP Proxy, so if one network path is down, it will use the other one. For incoming calls, you can configure SIP stacks on two separate ip interfaces so that if one is not reachable by a peer, the second can be reached. Note however that mechanism such as virtual IP is present only in 1+1 HA model between Units, so the peer SIP agent must be configured with the 2 ip addresses used on the TelcoBoard and it must be able to switch from one to the other if one is down.

f) IP VOIP interfaces bonding
On version 2.7 of SmartMedia bonding of VOIP interfaces was introduced making configuration of 1 IP on two interfaces possible.

g) IP Network (Voice Path)
We support redundant voice path for SIP calls. If one path is down, the other one will be selected for all new calls. If both path are up, they can be used in load sharing.

h) Flexible IP (SmartMedia 2.8 and above)
Version 2.8 allows creating multiple dot1q VLAN based Virtual Interfaces on Physical Interfaces and assigning them different roles. Also bonding is possible on all TelcoBoard interfaces (not only VOIP01 and VOIP1).

i) STM1 - Automatic Protection Switching
STM1 versions of SN10k support a linear topology APS fiber protection for 1+1 configurations only. In some implementations this standard is also called MSP 1+1 (Multiplexing Section Protection).

2. 1+1 Active/Standby HA design

1+1 HA model consists of a Master (regular) SN10k Unit, a special Backup (+1) SN10k Unit and a passive 1+1 patch panel. Below you can find an example SN10200/STM1 1+1 setup

SN10200 1+1 setup

Most outstanding feature in this setup would be the 1+1 patch panel, which allows to connect the Active/Standby pair to one (shared) TDM Line. During normal operations the Active MGW processes all of the calls and holds both Active TDM Signaling stack and Virtual IP. Backup unit is Idle and monitoring the Active MGW with special heartbeat protocol. In case of disaster Backup Unit recovers the Virtual IP and activates TDM Stack.

Virtual IP is based on a proprietary protocol working as a floating IP, announcing the MAC Address switchover with a Gratutious ARP.

 

Simplified diagram of 1+1 setup during normal operation

1+1 normal

Simplified diagram of switched 1+1 setup

1+1 switched

 

HW Architecture
SN10k architecture design is based on division of Application processing (Control Host) and Real Time Traffic (TelcoBoard) on the hardware level:

 HW Architecture

Such approach unleashes a remarkable call processing power and also allows to fully use the 1+1 potential, mainly because crossover control ability between Control Hosts and TelcoBoards.

Application Level failover in 1+1
All SmartMedia applications support Active-Standby mode so you can have two hosts, each having a running instance of our SmartMedia application. If the active server crashes, the application on the standby server will take the relay. Furthermore, on each host, SmartMedia installs a service that monitors all SmartMedia application and that automatically restart crashed or deadlocked applications.

Configuration Database HA in 1+1
We also support HA at the DB level. SmartMedia can configure 2 DB Server and setup replication between them so that all changes made to the Master DB are replicated on the Slave DB. If the Master DB is lost, SmartMedia will automatically switch to using the Slave DB. After the failure the Database roles are switched, to ensure proper replication.

IP Network (GW Control and Managment) (1+1)
All communications between application<-->application, application<-->TelcoBoard, TelcoBoard<-->TelcoBoard are done using redundant ip interfaces. This is build from the ground up in all our code.

1+1 Redundancy mechanism
a) SmartMedia Heartbeat application will control which unit is active
b) In case of failure:
- Primary unit is restarted (to clear any configuration) by Heartbeat or HW Watchdog
- SS7, SIP, ISDN stacks and resources are allocated on the secondary unit
- Virtual IP is transferred to secondary unit - proprietary low level driver API

1+1 failure scenarios:
a) Switchovers without loss of active calls:
- SmartMedia stopped on one of the units
- An application is shutdown
b) Switchovers causing loss of all calls:
- TelcoBoard unit shutdown
- TelcoBoard reboot requested
- Package installation
- Primary unit is back (and auto-switch-back is used)
c) Host applications switchovers (no loss of active calls)
- Application crash
- Control Host crash
d) TelcoBoard switchovers (loss of active calls)
- Process crash on the telecom board
- Loss of communication with the telecom board

 

 

(8748 vote(s))
Helpful
Not helpful

Comments (0)
Post a new comment
 
 
Full Name:
Email:
Comments:
CAPTCHA Verification 
 
Please enter the text you see in the image into the textbox below (we use this to prevent automated submissions).