Figure 4.8 Dynamic individual trunking
~ 4.1.2. Robustness
A highly-available telephony infrastructure must deal with the fact that an IP Telephony server might crash or be down for administrative reasons.Telephony services can be affected adversely in various ways, when a server goes down and another server takes over.The following are some approaches for implementing robustness in the server infrastructure.
P.70
[IP Telephony Cookbook] / Setting Up Basic Services
1. The first approach is to set up more than one server for a zone and treat each of them as a separate router (see Section 4.1.1.2.3) that shares the same configuration. In this case, there is no replication of registration or call data across multiple servers.
If the primary server fails, a calling phone will not even notice, at first, because the UDP media stream is usually transmitted directly between the two endpoints. But, the TCP signalling connections do not survive such a crash and so the first TCP message sent afterwards leads to an error and, very likely, to a call clearing.
A phone that is not currently in a call has no way of detecting a server crash in time. After its registration period expires, it will try to refresh its registration with the primary server and fail.
It then needs to find a new IP Telephony server. H.323 provides a mechanism called ‘Alternate gatekeeper’ which basically defines that a gatekeeper registering an endpoint informs it of possible secondary gatekeepers that can be used alternatively.The telephone stores this information and, in case of a server failure, tries to contact the other listed gatekeepers.
Another possibility that works for SIP and H.323 is to configure a prioritised list of H.323
Gatekeepers or SIP proxies in a DNS SRV record for the zone.This requires that the telephone is aware of its DNS domain and is able to query DNS servers, a concept that is common in the SIP world, but seldom found in H.323 devices.
In general, without synchronisation between the replicated servers the failure of one server normally results in the loss of all calls.The server loss is discovered after the defined registration timeout, which usually is measured in minutes - but theoretically can also be set to days. After that time, the phones should be able to find an alternate IP Telephony server to register with.
2. Another approach is to use servers that maintain replicated registration data while only one of them is the active server and the other is the standby server. If the active server fails, the standby server detects this instantly and can use the replicated information about which devices are registered to inform all endpoints (phones) that it is now the new active server. As a result, the outage will be noticeable for only a few seconds. Of course, active calls will still be cleared, and definitely not resumed.
3. If the previous approach is pushed a bit further, both servers could replicate every kind of state they keep internally, down to the connection layer. If the active server crashes, the other system takes over and can announce (via ARP) the same MAC address as the crashed server.This kind of ‘Hot Standby Server’ would take over instantly and seamlessly, allowing even ongoing calls to continue without noticeable interruption.
In terms of server infrastructure, this is the most advanced and complicated solution a manufacturer could implement. It does not require the phones to be intelligent or support any kind of robustness-mechanisms.The downside of this approach is that the rewriting-mechanisms ARP might not work in switched networks, which would force both servers to be in the same shared network segment.
It is hard to give general advice on which kind of robustness-mechanism to use.The third solution allows the use of ‘dumb’ endpoints because operation of the backup server is completely P.71
[IP Telephony Cookbook] / Setting Up Basic Services
transparent to them, reducing the cost of endpoint equipment.The other two solutions offer the possibility of putting the redundant server into different buildings, allowing the telephone system to operate even if one building burns down. A general observation is that telephones having the capability to switch servers immediately are not very common, and servers supporting Hot Standby, as described above, are equally hard to obtain.
Every manufacturer that offers IP Telephony solutions implements some robustness-mechanism.
One should be aware of the endpoint requirements that must be met to take advantage of the mechanism offered.
~ 4.1.3 Management issues
When setting up an IP Telephony infrastructure, certain issues concerning administration tasks should be considered ahead of time.
~ 4.1.3.1 Multiple account databases
The need to migrate legacy and IP Telephony gives rise to the problem of maintaining multiple account databases.The legacy PBX already has a configuration that defines valid numbers.The same usually applies to an IP Telephony server (see Figure 4.4). A shared configuration database for both the PBX and the IP Telephony server is very uncommon, unless they are of the same manufacturer and have similar configuration interfaces. Of course, keeping two separate databases consistent is difficult in the long run.
To make this problem even worse, it is possible that the gateway between the IP and the PSTN
world needs access to the valid numbers as well. Potentially, this implies a third database with valid numbers, making the administration of telephony accounts (e.g., creating a new account or moving one account from legacy telephony to IP Telephony) a tough job.
~ 4.1.3.2 Decentralisation
Another issue occurs regarding the question of who is allowed to administer what in the telephone system. In classic environments, there is a small group of PBX administrators that sits in a special location, but in the network world, at least on campuses, often consists of locally
-administrated networks.
When introducing IP Telephony, there is the chance, or pitfall, depending on your point of view, to apply the structures from the network world to the telephony world. For instance, consider an IP Telephony infrastructure of a university that gives every student a telephony account. At the start of the semester, several hundred new accounts must be created.To reduce the workload, this could be delegated to administrators of the different departments. Or, consider a research staff member that moves from one office to another, which might require a configuration change when using port-based authentication. It would be fine if these changes could be decentralised.
P.72
[IP Telephony Cookbook] / Setting Up Basic Services
Decentralised administration does not necessarily mean that all administrators have the same permissions. An IP Telephony server might, for example, separate the permission to change account data from the permissions to view the call detail records.
Many available products allow remote administration through a Web interface which generally allows decentralised administration.Whether different administration permissions can be granted heavily depends heavely on the products used.
~ 4.2 Dial plans
The previous sections already addressed issues regarding the dial plan that is used.There is no ideal solution to address all different needs but there are a number of techniques to solve specific needs.
This section addresses the most common problems faced when dealing with dial plans.