Monday, 4 June 2018

On PABXs...

We eventually decided that our existing PABX solution no longer met requirements. In particular, it was very opaque, and extremely expensive, with a licensing model (and vendor locked-in handset requirements) that were punishingly expensive.

We considered "DIY" with something from Yealink (most likely an S300), with an interim set of BRI and POTs links to move away from Avaya onto another platform, before investigating SIP trunks for the "uplink", but we looked at our work schedules (and the offers that came in) and we ultimately decided to get a VOIP telecommunications company to help us install it, implementing SIP from launch.

Here are a few things we learned along the way...

We considered a number of options, including carrying on the way we were. However, the existing system (an Avaya IP Office 500 system) is/was extraordinarily expensive, and the SIP trunk offerings from some of the suppliers were quite good.

It turned out that VOIP call rates (and line rentals) are so competitive (compared to incumbent call rates and BRI line rentals) that we could pay off the change in system in a few years, which certainly made persuading the finance part of the school easier.

One of the potential vendors, who had supported the partly broken Avaya system for several years, immediately started dishing out FUD, at which point we lost much interest in carrying on any further relationship with them.

It was something of a challenge to find someone who wanted to give us SIP trunks on our own bandwidth - many insist on laying fibre to the premises - but then you point out where you live, and they look confused, as they never have fibre to this town. Eventually, we found a fairly major national company willing to meet our requirements.

We ultimately opted for a system (Com.X10) from Far South Networks, a local company that basically "puts Asterisk in a box", along with a few different Yealink handsets (T23G as a basic model, T27G for power users, T29G for switchboard with some EXP20 modules, W52P for people needing mobile handsets and CP860 as a conference phone), implemented, supplied and supported by a major national VOIP telephony provider.
We also got an iTA with the requisite cards, for "analogue" lines, but we're not using it in practice (it was a backup option in case the SIP trunk was an absolute failure, which it isn't).
The Yealink Handsets almost all support gigabit pass-through, which is useful in 2018 in an environment where most areas still tend to have a network point ratio hovering around about "how could we ever need more than one jack per room?"...

We liked that the PABX pretty platform agnostic - it's "just SIP" - meaning we're not locked into any particular vendor from a handset point of view (and can theoretically investigate "softphones" and apps that do SIP).

We were looking forward to having a 3rd party do "most of the work" for us, which is more or less the case. Of course, there are still some things we need to "tidy up"...

Comparisons between Far South and Avaya

Obviously, the "big name" VOIP platform has a few advantages - and one very considerable disadvantage (cost).

We most missed the fairly easy centralised provisioning of handsets.

The IP Office Manager software provides a much "richer" set of programming/provisioning options for the handsets. One can use autoprovisioning on the Far South, but it's much more involved, and requires editing config files.
We were not given admin access onto the PABX server, so we haven't experimented, but Yealink's configuration generator tool may be useful (although it looks like it doesn't quite support the models we have "out of the box" - that may be fairly easy to fix by copying the relevant handset specific XML config files somewhere. It's available from the support page of each handset, so you would expect it to be more compatible. Autoprovisioing also comes in very handy when it becomes necessary to update firmware. Theoretically, it's possible to run a 3rd party autoprovisioning server, but when one exists on the PABX and there's a way of doing the BLFs, it's silly to do so. It surprised me that a major VOIP telecomms provider doesn't seem to use this modality, because it's the most scalable, sensible way of managing fleets of SIP endpoints; that may be a function of being serviced by a fairly minor branch, but I don't know. The unique handset identifier, not surprisingly, is its MAC address. Discussions with a friend who recently implemented a similar system at another site (same supplier, too) suggests there were some bugs/incompatibilities between certain Yealink handsets and Far South's autoprovisioning system - but I imagine those ought to be fixed by now.

At the moment, we're stuck with connecting to the webserver on each handset to configure things from there (which can include just uploading a config file that is "correct" already - if you have a lot of the same model(s) of phones, get one right, download the config file, edit the SIP account details and any other things that change between handsets, and upload away).

The other surprising "no brainer" feature that seems to be absent is an autogenerated centralised address book. With the Yealink phones, you have a couple of options (LDAP or XML files are the two front-runners for centralised provisioning) to access phone books, other than the "on phone" manually maintained option (not usually a good idea). I can't see an LDAP or XML address book I can just connect to on the Far South that basically lists the names and numbers as configured. One could, if you cracked the autoprovisioning, probably push local address books in the config files, but that is not particularly responsive to change.

Fortunately, there's a 3rd party application that is fairly easy to get running that creates an easy to use web-based GUI that generates the required XML format. Octivi Labs have kindly created the open source Yealink Phone Book Manager. Still, this means you're stuck maintaining a 3rd party system just for a working address book - and means you need to remember to update this when you change anything on the PABX. With Avaya's IPO, the address book functionality is built in, and "just works" on compatible handsets, and tracks changes to the numbering plan (and can of course also include important 3rd party numbers).

Training

Reasonably, they like people to be trained on the system before letting them loose on it. However, from what I've seen of the system, the admin web GUI is very straightforward, and the backend is "just Ubuntu with some add-ons", which is not at all scary.

Routing

Another amusing thing was watching their routing announcements in action. We utilise bandwidth from an NREN - a national consortium of educational institutions (primarily Universities and national research organisations) that pool resources to access better Internet facilities that might otherwise be the case. In this instance, it means they run/fund their own ISP, who peers around the country at multiple IXP facilities.
The first peering point our traffic passes was in Durban, and the problems were all on the final hop (i.e. the SIP server itself on their end).
The connection quality to their "main" SIP trunk endpoint in Johannesburg was terrible - latency from 35-1250 odd milliseconds, with reluctantly horrific jitter and ~30% packet loss.  Changing to the Durban endpoint was modestly better, but I see they're now using their Cape Town SIP trunk server. Of course, the traffic still dumps out in Durban and is carried on their backhaul from there. Our ISP's network is, of course, very good (Universities make very grumpy customers). mtr is a wonderful tool - far better than anything available in Windows. It's quite fun to expose MPLS tunnels, where the service provider you are routing over on that hop makes this information available.

You may wish to provide the provider with information (you may need to tell them your IP address range(s), AS Numbers, and so on so they can perhaps edit their routing). It may help if you understand your own typical routes to various places, or can supply "typical" routes (it will help if they give you a list of SIP endpoints you can traceroute to, and supply average RTT and jitter for).

Lessons Learned

We've learnt some things that might be worth bearing in mind. Many of them are quite obvious, but they bear repeating/stating. 

1. No matter how much stakeholder engagement you think you've done, do more. Likewise, your own prep work!


It's amazing how much time and effort you need to put into "stakeholder engagement" - don't assume anyone else is going to do this for you. Make sure you understand the "missing/broken/desired" feature-sets, all the extensions on your campus, etc. Go around and "interview" whomever you consider (or your boss considers) key "stakeholders" in this regard (bare minimum is all people who direct calls for others, including PAs/secretaries). Do this BEFORE you issue Requests for Proposal, and certainly before you put in the finalised order. It's surprising how many things keep crawling out of the woodwork (even from people you've engaged with a lot). Declare a change freeze for a period once the order goes in. If someone forgot something, tough (make sure it's not you)! 

Some of this stakeholder work is likely to be "Politics"; see point 6, below. 



1.b. Likewise, your own prep work!

Similar to that, no matter how much preparation you've done, do more. 

It's almost worth visiting each and every extension to verify what is there, because there are sometime surprises (like handsets you thought were one thing when they're actually something else, or extensions that turn out to have been manually "split" over the years, sometimes between buildings!). 

Make sure you have up-to-date network diagrams (physical layout, including interconnections), and can log into all managed devices. Pre-configure as much as you can (ideally during a change window). You may or may not need/want more advanced things like various LLDP features - depends on your switch support, and the VOIP handset vendor (and the implementer's preference). Find out if any changes are needed to the DHCP server(s) active on the VOIP VLAN(s) (like DHCP option 66 or others).  

Identify (and eliminate) unmanaged/non-PoE switches. PoE switching is the "right" way of doing IP telephony (as are dedicated VLANs, for which you need a managed switch).  PoE is perhaps "negotiable"; management of switches really isn't, in an enterprise network of any size/complexity. 

PoE on switches, however, allows you to at least try to ensure phones remain up "for a while" during power outages; people are used to phones being on when the power isn't - where this is NOT the case, ensure your stakeholders understand the limitation, and what to do otherwise. Centralised power means easy centralised power backup - i.e. put a UPS on each PoE switch/stack powering phones (and WiFi APs) and you have them remain up - for a while at least. Certainly, if you have a standby generator, ensure it's long enough to cover the delay in switchover from utility to backup power. 

2. Running two PABXs in parallel may -or may not- be a good idea. 


If you're changing technology (i.e. going from analogue or ISDN trunk lines to SIP or other VOIP backhaul) you may have the option to run both systems partially in parallel. Depending on your environment, this may be quite attractive, in that you can implement a new system in parallel, test it, and only cut across when you're happy with it. There may be complications if you're partly replacing a hybrid PABX with some VOIP handsets, but you should be able to work around this with managed switches and perhaps some external phone power supplies and/or PoE injectors. IP address space limitations are easily dealt with through larger scopes, or additional voice VLAN(s). 

Obviously, this means you have more systems floating around, and perhaps, more confusion - and probably means more than one visit to each end point. However, this multiple visit "bug" is actually a feature, because it means, once you've officially switched over, you (or other helpdesk people) can go around and make sure that all the "essential" features on the handsets are understood, and report/fix any outstanding issues.  Many people are somehow bad at reading/following instructions, and "monkey see, monkey do" can be very helpful. Prepare to deflect any non-VOIP related issues that may crop up on such contacts - get them to fill in a helpdesk ticket for non-VOIP issues, that will be handled by others. 

Make certain that at least "essential" people have both systems as soon as possible - and that "essential" endpoints are the last to have the old system removed. This is your switchboard, and any "emergency/vital" contact points (which may include IT, maintenance, healthcare, marketing/admissions/fees, other departmental switchboards, and so on). 



3. Don't assume that the installers know as much as you do about what is needed, even if you've spent ages filling out (their!) needs requirements documents for them. 


Make sure you assign people (or yourself, stretched really thin) to stay with the person(s) doing the
central configuration - as well as to help them find all the endpoints for installation - you may need to co-opt "helpers" from the staff or student body. Make sure every on site implementation person has a reasonably "clueful" companion from your school/team.

Sometimes the post-sales spec team(s), and the actual implementation team(s) aren't even the same people.

If their documentation doesn't make sense to you, make your own, and that should be the "gold standard" - if you don't know what a feature is called, ask.

Don't assume any feature is available just because you had it before, or have seen it elsewhere - make sure it's part of your spec documentation, and they say that it is supported (in writing) in the final order/contract.

Onboard sufficient VOIP/telephony jargon that you can properly describe what you want/need. For example, if you don't know what FXO and FXS are, and you're implementing analogue stuff, learn it! If you don't know what IVR is, learn - and so on. Basically, go through spec sheets, and Google unfamiliar terms and acronyms.

4. Don't assume the install team has actually been given/read commissioning/order documentation.

It's surprising how people pitch up and ask questions you've already documented at length. Don't get annoyed, but do have clear guidelines to give them, and answers to likely questions - or have a clear understanding of where to get the definitive answer. 

We were quite surprised how much time the implementation person had to spend on the phone to various "higher tier" support people to get things working. Eventually, it seems to have transpired that the SIP account/lines were not even correctly created by the central infrastructure team. Re-creation seemed to solve a multitude of sins (like not being able to dial out...!) 

5. Clear your schedules before and after you do this sort of thing. 

In order to do steps 1, 8, and 10 properly, you need (lots of) time. We all know how busy IT helpdesks at schools can be. 

Make sure sysadmin chaos is eliminated through change freezes (and avoiding "patch Tuesdays"). Declare (management approved) reduced support availability, and what will happen instead, and perhaps even what problems people will have to "just deal with" whilst you/IT team get VOIP sorted. You may have the luxury of a sufficiently large IT support operation that you can run a "skeleton" IT helpdesk for non-VOIP issues. This is unlikely at schools! 

There will, no matter how good the implementation was, be outstanding things that need attention. Make sure you "block out" at least two full days each side of the implementation window in your calendar. A week each side is probably more realistic.

Ensure the time to attend to them - and the documentation, whilst things are "fresh" in your mind - is available. 


6. Don't make Politics your problem. 


It's amazing how much "politics" there can be between administrators whose literal job it is to answer phones. (I understand this; phones are ludicrously intrusive on all other types of work. I hate phones - asynchronous communication [like email, text chat, etc.] is MUCH better). Still, your organisation's external customers and stakeholders (parents, suppliers) expect to speak to a human that can actually help them quite quickly if they phone you. 

Don't get involved. 

Don't play favourites. 

The correct point of view on how the system should function is that of the (external) customer - what do (prospective) parents reasonably expect? Deliver that experience as much as possible. Explain this to management. 

Get senior management to understand what the "best practice" is, and, diplomatically(!), hint that their PA may not be giving them the best answer as to how things "ought" to be. Make sure you present this best practice "out of band" of any meeting/email where people that deliver Politics unto the system (most commonly, the PAs of senior management!) are present/recipients. Remember that many may give their PAs access to email, so that may not be the best modality to bring things to their attention...!

I don't recommend it (because it's often an "untruth", which makes me uncomfortable), but a new system is often an opportunity to implement the "right" way as "the only supported way"/"way it has to be now, because of system limitations/features"...! This also gives management an opportunity to "get out of" difficult politics with their direct reports... An inanimate object can take quite a lot of hate and just not care, particularly if irate users cannot throw it out of the window! 

If there are unpopular "ways it has to be", point out some awesome new feature they're getting as a (partial?) recompense! Commiserate about the horrors of technology. Move on with the day! :) 


7. Number portability is awesome. 


There was once a time when a change in telephony provider meant all your numbers had to change. 

No longer! 

This feature is extremely useful, because it means that external stakeholders don't "lose" your contact information, and all your printed stationery, etc. remains substantively correct. Changing phone numbers is, in expense/trouble, at least as much trouble as changing your organisational branding. 

If you live somewhere benighted that *doesn't* yet have number portability, make sure you lend your voice to any campaign to enact it through the appropriate governmental/regulatory processes. 

8. Document as you go

You will likely learn things as you go along. Make a note of them (in your hardback notebook) and transfer them into your team Wiki once you have a chance. Refer to your checklist(s) as implementation proceeds. 

There are additional things you should document as policy - does a person, or a functional role, take precedence for Moves, Adds and Changes? Who can request a phone (or a fancier phone, or an additional phone)? Schools are odd as organisations (from a telephony perspective, as well as others!), because not everyone has a supplied "office phone" - they're pretty annoying in a classroom setting, and that is the "average" teacher's office environment. 

I would say functional role should take precedence - because from an organisational perspective, people phoning number X for official business probably mostly want Role Q, not Person P. If a person who used extension X wears multiple job role "hats", this may become somewhat complex. Documentation (i.e. contact details on websites or other forms of directory) should be kept up to date, if you supply such information publicly (or privately!). This is similar to the "role based" vs "person based" email account issue, but abstraction is harder/more expensive (but not impossible) with telephony. Indeed, when there are clearly different functional roles, it may be worth "pre-allocating" external numbers or at least extensions to those separate roles, and having multiple lines on a telephone to support this. 

9. Make sure your own implementation team knows what is required. 

Make 100% sure your "helpers" understand what "in parallel" means if you're going to run both systems together during the switch-over. Language barriers are real, and sometimes, people don't pay attention in meetings (or even read documentation) when you explain what is required. 

Have separate "your team" and then "everyone" meetings - i.e. make sure your in-house people "get" what you're planning, and that the external implementation team is also on the same page. Provide a clear "command chain", and (somehow) find some time to make sure the distributed teams are getting it right - having a centralised deployment location with all the new gear near to you may help (but not in the same room, because that's not ideal - because it's distracting when you're doing "hard" things).

Many of our problems came back to having insufficient time to get this implementation planning done and communicated, and conflicting commitments as the supplier kept delaying the implementation date (on one occasion because of damage in transit of a part). 

If you can, plan extra faaaaaaar in advance, so you can do changes like this in a vacation/holiday period (making sure you have access to ALL keys/buildings...!). In our case, installation sooner rather than later was desired to realise cost savings - otherwise, this would have been delayed until August - although we've been planning this move for well over 5 months already; the chosen supplier could not deliver during our last holiday period (to be fair, we only finalised this project just before that holiday began). 

10. Make checklists. Follow them!

Checklists make a fairly good way of making sure things go properly, and you don't forget anything important. So make them, and use them! Every single step (or group of steps) should be on a checklist, and as they are (successfully and completely) finished, they ought to be checked off. Where they are not successfully/completely finished, "red flag" them for attention/resolution/excalation. 

SIP Security


As you can imagine, in a world where phone calls cost money (and money making premium rate lines are a thing a dubious actor might run, and direct fake calls at), SIP trunk accounts (and SIP servers/PABXs) are a tempting target; you need to secure SIP. You may want to ensure your service provider implements this, and that anyone using SIP to your PABX externally is making use of a suitable protocol (a VPN back to your home base is a good bet). A clear indicator is that the SIP URI is "sips" rather than "sip" - just like https is secure http. As with HTTP, TLS is a relatively easy (PKI complexities aside) mechanism to get this right. Even early SIP RFC3261 covers secure use.

You may need to specifically ensure that the liability for malicious calls and compromises is contractually agreed. 

Interestingly, our SIP trunk account/PABX  (the latter is more likely) got hacked/compromised and exploited within days of being implemented and picked up thousands in billable rates of illicit calls overnight (to the point some of our staff got phoned in the middle of the night to ask if that was "normal")....
At least some of the traffic is related to this: https://badpackets.net/ongoing-large-scale-sip-attack-campaign-coming-from-online-sas-as12876/

This irritates me, as we were explicitly instructed by the suppliers not to firewall the PABX at our border. They say they've implemented a firewall on the PABX host, but I didn't see much evidence of that; perhaps they've tightened up some Asterisk/FreePBX setting(s). (They have now implemented some incoming rules, but they need to be blocking outgoing traffic, too, as I can see traffic originating at the PABX to weird locations...). I really suspect the PABX box itself has been pwned and needs to be nuke 'n paved

I strongly suspect the service provider also does not encrypt their SIP sessions - and even though we reserved a routable IP address for all of our legitimate SIP traffic, this perhaps was not used to limit SIP trunk access. With your SIP traffic going across the Internet, there are plenty of places someone could intercept that traffic (and plaintext credentials) - and of course there may be pwned endpoints and devices all over the place. As I can see dubious looking calls in the PABX's CDRs (Call Data Records), it seems the PABX was used in the compromise rather than the SIP trunk itself being compromised (also, I can see weird traffic in our border firewall logs). 

You may also need to check that voicemail and IVR trees and other such features don't allow "dial-through" to "other locations". I was quite surprised to find the IVR tree allowed me to dial internal extensions, for instance (an undocumented "feature"). 

"Defense in Depth" would suggest you might want to be making use of secure SIP across your LAN as well - dedicated VLAN or not.

Asterisk security is a fairly big topic, e.g. https://www.voip-info.org/asterisk-security

You may also want to check that your firewall doesn't blow a hole in SIP security. SIP ALGs and "session helpers" can be quite poorly implemented. Some versions of FortiOS seem to have issues. A fairly in-depth guide to SIP in FortiOS 5.6 can be found here

It's important that secure credentials are sent securely, both when remote sysadmin teams sent credentials, and when phones and technicians are connecting to PABXs... 

Looming Worries

The PABX itself is some sort of 1U device running Ubuntu - 12.x - which is of course now out of support - even in LTS edition. This is something that you need to be wary about with embedded devices. It may be that there is an updated version coming out RealSoonNow with a new Ubuntu distribution, but we'd expect them to ship with the current available version - or be updated to it by the manufacturer. They (Far South) have their own Ubuntu repository (update.commanet.co.za), so they *may* be manually maintaining things that need patching. Of course, as/when they get onto the next Ubuntu release (presumably the latest LTS), upgrading looks fairly straightfoward, if likely to cause some downtime.

Given that these things, by their nature, need to at least partially live "on the Internet", it is a worry to have potentially unpatched systems online - particularly as the supplied configs don't seem particularly hardened. We'll probably lock down our firewall more as and when we get a chance to figure out what it needs beyond the basic SIP ports. It seems to runs shorewall, so that could also be tightened up on the host itself - either there or in iptables.

1 comment: