M1’s 3G downtime: what’s the root cause?

January 17th, 2013 | by Alfred Siew
M1’s 3G downtime: what’s the root cause?

M1 4G network (2)

As thousands of M1 subscribers find themselves suddenly cut off from phone calls, SMSes and e-mail these past couple of days, it is hardly surprising to see many taking to Facebook to hit out at their telecom operator.

That the most serious outage in recent memory was caused by M1’s vendors somehow setting off a water sprinkler at a network centre makes it even harder to accept. As a friend who works in the industry remarked, this looked like a rather “noob” mistake.

A few questions immediately spring to mind.

How did these vendors, there to upgrade transmission equipment at 3am on Tuesday, set off the emergency systems? Why was there seemingly a single point of failure, that is, the problem could not easily be routed elsewhere quick enough?

M1 comment on Facebook

M1’s promise to restore 3G services.

Right now, about 48 hours after the network first went down, many users still cannot connect to it. M1 expects to fully restore its services only on Thursday midnight, which means close to three days of down time.

During this time, users may not be able to call each other to arrange for meetings. Those on the road who depend on a 3G connection could miss out on important e-mail until they reach the office.

The dependence on mobile Internet has never been heavier. Businesses are badly affected when it goes down for such a long time.

Accordingly, the industry regulator has said it was “concerned” by the breakdown. In the past two years, the Infocomm Development Authority (IDA) has been tough on telcos to improve their quality of service, handing out heavy fines of hundreds of thousands of dollars for serious infringements.

In the most recent case last month, it fined all three cellphone operators S$10,000 each for not providing satisfactory 3G coverage.

Just months before, SingTel was handed a record S$400,000 fine for disruptions to its 3G service. In September 2011, its subscribers could not make calls, send SMSes or go online after 5 per cent of its base stations were down for two days.

M1’s outage looks more serious. If it is found to be responsible, and its recovery job deemed too slow, it will be staring at a hefty fine from IDA. The regulator is empowered to hand out penalties of up to S$1 million or 10 per cent of a telco’s annual turnover, whichever is higher.

Yet, this latest episode also shows that it is perhaps time to think beyond fines and get to the root of the problem. The engineers have a phrase for it. It’s called root cause analysis.

Are telcos having issues running three networks – 2G, 3G and 4G – at the same time and failing to cope? They keep 2G running because it connects up users from overseas who may turn up with a 2G phone and “roam” on their network during a visit.

The question is whether telcos are up to the job, handling the complexity that’s needed to keep all the networks running at the same time. As they rush out their 4G services, are they becoming less careful?

The regulator can fine them more heavily each time, but that won’t solve the problem if the technical issues are not ironed out. The deterrent is important, but more can be done.

Should the level of backup be stepped up? Voice calls, for example, should be re-routed much more quickly than what M1 has done, should one piece of network equipment get damaged. Having little backup shows how poorly planned a network is.

M1 has advised users to switch their phones to link up to the older 2G network to keep using its services, but many users say they still cannot access any services after doing so. So, is there enough 2G bandwidth in the air to cater for a sudden influx of users after much of the airwaves have been re-assigned to newer 4G services of late?

Going further, would local telcos be open to routing each other’s calls or SMSes as a backup link if one of them is partially out of action, like M1 this week? This is not something that can be turned on with a switch – it requires prior planning and setup – but it’s not such a controversial idea either.

During the catastrophic East Asian telecom blackout of 2006, telcos that were affected turned to alternative, though slower, links such as satellite and over-land cables after many of their undersea cables were cut following an earthquake in Taiwan.

Those who had extra capacity sold temporary links at really expensive prices – they could because the connections were needed desperately – but at least they provided a connection for many users as affected telcos sent divers out to sea to patch up their cables.

The same could work in Singapore. At the least, a link between the three telcos could ensure that some calls, for example, can be routed through.

If they can easily take on traffic from travellers who roam on their networks while in Singapore, why not provide an emergency lifeline for a local user from another local telco?

True, the rivalry is fierce in the red, orange and green camps. Yet, if they can be brought together by the IDA, then Singapore’s infrastructure stands to be more robust to shake off setbacks like the one this week.

In future, technology companies expect thousands of sensors to be connected via a mobile Internet link and feeding live information, say, the level of flood water or quality of water in a reservoir, to government agencies and citizens.

As dependence on the mobile network grows, its reliability has to improve as well. Surely, it has to be better than what M1 has shown this week, when you’re lucky just to be able to make a call in some parts of the island.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.