Canada was left without communication, failure in Rogers and failure of all systems

Friday morning starts differently for everyone. Municipal workers wake up very early, they start

prepare Toronto for the influx of people whoflood city streets and roads. At 4.30 in the morning, city services noticed that mobile communications had disappeared, and many cars suddenly disappeared on the map, they stopped showing signs of life. It was easy to imagine that something had happened to the city's systems, some minor glitch had occurred, which from time to time leads to similar consequences. But after a few hours, it became clear that the failure was not related to the city's systems, Rogers, an operator providing both mobile and fixed-line services, stopped working throughout Canada. There are 38.8 million people living in Canada today, of which 11 million are Rogers Wireless users and 3 million use wireline services. In addition, the largest operator serves banks, government agencies, hospitals and even the 911 service. Since early morning, both ordinary people and government services have been without communication.

Blame hackers for shutting down the biggestprovider is impossible, no one has attacked Rogers systems, except for company engineers who do this all the time. During the pandemic, Rogers began to save on employees, as well as on equipment, the operator mainly uses systems from Ericsson. In April 2021, there was already a large-scale failure when Canadians were left without communication during the day - voice calls, SMS, data transfer did not work. At that time, the blame was placed on a software update that led to such consequences. President Rogers repented and promised to look into the causes of what was happening, to compensate all the victims for the inconvenience and, in general, never to allow such a thing in the future. The company's shares in 2021 did not suffer from that large-scale failure, the stock market did not react to the event.

Fifteen months later, the situation repeats itself andlooks exactly the same, no lessons from the past have been learned. The total network failure time is about 19 hours, and the connection was not restored for everyone, only the main services. Given that Friday is a working day, and many Canadians work from home, they instantly flooded cafes, in particular Starbucks, where the Internet worked, as it is provided by another provider. For Starbucks, this was not happy, as people ordered almost nothing, drank one cup of coffee for several hours and sat at their computers. Similar scenes were played throughout the country.

More than ten million inhabitants of the country, that isalmost one in four, found themselves without communication. To appreciate the helplessness of people, try to imagine that you cannot call your loved ones, read the news, or correspond with friends, your phone becomes an almost useless toy. Social networks were flooded with messages about what didn’t work and how - yes, people found fallback ways to get in touch.

Rogers reacted to the situation with the usualThus, the president of the company turned to the Canadians and promised to sort everything out, to automatically compensate everyone for the inconvenience. You can find this post here.

Interestingly, the message prudently does notthe approximate time for the resumption of the network was given! Which can be explained by the fact that inside the operator at that time they simply did not understand what had happened.

The next day, when the connection was restored, the president of Rogers already issued another address in which he describes what happened in detail, which can be found here.

Not very detailed explanations say thatthe operator updated the software, which caused some of the routers to become inoperable, and the entire network fell. Rogers shut down the equipment that wasn't working, redirected the traffic, and everything was up and running again. It took the operator 19 hours!


It would be useful to recall how Russianoperators who have experienced software update failures in the past, including those from Ericsson, and have developed a working model that has proven viable. At first, the update is carried out in the sandbox, it is not rolled out to the entire network, but they look at how it works in a limited space. Further, the update takes place at night (just like in Rogers), while the engineering team keeps their finger on the pulse, making sure that everything goes according to plan. And if a network failure suddenly begins to be observed, then everyone returns to the previous state - it takes minutes, not hours. Russian operators, taught by life, are very reverent about the operation of the network. And there are no complex structures that cannot be repeated in Canada, but for some reason, failures constantly occur that look like something unimaginable - in fact, a day without communication, and most of the time when people work.

Why did this become possible?I found people inside Ericsson who know firsthand how the Rogers network works, what equipment is used there. The main word, in addition to obscene language addressed to crooked engineers, is economy. The operator saves a lot on equipment, refuses reservations and tries to reduce his bills to suppliers. The whole business is geared towards saving as much as possible, and this is not only a Rogers problem, but rather a disadvantage of all Canadian operators, the history of a particular country. Rogers does not have a sandbox where you can watch the update, it is rolled out to the entire network at once. Cause? Creating a sandbox is expensive, and it is better to deal with the risk of failure of all systems than to spend money on hardware and software. The quality of the Rogers network is quite average and there are many bottlenecks. If you look at the Ookla test, it turns out that all Canadian operators are better than those in Russia.

But the fact is exactly the opposite, the averagespeeds are higher in Russia, network quality is also higher, and Ookla's measurements suffer from a number of shortcomings. For example, the coverage of Canada with communications is noticeably lower than that of Russia. And if we take into account only large cities, the picture will change beyond recognition. It's just that in Russia there is a connection in places where it is not in Canada, and it is also measured in villages and small towns, which dramatically changes the picture of average speed. In Canada, there is often no connection at all in some places, hence the inequality of comparison.

But we have talked about this many times, and repeatI don't want here. Let's take a look at how much a connection from Rogers costs. A regular plan with 25 GB of data at maximum speed will cost 85 Canadian dollars (approximately 4,200 rubles). For 50 GB you will have to pay $ 105, or 5,280 rubles.

By the way, compensation from Rogers will happenautomatically, no money will be taken from customers for a day of downtime, it can be noted that this is very generous for Canada. Last year, the compensation was exactly the same, which caused indignation among the owners of various businesses, they could not accept payments and lost turnover in one business day. But nothing has changed today.

Expensive services, mediocre investmentinto the network and the minimum amount of equipment without redundancy, the low quality of engineers who make such mistakes, but cost the operator less than those who know and can do more. Failures are relatively rare, and equipment and people need to be paid monthly. And this is the fundamental difference from the same Russia, where the approach is completely different - a bet on the quality of communication made by all players.

The example of Canada is an excellent illustration of what is beginningoccur when they save on reservations, on people. And this is no longer an exception, but an event that repeats annually. Unfortunately, in the Western world it will become the norm in the coming years, such refusals will be perceived as something normal. They do not cause rage in people, at least, such that the operator feels the outflow of customers and financial losses, other operators are no better. It is noteworthy that the cost of communication is such that people cannot afford to have two SIM cards just in case, it is expensive. And this is also a fundamental difference from Russia, where the number of people with two or even three SIM-cards exceeds a third of all communication users. Which makes the resistance to possible failures much higher than in any other country in the world.

But in any case, the experience of Rogers is important, it needspoke everyone who believes that we have too high a margin of safety on the networks of operators and that it is possible to painlessly reduce investments in equipment. It is forbidden! We cannot afford such shutdowns, which means that we need to continue to develop telecom in the same volume as before.

For a Russian user, a failure of this magnitudeand for so many people unimaginable, all the more repetitive. Individual failures that lead to the partial inoperability of a particular network for several hours are force majeure, which is analyzed at all levels, and this happens extremely rarely. But here, in Canada, things are somewhat different.

