You may have heard about the recent O2 outage that caused many of their customers unable to use the Internet, but can you do when something goes wrong?
One of the best things you can do is to stay calm, find out what the problem is and keep everyone informed. For your users, knowing that someone is on the case can be reassuring and helps them to be aware that a fix is in progress.
With that in mind, what happened with O2 over the weekend?
Some of the equipment used by O2 is manufactured by Swedish company Ericsson, a well known and highly trusted company with customers worldwide. Given that their equipment handles data ranging from Tweets to highly sensitive personal data and much more inbetween, keeping things secure is vital.
Many websites now use HTTPS rather than the older HTTP, which is often shown through the Green Padlock in the address bar. This helps you to know that anything going between you and the server is encrypted to prevent someone else from eavesdropping. One of the things this needs is an SSL Certificate also known as a Security Certificate. These can be used for other purposes as well, not just websites.
Ericsson makes heavy use of Security Certificates to ensure that data going through their equipment is encrypted because a wide range of data is handled. SSL Certificates do not last forever and must be renewed before they expire - otherwise the encryption will stop working.
On the 6th December 2018, some of the Security Certificates used by Ericsson expired and weren't renewed. The systems correctly picked up on this and blocked all unencrypted data from passing through, effectively shutting down the entire system. As a result, this caused anything that relied on the equipment to also fail - with O2 being one of the worst affected. Fixing the issue required several hours of investigating and a number of attempts. By the evening, some systems had started to recover as the root cause was found and fixed. Both Ericsson and O2 are still cleaning up and it may take some time before the full answers are available.
While an outage of this size doesn't happen every day, it's always important to be prepared for when something does go wrong.