A Lack of Accountability at AT&T and Microsoft Led To AT&T’s Feb 22nd Network Outage
Microsoft’s Centralized, Horizontal Management Structure Is A Liability
One of TEK2day’s regular readers kindly pointed out that AT&T (T) outsourced its core 5G network to Microsoft (MSFT) back in 2021 (MSFT release HERE). AT&T’s February 22nd 2024 Core 5G network outage provides a teachable leadership moment.
Microsoft publicly touted its collaboration with AT&T in 2021 and 2022. However, Microsoft did not call attention to itself on February 22nd when thousands of AT&T customers lost connectivity to AT&T’s Core 5G network – which is operated by Microsoft Azure.
I believe that AT&T CEO John Stankey was transparent (read the Feb 25th letter from John Stankey to AT&T employees HERE). Stankey said in his letter that the “outage indicates it was due to the application and execution of an incorrect process used while working to expand our network.”
Stankey’s explanation is highly plausible. Do I believe that a cyberattack could have taken AT&T’s Core 5G network down? Sure. Just look at what we wrote in 2020 – 2022 about the Solar Winds cyberattack which darn near affected most every Enterprise Software company. Companies clearly underweight the risk of and are underprepared for cyberattacks.
However, I believe AT&T’s February 22nd network outage was most likely caused by a lack of accountability. For example, it appears that AT&T customer phone settings may have determined whether a particular AT&T customer was ultimately affected by the network outage or not. AT&T owns the customer relationship. Therefore, shouldn’t AT&T engineers have been thinking about potential risks associated with the upgrade across the value chain from the Core network (in this case Azure) all the way through to the endpoints/phones? The answer is emphatically “Yes”. Yet, the act of outsourcing the Core network to Microsoft likely changed the focus of AT&T engineers – many likely thought “out-of-sight, out-of-mind”, once Microsoft acquired the core network in the deal.
Similarly, shouldn’t Microsoft Azure engineers have prepared AT&T for potential risks associated with the network upgrade? The answer is emphatically “Yes”. Perhaps some did. However, it is likely that many Microsoft Azure engineers were focused only on Azure’s operations and did not put as much thought into risk prevention at customer endpoints.
Microsoft’s Executive Management structure is centralized and horizontal. That is a significant liability.
Microsoft ought to have C-Level Executives, each of whom owns a single industry vertical (Telecom, Technology, Pharma, Auto, Energy, Financial Services, Healthcare and so on) and each of whom reports directly to MSFT CEO Satya Nadella. Such a decentralized management model would drive accountability deep into MSFT’s industry vertical organizations.
C-Level Executives would own and be accountable for the full technology stack that touches their customers (in addition to having P&L responsibility for their industry verticals). Azure ought to be in a support role, supporting various industry vertical solutions.
If the above was true, the Microsoft C-Level Executive that owns Telecom would have been all over the AT&T relationship like a wet blanket. Prospective risks would have been identified in conjunction with AT&T well in advance of the network expansion. Real-time communication between MSFT and T would have been optimized.
Resources used to research certain technical elements of this article:
https://www.thousandeyes.com/blog/internet-report-pulse-update-att-outage-and-other-news
https://www.wired.com/story/att-network-outage-verizon-tmobile/
https://www.lightreading.com/mobile-core/at-t-s-outage-twists-up-its-mwc-story
https://www.lightreading.com/mobile-core/at-t-to-offload-5g-into-microsoft-s-cloud



