On January 25, 2023, at ~2:31 AM ET, Microsoft communicated via tweet (@MSFT365status) that they were investigating an issue impacting multiple Microsoft 365 services.
For IT professionals and system admins with access to the Microsoft Admin Center, the service incident number to reference is MO502273.
In their second message, approximately one hour later, Microsoft had identified a networking issue as the possible cause, but any remediation steps were still to be determined at this point.
Followers on Twitter were quick to point out that even the status.office365.com page was unavailable during this time.
Approximately 2 hours after first reporting the issue, Microsoft stated that the recent network change was rolled back and that they were continuing to monitor the situation.
Several tweets from the MVP community voiced frustration as to yet another Microsoft service incident this month, with a previous Microsoft 365 service incident taking place just last week.
At approximately 5:47 AM ET, Microsoft's next communication via @MSFT365status indicated they were still monitoring the incident and that some customers were reporting service restoration.
As of 7:30 AM ET, a check of status.office365.com did confirm service degradation for multiple Microsoft 365 services, including but not limited to Teams, Exchange Online, Outlook, Power BI, OneDrive, and the Admin Portal.
This outage had such an impact that larger media outlets noticed and reported on the issues.
At approximately 9:30 AM ET, Microsoft provided another update: all impacted services had recovered and remained stable. However, Microsoft also stated that they were actively investigating ongoing impacts to Exchange Online services. Microsoft provided a new service incident number at this time for the Exchange Online matter: EX502694. At this time, there have been no other messages from @MSFT365status as to the outage.
However, Microsoft has provided more comprehensive information about this event outside of Twitter, namely on their Azure status page. So what happened exactly? According to Microsoft, a change made to their Microsoft Wide Area Network (WAN) impacted connectivity between clients on the internet to Azure, and impacted connectivity between services within regions, as well as ExpressRoute connections. Microsoft rolled back these changes and recovery was shown across all regions and services.
Microsoft has indicated that a preliminary Post Incident Review (PIR) will be delivered to the public shortly, as well as a more detailed PIR several days after their preliminary report.
The Importance of Microsoft 365 Monitoring
In a cloud-world, outages are bound to happen. While Microsoft is responsible for restoring service during outages, IT needs to take ownership of their environment and user experience. It is crucial to have greater visibility into business impacts during a service outage the moment it happens.
ENow’s Microsoft 365 Monitoring and Reporting solution enables IT Pros to pinpoint the exact services effected and root cause of the issues an organization is experiencing during a service outage by providing:
- The ability to monitor networks and entire environments in one place with ENow’s OneLook dashboard which makes identifying a problem fast and easy without having to scramble through Twitter and the Service Health Dashboard looking for answers.
- A full picture of all services and subset of services affected during an outage with ENow’s remote probes which covers several Microsoft 365 apps and other cloud-based collaboration services.
Identify the scope of Microsoft 365 service outage impacts and restore workplace productivity with ENow’s Microsoft 365 Monitoring and Reporting solution. Access your free 14-day trial today!