Office 365 Monitoring: Yammer Outage February 9, 2021
On February 9, 2021 at ~1:00 am UTC, Microsoft reported an issue that was preventing some users...
On Friday, June 9th, 2023, customers and users of the Microsoft Azure Portal were unable to access the service. Microsoft confirmed the incident with several communications to its customers, however, Microsoft was brief on specifics as to the Azure outage cause.
Earlier in the week, on Monday, June 5th, Microsoft dealt with a service incident ("outage") impacting Microsoft Outlook, Teams, OneDrive for Business, SharePoint Online, and other Microsoft 365 services.
And then again on June 8th, Microsoft dealt with a OneDrive outage that prevented customer access worldwide. As reported by several news agencies, the hacktivist threat actor known as 'Anonymous Sudan' was claiming responsibility for all of the Microsoft service incidents this week, allegedly employing DDoS attacks to stymie several Microsoft services.
This is what many customers saw Friday, June 9th, when attempting to access the Azure Portal:
And this is what IT professionals, equipped with ENow's Microsoft 365 monitoring and reporting solution, saw at the exact same time: a "red light" indicator front and center on ENow's OneLook dashboard. This gave ENow clients an immediate indication that something serious was happening, allowing IT pros and system admins to take action before tickets and calls to the helpdesk came in:
Microsoft 365 monitoring and reporting solutions, such as ENow's Microsoft 365 monitoring solution, underscore the imperative need for IT professionals to have a third-party monitoring tool at their disposal, to fill the gaps and clear up blind spots where Microsoft's native tools are limited. In instances like these, it's vitally important for an organization to obtain critical service and health status information as soon as an issue manifests, without waiting on a delayed tweet from Microsoft with more information.
The first public communication from Microsoft pertaining to the Azure portal outage was at approximately 11:00 AM (ET) via the Microsoft Azure status page. As with all of their public messages regarding the outage, Microsoft was vague and very limited with details and specifics:
At approximately 12:35 PM (ET), Microsoft tweeted @AzureSupport a very sparse message as to the outage:
⚠️We are investigating an issue impacting the Azure Portal. More information and updates can be found on the Azure Status page at https://t.co/Slt24X1wGM
— Azure Support (@AzureSupport) June 9, 2023
Several minutes later, Microsoft repeated the limited information message @MSFT365status:
At approximately 3:18 PM (ET), the next, and final, tweet from Microsoft (@AzureSupport) as to this major outage was brief in content and substance, stating only that the situation was now mitigated, and "detailed" information could be found on the Azure Status history page.
✏️Engineers have confirmed that an issue that impacted Azure Portal is now mitigated. A detailed resolution statement has been posted to the Azure Status History page at https://t.co/GMEztPRZKv.
— Azure Support (@AzureSupport) June 9, 2023
Thus far, the extent of Microsoft's disclosure as to the cause of the issue has been constrained. According to Microsoft's brief statement on their Azure Status page, they identified a "spike in network traffic" which in turn impacted their ability to manage traffic. Microsoft also disclosed that they employed load-balancing processes in conjunction with auto-recovery operations to mitigate the situation, and that was the extent of Microsoft's explanation of the cause of the June 9th Azure Portal outage.
Microsoft has promised to provide a post-incident review within the next few days. However, at this time* precise and more comprehensive details from Microsoft as to the June 9th Azure outage are still forthcoming.
[* Microsoft has since this time publicly confirmed that the June 9th Azure outage and other Microsoft service incidents that same week were indeed caused by DDoS attacks.]
In a cloud-based world, outages are bound to happen. While Microsoft is responsible for restoring service during outages, IT needs to take ownership of their environment and user experience. It is crucial to have greater visibility into business impacts during a service outage the moment it happens.
ENow’s Microsoft 365 Monitoring and Reporting platform provides this type of visibility so that users remain productive. It monitors Microsoft 365, your on-premises hybrid services and remote locations, through our OneLook Dashboard. This dashboard provides a single, easy-to-use, way to view at a quick glance, everything you must monitor in your environment without the need for separate tools - everything is integrated and monitored as one system.
ENow’s Microsoft 365 Monitoring and Reporting solution enables organizations to detect outages immediately, validate the end user experience, and increase adoption while controlling costs. It enables IT Pros to pinpoint the exact services affected and the root cause of the issues an organization is experiencing during a service outage by providing: