Today, Office 365 customers experienced another global outage affected by Azure AD. When did you first know that there was a problem? Were you alerted to the problem by a frustrated user trying to log into email on a Monday morning? How much productivity was lost within your organization due to the outage?
This is the second outage in as many months. This time it was caused by a multi-factor authentication (MFA) outage. Users around the world who rely on Azure multi-factor authentication to access Office 365 workloads found that they could not successfully authenticate if they had an expired token. Luckily(?), most Office 365 customers still don’t use MFA to protect their Office 365 accounts, so those users were unaffected.
However, most Office 365 admin accounts are protected by MFA (as is best practice) and many admins found they could not sign-in to manage their tenants. Both admins and end-users also found that they could not use the Azure self-service password reset service since it relies on MFA, as well.
The cause of the outage is still largely unknown at press time but there have been references to a failed patch roll out. Microsoft only indicates a “service degradation” that users are “Unable to sign in to Microsoft 365 services” starting Monday, November 19, 2018, at 4:39 AM UTC. Even though the scope of impact says it was limited to users in EMEA and APAC, users in North America were also impacted.
Of course, if your MFA-enabled admin account is affected by this outage you wouldn’t be able to sign-in to see this health advisory, which presents a catch 22.
This is the second wide-scale outage affecting Azure AD this year. The first was September 4, 2018 when a cooling problem in the San Antonio, TX data-center caused by a power voltage increase due to lightning strikes affected authentication around the globe.
Azure MFA can be bypassed to work-around this issue in several ways, depending on how MFA is configured in your tenant. Unfortunately, both methods need to be configured ahead of time since they are both configured in the Azure Portal.
User Assignment MFA
If you assign MFA to a user or admin account, that user is prompted for MFA every time they access an Office 365 resource. This is configured in the Azure AD Portal
Here, you can assign which users are enforced to use MFA. Users configured as Disabled will use Conditional Access MFA if it’s configured. If you click the Service Settings heading at the top of this same page you can configure Trusted IP subnets. Federated users authenticating from these subnets will bypass MFA.
Unfortunately, the example IP subnets listed in the Azure MFA dialog show private IP subnets. You actually should enter your public IP subnet(s).
It’s important to know that User Assignment MFA overrides Conditional Access MFA, described next.
Conditional Access MFA
You can use Conditional Access MFA to apply MFA under certain conditions rather than every time a user accesses an Office 365 resource. Conditional Access offers built-in policies to use MFA. The first one, “Baseline policy: Require MFA for admins (Preview)”, requires MFA for all Office 365 admin accounts and is disabled by default.
If this policy is enabled, the outage experienced today would prevent you from signing in to Office 365 with any admin account. There is no option to bypass MFA for admin accounts with this policy. I see this as a problem, but Microsoft says this is by design. Perhaps they will rethink this after today’s outage.
The second policy, “Require MFA from external networks”, requires you to enter the subnet(s) for networks where you want to bypass Conditional Access MFA. Again, here you enter the public subnet(s), not the private ones shown in the example.
Despite recent outages, I still believe that MFA is one of the very best ways to secure accounts and resources in Office 365. Proper planning can prevent you getting in a pickle in the event of an outage like today’s.
Mailscape 365 from ENow Software provides visibility
ENow Software is the leading provider of Office365 Management solutions that helps you save money and increase end user productivity.
Let’s quickly walkthrough how Mailscape 365 surfaces problems in real-time and enable our customers to successfully navigate the Azure AD outages over the past two months to achieve SLA transparency.
Once the September 4th and today’s outage started to affect the ability to authenticate via Azure AD to Office 365 systems, the Mailscape 365 OneLook dashboard turned red as a visual indicator for the NOC. You can see in the screenshot below that the Network and Directory & Authentication services are showing red.
The visual queue of the red indicators quickly show there are issues with the Office 365 service.
During today’s outage on November 20th Mailscape 365 customers were greeted with a failed multi-factor authentication status when selecting the failed network indicator.
Mailscape 365 to the Rescue
ENow customers like Barclays, Facebook and Wells Fargo were able to quickly identify and drill down to the root cause of the problem as it was happening.
Don't believe us? See how to triage today’s issue in real-time by viewing the video below!
The Importance of Office 365 Monitoring
In a cloud-world, outages are bound to happen. While Microsoft is responsible for restoring service during outages, IT needs to take ownership of their environment and user experience. It is crucial to have greater visibility into business impacts during a service outage the moment it happens.
ENow’s Office 365 Monitoring and Reporting solution enables IT Pros to pinpoint the exact services effected and root cause of the issues an organization is experiencing during a service outage by providing:
- The ability to monitor entire environments in one place with ENow’s OneLook dashboard which makes identifying a problem fast and easy without having to scramble through Twitter and the Service Health Dashboard looking for answers.
- A full picture of all services and subset of services affected during an outage with Enow’s remote probes which covers several Office 365 apps and other cloud-based collaboration services.