Back to Blog

Active Directory Monitoring: Network AD Crashes

Image of AmyKelly Petruzzella
AmyKelly Petruzzella
Active Directory monitoring listing image

When a network issue leaves your DC stranded on an “island”

Your users know immediately when they lose their internet connection. Those “internet is down!” tickets start flowing. But what happens when the network segment hosting their domain controller (DC) is unreachable? Microsoft refers to the isolated segment as a “replication island”. This is when part of the domain or forest is unable to communicate with the other DCs. That’s a more insidious problem because the symptoms are not immediately obvious. 

From the user’s point of view, not much has changed. Their mapped drives will still be mapped (assuming the drive is not mapped to that DC). Even subsequent login attempts should be OK. This assumes the default setting which allows the use of locally cached credentials. As long as the DC was not their DHCP server, no issues there either unless they tried to refresh the IP address. Of course, if they use software that relies on Active Directory authentication, this may fail, or kick users out. Group policies are applied at login, so all but recent changes would still be in effect. Of course, tools like GPOTOOL.exe could reveal issues with Group Policy, but only if you had an inkling that something was wrong. Another confirmation test would be to run the PowerShell Test-ComputerSecureChannel cmdlet from a client workstation. But the best option would be to automatically check every day. That’s where the power of effective monitoring software shows its value.

Even if the network segment is live, network misconfiguration is a common culprit in AD network errors.  An unintended firewall change might block the AD replication calls (port 135). What if the downed segment contained the DNS servers referenced by a DC? This could eventually manifest itself in replication failures. But again, if you have the recommended redundancies of DCs, the symptoms will be slow to appear. Microsoft suggests monitoring replication health daily, using REPADMIN. A better solution is to let an Active Directory monitoring software automatically do the work for you.

The Knowledge Consistency Checker (KCC) can be both a blessing and a curse in this scenario. In a large enterprise with a hierarchical site topology, the KCC can change the replication flow during a run as needed. This is great to overcome the sudden inaccessibility of a DC. It’s sometimes not so easy to undo. With proper site-link bridging in effect, the KCC may pick another DC to replicate with. This can eventually disrupt the entire topology. It may be further exacerbated if there are some old legacy manually created site links still in existence. Depending on how long it took to detect the network outage, the offline DC could be significantly out of sync. It may even have to be removed and restored. The KCC is good at trying to maintain replication, so there is usually no disruption noticed by the users.

The inaccessibility of a particular domain controller may only become apparent when significant AD changes are made. The AD console will have a different “view” of the AD forest than its peers. Even this would not be obvious unless you were looking for something specific, such as a newly-hired user’s new account. If the isolated DC holds the PDC master/emulator role, then the problem is more serious.   You can verify which DC has the PDC role using the "dsquery server -hasfsmo pdc" command.

The PDC emulator is the one FSMO role that a domain can’t live without for very long. It’s this role that performs the editing or creation of group policy objects. More importantly, this role processes account lockouts. This could be a significant security hole until access is restored. An administrator would normally choose to transfer the role, but if the network segment is down, this is not possible. Seizing the role is also an option, but impractical. Microsoft suggests using the PowerShell Move-ADDirectoryServerOperationMasterRole cmdlet. Yet Microsoft doesn’t recommend this unless the previous role holder is not going to return to the domain. The better answer is to repair the inaccessibility issue as soon as possible. The problem is compounded when you have multiple replication islands. If each island contains some FSMO roles, there may be problems even after communication is restored. This is why you need automatic, immediate reporting on both AD and network problems.

A recent study by Semperis revealed that 84% of those surveyed said that an active directory outage would be “significant, severe, or catastrophic.” Yet a third of those organizations revealed that while they had an AD recovery plan in place, they had not tested it. Early response can help prevent network issues from developing into major AD problems. ENow’s Active Directory Monitoring and Reporting automatically detects inaccessible DC’s due to network issues. And much more. You can effectively monitor all critical components of AD from a “single pane of glass.”

 


Active Directory Monitoring and Reporting

Active Directory is the foundation of your network, and the structure that controls access to the most critical resources in your organization. The ENow Active Directory Monitoring and Reporting tool uncovers cracks in your Active Directory that can cause a security breach or poor end-user experience and enables you to quickly identify and remove users that have inappropriate access to privileged groups (Schema Admins, Domain Administrators). While ENow is not an auditing software, our reports reduce the amount of work required to cover HIPAA, SOX, and other compliance audits.

Access your FREE 14-day trial to accelerate your security awareness and simplify your compliance audits. Includes entire library of reports.


Office 365 Solutions

3rd-Party Federation Solutions for Office 365: Celestix ADFS Bridge

Image of Michael Van Horenbeeck MVP, MCSM
Michael Van Horenbeeck MVP, MCSM

As mentioned in my 2015 New Year in Review "Here We Are" blog article, the purpose of this article...

Read more
Active Directory Cloud Preparation listing image

Preparing Active Directory for the Cloud

Image of Nathan O'Bryan MCSM
Nathan O'Bryan MCSM

Preparing Active Directory for the Cloud

IT departments in organizations of all sizes can expect to...

Read more