blog_listing_hero_img.jpg

The curious case of the Azure AD Connect installation that stopped working after a reboot

Hybrid Identity, the relationship between Active Directory and Azure AD, has benefitted from many improvements in Azure AD Connect. For the vast majority of organizations with Hybrid Identity, Azure AD Connect provides the synchronization part of the Hybrid Identity story and can also play a vital role in the authentication part of it.

With the Azure AD Connect v2 release in July 2022, Microsoft took its free synchronization solution to the next level, at least in terms of software compatibility. Azure AD Connect v2’s SQL Server 2019-based LocalDB solution replaced Azure AD Connect v1’s SQL Server 2012 SP4-based LocalDB solution and is more stable, better performing and also makes Azure AD Connect ready for the next couple of years.

However, the LocalDB solution also made Azure AD Connect installations go belly up the last couple of months . . . .

Over the past few months, I have been getting messages from admins whose Azure AD Connect installations stopped working after installing the latest Windows Server cumulative update. I’ve dug into many of these installations, only to find that the Azure AD Connect-managed LocalDB solution couldn’t start anymore after any reboot; Azure AD Connect didn’t break because of the monthly cumulative update; the update was merely the cause for the reboot.

Common causes ruled out

There are many common causes why Azure AD Connect stops working and/or is no longer supported:

  •  
  • The LocalDB instance has grown larger than 10 GB
  • There’s insufficient RAM to start the local DB instance
  • A Group Policy setting is preventing Azure AD Connect or its core components from starting
  • The Windows Server installation running Azure AD Connect was upgraded in-place
  • The service account’s permissions or account changed or the service account’s password expires or is changed (as these credentials are used to connect to the database)

All these causes were ruled out as the cause of why the particular instances of Azure AD Connect I investigated stopped working.

What’s more, Azure AD Connect staging mode servers suffered the same fate. Restoring Azure AD Connect from a previous backup also didn’t help, as Azure AD Connect would stop working at the next reboot. Microsoft’s solution to uninstall and then reinstall Azure AD Connect merely alleviated the problem as a couple of months down the road the LocalDB instance would just refuse to start again . . . "

Testing, testing . . . Is this thing on?

In demo environments, a couple of people started investigating Azure AD Connect. This led to the understanding that the cause of the non-starting LocalDB is corruption of the LocalDB instance’s model database. Didier van Hoye documented the finding in the most detail.

In all cases in which the issue was reproduceable, the same two artifacts can be witnessed:

  •  
  1. In the error.log file, typically located at C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019, the following log lines can be read:

    Error: 9903, Severity: 20, State: 1. The log scan number (x) passed to log scan in database 'model' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.

  2. An event is logged in the Application log with Event ID 528:

    EventID528

    Event 528 with source SQLLocalDB 15.0 Windows API call WaitForMultipleObjects returned error code: 575. Windows system error message is: {Application Error} The application was unable to start correctly (0x%lx). Click OK to close the application. Reported at line: 3714.

Microsoft also investigated the issue. With 30 million organizations using Azure AD Connect, this issue was also raised with them by admins at the end of their ropes.

The cause

The SQL team at Microsoft have identified the root cause of the issue. The issue is caused by a software error in the backup logic that creates an inconsistent state in the SQL Server model database start page.

After a backup occurs, the model database is set to FULL recovery mode (dbi_status == 0x40010000), and the dbi_dbbackupLSN (the log sequence number for the database backup) is set to a value that points to a log file.

The actual recovery mode that is governed by the master database is SIMPLE. In SIMPLE recovery mode, database logs are truncated automatically. In contrast, in FULL recovery mode, logs are truncated only after a backup.

When the LocalDB instance is restarted after the log file is truncated, it detects a backup log sequence number that's earlier than the earliest log file. Therefore, it won't start the service.

The solution

If you experience this issue, you can have your Azure AD Connect installation working again with these steps, using an elevated Windows PowerShell:

  •  
  1. Stop the Microsoft Azure AD Sync service:

    Set-Service ADSync -StartupType Disabled

    Stop-Service ADSync -force

  2. Copy over the known-good model database template:

    Copy-Item "C:\Program Files\Microsoft SQL
    Server\150\LocalDB\Binn\Templates\model.mdf"
    "C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019"

    Copy-Item "C:\Program Files\Microsoft SQL
    Server\150\LocalDB\Binn\Templates\modellog.ldf"
    "C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019"


  3. Start the Microsoft Azure AD Sync service:

    Set-Service ADSync -StartupType Automatic
    Start-Service ADSync

The location of Azure AD Connect’s service profile ("C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019") could be different in your situation. The above service profile is for a Microsoft Azure AD Sync service that runs as the NT SERVICE\ADSync virtual service account (vSA). This is the default account to run the service. If you run the service as another account or as a group Managed Service Account, change the account name in the service profile location above.

To no longer experience this issue, upgrade Azure AD Connect to version 2.1.1.0, as the Azure AD Connect team have added logic to this version of Azure AD Connect to prevent the issue from occurring.

Active Directory Monitoring and Reporting

Active Directory is the foundation of your Hybrid Identity, and the structure that controls access to the most critical resources in your organization. The ENow Active Directory Monitoring and Reporting tool uncovers cracks in your Active Directory that can cause a security breach or poor end-user experience and enables you to quickly identify and remove users that have inappropriate access to privileged groups (Schema Admins, Domain Administrators). While ENow is not an auditing software, our reports reduce the amount of work required to cover HIPAA, SOX, and other compliance audits.