Back to Blog

Removing Duplicate Items from a Mailbox

Image of Michel de Rooij
Michel de Rooij
application screenshot

For those involved with Exchange migration projects or managing Exchange environments, at some point you probably have experienced a situation where individuals ended up with duplicate items in their mailbox. Duplicate items can be caused by many things, but most common are:

  •  
  • Synchronization tools or plug-in. Entries from the mailbox are treated as new entries and as a consequence are added to the mailbox when synchronizing information back to the mailbox, creating duplicates. In the past, I’ve seen this happening with Nokia PC Suite and Google Apps Sync for example;
  • Importing existing data. Accidental import from – for example – a PST file to a mailbox  can lead to duplicate entries.

Contacts window

When looking for a solution, you’ll probably encounter MSKB299349, “How to remove duplicate imported items in Outlook”. This article describes a manual procedure to remove duplicates entries from your calendar, contacts, inbox or other folders. Not very helpful and labor intensive.

When continuing your search, you’ll find lots (I mean lots!) of tools and Outlook add-ins, like Vaita’s DIR or MAPILab’s Duplicate Remover. Not all of this software is free (some even require payment per duplicate removal of appointments, contacts or e-mail) and some might not even work (MAPI-based tools may not work against Exchange 2013).

When you finally have selected a tool, in most cases they require installation of a piece of software and someone to perform the removal process using the tool or Outlook with add-in. When you’re an Apple shop you’ll require different tools, unless you’re running a Windows desktop somewhere (I’ll just pretend I didn’t hear you saying ‘Why don’t you install the tool on the Exchange server’).

Wouldn’t it be nice if you’d have a PowerShell script you can conveniently run from any workstation (or server) with PowerShell installed, removing those duplicate items from a user’s mailbox remotely? If the answer is yes, the Remove-DuplicateItems.ps1 script may be something for you.

Requirements
Using the Remove-DuplicateItems.p1 script requires Exchange 2007 or later and Exchange Web Services (EWS) Managed API 1.2 (or later). Alternatively, when you already got the EWS Managed API Microsoft.Exchange.WebServices.DLL somewhere, you can copy it to the same folder where the script is located and it will be picked up. The script has been developed and tested against Exchange 2007, meaning it’s a PowerShell 1.0 script which should be compatible with later versions of PowerShell or Exchange.

Also take notice that since you’ll be processing user mailboxes, you’ll need to have full mailbox access or impersonation permissions; the latter is preferred. For details on how to configure impersonation for Exchange 2010 using RBAC.

Usage
The script Remove-DuplicateItems.ps1 uses the following syntax:

Remove-DuplicateItems.ps1 [-Mailbox] <String> [[-Type] <String>] [-Server <String>] [-Impersonation] [-DeleteMode <String>] [-Mode <String>][-WhatIf] [-Confirm] [<CommonParameters>]

A quick walk-through on the parameters and switches:

  •  
  • Mailbox is the name of the mailbox to process;
  • Type determines what folders are checked for duplicates. Valid options are Mail, Calendar, Contacts, Tasks, Notes or All (Default);
  • Server is the name of the Client Access Server to access for Exchange Web Services. When omitted, the script will attempt to use Autodiscover;
  • When the Impersonation switch is specified, impersonation will be used for mailbox access, otherwise the current user context will be used;
  • DeleteMode specifies how to remove messages. Possible values are HardDelete (permanently deleted), SoftDelete (use dumpster, default) or MoveToDeletedItems (move to Deleted Items folder).
  • Mode determines how items are matched. Options are Quick, which uses PidTagSearchKey and is the default mode, or Full which uses a predefined set of attributes to match items, depending on the item class:
ItemClass window
  •  
  •  

Few notes:

  • When MoveToDeletedItems is specified, the Deleted Items folder will be skipped;
  • When Type is omitted or set to All, all folders are scanned, including folders like Conversation History, RSS Feeds, etc.;
  • When Quick mode is used and PidTagSearchKey is missing or inaccessible, search will fall back to Full mode;
  • For more info on PidTagSearchKey

So, suppose you want to remove duplicate Appointments from the calendar of mailbox migtester1 using attribute matching, moving duplicate items to the DeletedItems, using Impersonation and you want to generate extra output using Verbose. In such case, you could use the following cmdlet:

Remove-DuplicateItems.ps1 -Mailbox <MailboxID> -Type Calendar -Impersonation -DeleteMode MoveToDeletedItems -Mode Full -Verbose

cmdlet window

In case you want to process multiple mailboxes, you can use a CSV file which needs to contain the Mailbox field. An example of how the CSV could look:

Mailbox
francis
philip

The cmdlet could then be something like:

Import-CSV users.csv1 | Remove-DuplicateItems.ps1 -DeleteMode HardDelete -Impersonation

keyboard keys

Are You Automating Enough?

Image of Michael Van Horenbeeck MVP, MCSM
Michael Van Horenbeeck MVP, MCSM

Earlier this week, Tony Redmond wrote about Jeffrey Snover – also known as the godfather of...

Read more
computer code

Preservation Policies in Office 365

Image of Vasil Michev MVP
Vasil Michev MVP

Preservation policies were introduced almost a year ago as part of the Compliance Center in Office...

Read more