TL;DR summary
- When performing a mailbox backup from Office 365, progress may slow down unexpectedly, especially when processing the “Calendar” folder
- There are two main causes of a slowdown:
- Throttling at the Exchange Server end.
- Dirty data that causes server-side errors within Exchange Server. This requires special handling at the BackupAssist 365 end.
- BackupAssist 365 is engineered for performance and resilience, and deals with this with high efficiency – despite the appearance of a stall.
- It’s generally nothing to worry about. Backups will complete over a number of runs.
Backing up Office 365 mailboxes to local PST files – how we engineered for performance with large batches
On the whole, BackupAssist 365 has been optimized for performance when downloading mailboxes from Office 365 or Exchange to PST file.
A key part of this is minimizing the number of requests to the server. Each request incurs latency, and therefore is expensive. By default, BackupAssist 365 will download mailbox items from Office 365 or Exchange in large batches, such as 400 items at a time.
There are occasional circumstances where it might appear that BackupAssist 365 gets “stuck” or slows down – especially with Calendar items. In the progress log, you might see a message like:
“Downloading items 1 to 400 from Calendar” – and this message may stay there for minutes.
In the backup progress log and report, you may also see messages like:
The mailbox is throttled on folder ‘Calendar’. Skipping this folder – the backup will continue on the next run.
Note: this problem can occur whether you use a single login to back up multiple mailboxes, or not.
Cause 1: Throttling at Microsoft Exchange Server
Throttling is something that users may encounter when the Exchange Server cannot handle the load of requests from all the users it serves. This means that it’s not predictable, and does not happen for every person.
Most human users will never notice this, because if Outlook fails to sync due to throttling, it’s generally invisible to the user. However, backup software for Office 365 like BackupAssist 365 will report this occurrence, because if throttling happens, backups will take multiple runs to complete.
We’ve found that Calendar items are particularly problematic. A summary of our findings are:
- BackupAssist 365 can generally download 400 to 1000 calendar items per user per run before throttling happens.
- Downloading Calendar items is slower than regular emails.
- If BackupAssist 365 detects that Exchange is throttling and not delivering any additional calendar items, it will skip the folder and continue the rest of the backup.
- Mailboxes with many calendar items will take several runs to complete. For example, a mailbox with 10,000 items can be expected to take between 10 and 25 runs to complete.
In our experiments, we’ve found that other applications like Outlook also encounter throttling, especially on calendar items. Outlook also downloads these items gradually.
The take-home message from this is – don’t worry. BackupAssist 365 will download as much data as it can on each backup run. Some mailboxes will just take multiple backup runs to complete.
Cause 2: Dirty data in the mailbox is not possible to back up
If throttling is not the cause, then almost always, the cause of the slowdown is dirty data. Here’s why.
When we issue the commands to backup a mailbox from Office 365 or Exchange, along the lines of “fetch items #1 to #400 from the Calendar folder”, the server will retrieve those items and return them in one operation.
However, occasionally we get an error message instead. If just one of those items is “dirty” – that means, if it does not confirm to the expected data format – then the Exchange server will throw an error, instead of returning any data.
(We’ll explain how dirty data gets into a mailbox a bit later.)
Why it can get slow (the price of resilient backups for Office 365 and Exchange)
In the example above, instead of skipping ignoring the error returned from the server and skipping all 400 items, the correct thing to do is to download every item that isn’t “dirty”.
But how do we know which one of the 400 items is the dirty one? Unfortunately, we don’t. We have to search for it… and the most efficient way to search for the dirty data is to do a kind of binary search: to break the request for 400 items into two requests for 200 items. Somewhere, one of those requests will fail. Then we break the next failed requests into two requests for 100 items. And then down to two requests for 50 items. And so on… until we have downloaded everything we can.
This means that the number of requests to the server will explode.
Here’s an example
In the example above, let’s say that item #2 is dirty but every other item is normal, then you’d see behaviour something like this:
1. Fetch 400 items (#1 - #400) – failed
2. Fetch 200 items (#1 - #200) – failed
3. Fetch 200 items (#201 - #400) – success
4. Fetch 100 items (#1 - #100) – failed
5. Fetch 100 items (#101 - #200) – success
6. Fetch 50 items (#1 - #50) – failed
7. Fetch 50 items (#51 - #100) – success
8. Fetch 25 items (#1 - #25) – failed
9. Fetch 25 items (#26 - #50) – success
10. Fetch 13 items (#1 - #13) – failed
11. Fetch 12 items (#14 - #25) – success
12. Fetch 7 items (#1 - #7) – failed
13. Fetch 6 items (#8 - #13) – success
14. Fetch 4 items (#1 - #4) – failed
15. Fetch 3 items (#5 - #7) – success
16. Fetch item #1 – success
17. Fetch item #2 – failed
18. Fetch item #3 – success
19. Fetch item #4 – success
So in the example above, there are 19 times the number of requests to Exchange caused by one dirty item.
In summary you can see,
- It takes 1 request to get 400 items when all items are clean, and
- It takes 19 requests to get 399 items when one item is dirty.
The more dirty items there are, the more number of server requests are needed. Because we don’t know how many dirty items there are in a batch, doing the binary search is going to be the most efficient way to ensure we back up all the valid items.
What causes the dirty data in Office 365 or Exchange mailboxes?
We are not exactly sure what the causes are. We have noticed that it is more prevalent with data that has been migrated one or more times. For example, it happened on our own on-premise Exchange Server when data had been migrated from Exchange 2003 to Exchange 2007 to Exchange 2010, and also included data added from PST files using Outlook.
Any fault with the EDB files in Exchange, or faulty copying form PST files, or any bug in Outlook clients could have caused the problem. Another possibility is 3rd party applications and mail clients that add items to a mailbox. If these clients introduced data that did not meet the required specifications, that 3rd party app might appear to work but cause problems with backups.
Conclusion
When designing BackupAssist 365, we had to engineer it for reliability and dependability. As explained above, that means sometimes performance appears to slow down when there is dirty data present. The server returns an error when trying to retrieve the dirty data.
Despite the appearance of a stall, BackupAssist 365 attempts to handle the presence of dirty data in the most efficient way possible. Therefore, the Office 365 mailbox backup PST files will contain all the valid items in it.
If you experience a slowdown or stall lasting more than an hour, please contact our technical support department and we’ll investigate it further.
Image credit: Photo by Daria Nepriakhina on Unsplash
Article last updated: 31 May 2021