It’s been 10 months since the last instalment of my tales of a Microsoft 365 migration series! In that time I’ve had a new son, moved house, again and again, plus migrated a boat load of data (pun intended). In this post I’ll detail how we migrated over 20TB of data into SharePoint Online from on-premise file shares.
If you want to take a look back at my other posts documenting my organisations migration journey you can find them below:
- Tales of a Microsoft 365 migration – path to a trial migration
- Tales of a Microsoft 365 migration – OneDrive
I’ve used this before, but the graphic below highlights the phases of our migration efforts, what we have completed thus far (green) and where we are currently are at (yellow):
Exchange email migration
Home drive migration
File share migration
Our aim was to migrate all of the file share data server by server into SharePoint Online, making the on-premise file servers read-only and finally decommissioning the servers. These new SharePoint sites were to act as temporary, bridging location- allowing the organisation to continue working as normal, but over a longer period of time review the sites contents and move the data into existing organisational Teams.
Work had been done previously to reduce the volume of content and also identify locations that would require more granular permissions changes.
As with the home drive migration project, we needed to ascertain:
- How many file servers + individual shares are there?
- What is the total volume size of each server combined?
- How do users gain access the data once its migrated to SharePoint Online?
Early on we realised the Microsoft Migration Manager would not be suitable for this project, mainly as we saw a significantly higher level of errors when running pre-scans in Migration Manager that were reduced by up to 90% running the exact same pre-scan in ShareGate. This led us to purchase additional ShareGate licenses to ramp up our migration efforts, which honestly was the best thing we ever did!
The pre-scan migrations using ShareGate gave us a great baseline for understanding our data, but we also purchased Treesize pro which we used to generate reports against each file server that made it much easier to understand what we were migrating and interactive (you can drill down through folders using this method, which comes in handy – more on this later).
With all the above tooling at our disposal we went to work using Treesize to generate reports for each file server and shared with data owners pre-migration. We also kept an active scan of each location open within in the Treesize application to refer back to, plus an export of each migration task from Sharegate for the first pass and final incremental migration.
Using the above reports we were able to break down each server into tasks that would be within ShareGates software limitations:
- We found that migrating over 1TB or 1 million items in one task would generally cause issues. The tasks would either not complete or once completed we would be unable to open or export the task reports.
- We decided to stick to creating tasks that contained no more than 500GB of data, spread across multiple VM’s to also avoid throttling – we highlighted each task in a block of colour in each Treesize spreadsheet to help identify
Based on the volume of data and the number of files/ folders we had to migrate we ran a two-stage migration for all our file share data:
- First pass – this was the big migration, we ran it silently in the background to allow longer migration tasks to run and shift all the data across during this first migration.
- Incremental – this was the migration we made public to our users. We published migration schedules for the entire organisation to see and scheduled tasks in ShareGate to run over the weekends to ensure no loss in service following the migration.
In SharePoint Online, at the site level we opened up access to everyone except external users. We then created document libraries within each SharePoint site and mapped each existing on-premise domain group to the corresponding file share library. Due to the sheer size of the folders structures within our file server environment, we decided to only publish the 2nd and 3rd level of each file share structure as part of the Tree size pre-migration scans. Data owners for each area of the business reviewed the reports we produced using Treesize and identified folders that would require further permission changes to be made.
In some cases we were required to break down ShareGate migration tasks into several parts. We would have to do this where the file share would contain over 1TB data or more than 1 million files/folders, as mentioned above. To do this we would need to drill-down further into the file share structure using Treesize until we got to the position of being under 1TB or 1 million files/ folders.
Issues & workarounds + things of note
#1 Migration Manager vs ShareGate
As mentioned above, when we ran pre-scans of our file share locations using Migration Manager the tool reported thousands of errors that largely related to invalid characters, blocked file types or filenames or paths being too long – meaning we were unable to use the tool for our migration efforts.
Workaround: ShareGate works best for your file share migrations
After running an identical pre-scan using ShareGate, our error reports reduced by as much as 90% in some cases, mainly due to ShareGate’s built in capability of replacing invalid characters. It’s also worth noting that the ShareGate UI is much more feature rich and allows you to do way more migration configuration
#2 50,000 unique item permissions limit
When migrating file share data using ShareGate we noticed this error message when migrating files/ folders with more than 50K permissions:
The custom permission associated with the item could not be copied since it would exceed the list limit. The maximum number of permission scope for this list has been reached.https://support-desktop.sharegate.com/hc/en-us/articles/360001453523-The-custom-permission-associated-with-the-item-could-not-be-copied-since-it-would-exceed-the-list-limit
SharePoint has a limit of 50k permission scopes within a list or library. A scope is either an item, document or folder with broken inheritance. The documentation provided by Microsoft refers to SharePoint Server, but this limitation also applies to SharePoint Online.
If the 50K unique item permission limit is met, all items scheduled for migration after the 50K unique permission is met will fail.
Workaround: Don’t migrate permissions as part of the migration
Let’s face it, if you have file shares with more than 50K unique item permissions you likely have bigger problems that just trying to migrate the data! In my case we made the decision to not include permissions as part of each migration task. This was a big decision that required senior management backing, but ultimately was the right one as it allowed us to migrate the data.
As we already and created reports for the top-level file share structure, we mapped the permissions for the top-level from the file shares for each library in each SharePoint site. Having published the same spreadsheet, we gave our data owners the chance to highlight any other locations that would require additional unique permissions adding and included them as part of the migration effort.
#3 Breaking permissions inheritance in document libraries
I’ve already wrote a separate article on how to break permissions inheritance on large libraries/ lists in SharePoint. However, I’ll summarise the key areas to watch out for and how to workaround them:
- 5,000 list view threshold for libraries in SharePoint Online
- 100,000 item limit for breaking permissions inheritance for document libraries in SharePoint Online
Workaround: Delete items to get library under 100,000 items > break permissions inheritance > then restore the deleted items
Also if you read this before starting, just stop inheriting permissions for each library before doing anything else!
#4 Using on-premise Active Directory domain groups to manage document libraries
We decided to use the existing on-premise domain groups to manage our document libraries in SharePoint Online as we felt it would be an easier task to troubleshoot issues if you can trace back where users originally had permissions using the same domain group. However, we found that you cannot add or manage the membership of a synced on-premise domain group in Microsoft 365. As explained in this article from Microsoft:
If the group is synced from on premises Windows AD they cannot be managed in Azure AD. They must be managed on-prem with tools like the Active Directory Users and Computers.https://docs.microsoft.com/en-us/microsoft-365/community/all-about-groups
Workaround: Create your own Azure Active Directory security groups
As we want to be able to manage everything in Microsoft 365, having to manage the membership of our SharePoint Online libraries using on-premise active directory just wasn’t going to work. So we decided to create our own security groups in Azure Active Directory, following a similar naming convention and use these groups to manage the document library permissions instead.
#5 Syncing and accidental deletion
Once we started to go live with our new SharePoint file server replacement sites, it didn’t take long before users started to figure out they had the ability to sync or create shortcuts to the libraries/ folders. The first issue we faced was that when users starting syncing these large file shares it would kill their OneDrive. It would take a very long time to complete the sync and it many scenarios the synced location would contain blocked file types that would cause the users OneDrive to error and stop syncing.
We also observed that the nature of syncing confused some of our users, meaning they thought the syncing had actually created a local copy on their machines and began deleting files not realising it was an active connection to the SharePoint document library!
Workaround: Switch off syncing on large document libraries
Greg Zelfond over at SharePoint Maven came to the rescue for me here! He has a great article on how you can disable sync in SharePoint and OneDrive that allowed us to stop our users syncing large libraries and accidentally deleting items.
6# Moving files creates duplicates in the recycle bin
Something I did not know before hand was that when you move files in SharePoint Online it actually creates a copy in the destination location and then deletes from source once completed. Here’s what Microsoft say about it:
When a file is moving, it continues to appear in the source directory until it’s fully moved to the destination and then it will be deleted. The file remains in the source sites Recycle Bin after the Move is complete and is subject to the normal recycle schedule unless a user recovers it from the Recycle Bin.https://support.microsoft.com/en-us/office/move-or-copy-files-in-sharepoint-00e2f483-4df3-46be-a861-1f5f0c1a87bc
Workaround: Consider the destination of files/folders before moving
I think in hindsight had we realised this before we would have either planned ahead for it and bought more storage space to prepare for having duplicates in our environment for 93 days at a time, or worked with the data owners to better identify files/ folders to be migrated into there final organisational Team home, rather than sit in a file server replacement SharePoint site.