Estimating Drupal 8 Migration Scope
Estimating Drupal 8 Migration Scope
Joshua Turton | Senior Developer
January 18, 2018
In my last post, we discussed why marketers might want to migrate their content to Drupal 8, and the strategy and planning required to get started. The spreadsheet we shared with you in that post is the foundation of a good migration, and it usually takes a couple sprints of research, discussion, and documentation to compile. It’s also a process that’s applicable to all migration of content, no matter the source or destination framework.
In this post, we will talk about what’s required from your internal teams to actually pull off a content migration to Drupal 8. In later posts, we’ll cover the actual technical details of making the migration happen.
Migration: A definition
It’s probably worth taking some time here to clarify what, exactly, we’re talking about when we say ‘migration’. In this context, a migration is a (usually automated) transferring of existing content from an old web site to a new one. This also usually implies a systems upgrade, from an outdated version of your content management system to a current version. In these exercises, we’re assuming that you’re moving from Drupal 6 or 7 to Drupal 8.
What kind of team is required?
There are several phases of migration, each of which requires a different skill set. The first step is outlined in detail in my last post. The analysis done here is a joint effort, generally requiring input from a project manager and/or analyst, a marketing manager, and a developer.
The project manager and analyst should be well versed in information architecture and content strategy (there is some great information on this topic at usability.gov). Further, it is really helpful if they have an understanding of the capabilities of the source and target systems, as this often informs what content is transferable, and how.
It’s also helpful if your team has a handle on the site’s traffic and usage. This usually falls to a senior content editor or marketing manager. Also important is that they have the ability to decide what content is worth migrating, and in what form.
In the documentation phase of migration, the developer often has limited input, as this is the least-technical phase of the whole process. However, they should definitely have some level of oversight on the decisions being made, just to ensure technical feasibility. That requires a good thorough understanding of the capabilities of the source and target systems.
One of the parties should also have the ability to make and export content types and fields. You can see Mike Potter’s excellent Guide to Configuration Management for more information on that.
Once development on the migration begins, it mostly becomes a developer task. Migrations are a really great mentoring opportunity (We’re really big on this at Phase2).
Finally, someone on the team also needs the ability to setup the source and target databases and files for use in all the environments (development, testing, production).
Estimation
“How long will all this take?” We hear this a lot. And, of course, there’s no one set answer. Migration is a complicated task with a lot of testing and a lot of patience required. It’s pretty difficult to pin down, but here are some (really, really rough) guidelines for you to start from. Many of the tasks below may sound unfamiliar; they will be covered in detail in later posts.
Node/User/Taxonomy migrations | 1-5 content types | 6-10 content types | 11+ content types |
Initial analysis (“the spreadsheet”) | 16-24 hours | 32-40 hours | 48-56 hours |
Content type creation & export | 16-40 hours | 40-80 hours | 8 hours/type |
Configuration Grouping | 16-24 hours | 24-40 hours | 24-40 hours |
Content migrations | 16-40 hours | 32-56 hours | 8 hours/type |
Testing | 24-32 hours | 40-56 hours | 8 hours/type |
Additional Migrations | |
Files & media migration | 32-56 hours |
Other entity types | 16-40 hour per entity type |
Migrations from non-Drupal sources | 16-40 hour per source type |
The numbers here are in “averaged person-hours” format - this would be what it would take for a single experienced developer to accomplish these tasks. Again, remember that these are really rough numbers and your mileage will vary.
You might note, reading the numbers closely, that most of the tasks are ‘front-loaded’. Migration is definitely a case where the heavy work happens at the start, to get things established. Adding additional content types becomes simpler with time - fields are often reused, or at least similar enough to each other to allow for some overlap of code and configuration.
Finally, these numbers are also based on content types of "average" complexity. By this I mean, somewhere between 5 and 15 un-customized content fields. Content types with substantially more fields, or with fields that require a lot of handling on the data, will expand the complexity of the migration. More complexity means more time. This is an area where it's hard to provide any specific numbers even as a guideline, but your migration planning spreadsheet will likely give you an idea of how much extra work is necessary. Use your best judgement and don't be afraid to give yourself some wiggle room in the overall total to cover these special cases.
Security and safety considerations
As with all web development, a key consideration in migrating content is security. The good news is that migration is usually a one-time occurence. Once it’s done, all the modules and custom code you’ve written are disabled, so they don’t typically present any security holes. As long as your development and database servers are set up to industry standard, migration doesn’t present any additional challenges in and of itself.
That said, it’s important to remember that you are likely to be working with extremely sensitive data - user data almost always contains PII (Personally Identifiable Information). It is therefore important to make sure that user data - in the form of database dumps, xml files, or other stores - does not get passed around in emails or other unsecure formats.
Depending on your business, you may also have the same concerns with actual content, or with image and video files. Be sensible, take proper precautions. And make sure that your git repository is not public.
I also strongly recommend sanitizing user accounts and email addresses on your development databases. There’s no feeling quite like accidentally sending a few thousand dummy emails to your unsuspecting and confused customers. Use drush sql-sanitize and avoid any possibly embarrassing and unprofessional gaffes.
What’s next?
Well, we’ve covered all the project management aspects of migration - next up is some tech talk! Stay tuned for my next post, which will cover the foundations of developing a migration.