Managing Your Drupal 8 Migration
Managing Your Drupal 8 Migration
Joshua Turton | Senior Developer
February 13, 2018
In this post, we’ll begin to talk about the development considerations of actual website code migration and other technological details. In these exercises, we’re assuming that you’re moving from Drupal 6 or 7 to Drupal 8. In a later post, I will examine ways to move other source formats into Drupal 8 - including CSV files, non-Drupal content management systems, or database dumps from weird or proprietary frameworks.
Migration: A Primer
Before we get too deep into the actual tech here, we should probably take a minute to define some terms and explain what’s actually happening under the hood when we run a migration, or the rest of this won’t make much sense.
When we run a migration, what happens is that the Web Server loads the content from the old site, converts it to a Drupal 8 format, and saves it in the new site. Sounds simple, right?
Actually, it pretty much is that simple. At least, conceptually. So, try to keep those three steps in mind as we go through the hard stuff later. Everything we do is designed to make one of those three steps work.
Key Phrases
- Migration: The process of moving content from one site to another. ‘A migration’ typically refers to all the content of a single content or entity type (in other words, one node type, one taxonomy, and so on).
- Migration Group: A collection of Migrations with common traits
- Source: The Drupal 6 or 7 database from which you’re drawing your content (or other weird source of data, if applicable)
- Process: The stuff that Drupal code does to the data after it’s been loaded, in order to digest it into a format that Drupal 8 can work with
- Destination: The Drupal 8 site
Interestingly, each of those key phrases above corresponds directly to a code file that’s required for migration. Each Migration has a configuration (.yml) file, and each is individually tailored for the content of that entity. As config files, each of these is pretty independant and not reusable. However, we can also assign them to Migration Groups. Groups are also configuration (.yml) files. They allow us to declare common configurations once, and reuse them in each migration that belongs to that group.
The Source Plugin code is responsible for doing queries to the Source database, retrieving the data, and formatting it into PHP objects that can be worked on. The Process Plugin takes that data, does stuff to it, and passes it to the next step. The Destination Plugin then saves it in Drupal 8 format. Rinse, repeat.
On a Drupal-to-Drupal migration, around 75% of your time will be spent working in the Migration or Migration Group config, declaring the different Process Plugins to use. You may wind up writing one or more Process Plugins as part of your migration development, but a lot of really useful ones are included in Drupal core migration code and are documented here. A few more are included with Migrate Plus.
Drupal 8 core has Source Plugins for all standard Drupal 6 and Drupal 7 entity types (node, taxonomy, user, etc.). The only time you’ll ever need to write a Source plugin is for a migration from a source other than Drupal 6 or 7, and many of these are already available as Contrib modules.
Also included in Drupal core are Destination Plugins for all of the core entity types. Unless you’re using a custom entity in Drupal 8, and migrating data into that entity, you’ll probably never write a Destination Plugin.
Development Foundations
There are a few key requirements you need to have in place before you can begin development. First, and probably foremost, you need to have both your Drupal 6/7 and Drupal 8 sites - the former full of all your valuable content, and the latter empty of everything but structure.
An important note: though the completed migration will be run on your production server, you should be using development environments for this work. At Phase2, we use Outrigger to simplify and standardize our dev and production environments.
For migration purposes, we only actually need the Drupal 7 site’s database itself, in a place that’s accessible to the destination site. I usually take an SQL dump from production, and install it as an additional database on the same server as the destination, to avoid network latency and complicated authentication requirements. Obviously, unless you freeze content for the duration of the migration development, you’ll have to repeat this process for final content migration on production.
I’d like to reiterate some advice from my last post: I strongly recommend sanitizing user accounts and email addresses on your development databases. Use drush sql-sanitize and avoid any possibly embarrassing and unprofessional gaffes.
On your Drupal 8 site, you should already have completed the creation of the new content types, based on information you discovered and documented in your first steps. This should also encompass the creation of taxonomy vocabularies, and any fields on your user entities.
In your Drupal 8 settings.php file, add a second database config array pointed at the Drupal 7 source database.
$databases['migration_source_db']['default'] = array(
'database' => 'example_source',
'username' => 'username',
'password' => 'password',
'prefix' => '',
'host' => 'db',
'port' => '',
'namespace' => 'Drupal\Core\Database\Driver\mysql',
'driver' => 'mysql',
Finally, you’ll need to add the migration module suite to your site. The baseline for migrations is migrate, migrate_drupal, migrate_plus, and migrate_tools. The Migrate and Migrate Drupal modules are core code. Migrate provides the basic functionality required to take content and put it into Drupal 8. Migrate Drupal provides code that understands the structure of Drupal 6 and 7 content, and makes it much more straightforward to move content forward within the Drupal ecosystem.
Both Migrate Plus and Migrate Tools are contributed modules available at Migrate Plus, as the name implies, adds some new features, most importantly migration groups. Migrate Tools provides the drush integration we will use to run and rollback migrations.
Drupal 8 core code also provides migrate_drupal_ui, but I recommend against using it. By using Migrate Tools, we can make use of drush, which is more efficient, can be incorporated into shell scripts, and has more clear error messages.
Framing the House
We’ve done the planning and laid the foundations, so now it’s time to start building this house!
We start with a new, custom module. This can be pretty bare-bones, to start with.
type: module
name: 'Example Migrate'
description: 'Example custom migrations'
package: 'Example Migrate'
core: '8.x'
- drupal:migrate
- drupal:migrate_plus
- drupal:migrate_tools
- drupal:migrate_drupal
Within our module folder, we need a config/install directory. This is where all our config files will go.
Migration Groups
The first thing we should make is a general migration group. While it’s possible to put all the configuration into each and every migration you write, I’m a strong believer in DRY programming (Don’t Repeat Yourself). Migrate Plus gives us the ability to put common configuration into a single file and use it for multiple migrations, so let’s take advantage of that power!
Note the filename we’re using here. This naming convention gives Migrate Plus the ability to find and parse this configuration, and marks it as a migration group.
# The machine name of the group, by which it is referenced in individual migrations.
id: example_general
# A human-friendly label for the group.
label: General Imports
# More information about the group.
description: Common configuration for simple migrations.
# Short description of the type of source, e.g. "Drupal 6" or "WordPress".
source_type: Drupal 7 Site
# Here we add any default configuration settings to be shared among all
# migrations in the group.
key: migration_source_db
# We add dependencies just to make sure everything we need will be available
- example_migrate
- migrate_drupal
- migrate_tools
This is a very simple group that will use for migrations of simple content . Most of the stuff in here is self-descriptive. However, source is a critical config - it uses the key of the database configuration we added earlier, to give migrate access to that database. We’ll examine a more complicated migration group another time.
User Migration
In Drupal, users pretty much have their fingers in every pie. They are listed as authors on content, they are creators of files… you get the picture. That’s why it’s usually the first migration to get run.
Note again the filename convention here, which allows Migrate Plus to find it, and marks it as a migration (as opposed to a group).
# Migration for user accounts.
id: example_user
label: User Migration
migration_group: example_general
plugin: d7_user
plugin: entity:user
plugin: get
source: mail
status: status
plugin: get
source: name
plugin: dedupe_entity
entity_type: user
field: name
plugin: static_map
source: roles
2: authenticated
3: administrator
4: author
5: guest_author
6: content_approver
created: created
changed: changed
required: { }
- example_migrate
Wow! There’s lots of stuff going on here. Let’s try and break it down a bit.
id: example_user
label: User Migration
migration_group: example_general
The id designation is a standard machine name for this migration. We will call this with drush to run the migration. Label is a standard human-readable name. The migration_group should be obvious - it connects this migration to the group we designed above, which means we are now importing all the config in there. Notably, that connects us to the D7 database.
plugin: d7_user
plugin: entity:user
Here are two key items. The source plugin defines where we are getting our data, and what format it’s going to come in. In this case, we are using Drupal core’s d7_user plugin.
The destination plugin defines what we’re making out of that data, and the format it ends up in. In this case, we’re using Drupal core’s entity:user plugin.
plugin: get
source: mail
status: status
plugin: get
source: name
plugin: dedupe_entity
entity_type: user
field: name
plugin: static_map
source: roles
2: authenticated
3: administrator
4: author
5: guest_author
6: content_approver
created: created
changed: changed
Now we get into the real meat of a migration - the Process section. Each field you’re going to migrate has to be defined here. They are keyed by their field machine name in Drupal 8.
Each field assigns a plugin parameter, which defines the Process Plugin to use on the data. Each of these process plugins will take a source parameter, and then possibly others. The source parameter defines the field in the data array provided by the source plugin. (Yeah, like I’ve said before, naming things clearly isn’t Drupal’s strong suit).
Our first example is mail. Here we are assigning it the get process plugin. This is the easiest process to understand, as it literally takes the data from the old site and gives it to the new site without transforming it in any way. Since email addresses don’t have any formatting changes or necessary transformations, we just move them.
In fact, the get process plugin is Drupal’s default, and our next example shows a shortcut to use it. The status field is getting its data from the old status field. Since get is our default, we don’t even need to actually specify the plugin, and the source is simply implied. See the documentation on for more detail.
Name is a slightly more complicated matter. While usernames don’t change much in their format, we want to make absolutely sure that they are unique. This leads us to Plugin Chaining, an interesting option that allows us to pass data from one plugin to another, before saving it. The YML array syntax, as demonstrated above, allows us to define more than one plugin for a single field.
We start off by defining the get plugin, which just gets the data from a source field. (You can’t use the default shortcut when you’re chaining, incidentally.)
We then pass it off to the next plugin in the chain, dedupe_entity. This plugin ensures that each record is absolutely certain to be unique. It has the additional parameters entity_type and field. These define the entity type to check against for uniqueness, and the field in which to look on that entity. See the documentation for more detail.
Note that this usage of dedupe_entity does not specify a source parameter. That’s because plugin chaining hands off the data from the first plugin in line to the next, becoming, in effect, the source. It’s very similar to method chaining in jQuery or OOP PHP. You can chain together as many process plugins as you need, though if you start getting up above four it might be time to re-evaluate what you’re doing, and possibly write a custom processor.
Our final example to examine is roles. User roles in Drupal 7 were keyed numerically, but in Drupal 8 they are based on machine names. The static_map plugin takes the old numbers, and assigns them to a machine name, which becomes the new value.
The last two process items are changed and created. Like status, they are using the get process plugin, and being designated in the shortcut default syntax.
required: { }
- example_migrate
The last two configs are pretty straightforward. Migration Dependencies are used when a migration requires data from other migrations (we’ll get into that more another time). Dependencies are used when a migration requires a specific additional module to be enabled. In my opinion it’s pretty redundant with the dependencies declared in the module itself, so I don’t use it much.
In the next post, we’ll cover taxonomy migrations and simple node migrations. We’ll also share a really useful tool for migration development. Thanks for reading!