BTMash

Blob of contradictions

Migrating Content Part 2: Nodes

Written

(Updated November 10, 2011 - Updated information about file actions. Thank you Patrick Thurmond!)
(Updated June 4, 2011 - File Handler Compatibility with Migrate 2.1)
(Updated June 8, 2011 - Image to show how to write out your destination field names)
Updated August 29, 2012: I wrote out a blog post on the new image/file handler changes from Migrate 2.4+ a few weeks ago which you can read about here. I've not updated this blog post with those changes so take a bit of what you learn from these blog posts, and add in all the things from the new one for anything file related (or just look at the new one since I link out to the new reference codebase from there as well). and hopefully it helps :)

Its been a while since I wrote about using the migrate module to migrate content from various sources. The last time, I covered migrating users into Drupal and this time, I will write out how to migrate content into drupal as nodes. Because the migrate module is so flexible, it makes doing such things quite easy.

The first article discussed the merits of using the Migrate module while the second article was a walkthrough on how to import users.

Much like last time, we first define our migration class

  1. class MyBasicPageMigration extends Migration {
  2. ...
  3. }

And this class will consist of two functions - one is the class constructor while the other is for massaging/adding extra information on a per row basis.

  1. /**
  2.  * Class Constructor
  3.  */
  4. public function __construct() {
  5. ...
  6. }
  7.  
  8. /**
  9.  * Add any additional data / clean up data for the current row / node that will be added/updated.
  10.  */
  11. public function prepareRow($current_row) {
  12. ...
  13. }

Let's start by defining the class constructor.
We need to create the mapping information (for more detail on how the mapping information works, my previous article on migrating users would be a recommended read as I will be diving headfirst into posting the code). Please note that I had already defined some constants before hand: MY_MIGRATION_DATABASE_NAME is the name of the database, MY_MIGRATION_FILES_DIRECTORY is the path to my files directory.

  1. public function __construct() {
  2. parent::__construct();
  3.  
  4. $this->description = t('Migrate basic pages');
  5.  
  6. $source_fields = array(
  7. 'nid' => t('The node ID of the page'),
  8. 'linked_files' => t('The set of linked files'),
  9. 'right_side_images' => t('The set of images that previously appeared on the side'),
  10. );
  11.  
  12. $query = db_select(MY_MIGRATION_DATABASE_NAME .'.node', 'n')
  13. ->fields('n', array('nid', 'vid', 'type', 'language', 'title', 'uid', 'status', 'created', 'changed', 'comment', 'promote', 'moderate', 'sticky', 'tnid', 'translate'))
  14. ->condition('n.type', 'page', '=');
  15. $query->join(MY_MIGRATION_DATABASE_NAME .'.node_revisions', 'nr', 'n.vid = nr.vid');
  16. $query->addField('nr', 'body');
  17. $query->addField('nr', 'teaser');
  18. $query->join(MY_MIGRATION_DATABASE_NAME .'.users', 'u', 'n.uid = u.uid');
  19. $query->addField('u', 'name');
  20. $query->orderBy('n.changed');
  21.  
  22. $this->highwaterField = array(
  23. 'name' => 'changed', // Column to be used as highwater mark
  24. 'alias' => 'n', // Table alias containing that column
  25. );
  26.  
  27. $this->source = new MigrateSourceSQL($query, $source_fields);
  28. $this->destination = new MigrateDestinationNode('page');
  29.  
  30. $this->map = new MigrateSQLMap($this->machineName,
  31. 'nid' => array(
  32. 'type' => 'int',
  33. 'unsigned' => TRUE,
  34. 'not null' => TRUE,
  35. 'description' => 'D6 Unique Node ID',
  36. 'alias' => 'n',
  37. )
  38. ),
  39. MigrateDestinationNode::getKeySchema()
  40. );
  41.  
  42. ...
  43. }

From the above code, you should see that it is very similar to the mappings done for the user in the prior link. We are simply defining more tables to join against the initial table and columns that we want pulled in. However, there is one new component that has now been added: highwater. Highwater is a more recent concept that has been added to the migrate module which provides you with the ability to update existing content that had been migrated into your new Drupal site. This means that content does not have to be rolled back and migrated back in! You define which column from your table will be the 'highwater' mark to denote whether or not a piece of content needs to be updated. The other piece that you have to provide is your sql query has to also be sorted by the same highwater column.

  1. /*
  2.   Nodes will be checked for updates based on the 'changed' column
  3.   from the original db defined in the sql query above.
  4.   */
  5. $this->highwaterField = array(
  6. 'name' => 'changed', // Column to be used as highwater mark
  7. 'alias' => 'n', // Table alias containing that column
  8. );

With that out of the way, we are now ready to start mapping fields. As a reference, this is what part of my migration screen looks like for migration of various pieces of node info (this includes node title, status, uid, etc along with fields).
The migration ui describes the names of fields that need to get mapped. This includes core node info such as uid, created, status, etc along with fields such as field_content_associated_images which will be described in more detail below
Most of the values below are the same as the way in which mappings were done for the users.

  1. // Make the mappings
  2. $this->addFieldMapping('title', 'title');
  3. $this->addFieldMapping('is_new')->defaultValue(TRUE);
  4. $this->addFieldMapping('uid', 'uid');
  5. $this->addFieldMapping('revision')->defaultValue(TRUE);
  6. $this->addFieldMapping('revision_uid', 'uid');
  7. $this->addFieldMapping('created', 'created');
  8. $this->addFieldMapping('changed', 'changed');
  9. $this->addFieldMapping('status', 'status');
  10. $this->addFieldMapping('promote', 'promote');
  11. $this->addFieldMapping('sticky', 'sticky');
  12. $this->addFieldMapping('comment', 'comment');
  13. $this->addFieldMapping('language')->defaultValue('und');
  14.  
  15. $this->addFieldMapping('path')->issueGroup(t('DNM'));
  16. $this->addFieldMapping('pathauto_perform_alias')->defaultValue('1');
  17.  
  18. $this->addFieldMapping(NULL, 'name');
  19. $this->addFieldMapping(NULL, 'vid');
  20. $this->addFieldMapping(NULL, 'type');
  21. $this->addFieldMapping(NULL, 'language');
  22. $this->addFieldMapping(NULL, 'moderate');
  23. $this->addFieldMapping(NULL, 'tnid');
  24. $this->addFieldMapping(NULL, 'translate');

But the migration wouldn't be a migration if we only have simple node table column values to migrate and had actual fields. The migrate module has currently mapped out all core fields (and migrate extras is mapping out fields defined by other modules though work is ongoing). In the case of text fields (or text field with summary/whatnot), there are now arguments to migrate that content into your new Drupal site.

  1. $body_arguments = MigrateTextFieldHandler::arguments(NULL, filter_default_format(), NULL);
  2. $this->addFieldMapping('field_content_body', 'body')
  3. ->arguments($body_arguments);

The first argument in the body_arguments is to see which field would map to a summary text field (if there was any). In the event that we did have one, it would be by defining where to find the source field:
$arguments = MigrateTextFieldHandler::arguments(array('source_field' => 'excerpt'));. The second field is which filter you wish to apply to the content (so if you know the machine name of the filter you wish to use or pull it dynamically, you may do so in the same way as for the first argument). The final argument is the language of the text field.

The migrate module also provides the ability to import file fields and have them get attached to nodes.

  1. $associated_file_arguments = MigrateFileFieldHandler::arguments(NULL, 'file_copy', FILE_EXISTS_RENAME);
  2.  
  3. $this->addFieldMapping('field_content_associated_images', 'right_side_images')
  4. ->arguments($associated_file_arguments);

Update as of June 4, 2011: There is a lot less going on in the code above - we get going to be getting multiple files and assigning attributes to them for the import. For the arguments, they consist of:

  1. Source path to the file.
  2. File action: file_move, file_copy, file_fid, file_link, file_blob, or file_fast (which doesn't really do anything)
  3. Action if file already exists.

If the first item is null, then defining it in your values passed in will need to define a path to the file (this will all get defined during the prepareRow() implementation. Also for this file definition, things are *MUCH* simpler than using the separator method from Migrate 2.0 whereby a PHP array is defined (and JSONified using drupal_json_encode) for migrate. For the array, we define a few things:

  1. Source Path to the file
  2. Language
  3. File Alt Text
  4. File Title Text
  5. File Description
  6. File Display name
  1. public function prepareRow($current_row) {
  2. // Right Side Images
  3. $right_side_image_query = db_select(MY_MIGRATION_DATABASE_NAME .'.content_field_right_side_image', 'rsi')
  4. ->fields('rsi')
  5. ->condition('rsi.nid', $nid, '=');
  6. $right_side_image_query->join(MY_MIGRATION_DATABASE_NAME .'.files', 'f', 'rsi.field_right_side_image_fid = f.fid');
  7. $right_side_image_query->addField('f', 'filepath');
  8. $results = $right_side_image_query->execute();
  9.  
  10. $images = array();
  11.  
  12. foreach ($results as $row) {
  13. $image_data = unserialize($row->field_right_side_image_data);
  14. $current_image_data = array(
  15. 'path' => REDCAT_MIGRATION_FILES_DIRECTORY .'/'. $row->filepath,
  16. 'alt' => $image_data['alt'],
  17. 'title' => $image_data['title'],
  18. );
  19. $images[] = drupal_json_encode($current_image_data);
  20. }
  21. $current_row->right_side_images = $images
  22. return TRUE;
  23. }

Since the right_side_images above are now an array of images, you can have 0, 1, 10, 20, N files associated with that particular field! And with all that, we have our various fields defined for migrating content from another database into nodes!
I hope all of this is helpful. If something doesn't make a lot of sense, leave a comment! I'll try and improve this documentation. And next time, I'll post how to migrate an event node with dates, taxonomy terms, and attached files.