Drupal 7 Import content with custom fields, images, and taxonomy

By Greg Boggs

I’m in the process of migrating an old PHP website for Portland State University from a custom database into Drupal 7. Doing the data entry on hundreds of nodes with multiple custom fields, taxonomies, and images would have taken ages. So, instead I imported the data into Drupal with a PHP script. At first, I attempted to use phpMyAdmin, to export a CSV, and then import the CSV using the Feeds module. However, the feeds module for Drupal 7 doesn’t seem to do field matching well, and I couldn’t get the taxonomies or images to import correctly. So, with the help of my intelligent coworker, I decided to use Drush to execute a PHP script that will import the content. The process wasn’t hard for a PHP programmer, but all the directions I found on other blogs were either confusing me, or missing a few steps. Mainly the examples were missing the code to insert dynamic data from a database in a loop.

This being my first Drupal 7 project, I had a lot to learn about new Drupal features. Don’t worry, I’ve attached the code, so you can cut and paste my hard work.

Import content into Drupal 7

  1. Create the taxonomies and terms that you will need. This can be done with a script, but most websites don’t have that many categories.
  2. Create a content type for the nodes and include all the custom fields you will need. Be sure to include term reference fields for each taxonomy from step 1.
  3. Install Drush and get “Drush help” to work.
  4. Create a PHP script, and execute it with drush php-script (script name)

Drush is nifty because it loads the Drupal environment and allows you to access Drupal’s functions. This saves you from having to figure out where to insert all the node data manually.

Connect to the database and get content

// My source data was not normalized. All data was in 1 table! Lucky me. No joins.


$query = 'SELECT * FROM `tb|record`';
$result = db_query($query);
$i = 20;
foreach ($result as $row) {

Most of what Tim wrote worked perfectly, but here’s what Tim has to say about assigning content to categories…

Add a term to a node

$node->field_tags[$node->language][]['tid'] = 1;
‘field_tags’ here is the name of a term reference field attached to your content type, ’1′ is a term id you wish to assign to a node. Simple!

Now here’s the code I actually used to insert my content into a category. It wasn’t simple:

 if ( $term = taxonomy_get_term_by_name($row->culture) ) {
$terms_array = array_keys($term);
$node->field_culture[$node->language][]['tid'] = $terms_array['0'];
}

If you review my code, you’ll notice that I failed to get my loop to read and load the taxonomies from the database, so I added them one a time. If anyone can figure out the syntax error in the block of code that’s commented out, that would be awesome. Contact me, and I’ll give you credit. But, I doubt many sites have more than a handful of categories.

The script took me about 10 hours to perfect after I threw away 2-3 different methods. Hopefully, you can copy this script and get your site converted over to Drupal in much less time.

The Full Source Code


<!--?php <br ?--> cm_load_rep_data();

function cm_load_rep_data( ) {

// My source data was not normalized. All data was in 1 table! Lucky me. No joins.
$query = ‘SELECT * FROM `tb|record`’;
$result = db_query($query);
$i = 20;
foreach ($result as $row) {
$node = new stdClass(); // We create a new node object
$node->type = ‘piece’; // Or any other content type you want
$node->title = $row->subject;
$node->language = LANGUAGE_NONE; // Or any language code if Locale module is enabled. More on this below *
$node->uid = $i; // Or any id you wish
$url = str_replace(” “,”_”,$row->subject);
$node->path = array(‘alias’ => $url) ; // Setting a node path
node_object_prepare($node); // Set some default values.
$i++;

//grab the file for this node based on the “schema” from the old website…
$file_path = ‘/home/gboggs/Images_data/’
. $row->collection_alias . ‘/’
. $row->collection_alias . $row->accession_no . ‘.jpg’;

if (file_exists($file_path)) {
$file = (object) array(
‘uid’ => $i ,
‘uri’ => $file_path,
‘filemime’ => file_get_mimetype($filepath),
‘status’ => 1,
);
$file = file_copy($file, “public://”);
$node->field_image[LANGUAGE_NONE][0] = (array)$file;
}

// Grab the body content. My database had the body content repeated. You don’t need this piece.
if (!isset($row->long_des) || $row->long_des == null) {
$body = $row->short_des;
} else {
$body = $row->long_des;
}

// Insert the content into the node
$node->body[$node->language][0]['value'] = $body;
$node->body[$node->language][0]['format'] = ‘full_html’;

// Let’s add some CCK fields. This is pretty similar to the body example
$node->field_collection_number[$node->language][0]['value'] = $row->accession_no;
$node->field_dimensions[$node->language][0]['value'] = $row->dimension;
$node->field_date[$node->language][0]['value'] = $row->date;
$node->field_artist_author[$node->language][0]['value'] = $row->artist;
$node->field_provenance[$node->language][0]['value'] = $row->provenance_prior_pub;
$node->field_details[$node->language][0]['value'] = $row->object_details;

/* This doesn’t execute, and I couldn’t figure out why in < 10 minutes. so doing it the long way.
$categories = array(‘field_culture’ => ‘culture’,
‘field_medium’ => ‘medium’,
‘field_time_period’ => ‘time_periods’,
‘field_collection’ => ‘collection’,
‘field_theme’ => ‘theme’);
foreach ($categories as $field => $category) {
$term = taxonomy_get_term_by_name($row->$category);
$node->$field[$node->language][]['tid'] = $term->tid;
}
*/

// The long way to check taxonmy terms and add them to the node.
if ( $term = taxonomy_get_term_by_name($row->culture) ) {
$terms_array = array_keys($term);
$node->field_culture[$node->language][]['tid'] = $terms_array['0'];
}

if ( $term = taxonomy_get_term_by_name($row->medium) ) {
$terms_array = array_keys($term);
$node->field_medium[$node->language][]['tid'] = $terms_array['0'];
}

if ( $term = taxonomy_get_term_by_name($row->time_periods) ) {
$terms_array = array_keys($term);
$node->field_time_period[$node->language][]['tid'] = $terms_array['0'];
}

if ( $term = taxonomy_get_term_by_name($row->collection) ) {
$terms_array = array_keys($term);
$node->field_collection[$node->language][]['tid'] = $terms_array['0'];
}

if ( $term = taxonomy_get_term_by_name($row->theme) ){
$terms_array = array_keys($term);
$node->field_theme[$node->language][]['tid'] = $terms_array['0'];
}

$node = node_submit($node); // Prepare node for a submit
node_save($node); // After this call we’ll get a nid

// printing from Drush is easy
drush_print( $row->subject . ” added.\n” );
}
}
?>


Comments are closed.



All content © Copyright 2014 by Greg Boggs.
Subscribe to RSS Feed – Posts or just Comments