REA XML Parser and WordPress

A while back I was asked to display a directory full of REAXML files on a WordPress site. To accomplish this I created a class which would convert the XML files into an associative array and then a plugin to use wp_insert_post() for each property. realestate.com.au accepts property data in their proprietary XML format as a means to import properties into your account on their website. Due to this feature many businesses are already producing REAXML documents from their custom software or another XML Provider, with this REAXML parser we are able to easily parse the data into our website.

Creating the plugin

/*
Plugin Name: Property Inserter
Plugin URI: http://devblog.com.au
Description: A plugin to insert properties from REAXML files
Version: 1.0
Author: Ben Dougherty
Author URI: http://devblog.com.au
*/

Scheduling the Importer

We want the importer to run periodically, In this case every hour. To achieve this we register a WordPress scheduled event when the plugin is activated. We also have a deactivation hook to remove and scheduled events when the plugin is deactivated.

  register_activation_hook(__FILE__, 'db_importer_activation');
  register_deactivation_hook(__FILE__, 'db_importer_deactivation');
  add_action('db_importer_run_hourly', 'db_importer_process_properties');	

/*
 * Schedules our events on plugin activation
 */
function db_importer_activation() {
  wp_schedule_event( current_time( 'timestamp' ), 'hourly', 'db_importer_run_hourly');
}

/*
 * Remove any scheduled events on plugin deactivation
 */
function db_importer_deactivation() {
  wp_clear_scheduled_hook('db_importer_run_hourly');
}

/**
 * Runs Hourly
 */
function db_importer_run_hourly() {
  db_importer_load_properties();
}

Loading the properties using our REA XML Parser

We now need to specify some defaults such as the user to attach the posts to, the category to insert the posts into and the custom post type. The REA XML parser class is included and we use the parse_directory() method to parse all of our XML files. You can see that a directory is specified to move the XML files into after processing based on if they failed or were processed successfully. You can download the REA XML parser code from github. You can see in our call to db_imoprter_insert_properties that we are only going to insert residential properties.

function db_importer_load_properties() {
  /* The location of our data directory */
  $xml_dir = "../../../data";

  /* Where our failed files will be moved */
  $failed_dir = "../../../data/failed";

  /* Where our successfully processed xml files will be moved */
  $processed_dir = "../../../data/processed";

  /* Files to exclude in our data folder */
  $excluded_data_files = array(".", "..", ".ftpquota", "processed", "failed");

  /* Residential Catgegory */
  $residential_cat = 5;

  /* My user ID */
  $user_id = 1;	

  /* Post Type */
  $post_type = "property";

  //Included our REA_XML class
  include("rea_xml/rea_xml.class.php");
  $rea = new REA_XML($debug=false); //create in debug mode

  //set some additional excluded files
  $excluded_files = array("processed", "failed", ".DS_Store", ".ftpquota");

  //parse the whole directory
  $properties = $rea->parse_directory($xml_dir, false, false, $excluded_files);

  //Insert all residential properties
  db_importer_insert_properties($properties['residential'], $residential_cat, $user_id, $post_type);	
}

Inserting the Properties

Once we have successfully loaded the properties we want to insert them all into WordPress as posts. We simply loop through the properties creating posts and inserting any custom meta data.

/**
 * Enters properties from a associative arrays WP Posts
 */
function db_importer_insert_properties($properties, $category, $user_id, $post_type) {
  $result = true;//if properties were added from this file

  if($properties) {
    if(count($properties) > 0) {
      $count = 0;
      foreach($properties as $property) {
        $title = get_property_title($property); //get title

        //setup post
        $new_post = array(
        'post_title' => $title,
        'post_content' => $property['description'],
        'post_status' => 'publish',
        'post_date' => date('Y-m-d H:i:s'),
        'post_author' => $user_id,
        'post_type' => $post_type,
        'post_category' => array($category)
      );

      $post_id = wp_insert_post($new_post);	//insert post	
      if($post_id != 0) {
        //add metadata
        add_post_meta($post_id, "_bathrooms", esc_attr($property['features']['bathrooms']));
        add_post_meta($post_id, "_bedrooms", esc_attr($property['features']['bedrooms']));

        if(is_array($property['images'])) {
          add_post_meta($post_id, "_images", esc_attr(implode("\n", $property['images'])));	
        }
        else {
  	     feedback("Post ID was 0");
        }	

        feedback("added property $title with post_id $post_id"); //feedback
        $count++; //count
      }
      else {
        feedback("post was failed to add");
      }
    }//end loop
    feedback("Added $count properties");

    }//end count
    else {
      feedback("No properties to add.");
    }
  }
  else {
    feedback("No properties were selected");
    $result = false;
  }
  return $result;
}

I hope this code saves somebody some time.

Comments

Permalink

I'm currently doing a plugin on top of your code.. Thanks for the code Ben..

I'm trying to build a working plugin for this, but I'm not having much luck so far .. I'm not sure if the wordpress plugin specs have changed since this was written, but when I activate it it just dumps a heap of code to the screen :(

Would really appreciate any help with this!

Permalink

Can you post the code which you think is causing issues?

Permalink

i am confused with these parameters, i mean what is the source of xml file, from where the reaxml class is getting the xml feed, please help me

Permalink

@simon, the XML in my case was coming from an external application. I'm presuming if you're using this code you have a XML feed of properties. Somebody else has forked my initial project on Github and they have some sample data: https://github.com/satheshf12000/REA-XML-Parser/tree/mybranch/data/processed

Permalink

Thanks Ben, i have use this sample it solve my problem but rise another issue :(
it just import only 1 post from each Xml i used the sample Xml in data folder

Permalink

@simon it's a little hard to help you without seeing the code. Why don't you post a question on Stack Overflow with all the things you've tried and code samples and then post the link here and I'll see if I can help?

Permalink

No worries. Glad you got it sorted.

Permalink

Hoping you can help. For some reason, it only returns properties with type rental not residential. Any ideas why?

Permalink

I've noticed though that if i delete rental out of the xml file, it then shows the residential listings in the array?

Permalink

But i do get the line: Warning: REA_XML::parse_xml() [rea-xml.parse-xml]: Node no longer exists in D:\wamp\www\www.visionrealestateqld.com.au\wp-content\plugins\reaxmlparser\classes\rea_xml.class.php on line 149

Permalink

Just additionally, when there's only two items (one rental, one residental) it only returns the rental one.

Permalink

Hi crashfellow,

It's hard for me to comment on your problems without seeing the code. The code is this post is a basis not a complete solution.

If you're still stuck I would suggest you post a more thorough post on StackOverflow with code examples and then let me know and i'll see what I can do to help.

Permalink

Hi Ben, same issue as crashfellow.

It seems to ignore anything with a property type other than the first entry in the XML. Should the db importer method be sent all properties instead of only residential?

Pete

Permalink

Hi Pete,

That sounds like a valid issue. If you create a pull request I don't mind merging it in. The code is more of a starting point than a finalised solution.

Permalink

G'day Ben - have logged as an issue on github.

Cheers and thanks,

Pete

Permalink

Hi Ben,
Thanks heaps for the code! Awesome work!
I just have one question, how do we retrieve any attribute values that may be set?

for example:

blah blah

so how would we go about getting the display attribute value?

Thanks again!

Permalink

sorry my previous message had all the XML tags stripped out of it...the blah blah was actually inside the suburb tag

Permalink

Hi Ben,

Thanks for sharing this wonderful tutorial. I noticed that this code only insert new post into WordPress. But, if I need to update existing post with the xml, how can I achieve this?

Thanks

Permalink

Hi Spencer,

You'd have to insert a unique meta value for each property and then query the WordPress database to see if the property already existed. From there you could handle updates to the meta data rather than creating a new post.

Ben.

Permalink

Hi Ben,

Thanks for your reply. Which element in the xml file do you think can be used as the key to distinguish each property? Is there something in the xml file that can be used as a unique key for each property?

Thank you very much.

Spencer

Permalink

How about < /uniqueID> ?

Permalink

Thanks Ben, I think it is unique :)

Spencer

Permalink

Hi Ben,

Absolutely great work!!!

How to retrieve the category name or any attribute value?

Thanks
Abhi

Permalink

@Ben, Its around a week but no reply! Hmmm, nevermind i have found it myself :)

Thanks,
Abhi

Add new comment

The content of this field is kept private and will not be shown publicly.

Plain text

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.