Shearer Software

Andrew Shearer’s Drivel

99 and 44/100 percent pure.

Thursday, June 10, 2004

WordPress RSS Import

WordPress 1.2 now has an its own RSS import feature. However, it’s based on a different technique (regular expressions) than the code I contributed in January (which uses a true XML SAX parser). So I’m posting the code here as open source under the GPL license. This code has some additional features:

  • It can import single files from either your local drive or from a URL you specify, or it can import entire folder hierarchies of RSS files (blogBrowser-style: one folder per year, one file per month), making it a general-purpose weblog batch import tool using RSS as the exchange format.
  • It aggregates RSS feeds, if you point one or more copies of it at feeds on the web and set it to run regularly. (Even when run frequently, it won’t import the same item twice.) You can also use this to maintain more than one WordPress site that shares the same content, such as a test site and a production site.
  • It handles time zones in a sophisticated way, preserving the timezone offset so that each item can appear on your weblog under the author’s original local time, while using GMT for all date comparisons.
  • It respects and stores modification dates if given in the RSS file.
  • If modification dates are given in the RSS file, it can optionally import only new or changed posts, leaving posts alone that haven’t been changed or that have been changed more recently on the local machine.
  • Using the above feature and two copies of WordPress, it can synchronize two or more weblogs, bidirectionally or multi-directionally. New and changed posts on any one weblog will automatically show up on the others.
  • It complies with the XML specification, for correct behavior with XML namespaces with arbitrary prefixes and CDATA sections in arbitrary locations, both of which can trip up a regular-expression-based parser.

As long as your RSS feed passes the XML well-formedness test (which it probably does, even if it doesn’t validate according to the RSS Validator), you can use this RSS Import filter. If it’s not well-formed XML, you’re better off with the RSS import filter built into WordPress.

Versions are available for WordPress 0.9 through 1.2.

More Info and Download

   Open Source, Software, General  Posted at 8:15 AM   

Leave a Reply

June 2004
M T W T F S S
« May   Sep »
 123456
78910111213
14151617181920
21222324252627
282930  
Recent Reading

A Heartbreaking Work of Staggering Genius, by Dave Eggers

Harry Potter and the Order of the Phoenix, by J. K. Rowling

Player Piano, by Kurt Vonnegut

Bad News, by Donald E. Westlake

The Blank Slate: The Modern Denial of Human Nature, by Steven Pinker

The Jungle, by Upton Sinclair

Gödel, Escher, Bach: An Eternal Golden Braid, by Douglas R. Hofstadter

Speaking With the Angel, by Nick Hornby (Editor)

In Progress

The Language Instinct, by Steven Pinker

The Corrections, by Jonathan Franzen