I finally broke down and purchased the Developers Resource Kit 4 which includes the RSS Untangler (RSSU) object. Once I unzipped the code samples and did some studying up on how it was called, I quickly realized that there was probably an easier way to use the object than they were showing.
In this tutorial, we'll go over installing the RSSU object, write a CFC with a single method to use the object to parse RSS feeds, and then write some code that uses the CFC. It's all very simple stuff, but getting there took some figuring out.
Installing the RSSU Object
Obviously, before you can install the object you need to purchase it. Macromedia's DRK4 is a pretty good resource, and if you're using RSS a lot in your code, it's worth the $99.00. You can use the Java object in anything that supports Java objects as well, so keep that in mind.
Unzip the DRK4, and look for the rssu.zip file in DevResKit_V4\coldfusion\. Unzip the rssu.zip file, copy the rssu.jar file to your class path.
Now, before going on, let me talk about ColdFusion and class paths. I never know where my default class path is. I have development boxes that use the internal ColdFusion webserver, and that standard install puts a class path somewhere I always forget. I have other development servers using Apache, and that standard install puts the class path somewhere I also always forget. And I have a production Linux server using Apache, which you might guess I also forget the class path for. It's probably comical to watch as I flounder whenever looking for the class path.
To get around this bit of discomfort, I usually create a folder somewhere that is consistent among all my environments, and is secure from the outside world. I specify that class path in the ColdFusion Admin and I'm confused no more.
Now, back to the subject - once you've copied the rssu.jar file to your class path, you should restart ColdFusion. I'm not sure if it's required, but I will say this: I have never been able to call a new Java class without first restarting ColdFusion. Perhaps there's an easier way I'm unaware of.
Writing Some Code
Create a CFC anywhere you like, call it anything you like. For simplicity, I've created an rssu folder off my webroot where I've placed my CFC. Here's what the CFC looks like:
<cfcomponent>
<cffunction name="parseURL" output="No" returntype="query">
<cfargument name="url" required="yes" type="string">
<cfscript>
// create an array that we'll use to send back two query objects in, one for the RSS Feed info, and one for the entries
qryFeedContainer = QueryNew("feedInfo,feed");
// query our query objects
qryFeedInfo = QueryNew("title, url"); // * will add more columns soon
qryFeed = QueryNew("title,permalink,content,category,author,date");
// create the rssu object
parser = createObject("java", "com.macromedia.rssu.AutoParser");
parser.init();
// send the rssu object a url to parse
url = createObject("java", "java.net.URL").init(arguments.url);
// get and parse the feed
feed = parser.parse(url);
entries = feed.getItems();
// drop the feed info into the first query
QueryAddRow(qryFeedInfo);
QuerySetCell(qryFeedInfo, "title", feed.getTitle());
QuerySetCell(qryFeedInfo, "url", feed.getLink());// * will add more columns soon
// loop over the feed, filling the columns for each entry
for (i=1;i lte arrayLen(feed.getItems());i=i+1) {
QueryAddRow(qryFeed);
QuerySetCell(qryFeed,"title",entries[i].getTitle());
QuerySetCell(qryFeed,"permalink",entries[i].getLink());
QuerySetCell(qryFeed,"content",entries[i].getDescription());
QuerySetCell(qryFeed,"category",entries[i].getCategory());
QuerySetCell(qryFeed,"author",entries[i].getAuthor());
QuerySetCell(qryFeed,"date",entries[i].getPubDate());
}
// put our query objects into the query container
QueryAddRow(qryFeedContainer);
QuerySetCell(qryFeedContainer,"feedInfo", qryFeedInfo);
QuerySetCell(qryFeedContainer,"feed", qryFeed);
return qryFeedContainer;
</cfscript>
<cfreturn arryFeed>
</cffunction>
<cffunction name="jsFeed" output="No" returntype="string">
<cfargument name="feedURL" required="yes" type="string">
<cfargument name="entries" type="numeric" required="no" default="0">
<cfinvoke component="rss" method="parseURL" returnvariable="qryFeedContainer">
<cfinvokeargument name="feedURL" value="#arguments.feedURL#">
</cfinvoke>
<cfset qryFeed = qryFeedContainer.feed>
<!--- build the JS string to send back --->
<cfset jsFeed = "">
<cfif arguments.entries eq 0>
<cfoutput query="qryFeed">
<cfset jsFeed = jsFeed & "document.writeln(""<a href='#permalink#' class='rssLink'>#title#</a><br />"");" & chr(10) & chr(13)>
</cfoutput>
<cfelse>
<cfoutput query="qryFeed" maxrows="#arguments.entries#">
<cfset jsFeed = jsFeed & "document.writeln(""<a href='#permalink#' class='rssLink'>#title#</a><br />"");" & chr(10) & chr(13)>
</cfoutput>
</cfif>
<cfreturn jsFeed>
</cffunction>
</cfcomponent>
The most important part of the parseURL method is where we call the RSSU object.
parser = createObject("java", "com.macromedia.rssu.AutoParser");
parser.init();
It's very simple. A lot of the examples supplied by Macromedia seem to be a lot more complicated, but really, the parser object we're creating is ready to go with those two lines of code.
The second most important part is where we create a URL object to that contains the location of the feed we'll be parsing, and then the actual call to the parser object.
url = createObject("java", "java.net.URL").init(arguments.url);
...
feed = parser.parse(url);
If you were to consolidate the above snippets of code into a single test file, you'd be able to get some immediate results. Dump the feed object using CFDUMP, and you'll see all of the methods within the object. Dump feed.getItems() and you'll see all the entries in the feed. It's a wonderful thing.
Something that might stick out about the code in the CFC is the fact that I'm creating two queries that I then return in another query. This is by no means the only way, but I find it helpful to seperate the metadata of the RSS feed (site title, feed info, etc) from the part of the feed that contains the entries. Also, there are more fields available from the RSS feed. I've only used what I need.
Using Our CFC
Now that we've created a CFC, we can put it to good use. I'm using the CFC above to consume RSS Feeds and then display them on this site. Now, there were a few obstacles to overcome in doing this, the first is that this site runs on TypePad, which currently doesn't provide the ability to consume RSS feeds or run ColdFusion code. Anil suggests that the former will eventually be available, but I'm pretty sure it's safe to say it will never support the latter. I had to get tricky.
I wrote a file called jsFeed.cfm that calls the CFC I wrote above, then returns the results of the CFC as Javascript code that writes out the feeds and URLs to the actual articles. When you boil it all down, it's all very simple, but at first it was quite the puzzle to figure out.
The jsFeed.cfm code looks like this:
<cfsetting enablecfoutputonly="Yes">
<cfinvoke component="rss" method="jsFeed" returnvariable="js">
<cfinvokeargument name="feedURL" value="#url.feedURL#">
<cfinvokeargument name="entries" value="#url.entries#">
</cfinvoke>
<cfoutput>#js#</cfoutput>
<cfsetting enablecfoutputonly="No">
I include that code on this site by making a simple Javascript call to the file. Your browser assumes that the output of jsFeed.cfm is valid Javascript (and it is) and writes the results to the rendered HTML page.
Something I haven't done, which I should, is do some error handling in either the CFC that consumes the feed and/or the jsFeed.cfm file. The RSSU has a problem if the feed supplied to it is malformed, which happens more often than you'd think. All it takes is an ampersand in the wrong place, and it breaks the XML well-formedness of the feed.
Summary
Don't put a lot of time into figuring out the RSSU samples that come with it. On the three machines I've played with them on, they've all failed right out of the box. Distill it down, as I have, to the most basic elements. Make sure that you have the rssu.jar file in an active class path. Make sure you've restarted ColdFusion after putting the rssu.jar file in the active class path. Make sure the feed you're testing is valid XML, RSS, or RDF.