Developer Forums | About Us | Site Map


Useful Lists

Web Host
site hosted by netplex

Online Manuals

An introduction to RSS news feeds
By James Lewin - 2004-04-05 Page:  1 2 3 4 5 6

Parsing RSS files

Once you start working with RSS files, you will want to parse the file back into discrete units of information. You can do this with the help of a variety of open-source tools written in Java, Perl, PHP, and even ASP. The parser reads a stream of XML text, identifies the opening and closing tags, finds the text enclosed in each tag, and creates handles to work with the parsed information. Once parsed, this information can be incorporated into dynamically generated pages.

Listings 7 and 8 show two simple Perl programs that read RSS files. Even if you don't write Perl, the examples may give you some ideas that you can use in your own development environment.

Perl is a great language for manipulating RSS files; there is a substantial amount of open-source code readily available to help get you started. Jonathan Eisenzopf has developed the XML::RSS module, which writes and parses RSS files. To take advantage of this parser, you will also need the XML::Parser module. These two Perl modules are available for free at CPAN (see Resources).

Here is an example of how XML:RSS can be used:

Listing 7. A Perl example using XML::RSS
# Setup includes
use strict;
use XML::RSS;
use LWP::Simple;
# Declare variables for URL to be parsed
my $url2parse;
# Get the command-line argument
my $arg = shift;
# Create new instance of XML::RSS
my $rss = new XML::RSS;
# Get the URL, assign it to url2parse, and then parse the RSS content
$url2parse = get($arg);
die "Could not retrieve $arg" unless $url2parse;

This code sample passes a URL to a Perl script for parsing. Once parsed, the elements of the RSS file can be used in many ways. For example, you could use RSS items to create a list of headlines:

# Print the channel items
foreach my $item (@{$rss->{'items'}}) {
     next unless defined($item->{'title'}) && defined($item->{'link'});
     print "<li><a href="$item->{'link'}">$item->{'title'}</a><BR>n";

This sample loops through the array of RSS items, verifying that each item comes complete with a title and link. Incomplete items are skipped; complete items are included in a list of linked headlines.

If you plan to use the XML::RSS module, open and read it with any text editor; it is heavily commented with suggestions for using it effectively.

Once you have tried your hand at RSS files, you'll find that there are many ways that you can use them. For example, you can write scripts that generate RSS summaries every time your site is updated, or scripts that periodically retrieve news from other sites and automatically update your own news page. (How to write those scripts is fodder for another article, but you may find some useful open-source tools to automatically generate RSS summaries in the tool sources listed in Resources.

I've offered a few suggestions for creating and using RSS files. The resource section provides additional information, such as sources for RSS files, the RSS specifications, and places where you can post your headlines.

View An introduction to RSS news feeds Discussion

Page:  1 2 3 4 5 6 Next Page: Resources

First published by IBM developerWorks

Copyright 2004-2019 All rights reserved.
Article copyright and all rights retained by the author.