Difference between revisions of "Extension talk:SimpleRSS.php"

From Organic Design wiki
m
m
Line 1: Line 1:
{{Message|icon=[[Image:Note.svg|25px]]|text=[[Extension:SimpleRSS.php|'''New version available''']]<br/>Doesn't do the aggregation yet but now working as a parser function. [[Extension talk:SimpleRSSS.php|See docs here.]]
+
{{Message|icon=[[Image:Note.svg|50px]]|text=[[Extension:SimpleRSS.php|'''New version available''']]<br/>Doesn't do the aggregation yet but now working as a parser function. [[Extension talk:SimpleRSSS.php|See docs here.]]
 
}}
 
}}
  

Revision as of 01:13, 18 October 2007

Note.svg New version available
Doesn't do the aggregation yet but now working as a parser function. See docs here.

Installation

Create a new directory in extensions called RobsRSSExtension. Download Magpie RSS from here Unpack the tarball into the directory. The structure should look something like this: RSS-install-folder-structure.png

Add to your LocalSettings.php

include("extensions/RobsRSSExtension/RobsRSSExtension.php");

On a wiki page paste in this wikitext to test

<rss>
http://angelacarter.co.nz/blog/?feed=rss2
http://rss.slashdot.org/Slashdot/slashdot
http://digg.com/rss/index.xml
</rss>

NOTE: This extension isn't working yet. At present it gives debugging output.

RSS keys

In the process of development i've found that different RSS feeds use a variety of keys to represent information. We need to arrive at a common set that we can support and supply to the template for rendering. We may want to 'squash' keys in some reasonable manner to provide consistent keys for similar data.

The extension is set up to test this at present. Although it is retrieving the data for RSS correctly, it only outputs the keys used by each feed for channel, and items in the feed. This way we can arrive at a good set of keys that are widely supported.

Another option might be publish all keys that are not arrays.

Features

We want a more solid solution for the wiki RSS reading and rendering than the current MW:Extension:RSS Reader, including the following main concerns:

  • Single script install with no dependencies - Rob wants to use Magpie initially to get it up and running asap.
  • Syntax and parameters compatible with MW:Extension:RSS Reader (but we should add parser-function support too)
  • Aggregation by specifying a list of urls
  • More control over what fields are displayed and their layout (with an extra template parameter?)
For example so that an RSS-based replacement for the recentchanges could be made
  • Optional Ajax so page load isn't held up and can update on change without page reloads
  • Control over file caching time and ajax update time
  • Have more control over descriptions
Yes the code should grab all available fields and then allow the row-template to decide which ones are displayed and where. --Nad 10:08, 16 October 2007 (NZDT)
The key thing we want is aggregation which doesn't seem to be part of their plan (also ajax is necessary with aggregation since page load time will be very slow if reading headers from many sources) --Nad 10:12, 16 October 2007 (NZDT)
  • image and video support?
Rob said that another reason for using Magpie is its support for atom which handles the different media types and encoding formats properly which allows it to work well for video etc --Nad 18:14, 16 October 2007 (NZDT)
  • Another idea I'm looking into (which is related but not part of this extension) is that we should be able to use DPL in combination with Extension:XmlOutput.php to produce dynamic query-generated RSS feeds. --Nad 18:17, 16 October 2007 (NZDT)

Purpose

We want to have a dynamically changing list of user-customisable aggregated RSS sources avalailble on pages and in the sidebar. These sources can include recentchanges feeds from our wikia so we can have the old changes-in-the-logo style effect, but using standard feeds from any sources and without consuming server resource from centralised polling.

Development

Rob has had experience with developing RSS related code before so he'll handle the main functionality and structure of the extension. Nad will handle the XML/DOM and Ajax aspects.

Testing

Installed ok (lists tag in Special:Version but not author details etc). Does not seem to give output even when arbitary output in hardwired into the return value. Maybe not quite installed properly? --Rob 18:25, 14 October 2007 (NZDT)

WikiFS:Special:Version isn't showing the rss tag but there is an empty tag there - I'll have a look in a sec... --Nad 18:59, 14 October 2007 (NZDT)
It's not showing up in the credits either so it hasn't installed, only the setup function has registered. --Nad 19:04, 14 October 2007 (NZDT)
That fixed it - wrong syntax for setting tag hook - don't forget to use action=purge when testing too as code changes don't invalidate cache --Nad 19:15, 14 October 2007 (NZDT)

RSS parsing

The current extension uses an external dependency, lastRSS.php to read the RSS with simple file-based cache/expiry system using MD5(url) for filename. It may be doing some things we need to replicate ourselves, but most of the code is concerned with regular expressions for handling the XML, and that aspect can be handled much more compactly by reading the RSS as a DOM object instead.

Entities

Some news feeds (eg RSS2 from Wordpress) include entities in the title tags where they are illegal according to strict XML. This means the document is not well formed and the parser will break with an error like:

Warning: DOMDocument::load() [function.DOMDocument-load]:
    Entity 'raquo' not defined in http://angelacarter.co.nz/blog/?feed=rss2,
    line: 10 in /home/rob/www/extensions/RobsRSSExtension.php on line 63

To fix this problem we need to do a replace on the contents of title tags that contain entities and turn them into spaces.

That's why I reckoned we'd still want to take some of the code from lastRSS.php such as the following function which looks like it's made to handle what you're talking about:

<php> // ------------------------------------------------------------------- // Replace HTML entities &something; by real characters // ------------------------------------------------------------------- function unhtmlentities($string) {

   // Get HTML entities table
   $trans_tbl = get_html_translation_table (HTML_ENTITIES, ENT_QUOTES);
   // Flip keys<==>values
   $trans_tbl = array_flip ($trans_tbl);
   // Add support for ' entity (missing in HTML_ENTITIES)
   $trans_tbl += array(''' => "'");
   // Replace entities by values
   return strtr ($string, $trans_tbl);
   }

</php>

Ajax

The ajax should be able to poll for headers only and then when changed it should post a request to MediaWiki which includes the changed feed and the timestamp of the last update. MediaWiki should then use an AjaxDispatcher to respond with the new item[s] which the ajax can then prepend to the innerHTML of the target area.

Special:Recentchanges example feed

<xml> <?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/css" href="http://www.organicdesign.co.nz/wiki/skins/common/feed.css?97"?> <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <channel> <title>OrganicDesign - Recent changes [en]</title> <link>http://www.organicdesign.co.nz/Special:Recentchanges</link> <description>Track the most recent changes to the wiki in this feed.</description> <language>en</language> <generator>MediaWiki 1.11.0</generator> <lastBuildDate>Sun, 14 Oct 2007 05:06:24 GMT</lastBuildDate>

<item> <title>Extension talk:Robs RSS extension.php</title> <link>http://www.organicdesign.co.nz/Extension_talk:Robs_RSS_extension.php</link> <description> </description> <pubDate>Sun, 14 Oct 2007 05:04:19 GMT</pubDate> <dc:creator>Rob</dc:creator> <comments>http://www.organicdesign.co.nz/Extension_talk:Robs_RSS_extension.php</comments> </item> </channel> </rss> </xml>

See also