Difference between revisions of "Extension talk:SimpleRSS.php"
(→Testing: fixed) |
(RSS aggregator) |
||
(28 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | == | + | To test the extension you'll need to set up some templates: |
+ | *[[Template:RSSHead]] | ||
+ | *[[Template:RSSItem]] | ||
+ | *[[Template:RSSFoot]] | ||
+ | |||
+ | Each template has passed to it the keys in the RSS feed. | ||
+ | == Todo == | ||
+ | *<s>Templates to use can be passed params to #rss</s> | ||
+ | |||
+ | == Problem == | ||
+ | Where different feeds have different sets of keys: how to deal with them consistently when agregating? | ||
+ | |||
+ | == Name == | ||
+ | Why did you make a new extension and not just move the other one? --[[User:Nad|Nad]] 17:03, 18 October 2007 (NZDT) | ||
+ | |||
+ | |||
+ | ==Installation== | ||
+ | Create a new directory in extensions called RobsRSSExtension. Download Magpie RSS from [http://optusnet.dl.sourceforge.net/sourceforge/magpierss/magpierss-0.72.tar.gz here] | ||
+ | Unpack the tarball into the directory. The structure should look something like this: [[Image:RSS-install-folder-structure.png]] | ||
+ | |||
+ | Add to your LocalSettings.php | ||
+ | <pre> | ||
+ | include("extensions/RobsRSSExtension/RobsRSSExtension.php"); | ||
+ | </pre> | ||
+ | |||
+ | On a wiki page paste in this wikitext to test | ||
+ | <pre> | ||
+ | <rss> | ||
+ | http://angelacarter.co.nz/blog/?feed=rss2 | ||
+ | http://rss.slashdot.org/Slashdot/slashdot | ||
+ | http://digg.com/rss/index.xml | ||
+ | </rss> | ||
+ | </pre> | ||
+ | |||
+ | NOTE: This extension isn't working yet. At present it gives debugging output. | ||
+ | |||
+ | ==RSS keys== | ||
+ | In the process of development i've found that different RSS feeds use a variety of keys to represent information. We need to arrive at a common set that we can support and supply to the template for rendering. We may want to 'squash' keys in some reasonable manner to provide consistent keys for similar data. | ||
+ | |||
+ | The extension is set up to test this at present. Although it is retrieving the data for RSS correctly, it only outputs the keys used by each feed for channel, and items in the feed. This way we can arrive at a good set of keys that are widely supported. | ||
+ | |||
+ | Another option might be publish all keys that are not arrays. | ||
+ | |||
+ | *[[Extension:Robs RSS extension.php/List of keys|Example of output]] | ||
+ | |||
+ | == Features == | ||
We want a more solid solution for the wiki RSS reading and rendering than the current [[MW:Extension:RSS Reader]], including the following main concerns: | We want a more solid solution for the wiki RSS reading and rendering than the current [[MW:Extension:RSS Reader]], including the following main concerns: | ||
− | *Single script install with no dependencies | + | *<s>Single script install with no dependencies</s> ''- Rob wants to use [http://magpierss.sourceforge.net Magpie] initially to get it up and running asap. |
− | *Syntax and parameters compatible with [[MW:Extension:RSS Reader]] | + | *Syntax and parameters compatible with [[MW:Extension:RSS Reader]] (but we should add parser-function support too) |
*Aggregation by specifying a list of urls | *Aggregation by specifying a list of urls | ||
*More control over what fields are displayed and their layout (with an extra template parameter?) | *More control over what fields are displayed and their layout (with an extra template parameter?) | ||
Line 8: | Line 53: | ||
*Optional Ajax so page load isn't held up and can update on change without page reloads | *Optional Ajax so page load isn't held up and can update on change without page reloads | ||
*Control over file caching time and ajax update time | *Control over file caching time and ajax update time | ||
+ | *Have more control over descriptions | ||
+ | ::Yes the code should grab all available fields and then allow the row-template to decide which ones are displayed and where. --[[User:Nad|Nad]] 10:08, 16 October 2007 (NZDT) | ||
+ | :::The key thing we want is aggregation which doesn't seem to be part of their plan (also ajax is necessary with aggregation since page load time will be very slow if reading headers from many sources) --[[User:Nad|Nad]] 10:12, 16 October 2007 (NZDT) | ||
+ | *image and video support? | ||
+ | :Rob said that another reason for using Magpie is its support for atom which handles the different media types and encoding formats properly which allows it to work well for video etc --[[User:Nad|Nad]] 18:14, 16 October 2007 (NZDT) | ||
+ | *Another idea I'm looking into (which is related but not part of this extension) is that we should be able to use DPL in combination with [[Extension:XmlOutput.php]] to produce dynamic query-generated RSS feeds. --[[User:Nad|Nad]] 18:17, 16 October 2007 (NZDT) | ||
== Purpose == | == Purpose == | ||
Line 16: | Line 67: | ||
==Testing== | ==Testing== | ||
+ | *[http://wikifs.org/Sandbox Here] | ||
Installed ok (lists tag in Special:Version but not author details etc). Does not seem to give output even when arbitary output in hardwired into the return value. Maybe not quite installed properly? --[[User:Rob|Rob]] 18:25, 14 October 2007 (NZDT) | Installed ok (lists tag in Special:Version but not author details etc). Does not seem to give output even when arbitary output in hardwired into the return value. Maybe not quite installed properly? --[[User:Rob|Rob]] 18:25, 14 October 2007 (NZDT) | ||
:[[WikiFS:Special:Version]] isn't showing the rss tag but there is an empty tag there - I'll have a look in a sec... --[[User:Nad|Nad]] 18:59, 14 October 2007 (NZDT) | :[[WikiFS:Special:Version]] isn't showing the rss tag but there is an empty tag there - I'll have a look in a sec... --[[User:Nad|Nad]] 18:59, 14 October 2007 (NZDT) | ||
::It's not showing up in the credits either so it hasn't installed, only the setup function has registered. --[[User:Nad|Nad]] 19:04, 14 October 2007 (NZDT) | ::It's not showing up in the credits either so it hasn't installed, only the setup function has registered. --[[User:Nad|Nad]] 19:04, 14 October 2007 (NZDT) | ||
:::That fixed it - wrong syntax for setting tag hook - don't forget to use action=purge when testing too as code changes don't invalidate cache --[[User:Nad|Nad]] 19:15, 14 October 2007 (NZDT) | :::That fixed it - wrong syntax for setting tag hook - don't forget to use action=purge when testing too as code changes don't invalidate cache --[[User:Nad|Nad]] 19:15, 14 October 2007 (NZDT) | ||
− | |||
== RSS parsing == | == RSS parsing == | ||
The current extension uses an external dependency, [http://lastrss.oslab.net/lastRSS.phps lastRSS.php] to read the RSS with simple file-based cache/expiry system using MD5(url) for filename. It may be doing some things we need to replicate ourselves, but most of the code is concerned with regular expressions for handling the XML, and that aspect can be handled much more compactly by reading the RSS as a DOM object instead. | The current extension uses an external dependency, [http://lastrss.oslab.net/lastRSS.phps lastRSS.php] to read the RSS with simple file-based cache/expiry system using MD5(url) for filename. It may be doing some things we need to replicate ourselves, but most of the code is concerned with regular expressions for handling the XML, and that aspect can be handled much more compactly by reading the RSS as a DOM object instead. | ||
+ | ===Entities=== | ||
+ | Some news feeds (eg RSS2 from Wordpress) include entities in the ''title'' tags where they are illegal according to strict XML. This means the document is not well formed and the parser will break with an error like: | ||
+ | <pre> | ||
+ | Warning: DOMDocument::load() [function.DOMDocument-load]: | ||
+ | Entity 'raquo' not defined in http://angelacarter.co.nz/blog/?feed=rss2, | ||
+ | line: 10 in /home/rob/www/extensions/RobsRSSExtension.php on line 63 | ||
+ | </pre> | ||
+ | To fix this problem we need to do a replace on the contents of title tags that contain entities and turn them into spaces. | ||
+ | :That's why I reckoned we'd still want to take some of the code from [http://lastrss.oslab.net/lastRSS.phps lastRSS.php] such as the following function which looks like it's made to handle what you're talking about: | ||
+ | <php> | ||
+ | // ------------------------------------------------------------------- | ||
+ | // Replace HTML entities &something; by real characters | ||
+ | // ------------------------------------------------------------------- | ||
+ | function unhtmlentities($string) { | ||
+ | // Get HTML entities table | ||
+ | $trans_tbl = get_html_translation_table (HTML_ENTITIES, ENT_QUOTES); | ||
+ | // Flip keys<==>values | ||
+ | $trans_tbl = array_flip ($trans_tbl); | ||
+ | // Add support for ' entity (missing in HTML_ENTITIES) | ||
+ | $trans_tbl += array(''' => "'"); | ||
+ | // Replace entities by values | ||
+ | return strtr ($string, $trans_tbl); | ||
+ | } | ||
+ | </php> | ||
== Ajax == | == Ajax == | ||
Line 58: | Line 133: | ||
*<s>[http://www.mediawiki.org/wiki/Extension:FeedImport Extension:FeedImport]</s> ''- this is an even bigger dependency than [[MW:Extension:RSS Reader|RSS Reader]]'' | *<s>[http://www.mediawiki.org/wiki/Extension:FeedImport Extension:FeedImport]</s> ''- this is an even bigger dependency than [[MW:Extension:RSS Reader|RSS Reader]]'' | ||
*[[w:RSS (file format)#RSS_2.0]] | *[[w:RSS (file format)#RSS_2.0]] | ||
+ | |||
+ | ==Unncesessary fork?== | ||
+ | [[Extension:SimpleRSS]] looks like it can be deleted --[[User:Sven|Sven]] 11:19, 16 November 2007 (NZDT) | ||
__NOTOC__ | __NOTOC__ | ||
+ | == RSS aggregator == | ||
+ | Have you seen this? [[M:XFeed - RSS Feed Aggregator]], I found it on http://www.ehartwell.com/InfoDabble/index.php?title=Special:Version --[[User:Sven|Sven]] 13:11, 15 November 2007 (NZDT) |
Latest revision as of 22:21, 15 November 2007
To test the extension you'll need to set up some templates:
Each template has passed to it the keys in the RSS feed.
Todo
Templates to use can be passed params to #rss
Problem
Where different feeds have different sets of keys: how to deal with them consistently when agregating?
Name
Why did you make a new extension and not just move the other one? --Nad 17:03, 18 October 2007 (NZDT)
Installation
Create a new directory in extensions called RobsRSSExtension. Download Magpie RSS from here Unpack the tarball into the directory. The structure should look something like this:
Add to your LocalSettings.php
include("extensions/RobsRSSExtension/RobsRSSExtension.php");
On a wiki page paste in this wikitext to test
<rss> http://angelacarter.co.nz/blog/?feed=rss2 http://rss.slashdot.org/Slashdot/slashdot http://digg.com/rss/index.xml </rss>
NOTE: This extension isn't working yet. At present it gives debugging output.
RSS keys
In the process of development i've found that different RSS feeds use a variety of keys to represent information. We need to arrive at a common set that we can support and supply to the template for rendering. We may want to 'squash' keys in some reasonable manner to provide consistent keys for similar data.
The extension is set up to test this at present. Although it is retrieving the data for RSS correctly, it only outputs the keys used by each feed for channel, and items in the feed. This way we can arrive at a good set of keys that are widely supported.
Another option might be publish all keys that are not arrays.
Features
We want a more solid solution for the wiki RSS reading and rendering than the current MW:Extension:RSS Reader, including the following main concerns:
Single script install with no dependencies- Rob wants to use Magpie initially to get it up and running asap.- Syntax and parameters compatible with MW:Extension:RSS Reader (but we should add parser-function support too)
- Aggregation by specifying a list of urls
- More control over what fields are displayed and their layout (with an extra template parameter?)
- For example so that an RSS-based replacement for the recentchanges could be made
- Optional Ajax so page load isn't held up and can update on change without page reloads
- Control over file caching time and ajax update time
- Have more control over descriptions
- Yes the code should grab all available fields and then allow the row-template to decide which ones are displayed and where. --Nad 10:08, 16 October 2007 (NZDT)
- The key thing we want is aggregation which doesn't seem to be part of their plan (also ajax is necessary with aggregation since page load time will be very slow if reading headers from many sources) --Nad 10:12, 16 October 2007 (NZDT)
- Yes the code should grab all available fields and then allow the row-template to decide which ones are displayed and where. --Nad 10:08, 16 October 2007 (NZDT)
- image and video support?
- Rob said that another reason for using Magpie is its support for atom which handles the different media types and encoding formats properly which allows it to work well for video etc --Nad 18:14, 16 October 2007 (NZDT)
- Another idea I'm looking into (which is related but not part of this extension) is that we should be able to use DPL in combination with Extension:XmlOutput.php to produce dynamic query-generated RSS feeds. --Nad 18:17, 16 October 2007 (NZDT)
Purpose
We want to have a dynamically changing list of user-customisable aggregated RSS sources avalailble on pages and in the sidebar. These sources can include recentchanges feeds from our wikia so we can have the old changes-in-the-logo style effect, but using standard feeds from any sources and without consuming server resource from centralised polling.
Development
Rob has had experience with developing RSS related code before so he'll handle the main functionality and structure of the extension. Nad will handle the XML/DOM and Ajax aspects.
Testing
Installed ok (lists tag in Special:Version but not author details etc). Does not seem to give output even when arbitary output in hardwired into the return value. Maybe not quite installed properly? --Rob 18:25, 14 October 2007 (NZDT)
- WikiFS:Special:Version isn't showing the rss tag but there is an empty tag there - I'll have a look in a sec... --Nad 18:59, 14 October 2007 (NZDT)
- It's not showing up in the credits either so it hasn't installed, only the setup function has registered. --Nad 19:04, 14 October 2007 (NZDT)
- That fixed it - wrong syntax for setting tag hook - don't forget to use action=purge when testing too as code changes don't invalidate cache --Nad 19:15, 14 October 2007 (NZDT)
- It's not showing up in the credits either so it hasn't installed, only the setup function has registered. --Nad 19:04, 14 October 2007 (NZDT)
RSS parsing
The current extension uses an external dependency, lastRSS.php to read the RSS with simple file-based cache/expiry system using MD5(url) for filename. It may be doing some things we need to replicate ourselves, but most of the code is concerned with regular expressions for handling the XML, and that aspect can be handled much more compactly by reading the RSS as a DOM object instead.
Entities
Some news feeds (eg RSS2 from Wordpress) include entities in the title tags where they are illegal according to strict XML. This means the document is not well formed and the parser will break with an error like:
Warning: DOMDocument::load() [function.DOMDocument-load]: Entity 'raquo' not defined in http://angelacarter.co.nz/blog/?feed=rss2, line: 10 in /home/rob/www/extensions/RobsRSSExtension.php on line 63
To fix this problem we need to do a replace on the contents of title tags that contain entities and turn them into spaces.
- That's why I reckoned we'd still want to take some of the code from lastRSS.php such as the following function which looks like it's made to handle what you're talking about:
<php> // ------------------------------------------------------------------- // Replace HTML entities &something; by real characters // ------------------------------------------------------------------- function unhtmlentities($string) {
// Get HTML entities table $trans_tbl = get_html_translation_table (HTML_ENTITIES, ENT_QUOTES); // Flip keys<==>values $trans_tbl = array_flip ($trans_tbl); // Add support for ' entity (missing in HTML_ENTITIES) $trans_tbl += array(''' => "'"); // Replace entities by values return strtr ($string, $trans_tbl); }
</php>
Ajax
The ajax should be able to poll for headers only and then when changed it should post a request to MediaWiki which includes the changed feed and the timestamp of the last update. MediaWiki should then use an AjaxDispatcher to respond with the new item[s] which the ajax can then prepend to the innerHTML of the target area.
Special:Recentchanges example feed
<xml> <?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/css" href="http://www.organicdesign.co.nz/wiki/skins/common/feed.css?97"?> <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <channel> <title>OrganicDesign - Recent changes [en]</title> <link>http://www.organicdesign.co.nz/Special:Recentchanges</link> <description>Track the most recent changes to the wiki in this feed.</description> <language>en</language> <generator>MediaWiki 1.11.0</generator> <lastBuildDate>Sun, 14 Oct 2007 05:06:24 GMT</lastBuildDate>
<item> <title>Extension talk:Robs RSS extension.php</title> <link>http://www.organicdesign.co.nz/Extension_talk:Robs_RSS_extension.php</link> <description> </description> <pubDate>Sun, 14 Oct 2007 05:04:19 GMT</pubDate> <dc:creator>Rob</dc:creator> <comments>http://www.organicdesign.co.nz/Extension_talk:Robs_RSS_extension.php</comments> </item> </channel> </rss> </xml>
See also
- MW:Extension:RSS Reader - the one we currently use which depends on lastRSS.php
- PHP/DOM/AJAX/RSS example from W3C - this has everything we want! just needs a little customisation
Extension:FeedImport- this is an even bigger dependency than RSS Reader- w:RSS (file format)#RSS_2.0
Unncesessary fork?
Extension:SimpleRSS looks like it can be deleted --Sven 11:19, 16 November 2007 (NZDT)
RSS aggregator
Have you seen this? M:XFeed - RSS Feed Aggregator, I found it on http://www.ehartwell.com/InfoDabble/index.php?title=Special:Version --Sven 13:11, 15 November 2007 (NZDT)