|
|
Line 8: |
Line 8: |
| *hooks used | | *hooks used |
| *hooks defined | | *hooks defined |
− | :Absolutely --[[User:Sven|Sven]] 15:55, 11 February 2008 (NZDT)
| |
− |
| |
− | ==Selecting current articles by category==
| |
− | See [http://upload.wikimedia.org/wikipedia/commons/4/41/Mediawiki-database-schema.png MediaWiki schema] for a description of the tables. Essentially '''categorylinks''' store the category member relationships , '''page''' identifies the title and metadata information, and '''text''' contains the actual wikitext of articles.
| |
− |
| |
− | The database schema used by MediaWiki allows variable article content to be stored as key => value pairs where the atomic unit for the content '''varies''' depending on the content of the article. A way around this is to use categorization to group common atomic unit structure together. Basically this is a filtering problem, however queried atomic unit structure needs to be checked downstream by any downstream processing of content within categories.
| |
− |
| |
− | ===Example===
| |
− | *[[Template:SQLcat]] is transcluded into the SELECT statements of this article, changing it will change category the statements QUERY on.
| |
− | *[[Template:SQLprefix]] is transcluded into the SELECT statements of this article, changing it will change the database prefix statements QUERY on.
| |
− | SELECT * FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}';
| |
− |
| |
− | ;This selects all pages (article names) from category '{{SQLcat}}'.
| |
− | SELECT * FROM {{SQLprefix}}page
| |
− | WHERE page_id IN (SELECT cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}');
| |
− |
| |
− |
| |
− | ;This selects all revisions of articles from category '{{SQLcat}}'.
| |
− | SELECT * FROM {{SQLprefix}}revision
| |
− | WHERE rev_page IN (SELECT cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}');
| |
− |
| |
− | From the output of this information it can be seen that what we want is the rev_text_id that is largest for any rev_page.
| |
− |
| |
− | ;This selects all text from articles of category ''Rosaceae'';
| |
− | Select * FROM {{SQLprefix}}text
| |
− | WHERE old_id IN (SELECT page_latest FROM {{SQLprefix}}page WHERE page_id IN (SELECT cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}'));
| |
− |
| |
− | The query is extremely slow taking up to 3 seconds due to multiple nesting of SELECT statements.
| |
− | the following optimization by joining the two inner most nested SELECT statements speeds up the query by at least an order of magnitude.
| |
− | SELECT * FROM {{SQLprefix}}text
| |
− | WHERE old_id IN (SELECT page_latest FROM {{SQLprefix}}page,{{SQLprefix}}categorylinks WHERE cl_to = '{{SQLcat}}' AND cl_from = page_id);
| |