Difference between revisions of "Talk:SQL"

From Organic Design wiki
(Update)
(Notes on correct query)
 
(One intermediate revision by the same user not shown)
Line 8: Line 8:
 
*hooks used
 
*hooks used
 
*hooks defined
 
*hooks defined
:Absolutely --[[User:Sven|Sven]] 15:55, 11 February 2008 (NZDT)
 
  
==Selecting current articles by category==
+
==Building up the query==
See [http://upload.wikimedia.org/wikipedia/commons/4/41/Mediawiki-database-schema.png MediaWiki schema] for a description of the tables. Essentially '''categorylinks''' store the category member relationships , '''page''' identifies the title and metadata information, and '''text''' contains the actual wikitext of articles.
+
: SELECT page_latest FROM page, categorylinks
 +
: WHERE cl_to = "Portal" AND cl_from = page_id;
  
The database schema used by MediaWiki allows variable article content to be stored as key => value pairs where the atomic unit for the content '''varies''' depending on the content of the article. A way around this is to use categorization to group common atomic unit structure together. Basically this is a filtering problem, however queried atomic unit structure needs to be checked downstream by any downstream processing of content within categories.
+
:SELECT rev_text_id FROM revision
 +
: WHERE rev_id IN (SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id);
  
===Example===
+
;This is slow dues to a lack of indexing
*[[Template:SQLcat]] is transcluded into the SELECT statements of this article, changing it will change category the statements QUERY on.
+
:SELECT * FROM text WHERE old_id IN (SELECT rev_text_id FROM revision
*[[Template:SQLprefix]] is transcluded into the SELECT statements of this article, changing it will change the database prefix statements QUERY on.
+
:WHERE rev_id IN (SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id));
SELECT * FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}';
 
  
;This selects all pages (article names) from category '{{SQLcat}}'.
+
;Two parts step 1;
SELECT * FROM {{SQLprefix}}page
+
:SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id)
WHERE page_id IN (SELECT cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}');
+
;step 2
 +
:SELECT * FROM text WHERE old_id IN (SELECT rev_text_id FROM revision
 +
:WHERE rev_id IN (3031, 3033)
  
 
+
;Select statement that works
;This selects all revisions of articles from category '{{SQLcat}}'.
+
:SELECT  old_text FROM {{SQLprefix}}text, {{SQLprefix}}revision
SELECT * FROM {{SQLprefix}}revision 
+
:WHERE old_id = rev_text_id and rev_id IN (
WHERE rev_page IN (SELECT  cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}');
+
SELECT page_latest FROM {{SQLprefix}}page, {{SQLprefix}}categorylinks
 
+
WHERE cl_to = "Portal" and cl_from = page_id );
From the output of this information it can be seen that what we want is the rev_text_id that is largest for any rev_page.
 
 
 
;This selects all text from articles of category ''Rosaceae'';
 
Select * FROM {{SQLprefix}}text
 
WHERE old_id IN (SELECT page_latest FROM {{SQLprefix}}page WHERE page_id IN (SELECT  cl_from FROM {{SQLprefix}}categorylinks WHERE cl_to LIKE '{{SQLcat}}'));
 
 
 
The query is extremely slow taking up to 3 seconds due to multiple nesting of SELECT statements.
 
the following optimization by joining the two inner most nested SELECT statements speeds up the query by at least an order of magnitude.
 
SELECT * FROM {{SQLprefix}}text
 
WHERE old_id IN (SELECT page_latest FROM {{SQLprefix}}page,{{SQLprefix}}categorylinks WHERE cl_to = '{{SQLcat}}' AND cl_from = page_id);
 

Latest revision as of 03:53, 27 May 2008

I need to add links to an article I have on SQL, I also propose a section listing every mediawiki extension that contains SQL queries --Sven 14:09, 11 February 2008 (NZDT)

Sounds like a job for SMW - adding script related properties to them all. Or maybe to their talk pages since scripts aren't wikitext articles - or could add a namespace for that - like properties:? --Nad 14:26, 11 February 2008 (NZDT)

Ideas for script-related properties

  • has db queries
  • classes/methods used
  • classes defined
  • hooks used
  • hooks defined

Building up the query

SELECT page_latest FROM page, categorylinks
WHERE cl_to = "Portal" AND cl_from = page_id;
SELECT rev_text_id FROM revision
WHERE rev_id IN (SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id);
This is slow dues to a lack of indexing
SELECT * FROM text WHERE old_id IN (SELECT rev_text_id FROM revision
WHERE rev_id IN (SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id));
Two parts step 1;
SELECT page_latest FROM page, categorylinks WHERE cl_to = "Portal" AND cl_from = page_id)
step 2
SELECT * FROM text WHERE old_id IN (SELECT rev_text_id FROM revision
WHERE rev_id IN (3031, 3033)
Select statement that works
SELECT old_text FROM 1120_text, 1120_revision
WHERE old_id = rev_text_id and rev_id IN (
SELECT page_latest FROM 1120_page, 1120_categorylinks
WHERE cl_to = "Portal" and cl_from = page_id );