SQL

From Organic Design wiki
Revision as of 11:41, 26 May 2008 by Sven (talk | contribs) (Example)

SQL (Structured Query Language) is a computer language used to create, retrieve, update and delete data from relational database management systems. SQL has been standardised by both ANSI and ISO.

SQL is commonly spoken either as the names of the letters ess-cue-el, or like the word sequel. The official pronunciation of SQL according to ANSI is ess-cue-el. However, each of the major database products (or projects) containing the letters SQL has its own convention: MySQL is officially and commonly pronounced "My Ess Cue El"; PostgreSQL is expediently pronounced postgres (being the name of the predecessor to PostgreSQL); and Microsoft SQL Server is commonly spoken as Microsoft-sequel-server.

Common MediaWiki MySQL queries

Insert into interwiki

INSERT INTO interwiki (iw_prefix,iw_url,iw_local) VALUES('example','http://www.example.org/$1',0);

To delete entry;

DELETE FROM interwiki WHERE iw_prefix LIKE "example";

Backup and compress DB & FS

7za a -si backupfile.sql.7z # Backup

tar cf - directory


{{{1}}}

Reset a password

UPDATE user SET user_password=md5(CONCAT('184-',md5('password'))) WHERE user_id=184;

Reset or set a page hit counter

UPDATE page SET page_counter=0 WHERE page_title='Main_Page';

Adjust user groups

INSERT INTO user_groups (ug_user,ug_group) VALUES(999,'sysop');

Selecting current articles by category

See MediaWiki schema for a description of the tables. Essentially categorylinks store the category member relationships , page identifies the title and metadata information, and text contains the actual wikitext of articles.

The database schema used by MediaWiki allows variable article content to be stored as key => value pairs where the atomic unit for the content varies depending on the content of the article. A way around this is to use categorization to group common atomic unit structure together. Basically this is a filtering problem, however queried atomic unit structure needs to be checked downstream by any downstream processing of content within categories.

Example

Illustrating the relevant tables in the query
SELECT  * FROM 1120_categorylinks WHERE cl_to LIKE 'Selenium';
This selects all pages (article names) from category 'Selenium'.
SELECT * FROM 1120_page
WHERE page_id IN (SELECT  cl_from FROM 1120_categorylinks WHERE cl_to LIKE 'Selenium');
This selects all revisions of articles from category 'Selenium'.
SELECT * FROM 1120_revision  
WHERE rev_page IN (SELECT  cl_from FROM 1120_categorylinks WHERE cl_to LIKE 'Selenium');

From the output of this information it can be seen that what we want is the rev_text_id that is largest for any rev_page.

This selects all text from articles of category 'Selenium';
Select old_text FROM 1120_text
WHERE old_id IN (SELECT page_latest FROM 1120_page WHERE page_id IN (SELECT  cl_from FROM 1120_categorylinks WHERE cl_to LIKE 'Selenium'));

The query is extremely slow taking up to 3 seconds due to multiple nesting of SELECT statements. the following optimization by joining the two inner most nested SELECT statements speeds up the query by at least an order of magnitude.

SELECT old_text FROM 1120_text
WHERE old_id IN (SELECT page_latest FROM 1120_page,1120_categorylinks WHERE cl_to = 'Selenium' AND cl_from = page_id);

Selecting the most recent revision of all articles

Select * FROM 1120_text
WHERE old_id IN (SELECT page_latest FROM 1120_page);

Some SQL queries executable by admin (need fixing to work in new MW1.9.3 environment)

Documentation

Info.svg A nice tutorial on SQL Syntax

MySQL vs MSSQL

MySQL News & Information

Towards SQL for P2P environments