import CSV data into a wiki

From Organic Design wiki
Revision as of 04:21, 19 August 2011 by Nad (talk | contribs) ({{lowercase}})
Info.svg This code is in our Git repository here.

Note: If there is no information in this page about this code and it's a MediaWiki extension, there may be something at mediawiki.org.

csv2wiki is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a text file describing the parameters for the job, for example:

./csv2wiki.pl /home/foo/my.job

Here is an example of what the content of the job file may look like:

$wiki  = "http://foo.bar/wiki/index.php";
$user  = "Foo";
$pass  = "Bar";
$csv   = "/home/foo/projects/bar.csv";
$title = "$1 $2 $3";

Fields and values

Each line of the input file will be imported into an article in the wiki, and the first line of the input file specifies the field names.

Title, Firstname, Surname
Mr, Bob, McFoo
Miss, Mary, Barson


The first row defines the fieldnames, and the second is imported into the first article with the following content:

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
}}


If there was already a Record template on the page, then only that template would be updated rather than the whole article.

Multi-select fields

In the following example, some of the field names in the first row are left blank to indicate a multi-value field. The interests field can have up to three items.

Title, Firstname, Surname, Interests,,, Email
Mr, Bob, McFoo, Art, History,, bob@foo.bar
Miss, Mary, Barson, Chemistry, Maths, Physics, mary.barson@bar.baz


In this case, the first row imported looks like this:

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
 | Interests = Art
History
 | Email = bob@foo.bar
}}


Note that the multisep parameter can be set in the job file, for example if multisep were set to ";", then the above content would instead be imported as follows. The default value for multisep is the newline character because that's how the RecordAdmin extension expects multiple values to be formatted.

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
 | Interests = Art;History
 | Email = bob@foo.bar
}}

Parameters

The job file contains all the information necessary to update your wiki from content in a single Source File which should be UTF-8 encoded. Here is a description of the possible parameters in the job file and their meaning:

  • csv: the source file to import
  • wiki: the full long-form URL of the wiki including index.php
  • user: username of a user on the wiki with permission to create the necessary articles
  • pass: the users password
  • separator: also just "sep" is allowed, specifies the separator character, default is comma
  • multisep: specifies the separator character to use for multi-value fields, default is newline which is used by RecordAdmin
  • title: the format of the title using $n to specify fields, default is NULL which means to use GUID's for titles
  • template: The template that the parameters should be wrapped by in the created wiki articles, defaults to Template:Record
  • append: Specifies whether the template should be placed before or after existing text if the template doesn't already exist in the article