Difference between revisions of "Import CSV data into a wiki"

From Organic Design wiki
(Change source-code blocks to standard format)
Line 2: Line 2:
  
 
'''csv2wiki''' is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a text file describing the parameters for the job, for example:
 
'''csv2wiki''' is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a text file describing the parameters for the job, for example:
{{code|<pre>
+
<source>
 
./csv2wiki.pl /home/foo/my.job
 
./csv2wiki.pl /home/foo/my.job
</pre>}}
+
</source>
  
 
Here is an example of what the content of the job file may look like:
 
Here is an example of what the content of the job file may look like:
{{code|<pre>
+
<source>
 
$wiki  = "http://foo.bar/wiki/index.php";
 
$wiki  = "http://foo.bar/wiki/index.php";
 
$user  = "Foo";
 
$user  = "Foo";
Line 13: Line 13:
 
$csv  = "/home/foo/projects/bar.csv";
 
$csv  = "/home/foo/projects/bar.csv";
 
$title = "$1 $2 $3";
 
$title = "$1 $2 $3";
</pre>}}
+
</source>
  
 
== Fields and values ==
 
== Fields and values ==
 
Each line of the input file will be imported into an article in the wiki, and the first line of the input file specifies the field names.
 
Each line of the input file will be imported into an article in the wiki, and the first line of the input file specifies the field names.
{{code|<pre>
+
<source>
 
Title, Firstname, Surname
 
Title, Firstname, Surname
 
Mr, Bob, McFoo
 
Mr, Bob, McFoo
 
Miss, Mary, Barson
 
Miss, Mary, Barson
</pre>}}
+
</source>
  
  
 
The first row defines the fieldnames, and the second is imported into the first article with the following content:
 
The first row defines the fieldnames, and the second is imported into the first article with the following content:
{{code|<pre>
+
<source>
 
{{Record
 
{{Record
 
  | Title = Mr
 
  | Title = Mr
Line 31: Line 31:
 
  | Surname = McFoo
 
  | Surname = McFoo
 
}}
 
}}
</pre>}}
+
</source>
  
  
Line 38: Line 38:
 
=== Multi-select fields ===
 
=== Multi-select fields ===
 
In the following example, some of the field names in the first row are left blank to indicate a multi-value field. The ''interests'' field can have up to three items.
 
In the following example, some of the field names in the first row are left blank to indicate a multi-value field. The ''interests'' field can have up to three items.
{{code|<pre>
+
<source>
 
Title, Firstname, Surname, Interests,,, Email
 
Title, Firstname, Surname, Interests,,, Email
 
Mr, Bob, McFoo, Art, History,, bob@foo.bar
 
Mr, Bob, McFoo, Art, History,, bob@foo.bar
 
Miss, Mary, Barson, Chemistry, Maths, Physics, mary.barson@bar.baz
 
Miss, Mary, Barson, Chemistry, Maths, Physics, mary.barson@bar.baz
</pre>}}
+
</source>
  
  
 
In this case, the first row imported looks like this:
 
In this case, the first row imported looks like this:
{{code|<pre>
+
<source>
 
{{Record
 
{{Record
 
  | Title = Mr
 
  | Title = Mr
Line 55: Line 55:
 
  | Email = bob@foo.bar
 
  | Email = bob@foo.bar
 
}}
 
}}
</pre>}}
+
</source>
  
  
 
Note that the '''multisep''' parameter can be set in the job file, for example if ''multisep'' were set to ";", then the above content would instead be imported as follows. The default value for ''multisep'' is the newline character because that's how the RecordAdmin extension expects multiple values to be formatted.
 
Note that the '''multisep''' parameter can be set in the job file, for example if ''multisep'' were set to ";", then the above content would instead be imported as follows. The default value for ''multisep'' is the newline character because that's how the RecordAdmin extension expects multiple values to be formatted.
{{code|<pre>
+
<source>
 
{{Record
 
{{Record
 
  | Title = Mr
 
  | Title = Mr
Line 67: Line 67:
 
  | Email = bob@foo.bar
 
  | Email = bob@foo.bar
 
}}
 
}}
</pre>}}
+
</source>
  
 
== Parameters ==
 
== Parameters ==

Revision as of 18:10, 22 May 2015

Info.svg This code is in our Git repository here.

Note: If there is no information in this page about this code and it's a MediaWiki extension, there may be something at mediawiki.org.

csv2wiki is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a text file describing the parameters for the job, for example:

./csv2wiki.pl /home/foo/my.job

Here is an example of what the content of the job file may look like:

$wiki  = "http://foo.bar/wiki/index.php";
$user  = "Foo";
$pass  = "Bar";
$csv   = "/home/foo/projects/bar.csv";
$title = "$1 $2 $3";

Fields and values

Each line of the input file will be imported into an article in the wiki, and the first line of the input file specifies the field names.

Title, Firstname, Surname
Mr, Bob, McFoo
Miss, Mary, Barson


The first row defines the fieldnames, and the second is imported into the first article with the following content:

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
}}


If there was already a Record template on the page, then only that template would be updated rather than the whole article.

Multi-select fields

In the following example, some of the field names in the first row are left blank to indicate a multi-value field. The interests field can have up to three items.

Title, Firstname, Surname, Interests,,, Email
Mr, Bob, McFoo, Art, History,, bob@foo.bar
Miss, Mary, Barson, Chemistry, Maths, Physics, mary.barson@bar.baz


In this case, the first row imported looks like this:

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
 | Interests = Art
History
 | Email = bob@foo.bar
}}


Note that the multisep parameter can be set in the job file, for example if multisep were set to ";", then the above content would instead be imported as follows. The default value for multisep is the newline character because that's how the RecordAdmin extension expects multiple values to be formatted.

{{Record
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
 | Interests = Art;History
 | Email = bob@foo.bar
}}

Parameters

The job file contains all the information necessary to update your wiki from content in a single Source File which should be UTF-8 encoded. Here is a description of the possible parameters in the job file and their meaning:

  • csv: the source file to import
  • wiki: the full long-form URL of the wiki including index.php
  • user: username of a user on the wiki with permission to create the necessary articles
  • pass: the users password
  • separator: also just "sep" is allowed, specifies the separator character, default is comma
  • multisep: specifies the separator character to use for multi-value fields, default is newline which is used by RecordAdmin
  • title: the format of the title using $n to specify fields, default is NULL which means to use GUID's for titles
  • template: The template that the parameters should be wrapped by in the created wiki articles, defaults to Template:Record
  • append: Specifies whether the template should be placed before or after existing text if the template doesn't already exist in the article