Difference between revisions of "Import CSV data into a wiki"

From Organic Design wiki
({{legacy}})
(update docs)
Line 1: Line 1:
{{legacy}}
 
 
{{lowercase}}{{svn|tools|csv2wiki.pl}}
 
{{lowercase}}{{svn|tools|csv2wiki.pl}}
  
'''csv2wiki''' is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a text file describing the parameters for the job, for example:
+
'''csv2wiki''' is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a configuration file describing the parameters for the job. It depends on a second script called {{svn|tools|wiki.pl}} existing in the same location.
 
<source>
 
<source>
./csv2wiki.pl /home/foo/my.job
+
./csv2wiki.pl /home/myJob.conf
 
</source>
 
</source>
  
Here is an example of what the content of the job file may look like:
+
Here is an example configuration file:
<source>
+
<source lang="js">
$wiki = "http://foo.bar/wiki/index.php";
+
wiki     = "http://foo.bar/wiki/index.php"
$user = "Foo";
+
user     = "Foo"
$pass = "Bar";
+
pass     = "Bar"
$csv   = "/home/foo/projects/bar.csv";
+
csv     = "/home/foo/projects/bar.csv"
$title = "$1 $2 $3";
+
title   = "$1 $2 $3"
 +
template = Customer
 
</source>
 
</source>
  
Line 25: Line 25:
  
  
The first row defines the fieldnames, and the second is imported into the first article with the following content:
+
In the above example the first row of data will be imported into an article called "Mr Bob McFoo" with the following content:
<source>
 
{{Record
 
| Title = Mr
 
| Firstname = Bob
 
| Surname = McFoo
 
}}
 
</source>
 
 
 
 
 
If there was already a ''Record'' template on the page, then only that template would be updated rather than the whole article.
 
 
 
=== Multi-select fields ===
 
In the following example, some of the field names in the first row are left blank to indicate a multi-value field. The ''interests'' field can have up to three items.
 
<source>
 
Title, Firstname, Surname, Interests,,, Email
 
Mr, Bob, McFoo, Art, History,, bob@foo.bar
 
Miss, Mary, Barson, Chemistry, Maths, Physics, mary.barson@bar.baz
 
</source>
 
 
 
 
 
In this case, the first row imported looks like this:
 
 
<source>
 
<source>
{{Record
+
{{Customer
 
  | Title = Mr
 
  | Title = Mr
 
  | Firstname = Bob
 
  | Firstname = Bob
 
  | Surname = McFoo
 
  | Surname = McFoo
| Interests = Art
 
History
 
| Email = bob@foo.bar
 
 
}}
 
}}
 
</source>
 
</source>
  
  
Note that the '''multisep''' parameter can be set in the job file, for example if ''multisep'' were set to ";", then the above content would instead be imported as follows. The default value for ''multisep'' is the newline character because that's how the RecordAdmin extension expects multiple values to be formatted.
+
If the page already existed, then the new template would be inserted at the top of the page (or the bottom if "append" is set). Or if the template already exists in the page, then it's values will be updated.
<source>
 
{{Record
 
| Title = Mr
 
| Firstname = Bob
 
| Surname = McFoo
 
| Interests = Art;History
 
| Email = bob@foo.bar
 
}}
 
</source>
 
  
== Parameters ==
+
== Configuration options ==
The job file contains all the information necessary to update your wiki from content in a single Source File which should be '''UTF-8''' encoded. Here is a description of the possible parameters in the job file and their meaning:
+
Here is a description of the possible parameters in the job file and their meaning:
*'''csv:''' ''the source file to import''
+
{| class=od-info
*'''wiki:''' ''the full long-form URL of the wiki including index.php''
+
!Name!!Meaning!!Default value
*'''user:''' ''username of a user on the wiki with permission to create the necessary articles''
+
|-
*'''pass:''' ''the users password''
+
|csv||The source file to import relative to the location of the script||-
*'''separator:''' ''also just "sep" is allowed, specifies the separator character, default is comma''
+
|-
*'''multisep:''' ''specifies the separator character to use for multi-value fields, default is newline which is used by RecordAdmin''
+
|wiki||the full long-form URL of the wiki including index.php||-
*'''title:''' ''the format of the title using '''$n''' to specify fields, default is NULL which means to use GUID's for titles''
+
|-
*'''template:''' ''The template that the parameters should be wrapped by in the created wiki articles, defaults to [[Template:Record]]''
+
|user||Username of a user on the wiki with permission to create the necessary articles||-
*'''append:''' ''Specifies whether the template should be placed before or after existing text if the template doesn't already exist in the article''
+
|-
 +
|pass||The users password||-
 +
|-
 +
|separator||Also just "sep" is allowed, specifies the separator character used in the CSV file||''comma''
 +
|-
 +
|title||The format of the title using '''$n''' to specify the column numbers, default is ''$1'' which means to use just the first columns as the title||$1
 +
|-
 +
|template||The template that the parameters should be wrapped by in the created wiki articles||Record
 +
|-
 +
|append||Set this to 1 to place the template at the end of the text if the template doesn't already exist in the article||0
 +
|-
 +
|include||A comma-separated list of the columns that should be included from the CSV, by default all columns are used||''not set''
 +
|-
 +
|}
 
[[Category:Tools]][[Category:PERL]]
 
[[Category:Tools]][[Category:PERL]]

Revision as of 21:10, 6 May 2020

Info.svg This code is in our Git repository here.

Note: If there is no information in this page about this code and it's a MediaWiki extension, there may be something at mediawiki.org.

csv2wiki is a way to import data from a CSV file into a MediaWiki. The program is run from the shell and requires one parameter which is the filename of a configuration file describing the parameters for the job. It depends on a second script called

Info.svg This code is in our Git repository here.

Note: If there is no information in this page about this code and it's a MediaWiki extension, there may be something at mediawiki.org.

existing in the same location.

./csv2wiki.pl /home/myJob.conf

Here is an example configuration file:

wiki     = "http://foo.bar/wiki/index.php"
user     = "Foo"
pass     = "Bar"
csv      = "/home/foo/projects/bar.csv"
title    = "$1 $2 $3"
template = Customer

Fields and values

Each line of the input file will be imported into an article in the wiki, and the first line of the input file specifies the field names.

Title, Firstname, Surname
Mr, Bob, McFoo
Miss, Mary, Barson


In the above example the first row of data will be imported into an article called "Mr Bob McFoo" with the following content:

{{Customer
 | Title = Mr
 | Firstname = Bob
 | Surname = McFoo
}}


If the page already existed, then the new template would be inserted at the top of the page (or the bottom if "append" is set). Or if the template already exists in the page, then it's values will be updated.

Configuration options

Here is a description of the possible parameters in the job file and their meaning:

Name Meaning Default value
csv The source file to import relative to the location of the script -
wiki the full long-form URL of the wiki including index.php -
user Username of a user on the wiki with permission to create the necessary articles -
pass The users password -
separator Also just "sep" is allowed, specifies the separator character used in the CSV file comma
title The format of the title using $n to specify the column numbers, default is $1 which means to use just the first columns as the title $1
template The template that the parameters should be wrapped by in the created wiki articles Record
append Set this to 1 to place the template at the end of the text if the template doesn't already exist in the article 0
include A comma-separated list of the columns that should be included from the CSV, by default all columns are used not set