Set up a new bot job

From Organic Design wiki
Revision as of 04:14, 23 November 2009 by Nad (talk | contribs) (Example main function)
Procedure.svg Set up a new bot job
Organic Design procedure

First ensure that you have a running wiki daemon (wikid) local to the server and wiki which will be used to administer the jobs. Set up the EventPipe extension and ensure that the daemon is receiving the events either by checking /var/www/tools/wikid.log or by configuring the bot to publish its notifications to an IRC channel. For more details on these aspects, see the install a new server and configure IRC procedures. After the robot framework and necessary extensions are installed you're ready to begin setting up a new job type which is described following.

Job Functions

First you will need to create some functions in Perl which will do the actual work. These can reside anywhere and are made available to the daemon by appending a require statement to /var/www/tools/wikid.conf.

There are three special functions which may be directly called by the daemon, one when an instance of the job type is first started, one which is called iteratively until the job is complete, and another after the job has completed so that any loose ends can be tied up such as closing database connections or file handles. These three functions are init, main and stop and are combined with the name of the job type when defining the functions, for example initFoo. Only the main is mandatory, the init and stop may not be necessary for some kinds of jobs.

Internal functions

In addition to the three special job functions, a job script may also define a number of special internal functions to be called by the other main functions when needed. These should follow the naming convention of jobFooBar (the "Bar" function used by functions in the "Foo" job) to avoid possible naming conflicts with other jobs or components.

The global job hash

When the daemon calls any of the three special job functions, it makes all of the data concerning the job they refer to available in the hash referred to by the $::job hashref. The job functions can store any data they like in this hash and it will be persistently available even if the daemon or server need to be restarted. Some of the keys in the job hash are used by the job execution framework as follows (the $$ is due to $::job being a reference to a hash rather than an actual hash).

  • $$::job{'id'} the daemon will set this to a GUID when a new job is started
  • $$::job{'type'} the daemon sets this when the job is started
  • $$::job{'user'} the daemon sets this to the wiki user it will interact with the wiki through
  • $$::job{'start'} the daemon sets this to the UNIX time stamp when the job is started
  • $$::job{'paused'} a jobs main will not be called if its paused value is not empty
  • $$::job{'finish'} the daemon sets this to the UNIX time stamp when the job finishes
  • $$::job{'length'} the job's init function should calculate and set this if the number of iterations is able to be determined
  • $$::job{'status'} this can be set by the job functions and is displayed in the WikidAdmin special page
  • $$::job{'revisions'} if the jobs change wiki articles, this value should track the number of changes made
  • $$::job{'errors'} any errors occurring during execution should be logged in this value
  • $$::job{'wptr'} the daemon sets this to zero when the job starts and increments it on every execution of the job's main function. The job is automatically stopped when this value reaches the length value.

Example init function

A typical init function for a job would be something like the following example. It reads in the file from the location provided in the job's parameters, extracts the useful information out of the file ready for processing, and stops the job if there are any problems with that. Note that for very large files, it would be better to read the lines of the input file one at a time in the main function instead of pre-processing the whole things like this.

{{{1}}}

, \( $1, $2, $3 ) if /^(.+?),(.+?),(.+?)$/;

} close INPUT; } else { workLogError( "Could not open input file '$file'!" ); $errors++; }

# Stop job and log errors, or set the job length ready to start if ( $errors ) { workLogError( "$errors errors were encountered during preprocessing, job aborted" ); workStopJob(); } else { $$::job{'length'} = 1 + $#{$$::job{'data'}}; }

1; } </perl>}}

Example main function

" . join( "

Example stop function

Stop functions are only required if the job exhibits persistent database connections or file handles. It might look something like the following example.

<perl>

sub stopFoo { $$::job{'handle'}->close(); 1; } </perl>

Creating jobs to be started from the special page

The usual way to run a job is manually from the WikidAdmin special page. If its functions have been correctly defined and included by the daemon, then the job type will appear in the list in the special page. Simple select the job type and click the start button.

Supplying parameters to jobs

Some jobs require some specific parameters to be set before they can run, for example import jobs require a file to be uploaded or a previously uploaded file to be selected. To create a new job which requires such parameters to be filled in first two additional functions must be written. These functions are MediaWiki hooks written in PHP. The hook names depends on the job name, for example the following hooks may be added to the LocalSettings.php file of the wiki to provide a form for our previous "Foo" job:

{{{1}}}

Form rendering hook

Here's a typical function definition for the form rendering hook, all that's required here is to set the passed $html variable with the HTML content of the form. In this example we're providing an upload button and a list of files that have been previously uploaded, the files are stored in the internal location specified in $wgFooDataDir.

{{{1}}}

Form processing hook

And here's the corresponding processing hook function which saves a file into the internal data location ($wgFooDataDir) if one has been uploaded with the form, and starts a job if a valid file has been specified from the list or uploaded. The $job array is empty on entry and should be filled with data to be made available to the executing job (in the global %::job hash). The $start variable is used to specify whether or not the job should run, this allows the processing function to abort the job if there's a problem with supplied data.

{{{1}}}

Creating event-driven jobs

Jobs may be started manually from the WikidAdmin special page, or by various events in the wiki or the Perl environment. To handle wiki events simply define an event handler function for the wiki hook name, this will override any default handler supplied by the daemon. In the following example code, the "Foo" job is started when an article in the "Bar" namespace is edited.

{{{1}}}