Difference between revisions of "Set up a new bot job"

From Organic Design wiki
(Creating jobs to be started manually: change $args to $job)
(Job Functions: the $::job hashref)
Line 11: Line 11:
 
There are three special functions which may be directly called by the daemon, one when an instance of the job type is first started, one which is called iteratively until the job is complete, and another after the job has completed so that any loose ends can be tied up such as closing database connections or file handles. These three functions are '''init''', '''main''' and '''stop''' and are combined with the name of the job type when defining the functions, for example '''initFoo'''. Only the ''main'' is mandatory, the ''init'' and ''stop'' may not be necessary for some kinds of jobs.
 
There are three special functions which may be directly called by the daemon, one when an instance of the job type is first started, one which is called iteratively until the job is complete, and another after the job has completed so that any loose ends can be tied up such as closing database connections or file handles. These three functions are '''init''', '''main''' and '''stop''' and are combined with the name of the job type when defining the functions, for example '''initFoo'''. Only the ''main'' is mandatory, the ''init'' and ''stop'' may not be necessary for some kinds of jobs.
  
When the daemon calls the functions it puts some important data in the global environment...
+
In addition to the three special job functions, a job script may also define a number of special internal functions to be called by the other main functions when needed. These should follow the naming convention of '''jobFooBar''' (the "Bar" function used by functions in the "Foo" job) to avoid possible naming conflicts with other jobs or components.
  
In addition to the three special job functions, a job script may also define a number of special internal functions to be called by the other main functions when needed. These should follow the naming convention of '''jobFooBar''' (the "Bar" function used by functions in the "Foo" job) to avoid possible naming conflicts with other jobs or components.
+
When the daemon calls any of the three special job functions, it makes all of the data concerning the job they refer to available in the hash referred to by the '''$::job''' hashref. The job functions can store any data they like in this hash and it will be persistently available even if the daemon or server need to be restarted. Some of the keys in the job hash are used by the job execution framework as follows (the $$ is due to ''$::job'' being a reference to a hash rather than an actual hash).
 +
*'''$$::job{'id'}''' the daemon will set this to a GUID when a new job is started
 +
*'''$$::job{'type'}''' the daemon sets this when the job is started
 +
*'''$$::job{'user'}''' the daemon sets this to the wiki user it will interact with the wiki through
 +
*'''$$::job{'start'}''' the daemon sets this to the UNIX time stamp when the job is started
 +
*'''$$::job{'paused'}''' a jobs ''main'' will not be called if its ''paused'' value is not empty
 +
*'''$$::job{'finish'}''' the daemon sets this to the UNIX time stamp when the job finishes
 +
*'''$$::job{'length'}''' the job's ''init'' function should calculate and set this if the number of iterations is able to be determined
 +
*'''$$::job{'status'}''' this can be set by the job functions and is displayed in the WikidAdmin special page
 +
*'''$$::job{'revisions'}''' if the jobs change wiki articles, this value should track the number of changes made
 +
*'''$$::job{'errors'}''' any errors occurring during execution should be logged in this value
 +
*'''$$::job{'wptr'}''' the daemon sets this to zero when the job starts and increments it on every execution of the job's ''main'' function. The job is automatically stopped when this value reaches the ''length'' value.
  
 
== Creating jobs to be started manually ==
 
== Creating jobs to be started manually ==

Revision as of 02:04, 23 November 2009

Procedure.svg Set up a new bot job
Organic Design procedure

First ensure that you have a running wiki daemon (wikid) local to the server and wiki which will be used to administer the jobs. Set up the EventPipe extension and ensure that the daemon is receiving the events either by checking /var/www/tools/wikid.log or by configuring the bot to publish its notifications to an IRC channel. For more details on these aspects, see the install a new server and configure IRC procedures. After the robot framework and necessary extensions are installed you're ready to begin setting up a new job type which is described following.

Job Functions

First you will need to create some functions in Perl which will do the actual work. These can reside anywhere and are made available to the daemon by appending a require statement to /var/www/tools/wikid.conf.

There are three special functions which may be directly called by the daemon, one when an instance of the job type is first started, one which is called iteratively until the job is complete, and another after the job has completed so that any loose ends can be tied up such as closing database connections or file handles. These three functions are init, main and stop and are combined with the name of the job type when defining the functions, for example initFoo. Only the main is mandatory, the init and stop may not be necessary for some kinds of jobs.

In addition to the three special job functions, a job script may also define a number of special internal functions to be called by the other main functions when needed. These should follow the naming convention of jobFooBar (the "Bar" function used by functions in the "Foo" job) to avoid possible naming conflicts with other jobs or components.

When the daemon calls any of the three special job functions, it makes all of the data concerning the job they refer to available in the hash referred to by the $::job hashref. The job functions can store any data they like in this hash and it will be persistently available even if the daemon or server need to be restarted. Some of the keys in the job hash are used by the job execution framework as follows (the $$ is due to $::job being a reference to a hash rather than an actual hash).

  • $$::job{'id'} the daemon will set this to a GUID when a new job is started
  • $$::job{'type'} the daemon sets this when the job is started
  • $$::job{'user'} the daemon sets this to the wiki user it will interact with the wiki through
  • $$::job{'start'} the daemon sets this to the UNIX time stamp when the job is started
  • $$::job{'paused'} a jobs main will not be called if its paused value is not empty
  • $$::job{'finish'} the daemon sets this to the UNIX time stamp when the job finishes
  • $$::job{'length'} the job's init function should calculate and set this if the number of iterations is able to be determined
  • $$::job{'status'} this can be set by the job functions and is displayed in the WikidAdmin special page
  • $$::job{'revisions'} if the jobs change wiki articles, this value should track the number of changes made
  • $$::job{'errors'} any errors occurring during execution should be logged in this value
  • $$::job{'wptr'} the daemon sets this to zero when the job starts and increments it on every execution of the job's main function. The job is automatically stopped when this value reaches the length value.

Creating jobs to be started manually

The usual way to run a job is manually from the WikidAdmin special page. If its functions have been correctly defined and included by the daemon, then the job type will appear in the list in the special page. Simple select the job type and click the start button.

Some jobs require some specific parameters to be set before they can run, for example import jobs require a file to be uploaded or a previously uploaded file to be selected. To create a new job which requires such parameters to be filled in first two additional functions must be written. These functions are MediaWiki hooks written in PHP. The hook names depends on the job name, for example the following hooks may be added to the LocalSettings.php file of the wiki to provide a form for our previous "Foo" job:

{{{1}}}


Here's a typical function definition for the form rendering hook, all that's required here is to set the passed $html variable with the HTML content of the form. In this example we're providing an upload button and a list of files that have been previously uploaded, the files are stored in the internal location specified in $wgFooDataDir.

{{{1}}}


And here's the corresponding processing hook function which saves a file into the internal data location ($wgFooDataDir) if one has been uploaded with the form, and starts a job if a valid file has been specified from the list or uploaded. The $job array is empty on entry and should be filled with data to be made available to the executing job (in the global %::job hash). The $start variable is used to specify whether or not the job should run, this allows the processing function to abort the job if there's a problem with supplied data.

{{{1}}}

Creating event-driven jobs

Jobs may be started manually from the WikidAdmin special page, or by various events in the wiki or the Perl environment. To handle wiki events simply define an event handler function for the wiki hook name, this will override any default handler supplied by the daemon. In the following example code, the "Foo" job is started when an article in the "Bar" namespace is edited.

{{{1}}}