gangliarc provides a way to monitor an ARC Computing Element through an
existing ganglia installation. Running gangliarc adds various ARC-related
metrics to ganglia using gmetric which can then be viewed on the ganglia web
page for the ARC CE host.

The available metrics are configurable and explained below.

gangliarc takes information (eg control directory, cache directories) from the
standard ARC configuration file arc.conf. If it is not in the default location
(/etc/arc.conf) then ARC_CONFIG must be set.
 
Requirements:
  python >= 2.4.x
  ganglia >= 3.0.x
  ARC >= 0.8.x (Some metrics are only available with ARC >= 1.0)

To install from sources:

  python setup.py install (as superuser)
  
To start/stop/restart:

  /etc/init.d/gangliarc start/stop/restart (as superuser)
  
  Log messages are logged to /var/log/arc/gangliarc.log by default but this can
  be configured.

To configure:
 
 In many cases no gangliarc configuration is necessary, however the following
 parameters may be specified in a [gangliarc] section in arc.conf.

 - frequency -- period of gathering the info, in seconds. Default is 20.
 - dmax -- expiration period of information, passed to gmetric, in seconds.
           0 means infinity. Default is 180.
 - gmetric_exec -- path to gmetric executable. Default is /usr/bin/gmetric
 - logfile -- log file of the daemon. Default is /var/log/arc/gangliarc.log
 - pidfile -- pid file of the daemon. Default is /var/run/gangliarc.pid
 - python_bin_path -- path to python executable. Default is /usr/bin/python
 - metrics -- the metrics to be monitored. Default is all.
 
 metrics takes a comma-separated list of one or more of the following metrics:
 - staging -- number of tasks in different data staging states
 - cache -- free cache space
 - session -- free session directory space
 - heartbeat -- last modification time of A-REX heartbeat 
 - processingjobs -- the number of jobs currently being processed by ARC (jobs
                     between PREPARING and FINISHING states)
 - failedjobs -- the number of failed jobs per last 100 finished
 - jobstates -- number of jobs in different A-REX internal stages
 - all -- all of the above metrics

Explanation of data staging states:

 Data staging states metrics all start with ARC_STAGING
 
 CACHE_WAIT -- Files waiting to obtain a lock in the cache, currently held by
               another request downloading the same file
 STAGE_PREPARE -- There is a limit (max_prepared in arc.conf) on files prepared
                  (pinned) on SRM storage. Files in this state are waiting for
                  a free slot to prepare their source or destination.
 STAGING_PREPARING_WAIT -- Files which have made a staging request to SRM and
                           are waiting for a TURL to be returned
 TOTAL -- All files in the data staging system (corresponding to jobs in
          PREPARING and FINISHING)
 TRANSFERRING_hostname -- Files actively transferring data, with the transfer
                          running on hostname
 TRANSFER_WAIT -- Files ready for transfer and waiting on a TRANSFERRING slot  

Example configuration:
 
 [gangliarc]
 python_bin_path="/usr/bin/python2.6"
 metrics="cache,heartbeat,failedjobs,processingjobs,staging"


Authors:

 David Cameron (d.g.cameron@fys.uio.no)
 Dmytro Karpenko (dmytro.karpenko@fys.uio.no)
 
More information:

 http://wiki.nordugrid.org/index.php/Gangliarc
