Skip to end of metadata
Go to start of metadata

CB-12 Setup monitoring and usage reporting for a campus HPC resource

A campus IT administrator wants to reuse XSEDE practices concerning compute resource
monitoring and reporting needs. The administrator would like to have the ability to
easily monitor load on the HPC environment and generate detailed useage reports based on
resource manager logs. We assume
XSEDE has built a consensus around its own “common environment” expectations, for
HPC-specific monitoring and metric-gathering.

In most cases, the campus IT administrator would like to experience it as follows.

  1.  The administrator finds software packages on XSEDE's website to install standard monitoring solutions in an HPC environment.
  2. The administrator finds documentation on XSEDE’s website that he/she can use with minimal alterations for the local environment.
  3. After installation and configuration, the administrator is able to see load and job activity over time of the cluster environment under administration

We’ll accept any solution to this problem, as long as the following are true.

  1.  Documentation and training materials provided by XSEDE must be released with a license that allows reuse and modification, such as the CC BY 3.0 license. [5]
  2.  Documentation and training materials should be provided in editable, commonly used formats.
  3.  Software provided by XSEDE must be available for use under a free­use license.
  4.  This solution does not require participation in XSEDE allocation processes.
  5.  It does not take longer than 1 day to set up these services on local resources.
  6. The solution should provide an option to aggregate monitoring data from multiple systems. At this time, doesn't matter which format or mechanism is used.
  • No labels