C.4 RRD Definition Using Deployable .metric Files

Definition of the “aggregation” functions performed by the Metrics Facility’s internal RRD data structures are customizable using deployable XML .metric definition files. This accommodates a flexible configuration of the following:

The deployable definition files, one per metric to be aggregated, consist of the following:

When an RRD is defined through deployment of its definition file, three RRAs are created for each Period: AVERAGE, MAX, and MIN. A new DS (datasource) is added to the RRD for each resource reporting the metric to be aggregated. This requires the RRD file to be re-created each time a new resource begins reporting a given metric and the previously aggregated values copied from the old RRD to the new one. This approach enhances performance and flexibility, but the RRD file is not of fixed size: Over time, the RRD grows or shrinks as new resources are added to the system or are deleted from the Orchestrate model.

NOTE:The RRD is actually re-created with a new DS added for each new resource and the “old” RRA’s data copied into it.

Deleting a Resource GridObject removes its DS from the RRD file (actually, from all RRDs with metrics reported by that resource).

One optimization you can implement for storing the smallest Period (consisting of a single step) is to create only a single RRA (vs. three), because the average of a single datapoint is equal to the maximum and minimum of a single datapoint.

NOTE:The RRD files created by the Java rrd4j library are not binary compatible with RRD files generated by the rrdtool used by gmetad. They are however portable across operating system architectures (e.g., 32-bit bigendian vs. 64-bit little-endian) which is not possible with traditional RRD files created using rrdtool.

C.4.1 XML Format for Deployable .metric Definitions

An example of the format of the deployable RRD definition is shown below.

<metric name="load_one" heartbeat="120" 
        description="Ganglia oneminute load average">


    <period name="1_minute" steps="1" rows="60" xff=".5" 
        description="1 hour worth of 1 minute (raw) data"/>


    <period name="5_minute" steps="5" rows="12" xff=".5" 
        description="1 hour worth of 5 minute aggregations"/>


    <period name="10_minute" steps="10" rows="72" xff=".5" 
        description="12 hours worth of 10 minute aggregations"/>


    <period name="1_hour" steps="60" rows="24" xff=".5"
        description="1 days worth of 1 hour aggregations"/> 


</metric> 

This example creates an RRD for the load_one metric, with four aggregation periods (RRAs) called 1_minute, 5_minute, 10_minute, and 1_hour. The default sample (RRD update) time is one minute, so the 10_minute aggregation period has 10 steps. 72 “rows” of aggregated datapoints are retained in the RRA before the oldest is dropped off, representing 12 hours ( 12 * 60 / 10 ) worth of data.

NOTE:The 1_minute period is not a true aggregation because the default sample (RRD update) time is also one minute. In this case, the “raw” datapoints are stored for historical reference.