GemStone supports Shared Page Cache monitoring via Promentheus, on Linux on x86/64.
GemStone’s statprom executable starts a process that can be queried from the Prometheus monitoring software, to retrieve GemStone cache statistics values. A statprom process may connect to either the Stone’s cache or a remote cache; you will need multiple instances to monitor multiple caches.
When statprom is started, it takes an argument configuration file that is customized for a specific cache and the specific monitoring requirements; this includes the port number, the Stone name, and a list of statistics to report. This configuration file is in JSON format.
While statprom can report any GemStone statistic originating in the shared page cache, host system statistics are not available.
Prometheus is an open-source systems monitoring and alerting toolkit, that collects and stores metrics (numeric data) as time series, along with key-value tags. Prometheus is widely used, and the Prometheus Github project has a active developer and user community.
In addition to Prometheus, Grafana can be installed and used for live monitoring of Prometheus data; Grafana provides out of the box support for Prometheus, and no additional configuration is needed to collect the GemStone data from Prometheus.
statprom is built on the open-source github project Prometheus Client Library for Modern C++ (jupp0r.github.io/prometheus-cpp). This imbeds a web server from the CivetWeb project (civetweb.github.io/civetweb), which handles http requests from Prometheus. statprom in turn uses the GemStone C Statistics Interface (GCSI) to access cache statistics. The GCSI attaches the process to the shared page cache as read-only; it does not create a gem session, and therefore has no view of the repository.
The statprom process must be started with a configuration file that specifies the stone or cache name, the port to expect queries from prometheus, and the specific cache statistics that can be returned to prometheus. The statprom configuration file is in JSON format and requires specific keys to be present in a specific structure.
For details on the file format, see Configuration file JSON format. An example configuration file is included in the distribution, $GEMSTONE/examples/Prometheus/stats.json.
The statprom configuration file allows you to monitor the Stone, the Shared Page Cache Monitor, or Gems. Any GemStone cache-based statistic of these process can be accessed; however, host process statistics cannot be monitored directly.
The Stone and Shared Page Cache Monitor are singletons within a cache, and thus straightforward to monitor. However, a GemStone system contains multiple gems with different purposes and monitoring requirements. Prometheus does not handle multiple instances, so there is additional configuration required for monitoring Gems.
When multiple Gems match the criteria, the statistics values for all of them are added together. The monitoring criteria should be designed carefully so that either a single Gem process is identified, or if there is a chance that multiple Gem processes will be matched, that the specific statistic being monitored can logically be summed over all the matched Gems.
There are two options for filtering on the Gem you wish to monitor:
Cache statistics data includes an entry for a String ProcessName, which is set by the System; for example, Gem or TopazL. This can be set to a specific value in your application in several ways: using the -u option on the topaz command line, executing System cacheName: gemName, or set cachename gemName in topaz (note that topaz set cachename takes effect on the next login, and does not affect any current sessions).
statprom matches the Gem ProcessName using regular expressions, e.g. MyGemName*.
Starting with this version, cache statistics data includes an entry for an integer GemKind. This is 0 by default, and negative for system Gems. This can be set using System setGemKind: anInt. Negative values are reserved for use by GemTalk.
statprom matches the Gem processes’ GemKind, using a high and low value, forming an inclusive range.
statprom -f cfgFile [-c] [-d] [-r]
statprom -h | -v
-c Check the JSON file (-f argument) for errors and exit.
Requires -f.
-d Enable printing debug output to stderr.
-f <cfgFile>
Specifies a configuration file in JSON format which
determines which processes and statistics are collected.
-h Print this help screen and exit.
-r Retry if the shared page cache not running. If the cache
connection is lost, sleep and attempt to reattach.
Without -r the process exits if the cache connection is
lost or not present at startup time.
-v Print the program version and exit.
Prometheus in turn must be started with a configuration file that specifies the node on which statprom is running and the configured port.
For example, the prometheus configuration file may include an additional entry such as:
- job_name: 'gemstone'
static_configs:
- targets: ['nodename.gemtalksystems.com:9985']
Once Prometheus is started with its updated configuration file, and statprom has been started with its configuration file, Prometheus will starting querying and recording information for the specified statistics.
Prometheus will retrieve and store values with the name provided in the statprom configuration file. In addition to data tags such as the job name, the name of the Stone is provided as a tag with the key StoneName.
A sample JSON file, that could be used as an argument to statprom, is included in $GEMSTONE/examples/Prometheus/stats.json.
This provides an example of the main features of statprom configuration.
"http" : {
"listen_addresses" : ["[::]:9985","9985"]
},
specifies that either IVp6 or IVP4 connection is accepted on port 9985 on localhost.
Note that statprom does not support ssl connections at this time.
"gemstone" : {
"cache_name" : null,
"stone_name" : "gs64stone",
"sample_interval": 60
},
Three types of metrics are supported: monitor (shared page cache monitor), stone, and gem. You do not need to include all three types.
For each type, there should be an array of specific metrics to be monitored for that type. The JSON metrics objects have the following members:
"metrics" : {
"stone" : [
{
"vsd_name" : "CommitRecordCount",
"metric_name" : "gemstone_stone_commit_records",
"metric_type" : "Gauge",
"metric_help" : "Number of commit records.",
"metric_units" : "Commit Records"
},
In addition to the above object members, entries for Gems require a filter criteria; either cache_name_regex, or both gem_kind_min and gem_kind_max, or all three.
All Gems in the cache that match the filter criteria have their statistics values added together for return to Prometheus. You must be careful to ensure that either a unique Gem can be confidently matched to the filter criteria, or that the particular statistic values are meaningful if added together for multiple Gems.
If you have multiple Gems with statistics that need to be separately reported, you will need separate entries, each with a unique Prometheus statistic name, for each Gem/statistic.
Since Gems matching the filter criteria have values summed, you can make use of it, for example, if tasks are divided over multiple Gems.
"gem" : [
{
"cache_name_regex" : "Widget.*",
"vsd_name" : "SessionStat01",
"metric_name" : "gemstone_gem_widgets_produced",
"metric_type" : "Counter",
"metric_help" : "Number of widgets produced.",
"metric_units" : "Widgets"
}]
With this example, if there are multiple Gems with names that match Widget, the values will be summed. E.g. if there are the following Gems in the cache:
Gem Widget1, SessionStat01 == 6
Gem Widget2, SessionStat01 == 8
In this case, Prometheus will show the aggregate value of 14 (6 + 8) for the statistic gemstone_gem_widgets_produced.
Note that this monitor example would also match Gems named WidgetLogging, WidgetDefects, and so on; you must be aware of your application’s Gem cacheName conventions.
If you do not wish to sum the values, you must be able to identify the specific Gem or Gems for reporting, and create specific entries for each individual value. In the above example, if you wished to monitor Widget1’s 6 and Widget2’s 8 separately, rather than summed, you could create the following entries:
"gem" : [
{
"cache_name_regex" : "Widget1",
"vsd_name" : "SessionStat01",
"metric_name" : "gemstone_gem_widgets1_produced",
"metric_type" : "Counter",
"metric_help" : "Number of widgets produced by Widget1.",
"metric_units" : "Widgets"
},
{
"cache_name_regex" : "Widget2",
"vsd_name" : "SessionStat01",
"metric_name" : "gemstone_gem_widgets2_produced",
"metric_type" : "Counter",
"metric_help" : "Number of widgets produced by Widget2.",
"metric_units" : "Widgets"
}]