The metric requirements for an enterprise traffic manager are straightforward: how loaded is the system, how many requests is each vService handling, which backend is slowing down, which health check is DOWN, is the WAAP attack rate climbing? Yet in many architectures, answering those questions means deploying, monitoring, updating and recovering a separate exporter binary.
The problem compounds in multi-process and fork architectures. Every worker generates its own statistics; those figures need to be merged at a single scrape point. If the aggregation is done poorly, Prometheus ends up with missing metrics, double-counted values or inconsistent panels. The operations team ends up managing the metric pipeline like a second application.
The dashboard side carries its own cost. Connecting data to a blank Grafana instance is only the start — panels, labels, alerts, thresholds and per-service breakdowns all have to be built from scratch. Without the right vService, backend group, health-check status and tenant label model, the dashboard yields only generic system graphs rather than actionable operational insight.
Metric type discipline is another critical concern. Monotonically increasing values must be exposed as counters; instantaneous readings and limit values as gauges. A wrong type assignment breaks rate calculations, alert rules and long-term trend analysis alike.
TR7 Native Prometheus + Grafana Integration removes this burden: 50+ metrics, multi-process aggregation, correct gauge/counter separation, a vService and backend label model and ready-made Grafana dashboard JSON files make observability a natural part of the platform.
TR7 solves metric publishing through a built-in endpoint, process aggregation and a ready dashboard package — no external exporter required.
TR7 publishes metrics in Prometheus exposition format including HELP and TYPE lines. Gauge and counter values are presented in a directly scrapeable form without any additional configuration.
Traffic statistics from fork workers and child processes are aggregated in the primary metric publisher. Prometheus scrapes a single endpoint and receives the consolidated view; operators have no per-process exporters to manage.
Monotonically increasing counters and instantaneous gauges are correctly typed in the schema. This separation provides the right data model for Prometheus rate calculations, alert rules and dashboard panels.
TR7 ships Grafana dashboard JSON files for both global and detailed views. Operations teams import them and work with a production-ready metric model instead of building panels from scratch.
Native Prometheus + Grafana Integration brings device, vService, backend, QoS and health-check metrics together in a single observability model.
`tr7_device_uptime` reports per-host device uptime in seconds. `tr7_device_cpu_detailed` exposes CPU breakdown by user, system, nice and irq as gauges. `tr7_device_mem_detailed` tracks total, active, cached and buffer memory values at MB granularity. These metrics form the baseline for correlating traffic behavior with underlying system resources.
`tr7_tm_qos_cpu_count` reports the number of CPU cores allocated to a vService. `tr7_tm_qos_cpu_percent_limit` exposes the CPU percentage limit and `tr7_tm_qos_memory_limit` exposes the memory limit. These metrics are essential for capacity planning and tenant-level resource tracking. Operators can view traffic growth alongside the allocated resource envelope, not just as raw request counts.
At the vService level, metrics include uptime, process idle percent, SSL connections, SSL totals, SSL rate, compression in/out, logs dropped, memory usage, session limit, session total, request rate and request total. Response code counts across 1xx through 5xx are exposed as counters. Connection totals, bytes in/out and request errors clarify service behavior. These metrics are the primary panel data for SLA tracking, capacity analysis and error diagnosis.
`tr7_tm_vservice_waf_attack_rate` carries the WAAP attack rate to the Prometheus side. Security teams can write alert rules against this metric and track attack trends on their dashboards. Traffic volume and attack rate share the same vService label model, so the security signal stays connected to the operational context.
At the backend level, metrics cover newsession, session, response class counters, bytes in/out, connection error, response error and connection pool state. Queue time, connect time, response time and total time metrics help analyze backend latency. These measurements surface which specific target is slowing down or starting to generate errors — making real backend behavior visible behind the aggregate vService graph.
`tr7_tm_bservice_hc_state` reports health check status with host, vservice, bservice_group, bservice and state labels. UP is encoded as 1, DOWN as 0 and NOCHECK as 2. This numeric model is convenient for Prometheus alert rules — a DOWN backend can trigger an alert directly. `tr7_tm_bservice_hc_time` also tracks health check duration in milliseconds.
The backend label model includes a bservice_group field that distinguishes the default group from dynamically or conditionally assigned backend groups. In large vService configurations, operators can identify which group is affected directly from the dashboard panel. The operations team gains topological visibility instead of a flat list of targets.
Metrics from TR7 worker processes are merged in the primary publisher. Prometheus scrapes a single `/metrics` endpoint and receives full visibility. This eliminates the need for per-process scraping and manual aggregation, which is particularly critical for producing consistent dashboards in high-traffic multi-fork deployments.
Metric fields with no value are not emitted. This prevents meaningless null gauge pollution on the Prometheus side. Dashboard panels show only values that genuinely exist. Fields absent from the current configuration do not inflate the metric series count.
TR7_Detailed_Dashboard and TR7_Global_Dashboard JSON packages can be imported directly into Grafana. The global dashboard provides an overall device and service view; the detailed dashboard focuses on per-vService and per-backend breakdowns. Operations teams do not need to build panels from zero. Both dashboards are structured around the Prometheus label model shipped by TR7.
The Prometheus integration is operated through metric prefixes, a defined label model, type separation and numeric health-check state codes.
Traffic manager metrics are published under the `tr7_tm_*` prefix. System-level metrics use the `tr7_device_*` prefix. This naming convention makes the metric family easy to locate in PromQL queries and Grafana variable selectors.
vService metrics are published with a `{host, vservice}` label set. The host value comes from the device hostname. The vservice label is used for per-service filtering and Grafana dashboard variables.
Backend metrics are published with a `{host, vservice, bservice_group, bservice}` label set. This model supports analysis at service, backend group and individual target levels. Alert rules can be narrowed down to a specific backend.
The health check state metric carries the state label with values UP, DOWN or NOCHECK. Numeric encoding simplifies writing alert rules. DOWN matches can be linked directly to Prometheus alert definitions.
Monotonically increasing values — req_tot, ssl_tot, session_total, response code counts, bytes in/out and request errors — are exposed as counters. These values should be analyzed with Prometheus rate or increase functions. They are the correct metric type for long-term traffic trend analysis.
Instantaneous readings — request rate, current connection count, limit values, health check time, queue time, connect time and response time — are exposed as gauges. Gauges reflect current state and are used for threshold-based alert rules. Limit and utilization values can be shown side by side on the same dashboard panel.
SRE teams add the TR7 `/metrics` endpoint as a Prometheus scrape target. Importing the ready-made Grafana dashboard JSON files immediately opens global and detailed views. No separate exporter deployment is required.
Operations teams can track `tr7_tm_vservice_memory_alloc` and related memory metrics over time. An alert can fire when utilization approaches a defined threshold. Capacity decisions are based on measured trends rather than estimates.
Security teams can define a Prometheus alert rule on `tr7_tm_vservice_waf_attack_rate`. When the attack rate rises on a specific vService, the incident management workflow is triggered. Traffic and security visibility converge on the same dashboard.
When `tr7_tm_bservice_hc_state` reports a DOWN condition as 0, an alert can be raised. The alert identifies the affected target directly through its host, vservice, bservice_group and bservice labels. SRE teams can pinpoint which backend has gone down without scanning logs.
50+ built-in metrics, multi-process aggregation and ready-made dashboard JSON files. Let us walk through a live setup in your own environment.