The standard GSLB pattern looks like this: monitor an endpoint, when it goes unhealthy shift DNS responses to the backup, when it goes healthy shift back. Implementations from major GSLB vendors treat both transitions as the inverse of the same condition.
Production reality is not symmetric. Failing over to a backup is a defensive move executed under pressure; returning to the primary is an offensive move executed under confidence. The conditions, the wait windows, the operator approvals, and the triggers that should fire are different in each direction.
Examples: fast-in / slow-out — failover on first detected error to protect the user, but require 15 minutes of clean health before returning to avoid a flap. Conservative-in / fast-out — require three consecutive failures before initiating failover, but return immediately when primary recovers to meet RPO/RTO commitments. Auto-in / manual-out — failover is automated, but the return path requires SRE acknowledgment after a runbook review.
None of these are expressible in a one-way scenario. The operator either picks one direction and lives with the wrong policy in the other, or wires together brittle custom scripts that drift from the GSLB's source of truth.
TR7 GTM bidirectional scenarios let activation and deactivation carry independent conditions, independent gating checks and independent trigger actions — the policy structure your incident response runbook already assumes.
A scenario is a named, reusable state machine with two directions. Each direction is defined by a combined condition expression and a trigger set; the two directions do not need to be inverses.
A combined condition expression evaluates the underlying health checks. When it returns true, the scenario activates. Optional triggers fire actions (HTTP/HTTPS webhooks, Oracle queries) and an optional gating check confirms the trigger should proceed.
A separate combined condition expression evaluates a separate set of health checks. The deactivation path can be the inverse of activation, or it can require additional stability, additional probes, or different triggers entirely.
Conditions are not single booleans — they are groups of health-check results joined with AND inside a group and OR across groups, with optional negation. The same DSL that drives traffic-rule logic on the ADC also drives scenario evaluation here.
A scenario is defined once and referenced by name from DNS records, disaster-recovery configurations and cross-DC failover policy. Operators do not re-author the same logic in multiple places.
Bidirectional scenarios are the foundation of TR7 GTM's failover and recovery policy.
Activation condition is built from health-check results across one or more health-check profiles. Groups join checks with AND; multiple groups join with OR. Each individual check can be negated. Operators express conditions like "(API is up AND database is up) OR (failover path A is up AND failover path B is up)" without scripting.
Deactivation condition is independent of activation. Operators express conditions like "primary has been healthy for 15 minutes AND latency is below threshold" while activation may have been simply "primary is down."
Activation triggers and deactivation triggers are separate selectable sets. An activation event may notify the on-call SRE; a deactivation event may notify the SRE plus run a synthetic transaction plus emit a webhook to the deployment system.
An optional gating condition runs before each direction's triggers execute. If the gating check returns false, the state transition still happens but the triggers do not fire. Use case: state transitions automatically, but external notifications only fire during business hours.
Each direction supports three operator-selectable modes. Auto follows the condition expression. On forces the direction to activate regardless of conditions (manual override). Off disables the direction entirely (e.g., disable failback during a maintenance window).
When two data centers are defined, TR7 GTM auto-generates four scenarios per pair: from-active, to-active, from-backup, to-backup — each with appropriate condition logic based on WAN access, LAN access and internet reachability checks. Operators can use the auto-generated scenarios as-is, customize them, or create their own from scratch.
DNS records can have their healthy/unhealthy state driven by a scenario rather than a static boolean or single health check. Per-record `cond` field accepts a scenario reference: when the scenario activates, the record is excluded from responses; when it deactivates, the record returns.
Disaster-recovery records can specify `drCond` — a scenario that determines when the DR record set replaces the primary record set in responses. The DR scenario evaluation is bidirectional, supporting controlled failover and controlled failback.
Triggers fire as HTTP/HTTPS calls (custom URI, method, headers, body, expected status codes, content-match query) or Oracle database calls (configured SQL). Operators wire scenario activations into existing incident management, deployment, or audit pipelines.
Every scenario state change is recorded: which direction fired, which conditions evaluated true/false, which triggers ran, which gating check passed. Post-incident review reconstructs the exact sequence of automated decisions without manual log archaeology.
Scenarios are operated together with health-check definitions, trigger configurations, DNS record bindings, and disaster-recovery configurations.
Within a condition group, all listed checks must evaluate true (AND). Across groups, any one group evaluating true is sufficient (OR). The `!` suffix on a check ID negates it. The grouping structure is symmetric for activation and deactivation; each direction has its own group set.
Conditions reference health checks by ID. User-defined health-check profiles and auto-generated DC-pair checks share the same ID space. Operators mix manual and auto checks in the same condition group.
When activation is forced to on (manual override), deactivation evaluation typically continues — operators can manually activate, then let the deactivation condition decide when to restore. Forcing both directions to on creates a stuck state and is logged as a configuration warning.
Triggers fire with a structured payload carrying scenario ID, direction, evaluation timestamp, and the configuration snapshot at trigger time. Trigger failure (HTTP non-2xx, Oracle error) is logged and optionally retried per trigger profile.
Scenarios are evaluated on every health-check state change, not on a polling timer. The first state change that crosses an activation or deactivation threshold triggers the transition. Evaluation cost stays low because conditions reference precomputed health states.
Operators see the current state of every scenario (activated / deactivated), the time of the last transition, the last evaluation result for each condition group, and the trigger outcomes. The dashboard surfaces stuck transitions and conflicting overrides.
Activate failover on the first detected error to protect the user. Deactivate only after 15 minutes of clean primary health to avoid a flap. Different conditions, different timing — same scenario object.
Failover is automated; the return path requires SRE acknowledgement. Deactivation direction is set to off; an operator manually flips it to on after runbook review. Activation continues to evaluate automatically.
DC A → DC B failover triggers an HTTP webhook to the incident management system. DC B → DC A failback triggers the same webhook plus a deployment-system call to re-warm caches. Triggers in each direction are independent.
Use the Oracle trigger to query a database before failover — for example, confirm the backup database has caught up via log shipping. The trigger result gates the actual state transition.
Walk through a bidirectional scenario built on your own runbook: fast-in / slow-out, manual failback, asymmetric trigger sets — your policy, not the platform's default.