Capability

Per-vService Traffic Shaping and QoS

Apply per-vService, per-user or shared bandwidth limits and distribute traffic capacity in a controlled way at the application layer.

TR7 Per-vService Traffic Shaping and QoS lets you manage traffic capacity not only by connection count or request rate, but by actual bandwidth consumption. Separate upload and download bandwidth ceilings can be defined for each vService. The feature supports three usage models: a separate limit per connection, a limit keyed to a user, IP or tenant identifier, and a shared limit across a high-availability pair. This prevents a single tenant, a single IP or a single heavy transfer from consuming the entire service capacity. Traffic shaping here is not a complex network-level queuing architecture — it is a capacity ceiling applied at the connection and flow level. Operators can quickly establish bandwidth policies for a vService, tenant, user or risky traffic group. The result: TR7 removes bandwidth management from the domain of separate network appliances and complex system tuning, turning it into conditional, auditable and service-level traffic control running directly on the ADC.

Limit modes: stream, per-key, shared

100K

Maximum tracked keys in per-key mode

100ms

Inter-node synchronisation frequency in shared mode

When bandwidth is left uncapped, a single tenant or a single heavy flow can affect the entire service.

In modern applications, capacity problems are not measured by request count alone. Large file downloads, video streaming, API exports, backup traffic or bot-driven bulk data pulls can consume high bandwidth even at low request rates. In these cases, a request rate limit on its own is not sufficient.

In multi-tenant environments the problem becomes more visible. When one tenant runs a heavy data transfer, latency for other tenants can increase. Without a capacity boundary between customers sharing the same vService, a fair-use policy cannot be technically enforced.

Conventional traffic shaping typically requires complex queuing and class structures at the network level. That model is powerful, but working directly with L7 signals such as path, tenant, user, JWT claims or source IP at the application delivery layer is difficult. Debug and change management for operations teams also becomes heavier.

The right approach is to apply a connection- and key-based bandwidth ceiling at the ADC level. Different limits should be definable for a vService, user, tenant or risky traffic group, and the upload and download directions should be manageable independently.

TR7 Per-vService Traffic Shaping and QoS controls bandwidth consumption at the application delivery layer with stream, per-key and shared modes.

Our approach

TR7 applies bandwidth control as a three-mode policy across vService, traffic key and high-availability sharing.

Upload and download ceilings are applied per vService

Upload traffic from clients to the backend and download traffic from the backend to clients can be limited separately for each vService. This puts application capacity under service-level control.

Stream mode gives each connection its own limit

In stream mode every connection operates under its own bandwidth ceiling. The resource consumption of a single connection can be bounded in long-running downloads, uploads or WebSocket-style streams.

Per-key mode applies quotas by user, IP or tenant

In per-key mode the limit is applied according to a key produced by an FX expression. Source IP, user, tenant ID or a JWT claim value can all serve as a bandwidth key.

Shared mode provides a common capacity budget across the cluster

In shared mode the bandwidth budget in a two-node setup is not confined to a single device. A total limit for the same tenant or service can be applied more consistently at the cluster level.

Capabilities

Traffic shaping turns per-connection, per-key and shared bandwidth limits into a vService-level policy.

Three limit modes cover different capacity control scenarios

Stream mode applies a separate ceiling to every connection. Per-key mode applies a shared limit across an IP, user, tenant or any other FX key. Shared mode helps the same limit be distributed across nodes in a high-availability setup. A single feature therefore covers simple connection limiting, tenant quota management and cluster-wide shared capacity control.

Upload and download limits are defined independently

In some services download traffic is dominant; in others upload consumes critical capacity. TR7 can limit the upload direction from clients to the backend and the download direction from the backend to clients separately. For example, upload can be more tightly controlled on a file-upload service while download is tighter on a video service. This separation aligns bandwidth policy with real traffic behaviour.

FX key builder produces per-tenant and per-user limits

In per-key mode the bandwidth key is built using the FX expression engine. Source IP, JWT user information, tenant ID, a header value or any combination of these can serve as the key. For example, all users belonging to the same tenant can share a common capacity ceiling. This is a powerful mechanism for fair-use enforcement in multi-tenant SaaS models.

The per-key table can track large numbers of users or tenants

In per-key mode each key is tracked as a separate usage state. Keys that have been silent for a defined period are removed; active keys remain subject to the limit policy. This model provides centralised capacity control for thousands of users or tenants. The operator applies limits against a logical traffic owner rather than individual connections.

Shared mode preserves the total limit across a high-availability pair

In active-active or active-passive setups traffic can be distributed between two nodes. Shared mode helps the bandwidth budget be applied as a shared service behaviour rather than being confined to a single node. When a tenant moves from one node to the other, the limit logic is not disrupted. This is particularly important for enterprise SLA and quota scenarios.

Conditional application separates premium and standard traffic

Traffic shaping rules can operate conditionally. A premium tenant can be unlimited, a free tenant capped at 100 Kbps and a suspicious IP restricted to 1 Mbps. Conditions can be built from path, user, header, JWT claim, source IP or any FX expression. Bandwidth policy is no longer a single global setting.

Multiple limit rules can be defined within a vService and pool

Different traffic slices within the same vService can have different limit rules. For example, the `/download` path can have its own limit, `/api/export` another and standard API calls yet another. This breaks capacity control into segments aligned with application behaviour. The operator applies context-sensitive shaping rather than a single blunt ceiling.

A connection-level ceiling is applied to long-lived streams

WebSocket, large-file download or long-duration streaming connections can bypass classic request-based limits. Stream mode applies a bandwidth ceiling to each such flow. A single long connection is prevented from exhausting service capacity. This model is important for media, file transfer and real-time streaming scenarios.

Limit changes can be applied without service interruption

Bandwidth limits can be updated through a configuration change. New limits are applied to production traffic in a controlled manner. This enables rapid response during campaigns, suspected DDoS events, tenant quota changes or temporary capacity restrictions. Operations teams do not need to wait for a separate network device change.

Monitoring shows which key is consuming its limit

In per-key mode the usage state for each user, IP or tenant can be observed. The operator can see which key is approaching its quota ceiling. This information is valuable for customer support, security analysis and capacity planning. Limit breach events can be connected to logs and SIEM pipelines.

A rate ceiling can be applied to suspicious traffic in DDoS mitigation

Not every suspicious flow must be blocked outright. TR7 can apply a low bandwidth limit to risky IPs, ASNs, paths or behaviour groups. This reduces the impact of an attack while avoiding a complete cutoff of legitimate users in false-positive situations. The approach fits a graduated defence model.

Works as a transparent and predictable connection-level flow control

This feature is not a complex network-level queuing system or hardware QoS mechanism. TR7 applies a bandwidth ceiling at the connection and flow level. This boundary is straightforward to manage at the vService, key and condition level. The operator defines clearly how much capacity each service or user receives.

Operational depth

Traffic shaping should be planned together with limit mode, key design, upload and download direction, long-lived flows, cluster sharing and audit visibility.

Limit mode selection

Stream mode suits per-connection limits, per-key mode suits per-user or per-tenant limits, and shared mode suits a common limit across the cluster. Choosing the wrong mode can break the expected capacity behaviour. The policy objective should be clarified first.

Key design

The key used in per-key mode determines who the limit actually belongs to. Source IP is sufficient in some environments; tenant or user information can provide a more accurate quota model. For multiple users behind NAT, IP alone may not be fair.

Upload and download separation

Upload and download directions have different resource impacts. Large file uploads consume backend ingress capacity; downloads consume egress capacity. These two directions should be limited separately.

Long-lived connections

WebSocket, stream and large file transfer connections can stay open for an extended period. Per-connection limits make resource consumption more predictable in these flows. Timeout and bandwidth limit settings should be evaluated together.

Cluster sharing

Shared mode can be used for common budget behaviour in two-node setups. The aim is for the limit policy to remain consistent as traffic distribution shifts between nodes. This behaviour is important for critical tenant SLAs.

Audit and alerting

Keys that reach a limit threshold can be logged. SIEM-side alerting can be configured for per-tenant, per-user or per-IP quota breaches. This information is useful for both security operations and customer support.

When to use it

Enforcing bandwidth quotas for SaaS tenants

In a multi-tenant SaaS environment a separate capacity ceiling can be defined for each tenant. The tenant ID is used as the key, and all users belonging to the same tenant share the common limit.

Delivering different speeds across premium and free tiers

A higher bandwidth limit can be given to premium customers and a lower one to free users. The tier difference is managed in the ADC policy without embedding logic in application code.

Applying quality-level rate limits in media streaming

Different quality levels in video or media services require different bandwidths. TR7 can apply a download ceiling based on path or user subscription tier.

Slowing suspicious traffic instead of blocking it

During a suspected DDoS or bot event, traffic can be placed under a low rate limit rather than being cut off entirely. This reduces attack impact while avoiding a complete disconnect for real users in false-positive cases.

Separate capacity policies for internal and external traffic

Internal API calls can be left unlimited while traffic from the internet is capped. The separation is made centrally using source IP, path or header conditions.

Throttling specific endpoints during promotional campaigns

Certain endpoints can receive sudden traffic spikes during e-commerce campaigns. TR7 applies a temporary bandwidth ceiling to checkout or campaign APIs to maintain service stability.

Frequently asked questions

How do I choose between stream, per-key and shared mode?

Stream mode applies a separate limit to every connection and is suited to long-lived flows such as WebSocket or large file transfers. Per-key mode applies a shared limit against an FX key such as user, IP or tenant and is preferred for multi-tenant quota scenarios. Shared mode distributes the same budget between two nodes in a high-availability pair. The policy objective should be clarified first — the wrong mode can break the expected capacity behaviour.

Can upload and download limits be defined separately?

Yes. TR7 can limit the upload direction from clients to the backend and the download direction from the backend to clients independently. For example, upload can be more tightly controlled on a file-upload service while download is tighter on a video service. Managing the two directions separately aligns bandwidth policy with real traffic behaviour.

How is a key built in per-key mode, and how many keys can be tracked?

The key is built using the FX expression engine. Source IP, JWT user information, tenant ID, a header value or any combination of these can serve as the key. The stick-table in per-key mode can track up to 100,000 active keys. Keys that have been silent for 3,600 seconds are cleaned up automatically.

How does shared mode work in a high-availability setup?

In shared mode the two nodes exchange bandwidth budget state via UDP synchronisation at 100 ms intervals. When a tenant's traffic moves between nodes, the limit logic is not disrupted. This model is particularly important for guaranteeing tenant isolation in enterprise SLA and quota scenarios.

Is this feature the same as Linux tc or hardware QoS systems?

No. TR7 traffic shaping is a connection-level flow control that applies a bandwidth ceiling at the connection and flow level. It is not a kernel-level queuing architecture such as Linux tc or HTB, nor is it a hardware QoS mechanism. This approach works directly with L7 signals such as path, tenant, user or JWT claims at the application delivery layer, with significantly less complexity.

What happens when a limit is exceeded, and can these events be monitored?

When a connection or key reaches its limit it is throttled — traffic is not cut off, it is pulled down to the bandwidth ceiling. Limit breach events can be logged and connected to a SIEM pipeline. In per-key mode the operator can observe in real time which tenant, IP or user is consuming their quota.

Control traffic capacity at the vService and tenant level

Move bandwidth management onto the ADC. Establish capacity policies with stream, per-key and shared modes — no separate network appliance or complex queuing architecture required.