Capability

Connection Multiplexing

Carry client traffic to backends without mirroring every connection — fewer handshakes, lower latency.

TR7 Connection Multiplexing does not mirror every client request as a new TCP/TLS connection to the backend. Instead, it maintains a persistent connection pool on the backend side, so high client traffic generates far less connection-establishment overhead on the services that actually do the work. On the frontend, modern protocols including HTTP/2 and HTTP/3 enable multi-stream behavior. On the backend, keepalive and connection reuse are applied according to an operator-selected mode: safe, aggressive, always or never. This approach matters most for API workloads, mobile applications, e-commerce, media and B2B services — anywhere short, frequent requests are the norm. Backend services focus on real application logic instead of burning resources on repeated TCP and TLS handshakes. The result: TR7 turns connection multiplexing from a hidden optimization into a manageable ADC policy — modern protocols on the frontend, a persistent pool on the backend, and per-service reuse control throughout.

100K→1K

Typical client-to-backend connection ratio with multiplexing

http-reuse modes: never, safe, aggressive, always

95%+

TLS handshake CPU reduction with session ticket resumption

When every client request opens a new backend connection, resources go to connection setup — not to serving the workload.

In the classic connection model, a large number of clients translates into an equally large number of TCP or TLS connections on the backend side. Every new connection brings a three-way TCP handshake, a TLS handshake, socket management and a teardown cost. As traffic grows, backends spend more time on connection overhead than on actual application logic.

Short, frequent API requests make the problem more visible. Mobile applications, B2B calls, microservices and checkout flows all generate many small requests. When each request becomes a new connection, CPU, memory, sockets and network resources are consumed unnecessarily.

HTTP/1.1 behavior on the frontend adds another constraint. A slow request on a given connection can block others behind it; parallel stream capacity is limited. Modern client traffic moves more efficiently with HTTP/2 and HTTP/3, and backend connection management must keep pace.

The right approach is not to mirror client connections one-to-one onto backends. It is to pool and reuse connections, and to tune multiplexing behavior per service type. Non-idempotent operations call for safer reuse modes; high-volume APIs benefit from more aggressive ones.

TR7 Connection Multiplexing delivers this model: modern stream multiplexing on the frontend, a keepalive pool on the backend, and a manageable per-service connection reuse policy.

Our approach

TR7 implements connection multiplexing through a backend keepalive pool, a per-service reuse mode, HTTP/2 and HTTP/3 protocol support, and TLS session resumption.

A backend connection pool eliminates repeated connection-establishment cost

Connections opened to the backend are not closed immediately when a request completes — they are returned to the pool for reuse. This cuts TCP and TLS handshake overhead on the backend side.

The reuse mode is chosen to match service behavior

Connection reuse can be managed through four modes: never, safe, aggressive and always. Operators select the right behavior for each service by balancing security requirements, idempotency constraints and throughput goals.

HTTP/2 stream multiplexing delivers parallelism over a single connection

With HTTP/2 ALPN, multiple parallel streams are carried over a single connection. This is especially effective for large numbers of short API requests, improving client-side connection efficiency.

TLS session resumption lowers handshake cost for returning clients

TLS session resumption allows returning clients to skip the full handshake. In TLS-heavy services this reduces CPU consumption and connection-setup latency.

Capabilities

Connection Multiplexing reduces backend connection overhead through per-service reuse, keepalive, HTTP/2, HTTP/3 and timeout profiles.

http-reuse mode puts connection reuse under operator control

TR7 manages connection reuse as a pool-level action. The never mode behaves close to a new-connection-per-request model. safe provides more controlled reuse. aggressive and always target heavier connection savings for high-volume services. Operators pick the mode that reflects the performance and safety needs of each service.

Backend keepalive reduces connection open and close overhead

With keepalive enabled on the backend side, connections return to the pool when a request completes. The next request picks up a ready connection instead of opening a new one. This significantly reduces connection-setup cost for short requests and gives backends a more stable, predictable connection profile.

HTTP/2 ALPN supports modern client streams on the frontend

TR7 supports HTTP/2 ALPN on the client side, carrying multiple streams over a single connection. This reduces the number of connections browsers and mobile clients need to open. Latency and resource consumption become more predictable. HTTP/2 support is the baseline performance layer for modern web and API traffic.

Backend HTTP/2 is enabled per service with a toggle

HTTP/2 on the backend side is opt-in at the service level. When a backend supports HTTP/2, ALPN negotiates h2 or http/1.1 accordingly. Services that do not support HTTP/2 fall back to HTTP/1.1 automatically. This lets modern backends benefit from HTTP/2 without breaking legacy services.

HTTP/3 on the frontend improves connection behavior on lossy networks

TR7 carries modern client traffic over HTTP/3/QUIC on the frontend. On mobile and lossy networks, connection setup and stream continuity improve. HTTP/2 fallback preserves backward compatibility. The backend side is managed independently based on its own protocol capabilities.

Safe mode preserves data integrity for non-idempotent operations

The safe reuse mode applies more conservative behavior for risky or non-idempotent operations. In banking, payment or write-heavy APIs, performance optimization must not come at the cost of data integrity. This mode keeps reuse within safe boundaries. Operators can select safe instead of aggressive for high-sensitivity services.

TLS session resumption reduces CPU load for returning clients

TLS session reuse allows the same client to reconnect without repeating the full handshake. TLS 1.2 session tickets and TLS 1.3 PSK mechanisms both support this behavior. Under heavy HTTPS traffic, ADC CPU usage is significantly reduced. This is especially valuable in mobile and API scenarios with many short-lived connections.

Timeout profiles fine-tune pool behavior precisely

HTTP keepalive, client-fin, server-fin, tunnel, connect, server, client and queue timeout values all shape how the connection pool behaves. Timeouts that are too short drain the pool and reduce reuse. Timeouts that are too long raise idle connection count and memory consumption. TR7 makes this balance manageable per service profile.

maxconn limits set the upper bound on pool size

Connection limits can be defined at pool and backend level. These limits help protect backend services from sudden connection bursts. They are especially important for applications with smaller or licensed connection capacity. When combined with connection multiplexing, maxconn limits provide more predictable capacity behavior.

Connection rate limiting controls new-connection bursts

The rate of new connections per second can be bounded by an upper limit. This prevents backend services from being overwhelmed by bot waves, mobile reconnect storms or sudden traffic spikes. The keepalive pool handles reuse while the new-connection rate is managed separately. Operations teams can constrain connection behavior not just by total count but by rate.

TCP keepalive reduces the risk of NAT and firewall timeouts

TCP keepalive signals on both the client and backend sides can prevent intermediate network devices from silently closing idle connections. Long-lived connections are vulnerable to firewall and NAT timeouts. Keepalive helps maintain those connections. This matters most for services with long sessions or low-frequency traffic.

Soft reload applies new configuration without dropping connections

When configuration changes, existing connections are drained while new connections are accepted by the new worker under the updated configuration. This prevents abrupt disruption to connection pools. Operators can change timeout values, reuse modes or ALPN settings while preserving service continuity. Production changes carry lower operational risk.

Operational depth

Connection multiplexing is operated alongside keepalive timeout balance, drain behavior, TLS cache sizing, stream concurrency, protocol bridging and monitoring metrics.

Keepalive timeout balance

If the keepalive timeout is too short, connections close before they can be reused and pool utilization drops. If it is too long, idle connection count grows and memory consumption rises. The value should be tuned to match traffic density and backend capacity.

Connection drain on reload

During a soft reload, the old worker drains existing connections in a controlled way while the new worker accepts connections under the new configuration. This is critical for zero-disruption changes in services that rely on connection multiplexing. Drain duration should be planned separately for long-lived connections.

TLS session cache

TLS session cache size matters for clients that reconnect frequently under heavy HTTPS traffic. A cache that is too small lowers the resumption rate. A cache that is very large needs to be accounted for in memory planning.

TCP keepalive behavior

TCP keepalive at the OS level signals to intermediate layers that the connection is still alive. This reduces the chance that NAT devices, firewalls or stateful security appliances close idle connections prematurely. The setting is most valuable for long-lived connections.

HTTP/2 stream concurrency

The number of parallel streams over a single HTTP/2 connection can be capped. Too low a value reduces multiplexing benefit; too high a value risks overloading a single connection. The right setting depends on the traffic mix.

HTTP/2 frontend, HTTP/1.1 backend

When the client side uses HTTP/2 but the backend runs HTTP/1.1, stream behavior on the backend side does not mirror the same parallelism. Some requests may be serialized according to the backend connection model. If the backend supports HTTP/2, enabling the relevant toggle should be considered.

TLS termination CPU impact

When TLS is terminated by TR7 rather than the backend, the backend's TLS handshake load disappears. The ADC takes on the TLS processing cost instead. TLS session resumption and the connection pool together help offset that cost.

Monitoring metrics

Request totals, queue depth, connection count, reuse behavior and error metrics reflect connection pool health. Low reuse warrants a review of timeout settings or backend behavior. A growing queue signals that backend connection capacity or maxconn limits may need adjustment.

When to use it

High-QPS API gateway connection savings

SaaS APIs generate short, dense request volumes. Connection multiplexing reduces the number of backend connections and eliminates repeated TCP/TLS establishment cost.

Reducing reconnect latency for mobile API traffic

Mobile clients frequently open and close connections. Keepalive, HTTP/2 and TLS resumption reduce the cost of reconnection on the client side.

HTTP/2 backend use in microservice environments

For backends that support HTTP/2, the ALPN toggle can be enabled to evaluate stream multiplexing. This yields more efficient connection usage across high-frequency inter-service calls.

Persistent connection pool for media origin traffic

Edge or dense client traffic can draw from a pool of existing connections to the origin backend instead of opening new ones continuously. This reduces origin CPU and socket pressure.

Safe reuse mode for banking transactions

For non-idempotent financial operations, safe mode is preferred over aggressive reuse. This pursues performance gains while preserving transaction integrity.

Reducing TLS overhead for B2B API calls

B2B services often carry high-value, low-frequency requests that still incur significant TLS cost. A connection pool and session resumption reduce secure connection establishment overhead.

Frequently asked questions

What is the difference between the http-reuse modes?

TR7 offers four reuse modes. never behaves close to a new-connection-per-request model. safe applies reuse only to idempotent operations such as GET and HEAD — it is the preferred choice for banking and payment APIs. aggressive extends reuse to methods such as POST and PUT for heavier savings. always selects the most aggressive behavior and suits high-volume, low-risk services. Operators choose the mode that reflects the performance and safety profile of each service.

How is HTTP/2 enabled on the backend side?

Backend HTTP/2 is enabled per service through a toggle. When the backend supports HTTP/2, ALPN negotiates h2 or http/1.1 automatically. Services that do not support HTTP/2 fall back to HTTP/1.1 without any change to their configuration. This lets modern backends benefit from HTTP/2 while legacy services remain unaffected.

How does TLS session resumption work and how much CPU does it save?

TLS session resumption lets a returning client reconnect using cached session material from a previous handshake, skipping the full TLS negotiation. TLS 1.2 session tickets and TLS 1.3 PSK mechanisms both support this. Under heavy HTTPS traffic, ADC CPU usage attributable to TLS can be reduced by 95% or more; the actual figure depends on traffic profile and reconnect frequency.

Is HTTP/3 frontend support production-ready?

Yes. HTTP/3/QUIC frontend support is production-active. It reduces connection-setup latency for mobile clients and improves stream continuity on lossy networks. HTTP/2 fallback preserves backward compatibility for clients that do not support HTTP/3. The backend side is managed separately according to its own protocol capabilities; QUIC on the backend side is on the roadmap.

How should the keepalive timeout value be set?

A keepalive timeout that is too short causes connections to close before they can be reused, reducing pool efficiency. One that is too long raises idle connection count and memory consumption. The right value depends on traffic density and backend capacity; a starting range of 60–120 seconds is common. TR7 allows timeout profiles to be managed independently per service.

Are existing connections dropped during a configuration change?

No. During a soft reload, the old worker drains existing connections in a controlled way while the new worker accepts incoming connections under the updated configuration. Connection pools are not disrupted abruptly. Operators can change timeout values, reuse modes or ALPN settings while preserving service continuity, which lowers the operational risk of production changes.

Free your backends from connection overhead

Keepalive pool, http-reuse modes, HTTP/2, HTTP/3 and TLS session resumption — all connection optimization in a single ADC policy. Let's walk through a live setup on your own services.