Kubermatic branding element
The KubeLB and agentgateway logos with a heart between them

When you run AI agents, almost everything they do crosses one piece of infrastructure. The tools an agent calls, the model requests behind its reasoning, the messages it sends to other agents — all of it flows through a gateway. That gateway can see what your agents do and decide what they are allowed to do, which makes it the most sensitive control point in an agentic system. So the practical question is where it runs, and who operates it.

If you already run KubeLB, the answer is already in place. KubeLB v1.4 deploys and manages an agent gateway for you, on your own clusters, using the open-source agentgateway project as the data plane.

What KubeLB does

KubeLB is Kubermatic’s Kubernetes-native, multi-tenant load balancer. It centralizes Layer 4 and Layer 7 load balancing for a fleet of clusters using a hub-and-spoke model. A management cluster runs the data plane (Envoy) and a central control plane. Each tenant cluster runs a lightweight agent that propagates its load-balancing config up to the management cluster through KubeLB’s CRDs. Tenants never touch the management cluster directly — they declare what they need, and KubeLB provisions and configures the data plane for them.

KubeLB is built on the Kubernetes Gateway API. An agent gateway is really just a Gateway API data plane that also understands AI traffic, which is why it fits into KubeLB so cleanly.

What the agent gateway is

agentgateway is an open, Envoy-based data plane that implements the Kubernetes Gateway API and adds first-class support for three kinds of agentic traffic:

  • LLM traffic — routing model requests to providers (OpenAI, Anthropic, Google, Mistral, or local models), with the auth, rate limiting, and observability you would expect from a gateway.
  • MCP servers — federating one or more Model Context Protocol servers behind a single endpoint, so an agent reaches your tools through one governed entry point.
  • Agent-to-agent (A2A) — proxying traffic between agents through the same gateway, so agent-to-agent calls get the same policy and visibility as everything else.

It recently joined the Linux Foundation’s Agentic AI Foundation (AAIF), the same independent home as MCP. So the data plane carrying your agents’ authority to act is open source and governed by a neutral foundation.

For the concept on its own — why agentic traffic needs its own gateway, and the main use cases — see What Is an Agent Gateway?.

How it works in KubeLB

In KubeLB v1.4, the agent gateway ships as an addon in the KubeLB manager chart — the same kubelb-addons mechanism KubeLB uses for components like cert-manager and the Gateway API. You enable it in the manager chart values, and the management cluster deploys and runs it:

kubelb:
  enableGatewayAPI: true
kubelb-addons:
  enabled: true
  agentgateway:
    enabled: true

From there you configure it with standard Gateway API resources plus one agentgateway CRD. First, a Gateway that uses the agentgateway GatewayClass:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: agentgateway-proxy
  namespace: kubelb
spec:
  gatewayClassName: agentgateway
  listeners:
    - name: http
      protocol: HTTP
      port: 8080
      allowedRoutes:
        namespaces:
          from: All

Then an AgentgatewayBackend that describes where the traffic actually goes — here, a self-hosted open model exposed through an OpenAI-compatible API, with its credentials pulled from a secret:

apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: local-llm
  namespace: kubelb
spec:
  ai:
    provider:
      # Point at any OpenAI-compatible server. Here it is a self-hosted open
      # model (llama.cpp, vLLM, or Ollama) running in the cluster.
      host: llama-cpp.llm.svc.cluster.local
      port: 8080
      openai:
        model: gemma-4-12b-it
  policies:
    auth:
      secretRef:
        name: llm-secret

And an HTTPRoute that points a path at that backend:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: local-llm
  namespace: kubelb
spec:
  parentRefs:
    - name: agentgateway-proxy
      namespace: kubelb
  rules:
    - backendRefs:
        - name: local-llm
          namespace: kubelb
          group: agentgateway.dev
          kind: AgentgatewayBackend

Those three resources are the whole pattern. The Gateway listens, the AgentgatewayBackend defines the LLM target and its credentials, and the HTTPRoute connects them. Agents send their requests to the gateway, which handles auth, routing, and policy before forwarding them on.

The MCP and A2A cases follow the same pattern. For MCP, the backend describes a set of mcp.targets — several MCP servers federated behind one endpoint — and a route exposes them under an /mcp path. For A2A, agentgateway proxies traffic between agents through the same gateway, so those calls inherit the same auth and observability. The full, tested walkthrough — including a request test, rate limiting, and MCP setup — is in the KubeLB AI & MCP Gateway tutorial.

Why running it on KubeLB matters

Because the gateway is a KubeLB addon, two things come for free.

First, it is multi-tenant by design. The same management cluster that load-balances your fleet now provisions agent gateways across it, with the tenant isolation KubeLB already enforces. You are not standing up and operating a separate gateway per team or per cluster by hand.

Second, it runs on infrastructure you operate. The fastest path to an agent gateway is usually the managed one a cloud provider offers, but that puts the control point for your agents inside a platform you don’t run. Running it on KubeLB keeps it on your own clusters, on a data plane that is open source and governed by a neutral foundation. That placement is hard to change later — once every agent in the company depends on a gateway, moving it is a migration — so it is worth choosing on purpose while it is still easy.

Takeaways

  • The agent gateway is the control point for agentic systems — every tool call, model request, and agent-to-agent message passes through it.
  • KubeLB v1.4 ships one as a managed addon, using the open-source agentgateway data plane (Envoy, Gateway API, now in the LF Agentic AI Foundation).
  • It covers LLM routing, MCP federation, and A2A through standard Gateway API resources plus the AgentgatewayBackend CRD.
  • Because it is part of KubeLB, you get it multi-tenant across your fleet and on infrastructure you control.

Try it: the KubeLB AI & MCP Gateway tutorial walks through the full setup. To build an agent gateway from the ground up, the AI Infrastructure track on Kubermatic Learn covers it step by step.

Abubakar Siddiq Ango

Abubakar Siddiq Ango

Senior Developer Advocate

Kubermatic named in the 2025 Gartner® Magic Quadrant™ for Container Management

Access the Report

Empower Your Business with Cloud Native Labs Consulting Services, Accelerators and Trainings

Discover More