Skip to main content

Availability

EditionDeployment Type
Community & EnterpriseSelf-Managed, Hybrid
The Edge Gateway operates as an independent, dedicated AI proxy. It processes AI requests, enforces policies, and reports analytics to the control plane. It is optimized for high performance and resilience in production. The Edge Gateway operates as an independent, dedicated AI proxy. It processes AI requests, enforces policies, and reports analytics to the control plane. It is optimized for high performance and resilience in production.
Note: The Edge Gateway is sometimes referred to as the Microgateway in configuration files. However, Edge Gateway is the preferred terminology.
  • Unified Access Point: Provides a single, consistent endpoint for applications to interact with various LLMs.
  • Security Enforcement: Handles authentication, authorization, and applies security policies.
  • Policy Management: Enforces rules related to budget limits, model access, and applies custom Filters.
  • Observability: Logs detailed analytics data for each request, feeding the Analytics & Monitoring system.
  • Vendor Abstraction: Hides the complexities of different LLM provider APIs, especially through the OpenAI-compatible endpoint.

Edge Gateway Variants

There are two Edge Gateway variants:
VariantWhere it runsCapabilitiesUse case
Embedded GatewayInside AI StudioLLM proxying, tool calling (REST + MCP), datasource querying. No filters, no middleware, no plugins.Testing LLM configurations, powering the Chat interface
Edge GatewayStandalone binary, deployed at edgeFull middleware pipeline: authentication, filters, plugins, analytics, budget enforcement, tool calling (REST + MCP), datasource queryingProduction data plane in hub-and-spoke deployments
Both variants rely on the core proxy library and access control mechanisms. The embedded gateway is lightweight, while the Edge Gateway offers the full feature set. Tools, datasources, and OAuth state are synced to edge gateways through the hub-spoke configuration system.

Core Features

  1. Request Routing: Incoming requests include an llmSlug in their path (e.g., /llm/call/{llmSlug}/...). The Proxy uses this slug (auto-generated from the LLM configuration name) to identify the target LLM Configuration and route the request accordingly.
  2. Authentication & Authorization:
    • Validates the API key provided by the client application.
    • Identifies the associated Application and User.
    • Checks if the Application/team has permission to access the requested LLM Configuration based on RBAC rules.
  3. Policy Enforcement: Before forwarding the request to the backend LLM, the Proxy enforces policies defined in the LLM Configuration or globally:
    • Budget Checks: Verifies if the estimated cost exceeds the configured Budgets for the App or LLM.
    • Model Access: Ensures the requested model is allowed for the specific LLM configuration.
    • Filters: Applies configured request Filters to modify the incoming request payload.
  4. Analytics Logging: After receiving the response from the backend LLM (and potentially applying response Filters), the Proxy logs detailed information about the interaction (user, app, model, tokens used, cost, latency, etc.) to the Analytics database.

Proxy Modes

The Edge Gateway (and embedded gateway) offer two ways to proxy LLM traffic:
ModeEndpointDescriptionTradeoff
SDK-Compatible (Unified)/llm/call/{slug}/...Pass-through to the vendor’s native API format. No request manipulation beyond analytics/budget tracking.Full feature access, resilient to vendor API changes. Best for users working directly with a vendor’s SDK.
OpenAI-Compatible/llm/call/{slug}/v1/chat/completionsAccepts only OpenAI-format input and translates to the upstream vendor’s API format.Maximum client-side compatibility (one format for all vendors), but reduced feature access for vendor-specific capabilities.
Both modes support streaming and non-streaming responses. There are also two legacy endpoints (/llm/rest/{slug}/... and /llm/stream/{slug}/...) from before the unified endpoint existed. While not actively used by end-users, the underlying code is still used internally by the proxy to handle each response style.

LLM Slug

The {llmSlug} in the endpoint path is automatically generated from the LLM configuration name when you create it. For example, an LLM named “My OpenAI Config” would have a slug like my-openai-config.

Model Router

The Edge Gateway also includes a Model Router component that enables intelligent routing of requests across multiple LLM providers based on cost, performance, or availability. This allows you to build resilient AI applications that automatically failover or load balance between different models. For more information on configuring and using this feature, see the Model Router documentation.

Configuration Reference

To know more about configuring Edge Gateways, see the Configuration Reference for detailed documentation on all environment variables.

Troubleshooting

  • Check network connectivity between the edge and control plane
  • Verify the edge gateway is running and healthy
  • Check edge gateway logs for connection errors
  • Ensure firewall rules allow gRPC traffic (default port 50051)
  • Wait a few seconds for the heartbeat cycle to complete
  • Check if the edge is connected (not disconnected)
  • Verify the edge gateway logs for configuration load errors
  • Check if the edge has sufficient permissions to fetch configuration
  • Try pushing configuration again
  • Check for configuration validation errors in edge logs
  • Verify the edge and control plane are running compatible versions
  • Check for database replication lag if using PostgreSQL replication
  • Verify all edges have successfully loaded the new configuration
  • Check for any disconnected edges that can’t receive updates
  • Refresh the page to ensure the latest status is displayed