Availability
| Edition | Deployment Type |
|---|---|
| Community & Enterprise | Self-Managed, Hybrid |
Note: The Edge Gateway is sometimes referred to as the Microgateway in configuration files. However, Edge Gateway is the preferred terminology.
- Unified Access Point: Provides a single, consistent endpoint for applications to interact with various LLMs.
- Security Enforcement: Handles authentication, authorization, and applies security policies.
- Policy Management: Enforces rules related to budget limits, model access, and applies custom Filters.
- Observability: Logs detailed analytics data for each request, feeding the Analytics & Monitoring system.
- Vendor Abstraction: Hides the complexities of different LLM provider APIs, especially through the OpenAI-compatible endpoint.
Edge Gateway Variants
There are two Edge Gateway variants:| Variant | Where it runs | Capabilities | Use case |
|---|---|---|---|
| Embedded Gateway | Inside AI Studio | LLM proxying, tool calling (REST + MCP), datasource querying. No filters, no middleware, no plugins. | Testing LLM configurations, powering the Chat interface |
| Edge Gateway | Standalone binary, deployed at edge | Full middleware pipeline: authentication, filters, plugins, analytics, budget enforcement, tool calling (REST + MCP), datasource querying | Production data plane in hub-and-spoke deployments |
Core Features
-
Request Routing: Incoming requests include an
llmSlugin their path (e.g.,/llm/call/{llmSlug}/...). The Proxy uses this slug (auto-generated from the LLM configuration name) to identify the target LLM Configuration and route the request accordingly. -
Authentication & Authorization:
- Validates the API key provided by the client application.
- Identifies the associated Application and User.
- Checks if the Application/team has permission to access the requested LLM Configuration based on RBAC rules.
- Policy Enforcement: Before forwarding the request to the backend LLM, the Proxy enforces policies defined in the LLM Configuration or globally:
- Analytics Logging: After receiving the response from the backend LLM (and potentially applying response Filters), the Proxy logs detailed information about the interaction (user, app, model, tokens used, cost, latency, etc.) to the Analytics database.
Proxy Modes
The Edge Gateway (and embedded gateway) offer two ways to proxy LLM traffic:| Mode | Endpoint | Description | Tradeoff |
|---|---|---|---|
| SDK-Compatible (Unified) | /llm/call/{slug}/... | Pass-through to the vendor’s native API format. No request manipulation beyond analytics/budget tracking. | Full feature access, resilient to vendor API changes. Best for users working directly with a vendor’s SDK. |
| OpenAI-Compatible | /llm/call/{slug}/v1/chat/completions | Accepts only OpenAI-format input and translates to the upstream vendor’s API format. | Maximum client-side compatibility (one format for all vendors), but reduced feature access for vendor-specific capabilities. |
/llm/rest/{slug}/... and /llm/stream/{slug}/...) from before the unified endpoint existed. While not actively used by end-users, the underlying code is still used internally by the proxy to handle each response style.
LLM Slug
The{llmSlug} in the endpoint path is automatically generated from the LLM configuration name when you create it. For example, an LLM named “My OpenAI Config” would have a slug like my-openai-config.
Model Router
The Edge Gateway also includes a Model Router component that enables intelligent routing of requests across multiple LLM providers based on cost, performance, or availability. This allows you to build resilient AI applications that automatically failover or load balance between different models. For more information on configuring and using this feature, see the Model Router documentation.Configuration Reference
To know more about configuring Edge Gateways, see the Configuration Reference for detailed documentation on all environment variables.Troubleshooting
Edge Shows "Disconnected"
Edge Shows "Disconnected"
- Check network connectivity between the edge and control plane
- Verify the edge gateway is running and healthy
- Check edge gateway logs for connection errors
- Ensure firewall rules allow gRPC traffic (default port 50051)
Edge Shows "Pending" After Push
Edge Shows "Pending" After Push
- Wait a few seconds for the heartbeat cycle to complete
- Check if the edge is connected (not disconnected)
- Verify the edge gateway logs for configuration load errors
- Check if the edge has sufficient permissions to fetch configuration
Checksum Mismatch Persists
Checksum Mismatch Persists
- Try pushing configuration again
- Check for configuration validation errors in edge logs
- Verify the edge and control plane are running compatible versions
- Check for database replication lag if using PostgreSQL replication
Sync Status Banner Doesn’t Disappear
Sync Status Banner Doesn’t Disappear