WebLLM

Gateway Access Modes

Understanding the three types of gateway access: Public, Token-Gated, and API-Key

WebLLM gateways support multiple access modes to fit different use cases. Choose the right mode based on your security requirements and deployment scenario.

Public
No Auth

Open access with global rate limits. Good for demos.

Token-Gated
Access Token

Per-token quotas and domain restrictions. Production-ready.

API-Key
Server Key

Full access for server-to-server. Internal tools.

Public Gateway

How it works

No authentication required. Requests are accepted from any origin with only global rate limits to prevent abuse.

Use cases

  • - Demos and experiments
  • - Testing during development
  • - Low-stakes applications
  • - Community/open access

Configuration

{
  "accessMode": "public",
  "rateLimits": {
    "requestsPerMinute": 10,
    "tokensPerDay": 10000
  }
}
!Warning

Public gateways can burn through API credits quickly. Use sparingly and implement global rate limits at the gateway level.

Token-Gated Gateway

The recommended mode for production websites. Each access token contains embedded permissions, quotas, and domain restrictions.

Authentication Flow

Your Server
Generates token
User's Browser
Receives token
Gateway
Validates & executes

Token Contains

  • tid - Unique token ID
  • gid - Gateway ID
  • quota - Usage limits (requests/tokens per period)
  • domains - Allowed origins with wildcards
  • exp - Expiration timestamp

Security Features

  • - JWT signature verification (HS256)
  • - Origin header validation
  • - Automatic quota enforcement
  • - Token expiration
  • - Revocation support

Example: Server-Side Token Generation

import { generateGatewayToken } from 'webllm';

const token = await generateGatewayToken({
  secretKey: process.env.GATEWAY_SECRET_KEY,
  gatewayId: 'your-gateway-id',
  quota: {
    type: 'requests',
    limit: 1000,
    period: 'month'
  },
  domains: ['myapp.com', '*.myapp.com'],
  expiresIn: 30 * 24 * 60 * 60 * 1000, // 30 days
});

API-Key Gateway

Full access mode for trusted server-to-server communication. No quota enforcement - the caller is fully trusted.

Use cases

  • - Backend services calling the gateway
  • - CI/CD pipelines
  • - Internal tooling
  • - Admin operations

Authentication

// Request header
X-Gateway-Key: sk-webllm-gateway-...
!Security Notice

Never expose API keys in client-side code. They should only be used in server-side applications where the key cannot be extracted.

Comparison

FeaturePublicToken-GatedAPI-Key
AuthenticationNoneBearer tokenX-Gateway-Key
Quota EnforcementGlobal onlyPer-tokenNone
Domain RestrictionsNoYesNo
Client-Side SafeYesYesNo
Best ForDemosProductionBackend

Hybrid Mode

Gateways can support multiple authentication methods simultaneously. The gateway checks authentication in this priority order:

  1. 1
    X-Gateway-Key header → Full access (server mode)
  2. 2
    Authorization: Bearer header → Token validation + quotas
  3. 3
    No authentication → Public rate limits (if enabled)