Gateway System

An open protocol for proxying LLM requests—self-host or use WebLLM's hosted gateway

Open Gateway Protocol

The gateway protocol is open and anyone can run a gateway. WebLLM Gateway is our hosted implementation, but you can self-host for full control over your data and infrastructure.

Gateways are optional—most users connect directly to providers via the browser extension or their own API keys. Gateways are useful when you want to provide AI access to website visitors without requiring them to configure anything.

Architecture Overview

Dev configures gateway with secret key + API keys

Dev generates access token locked to origin domain

User app sends requests with token for inference

Token-Origin Binding

Each access token is cryptographically bound to specific origins. When a request arrives:

1.Gateway extracts token: wlm-abc123.eyJ...

2.Verifies JWT signature with secret key

3.Checks Origin header against token's domains[]

4.Validates quota not exceeded

5.Proxies request to provider (OpenAI, Anthropic, etc.)

Allowed

Token domains: ["myapp.com", "*.myapp.com"]

Requests from myapp.com or staging.myapp.com

Rejected

Origin: evil-site.com

Token stolen and used from different domain

Security Parameters

Quota Limits

Parameter	Options	Description
quota.type	`requests` \| `tokens`	Count API requests or LLM tokens consumed
quota.limit	number	Maximum allowed per period (e.g., 1000)
quota.period	`hour` \| `day` \| `month` \| `lifetime`	When quota resets