Architecture
Browser-native LLM integration following a standard model
Think of it as: "Geolocation API for AI" - Install once, use your AI everywhere on the web
Vision: AI to the People
• Your AI, everywhere: Install once, use on any website
• Stop paying twice: Use ChatGPT Plus/Claude Pro across all sites
• Data control: Local models, per-site permissions, full transparency
• Choice: Switch between local, cloud, or company-provided models
• Free infrastructure: Zero API costs, no key management
• 3 lines of code: navigator.llm.generate()
• Universal compatibility: Works with any model, future-proof
• Privacy built-in: GDPR/HIPAA ready, user-controlled
Three-Layer Architecture
Browser polyfill that adds navigator.llm API
• Auto-detects transport (Extension → Daemon → Gateway)
• Provides simple API: generateText(), streamText(), etc.
• Works in any browser once extension is installed
Core orchestration running in extension service worker or Node.js daemon
• Request coordination and progress tracking
• Provider selection and routing (@webllm/router)
• Permission management (per-origin)
• Storage (IndexedDB) and usage tracking
14+ AI providers execute requests
• Cloud APIs: OpenAI, Anthropic, Google, Azure, DeepSeek, etc.
• Gateways: OpenRouter, Portkey, Developer Gateways
• Local: Ollama, LM Studio, browser-based models
Transport Modes
The client SDK automatically detects the best transport in priority order:
Priority: 1 (Highest)
When: Chrome extension installed
Best for: Desktop users
User brings their own AI, zero cost to developer
Priority: 2
When: localhost:54321 responds
Best for: Development
Local testing with hot reload
Priority: 3 (Fallback)
When: Developer provides token
Best for: Mobile, 100% coverage
Works on iOS/Android without installation
Developer Gateway Platform
Critical for mobile support during transition period. Developers create hosted gateways with their API keys, distribute limited tokens to users.
Developer creates gateway
At gateway.webllm.org, inputs their API keys (OpenAI, Anthropic, etc.)
Get encoded token
Set limits: requests/day, tokens/month, expiration
webllm-gateway-abc123-5k-limitUsers make requests
Works on mobile! No extension/daemon needed
Gateway proxies to provider
Enforces limits, tracks usage, returns response
✓ Works on iOS/Android
✓ Zero installation
✓ Same API everywhere
✓ No server needed
✓ Usage control per token
✓ Monitor and revoke
✓ No backend secrets
✓ Client-side tools
✓ Direct execution
Client-Side Tool Execution
One of the most powerful architectural benefits: AI tools execute directly in the browser with zero server round-trips.
✓ No Server Round-Trips: Tools execute immediately in the browser
✓ No Secret Handling: Developer's API keys stay in gateway, not in backend code
✓ Simpler Architecture: Modern apps just need client-side code
✓ Better UX: Instant UI updates, no latency from server relay
✓ Rich Interactions: AI can directly manipulate DOM, play sounds, trigger animations
✓ Only Fetch What's Needed: Get AI response from gateway, execute tools locally
• Interactive UIs: Theme changes, layout adjustments
• Media Control: Play sounds, show images, control video
• Form Interactions: Auto-fill, validate, show/hide fields
• Data Visualization: Update charts, graphs, tables
• Game Logic: AI-driven state changes
• Accessibility: Dynamic ARIA updates
tools: [
{
name: 'change_theme',
execute: (params) => {
changeTheme(params.theme);
return { success: true };
}
},
{
name: 'play_sound',
execute: () => {
new Audio('/notify.mp3').play();
return { success: true };
}
}
]Request Flow
Permission Check (0-20%)
Verify site has user permission (like geolocation API)
Router Selection (20-40%)
@webllm/router scores models by 16 criteria, builds fallback chain
Provider Execution (40-80%)
Try primary provider, auto-fallback on failure
Storage (80-90%)
Save conversation to IndexedDB
Usage Tracking (90-100%)
Record tokens, cost, provider used
RequestCoordinator: Orchestrates full pipeline with progress tracking
ProviderManager: Provider selection and routing using factory pattern
RouterManager: Integrates @webllm/router for intelligent model selection
PermissionManager: Per-origin permission system (like geolocation API)
UsageTracker: Tracks token consumption and cost
Provider Architecture
16+ providers supported via factory pattern. Add new providers in ~30 lines of code.
• Anthropic (Claude), OpenAI (GPT)
• Google Generative AI, Vertex AI
• Azure OpenAI, Mistral AI
• DeepSeek, Groq, Fireworks AI
• Together.ai, Cohere
• OpenRouter (100+ models)
• Portkey (unified API)
• Cloudflare Workers AI
• Ollama, LM Studio
• Local Browser (WebGPU/WASM)
1. Define Provider Metadata
In @webllm/data registry: ID, name, type, category, tier, config fields
2. Create Provider Class (~30 lines)
Extend APIProvider for API providers, or BaseProvider for custom
3. Register Factory
One line in register-providers.ts
4. Done!
UI automatically shows correct category, tier, config form, connection testing
Data Storage
All data stored locally in IndexedDB (no external telemetry):
Philosophy & Principles
Key Takeaways
- ✓ WebLLM is like the Geolocation API, but for AI - Standardized browser API for LLM access
- ✓ Three-layer architecture: Client SDK → Server (orchestration) → Providers
- ✓ Three transport modes: Extension (desktop) → Daemon (dev) → Gateway (mobile/fallback)
- ✓ Developer Gateway Platform: Solves mobile problem with hosted gateways (critical for transition!)
- ✓ Client-side tool execution: AI tools run in browser - instant UI updates, no server round-trips
- ✓ Intelligent routing: @webllm/router selects best model based on 16 criteria
- ✓ Privacy-first: Local models, per-site permissions, transparent logging
- ✓ Developer-friendly: 3 lines of code, zero infrastructure, no backend secrets needed
- ✓ User control: Choose AI, approve sites, switch models, track usage
- ✓ 16+ providers supported via factory pattern (30 lines to add new)
- ✓ Automatic fallback: Graceful provider switching on failure
- ✓ Heading to W3C: Extension is phase 1, native browser API is the goal
Learn More
Get started using WebLLM in the browser
View Guide →Learn about supported AI providers
View Provider Docs →Set up developer gateways for mobile support
Configure Gateway →Run the Node.js daemon for development
View Daemon Guide →Understand WebLLM's security model
View Security Docs →Test WebLLM features in the playground
Open Playground →