You've decided to add AI to your web app. Now what?
There are multiple approaches, each with different tradeoffs. This guide helps you choose.
The 5 Approaches
1. Direct API Integration
Call AI providers (OpenAI, Anthropic, etc.) directly from your backend.
// Your backend
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });
app.post('/api/chat', async (req, res) => {
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: req.body.messages
});
res.json(response);
});
Best for: Quick prototypes, single-provider apps
2. AI Gateway Services
Use managed services like Vercel AI SDK, Cloudflare AI Gateway, or AWS Bedrock.
// Using Vercel AI SDK
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
const result = await generateText({
model: openai('gpt-4'),
prompt: 'Hello'
});
Best for: Production apps needing reliability, caching, analytics
3. Backend Proxy
Build your own abstraction layer over multiple providers.
// Your custom proxy
class AIProxy {
async generate(prompt) {
try {
return await this.openai.generate(prompt);
} catch {
return await this.anthropic.generate(prompt); // Fallback
}
}
}
Best for: Custom routing logic, multi-provider fallback, cost optimization
4. Browser-Native (User-Powered)
Users bring their own AI via browser extension or native browser support.
// Frontend - no backend needed
if ('llm' in navigator) {
const response = await navigator.llm.prompt('Hello');
}
Best for: Privacy-focused apps, cost-sensitive projects, user control
5. Local Models
Run models on-device using Ollama, LM Studio, or in-browser via WebGPU.
// Connect to local Ollama
const response = await fetch('http://localhost:11434/api/generate', {
method: 'POST',
body: JSON.stringify({ model: 'llama3', prompt: 'Hello' })
});
Best for: Offline capability, full privacy, air-gapped environments
Comparison Matrix
| Factor | Direct API | AI Gateway | Backend Proxy | Browser-Native | Local Models |
|---|---|---|---|---|---|
| Setup complexity | Low | Medium | High | Low | Medium |
| Cost to you | High | Medium | High | Zero | Zero |
| Privacy | Low | Low | Medium | High | Highest |
| User control | None | None | None | Full | Full |
| Provider flexibility | Low | Medium | High | High | Medium |
| Offline support | No | No | No | Possible | Yes |
| Reliability | Provider-dependent | High (managed) | You manage | User-dependent | You manage |
Decision Tree
Do you need offline support?
├── Yes → Local Models (5)
└── No
│
Do users need to control their AI provider?
├── Yes → Browser-Native (4)
└── No
│
Do you need multi-provider fallback?
├── Yes
│ │
│ Build vs Buy?
│ ├── Build → Backend Proxy (3)
│ └── Buy → AI Gateway (2)
└── No → Direct API (1)
Approach 1: Direct API Integration
When to Use
- Rapid prototyping
- Single provider is acceptable
- Simple use case
- Small scale
Implementation
// Backend (Node.js/Express)
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
app.post('/api/chat', async (req, res) => {
const message = await anthropic.messages.create({
model: 'claude-3-sonnet-20240229',
max_tokens: 1024,
messages: [{ role: 'user', content: req.body.prompt }]
});
res.json({ response: message.content[0].text });
});
Pros
- Simplest to implement
- Direct access to provider features
- Good documentation
Cons
- Locked to one provider
- No fallback
- You manage everything
Approach 2: AI Gateway Services
When to Use
- Production applications
- Need caching, rate limiting, analytics
- Want managed reliability
- Multiple models/providers
Implementation
// Using Vercel AI SDK
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
// Easy provider switching
const model = useOpenAI ? openai('gpt-4') : anthropic('claude-3-sonnet');
const result = await streamText({
model,
prompt: input
});
// Streaming built-in
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
Pros
- Managed infrastructure
- Built-in streaming, caching
- Provider abstraction
- Good for production
Cons
- Additional service dependency
- May have usage limits
- Less control than custom proxy
Approach 3: Backend Proxy
When to Use
- Custom routing logic needed
- Cost optimization across providers
- Specific compliance requirements
- Full control required
Implementation
// Custom AI proxy with fallback
class AIService {
constructor() {
this.providers = [
{ name: 'groq', client: new Groq(), priority: 1 },
{ name: 'openai', client: new OpenAI(), priority: 2 },
{ name: 'anthropic', client: new Anthropic(), priority: 3 }
];
}
async generate(prompt, options = {}) {
const sorted = this.providers.sort((a, b) => a.priority - b.priority);
for (const provider of sorted) {
try {
return await this.callProvider(provider, prompt, options);
} catch (error) {
console.log(`${provider.name} failed, trying next...`);
}
}
throw new Error('All providers failed');
}
async callProvider(provider, prompt, options) {
// Provider-specific implementation
switch (provider.name) {
case 'groq':
return this.callGroq(provider.client, prompt, options);
case 'openai':
return this.callOpenAI(provider.client, prompt, options);
case 'anthropic':
return this.callAnthropic(provider.client, prompt, options);
}
}
}
Pros
- Full control
- Custom fallback logic
- Cost optimization
- Compliance flexibility
Cons
- Significant engineering effort
- You maintain everything
- Still server-side costs
Approach 4: Browser-Native (User-Powered)
When to Use
- Privacy is important
- Don't want to pay AI costs
- Users likely have AI subscriptions
- Simple AI features (enhancement, not core)
Implementation
// Frontend - no backend AI needed
class BrowserAI {
constructor() {
this.available = 'llm' in navigator;
}
async generate(prompt, fallback = null) {
if (this.available) {
return await navigator.llm.prompt(prompt);
}
if (fallback) {
return await fallback(prompt);
}
throw new Error('AI not available');
}
async stream(prompt) {
if (!this.available) {
throw new Error('AI not available');
}
return navigator.llm.streamPrompt(prompt);
}
}
// Usage
const ai = new BrowserAI();
if (ai.available) {
const response = await ai.generate('Improve this text: ' + text);
}
Pros
- Zero cost to you
- User controls privacy
- Simpler architecture
- No API key management
Cons
- Users must have AI configured
- Less control over model
- Graceful degradation needed
Approach 5: Local Models
When to Use
- Offline capability required
- Maximum privacy needed
- Air-gapped environments
- Specific model requirements
Implementation
// Using Ollama
class LocalAI {
constructor(baseUrl = 'http://localhost:11434') {
this.baseUrl = baseUrl;
}
async generate(prompt, model = 'llama3') {
const response = await fetch(`${this.baseUrl}/api/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model,
prompt,
stream: false
})
});
const data = await response.json();
return data.response;
}
async *stream(prompt, model = 'llama3') {
const response = await fetch(`${this.baseUrl}/api/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model,
prompt,
stream: true
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const lines = decoder.decode(value).split('\n');
for (const line of lines) {
if (line) {
const data = JSON.parse(line);
if (data.response) yield data.response;
}
}
}
}
}
Pros
- Works offline
- Full privacy
- No API costs
- No rate limits
Cons
- User needs to set up Ollama
- Hardware-dependent performance
- Limited model selection
- No cloud fallback
Hybrid Approaches
Real applications often combine approaches:
Browser-First with Backend Fallback
async function getAIResponse(prompt) {
// Try browser AI first (free for you)
if ('llm' in navigator) {
try {
return await navigator.llm.prompt(prompt);
} catch {
// Fall through to backend
}
}
// Fallback to your backend
const res = await fetch('/api/ai', {
method: 'POST',
body: JSON.stringify({ prompt })
});
return res.json();
}
Local-First with Cloud Fallback
async function generate(prompt) {
// Try local Ollama
if (await this.isOllamaRunning()) {
return await this.localAI.generate(prompt);
}
// Fallback to cloud
return await this.cloudAI.generate(prompt);
}
Recommendations by Use Case
| Use Case | Recommended Approach |
|---|---|
| Prototype | Direct API |
| Production SaaS | AI Gateway |
| Privacy-focused app | Browser-Native + Local |
| Cost-sensitive | Browser-Native |
| Enterprise/compliance | Backend Proxy |
| Offline-first | Local Models |
| Open source project | Browser-Native |
Conclusion
There's no single "best" approach. The right choice depends on:
- Privacy requirements → Browser-Native or Local
- Cost constraints → Browser-Native
- Control needs → Backend Proxy
- Simplicity needs → Direct API or AI Gateway
- Offline needs → Local Models
Most production apps benefit from hybrid approaches—browser-native for enhancement features, backend for core functionality.