How Browser APIs Become Standards

The journey from Geolocation to WebGPU—and what it tells us about the future of browser AI

WebLLM Team
How Browser APIs Become Standards

In 2008, your phone knew where you were, but your browser didn't. Today, you can get turn-by-turn directions in a web app, run GPU-accelerated machine learning in a tab, and have a video call without installing anything. None of this happened by accident.

Every powerful browser capability you use today—from navigator.geolocation to WebRTC to WebGPU—followed a pattern. Understanding that pattern reveals where the web platform is heading next, and why certain APIs become universal while others fade away.

This is the story of how browser APIs become standards, told through the APIs that changed what's possible on the web.

The Problem: Browsers Were Sandboxes, Not Platforms

In the early 2000s, browsers were document viewers that happened to run JavaScript. If you wanted to:

  • Know a user's location → Build a native app
  • Access the camera → Build a native app
  • Store data offline → Build a native app
  • Run complex graphics → Build a native app
  • Have real-time communication → Build a native app

The web was powerful for documents and basic interactivity, but anything requiring device capabilities meant going native. This created a fundamental tension: the web's reach (instant access, no install, cross-platform) versus native's capabilities.

The solution wasn't to make browsers less secure. It was to create permission-gated APIs that exposed device capabilities while keeping users in control.

Act 1: Geolocation (2008-2012) — The Permission Model Is Born

The problem: Mobile web was taking off, but web apps couldn't do the one thing mobile users expected—know where they were.

The breakthrough: The W3C Geolocation API, first implemented in Firefox 3.5 (2009) and quickly adopted by others.

// This simple API changed everything
navigator.geolocation.getCurrentPosition(
  (position) => {
    console.log(position.coords.latitude, position.coords.longitude);
  },
  (error) => {
    console.log('Permission denied or unavailable');
  }
);

Why it mattered:

  1. Permission-first design: Users explicitly grant access. No silent tracking.
  2. Graceful degradation: Sites work without location; it's an enhancement.
  3. Simple API surface: One object, two main methods. Easy to implement.
  4. Browser-mediated trust: The browser, not the website, asks for permission.

The pattern established:

  • Capability exists on device (GPS)
  • Browser exposes it via JavaScript API
  • User grants permission through browser UI
  • Site uses capability with user's informed consent

This pattern—device capability + permission prompt + simple API—became the template for everything that followed.

Timeline:

  • 2008: W3C working draft
  • 2009: Firefox 3.5 implements
  • 2010: Chrome, Safari, Opera follow
  • 2012: IE9 adds support
  • 2016: W3C Recommendation (official standard)

From draft to universal support: 4 years. From draft to formal standard: 8 years. The implementations ran ahead of the formal specification—browsers shipped, developers used it, and the standard codified what worked.

Act 2: getUserMedia (2011-2017) — Camera and Microphone Access

The problem: Video chat required plugins (Flash, Skype plugins) or native apps. No web-native way to access camera/microphone.

The breakthrough: The MediaDevices API, particularly getUserMedia().

// Access camera and microphone
const stream = await navigator.mediaDevices.getUserMedia({
  video: true,
  audio: true,
});

// Attach to video element
videoElement.srcObject = stream;

What made it work:

  1. Clear permission UX: Browser shows camera icon, user sees themselves
  2. Granular control: Request video only, audio only, or both
  3. Revocable: Users can stop sharing anytime via browser UI
  4. Constraints system: Request specific resolutions, frame rates, devices
// Advanced constraints
const stream = await navigator.mediaDevices.getUserMedia({
  video: {
    width: { ideal: 1920 },
    height: { ideal: 1080 },
    facingMode: 'user', // Front camera on mobile
  },
  audio: {
    echoCancellation: true,
    noiseSuppression: true,
  },
});

The ecosystem it enabled:

  • WebRTC for peer-to-peer video (Google Meet, Discord web)
  • Browser-based streaming (OBS web alternatives)
  • AR/VR experiences (camera pass-through)
  • Accessibility tools (screen readers with camera input)

Timeline:

  • 2011: Initial proposals
  • 2012: Chrome experiments
  • 2013: Firefox, Opera implement
  • 2015: Edge adds support
  • 2017: Safari finally joins (the last major holdout)

Safari's delayed implementation (6 years after Chrome) illustrates a key point: standards need all major browsers to truly succeed. Until Safari supported getUserMedia, developers couldn't rely on it for general audiences.

Act 3: Web Notifications (2012-2015) — Permission Fatigue Emerges

The problem: Websites couldn't notify users when something happened (new message, price drop, breaking news) unless the tab was open.

The solution: The Notifications API.

// Request permission
const permission = await Notification.requestPermission();

if (permission === 'granted') {
  new Notification('New message', {
    body: 'You have a new message from Sarah',
    icon: '/icons/message.png',
  });
}

What it got right:

  • Works even when tab is backgrounded (with Service Workers)
  • Rich notifications with images, actions, badges
  • User controls at browser and OS level

What went wrong:

  • Permission spam: Every site asked immediately on load
  • Notification abuse: Sites sent too many low-value notifications
  • User fatigue: People started auto-denying all notification requests

The lesson learned:

Just because users can grant permission doesn't mean they will. The Notifications API taught the industry that:

  1. Timing matters: Ask when the user has context for why
  2. Value must be clear: "Notify me when my order ships" vs. "Allow notifications"
  3. Abuse ruins it for everyone: Bad actors make users distrust the whole category

Browsers responded with increasingly strict rules:

  • Chrome requires user gesture before permission prompt
  • Safari blocks notification requests from iframes
  • Firefox shows fewer prompts, remembers denials longer

This experience shaped how future permission APIs were designed.

Act 4: Service Workers (2014-2018) — The Progressive Web App Foundation

The problem: Web apps couldn't work offline. No connection = no app.

The breakthrough: Service Workers—a programmable network proxy that runs in the background.

// service-worker.js
self.addEventListener('fetch', (event) => {
  event.respondWith(
    caches.match(event.request).then((response) => response || fetch(event.request))
  );
});

Why it was revolutionary:

  1. Offline-first possible: Cache critical resources, work without network
  2. Background sync: Queue actions when offline, sync when online
  3. Push notifications: Receive server pushes even when site is closed
  4. Update control: Developers control when new versions activate

No permission prompt required—Service Workers don't access sensitive device capabilities. But they introduced a new concept: progressive enhancement at the platform level.

A site could be:

  • A regular website (no Service Worker)
  • An installable PWA (with Service Worker + manifest)
  • An offline-capable app (with caching strategies)

All the same codebase, progressively enhanced based on browser support.

Timeline:

  • 2014: Chrome experiments
  • 2015: Firefox implements
  • 2016: Chrome, Firefox stable
  • 2018: Safari (finally) adds partial support
  • 2019: Full Safari support

Service Workers enabled the "Progressive Web App" category—web apps that could be installed, work offline, and receive push notifications. Google, Twitter, Starbucks, and others shipped PWAs that matched or exceeded their native app experiences.

Act 5: WebGL → WebGPU (2011-2023) — Raw Power Comes to the Browser

The problem: Browsers couldn't do serious graphics or compute. Games, 3D visualization, and simulations required native code.

The evolution:

WebGL (2011)

Based on OpenGL ES 2.0, WebGL brought hardware-accelerated 3D graphics to the browser.

// WebGL is verbose but powerful
const gl = canvas.getContext('webgl');
gl.clearColor(0.0, 0.0, 0.0, 1.0);
gl.clear(gl.COLOR_BUFFER_BIT);
// ... hundreds more lines for anything useful

Impact: Games like HexGL, data visualizations, Google Maps 3D, A-Frame VR experiences.

Limitations: Based on 15-year-old graphics concepts, difficult API, limited compute capabilities.

WebGPU (2023)

The modern successor, based on Vulkan/Metal/DX12 concepts.

// WebGPU - more explicit, more powerful
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

// Compute shaders for ML inference, physics, etc.
const computePipeline = device.createComputePipeline({
  compute: {
    module: device.createShaderModule({ code: shaderCode }),
    entryPoint: 'main',
  },
});

Why WebGPU matters:

  1. Compute shaders: Run parallel computations on GPU (ML inference!)
  2. Modern architecture: Multi-threaded, lower overhead than WebGL
  3. Cross-platform abstraction: Works on Vulkan, Metal, DX12 backends
  4. Explicit resource management: Better performance predictability

No permission needed—GPU access doesn't expose sensitive user data. But like Service Workers, it's a capability that enables entirely new application categories.

Timeline:

  • 2017: Initial proposals (Apple's WebGPU, Google's NXT)
  • 2019: Unified WebGPU specification work
  • 2022: Chrome Origin Trial
  • 2023: Chrome ships, Firefox and Safari in development
  • 2024+: Broader browser support

WebGPU is why running LLMs in the browser is becoming practical. The same GPU that renders your game can now run neural network inference.

The Pattern: How APIs Become Standards

Looking across these examples, a clear pattern emerges:

Stage 1: Need Identification

A capability exists on devices that web apps can't access. Developers build native apps instead. The gap becomes painful.

Stage 2: Proposal and Experimentation

Browser vendors (often Google, Mozilla, or Apple) propose an API. They implement it behind flags. Brave developers experiment.

Stage 3: Multi-Browser Implementation

Other browsers implement their versions. Incompatibilities emerge. Developers complain.

Stage 4: Standardization

The W3C, WHATWG, or other body creates a formal specification. Implementations converge. Edge cases get defined.

Stage 5: Universal Adoption

All major browsers ship stable implementations. Developers can rely on the API. Tutorials, libraries, and tooling emerge.

Stage 6: Ecosystem Maturity

The API becomes "boring infrastructure." Nobody thinks about whether fetch() works—it just does.

Typical timeline: 4-8 years from proposal to universal support.

What Makes an API Succeed?

Not every proposed API becomes a standard. The successful ones share characteristics:

1. Clear Use Case

Geolocation: "I want to show users things near them." getUserMedia: "I want video chat in the browser." WebGPU: "I want fast graphics and compute."

Vague value propositions die in committee.

2. Permission Model That Works

  • Ask at the right time: When user intent is clear
  • Granular options: Camera vs. microphone, coarse vs. fine location
  • Revocable: Users can change their mind
  • Visible state: Users know when an API is active

3. Graceful Degradation

Good APIs work when:

  • Permission is denied
  • Capability isn't available
  • Browser doesn't support the API

Sites should enhance with capabilities, not require them.

4. Implementation Feasibility

APIs that require massive browser changes or pose security risks don't get implemented. Successful APIs find the "narrow waist"—maximum capability with minimum attack surface.

5. Multi-Stakeholder Support

An API needs at least two major browsers committed to implementation. Single-browser APIs become "vendor extensions" that developers can't rely on.

The APIs That Didn't Make It

Learning from failures is instructive:

Web SQL Database: Implemented only in Chrome/Safari. IndexedDB won because it had broader support.

Battery Status API: Removed from browsers due to fingerprinting concerns. Privacy implications killed it.

Ambient Light Events: Same story—too much fingerprinting risk for limited benefit.

Web Bluetooth (partially): Implemented in Chrome, limited Safari support, no Firefox. Useful but not universal.

The lesson: APIs must balance capability against privacy and security. The bar keeps rising.

Where This Is Heading

The pattern continues. Here's what's in various stages:

Shipping Now

  • WebGPU: GPU compute in browsers
  • WebTransport: Low-latency networking
  • Web Codecs: Direct video/audio codec access

In Development

  • WebNN: Neural network inference acceleration
  • File System Access: Read/write local files (with permission)
  • Multi-Screen Window Placement: Control windows across displays

Proposed / Experimental

  • Web Serial: Talk to serial devices
  • Web HID: Human Interface Devices access
  • WebXR Depth API: Depth sensing for AR

The Obvious Gap

  • AI/LLM Access: No standard way to invoke AI capabilities

Every major platform (iOS, Android, Windows, macOS) is adding AI APIs. The browser—the universal platform—has nothing. Users can access location, camera, microphone, GPU, notifications, local files... but not AI.

That gap won't last. The pattern predicts it:

  1. Need: Developers want AI in web apps without server round-trips
  2. Experimentation: WebLLM-style polyfills prove the concept
  3. Proposals: navigator.llm or similar APIs get proposed
  4. Implementations: Browsers add AI capabilities
  5. Standard: W3C or WHATWG codifies the API
  6. Universal: AI becomes a standard browser capability

Lessons for the Future

From 15+ years of browser API evolution:

  1. Permission models work. Users can handle power when they have control.

  2. Progressive enhancement wins. APIs that enhance without requiring work best.

  3. Multi-browser support is essential. Single-vendor APIs aren't reliable.

  4. Privacy concerns can kill APIs. The bar rises with each abuse.

  5. Use cases drive adoption. Theoretical capability needs concrete applications.

  6. Patience is required. Good APIs take 4-8 years to become universal.

The web platform keeps expanding because this process works. Each new API makes the web more capable while (mostly) maintaining the security and privacy that make it trustworthy.

The next chapter is being written now. AI in the browser isn't a question of "if"—it's a question of what the API looks like and who implements it first.


Further Reading

In this article:

Share this article: