v1.0 — Local-First AI for the Web

LocalMode

Privacy-first AI utilities. Run embeddings, vector search, RAG, classification, vision, and LLMs - all locally in the browser.

Read the Docs View on GitHub

Built for the Modern Web

AI in the Browser

Run embeddings, classification, vision, and LLMs directly in the browser with WebGPU/WASM.

Privacy-First

Zero telemetry. No data leaves your device. All processing happens locally.

Zero Dependencies

Core package has no external dependencies. Built on native Web APIs.

Offline-Ready

Models cached in IndexedDB. Works without network after first load.

Packages

Modular architecture - use only what you need.
Core provides everything; providers add framework integrations.

@localmode/core

Vector DB, embeddings, RAG utilities, storage, security, and all core functions.

Learn more

@localmode/transformers

HuggingFace Transformers.js provider for ML models in the browser.

Learn more

@localmode/webllm

WebLLM provider for local LLM inference with 4-bit quantized models.

Learn more

@localmode/pdfjs

PDF text extraction using PDF.js for document processing pipelines.

Learn more

Simple, Powerful API

Function-first design with TypeScript. All operations return structured results.

Embeddings & Vector Search

Terminal

$ pnpm install @localmode/core @localmode/transformers

embeddings.ts

import { createVectorDB, embed, embedMany, chunk } from '@localmode/core';
import { transformers } from '@localmode/transformers';

// Create embedding model
const model = transformers.embedding('Xenova/all-MiniLM-L6-v2');

// Create vector database
const db = await createVectorDB({
  name: 'documents',
  dimensions: 384,
});

// Chunk and embed document
const chunks = chunk(documentText, { strategy: 'recursive', size: 512 });
const { embeddings } = await embedMany({ model, values: chunks.map(c => c.text) });

// Store in database
await db.addMany(
  chunks.map((c, i) => ({
    id: `chunk-${i}`,
    vector: embeddings[i],
    metadata: { text: c.text },
  }))
);

// Search
const { embedding: queryVector } = await embed({ model, value: 'What is AI?' });
const results = await db.search(queryVector, { k: 5 });

Local LLM Inference

Terminal

$ pnpm install @localmode/core @localmode/webllm

chat.ts

import { generateText, streamText } from '@localmode/core';
import { webllm } from '@localmode/webllm';

// Create a WebLLM model instance
const model = webllm.chat('Llama-3.2-3B-Instruct-q4f16_1-MLC');

// Generate text (non-streaming)
const { text } = await generateText({
  model,
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is machine learning?' },
  ],
});

console.log(text);

// Stream text for real-time responses
const { textStream } = await streamText({
  model,
  messages: [
    { role: 'user', content: 'Explain quantum computing in simple terms.' },
  ],
});

for await (const chunk of textStream) {
  process.stdout.write(chunk);
}

Ready to Build?

Start building local-first AI applications with comprehensive documentation, examples, and guides.

Read the Documentation