v1.0 — Local-First AI for the Web

LocalMode

Privacy-first AI utilities. Run embeddings, vector search, RAG, classification, vision, and LLMs - all locally in the browser.

Built for the Modern Web

AI in the Browser

Run embeddings, classification, vision, and LLMs directly in the browser with WebGPU/WASM.

Privacy-First

Zero telemetry. No data leaves your device. All processing happens locally.

Zero Dependencies

Core package has no external dependencies. Built on native Web APIs.

Offline-Ready

Models cached in IndexedDB. Works without network after first load.

Simple, Powerful API

Function-first design with TypeScript. All operations return structured results.

Embeddings & Vector Search

Terminal
$ pnpm install @localmode/core @localmode/transformers
embeddings.ts
import { createVectorDB, embed, embedMany, chunk } from '@localmode/core';
import { transformers } from '@localmode/transformers';

// Create embedding model
const model = transformers.embedding('Xenova/all-MiniLM-L6-v2');

// Create vector database
const db = await createVectorDB({
  name: 'documents',
  dimensions: 384,
});

// Chunk and embed document
const chunks = chunk(documentText, { strategy: 'recursive', size: 512 });
const { embeddings } = await embedMany({ model, values: chunks.map(c => c.text) });

// Store in database
await db.addMany(
  chunks.map((c, i) => ({
    id: `chunk-${i}`,
    vector: embeddings[i],
    metadata: { text: c.text },
  }))
);

// Search
const { embedding: queryVector } = await embed({ model, value: 'What is AI?' });
const results = await db.search(queryVector, { k: 5 });

Local LLM Inference

Terminal
$ pnpm install @localmode/core @localmode/webllm
chat.ts
import { generateText, streamText } from '@localmode/core';
import { webllm } from '@localmode/webllm';

// Create a WebLLM model instance
const model = webllm.chat('Llama-3.2-3B-Instruct-q4f16_1-MLC');

// Generate text (non-streaming)
const { text } = await generateText({
  model,
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is machine learning?' },
  ],
});

console.log(text);

// Stream text for real-time responses
const { textStream } = await streamText({
  model,
  messages: [
    { role: 'user', content: 'Explain quantum computing in simple terms.' },
  ],
});

for await (const chunk of textStream) {
  process.stdout.write(chunk);
}

Ready to Build?

Start building local-first AI applications with comprehensive documentation, examples, and guides.

Read the Documentation