Run SmolLM2 language model directly in your web browser using WebAssembly - no server processing required!
- 100% Client-Side: All AI inference happens in your browser - no data sent to servers
- WebAssembly Powered: Rust compiled to WASM for near-native performance
- Web Worker: Model runs in background thread for responsive UI
- React Chat UI: Modern chat interface using @chatscope/chat-ui-kit-react
- SmolLM2-135M: Compact 135M parameter model optimized for edge deployment
- GitHub Pages Ready: Deploy as a static site with no backend required
Visit the live demo to try the model in your browser.
Note: First load downloads ~270MB of model weights. The model is cached by your browser for subsequent visits.
┌─────────────────────────────────────────────────────────────┐
│ Browser │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ React Chat │ │ Web Worker │ │
│ │ UI (Main │◄──►│ ┌─────────────────────────┐ │ │
│ │ Thread) │ │ │ WASM Module │ │ │
│ │ │ │ │ ┌─────────────────┐ │ │ │
│ │ @chatscope/ │ │ │ │ Candle (Rust) │ │ │ │
│ │ chat-ui-kit │ │ │ │ SmolLM2-135M │ │ │ │
│ └─────────────────┘ │ │ └─────────────────┘ │ │ │
│ │ └─────────────────────────┘ │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
# Clone the repository
git clone https://github.com/link-assistant/model-in-browser.git
cd model-in-browser
# Build the WASM package
./scripts/build-wasm.sh
# Install web dependencies and start dev server
cd web
npm install
npm run devOpen http://localhost:5173 in your browser.
# Build WASM
cd wasm && wasm-pack build --target web --out-dir ../web/src/pkg
# Build web app
cd web && npm run build
# Serve with the Rust server
cargo run --manifest-path server/Cargo.toml -- --dir web/dist.
├── wasm/ # Rust WASM library for model inference
│ ├── src/lib.rs # SmolLM2 WASM bindings
│ └── Cargo.toml # WASM package config
├── web/ # React web application
│ ├── src/
│ │ ├── App.tsx # Main chat component
│ │ ├── worker.ts # Web Worker for inference
│ │ └── pkg/ # Built WASM package
│ ├── package.json
│ └── vite.config.ts
├── server/ # Local development server
│ └── src/main.rs # Axum server with CORS
├── .github/workflows/
│ ├── release.yml # CI/CD pipeline
│ └── deploy.yml # GitHub Pages deployment
└── scripts/
├── build-wasm.sh # Build WASM package
└── dev.sh # Start development environment
-
Model Loading: When you click "Load Model", the web app downloads:
- Model weights (~270MB safetensors file)
- Tokenizer configuration
- Model configuration
-
Web Worker: The WASM module runs in a Web Worker to keep the UI responsive during inference.
-
Text Generation: The model uses the LLaMA architecture implemented in Candle, HuggingFace's minimalist ML framework for Rust.
-
Streaming Output: Tokens are generated one at a time and streamed to the chat UI for real-time response display.
- Inference Engine: Candle - Rust ML framework with WASM support
- Model: SmolLM2-135M-Instruct
- Frontend: React 18 with TypeScript
- Chat UI: @chatscope/chat-ui-kit-react
- Build Tool: Vite
- WASM Toolchain: wasm-pack, wasm-bindgen
- Modern browser with WebAssembly support
- ~512MB free memory for model loading
- Chrome, Firefox, Safari, or Edge (latest versions)
# Rust tests
cargo test
# Web tests
cd web && npm test# Format Rust code
cargo fmt
# Run Clippy lints
cargo clippy --all-targets --all-features
# Lint web code
cd web && npm run lintContributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Add a changelog fragment in
changelog.d/ - Submit a pull request
Unlicense - Public Domain
- HuggingFace for SmolLM2 and Candle
- Candle team for the WASM-compatible ML framework
- chatscope for the React chat UI components