Skip to content

umassos/faillite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FailLite

FailLite, a failure-resilient model serving system for resource-constrained edge environments. This project includes the source code of the system implementation, built upon Nvidia Triton Inference Server, and the source code of a simulator for large-scale evaluation.

Repository Organization

FailLite/
├── src/                       # System implementation
│  ├── controller/             # FailLite controller (failure detection + two-step failover approach)
│  ├── model_manager/          # FailLite Agent to coordinate the model loading/unloading on worker nodes
│  ├── monitoring_daemon/      # Collect heartbeat and system metrics from worker nodes
│  ├── model_profiler/         # Model profiling
│  ├── inference_client/       # Model inference client with receving failover notification
├── simulator/                 # Simulator for large-scale evaluation
├── scripts/                   # Scripts for running failover experiments on edge testbeds
├── analysis/                  # Scripts for results analysis and visualization
├── doc/                       # Documentation files (e.g., FailLite's architecture)

Getting Started

Prerequisites

Step 1: Clone the repository

Step 2: Install Dependencies

Step 3: Config Edge Servers and Applications

Step 4: Run FailLite 🏃

Simulator

Research

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

About

FailLite, a failure-resilient model serving system for resource constrained edge environments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published