Sentinel Dual-Engine Risk Framework

Sentinel is a high-performance risk assessment platform that allows real-time switching between Cloud-scale inference (Cerebras) and Private Local execution (NVIDIA GPU).

1. Frontend Development Setup

To run the Sentinel UI locally on your machine:

Prerequisites: Ensure you have Node.js installed (which includes npm).

Initialize Project:

# Create a new directory and enter it
mkdir sentinel-app && cd sentinel-app

# Initialize a Vite project (React + TypeScript)
npm create vite@latest . -- --template react-ts

# Install dependencies used in this app
npm install lucide-react recharts @google/genai

Copy Files: Place the App.tsx, index.html, types.ts, and the components/ & services/ folders into your local project structure.
Run Development Server:
```
npm run dev
```
Access the App: Open your browser to http://localhost:5173.

2. Local NVIDIA Setup (Ollama)

To utilize the NVIDIA Local Mode, you must have a local LLM server running on your machine using Ollama.

Setup Steps

Install Ollama: Download and install from ollama.com.
Pull a Risk Model: Open your terminal and pull a high-reasoning model:
```
ollama pull llama3
```
Ensure API Accessibility: Ollama serves an OpenAI-compatible API by default on http://localhost:11434/v1.
Hardware Optimization: Ensure your NVIDIA GPU (RTX 3090/4090 recommended) is correctly recognized by Ollama. You can check this by running ollama run llama3 and monitoring your GPU usage in Task Manager or nvidia-smi.

3. Architecture Overview

Cloud Engine (Cerebras): Routes requests via the Cerebras SDK for sub-second inference on Wafer-Scale Engines. Best for high-throughput, non-private data.
Local Engine (NVIDIA/Ollama): Routes requests via the local loopback to your GPU. Best for sensitive data privacy, offline testing, and cost-free execution.

4. Hardware Requirements

NVIDIA RTX GPU: Pascal architecture or newer (Ampere/Ada Lovelace preferred for FP16/BF16 performance).
VRAM:
- 8GB for Llama3 8B (Quantized)
- 24GB for Llama3 70B (Quantized) or larger models.
Drivers: Latest NVIDIA Game Ready or Studio drivers.

5. Integration Details

The system tracks and compares:

Network Latency: Time for packets to reach Cerebras Cloud.
VRAM Latency: Time for the local GPU to process weights and generate tokens via Ollama.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
components		components
services		services
.gitignore		.gitignore
App.tsx		App.tsx
README.md		README.md
index.html		index.html
index.tsx		index.tsx
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
types.ts		types.ts
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentinel Dual-Engine Risk Framework

1. Frontend Development Setup

2. Local NVIDIA Setup (Ollama)

Setup Steps

3. Architecture Overview

4. Hardware Requirements

5. Integration Details

About

Uh oh!

Releases

Packages

Uh oh!

Languages

gconsigli/SentinelHFT

Folders and files

Latest commit

History

Repository files navigation

Sentinel Dual-Engine Risk Framework

1. Frontend Development Setup

2. Local NVIDIA Setup (Ollama)

Setup Steps

3. Architecture Overview

4. Hardware Requirements

5. Integration Details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages