A production-grade fraud detection system designed to operate exactly where real-world banks and fintechs need it.
Financial institutions face three hard problems in fraud:
- Labels are delayed and incomplete — you can’t rely purely on supervised learning.
- Inference must be deterministic — regulators don’t accept "notebook logic".
- Every decision must be explainable — internal audit, compliance, and customers demand clarity.
FRIS is built as the system that would sit between transaction processing and risk decisioning in a real bank/fintech:
- ingest a transaction
- compute engineered signals + model outputs
- combine supervised + unsupervised signals
- return a risk score and explanation
- forward the decision to fraud ops, rule engines, or customer flows
It acts as the fraud intelligence layer, not just a classifier.** A full-stack, production-oriented fraud detection system. Built end-to-end with deterministic preprocessing, stacked ML inference, SHAP explainability, FastAPI serving, Docker packaging, and a Streamlit UI.
This repository contains the complete system exactly as it runs in production.
Below is the live version of the exact system described: ** Backend (FastAPI): https://fraud-risk-intelligence-system-api.onrender.com/
Frontend (Streamlit): https://fraud-risk-intelligence-system.streamlit.app/
Below is the text-form pipeline diagram:**
Raw Transaction (JSON)
↓
Input Validation (Pydantic)
↓
Frozen Feature Pipeline
↓
Base Models
- XGBoost
- Autoencoder (PyTorch)
- Isolation Forest
- MLP
↓
Meta-Feature Builder
↓
Stacked Ensemble (Logistic Regression)
↓
Risk Score + Label
↓
SHAP Explanation (Inference-Aligned)
↓
FastAPI → Streamlit UI
One pipeline. One truth. Training = Inference = Explainability.
These are the core engineering skills demonstrated by FRIS:
- training/inference parity (deterministic, frozen preprocessing)
- stacked ensembles with OOF predictions
- SHAP explainability at inference (no notebook drift)
- FastAPI backend with lifecycle-safe loading
- Dockerized deployment for reproducibility
- CI-ready repository structure with tests
- strict schema validation and boundary contracts
- Streamlit frontend correctly separated from ML logic
All preprocessing is frozen:
- numerical transforms
- temporal features
- frequency encodings
- aggregation stats
- missingness flags
Stored as:
feature_columns.jsonpreprocessors.joblib
Inference never recomputes anything.
FRIS combines multiple weak signals:
- XGBoost (supervised)
- Autoencoder (reconstruction-based anomaly signal)
- Isolation Forest (unsupervised)
- MLP (nonlinear auxiliary signal)
These feed into a Logistic Regression stacker trained only on OOF predictions.
FRIS implements:
- global importance
- local per-transaction attributions
- top-K feature drivers
- inference-aligned explanations
No notebook recomputation. No drift.
FastAPI backend with:
GET /healthPOST /predictPOST /explain
Includes:
- lifecycle-safe model loading
- frozen contracts
- schema validation
- deterministic behavior across environments
- end-to-end tests
- Dockerized backend
- artifacts baked in
- pinned requirements
- deployed on Render (API)
- deployed on Streamlit Cloud (UI)
Minimal, narrative-style interface:
- enter a transaction
- get fraud probability + label
- see SHAP explanation
- transparency panel
UI contains zero ML logic — everything flows through the API.
FRIS/
├── src/
│ ├── api/ # FastAPI backend
│ ├── features/ # Frozen feature pipeline
│ ├── models/ # Model loaders + artifacts
│ ├── pipeline/ # Inference spine
│ ├── explain/ # SHAP explain layer
│ └── utils/ # Helpers
│
├── app/ # Streamlit frontend
│ └── streamlit/
│
├── data/
│ ├── processed/
│ └── artifacts/ # models, encoders, scalers, explainer
│
├── tests/ # API + inference tests
│
├── docs
├── notebooks
├── Dockerfile
├── requirements.txt
└── README.md
pip install -r requirements.txt
uvicorn src.api.main:app --reload
POST /predict
{
"Time": 10000,
"V1": -1.35,
...
"Amount": 92.10
}
Response:
{
"score": 0.87,
"label": "fraud"
}
- 284,807 transactions
- 0.17% fraud rate
- PCA-derived features V1–V28
- Columns: Time, Amount, V1–V28, Class
Raw data not included.
- training/inference parity
- frozen preprocessing
- leakage detection
- stacked model design
- honest SHAP explainability
- API-first ML engineering
- Docker deployment
- UI separation of concerns
- real-world ML constraints
This is not a model.
This is a complete ML system.
FRIS v1.0 — Complete, deployed, stable.


