LLM Safety Labs — AI Safety Demos

Pin-ready repo showcasing AI Safety concepts: hallucination detection, jailbreak prevention, and bias checks for LLMs.

What this project demonstrates

Hallucination checks and adversarial prompts
Jailbreak defenses and a layered safety wrapper
Basic bias/validation hooks and production-minded patterns

Quickstart

conda create -n proj8venv python=3.10.8 -y
conda activate proj8venv
pip install --upgrade pip wheel setuptools
pip install -r requirements.txt
python -m ipykernel install --user --name proj8venv --display-name "Python (mod8venv)"

Create a .env file with OPENAI_API_KEY=...

Example commands

python ai_safety_demos.py
python chatgpt_api_safety_demo.py
python additional_hallucination_tests.py
python better_hallucination_tests.py
python advanced_jailbreak_tests.py

Suggested repo names

llm-safety-labs
llm-safety-demos

🔒 Technical Verification

This repository showcases the documented and structural components of my AI Safety project.

The complete functional implementation (including runnable demos, model wrappers, and evaluation scripts) is available upon request. The project has been successfully tested locally with:

RTX 4070 SUPER GPU (CUDA 12.8)
PyTorch 2.5 + cu121
OpenAI API v1.0+

For verification, please refer to:

UML diagram (docs/uml_overview.svg)
Recorded execution screenshots and terminal logs in /docs/demo_evidence/(to be posted)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
docs		docs
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
additional_hallucination_tests.py		additional_hallucination_tests.py
advanced_jailbreak_tests.py		advanced_jailbreak_tests.py
ai_safety_demos.py		ai_safety_demos.py
better_hallucination_tests.py		better_hallucination_tests.py
chatgpt_api_safety_demo.py		chatgpt_api_safety_demo.py
demo_runner.py		demo_runner.py
requirements.txt		requirements.txt
setup_demo.py		setup_demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Safety Labs — AI Safety Demos

What this project demonstrates

Quickstart

Example commands

Suggested repo names

🔒 Technical Verification

About

Uh oh!

Releases

Packages

Languages

License

FlosMume/LLM-Safety-Labs-Starter

Folders and files

Latest commit

History

Repository files navigation

LLM Safety Labs — AI Safety Demos

What this project demonstrates

Quickstart

Example commands

Suggested repo names

🔒 Technical Verification

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages