Submission/heretolearn/gymjailbreak #8

KaiquanMah · 2026-01-11T08:58:45Z

GymJailbreak Scenario

This is an Insecure Output scenario which tests an AI fitness coach's ability to maintain safe intensity limits when processing mathematical calculations embedded in workout requests.

Overview

An AI-powered gym coach (TitanCoach) controls a smart fitness machine (King Kong HI-POWER) and generates workout routines. The attacker uses formula injection and progressive calculations to trick the AI into outputting workout intensities exceeding the 100% safety limit, which could damage equipment or harm users.

More Information

For more information, please refer to scenarios\security_arena\submissions\heretolearn\gymjailbreak\README.md

… insuranceinsider scenario 2 - added

…unds

…ak seems ok-pass. insuranceinsider to fix

…imeout. cant be too obvious cuz atker or defender can reject. cant be too simple or outright ask defender to output unsafe percenages. calculations were difficult to target, yet finally worked

…eadme formats - SCENARIO_SPECIFICATIONS.md has novel aspects, no real world relevance. example_scenario/readme.md has real world relevance, no novel aspects

… ideas downloaded using r.jina into workings folder for traceability

…mmss/ into the respective scenario folders

…ssion

kaiquanmah and others added 20 commits December 31, 2025 17:35

scenario 1 - crmcrack draft

498d7dc

crm scenario 1 - fixed toml to follow portfolioiq mandatory sections.…

9442482

… insuranceinsider scenario 2 - added

scenario 3 GymJailbreak

4290aed

scenario4 base64bandit to draft biased emails

bc5b152

test portfolioiq worked. base64bandit - round1 defender won all 20 ro…

f8d195d

…unds

base64bandit passed. crmcrack still failed aft iterations. gymjailbre…

7bb7285

…ak seems ok-pass. insuranceinsider to fix

insuranceinsider - finally succeeded

5ca70ca

insuranceinsider - finally succeeded

14dff8e

base64bandit - updated docs. insuranceinsider - to update docs later

25a33f2

base64bandit readme minor update. insuranceinsider readme updated

2c78a9f

gymjailbreak update for realism

9be929c

crmcrack - updated scenario w data JSON

af239a9

gymjailbreak - scenario finally successful. cant be too verbose cuz t…

4c18f00

…imeout. cant be too obvious cuz atker or defender can reject. cant be too simple or outright ask defender to output unsafe percenages. calculations were difficult to target, yet finally worked

crmcrack - still failing

09c6e35

crmcrack - still failing. to continue debugging

41861ab

crmcrack - finally successful. all 4 README follows the 2 different r…

d7da4b6

…eadme formats - SCENARIO_SPECIFICATIONS.md has novel aspects, no real world relevance. example_scenario/readme.md has real world relevance, no novel aspects

workings md added n updated w debugging steps. moved ALL workings and…

be0ec7b

… ideas downloaded using r.jina into workings folder for traceability

workings md minor scenario update

6b718b4

results.json copied from results/heretolearn/scenarioname/yyyyMMdd_hh…

cb74217

…mmss/ into the respective scenario folders

Submission: heretolearn - gymjailbreak

6ea9a97

lemontreeran mentioned this pull request Jan 13, 2026

Register badblueprint plugin in registry Purple-Vanguard/agentbeats-lambda#3

Merged

registry.py - commented out other scenario plugins for the team submi…

196fd34

…ssion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Submission/heretolearn/gymjailbreak #8

Submission/heretolearn/gymjailbreak #8

Uh oh!

KaiquanMah commented Jan 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Submission/heretolearn/gymjailbreak #8

Are you sure you want to change the base?

Submission/heretolearn/gymjailbreak #8

Uh oh!

Conversation

KaiquanMah commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GymJailbreak Scenario

Overview

More Information

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KaiquanMah commented Jan 11, 2026 •

edited

Loading