Skip to content

Commit dd3b76a

Browse files
Merge pull request #11 from RamGcia/main
Changed reliance from Regex to presidio library and removed hardcoded report
2 parents 63eb9e2 + d4a2144 commit dd3b76a

File tree

13 files changed

+467
-306
lines changed

13 files changed

+467
-306
lines changed

.github/ETHICS_QUESTIONNAIRE.MD

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
**Ethics & Regulatory Questionnaire**
2+
*This PR cannot be merged until this form is completed.*
3+
4+
Please reply to this comment and answer all questions below (you can copy-paste and fill it).
5+
6+
1. Does this change involve any of the following? (check all that apply)
7+
- [ ] Training or fine-tuning of AI/ML models
8+
- [ ] Inference/serving of AI/ML models in production
9+
- [ ] Processing of personal data (PII, health, biometric, financial, children’s data, etc.)
10+
- [ ] Dual-use or military-applicable technology
11+
- [ ] Safety-critical systems (medical device, aviation, automotive, etc.)
12+
- [ ] High-impact algorithmic decision-making (credit, hiring, criminal justice, etc.)
13+
- [ ] None of the above (pure docs, tests, CI, formatting, etc.)
14+
15+
2. Estimated risk level (your honest assessment)
16+
- [ ] Low – no ethical or regulatory impact
17+
- [ ] Medium – possible fairness/privacy concerns
18+
- [ ] High – potential for serious harm or legal non-compliance
19+
20+
3. Brief description of any ethical/regulatory impact (or write “None”)
21+
22+
>
23+
24+
4. Relevant regulations / standards considered (e.g., EU AI Act, GDPR, HIPAA, NIST AI RMF, export controls, etc.)
25+
List them or write “N/A”
26+
27+
>
28+
29+
5. Have mitigation measures been implemented (bias testing, data minimization, consent flows, etc.)?
30+
- [ ] Yes → describe below
31+
- [ ] No
32+
- [ ] Not applicable
33+
34+
>
35+
36+
Thank you! The ethics gate will evaluate your answers automatically.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: Ethics & Regulatory Questionnaire
2+
description: Required for all PRs with potential ethical/regulatory impact
3+
title: "[Ethics Review] <PR title>"
4+
body:
5+
- type: checkboxes
6+
attributes:
7+
label: Scope of Change
8+
options:
9+
- label: Involves training or inference of AI/ML models
10+
- label: Processes personal data (PII, health, financial, etc.)
11+
- label: Dual-use potential (could be used in weapons/autonomous systems)
12+
- label: Affects safety-critical systems
13+
- label: Purely documentation / tests / CI changes (safe)
14+
15+
- type: textarea
16+
attributes:
17+
label: Description of ethical/regulatory impact (if any)
18+
placeholder: Explain who might be harmed, fairness implications, compliance requirements, etc.
19+
20+
- type: dropdown
21+
attributes:
22+
label: Have you consulted the relevant regulatory framework?
23+
options: ["Yes", "No", "Not applicable"]

.github/workflows/docker-build-deploy.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ jobs:
1212
scan:
1313
runs-on: ubuntu-latest
1414
steps:
15+
- name: Install presidio-analyzer
16+
run: pip install presidio-analyzer
17+
1518
- name: Checkout repository
1619
uses: actions/checkout@v2
1720

@@ -24,15 +27,16 @@ jobs:
2427
run: |
2528
python -m venv venv
2629
source venv/bin/activate
30+
pip install --upgrade pip
2731
pip install -r requirements.txt
28-
32+
2933
- name: Run scan
3034
run: |
3135
source venv/bin/activate
3236
python main.py
3337
34-
- name: Upload scan report
35-
uses: actions/upload-artifact@v2
36-
with:
37-
name: scan_report
38-
path: reports/scan_report.json
38+
##- name: Upload scan report
39+
## uses: actions/upload-artifact@v4
40+
## with:
41+
## name: scan_report
42+
## path: reports/scan_report.json

.github/workflows/ethics-gate.yaml

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
on:
2+
pull_request_target:
3+
types: [opened, reopened, synchronize]
4+
issue_comment:
5+
types: [created]
6+
7+
permissions:
8+
contents: read # needed for checkout
9+
pull-requests: write # needed for commenting & reviews (gh) when running in pull_request_target
10+
checks: write # needed to create check runs
11+
12+
jobs:
13+
# Job that posts the questionnaire (runs in the trusted pull_request_target context).
14+
post-questionnaire:
15+
if: github.event_name == 'pull_request_target' && github.event.pull_request.draft == false
16+
runs-on: ubuntu-latest
17+
steps:
18+
- name: Checkout base repo (safe; do NOT checkout PR head here)
19+
uses: actions/checkout@v4
20+
with:
21+
ref: ${{ github.event.pull_request.base.sha }}
22+
fetch-depth: 0
23+
24+
- name: Authenticate gh CLI with GITHUB_TOKEN
25+
run: |
26+
echo "${{ secrets.GITHUB_TOKEN }}" | gh auth login --with-token
27+
28+
- name: Check if questionnaire already answered
29+
id: check
30+
run: |
31+
PR_NUMBER=${{ github.event.pull_request.number }}
32+
# Collect PR comments (robust to empty output)
33+
RESPONSES=$(gh pr view "$PR_NUMBER" --json comments --jq '.comments[].body' 2>/dev/null | grep -i "Ethics & Regulatory Questionnaire" -A 20 || true)
34+
if [[ -z "$RESPONSES" ]]; then
35+
echo "status=missing" >> $GITHUB_OUTPUT
36+
else
37+
echo "status=answered" >> $GITHUB_OUTPUT
38+
fi
39+
40+
- name: Post questionnaire if missing
41+
if: steps.check.outputs.status == 'missing'
42+
run: |
43+
# ensure file exists in the base repo checkout (case-sensitive)
44+
if [[ ! -f .github/ETHICS_QUESTIONNAIRE.MD ]]; then
45+
echo ".github/ETHICS_QUESTIONNAIRE.MD not found in base repo; aborting." >&2
46+
exit 1
47+
fi
48+
gh pr comment ${{ github.event.pull_request.number }} --body-file .github/ETHICS_QUESTIONNAIRE.MD
49+
echo "Posted ethics questionnaire to PR #${{ github.event.pull_request.number }}."
50+
51+
# Ethics engine: collects comments, runs evaluation, posts a check, and requests changes for HIGH risk.
52+
# This job runs in the trusted context for pull_request_target and also on issue_comment (untrusted).
53+
# For untrusted issue_comment runs, write actions (requesting changes) may be skipped if permissions are restricted.
54+
ethics-engine:
55+
runs-on: ubuntu-latest
56+
needs: post-questionnaire
57+
steps:
58+
- name: Checkout base repo (we run parser from base repo)
59+
uses: actions/checkout@v4
60+
with:
61+
ref: ${{ github.event.pull_request.base.sha || github.ref }}
62+
fetch-depth: 0
63+
64+
- name: Authenticate gh CLI with GITHUB_TOKEN
65+
run: |
66+
echo "${{ secrets.GITHUB_TOKEN }}" | gh auth login --with-token
67+
68+
- name: Determine PR number
69+
id: prnumber
70+
run: |
71+
# Determine PR number whether triggered by pull_request_target or issue_comment
72+
PR_NUMBER=$(jq -r 'if .pull_request then .pull_request.number elif .issue then .issue.number else empty end' "$GITHUB_EVENT_PATH")
73+
if [[ -z "$PR_NUMBER" ]]; then
74+
echo "No PR number found in event payload; exiting."
75+
echo "risk=UNKNOWN" >> $GITHUB_OUTPUT
76+
exit 0
77+
fi
78+
echo "pr_number=$PR_NUMBER" >> $GITHUB_OUTPUT
79+
80+
- name: Collect comments
81+
id: collect
82+
run: |
83+
PR=${{ steps.prnumber.outputs.pr_number }}
84+
# Gather all PR comments into a single string (robust to empty)
85+
ANSWERS=$(gh pr view "$PR" --json comments --jq '[.comments[].body] | join("\n\n")' 2>/dev/null || true)
86+
echo "$ANSWERS" > answers.txt
87+
# Expose the answers (trim to avoid huge output)
88+
echo "answers=$(echo "$ANSWERS" | head -c 32768 | sed -e 's/"/'"'"'"/g')" >> $GITHUB_OUTPUT
89+
90+
- name: Run ethics parser & evaluator (safe runs code from base repo)
91+
id: run_engine
92+
env:
93+
PR_NUMBER: ${{ steps.prnumber.outputs.pr_number }}
94+
run: |
95+
# Ensure parser exists
96+
if [[ ! -f .github/workflows/parse_and_evaluate.py ]]; then
97+
echo "Parser .github/workflows/parse_and_evaluate.py not found in base repo; aborting."
98+
echo "RISK_LEVEL=UNKNOWN" > result.txt
99+
else
100+
python3 .github/workflows/parse_and_evaluate.py "$(cat answers.txt)" > result.txt || true
101+
fi
102+
cat result.txt
103+
# Extract RISK_LEVEL=XYZ from result.txt if present
104+
RISK=$(grep -m1 '^RISK_LEVEL=' result.txt | cut -d= -f2 || echo "LOW")
105+
echo "risk=$RISK" >> $GITHUB_OUTPUT
106+
107+
- name: Create/update "Ethics Review" check run
108+
uses: actions/github-script@v7
109+
with:
110+
github-token: ${{ secrets.GITHUB_TOKEN }}
111+
script: |
112+
const risk = "${{ steps.run_engine.outputs.risk }}".trim();
113+
const conclusions = {
114+
"LOW": "success",
115+
"MEDIUM": "action_required",
116+
"HIGH": "failure"
117+
};
118+
const conclusion = conclusions[risk] || "failure";
119+
const head_sha = (context.payload.pull_request && context.payload.pull_request.head && context.payload.pull_request.head.sha) || (context.payload.issue && context.payload.issue.pull_request && context.payload.issue.number ? undefined : undefined) || github.event.pull_request?.head?.sha;
120+
await github.rest.checks.create({
121+
owner: context.repo.owner,
122+
repo: context.repo.repo,
123+
name: "Ethics Review",
124+
head_sha: head_sha || context.sha,
125+
status: "completed",
126+
conclusion,
127+
output: {
128+
title: risk === "LOW" ? "Ethics cleared" : `Ethics review: ${risk}`,
129+
summary: risk === "LOW" ? "Low risk – automatically approved" : `Risk level ${risk} – review required`
130+
}
131+
});
132+
133+
- name: Request changes on HIGH risk (trusted-only; skip on untrusted events)
134+
if: steps.run_engine.outputs.risk == 'HIGH'
135+
run: |
136+
PR=${{ steps.prnumber.outputs.pr_number }}
137+
# Only attempt to request changes when running in pull_request_target context (trusted).
138+
if [[ "${GITHUB_EVENT_NAME}" != "pull_request_target" ]]; then
139+
echo "Not in pull_request_target context; skipping request-changes (insufficient permissions for fork PRs)."
140+
exit 0
141+
fi
142+
# Request changes using gh (GITHUB_TOKEN from pull_request_target has write rights)
143+
gh pr review "$PR" --request-changes -b "@ethics-team Required manual review for high-risk change"
144+
echo "Requested changes on PR #$PR due to HIGH risk."
145+
146+
- name: Final status message
147+
run: |
148+
echo "Ethics engine completed. Risk level: ${{ steps.run_engine.outputs.risk }}"

.github/workflows/redeengine.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
import os
2+
import json
3+
import sys
4+
5+
def evaluate_risk(answers):
6+
risk_score = 0
7+
flags = []
8+
9+
if answers.get("involves_ai", False):
10+
risk_score += 3
11+
flags.append("AI/ML component")
12+
if answers.get("processes_pii", False):
13+
risk_score += 5
14+
flags.append("Personal data")
15+
if answers.get("dual_use", False):
16+
risk_score += 10
17+
flags.append("🚨 Dual-use technology")
18+
if answers.get("safety_critical", False):
19+
risk_score += 8
20+
flags.append("Safety-critical")
21+
22+
if "purely documentation" in answers.get("safe_changes", []):
23+
return "LOW", "No ethical concerns detected."
24+
25+
if risk_score >= 10:
26+
return "HIGH", " | ".join(flags)
27+
elif risk_score >= 5:
28+
return "MEDIUM", " | ".join(flags)
29+
else:
30+
return "LOW", "Minor changes"
31+
32+
# Parse comment or form submission here (simplified)
33+
# In real use, you'd parse the actual comment body
34+
answers = json.loads(sys.argv[1]) # passed from workflow
35+
level, reason = evaluate_risk(answers)
36+
37+
print(f"RISK_LEVEL={level}")
38+
print(f"REASON={reason}")

README.md

87 Bytes
Binary file not shown.

asset-scanner/.gitignore

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# PII & Secrets Scanner - do not commit files
2+
scan_report.json
3+
scan_report.local.json
4+
scan_report.shareable.json
5+
*.local.json
6+
local_scan_*.json
7+
temp_report_*.json
8+
9+
# ignore any backup or temp reports
10+
*.json.bak
11+
*.json.tmp
12+
13+
# Optional: ignore the raw findings before enrichment (if you ever dump them)
14+
raw_findings.json
15+
debug_scan.json
16+
17+
# OS / editor
18+
.DS_Store
19+
Thumbs.db
20+
*.log

0 commit comments

Comments
 (0)