fix(gh-fix-ci): Prefer job-specific logs over run-level logs to prevent cross-job contamination by yaooqinn · Pull Request #139 · openai/skills

yaooqinn · 2026-02-09T14:57:04Z

Problem

When a workflow run contains multiple jobs, fetch_check_log fetches the entire run log (gh run view --log) which combines output from ALL jobs. It only falls back to job-specific logs when the run log is unavailable (pending).

This causes cross-job log contamination: the failure snippet for one job can contain errors from a completely different job, leading to incorrect CI failure diagnosis.

Example

In a workflow with 22 jobs (e.g., Velox Fuzzer Jobs), all failing checks get the same mixed log blob. extract_failure_snippet then picks up errors from the wrong job:

"Spark Aggregate Fuzzer" snippet showed Memory Arbitration errors instead of its own aggregation mismatch
"Ubuntu debug" snippet showed "Linux release with adapters" errors

Fix

Prioritize fetch_job_log(job_id) via the GitHub API (/repos/.../actions/jobs/{job_id}/logs) when a job_id is available. Fall back to run-level logs (gh run view --log) only when:

No job_id is available, OR
The job-specific log fetch fails (non-pending error)

Before

def fetch_check_log(run_id, job_id, repo_root):
    log_text, log_error = fetch_run_log(run_id, repo_root)  # ALL jobs
    if not log_error:
        return log_text, "", "ok"  # ignores job_id
    # only tries job log when run log is PENDING

After

def fetch_check_log(run_id, job_id, repo_root):
    if job_id:
        job_log, job_error = fetch_job_log(job_id, repo_root)  # specific job
        if job_log:
            return job_log, "", "ok"
    # falls back to run-level log
    log_text, log_error = fetch_run_log(run_id, repo_root)

Testing

Tested against facebookincubator/velox#16308 which has a Fuzzer Jobs workflow with 22 jobs (1 failing). After the fix, each failing check correctly shows its own job-specific errors.

When a workflow run contains multiple jobs, `fetch_check_log` previously fetched the entire run log (`gh run view --log`) which combines output from ALL jobs. This caused cross-job log contamination: the failure snippet for one job could contain errors from a completely different job, leading to incorrect diagnosis. For example, in a workflow with 22 jobs (e.g., Velox Fuzzer Jobs), all failing checks would get the same mixed log blob, and `extract_failure_snippet` would pick up errors from the wrong job. The fix prioritizes `fetch_job_log(job_id)` via the GitHub API (`/repos/.../actions/jobs/{job_id}/logs`) when a `job_id` is available, falling back to run-level logs only when job-specific logs fail.

yaooqinn requested a review from a team February 9, 2026 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gh-fix-ci): Prefer job-specific logs over run-level logs to prevent cross-job contamination#139

fix(gh-fix-ci): Prefer job-specific logs over run-level logs to prevent cross-job contamination#139
yaooqinn wants to merge 1 commit intoopenai:mainfrom
yaooqinn:fix/gh-fix-ci-job-specific-logs

yaooqinn commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yaooqinn commented Feb 9, 2026

Problem

Example

Fix

Before

After

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant