Skip to content

mug2mag/CreditEval

Repository files navigation

CreditEval is a financial credit reasoning evaluation system. It provides:

  • A batch evaluation pipeline (main.py) for offline scoring of CoT outputs.
  • A Flask backend API (app.py) exposing locate / evaluate / prompt-config endpoints.
  • A web UI (Risk-COT-Tagger/) for meta-evaluation, prompt editing and few-shot example management.

Installation

pip install -r requirements.txt

Command-line evaluation

main.py runs the full multi-dimension evaluation pipeline (accuracy, logical consistency, compliance) over a CoT jsonl file with user metadata:

python main.py \
  --cust_desc_file data/dag_4.3.2/cust_desc_file_sample.json \
  --cot_file data/dag_4.3.2/cot_file_sample.jsonl \
  --output results \
  --max_workers 30

Backend API + Web UI

  1. Start the Flask backend (locate / evaluate / meta-evaluate):

    python app.py

    This exposes endpoints such as:

    • GET /api/prompts/<dimension>/<stage>: read prompt config from common/config/*_prompt_config.yaml.
    • PUT /api/prompts/<dimension>/<stage>: update prompt content.
    • POST /api/locate: single-user feature extraction (locate stage).
    • POST /api/locate/batch: batch locate over files.
    • POST /api/evaluate/batch: batch evaluation over files.
  2. Start the front-end HTTP server for Risk-COT-Tagger:

    cd Risk-COT-Tagger
    python start_server.py

    Then open the printed URL (typically http://localhost:8000/start.html) to access:

    • Meta-evaluation workbench for reviewing and curating errors.
    • Prompt configuration pages for Locate and Evaluate stages.
    • Few-shot example repository and instruction builder.***

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published