Explore chart types for posterior evolution visualization

## Summary

Exploratory issue to evaluate different chart types for visualizing how posterior beliefs evolve over time during experiments. The goal is to prototype multiple approaches so stakeholders can judge their usefulness firsthand.

## Context

With the event log feature (#16), we'll have historical snapshot data available:
- `turns` and `rewards` per arm at various trial points
- `total_experiment_turns` for x-axis positioning
- `created` timestamp for optional date-based display

From this data we can compute:
- Posterior mean (estimated conversion rate)
- Credible intervals (uncertainty bounds)
- Full Beta distribution shape
- Probability each arm is "best"

## Experiment Types to Consider

The module serves two distinct use cases with different visualization needs:

| Type | Arms | Example | Visualization challenge |
|------|------|---------|------------------------|
| A/B tests | 2-5 | Landing page variants | Show detail per arm |
| Recommendations | 100-500+ | Blog post rankings | Summarize many arms |

Charts should work for both, or we need different defaults per type.

---

## Chart Types to Prototype

### 1. Line Chart with Confidence Bands

**Best for:** A/B tests (2-5 arms)

```
Rate
 │    ╭──────────────────── Arm A (shaded CI)
0.15─│───╱─────────────────────────────
     │  ╱   ╭─────────────── Arm B  
0.10─│─╱───╱────────────────────────────
     │╱   ╱
0.05─│───╱──────────────────────────────
     └────────────────────────────────→ trials
```

- X-axis: total experiment turns (or datetime)
- Y-axis: conversion rate estimate
- Lines: one per arm with distinct colors
- Bands: 95% credible interval shaded around each line

**Pros:** Intuitive, shows estimate + uncertainty, familiar format
**Cons:** Gets cluttered with many arms, overlapping bands hard to read

**Prototype tasks:**
- [ ] Basic line chart with Chart.js or similar
- [ ] Add shaded confidence bands
- [ ] Test with 2, 5, and 10 arms
- [ ] Toggle between trials and datetime x-axis

---

### 2. Probability of Winning (Stacked Area)

**Best for:** A/B tests, decision-focused view

```
P(best)
  1.0─│████████████████▓▓▓▓▓▓▓▓▓▓░░░░░░
     │████████████████▓▓▓▓▓▓▓▓▓▓░░░░░░
  0.5─│████ Arm A █████▓▓ Arm B ▓░░░░░░
     │████████████████▓▓▓▓▓▓▓▓▓▓░ C ░░
  0.0─│████████████████▓▓▓▓▓▓▓▓▓▓░░░░░░
     └────────────────────────────────→ trials
```

- X-axis: trials or datetime
- Y-axis: probability (0-1, stacked to 100%)
- Each arm is a colored band

**Pros:** Answers "which should I pick?", always sums to 100%, intuitive competition view
**Cons:** Doesn't show actual conversion rates, requires Monte Carlo computation

**Prototype tasks:**
- [ ] Stacked area chart implementation
- [ ] P(best) calculation from Beta distributions
- [ ] Test with 2, 5, and 10 arms
- [ ] Consider animation showing bands shifting

---

### 3. Heatmap (Arms × Time)

**Best for:** Large recommendation experiments (100+ arms)

```
         Trials →
        10   100  1000  10000
Arm 1   ░░   ▒▒   ▓▓    ██
Arm 2   ░░   ▒▒   ▓▓    ██
Arm 3   ░░   ░░   ▒▒    ▓▓
...
Arm 99  ░░   ░░   ░░    ▒▒

Color = conversion rate (darker = higher)
```

- X-axis: trial progression
- Y-axis: arms (sortable by current performance)
- Color intensity: conversion rate or P(best)

**Pros:** Scales to hundreds of arms, shows patterns across whole experiment
**Cons:** Less precise than line charts, requires good color scale design

**Prototype tasks:**
- [ ] Heatmap grid implementation
- [ ] Sortable rows (by name, current rate, total trials)
- [ ] Color scale selection (sequential vs diverging)
- [ ] Test with 50, 100, 500 arms

---

### 4. Ranking Chart (Bump Chart)

**Best for:** Recommendations, seeing position changes

```
Rank
  1─│    ╲    ╱───────── post1
  2─│─────╲╱─────╲────── post2
  3─│───────────╱─╲───── post3
  4─│──────────────╲──── post4
    └─────────────────────→ trials
```

- X-axis: trials or datetime
- Y-axis: rank position
- Lines: one per arm showing rank over time

**Pros:** Clear view of competition, works for many arms (show top N)
**Cons:** Doesn't show magnitude of differences, can get tangled

**Prototype tasks:**
- [ ] Bump chart implementation
- [ ] Show top N arms (configurable, default 10-20)
- [ ] Highlight lines on hover
- [ ] Click to see arm details

---

### 5. Distribution Evolution (Ridge/Joy Plot)

**Best for:** Educational view, single arm deep-dive

```
Trial 1000 ───────╱╲───────────────
Trial 500  ────╱──╲────────────────
Trial 100  ──╱────╲────────────────
Trial 10   ╱───────╲───────────────
           0.0    0.1    0.2    Rate
```

- X-axis: conversion rate
- Y-axis: stacked by trial number
- Shape: actual Beta distribution PDF

**Pros:** Beautiful, shows full uncertainty evolution, educational
**Cons:** Only practical for 1-3 arms, complex to read

**Prototype tasks:**
- [ ] Ridge plot implementation
- [ ] Beta PDF calculation and rendering
- [ ] Animation option (morphing distribution)
- [ ] Use as detail view when clicking an arm

---

### 6. Convergence Indicator

**Best for:** Quick status check, dashboard widget

```
CI Width
     │╲
 0.3─│ ╲
     │  ╲____
 0.1─│       ╲________
     └────────────────→ trials
     
     [███████████░░░░] 78% confident
```

- Simple line showing uncertainty shrinking over time
- Or: progress bar showing "confidence level"

**Pros:** At-a-glance experiment maturity, answers "can we decide yet?"
**Cons:** Supplementary only, doesn't show which arm is winning

**Prototype tasks:**
- [ ] CI width over time line chart
- [ ] Confidence progress bar widget
- [ ] Threshold indicator (e.g., "95% confident A beats B")

---

## Recommended Approach

### Phase 1: Core Charts
1. Line chart with CI bands (primary for A/B tests)
2. Heatmap (primary for large experiments)

### Phase 2: Decision Support
3. P(best) stacked area
4. Convergence indicator

### Phase 3: Advanced
5. Ranking chart
6. Ridge plot for deep-dive

---

## Technical Considerations

- **Library:** Chart.js, D3.js, or Apache ECharts
- **Rendering:** Client-side JavaScript, data via JSON endpoint
- **Responsive:** Charts should work on mobile
- **Accessibility:** Color-blind friendly palettes, screen reader support
- **Performance:** Lazy load charts, paginate large experiments

---

## Deliverables

- [ ] Prototype each chart type with sample data
- [ ] Screenshot/demo of each for stakeholder review
- [ ] Recommendation for default chart per experiment type
- [ ] Performance testing with large datasets

## Dependencies

- #16 (Event log for historical data)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore chart types for posterior evolution visualization #17

Summary

Context

Experiment Types to Consider

Chart Types to Prototype

1. Line Chart with Confidence Bands

2. Probability of Winning (Stacked Area)

3. Heatmap (Arms × Time)

4. Ranking Chart (Bump Chart)

5. Distribution Evolution (Ridge/Joy Plot)

6. Convergence Indicator

Recommended Approach

Phase 1: Core Charts

Phase 2: Decision Support

Phase 3: Advanced

Technical Considerations

Deliverables

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Arms	Example	Visualization challenge
A/B tests	2-5	Landing page variants	Show detail per arm
Recommendations	100-500+	Blog post rankings	Summarize many arms

Explore chart types for posterior evolution visualization #17

Description

Summary

Context

Experiment Types to Consider

Chart Types to Prototype

1. Line Chart with Confidence Bands

2. Probability of Winning (Stacked Area)

3. Heatmap (Arms × Time)

4. Ranking Chart (Bump Chart)

5. Distribution Evolution (Ridge/Joy Plot)

6. Convergence Indicator

Recommended Approach

Phase 1: Core Charts

Phase 2: Decision Support

Phase 3: Advanced

Technical Considerations

Deliverables

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions