Skip to content

Conversation

@jjroelofs
Copy link
Contributor

@jjroelofs jjroelofs commented Aug 11, 2025

Summary

Fixes #7, Fixes #9, Fixes #10, Fixes #11

This PR implements a comprehensive overhaul of the RL module, introducing human-readable experiment names, complete cold start handling, total turns calculation fixes, enhanced symlink support, and production-ready code quality improvements.

Problems Solved

Changes Made

Human-Readable Experiment Names (#10)

  • Database schema enhancement: Added experiment_name field to rl_experiment_registry table
  • Registry interface expansion: ExperimentRegistryInterface::register() now accepts optional experiment names
  • Reports integration: ReportsController displays stored names instead of UUIDs in admin interface
  • Backward compatibility: Existing experiments continue working, new ones get readable names
  • Database update hook: rl_update_8002() adds field to existing installations

Total Turns Calculation Fix (#11)

  • Fixed recordTurns() method: Now correctly increments total_turns by arm count instead of calling recordTurn() multiple times
  • Proper multi-armed bandit accounting: Total turns = sum of all individual arm exposures
  • Database consistency: Updated existing incorrect totals to match arm data
  • Accurate reporting: Overview reports now show correct total turn counts

Complete Cold Start Solution

  • Enhanced ExperimentManager: getThompsonScores() now accepts requested_arms parameter
  • Proper initialization: New arms get 0 turns/0 rewards for maximum exploration scoring
  • Centralized logic: All cold start handling consolidated in RL module instead of consuming modules
  • Comprehensive coverage: Handles both partial cold start (some arms missing) and complete cold start (no experiment data)

Thompson Sampling Improvements (#9)

  • Tie-breaker mechanism: Added micro-randomization to prevent identical scores during cold start
  • Statistical integrity: Maintains beta distribution properties while ensuring unique scores
  • Exploration guarantee: Prevents deterministic ordering when all arms have identical statistics
  • Cold start optimization: Critical for proper randomization during initial learning phase

Enhanced rl.php Endpoint (#7)

  • Symlink compatibility: Robust path detection using $_SERVER['SCRIPT_FILENAME'] fallback
  • Security hardening: Regex validation for experiment UUIDs and arm IDs
  • Error handling: Proper HTTP status codes (400, 500) with descriptive messages
  • Drupal 10/11 compatibility: Updated deprecated FILTER_SANITIZE_STRING usage
  • Performance optimization: Streamlined bootstrap and error handling

Production Code Quality (#7)

  • Debug cleanup: Removed all console.log and debug statements from production code
  • Coding standards: Full Drupal coding standards compliance with automated fixing
  • Documentation: Essential comments retained, verbose explanations removed
  • Error handling: Comprehensive exception handling with meaningful messages

Technical Implementation

Human-Readable Names Storage (#10)

-- New database field
ALTER TABLE rl_experiment_registry 
ADD COLUMN experiment_name VARCHAR(255) NULL 
COMMENT 'Human-readable experiment name';

Enhanced Cold Start Logic

// Initialize missing arms for cold start
if (\!empty($requested_arms)) {
  foreach ($requested_arms as $arm_id) {
    if (\!isset($arms_data[$arm_id])) {
      $arms_data[$arm_id] = (object) [
        'arm_id' => $arm_id,
        'turns' => 0,
        'rewards' => 0,
      ];
    }
  }
}

Corrected Total Turns Calculation (#11)

// Record total turns = number of arms shown (sum of individual turns)
$this->database->merge('rl_experiment_totals')
  ->key(['experiment_uuid' => $experiment_uuid])
  ->expression('total_turns', 'total_turns + :inc', [':inc' => $arm_count])
  ->execute();

Thompson Sampling with Tie-Breaker (#9)

$base_score = $this->randBeta($alpha, $beta);
$tie_breaker = mt_rand(1, 999) / 1000000;
$scores[$id] = $base_score + $tie_breaker;

Database Schema Changes

New Field Added (#10)

  • rl_experiment_registry.experiment_name (VARCHAR 255, nullable)
  • Stores human-readable experiment names like "blog_posts:default"
  • Backward compatible - existing experiments continue working

Update Hook

  • rl_update_8002() safely adds field to existing installations
  • No data migration required
  • Automatic execution during drush updb

Data Correction (#11)

-- Existing totals corrected automatically  
UPDATE rl_experiment_totals et 
SET total_turns = (
  SELECT SUM(turns) FROM rl_arm_data ad 
  WHERE ad.experiment_uuid = et.experiment_uuid
);

Benefits

User-friendly reporting - Experiment names show as "content_recent:default" instead of SHA1 hashes (#10)
Accurate statistics - Total turns display correctly as sum of all arm exposures (#11)
Proper cold start - New content gets appropriate exploration scores for fair exposure
Universal compatibility - Works with standard and symlinked Drupal installations (#7)
Production ready - Clean code without debug statements or verbose logging (#7)
Performance optimized - Efficient database operations and streamlined endpoint
Developer friendly - Clear error messages and comprehensive documentation
Statistically sound - Proper multi-armed bandit implementation with exploration guarantees (#9)

Reporting Improvements

Before (#10, #11):

Experiment ID: 6da7b208a42c9db4cb166b294f19a41f54f03b44
Total Turns: 0 (despite individual arms having data)

After:

Experiment ID: content_recent:block_1
Total Turns: 45 (correct sum of all arm exposures)

Testing

✅ Human-readable experiment names displaying correctly in admin reports (#10)
✅ Total turns calculation verified with multiple arm configurations (#11)
✅ Cold start behavior confirmed for new experiments and new arms
✅ Symlink compatibility tested in development environments (#7)
✅ Thompson sampling randomization verified during cold start conditions (#9)
✅ Database update hook tested on existing installations
✅ Drupal coding standards compliance verified
✅ Backward compatibility confirmed with existing experiment data

Compatibility

  • Drupal 10/11 compatible with deprecated function updates
  • Standard installations - Works with traditional module placement
  • Symlinked installations - Enhanced path detection for development setups (Production-ready updates: symlink support and debug cleanup #7)
  • Existing data preservation - No breaking changes to experiment data
  • Progressive enhancement - New features available immediately, old experiments continue working

Related Issues & PRs

This comprehensive overhaul transforms the RL module into a production-ready, statistically sound machine learning platform with proper cold start handling, accurate reporting, and user-friendly administration interfaces that work reliably across all Drupal installation scenarios.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

@jjroelofs jjroelofs changed the title feat(endpoint): enhance rl.php with symlink support and production cleanup feat(endpoint): enhance rl.php with symlink support, cold start fix, and production cleanup Aug 11, 2025
…and production cleanup

- Enhanced rl.php endpoint with symlink support using SCRIPT_FILENAME fallback
- Thompson sampling tie-breaker to ensure unique scores
- Cold start initialization: getThompsonScores() now accepts arm IDs to initialize
- Security improvements with regex input validation
- Complete debug cleanup removing all debug logging
- Code optimization with simplified logic

The cold start fix allows getThompsonScores() to accept an array of
arm IDs that should be initialized with zero stats if they don't exist
in the database. This ensures the RL module always returns scores,
even for completely new experiments.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jjroelofs jjroelofs force-pushed the jur/1.x/fix-cold-start-and-autoload.php-bugs branch from dd51762 to f1ba127 Compare August 11, 2025 12:09
Jurriaan Roelofs and others added 4 commits August 11, 2025 14:30
- Add experiment_name field to rl_experiment_registry table
- Update ExperimentRegistryInterface to support optional experiment name
- Update ExperimentRegistry to store experiment names when provided
- Update ReportsController to display experiment names instead of UUIDs
- Add database update hook rl_update_8002 for field addition

Fixes issue where experiments showed SHA1 hashes instead of readable names
in reports interface. AI Sorting module can now pass "view:display" format
names that will be properly displayed to users.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Use nullable parameter syntax (?string instead of string = NULL)
- Remove trailing whitespace

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix recordTurns() to increment total_turns by 1 per page view
- Previously was incrementing by number of arms shown
- Each arm still gets individual turn tracking
- 1 page view = 1 total turn (regardless of arms count)
- Also update existing incorrect totals based on arm data

In multi-armed bandit context, 1 turn = 1 decision opportunity,
not 1 per item displayed. This fixes the overview reports showing
0 total turns while individual experiments showed correct data.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Change recordTurns() to increment total_turns by arm count
- Each arm turn represents one opportunity/exposure
- Total turns = sum of all individual arm turns across all arms
- If 10 arms shown on page, total_turns increases by 10
- Update existing totals to match sum of arm turns

This fixes the overview reports to correctly show the sum of all
arm exposures rather than just counting page views.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jjroelofs jjroelofs changed the title feat(endpoint): enhance rl.php with symlink support, cold start fix, and production cleanup feat: comprehensive RL module overhaul with human-readable names, cold start fixes, and total turns correction Aug 11, 2025
@jjroelofs jjroelofs merged commit 523e231 into 1.x Aug 11, 2025
@jjroelofs jjroelofs deleted the jur/1.x/fix-cold-start-and-autoload.php-bugs branch August 11, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants