Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Reinforcement Learning (RL)

Multi-armed bandit experiments in Drupal using Thompson Sampling algorithm for efficient A/B testing that minimizes lost conversions.
Multi-armed bandit experiments in Drupal using Thompson Sampling algorithm for
efficient A/B testing that minimizes lost conversions.

## Features

Expand All @@ -12,9 +13,14 @@ Multi-armed bandit experiments in Drupal using Thompson Sampling algorithm for e

## How Thompson Sampling Works

Thompson Sampling is a learning-while-doing method. Each visitor triggers the algorithm to "roll the dice" based on learned performance. High-performing variants get larger numbers and show more often, while weak variants still get chances to prove themselves.
Thompson Sampling is a learning-while-doing method. Each visitor triggers the
algorithm to "roll the dice" based on learned performance. High-performing
variants get larger numbers and show more often, while weak variants still get
chances to prove themselves.

Traditional A/B tests waste conversions by showing losing variants for fixed durations. Thompson Sampling shifts traffic to better variants as soon as evidence emerges, saving conversions and reducing testing time.
Traditional A/B tests waste conversions by showing losing variants for fixed
durations. Thompson Sampling shifts traffic to better variants as soon as
evidence emerges, saving conversions and reducing testing time.

## Use Cases

Expand Down Expand Up @@ -55,7 +61,8 @@ $best_option = $ts_calculator->selectBestArm($scores);
## HTTP Endpoints

### rl.php - High-Performance Endpoint (Recommended)
**For high-volume, low-latency applications, use the direct rl.php endpoint:**
**For high-volume, low-latency applications, use the direct rl.php
endpoint:**

```javascript
// Record turns (trials) - when content is viewed
Expand Down Expand Up @@ -87,7 +94,8 @@ navigator.sendBeacon('/modules/contrib/rl/rl.php', rewardData);

## Related Modules

- [AI Sorting](https://www.drupal.org/project/ai_sorting) - Intelligent content ordering for Drupal Views
- [AI Sorting](https://www.drupal.org/project/ai_sorting) - Intelligent content
ordering for Drupal Views

## Technical Implementation

Expand All @@ -96,6 +104,8 @@ Full algorithm details available in source code:

## Resources

- [Multi-Armed Bandit Problem](https://en.wikipedia.org/wiki/Multi-armed_bandit) - Wikipedia overview
- [Multi-Armed Bandit Problem](https://en.wikipedia.org/wiki/Multi-armed_bandit) -
Wikipedia overview
- [Thompson Sampling Paper](https://www.jstor.org/stable/2332286) - Original research
- [Finite-time Analysis](https://homes.di.unimi.it/~cesa-bianchi/Pubblicazioni/ml-02.pdf) - Mathematical foundations
- [Finite-time Analysis](https://homes.di.unimi.it/~cesa-bianchi/Pubblicazioni/ml-02.pdf) -
Mathematical foundations
6 changes: 3 additions & 3 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ services:
working_dir: /src
command: bash -c "./scripts/run-drupal-lint.sh"
environment:
TARGET_DRUPAL_CORE_VERSION: 10
TARGET_DRUPAL_CORE_VERSION: 11
volumes:
- .:/src

Expand All @@ -16,7 +16,7 @@ services:
working_dir: /src
command: bash -c "./scripts/run-drupal-lint-auto-fix.sh"
environment:
TARGET_DRUPAL_CORE_VERSION: 10
TARGET_DRUPAL_CORE_VERSION: 11
volumes:
- .:/src

Expand All @@ -27,6 +27,6 @@ services:
command: bash -c "/src/scripts/run-drupal-check.sh"
tty: true
environment:
DRUPAL_RECOMMENDED_PROJECT: 10.3.x-dev
DRUPAL_RECOMMENDED_PROJECT: 11.2.x-dev
volumes:
- .:/src
9 changes: 8 additions & 1 deletion docs/rl_project_desc.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

<blockquote>The <strong>Reinforcement Learning (RL)</strong> module implements A/B testing in the most efficient and effective way possible, minizing lost conversions using machine learning.</blockquote>

<p><strong>Thompson Sampling</strong> is a learning-while-doing method. Each time a visitor lands on your site the algorithm “rolls the dice” based on what it has learned so far. Variants that have performed well roll larger numbers, so they are shown more often, while weak copies still get a small chance to prove themselves. This simple trick means the system can discover winners very quickly without stopping normal traffic.</p>

<p><strong>Thompson Sampling</strong> is a learning-while-doing method. Each time a visitor lands on your site the algorithm "rolls the dice" based on what it has learned so far. Variants that have performed well roll larger numbers, so they are shown more often, while weak copies still get a small chance to prove themselves. This simple trick means the system can discover winners very quickly without stopping normal traffic.</p>

<p>Traditional A/B tests run for a fixed horizon—say two weeks—during which half your visitors keep seeing the weaker version. Thompson Sampling avoids this waste. As soon as the algorithm has even a little evidence it quietly shifts most traffic to the better variant, saving conversions and shortening the wait for useful insights.</p>

Expand Down Expand Up @@ -47,6 +48,12 @@ <h3>Thompson Sampling</h3>
<li><strong>Bayesian approach</strong> - Incorporates uncertainty</li>
</ul>

<div class="note-tip">
<h4>Prefer a turnkey demo site?</h4>
<p>Spin up <strong>DXPR CMS</strong>—Drupal pre-configured with DXPR Builder, DXPR Theme, RL (Reinforcement Learning) module, and security best practices.</p>
<p><a href="https://www.drupal.org/project/dxpr_cms" title="DXPR CMS platform">Get DXPR CMS »</a></p>
</div>

<h3>API</h3>

<pre><code>// Get the experiment manager
Expand Down
2 changes: 1 addition & 1 deletion rl.info.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ description: 'Core API module for tracking multi-armed bandit experiments using
core_version_requirement: ^10.3 | ^11
package: Custom
dependencies:
- drupal:system
- drupal:system
10 changes: 4 additions & 6 deletions rl.install
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
* Install, update and uninstall functions for the Reinforcement Learning module.
*/

use Drupal\Core\Database\Database;
use Drupal\Core\Database\SchemaObjectExistsException;
use Drupal\Core\Database\SchemaObjectDoesNotExistException;

Expand Down Expand Up @@ -57,7 +56,7 @@ function rl_install() {
'experiment_unique' => ['experiment_uuid'],
],
'indexes' => [
'experiment_uuid' => ['experiment_uuid'],
'experiment_updated' => ['experiment_uuid', 'updated'],
],
];

Expand Down Expand Up @@ -124,7 +123,7 @@ function rl_install() {
'experiment_arm' => ['experiment_uuid', 'arm_id'],
],
'indexes' => [
'experiment_uuid' => ['experiment_uuid'],
'covering_time_window' => ['experiment_uuid', 'updated', 'arm_id', 'turns', 'rewards', 'created'],
],
];

Expand Down Expand Up @@ -250,7 +249,7 @@ function rl_schema() {
'experiment_unique' => ['experiment_uuid'],
],
'indexes' => [
'experiment_uuid' => ['experiment_uuid'],
'experiment_updated' => ['experiment_uuid', 'updated'],
],
];

Expand Down Expand Up @@ -308,7 +307,7 @@ function rl_schema() {
'experiment_arm' => ['experiment_uuid', 'arm_id'],
],
'indexes' => [
'experiment_uuid' => ['experiment_uuid'],
'covering_time_window' => ['experiment_uuid', 'updated', 'arm_id', 'turns', 'rewards', 'created'],
],
];

Expand Down Expand Up @@ -342,4 +341,3 @@ function rl_schema() {

return $schema;
}

2 changes: 1 addition & 1 deletion rl.links.menu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ rl.reports.experiments:
description: 'View reinforcement learning experiments and their statistics.'
route_name: rl.reports.experiments
parent: system.admin_reports
weight: 10
weight: 10
2 changes: 1 addition & 1 deletion rl.module
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ function rl_help($route_name, RouteMatchInterface $route_match) {
$output .= '</ul>';
return $output;
}
}
}
13 changes: 7 additions & 6 deletions rl.permissions.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
administer reinforcement learning:
title: 'Administer Reinforcement Learning'
description: 'Manage RL experiments and view analytics'
administer rl experiments:
title: 'Administer RL experiments'
description: 'Create, update, and delete reinforcement learning experiments.'
restrict access: true

view reinforcement learning data:
title: 'View Reinforcement Learning data'
description: 'Access experiment scores and analytics'
view rl reports:
title: 'View RL reports'
description: 'View reinforcement learning experiment reports and statistics.'
35 changes: 17 additions & 18 deletions rl.php
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@
/**
* @file
* Handles RL experiment tracking via AJAX with minimal bootstrap.
*
*
* Following the statistics.php architecture for optimal performance.
* Updated for Drupal 10/11 compatibility.
*/

use Drupal\Core\DrupalKernel;
use Symfony\Component\HttpFoundation\Request;

// CRITICAL: Only accept POST requests for security and caching reasons
// CRITICAL: Only accept POST requests for security and caching reasons.
$action = filter_input(INPUT_POST, 'action', FILTER_SANITIZE_FULL_SPECIAL_CHARS);
$experiment_uuid = filter_input(INPUT_POST, 'experiment_uuid', FILTER_SANITIZE_FULL_SPECIAL_CHARS);
$arm_id = filter_input(INPUT_POST, 'arm_id', FILTER_SANITIZE_FULL_SPECIAL_CHARS);

// Validate inputs more strictly
// Validate inputs more strictly.
if (!$action || !$experiment_uuid || !in_array($action, ['turn', 'turns', 'reward'])) {
http_response_code(400);
exit();
Expand All @@ -28,7 +28,7 @@
exit();
}

// Catch exceptions when site is not configured or storage fails
// Catch exceptions when site is not configured or storage fails.
try {
// Assumes module in modules/contrib/rl, so three levels below root.
chdir('../../..');
Expand All @@ -40,58 +40,57 @@
$kernel->boot();
$container = $kernel->getContainer();

// Check if experiment is registered
// Check if experiment is registered.
$registry = $container->get('rl.experiment_registry');
if (!$registry->isRegistered($experiment_uuid)) {
// Silently ignore unregistered experiments like statistics module
// Silently ignore unregistered experiments like statistics module.
exit();
}

// Get the experiment data storage service
// Get the experiment data storage service.
$storage = $container->get('rl.experiment_data_storage');

// Handle the different actions
// Handle the different actions.
switch ($action) {
case 'turn':
// Validate arm_id for single turn
// Validate arm_id for single turn.
if ($arm_id && preg_match('/^[a-zA-Z0-9_-]+$/', $arm_id)) {
$storage->recordTurn($experiment_uuid, $arm_id);
}
break;

case 'turns':
// Handle multiple turns with better validation
// Handle multiple turns with better validation.
$arm_ids = filter_input(INPUT_POST, 'arm_ids', FILTER_SANITIZE_FULL_SPECIAL_CHARS);
if ($arm_ids) {
$arm_ids_array = explode(',', $arm_ids);
$arm_ids_array = array_map('trim', $arm_ids_array);
// Validate each arm_id

// Validate each arm_id.
$valid_arm_ids = [];
foreach ($arm_ids_array as $aid) {
if (preg_match('/^[a-zA-Z0-9_-]+$/', $aid)) {
$valid_arm_ids[] = $aid;
}
}

if (!empty($valid_arm_ids)) {
$storage->recordTurns($experiment_uuid, $valid_arm_ids);
}
}
break;

case 'reward':
// Validate arm_id for reward
// Validate arm_id for reward.
if ($arm_id && preg_match('/^[a-zA-Z0-9_-]+$/', $arm_id)) {
$storage->recordReward($experiment_uuid, $arm_id);
}
break;
}
// Send success response

// Send success response.
http_response_code(200);

}
catch (\Exception $e) {
// Do nothing if there is PDO Exception or other failure.
}
}
30 changes: 28 additions & 2 deletions rl.routing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,39 @@ rl.reports.experiments:
_controller: '\Drupal\rl\Controller\ReportsController::experimentsOverview'
_title: 'RL Experiments'
requirements:
_permission: 'administer rl experiments'
_permission: 'view rl reports'

rl.reports.experiment_detail:
path: '/admin/reports/rl/experiment/{experiment_uuid}'
defaults:
_controller: '\Drupal\rl\Controller\ReportsController::experimentDetail'
_title: 'RL Experiment Detail'
requirements:
_permission: 'view rl reports'
experiment_uuid: '.+'

rl.experiment.add:
path: '/admin/reports/rl/add'
defaults:
_form: '\Drupal\rl\Form\ExperimentForm'
_title: 'Add RL Experiment'
requirements:
_permission: 'administer rl experiments'
experiment_uuid: '.+'

rl.experiment.edit:
path: '/admin/reports/rl/experiment/{experiment_uuid}/edit'
defaults:
_form: '\Drupal\rl\Form\ExperimentForm'
_title: 'Edit RL Experiment'
requirements:
_permission: 'administer rl experiments'
experiment_uuid: '.+'

rl.experiment.delete:
path: '/admin/reports/rl/experiment/{experiment_uuid}/delete'
defaults:
_form: '\Drupal\rl\Form\ExperimentDeleteForm'
_title: 'Delete RL Experiment'
requirements:
_permission: 'administer rl experiments'
experiment_uuid: '.+'
2 changes: 1 addition & 1 deletion rl.services.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ services:
rl.experiment_decorator_manager:
class: Drupal\rl\Decorator\ExperimentDecoratorManager
tags:
- { name: service_collector, tag: rl_experiment_decorator, call: addDecorator }
- { name: service_collector, tag: rl_experiment_decorator, call: addDecorator }
4 changes: 2 additions & 2 deletions scripts/prepare-drupal-lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
set -e

if [ -z "$TARGET_DRUPAL_CORE_VERSION" ]; then
# default to target Drupal 8, you can override this by setting the secrets value on your github repo
TARGET_DRUPAL_CORE_VERSION=10
# default to target Drupal 11, you can override this by setting the secrets value on your github repo
TARGET_DRUPAL_CORE_VERSION=11
fi

echo "php --version"
Expand Down
7 changes: 3 additions & 4 deletions src/Controller/ExperimentController.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
* Controller for RL experiment operations.
*/
class ExperimentController extends ControllerBase {

/**
* The experiment manager service.
*
Expand All @@ -35,8 +34,8 @@ public function __construct(ExperimentManagerInterface $experiment_manager) {
*/
public static function create(ContainerInterface $container) {
return new static(
$container->get('rl.experiment_manager')
);
$container->get('rl.experiment_manager')
);
}

/**
Expand Down Expand Up @@ -144,4 +143,4 @@ public function getThompsonScores(Request $request, $experiment_uuid) {
}
}

}
}
Loading