Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 2 additions & 10 deletions docs/AB_TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,31 +3,23 @@
A/B testing in Codex shows two translation suggestions side‑by‑side once in a while so you can pick the better one. This helps us learn which retrieval and prompting strategies work best without slowing you down.

## How it works
- Triggering: Tests run at random with a small probability (default 15%).
- Triggering: Tests run at random with a hardcoded probability of 1% (1 in 100). This is defined by `AB_TEST_PROBABILITY` in `src/utils/abTestingRegistry.ts`.
- Variants: When triggered, two candidates are generated in parallel.
- Auto‑apply: If the two results are effectively identical, we apply one automatically and no modal is shown.
- Choosing: If they differ, a simple chooser appears; click the option that reads best. Dismissing the modal after choosing just closes it.
- Frequency control: In the chooser, “See less/See more” nudges how often you’ll be asked in the future.

## What’s being compared
- Search algorithm for few‑shot retrieval: `fts5-bm25` vs `sbs`.
- Few‑shot example format: `source-and-target` vs `target-only`.
(Model comparisons are disabled by default.)

## Settings
- `codex-editor-extension.abTestingEnabled`: turn A/B testing on/off.
- `codex-editor-extension.abTestingProbability`: probability (0–1) for running a true A/B test. Default: `0.15` (15%).

Change these in VS Code Settings → Extensions → Codex Editor.

## Results & privacy
- Local log: Each choice is appended to `files/ab-test-results.jsonl` in your workspace (newline‑delimited JSON).
- Win rates: The editor may compute simple win‑rates by variant label and show them in the chooser.
- Network: If analytics posting is enabled in code, the extension may attempt to send anonymized A/B summaries to a configured endpoint. If your environment blocks network access, the extension continues without error.

## Disable A/B testing
- Set `codex-editor-extension.abTestingEnabled` to `false`, or
- Set `codex-editor-extension.abTestingProbability` to `0`.
Set `AB_TEST_PROBABILITY` to `0` in `src/utils/abTestingRegistry.ts`.

## Developer pointers (optional)
- Registry and helpers: `src/utils/abTestingRegistry.ts`, `src/utils/abTestingSetup.ts`.
Expand Down
14 changes: 0 additions & 14 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -931,20 +931,6 @@
"default": false,
"description": "When enabled, AI will only use translation pairs that have been validated by users as examples for few-shot prompting. This ensures higher quality examples but may reduce the number of available examples."
},
"codex-editor-extension.abTestingEnabled": {
"title": "Enable A/B Testing",
"type": "boolean",
"default": true,
"description": "Enables lightweight A/B tests during completions. Tests are globally gated by probability and each comparison shows exactly two options."
},
"codex-editor-extension.abTestingProbability": {
"title": "A/B Testing Probability",
"type": "number",
"default": 0.15,
"minimum": 0,
"maximum": 1,
"description": "Probability (0-1) that any eligible event will run a true A/B test. When not triggered, the system returns identical variants to keep UX consistent without doubling compute."
},
"codex-editor-extension.searchAlgorithm": {
"title": "Search Algorithm",
"type": "string",
Expand Down
1 change: 0 additions & 1 deletion src/copilotSettings/copilotSettings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,6 @@ export async function generateChatSystemMessage(
numberOfFewShotExamples: 0,
debugMode: false,
useOnlyValidatedExamples: false,
abTestingEnabled: false,
allowHtmlPredictions: allowHtmlPredictions,
fewShotExampleFormat: "source-and-target",
};
Expand Down
3 changes: 0 additions & 3 deletions src/projectManager/utils/migrationUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1186,9 +1186,6 @@ export const migration_lineNumbersSettings = async (context?: vscode.ExtensionCo
}
};

// Gently migrate A/B testing probability from older explicit 25% to 5% with user consent
// (removed) migration_abTestingProbabilityDefault — intentionally deleted for now

async function analyzeFileForLineNumbers(fileUri: vscode.Uri): Promise<boolean> {
try {
// Read the file content using serializer for proper deserialization
Expand Down
16 changes: 7 additions & 9 deletions src/providers/translationSuggestions/llmCompletion.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ import { CodexCellTypes } from "../../../types/enums";
import { getAutoCompleteStatusBarItem } from "../../extension";
import { tokenizeText } from "../../utils/nlpUtils";
import { buildFewShotExamplesText, buildMessages, fetchFewShotExamples, getPrecedingTranslationPairs } from "./shared";
import { abTestingRegistry } from "../../utils/abTestingRegistry";
import { abTestingRegistry, AB_TEST_PROBABILITY } from "../../utils/abTestingRegistry";

// Helper function to build A/B test context object
function buildABTestContext(
Expand Down Expand Up @@ -249,23 +249,21 @@ export async function llmCompletion(
// A/B testing is disabled during batch operations (chapter autocomplete, batch transcription)
// to avoid interrupting the user with variant selection UI
const extConfig = vscode.workspace.getConfiguration("codex-editor-extension");
const abEnabled = Boolean(extConfig.get("abTestingEnabled") ?? true) && !isBatchOperation;
const abProbabilityRaw = extConfig.get<number>("abTestingProbability");
const abProbability = Math.max(0, Math.min(1, typeof abProbabilityRaw === "number" ? abProbabilityRaw : 0.15));
// A/B testing is always enabled but skipped during batch operations.
// Probability is hardcoded in AB_TEST_PROBABILITY (single source of truth).
const abEnabled = !isBatchOperation;
const randomValue = Math.random();
const triggerAB = abEnabled && randomValue < abProbability;
const triggerAB = abEnabled && randomValue < AB_TEST_PROBABILITY;

if (completionConfig.debugMode) {
console.debug(`[llmCompletion] A/B testing: enabled=${abEnabled}, isBatchOperation=${isBatchOperation}, probability=${abProbability}, random=${randomValue.toFixed(3)}, trigger=${triggerAB}`);
console.debug(`[llmCompletion] A/B testing: enabled=${abEnabled}, isBatchOperation=${isBatchOperation}, probability=${AB_TEST_PROBABILITY}, random=${randomValue.toFixed(3)}, trigger=${triggerAB}`);
}

if (!triggerAB && completionConfig.debugMode) {
if (isBatchOperation) {
console.debug(`[llmCompletion] A/B testing disabled during batch operation`);
} else if (!abEnabled) {
console.debug(`[llmCompletion] A/B testing disabled in settings`);
} else {
console.debug(`[llmCompletion] A/B test not triggered (random ${randomValue.toFixed(3)} >= probability ${abProbability})`);
console.debug(`[llmCompletion] A/B test not triggered (random ${randomValue.toFixed(3)} >= probability ${AB_TEST_PROBABILITY})`);
}
}

Expand Down
2 changes: 0 additions & 2 deletions src/test/suite/validatedOnlyExamples.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,6 @@ suite("Validated-only examples behavior", () => {
if (section === "codex-editor-extension") {
return {
get: (key: string) => {
if (key === "abTestingEnabled") return true;
if (key === "abTestingProbability") return 1; // force
if (key === "useOnlyValidatedExamples") return true;
if (key === "searchAlgorithm") return "sbs";
return (cfg as any)?.get?.(key);
Expand Down
6 changes: 6 additions & 0 deletions src/utils/abTestingRegistry.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
/**
* Probability (0–1) that any eligible completion triggers a local A/B test.
* 0.01 = 1 in 100. Change this single constant to adjust frequency everywhere.
*/
export const AB_TEST_PROBABILITY = 0.01;

type ABTestResultPayload<TVariant> = TVariant[] | {
variants: TVariant[];
isAttentionCheck?: boolean;
Expand Down
3 changes: 0 additions & 3 deletions src/utils/llmUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,6 @@ export interface CompletionConfig {
numberOfFewShotExamples: number;
debugMode: boolean;
useOnlyValidatedExamples: boolean;
abTestingEnabled: boolean; // legacy flag; kept for type compatibility
allowHtmlPredictions?: boolean; // whether to preserve HTML in examples and predictions
fewShotExampleFormat: string; // format for few-shot examples: 'source-and-target' or 'target-only'
}
Expand Down Expand Up @@ -369,8 +368,6 @@ export async function fetchCompletionConfig(): Promise<CompletionConfig> {
numberOfFewShotExamples: (config.get("numberOfFewShotExamples") as number) || 30,
debugMode: config.get("debugMode") === true || config.get("debugMode") === "true",
useOnlyValidatedExamples: useOnlyValidatedExamples as boolean,
// A/B testing flag kept for compatibility; registry handles gating
abTestingEnabled: (config.get("abTestingEnabled") as boolean) ?? true,
allowHtmlPredictions: (config.get("allowHtmlPredictions") as boolean) || false,
fewShotExampleFormat: (config.get("fewShotExampleFormat") as string) || "source-and-target",
};
Expand Down
3 changes: 1 addition & 2 deletions types/index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2163,8 +2163,7 @@ type EditorReceiveMessages =
}
| { type: "providerUpdatesTextDirection"; textDirection: "ltr" | "rtl"; }
| { type: "providerSendsLLMCompletionResponse"; content: { completion: string; cellId: string; }; }
| { type: "providerSendsABTestVariants"; content: { variants: string[]; cellId: string; testId: string; testName?: string; names?: string[]; abProbability?: number; }; }
| { type: "abTestingProbabilityUpdated"; content: { value: number; }; }
| { type: "providerSendsABTestVariants"; content: { variants: string[]; cellId: string; testId: string; testName?: string; names?: string[]; }; }
| { type: "jumpToSection"; content: string; }
| { type: "providerUpdatesNotebookMetadataForWebview"; content: CustomNotebookMetadata; }
| { type: "updateVideoUrlInWebview"; content: string; }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,6 @@ const CodexCellEditor: React.FC = () => {
testId: string;
testName?: string;
names?: string[];
abProbability?: number;
}>({
isActive: false,
variants: [],
Expand Down Expand Up @@ -1373,7 +1372,7 @@ const CodexCellEditor: React.FC = () => {
},
setAudioAttachments: setAudioAttachments,
showABTestVariants: (data) => {
const { variants, cellId, testId, testName, names, abProbability } = data as any;
const { variants, cellId, testId, testName, names } = data as any;
const count = Array.isArray(variants) ? variants.length : 0;
debug("ab-test", "Received A/B test variants:", { cellId, count });

Expand Down Expand Up @@ -1423,7 +1422,6 @@ const CodexCellEditor: React.FC = () => {
testId,
testName,
names,
abProbability,
});
return;
}
Expand Down