Skip to content

initial experiments#2

Merged
abirharrasse merged 5 commits intomainfrom
abir-explorations
Dec 10, 2025
Merged

initial experiments#2
abirharrasse merged 5 commits intomainfrom
abir-explorations

Conversation

@abirharrasse
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings December 9, 2025 22:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an experimental pipeline for analyzing causal relationships in chain-of-thought (CoT) reasoning. The code generates CoT reasoning traces from a language model, computes sentence-level causal influence matrices using attention masking, extracts high-importance "thought anchors," and trains classifiers to categorize reasoning steps into 8 semantic classes (e.g., problem setup, active computation, self-checking).

  • Implements causal tracing via attention masking to measure how masking source sentences affects target sentence predictions (KL divergence)
  • Classifies reasoning sentences into 8 anchor classes and selects high-importance sentences based on causal outgoing influence
  • Trains Logistic Regression and MLP classifiers on hidden states to predict anchor classes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +336 to +337
text_before = problem + " " + " ".join(sentences[:idx])
hidden_state = get_hidden_state(text_before)
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature extraction concatenates problem + " " + " ".join(sentences[:idx]) to get context before each anchor sentence. This reconstructs text from split sentences, which may not match the original CoT text due to lost formatting, punctuation, or whitespace. This inconsistency could affect the hidden state extraction. Consider storing the original character positions of sentences and slicing the original CoT text instead of reconstructing from split sentences.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback


scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42, stratify=y)
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The train/test split uses stratify=y which is good practice, but with small class counts (min_samples=2), some classes may have only 2-3 samples total. Stratification with very small classes can fail or result in insufficient test samples for meaningful evaluation. Consider increasing min_samples to at least 5-10, or using cross-validation instead of a single train/test split for more robust evaluation with limited data.

Copilot uses AI. Check for mistakes.
# EXTRACT ANCHORS

print("="*80)
print("PHASE 2: EXTRACTING THOUGHT ANCHORS")
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is labeled as "PHASE 2" but Phase 2 was already used at line 207. The phases are misnumbered. This should be Phase 3, and subsequent phases should be renumbered (current Phase 3 → Phase 4, current Phase 4 → Phase 5).

Suggested change
print("PHASE 2: EXTRACTING THOUGHT ANCHORS")
print("PHASE 3: EXTRACTING THOUGHT ANCHORS")

Copilot uses AI. Check for mistakes.
'outgoing': outgoing_feature
})

pickle.dump(all_features, open(ckpt_features, 'wb'))
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is opened but is not closed.

Suggested change
pickle.dump(all_features, open(ckpt_features, 'wb'))
with open(ckpt_features, 'wb') as f:
pickle.dump(all_features, f)

Copilot uses AI. Check for mistakes.
Comment on lines 449 to 452
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is opened but is not closed.

Suggested change
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
with open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb') as f:
pickle.dump(clf_lr, f)
with open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb') as f:
pickle.dump(clf_mlp, f)
with open(f"{checkpoint_dir}/scaler.pkl", 'wb') as f:
pickle.dump(scaler, f)
with open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb') as f:
pickle.dump(class_to_idx, f)

Copilot uses AI. Check for mistakes.
Comment on lines 449 to 452
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is opened but is not closed.

Suggested change
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
with open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb') as f:
pickle.dump(clf_lr, f)
with open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb') as f:
pickle.dump(clf_mlp, f)
with open(f"{checkpoint_dir}/scaler.pkl", 'wb') as f:
pickle.dump(scaler, f)
with open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb') as f:
pickle.dump(class_to_idx, f)

Copilot uses AI. Check for mistakes.
Comment on lines 449 to 452
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is opened but is not closed.

Suggested change
pickle.dump(clf_lr, open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb'))
pickle.dump(clf_mlp, open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb'))
pickle.dump(scaler, open(f"{checkpoint_dir}/scaler.pkl", 'wb'))
pickle.dump(class_to_idx, open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb'))
with open(f"{checkpoint_dir}/classifier_lr.pkl", 'wb') as f:
pickle.dump(clf_lr, f)
with open(f"{checkpoint_dir}/classifier_mlp.pkl", 'wb') as f:
pickle.dump(clf_mlp, f)
with open(f"{checkpoint_dir}/scaler.pkl", 'wb') as f:
pickle.dump(scaler, f)
with open(f"{checkpoint_dir}/class_to_idx.pkl", 'wb') as f:
pickle.dump(class_to_idx, f)

Copilot uses AI. Check for mistakes.
@wlg1 wlg1 self-assigned this Dec 9, 2025
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

Copilot AI commented Dec 10, 2025

@abirharrasse I've opened a new pull request, #3, to work on those changes. Once the pull request is ready, I'll request review from you.

abirharrasse and others added 3 commits December 10, 2025 10:15
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@abirharrasse abirharrasse merged commit b49f87f into main Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants