Skip to content

Conversation

@srikrishnavansi
Copy link
Contributor

@srikrishnavansi srikrishnavansi commented Jan 12, 2026

Summary

Add comprehensive AI-powered tutor for package education with:

  • Interactive tutoring using Plan→Act→Reflect LangGraph workflow
  • LLM-powered lessons, code examples, and Q&A (Claude API via LangChain)
  • SQLite-based progress tracking and student profiling
  • CLI integration via cortex tutor <package>

Related Issue

Closes #131

Type of Change

  • New feature

Demo

Demo.mov

AI Disclosure

  • AI/IDE/Agents used (please describe below)

Used Claude (Anthropic) via Claude Code CLI for:

  • Code implementation and architecture design
  • PR review fixes (addressing Gemini/CodeRabbit comments)
  • Documentation generation

Testing

  • 266 tests pass with 86% coverage
  • Run: pytest cortex/tutor/tests/ -v
  • Manual testing: cortex tutor docker --fresh

Checklist

Summary by CodeRabbit

  • New Features

    • Intelligent Tutor: interactive, menu-driven lessons, Q&A, conversational follow-ups, runnable examples, caching and offline fallbacks.
    • Tutor CLI: new "tutor" command supporting teach, question, list, progress, reset, fresh, verbose, and progress controls with branded terminal UI.
  • Tools & Storage

    • Added LLM-backed lesson generator, examples provider, QA tool, deterministic lesson loader, progress tracker, validators, and SQLite-backed persistence.
  • Branding

    • Rich-based terminal UI components for banners, tables, code, progress, and prompts.
  • Documentation

    • Comprehensive AI_TUTOR guide and prompt/tool docs.
  • Tests

    • Extensive unit and integration tests for tutor flows, tools, CLI, and storage.

✏️ Tip: You can customize this high-level summary in your review settings.

Add comprehensive AI tutor for package education with:
- Interactive tutoring with Plan→Act→Reflect LangGraph workflow
- LLM-powered lessons, code examples, and Q&A (Claude API via LangChain)
- SQLite-based progress tracking with topic completion stats
- Best practices and step-by-step tutorials
- Rich terminal UI with branded output

Features:
- cortex tutor <package> - Start interactive session
- cortex tutor --list - Show studied packages
- cortex tutor --progress - View learning progress
- cortex tutor --reset - Reset progress

Technical:
- 266 tests with 85.61% coverage
- Lazy imports for non-LLM operations (no API key needed for --list/--progress)
- Pydantic models for type safety
- 7-layer prompt architecture

Closes cortexlinux#131
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

Adds an AI-powered "tutor" subsystem: CLI wiring, a LangGraph Plan→Act→Reflect tutor agent, deterministic and agentic tools (LLM-backed lesson/QA/examples), SQLite persistence, Rich terminal branding, configuration, prompts, docs, and extensive tests.

Changes

Cohort / File(s) Summary
CLI Integration
cortex/cli.py, cortex/tutor/cli.py
Adds tutor subcommand wiring and a full tutor CLI (parser + handlers). Review argument parsing, lazy imports, and interrupt/error handling.
Package Init & Branding
cortex/tutor/__init__.py, cortex/tutor/branding.py
New package exports (Config, console, tutor_print, __version__) and Rich-based UI helpers (banners, menus, panels, input).
Agents & Graph
cortex/tutor/agents/...
.../tutor_agent/graph.py, .../state.py, .../tutor_agent/tutor_agent.py, agents/__init__.py, tutor_agent/__init__.py
Introduces TutorAgent, InteractiveTutor UI, typed TutorAgentState schemas, Plan→Act→Reflect LangGraph graph, node routing and state mutation helpers. Focus on routing, state consistency, checkpoints and error handling.
Config & Contracts
cortex/tutor/config.py, cortex/tutor/contracts/*
Pydantic Config (env-driven, db path, API-key helpers) and Pydantic contracts for lesson/progress/quiz payloads (CodeExample, TutorialStep, LessonContext, ProgressContext, QuizContext).
Persistence / Memory
cortex/tutor/memory/sqlite_store.py, cortex/tutor/memory/__init__.py
Thread-safe SQLiteStore with schemas/APIs for progress, quizzes, student profiles, lesson cache (TTL), and utilities (reset, list). Review schema, JSON serialization, transactions, and locking.
Deterministic Tools
cortex/tutor/tools/deterministic/*
Adds LessonLoaderTool (cache + fallback lessons), ProgressTrackerTool (DB-backed actions), and validators (package/topic/question/score sanitizers). Verify cache semantics and validation rules.
Agentic Tools
cortex/tutor/tools/agentic/*, cortex/tutor/tools/__init__.py
Adds LLM-backed LessonGeneratorTool, ExamplesProviderTool, QAHandlerTool (with ConversationHandler), prompt builders, response structuring, sync/async paths, and cost reporting. Check prompt schemas and parser alignment.
Prompts & Policies
cortex/tutor/prompts/...
New multi-layer system prompt docs for tutor agent and tools (anti-hallucination rules, JSON output schemas). Ensure schema/code alignment.
Docs & Tests
docs/AI_TUTOR.md, cortex/tutor/tests/__init__.py, tests/tutor/*
Adds comprehensive documentation and extensive unit/integration tests covering agents, tools, CLI, branding, persistence, validators, and interactive UI. Review test isolation, mocks, and I/O assumptions.
Dependencies
requirements.txt
Appends langchain, langchain-anthropic, langgraph, and pydantic (v2) entries. Review dependency surface and version constraints.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant CLI as Cortex CLI
    participant Tutor as TutorAgent
    participant Graph as LangGraph
    participant Loader as LessonLoader
    participant Generator as LessonGenerator
    participant Store as SQLiteStore

    User->>CLI: cortex tutor <command>
    CLI->>Tutor: teach(package, force_fresh?) / ask(package, question)
    activate Tutor
    Tutor->>Graph: execute(state)
    Graph->>Graph: plan_node()
    alt plan -> use_cache
        Graph->>Loader: load cached lesson
        Loader->>Store: get_cached_lesson(package)
        Store-->>Loader: cached lesson
        Loader-->>Graph: attach cached_data
    else plan -> generate_full
        Graph->>Generator: generate lesson (LLM)
        Generator->>Generator: call LLM chain & parse JSON
        Generator->>Store: cache_lesson(package, lesson)
        Generator-->>Graph: attach generated lesson
    else plan -> qa
        Graph->>Tutor: route to QA node
    end
    Graph->>Graph: reflect_node() -> validate & score
    Graph->>Store: update progress / stats
    Graph-->>Tutor: final result payload
    Tutor-->>CLI: render lesson / answer / progress via branding
    deactivate Tutor
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mikejmorgan-ai
  • Suyashd999
  • Sahilbhatane

Poem

🐰
I hopped through prompts and SQLite rows,
Built lessons, menus, and helpful flows,
LangGraphs planned, the tutor taught,
Progress saved for every thought,
Hooray — learners bloom where curiosity grows!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title '[tutor] Add AI-Powered Installation Tutor (Issue #131)' is concise, clearly identifies the main change (AI-powered tutor feature), and is specific enough for scanning commit history.
Description check ✅ Passed The PR description includes all required sections: Summary, Related Issue (#131), Type of Change, AI Disclosure, Testing details, and Checklist completion. All major elements from the template are addressed.
Linked Issues check ✅ Passed The PR implementation satisfies all core requirements from issue #131: LLM-powered explanations via Claude, interactive Q&A flow, code examples, tutorials, progress tracking via SQLite, >80% test coverage (86% reported), and documentation (docs/AI_TUTOR.md). CLI integration via 'cortex tutor' matches usage example.
Out of Scope Changes check ✅ Passed All code changes are focused on implementing the AI-powered tutor feature. New modules added (cortex/tutor/*), dependencies added (langchain, pydantic), tests added, and documentation created align with requirements in issue #131. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 99.86% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Jan 12, 2026

CLA Verification Passed

All contributors have signed the CLA.

Contributor Signed As
@srikrishnavansi @srikrishnavansi
@Anshgrover23 @Anshgrover23
@srikrishnavansi @srikrishnavansi

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @srikrishnavansi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates an advanced AI-powered tutor into Cortex Linux, providing users with an interactive and personalized way to learn about various software packages. The system intelligently generates educational content, tracks learning progress, and offers quick answers to questions, enhancing the user's ability to understand and utilize system tools effectively.

Highlights

  • New AI-Powered Installation Tutor: Introduced a comprehensive AI tutor feature (cortex tutor) for interactive learning about software packages, leveraging Claude AI via LangChain.
  • Interactive Learning Workflow: Implemented a Plan→Act→Reflect LangGraph workflow for dynamic and adaptive tutoring sessions, including LLM-powered lessons, code examples, and Q&A.
  • Progress Tracking and Caching: Added SQLite-based progress tracking for topics and packages, along with smart caching of generated lessons to reduce API costs and improve performance.
  • Rich Terminal UI: Developed a rich terminal user interface using the Rich library, featuring branded output, syntax-highlighted code, progress bars, and interactive menus.
  • CLI Commands: New CLI commands are available: cortex tutor <package> for interactive lessons, cortex tutor -q 'question' for quick Q&A, cortex tutor --list to view studied packages, cortex tutor --progress to check learning progress, and cortex tutor --reset to clear progress.
  • Technical Enhancements: The implementation includes lazy imports for non-LLM operations (allowing --list/--progress without an API key), Pydantic models for type safety, and a 7-layer prompt architecture for robust AI agent behavior.
  • Comprehensive Documentation and Testing: A detailed AI_TUTOR.md document outlines the feature's architecture, technical design, API reference, and usage. The new codebase is covered by 266 tests with 85.61% coverage.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent and comprehensive pull request that adds a powerful AI Tutor feature. The architecture is well-designed, leveraging modern tools like LangGraph and Pydantic for a robust and maintainable implementation. The code is clean, well-documented, and includes a thorough test suite, which is fantastic to see. The 7-layer prompt architecture is particularly impressive and shows a deep understanding of prompt engineering best practices.

My review focuses on a few minor areas for improvement to enhance maintainability and robustness, such as centralizing CLI argument definitions and removing hardcoded values. Overall, this is a very high-quality contribution.

generator = LessonGeneratorTool()
result = generator._run(
package_name=package_name,
student_level="beginner", # Could be dynamic based on profile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The student_level is hardcoded to 'beginner'. The comment correctly notes that this could be dynamic. To make the tutor more adaptive, you could determine the student's level from their profile. For instance, you could add a simple function to assess the number of mastered_concepts vs weak_concepts and map that to a 'beginner', 'intermediate', or 'advanced' level. This would make the generated lessons more personalized.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srikrishnavansi Please address this one.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Fix all issues with AI agents
In @cortex/tutor/cli.py:
- Around line 110-150: The cmd_teach function accepts a fresh flag but never
passes it to the InteractiveTutor; update cmd_teach to forward the fresh flag
when constructing InteractiveTutor (e.g., InteractiveTutor(package,
force_fresh=fresh) or matching param name), and update InteractiveTutor.__init__
in cortex.tutor.agents.tutor_agent to accept and store the corresponding
parameter (force_fresh or fresh) so the interactive session honors the flag.

In @cortex/tutor/config.py:
- Line 44: The db_path field is declared as Path but defaults to None causing a
type mismatch; update the annotation to Optional[Path] (and add from typing
import Optional if missing) for the db_path Field declaration so it accurately
allows None prior to model_post_init, keeping the Field(..., description="Path
to SQLite database") usage and leaving model_post_init behavior unchanged.

In @cortex/tutor/memory/sqlite_store.py:
- Around line 340-377: The singleton profile logic can race when two threads
insert simultaneously; add a UNIQUE constraint to the student_profile table
(e.g., enforce a single row via id PRIMARY KEY or a UNIQUE constant key) and
change _create_default_profile() to perform an atomic upsert (SQLite: INSERT OR
IGNORE or INSERT ... ON CONFLICT DO NOTHING) instead of blindly inserting, then
re-query the table and return the actual stored row; update
get_student_profile() to, after attempting creation, SELECT the profile again
(don’t recurse) so whichever thread succeeded returns the canonical row and
duplicates/constraint errors are avoided.

In @cortex/tutor/tests/test_cli.py:
- Around line 103-115: The test mocks the wrong import location for
InteractiveTutor so cmd_teach still uses the real class; update the patch target
to where cmd_teach looks up the name by replacing
@patch("cortex.tutor.agents.tutor_agent.InteractiveTutor") with
@patch("cortex.tutor.cli.InteractiveTutor") in the test (test_successful_teach)
so the mocked InteractiveTutor instance is injected when cmd_teach("docker")
constructs and calls InteractiveTutor.start.

In @cortex/tutor/tests/test_progress_tracker.py:
- Around line 290-304: The three test stubs in TestConvenienceFunctions
(test_get_learning_progress, test_mark_topic_completed, test_get_package_stats)
should be implemented or removed; to fix, implement them using the temp_db
fixture to create minimal sample data, import and call the actual convenience
functions (get_learning_progress, mark_topic_completed, get_package_stats), then
assert expected behavior: for get_learning_progress verify returned structure
and percentages for a package/topic based on created progress rows, for
mark_topic_completed verify it creates/updates a progress record marking the
topic completed and sets completed_at or equivalent flag, and for
get_package_stats verify aggregate stats (total topics, completed count,
completion rate) match the test data; use the function/class names above to
locate targets and clean up any leftover pass placeholders if you prefer
removing unused tests.

In @cortex/tutor/tools/deterministic/lesson_loader.py:
- Around line 131-149: The clear_cache function's behavior doesn't match its
docstring: clear_cache(package_name) currently overwrites the cache entry via
self.store.cache_lesson(package_name, {}, ttl_hours=0) (marking it
expired/unretrievable to get_cached_lesson) rather than deleting it, and
clear_cache(None) calls self.store.clear_expired_cache() which only removes
already-expired entries. Update the clear_cache docstring to say that passing a
package_name marks that package's cache as expired (using cache_lesson with
ttl_hours=0) and that the function returns 1 on successful marking or 0 on
error; also clarify that calling clear_cache(None) only removes entries whose
expires_at <= now and that the return value is whatever
self.store.clear_expired_cache() returns (number of entries removed). Ensure
references to the symbols clear_cache, self.store.cache_lesson,
self.store.clear_expired_cache, and get_cached_lesson are used in the docstring
to make behavior explicit.

In @docs/AI_TUTOR.md:
- Line 753: The documentation line "Current coverage: **87%** (266 tests)" is
stale; update that exact string to reflect the current test coverage percentage
and test count (replace both the percentage and the test count with the latest
values), and scan for other occurrences of the same string in the docs to keep
coverage numbers consistent; commit the updated line so the README/doc shows the
accurate "Current coverage: **X%** (Y tests)" value.
- Line 6: The coverage badge in the README currently shows 87% ("[![Test
Coverage](https://img.shields.io/badge/coverage-87%25-brightgreen.svg)](https://github.com/cortexlinux/cortex)"),
which conflicts with the PR's reported 85.61%; update the badge value to match
the actual coverage (replace "87%25" with "85.61%25") or switch to a dynamic
coverage badge that reflects the real-time metric so the percentage stays
accurate.

In @requirements.txt:
- Around line 27-31: The dependency version ranges in requirements (langchain,
langchain-anthropic, langgraph) are too loose and risk pulling breaking major
releases; update the entries for langchain, langchain-anthropic, and langgraph
to use compatible release specifiers (e.g., change langchain>=0.3.0 to an
approximate pin like langchain~=1.2.0, langchain-anthropic>=0.3.0 to
langchain-anthropic~=1.3.0, and langgraph>=0.2.0 to langgraph~=1.0.0) so you
allow patch/minor updates but prevent unintended major version jumps while
leaving pydantic as-is unless you want to similarly tighten it.
🧹 Nitpick comments (42)
cortex/tutor/tools/agentic/examples_provider.py (2)

7-7: Unused import.

Path from pathlib is imported but not used in this module.

🧹 Suggested fix
-from pathlib import Path
 from typing import Any, Dict, List, Optional

59-157: Consider reducing duplication between _run and _arun.

Both methods share identical prompt construction and response structuring logic. Extracting the common setup into a helper method would improve maintainability.

♻️ Optional refactor to reduce duplication
def _build_chain(self):
    """Build the prompt-LLM-parser chain."""
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", self._get_system_prompt()),
            ("human", self._get_generation_prompt()),
        ]
    )
    return prompt | self.llm | JsonOutputParser()

def _build_inputs(self, package_name, topic, difficulty, learning_style, existing_knowledge):
    """Build input dict for chain invocation."""
    return {
        "package_name": package_name,
        "topic": topic,
        "difficulty": difficulty,
        "learning_style": learning_style,
        "existing_knowledge": ", ".join(existing_knowledge or []) or "basics",
    }

def _run(self, package_name: str, topic: str, ...) -> Dict[str, Any]:
    try:
        chain = self._build_chain()
        inputs = self._build_inputs(package_name, topic, difficulty, learning_style, existing_knowledge)
        result = chain.invoke(inputs)
        examples = self._structure_response(result, package_name, topic)
        return {"success": True, "examples": examples, "cost_gbp": 0.01}
    except Exception as e:
        return {"success": False, "error": str(e), "examples": None}
requirements.txt (1)

25-26: Pre-existing duplicate PyYAML entries noted.

There are three PyYAML specifications in this file (lines 9, 21, and 25) with conflicting constraints (>=6.0.0 vs ==6.0.3). While not introduced by this PR, this could cause dependency resolution issues. Consider consolidating in a follow-up.

cortex/tutor/tests/test_branding.py (3)

8-9: Remove unused imports.

MagicMock and StringIO are imported but never used in this test module.

Proposed fix
-from unittest.mock import Mock, patch, MagicMock
-from io import StringIO
+from unittest.mock import Mock, patch

46-70: Consider adding output assertions for stronger tests.

These tests accept capsys but don't use it to verify the output. While they serve as smoke tests ensuring no exceptions are raised, they don't validate the actual output content.

For improved test quality, consider capturing and asserting on key output patterns:

def test_tutor_print_success(self, capsys):
    tutor_print("Test message", "success")
    captured = capsys.readouterr()
    assert "Test message" in captured.out

243-248: Clarify test intent and strengthen assertion.

The test name test_get_user_input_strips suggests testing strip behavior, but the comment says "get_user_input should return raw input" (no stripping). The assertion assert "spaced" in result passes regardless of whether input is stripped or not.

Clarify the intended behavior and use a precise assertion:

Proposed fix for raw input (no stripping)
     @patch("builtins.input", return_value="  spaced  ")
-    def test_get_user_input_strips(self, mock_input):
-        """Test input stripping is not done (raw input)."""
+    def test_get_user_input_returns_raw(self, mock_input):
+        """Test get_user_input returns raw input without stripping."""
         result = get_user_input("Enter value")
-        # Note: get_user_input should return raw input
-        assert "spaced" in result
+        assert result == "  spaced  "
cortex/tutor/tests/test_tutor_agent.py (3)

9-10: Remove unused imports.

tempfile and Path are imported but never used in this test module.

Proposed fix
-import tempfile
-from pathlib import Path

78-84: Floating-point comparison may be fragile.

Direct equality comparison state["cost_gbp"] == 0.03 on floats can fail due to precision issues. While 0.02 + 0.01 is exact in this case, consider using pytest.approx for robustness:

assert state["cost_gbp"] == pytest.approx(0.03)

303-317: Consider splitting multiple exception assertions into separate tests.

This test validates two different input validation scenarios (empty package name and empty question) in a single test. If the first assertion fails, the second is never executed, potentially hiding additional issues.

Consider splitting into test_ask_validates_package_name and test_ask_validates_question for better test isolation.

cortex/tutor/tests/test_interactive_tutor.py (2)

8-8: Remove unused import.

call is imported from unittest.mock but never used in this test module.

Proposed fix
-from unittest.mock import Mock, patch, MagicMock, call
+from unittest.mock import Mock, patch, MagicMock

246-262: Consider adding behavior assertions for None lesson handling.

The test verifies that methods don't raise exceptions when lesson is None, which is valuable. However, adding assertions about the expected behavior (e.g., verifying that appropriate "No content" messages are printed or that methods return early) would strengthen the test.

cortex/tutor/prompts/tools/lesson_generator.md (2)

66-74: Add language identifier to fenced code block.

Per static analysis (markdownlint MD040), this code block should specify a language. Since it's JSON, add json after the opening fence.

-```
+```json
 {
   "package_name": "name of the package to teach",

104-132: Consider adding a language identifier for the workflow code block.

This fenced code block (starting at Line 104) lacks a language specification. Since it's pseudo-code/plaintext describing the workflow, you could use text or plaintext as the identifier to satisfy linting rules.

cortex/tutor/tests/test_integration.py (1)

207-216: Consider adding minimal assertions or removing unused capsys.

The capsys fixture is captured but not used in the assertions. If verifying Rich console output is impractical, consider removing the capture or adding a comment explaining the intentional no-op check.

♻️ Suggested refactor
     def test_tutor_print_success(self, capsys):
         """Test tutor_print with success status."""
         tutor_print("Test message", "success")
-        captured = capsys.readouterr()
-        # Rich console output is complex, just ensure no errors
+        # Rich console output is complex; this test verifies no exceptions are raised

     def test_tutor_print_error(self, capsys):
         """Test tutor_print with error status."""
         tutor_print("Error message", "error")
-        captured = capsys.readouterr()
+        # Verifies no exceptions are raised with error status
cortex/tutor/tests/test_validators.py (2)

216-219: Type hint inconsistency: sanitize_input accepts None but typed as str.

The test correctly verifies that sanitize_input(None) returns "", but the function signature in validators.py is typed as sanitize_input(input_text: str). Consider updating the type hint to Optional[str] for consistency, or document this behavior.


240-243: Same type consideration for extract_package_name(None).

Similar to sanitize_input, this tests None input while the function signature expects str.

cortex/tutor/tests/test_deterministic_tools.py (1)

99-108: Consider using tmp_path fixture instead of custom temp_db.

Pytest provides a built-in tmp_path fixture that handles cleanup automatically. The current approach works, but using tmp_path would be slightly more idiomatic.

♻️ Optional refactor
-    @pytest.fixture
-    def temp_db(self):
-        """Create a temporary database."""
-        with tempfile.TemporaryDirectory() as tmpdir:
-            yield Path(tmpdir) / "test.db"
+    @pytest.fixture
+    def temp_db(self, tmp_path):
+        """Create a temporary database."""
+        return tmp_path / "test.db"
cortex/tutor/tools/__init__.py (1)

7-11: Consider adding LessonLoaderTool to parent package exports for API consistency.

The cortex/tutor/tools/deterministic/__init__.py exports LessonLoaderTool, but this parent package's __all__ does not include it. Since other deterministic tools (ProgressTrackerTool, validate_package_name, validate_input) are re-exported here, LessonLoaderTool should also be added for a complete and consistent public API. This allows users to import it as from cortex.tutor.tools import LessonLoaderTool alongside other deterministic tools.

cortex/tutor/tests/test_progress_tracker.py (1)

27-44: Consider adding return type hints to fixtures.

Per coding guidelines, type hints are required. The fixtures should specify return types.

♻️ Suggested improvement
 @pytest.fixture
-def temp_db():
+def temp_db() -> Path:
     """Create a temporary database for testing."""
     with tempfile.TemporaryDirectory() as tmpdir:
         db_path = Path(tmpdir) / "test_progress.db"
         yield db_path


 @pytest.fixture
-def store(temp_db):
+def store(temp_db: Path) -> SQLiteStore:
     """Create a SQLite store with temp database."""
     return SQLiteStore(temp_db)


 @pytest.fixture
-def tracker(temp_db):
+def tracker(temp_db: Path) -> ProgressTrackerTool:
     """Create a progress tracker with temp database."""
     return ProgressTrackerTool(temp_db)
cortex/tutor/tests/test_agent_methods.py (1)

7-10: Remove unused imports.

tempfile and Path are imported but never used in this test file.

♻️ Proposed fix
 import pytest
 from unittest.mock import Mock, patch, MagicMock
-import tempfile
-from pathlib import Path
cortex/tutor/tests/test_cli.py (1)

10-10: Remove unused import.

StringIO is imported but never used in this file.

♻️ Proposed fix
-from io import StringIO
cortex/tutor/tests/test_tools.py (2)

8-8: Remove unused import.

MagicMock is imported but never used; only Mock is used.

♻️ Proposed fix
-from unittest.mock import Mock, patch, MagicMock
+from unittest.mock import Mock, patch

276-307: Fragile test pattern using __new__ to bypass __init__.

Using ConversationHandler.__new__() and manually setting attributes bypasses the constructor, which can lead to tests that don't catch initialization bugs. Consider properly mocking the config dependency and instantiating normally.

♻️ Suggested improvement
     def test_build_context_empty(self):
         """Test context building with empty history."""
-        with patch("cortex.tutor.tools.agentic.qa_handler.get_config"):
-            handler = ConversationHandler.__new__(ConversationHandler)
-            handler.history = []
+        with patch("cortex.tutor.tools.agentic.qa_handler.get_config") as mock_config:
+            mock_config.return_value = Mock(
+                anthropic_api_key="test_key",
+                model="claude-sonnet-4-20250514",
+            )
+            with patch("cortex.tutor.tools.agentic.qa_handler.ChatAnthropic"):
+                handler = ConversationHandler()
+                handler.history = []  # Reset for test
 
             context = handler._build_context()
             assert "Starting fresh" in context
cortex/tutor/agents/tutor_agent/tutor_agent.py (2)

248-258: Unused current_step attribute and missing force_fresh support.

The current_step attribute (line 258) is initialized but never used in the class. Additionally, the CLI's --fresh flag (passed to cmd_teach) is not propagated to InteractiveTutor, so users cannot force fresh content generation in interactive mode.

Consider either:

  1. Removing current_step if not needed
  2. Adding force_fresh: bool = False parameter to __init__ and passing it to self.agent.teach()
♻️ Suggested fix to support force_fresh
-    def __init__(self, package_name: str) -> None:
+    def __init__(self, package_name: str, force_fresh: bool = False) -> None:
         """
         Initialize interactive tutor for a package.

         Args:
             package_name: Package to learn.
+            force_fresh: Skip cache and generate fresh content.
         """
         self.package_name = package_name
         self.agent = TutorAgent(verbose=False)
         self.lesson: Optional[Dict[str, Any]] = None
-        self.current_step = 0
+        self.force_fresh = force_fresh

439-446: Consider extracting the magic number 5 as a constant.

The default topic count of 5 (line 443) is used as a fallback when stats.get("total", 0) is 0. While the inline comment explains it, extracting this as a named constant would improve clarity.

cortex/tutor/tools/deterministic/lesson_loader.py (1)

100-106: Consider async SQLite operations in the future.

The async method currently delegates to the synchronous implementation. This is acceptable for fast SQLite operations but could be improved with aiosqlite if async performance becomes important.

cortex/tutor/tools/agentic/lesson_generator.py (3)

1-19: Unused imports from lesson_context.

LessonContext, CodeExample, and TutorialStep (line 18) are imported but not used in this module. The _structure_response method returns a plain Dict rather than using these Pydantic models for validation.

Consider either:

  1. Removing unused imports
  2. Using LessonContext.model_validate() in _structure_response to leverage Pydantic validation

21-36: _load_prompt_template is defined but never called.

The function loads a prompt template from a file but is never invoked anywhere in this module. The actual prompts are generated inline by _get_system_prompt() and _get_generation_prompt().

♻️ Consider removing unused function
-# Load prompt template
-def _load_prompt_template() -> str:
-    """Load the lesson generator prompt from file."""
-    prompt_path = Path(__file__).parent.parent.parent / "prompts" / "tools" / "lesson_generator.md"
-    if prompt_path.exists():
-        return prompt_path.read_text()
-    # Fallback inline prompt
-    return """You are a lesson content generator. Generate comprehensive educational content
-    for the package: {package_name}
-
-    Student level: {student_level}
-    Learning style: {learning_style}
-    Focus areas: {focus_areas}
-
-    Return a JSON object with: summary, explanation, use_cases, best_practices,
-    code_examples, tutorial_steps, installation_command, confidence."""
-
-

142-184: Consider extracting shared prompt construction logic.

The _arun method duplicates the prompt construction from _run (lines 105-110 vs 152-157). Extracting this to a helper method would reduce duplication.

cortex/tutor/contracts/progress_context.py (1)

98-100: datetime.utcnow() is deprecated; use timezone-aware datetime.

datetime.utcnow() is deprecated since Python 3.12. Use datetime.now(timezone.utc) for timezone-aware UTC timestamps.

♻️ Suggested fix
+from datetime import datetime, timezone
-from datetime import datetime
 ...
     last_updated: datetime = Field(
-        default_factory=datetime.utcnow, description="Last update timestamp"
+        default_factory=lambda: datetime.now(timezone.utc), description="Last update timestamp (UTC)"
     )
cortex/tutor/agents/tutor_agent/graph.py (1)

1-25: Unused import: ExamplesProviderTool.

ExamplesProviderTool (line 24) is imported but never used in this module.

♻️ Remove unused import
 from cortex.tutor.tools.agentic.qa_handler import QAHandlerTool
-from cortex.tutor.tools.agentic.examples_provider import ExamplesProviderTool
cortex/tutor/contracts/lesson_context.py (1)

99-101: datetime.utcnow() is deprecated.

Same issue as in progress_context.py. Use datetime.now(timezone.utc) for timezone-aware UTC timestamps.

cortex/tutor/tools/deterministic/validators.py (1)

40-49: Unused constant: KNOWN_PACKAGE_CATEGORIES.

This constant is defined but never referenced anywhere in the module. Consider removing it or documenting its intended future use.

🔧 Suggested fix

Either remove the unused constant:

-# Common package categories for validation hints
-KNOWN_PACKAGE_CATEGORIES = [
-    "system",  # apt, systemctl, journalctl
-    "development",  # git, docker, npm, pip
-    "database",  # postgresql, mysql, redis, mongodb
-    "web",  # nginx, apache, curl, wget
-    "security",  # ufw, fail2ban, openssl
-    "networking",  # ssh, netstat, iptables
-    "utilities",  # vim, tmux, htop, grep
-]

Or add a TODO comment explaining its intended use:

+# TODO: Use for package categorization hints in future validation enhancements
 KNOWN_PACKAGE_CATEGORIES = [
cortex/tutor/tools/deterministic/progress_tracker.py (4)

7-10: Unused import: Path.

The Path type is imported but never used in this module.

🔧 Suggested fix
 from datetime import datetime
-from pathlib import Path
 from typing import Any, Dict, List, Optional

115-117: Async method delegates to sync - no actual async benefit.

The _arun method simply calls the synchronous _run, which blocks the event loop. This is acceptable if the underlying SQLite operations are fast, but consider noting this limitation in the docstring.

📝 Suggested docstring improvement
     async def _arun(self, *args: Any, **kwargs: Any) -> Dict[str, Any]:
-        """Async version - delegates to sync implementation."""
+        """Async version - delegates to sync implementation.
+
+        Note: SQLite operations are synchronous. For true async, consider
+        aiosqlite in future iterations.
+        """
         return self._run(*args, **kwargs)

258-280: Potential parameter shadowing issue.

The concept parameter is fetched from kwargs first, then falls back to the function parameter. This is redundant since concept would already be in kwargs if passed as a keyword argument.

🔧 Suggested simplification
     def _add_mastered_concept(
         self,
         concept: Optional[str] = None,
         **kwargs: Any,
     ) -> Dict[str, Any]:
         """Add a mastered concept to student profile."""
-        concept = kwargs.get("concept") or concept
         if not concept:
             return {"success": False, "error": "concept required"}
         self.store.add_mastered_concept(concept)
         return {"success": True, "message": f"Added mastered concept: {concept}"}

     def _add_weak_concept(
         self,
         concept: Optional[str] = None,
         **kwargs: Any,
     ) -> Dict[str, Any]:
         """Add a weak concept to student profile."""
-        concept = kwargs.get("concept") or concept
         if not concept:
             return {"success": False, "error": "concept required"}
         self.store.add_weak_concept(concept)
         return {"success": True, "message": f"Added weak concept: {concept}"}

301-346: Convenience functions create new tool instances on each call.

Each call instantiates a new ProgressTrackerTool, which opens a new database connection. For repeated operations, this is inefficient. Consider documenting this trade-off or providing a module-level singleton option.

📝 Suggested improvement

Add a note about usage patterns:

 # Convenience functions for direct usage
+# Note: Each function creates a new tool instance. For batch operations,
+# instantiate ProgressTrackerTool once and reuse it.


 def get_learning_progress(package_name: str, topic: str) -> Optional[Dict[str, Any]]:

Or provide a singleton pattern:

_default_tool: Optional[ProgressTrackerTool] = None

def _get_tool() -> ProgressTrackerTool:
    global _default_tool
    if _default_tool is None:
        _default_tool = ProgressTrackerTool()
    return _default_tool
cortex/tutor/tools/agentic/qa_handler.py (4)

7-8: Unused import: Path.

The Path type is imported but never used in this module.

🔧 Suggested fix
-from pathlib import Path
 from typing import Any, Dict, List, Optional

59-162: Significant code duplication between _run and _arun.

The sync and async methods share nearly identical logic. Extract the common setup into a helper method.

♻️ Suggested refactor
+    def _build_chain(self):
+        """Build the prompt chain for Q&A."""
+        prompt = ChatPromptTemplate.from_messages(
+            [
+                ("system", self._get_system_prompt()),
+                ("human", self._get_qa_prompt()),
+            ]
+        )
+        return prompt | self.llm | JsonOutputParser()
+
+    def _build_invoke_params(
+        self,
+        package_name: str,
+        question: str,
+        learning_style: str,
+        mastered_concepts: Optional[List[str]],
+        weak_concepts: Optional[List[str]],
+        lesson_context: Optional[str],
+    ) -> Dict[str, Any]:
+        """Build parameters for chain invocation."""
+        return {
+            "package_name": package_name,
+            "question": question,
+            "learning_style": learning_style,
+            "mastered_concepts": ", ".join(mastered_concepts or []) or "none specified",
+            "weak_concepts": ", ".join(weak_concepts or []) or "none specified",
+            "lesson_context": lesson_context or "starting fresh",
+        }
+
     def _run(self, ...) -> Dict[str, Any]:
         try:
-            prompt = ChatPromptTemplate.from_messages(...)
-            chain = prompt | self.llm | JsonOutputParser()
-            result = chain.invoke({...})
+            chain = self._build_chain()
+            params = self._build_invoke_params(...)
+            result = chain.invoke(params)
             # ... rest unchanged

105-109: Hardcoded cost estimate may become inaccurate.

The cost_gbp: 0.02 is static and may not reflect actual API costs as pricing changes.

📝 Suggested improvement

Consider computing based on token usage or making it configurable:

+            # Note: Estimated cost; actual varies by token usage
             return {
                 "success": True,
                 "answer": answer,
-                "cost_gbp": 0.02,
+                "cost_estimate_gbp": 0.02,  # Approximate based on ~2K tokens
             }

268-344: Unbounded conversation history growth.

The history list grows indefinitely during long sessions. While _build_context only uses the last 3 exchanges, the full history is retained in memory.

🔧 Suggested fix to cap history size
     def ask(
         self,
         question: str,
         ...
     ) -> Dict[str, Any]:
         ...
         # Update history
         if result.get("success"):
             self.history.append(
                 {
                     "question": question,
                     "answer": result["answer"].get("answer", ""),
                 }
             )
+            # Keep history bounded to prevent memory growth
+            max_history = 20  # or make configurable
+            if len(self.history) > max_history:
+                self.history = self.history[-max_history:]

         return result
cortex/tutor/memory/sqlite_store.py (1)

41-48: Replace mutable default values with Field(default_factory=list).

Pydantic recommends using Field(default_factory=list) for list defaults to ensure each instance gets a fresh list and to maintain consistency across Pydantic versions. While Pydantic does deepcopy mutable defaults, explicitly using default_factory is the clearer, future-proof approach.

🔧 Suggested fix
+from pydantic import BaseModel, Field
-from pydantic import BaseModel


 class StudentProfile(BaseModel):
     """Model for student profile."""

     id: Optional[int] = None
-    mastered_concepts: List[str] = []
-    weak_concepts: List[str] = []
+    mastered_concepts: List[str] = Field(default_factory=list)
+    weak_concepts: List[str] = Field(default_factory=list)
     learning_style: str = "reading"  # visual, reading, hands-on
     last_session: Optional[str] = None
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 337522a and 1fbed6a.

📒 Files selected for processing (42)
  • cortex/cli.py
  • cortex/tutor/__init__.py
  • cortex/tutor/agents/__init__.py
  • cortex/tutor/agents/tutor_agent/__init__.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/agents/tutor_agent/state.py
  • cortex/tutor/agents/tutor_agent/tutor_agent.py
  • cortex/tutor/branding.py
  • cortex/tutor/cli.py
  • cortex/tutor/config.py
  • cortex/tutor/contracts/__init__.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/memory/__init__.py
  • cortex/tutor/memory/sqlite_store.py
  • cortex/tutor/prompts/agents/tutor/system.md
  • cortex/tutor/prompts/tools/examples_provider.md
  • cortex/tutor/prompts/tools/lesson_generator.md
  • cortex/tutor/prompts/tools/qa_handler.md
  • cortex/tutor/tests/__init__.py
  • cortex/tutor/tests/test_agent_methods.py
  • cortex/tutor/tests/test_agentic_tools.py
  • cortex/tutor/tests/test_branding.py
  • cortex/tutor/tests/test_cli.py
  • cortex/tutor/tests/test_deterministic_tools.py
  • cortex/tutor/tests/test_integration.py
  • cortex/tutor/tests/test_interactive_tutor.py
  • cortex/tutor/tests/test_progress_tracker.py
  • cortex/tutor/tests/test_tools.py
  • cortex/tutor/tests/test_tutor_agent.py
  • cortex/tutor/tests/test_validators.py
  • cortex/tutor/tools/__init__.py
  • cortex/tutor/tools/agentic/__init__.py
  • cortex/tutor/tools/agentic/examples_provider.py
  • cortex/tutor/tools/agentic/lesson_generator.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/tutor/tools/deterministic/lesson_loader.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/tools/deterministic/validators.py
  • docs/AI_TUTOR.md
  • requirements.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/tutor/agents/__init__.py
  • cortex/tutor/agents/tutor_agent/state.py
  • cortex/tutor/agents/tutor_agent/__init__.py
  • cortex/tutor/tests/__init__.py
  • cortex/tutor/memory/__init__.py
  • cortex/tutor/tools/agentic/__init__.py
  • cortex/tutor/tests/test_branding.py
  • cortex/tutor/tests/test_validators.py
  • cortex/tutor/tests/test_agent_methods.py
  • cortex/tutor/tests/test_cli.py
  • cortex/tutor/config.py
  • cortex/tutor/tools/__init__.py
  • cortex/tutor/tests/test_tools.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/cli.py
  • cortex/tutor/tests/test_tutor_agent.py
  • cortex/tutor/cli.py
  • cortex/tutor/tests/test_interactive_tutor.py
  • cortex/tutor/tests/test_progress_tracker.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/tests/test_agentic_tools.py
  • cortex/tutor/tests/test_integration.py
  • cortex/tutor/branding.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/tools/agentic/examples_provider.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/tools/deterministic/lesson_loader.py
  • cortex/tutor/agents/tutor_agent/tutor_agent.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/tests/test_deterministic_tools.py
  • cortex/tutor/contracts/__init__.py
  • cortex/tutor/__init__.py
  • cortex/tutor/tools/agentic/lesson_generator.py
  • cortex/tutor/tools/deterministic/validators.py
  • cortex/tutor/memory/sqlite_store.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • cortex/tutor/agents/__init__.py
  • cortex/tutor/agents/tutor_agent/__init__.py
  • cortex/tutor/tests/__init__.py
  • cortex/tutor/memory/__init__.py
  • cortex/tutor/tools/agentic/__init__.py
  • cortex/tutor/tools/__init__.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/tutor/contracts/__init__.py
  • cortex/tutor/__init__.py
🧠 Learnings (2)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Type hints required in Python code

Applied to files:

  • requirements.txt
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to tests/**/*.py : Maintain >80% test coverage for pull requests

Applied to files:

  • cortex/tutor/tests/__init__.py
🧬 Code graph analysis (20)
cortex/tutor/agents/__init__.py (2)
cortex/cli.py (1)
  • tutor (1016-1074)
cortex/tutor/agents/tutor_agent/tutor_agent.py (1)
  • TutorAgent (20-238)
cortex/tutor/agents/tutor_agent/__init__.py (2)
cortex/tutor/agents/tutor_agent/state.py (1)
  • TutorAgentState (50-104)
cortex/tutor/agents/tutor_agent/tutor_agent.py (2)
  • TutorAgent (20-238)
  • InteractiveTutor (241-447)
cortex/tutor/memory/__init__.py (1)
cortex/tutor/memory/sqlite_store.py (1)
  • SQLiteStore (51-531)
cortex/tutor/tools/agentic/__init__.py (3)
cortex/tutor/tools/agentic/lesson_generator.py (1)
  • LessonGeneratorTool (39-303)
cortex/tutor/tools/agentic/examples_provider.py (1)
  • ExamplesProviderTool (19-236)
cortex/tutor/tools/agentic/qa_handler.py (1)
  • QAHandlerTool (19-241)
cortex/tutor/tests/test_validators.py (1)
cortex/tutor/tools/deterministic/validators.py (11)
  • validate_package_name (52-94)
  • validate_input (97-136)
  • validate_question (139-149)
  • validate_topic (152-174)
  • validate_score (177-193)
  • validate_learning_style (196-211)
  • sanitize_input (214-237)
  • extract_package_name (240-278)
  • get_validation_errors (281-321)
  • validate_all (349-368)
  • ValidationResult (324-346)
cortex/tutor/tests/test_agent_methods.py (2)
cortex/tutor/agents/tutor_agent/graph.py (10)
  • plan_node (30-87)
  • load_cache_node (90-109)
  • generate_lesson_node (112-154)
  • qa_node (157-199)
  • reflect_node (202-251)
  • fail_node (254-275)
  • route_after_plan (281-297)
  • route_after_act (300-311)
  • create_tutor_graph (317-373)
  • get_tutor_graph (380-390)
cortex/tutor/agents/tutor_agent/tutor_agent.py (8)
  • TutorAgent (20-238)
  • teach (47-94)
  • ask (96-139)
  • get_profile (155-162)
  • update_learning_style (164-175)
  • mark_completed (177-195)
  • reset_progress (197-208)
  • get_packages_studied (210-218)
cortex/tutor/tests/test_cli.py (1)
cortex/tutor/cli.py (7)
  • create_parser (33-107)
  • cmd_teach (110-150)
  • cmd_question (153-208)
  • cmd_list_packages (211-238)
  • cmd_progress (241-311)
  • cmd_reset (314-345)
  • main (348-386)
cortex/tutor/config.py (6)
cortex/tutor/tools/agentic/examples_provider.py (1)
  • Config (39-40)
cortex/tutor/tools/agentic/lesson_generator.py (1)
  • Config (62-63)
cortex/tutor/tools/agentic/qa_handler.py (1)
  • Config (39-40)
cortex/tutor/tools/deterministic/lesson_loader.py (1)
  • Config (35-36)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • Config (45-46)
cortex/logging_system.py (1)
  • debug (196-198)
cortex/tutor/tools/__init__.py (5)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • ProgressTrackerTool (23-295)
cortex/tutor/tools/deterministic/validators.py (2)
  • validate_package_name (52-94)
  • validate_input (97-136)
cortex/tutor/tools/agentic/lesson_generator.py (1)
  • LessonGeneratorTool (39-303)
cortex/tutor/tools/agentic/examples_provider.py (1)
  • ExamplesProviderTool (19-236)
cortex/tutor/tools/agentic/qa_handler.py (1)
  • QAHandlerTool (19-241)
cortex/tutor/tests/test_tools.py (1)
cortex/tutor/tools/deterministic/lesson_loader.py (4)
  • LessonLoaderTool (18-149)
  • get_fallback_lesson (227-237)
  • load_lesson_with_fallback (240-276)
  • _run (51-98)
cortex/tutor/tools/deterministic/__init__.py (3)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • ProgressTrackerTool (23-295)
cortex/tutor/tools/deterministic/validators.py (2)
  • validate_package_name (52-94)
  • validate_input (97-136)
cortex/tutor/tools/deterministic/lesson_loader.py (1)
  • LessonLoaderTool (18-149)
cortex/tutor/tests/test_tutor_agent.py (3)
cortex/tutor/tools/agentic/qa_handler.py (1)
  • _run (59-116)
cortex/tutor/tools/deterministic/lesson_loader.py (1)
  • _run (51-98)
cortex/tutor/config.py (1)
  • reset_config (144-147)
cortex/tutor/cli.py (5)
cortex/tutor/branding.py (8)
  • print_banner (53-69)
  • tutor_print (35-50)
  • print_table (162-180)
  • print_progress_summary (143-159)
  • print_error_panel (238-251)
  • print_success_panel (254-267)
  • get_user_input (194-216)
  • print_code_example (102-115)
cortex/tutor/tools/deterministic/validators.py (1)
  • validate_package_name (52-94)
cortex/tutor/config.py (3)
  • Config (19-121)
  • from_env (60-89)
  • get_db_path (101-109)
cortex/tutor/memory/sqlite_store.py (5)
  • SQLiteStore (51-531)
  • get_packages_studied (520-531)
  • get_completion_stats (257-286)
  • get_all_progress (171-203)
  • reset_progress (500-518)
cortex/tutor/agents/tutor_agent/tutor_agent.py (6)
  • InteractiveTutor (241-447)
  • start (260-326)
  • TutorAgent (20-238)
  • ask (96-139)
  • get_packages_studied (210-218)
  • reset_progress (197-208)
cortex/tutor/tests/test_interactive_tutor.py (2)
cortex/tutor/tools/agentic/qa_handler.py (1)
  • ask (286-327)
cortex/tutor/memory/sqlite_store.py (1)
  • get_progress (142-169)
cortex/tutor/tests/test_agentic_tools.py (3)
cortex/tutor/tools/agentic/lesson_generator.py (2)
  • LessonGeneratorTool (39-303)
  • _structure_response (251-303)
cortex/tutor/tools/agentic/examples_provider.py (1)
  • _structure_response (208-236)
cortex/tutor/tools/agentic/qa_handler.py (5)
  • _structure_response (215-241)
  • QAHandlerTool (19-241)
  • ConversationHandler (268-344)
  • _build_context (329-340)
  • clear_history (342-344)
cortex/tutor/tests/test_integration.py (5)
cortex/tutor/config.py (5)
  • Config (19-121)
  • get_config (128-141)
  • reset_config (144-147)
  • from_env (60-89)
  • ensure_data_dir (91-99)
cortex/tutor/branding.py (2)
  • tutor_print (35-50)
  • print_banner (53-69)
cortex/tutor/contracts/lesson_context.py (6)
  • LessonContext (36-132)
  • CodeExample (13-21)
  • TutorialStep (24-33)
  • to_json (103-105)
  • from_json (108-110)
  • to_display_dict (120-132)
cortex/tutor/contracts/progress_context.py (8)
  • ProgressContext (66-143)
  • PackageProgress (23-63)
  • TopicProgress (13-20)
  • completion_percentage (33-38)
  • average_score (42-46)
  • is_complete (61-63)
  • get_next_topic (54-59)
  • get_recommendations (115-131)
cortex/tutor/cli.py (1)
  • create_parser (33-107)
cortex/tutor/contracts/__init__.py (2)
cortex/tutor/contracts/lesson_context.py (1)
  • LessonContext (36-132)
cortex/tutor/contracts/progress_context.py (1)
  • ProgressContext (66-143)
cortex/tutor/__init__.py (2)
cortex/tutor/config.py (1)
  • Config (19-121)
cortex/tutor/branding.py (1)
  • tutor_print (35-50)
cortex/tutor/tools/agentic/lesson_generator.py (2)
cortex/tutor/config.py (2)
  • get_config (128-141)
  • Config (19-121)
cortex/tutor/contracts/lesson_context.py (3)
  • LessonContext (36-132)
  • CodeExample (13-21)
  • TutorialStep (24-33)
cortex/tutor/memory/sqlite_store.py (5)
cortex/tutor/contracts/progress_context.py (1)
  • total_time_seconds (50-52)
cortex/tutor/agents/tutor_agent/tutor_agent.py (3)
  • get_progress (141-153)
  • reset_progress (197-208)
  • get_packages_studied (210-218)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • mark_topic_completed (317-331)
cortex/hwprofiler.py (1)
  • profile (421-448)
cortex/tutor/tools/deterministic/lesson_loader.py (1)
  • cache_lesson (108-129)
🪛 markdownlint-cli2 (0.18.1)
docs/AI_TUTOR.md

92-92: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


250-250: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


310-310: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


362-362: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


401-401: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


443-443: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

cortex/tutor/prompts/tools/lesson_generator.md

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


97-97: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Comment on lines 110 to 150
def cmd_teach(package: str, verbose: bool = False, fresh: bool = False) -> int:
"""
Start an interactive tutoring session.
Args:
package: Package name to teach.
verbose: Enable verbose output.
fresh: Skip cache.
Returns:
Exit code (0 for success, 1 for failure).
"""
# Validate package name
is_valid, error = validate_package_name(package)
if not is_valid:
print_error_panel(f"Invalid package name: {error}")
return 1

try:
# Lazy import - only load when needed (requires API key)
from cortex.tutor.agents.tutor_agent import InteractiveTutor

# Start interactive tutor
interactive = InteractiveTutor(package)
interactive.start()
return 0

except ValueError as e:
print_error_panel(str(e))
return 1
except KeyboardInterrupt:
console.print()
tutor_print("Session ended", "info")
return 0
except Exception as e:
print_error_panel(f"An error occurred: {e}")
if verbose:
import traceback

console.print_exception()
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

--fresh flag is not propagated to InteractiveTutor.

The fresh parameter (line 110) is accepted but never passed to InteractiveTutor (line 133). The flag has no effect in interactive mode.

This connects to the earlier observation that InteractiveTutor.__init__ doesn't accept a force_fresh parameter.

🔧 Suggested fix
 def cmd_teach(package: str, verbose: bool = False, fresh: bool = False) -> int:
     ...
     try:
         from cortex.tutor.agents.tutor_agent import InteractiveTutor

-        interactive = InteractiveTutor(package)
+        interactive = InteractiveTutor(package, force_fresh=fresh)
         interactive.start()

This requires updating InteractiveTutor.__init__ as noted in the tutor_agent.py review.

🤖 Prompt for AI Agents
In @cortex/tutor/cli.py around lines 110 - 150, The cmd_teach function accepts a
fresh flag but never passes it to the InteractiveTutor; update cmd_teach to
forward the fresh flag when constructing InteractiveTutor (e.g.,
InteractiveTutor(package, force_fresh=fresh) or matching param name), and update
InteractiveTutor.__init__ in cortex.tutor.agents.tutor_agent to accept and store
the corresponding parameter (force_fresh or fresh) so the interactive session
honors the flag.

Comment on lines 103 to 115
@patch.dict("os.environ", {"ANTHROPIC_API_KEY": "sk-ant-test-key"})
@patch("cortex.tutor.agents.tutor_agent.InteractiveTutor")
def test_successful_teach(self, mock_tutor_class):
"""Test successful teach session."""
from cortex.tutor.config import reset_config
reset_config() # Reset config singleton

mock_tutor = Mock()
mock_tutor_class.return_value = mock_tutor

result = cmd_teach("docker")
assert result == 0
mock_tutor.start.assert_called_once()
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how InteractiveTutor is imported/used in cli.py
rg -n "InteractiveTutor" cortex/tutor/cli.py -A 2 -B 2

Repository: cortexlinux/cortex

Length of output: 364


🏁 Script executed:

#!/bin/bash
# Check the implementation of cmd_teach function
rg -n "def cmd_teach" cortex/tutor/cli.py -A 20

Repository: cortexlinux/cortex

Length of output: 789


🏁 Script executed:

#!/bin/bash
# Check the exact patch decorator in test_cli.py
sed -n '103,115p' cortex/tutor/tests/test_cli.py

Repository: cortexlinux/cortex

Length of output: 589


Adjust patch target for InteractiveTutor mocking.

Since InteractiveTutor is imported locally within the cmd_teach function (line 130 of cortex/tutor/cli.py), the patch must target where the name is looked up, not where the class is defined. Change the patch from @patch("cortex.tutor.agents.tutor_agent.InteractiveTutor") to @patch("cortex.tutor.cli.InteractiveTutor") to correctly mock the class instance used by cmd_teach.

🤖 Prompt for AI Agents
In @cortex/tutor/tests/test_cli.py around lines 103 - 115, The test mocks the
wrong import location for InteractiveTutor so cmd_teach still uses the real
class; update the patch target to where cmd_teach looks up the name by replacing
@patch("cortex.tutor.agents.tutor_agent.InteractiveTutor") with
@patch("cortex.tutor.cli.InteractiveTutor") in the test (test_successful_teach)
so the mocked InteractiveTutor instance is injected when cmd_teach("docker")
constructs and calls InteractiveTutor.start.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srikrishnavansi Please address this one as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@srikrishnavansi
Copy link
Contributor Author

recheck

Copy link
Collaborator

@Anshgrover23 Anshgrover23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srikrishnavansi Follow the contributing.md guidelines( i.e. add demonstrating video in pr description, add Documentation for the feature you're creating, add tests and AI Usage as well, thanks.
Also, kindly address CODERABBITAI comments.

@Anshgrover23 Anshgrover23 marked this pull request as draft January 12, 2026 05:47
@srikrishnavansi srikrishnavansi marked this pull request as ready for review January 12, 2026 05:56
@srikrishnavansi
Copy link
Contributor Author

I will update the PR and changes suggested by CodeRabbit.

@Anshgrover23 Anshgrover23 marked this pull request as draft January 12, 2026 06:47
@srikrishnavansi srikrishnavansi changed the title [tutor] Add AI-Powered Installation Tutor (Issue #131) [tutor] Add AI-Powered Installation Tutor (Issue #131) Jan 12, 2026
srikrishnavansi and others added 7 commits January 12, 2026 12:21
Fixes identified in PR cortexlinux#566 code review:

- Fix coverage badge discrepancy (87% → 85.6%) in docs/AI_TUTOR.md
- Fix db_path type annotation to Optional[Path] in config.py
- Remove unused ExamplesProviderTool import from graph.py
- Add DEFAULT_TUTOR_TOPICS constant to avoid magic numbers
- Fix --fresh flag propagation to InteractiveTutor
- Fix race condition in profile creation with INSERT OR IGNORE
- Tighten version constraints in requirements.txt (<2.0.0 bounds)
- Update clear_cache docstring to match actual behavior
- Implement empty test methods in TestConvenienceFunctions

All 266 tests pass with 86% coverage.
- Fix import sorting (I001) with ruff format
- Modernize type annotations for Python 3.10+:
  - Replace Dict/List/Tuple with dict/list/tuple
  - Replace Optional[X] with X | None
- Remove deprecated typing imports
- Fix f-string without placeholders

All 266 tests pass.
The tutor tests were in cortex/tutor/tests/ but CI runs
pytest tests/ - causing 0% coverage on tutor code.

Copied tests to tests/tutor/ so CI discovers them and
includes tutor code in coverage metrics.

All 266 tutor tests pass from new location.
@srikrishnavansi srikrishnavansi marked this pull request as ready for review January 12, 2026 07:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @cortex/tutor/contracts/lesson_context.py:
- Around line 97-99: The Field default_factory for generated_at currently uses
datetime.utcnow (in the generated_at: datetime = Field(...
default_factory=datetime.utcnow ...)), which is deprecated; replace it with a
timezone-aware factory using datetime.now(timezone.utc) and ensure timezone is
imported from datetime (or use datetime.timezone) so generated_at remains
UTC-aware; update the default_factory reference in the generated_at Field and
add the necessary import for timezone at the top of the module.

In @cortex/tutor/contracts/progress_context.py:
- Line 157: The QuizContext.timestamp Field currently uses datetime.utcnow() (in
the default_factory) which is deprecated; change the default_factory for
QuizContext.timestamp to produce a timezone-aware UTC datetime (e.g., use
datetime.now(timezone.utc) or an equivalent lambda), and update imports if
needed to include timezone so the Field default is timezone-aware while keeping
the type as datetime.
- Around line 98-100: The Field default_factory for last_updated in
progress_context.py uses deprecated datetime.utcnow(); update the
default_factory to use datetime.now(timezone.utc) instead and ensure timezone is
imported from datetime (mirror the fix applied in lesson_context.py); modify the
default_factory on the last_updated Field (and any similar uses in this module)
to call datetime.now(timezone.utc) to produce an aware UTC timestamp.

In @cortex/tutor/memory/sqlite_store.py:
- Around line 206-239: The upsert_progress method currently returns
cursor.lastrowid which is 0 when the INSERT triggers the ON CONFLICT ... DO
UPDATE branch; change upsert_progress (learning_progress upsert) to detect that
case and return the actual row id by querying the table for the record after the
upsert (e.g., SELECT id or rowid WHERE package_name = ? AND topic = ?) when
cursor.lastrowid is 0, ensuring the function always returns the real row id for
the given progress.package_name and progress.topic.
- Around line 419-444: The functions add_mastered_concept and add_weak_concept
perform a read-modify-write via get_student_profile() and
update_student_profile(), which opens a TOCTOU race; replace the Python-side
mutation with an atomic database operation (or wrap the operation in a single
transaction) so concurrent updates cannot be lost: implement a single SQL UPDATE
that uses JSON functions to append/remove the concept from the mastered_concepts
or weak_concepts JSON array (and remove it from the opposite array when
promoting/demoting), or execute the read and write inside a DB transaction with
appropriate locking/isolation in the same code paths where get_student_profile()
and update_student_profile() are called so add_mastered_concept and
add_weak_concept no longer rely on separate read and write calls that can
interleave.
🧹 Nitpick comments (46)
cortex/tutor/tests/test_validators.py (2)

7-7: Unused pytest import.

The pytest module is imported but never used in this test file. No pytest.fixture, pytest.raises, or other pytest-specific features are utilized.

🔧 Suggested fix
-import pytest
-
 from cortex.tutor.tools.deterministic.validators import (

165-180: Consider adding boundary value tests for score validation.

The tests cover valid scores and out-of-range values, but consider adding explicit boundary tests for exactly 0.0 and 1.0 to ensure inclusive boundary handling is verified.

🧪 Additional boundary test
def test_boundary_scores(self):
    """Test exact boundary values are valid."""
    # Ensure 0.0 and 1.0 are explicitly accepted
    is_valid, error = validate_score(0.0)
    assert is_valid, "0.0 should be valid"
    is_valid, error = validate_score(1.0)
    assert is_valid, "1.0 should be valid"
cortex/tutor/memory/sqlite_store.py (2)

52-57: Docstring claims "connection pooling" but implementation creates new connections.

The class docstring mentions "connection pooling," but _get_connection creates a new connection on each call and closes it after use. This is connection-per-request, not pooling.

📝 Suggested docstring fix
 class SQLiteStore:
     """
     SQLite-based storage for learning progress and student data.
 
-    Thread-safe implementation with connection pooling.
+    Thread-safe implementation with per-request connections.
 
     Attributes:
         db_path: Path to the SQLite database file.
     """

42-50: Use Field(default_factory=list) for explicit mutable defaults.

While Pydantic v2 safely deep-copies empty list defaults for each instance, the idiomatic approach is Field(default_factory=list) to explicitly signal that each instance gets its own list. This also guards against issues if the codebase ever migrates to older Pydantic versions where mutable defaults were not safely handled. Apply to mastered_concepts and weak_concepts fields.

cortex/tutor/tests/test_deterministic_tools.py (2)

9-9: Unused imports from unittest.mock.

Mock and patch are imported but never used in this test file.

🔧 Suggested fix
 import tempfile
 from pathlib import Path
-from unittest.mock import Mock, patch
 
 import pytest

100-109: Duplicate temp_db fixture definition.

The temp_db fixture is defined identically in both TestLessonLoaderTool (lines 26-30) and TestProgressTrackerTool (lines 100-104). Consider moving it to a conftest.py file for reuse.

♻️ Suggested refactor

Create cortex/tutor/tests/conftest.py:

import tempfile
from pathlib import Path

import pytest

@pytest.fixture
def temp_db():
    """Create a temporary database."""
    with tempfile.TemporaryDirectory() as tmpdir:
        yield Path(tmpdir) / "test.db"

Then remove the duplicate fixtures from both test classes.

tests/tutor/test_agentic_tools.py (1)

7-7: Remove unused import MagicMock.

MagicMock is imported but never used in this file; only Mock is utilized.

Suggested fix
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch
cortex/tutor/tests/test_agent_methods.py (1)

7-11: Remove unused imports.

tempfile and MagicMock are imported but not used in this file.

Suggested fix
-import tempfile
 from pathlib import Path
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch
cortex/tutor/tests/test_branding.py (2)

7-10: Remove unused imports.

StringIO and MagicMock are imported but never used.

Suggested fix
-from io import StringIO
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch

244-249: Clarify test assertion for stripping behavior.

The test name suggests verifying stripping behavior, but assert "spaced" in result passes whether result is " spaced " or "spaced". Consider asserting the exact expected value.

Suggested fix for explicit verification
     @patch("builtins.input", return_value="  spaced  ")
     def test_get_user_input_strips(self, mock_input):
         """Test input stripping is not done (raw input)."""
         result = get_user_input("Enter value")
-        # Note: get_user_input should return raw input
-        assert "spaced" in result
+        # Verify raw input is returned without stripping
+        assert result == "  spaced  "
tests/tutor/test_tutor_agent.py (1)

7-11: Remove unused imports.

tempfile and MagicMock are imported but not used.

Suggested fix
-import tempfile
 from pathlib import Path
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch
cortex/tutor/tests/test_integration.py (2)

10-10: Remove unused import.

MagicMock is imported but not used.

Suggested fix
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch

205-222: Consider removing duplicate branding tests.

These tests duplicate coverage from tests/tutor/test_branding.py. Consider removing them to reduce maintenance overhead, or keep them intentionally as integration-level smoke tests.

tests/tutor/test_interactive_tutor.py (2)

1-11: Unused import detected.

The call import from unittest.mock is not used anywhere in this file.

🧹 Remove unused import
-from unittest.mock import MagicMock, Mock, call, patch
+from unittest.mock import MagicMock, Mock, patch

247-264: Test coverage for None lesson is incomplete.

This test only exercises 4 of the 6 helper methods (_show_concepts, _show_examples, _run_tutorial, _show_best_practices). Consider adding _ask_question and _show_progress for completeness.

Additionally, the test only verifies no exceptions are raised but doesn't assert expected behavior (e.g., that appropriate "no data" messages are printed).

🧪 Extended test with assertions
     def test_methods_with_no_lesson(self, mock_agent_class):
         """Test methods handle None lesson gracefully."""
         from cortex.tutor.agents.tutor_agent import InteractiveTutor

         tutor = InteractiveTutor("docker")
         tutor.lesson = None
         tutor.agent = Mock()
+        tutor.agent.get_progress.return_value = {"success": False}

         # These should not raise errors
-        tutor._show_concepts()
-        tutor._show_examples()
-        tutor._run_tutorial()
-        tutor._show_best_practices()
+        with patch("cortex.tutor.agents.tutor_agent.tutor_agent.tutor_print") as mock_print:
+            tutor._show_concepts()
+            tutor._show_examples()
+            tutor._run_tutorial()
+            tutor._show_best_practices()
+            tutor._show_progress()
tests/tutor/test_agent_methods.py (1)

411-441: State helpers well-tested, but has_critical_error is missing a direct test.

The has_critical_error function is imported (line 33) but not directly tested in TestStateHelpers. It's indirectly tested via TestRouting.test_route_after_plan_fail_on_error, but a direct unit test would improve coverage.

🧪 Add direct test for has_critical_error
def test_has_critical_error_true(self):
    """Test has_critical_error returns True for non-recoverable errors."""
    state = create_initial_state("docker")
    add_error(state, "test", "Critical error", recoverable=False)
    assert has_critical_error(state) is True

def test_has_critical_error_false(self):
    """Test has_critical_error returns False for recoverable errors."""
    state = create_initial_state("docker")
    add_error(state, "test", "Minor error", recoverable=True)
    assert has_critical_error(state) is False
tests/tutor/test_integration.py (2)

205-222: Branding tests verify no errors but lack output assertions.

The capsys.readouterr() result is captured but not verified. While Rich console output is complex, consider at minimum asserting the output is non-empty for success cases, or use Rich's Console with a StringIO for testable output.

🧪 Add minimal assertions
     def test_tutor_print_success(self, capsys):
         """Test tutor_print with success status."""
         tutor_print("Test message", "success")
         captured = capsys.readouterr()
-        # Rich console output is complex, just ensure no errors
+        # Rich console output is complex, verify something was printed
+        assert len(captured.out) > 0 or len(captured.err) > 0

298-365: Good end-to-end test for cache-hit workflow.

The test properly exercises the Plan→Load→Reflect flow with mocked tools and verifies state transitions. The note about removing real API tests is appropriate for CI.

Consider adding a similar test for the cache-miss path (Plan→Generate→Reflect) to cover both main workflows.

cortex/tutor/tests/test_progress_tracker.py (1)

290-351: Redundant imports inside test methods.

Mock and patch are imported inside each test method. Since these are already available in the module scope (line 9), the local imports are unnecessary.

🧹 Remove redundant imports
     def test_get_learning_progress(self, temp_db):
         """Test get_learning_progress function."""
-        from unittest.mock import Mock, patch
-
         # Mock the global config to use temp_db
         mock_config = Mock()

Apply similar changes to test_mark_topic_completed and test_get_package_stats.

cortex/tutor/tests/test_tools.py (1)

277-310: ConversationHandler tests use __new__ to bypass initialization.

While using __new__ to create uninitialized instances works for testing, it's fragile and couples tests to implementation details. Consider adding a factory method or making the handler more testable.

# Alternative: Initialize properly with mocked config
with patch("cortex.tutor.tools.agentic.qa_handler.get_config") as mock_config:
    mock_config.return_value = Mock(anthropic_api_key="test", model="test")
    with patch("cortex.tutor.tools.agentic.qa_handler.ChatAnthropic"):
        handler = ConversationHandler()
        handler.history = []  # Reset history for test
cortex/tutor/config.py (2)

45-49: Direct mutation in model_post_init works but consider model_validator.

Pydantic v2 allows mutation in model_post_init, but using a model_validator with mode='after' is more idiomatic for computed defaults.

♻️ Alternative using model_validator
from pydantic import model_validator

@model_validator(mode='after')
def set_db_path_default(self) -> 'Config':
    """Set default db_path if not provided."""
    if self.db_path is None:
        object.__setattr__(self, 'db_path', self.data_dir / "tutor_progress.db")
    return self

100-108: Return type inconsistency with field type.

get_db_path() returns Path but db_path field is typed as Path | None. While model_post_init ensures it's always set, the type checker may flag this. Consider adding an assertion or adjusting the type.

🔧 Add runtime assertion for type safety
     def get_db_path(self) -> Path:
         """
         Get the full path to the SQLite database.

         Returns:
             Path: Full path to tutor_progress.db
         """
         self.ensure_data_dir()
+        assert self.db_path is not None, "db_path should be set by model_post_init"
         return self.db_path
cortex/tutor/branding.py (2)

7-7: Remove unused Optional import.

The code uses modern union syntax (str | None) throughout, so Optional is not needed.

Proposed fix
-from typing import Literal, Optional
+from typing import Literal

218-234: Unused description parameter.

The description parameter is accepted but not used. If the intent is to set a default task description, it should be returned with a pre-added task or documented that callers should use progress.add_task(description).

Option 1: Remove unused parameter
-def create_progress_bar(description: str = "Processing") -> Progress:
+def create_progress_bar() -> Progress:
     """
     Create a Rich progress bar for long-running operations.

-    Args:
-        description: Description text for the progress bar.
-
     Returns:
         Progress: Configured Rich Progress instance.
     """
Option 2: Use the parameter
+from typing import Tuple
+
-def create_progress_bar(description: str = "Processing") -> Progress:
+def create_progress_bar(description: str = "Processing") -> Tuple[Progress, int]:
     """
     Create a Rich progress bar for long-running operations.

     Args:
         description: Description text for the progress bar.

     Returns:
-        Progress: Configured Rich Progress instance.
+        Tuple of (Progress instance, task_id).
     """
-    return Progress(
+    progress = Progress(
         SpinnerColumn(),
         TextColumn("[progress.description]{task.description}"),
         BarColumn(),
         TaskProgressColumn(),
         console=console,
     )
+    task_id = progress.add_task(description, total=100)
+    return progress, task_id
tests/tutor/test_progress_tracker.py (1)

7-7: Remove unused os import.

The os module is imported but not used in this file.

Proposed fix
-import os
 import tempfile
cortex/tutor/agents/tutor_agent/tutor_agent.py (1)

86-92: Calling private _run method directly on LangChain tools.

The _run method is a private API of LangChain's BaseTool. For tool invocation, the public API is run() or invoke(). Using private methods may break if the LangChain library changes its internal implementation.

♻️ Suggested refactor
-            self.progress_tool._run(
-                "update_progress",
-                package_name=package_name,
-                topic="overview",
-            )
+            self.progress_tool.run(
+                "update_progress",
+                package_name=package_name,
+                topic="overview",
+            )

Apply the same pattern to all other ._run() calls in this file (Lines 155, 156, 165, 177, 192-197, 210, 220).

cortex/tutor/tools/deterministic/lesson_loader.py (2)

100-106: Async method delegates to synchronous implementation.

The _arun method simply calls the sync _run method, which could block the event loop if the SQLite operations are slow. For a deterministic cache lookup, this is likely acceptable, but consider documenting this limitation.


125-129: Silent exception swallowing in cache_lesson.

Catching all exceptions and returning False hides the root cause of failures. Consider logging the exception for debugging purposes.

♻️ Suggested improvement
+import logging
+
+logger = logging.getLogger(__name__)
+
     def cache_lesson(
         self,
         package_name: str,
         lesson: dict[str, Any],
         ttl_hours: int = 24,
     ) -> bool:
         ...
         try:
             self.store.cache_lesson(package_name, lesson, ttl_hours)
             return True
-        except Exception:
+        except Exception as e:
+            logger.warning("Failed to cache lesson for %s: %s", package_name, e)
             return False
cortex/tutor/tools/agentic/lesson_generator.py (2)

21-36: Unused function _load_prompt_template.

This function is defined but never called. The prompts are instead hardcoded in _get_system_prompt and _get_generation_prompt methods. Either use this function or remove it.

♻️ Option 1: Remove unused code
-# Load prompt template
-def _load_prompt_template() -> str:
-    """Load the lesson generator prompt from file."""
-    prompt_path = Path(__file__).parent.parent.parent / "prompts" / "tools" / "lesson_generator.md"
-    if prompt_path.exists():
-        return prompt_path.read_text()
-    # Fallback inline prompt
-    return """You are a lesson content generator. Generate comprehensive educational content
-    for the package: {package_name}
-
-    Student level: {student_level}
-    Learning style: {learning_style}
-    Focus areas: {focus_areas}
-
-    Return a JSON object with: summary, explanation, use_cases, best_practices,
-    code_examples, tutorial_steps, installation_command, confidence."""
-

82-140: DRY violation: _run and _arun have duplicated logic.

The async _arun method duplicates most of the logic from _run, including prompt construction, chain setup, and error handling. Consider extracting shared logic.

♻️ Suggested refactor
def _build_chain(self):
    """Build the LLM chain for lesson generation."""
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", self._get_system_prompt()),
            ("human", self._get_generation_prompt()),
        ]
    )
    return prompt | self.llm | JsonOutputParser()

def _build_invoke_params(self, package_name, student_level, learning_style, focus_areas, skip_areas):
    """Build parameters for chain invocation."""
    return {
        "package_name": package_name,
        "student_level": student_level,
        "learning_style": learning_style,
        "focus_areas": ", ".join(focus_areas or []) or "all topics",
        "skip_areas": ", ".join(skip_areas or []) or "none",
    }

def _run(self, package_name, student_level="beginner", ...):
    try:
        chain = self._build_chain()
        params = self._build_invoke_params(...)
        result = chain.invoke(params)
        # ...

Also applies to: 142-184

tests/tutor/test_tools.py (1)

280-309: Fragile test setup using __new__ to bypass initialization.

Using __new__ to create objects without calling __init__ is fragile and can break if the class implementation changes. Consider using proper mocking or fixtures.

♻️ Suggested improvement
 class TestConversationHandler:
     """Tests for ConversationHandler."""

     def test_build_context_empty(self):
         """Test context building with empty history."""
-        with patch("cortex.tutor.tools.agentic.qa_handler.get_config"):
-            handler = ConversationHandler.__new__(ConversationHandler)
-            handler.history = []
-
+        with patch("cortex.tutor.tools.agentic.qa_handler.get_config") as mock_config:
+            mock_config.return_value = Mock(
+                anthropic_api_key="test_key",
+                model="claude-sonnet-4-20250514",
+            )
+            with patch("cortex.tutor.tools.agentic.qa_handler.ChatAnthropic"):
+                handler = ConversationHandler(package_name="test")
+                handler.history = []
+
             context = handler._build_context()
             assert "Starting fresh" in context
cortex/tutor/tools/agentic/examples_provider.py (2)

7-8: Unused import: Path.

The Path import from pathlib is not used anywhere in this file.

♻️ Suggested fix
-from pathlib import Path
 from typing import Any

59-113: DRY violation: _run and _arun duplicate logic.

Same issue as in lesson_generator.py. The async method duplicates the sync implementation. Consider extracting shared chain-building logic.

Also applies to: 115-157

cortex/tutor/agents/tutor_agent/graph.py (2)

374-388: Singleton graph creation is not thread-safe.

The get_tutor_graph() function uses a global variable without synchronization. In a multi-threaded environment, multiple threads could create the graph simultaneously.

♻️ Suggested thread-safe implementation
+import threading
+
 # Create singleton graph instance
 _graph = None
+_graph_lock = threading.Lock()


 def get_tutor_graph() -> StateGraph:
     """
     Get the singleton Tutor Agent graph.

     Returns:
         Compiled StateGraph.
     """
     global _graph
-    if _graph is None:
-        _graph = create_tutor_graph()
+    if _graph is None:
+        with _graph_lock:
+            if _graph is None:  # Double-check locking
+                _graph = create_tutor_graph()
     return _graph

44-45: Using private _run method on LangChain tools throughout graph nodes.

Multiple nodes call tool._run() which is a private API. Consider using the public run() or invoke() methods for better compatibility with future LangChain versions.

Also applies to: 62-63, 122-123, 173-174

cortex/tutor/cli.py (3)

155-210: Consider handling missing validation_passed key more defensively.

If agent.ask() returns a malformed response without the validation_passed key, .get("validation_passed") returns None, which is falsy and falls through to the error case. This is reasonable behavior, but you may want to log or handle unexpected response formats explicitly for debugging purposes.


213-240: Docstring is missing the verbose parameter.

The cmd_list_packages function accepts a verbose parameter but the docstring doesn't document it.

📝 Suggested fix
 def cmd_list_packages(verbose: bool = False) -> int:
     """
     List packages that have been studied.
 
+    Args:
+        verbose: Enable verbose output.
+
     Returns:
         Exit code.
     """

243-313: Docstring is missing the verbose parameter documentation.

Similar to cmd_list_packages, the verbose parameter is not documented in the docstring.

📝 Suggested fix
 def cmd_progress(package: str | None = None, verbose: bool = False) -> int:
     """
     Show learning progress.
 
     Args:
         package: Optional package filter.
+        verbose: Enable verbose output.
 
     Returns:
         Exit code.
     """
cortex/tutor/tools/deterministic/progress_tracker.py (2)

115-117: Async implementation blocks the event loop.

The _arun method calls the synchronous _run method directly, which will block the event loop if _run performs I/O (SQLite operations). For a truly async implementation, consider using asyncio.to_thread() or an async-compatible SQLite library.

♻️ Suggested fix to avoid blocking the event loop
     async def _arun(self, *args: Any, **kwargs: Any) -> dict[str, Any]:
         """Async version - delegates to sync implementation."""
-        return self._run(*args, **kwargs)
+        import asyncio
+        return await asyncio.to_thread(self._run, *args, **kwargs)

301-346: Convenience functions create new tool instances per call.

Each convenience function (get_learning_progress, mark_topic_completed, get_package_stats) instantiates a new ProgressTrackerTool and thus a new SQLiteStore. This is inefficient for repeated calls and may have connection overhead. Consider caching or accepting an optional tool/store parameter.

cortex/tutor/tools/agentic/qa_handler.py (3)

1-8: Unused import: Path is imported but never used.

The Path import from pathlib is not used in this file.

🧹 Remove unused import
-from pathlib import Path
 from typing import Any

118-162: Code duplication between _run and _arun.

The sync and async methods have nearly identical logic. Consider extracting the common prompt/chain setup into a helper method.

♻️ Suggested refactor to reduce duplication
def _build_chain(self):
    """Build the prompt chain for Q&A."""
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", self._get_system_prompt()),
            ("human", self._get_qa_prompt()),
        ]
    )
    return prompt | self.llm | JsonOutputParser()

def _build_invoke_params(self, package_name, question, learning_style, 
                          mastered_concepts, weak_concepts, lesson_context):
    """Build parameters for chain invocation."""
    return {
        "package_name": package_name,
        "question": question,
        "learning_style": learning_style,
        "mastered_concepts": ", ".join(mastered_concepts or []) or "none specified",
        "weak_concepts": ", ".join(weak_concepts or []) or "none specified",
        "lesson_context": lesson_context or "starting fresh",
    }

268-344: Unbounded history growth in ConversationHandler.

The history list grows unboundedly. While _build_context only uses the last 3 exchanges, the full history is retained. Consider limiting history size to prevent memory growth in long sessions.

♻️ Suggested fix to limit history size
+    MAX_HISTORY_SIZE = 10  # Keep last 10 exchanges
+
     def ask(
         self,
         question: str,
         # ... params
     ) -> dict[str, Any]:
         # ... existing code ...
         
         # Update history
         if result.get("success"):
             self.history.append(
                 {
                     "question": question,
                     "answer": result["answer"].get("answer", ""),
                 }
             )
+            # Trim history to prevent unbounded growth
+            if len(self.history) > self.MAX_HISTORY_SIZE:
+                self.history = self.history[-self.MAX_HISTORY_SIZE:]
 
         return result
tests/tutor/test_branding.py (3)

7-8: Unused imports: StringIO, MagicMock, and Mock are imported but never used.

These imports are not utilized in the test file.

🧹 Remove unused imports
-from io import StringIO
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import patch

44-70: Tests lack assertions to verify output content.

The TestTutorPrint tests call the functions but don't assert anything about the output. While smoke testing is valuable, consider adding at least basic assertions using capsys.readouterr() to verify expected output patterns.

💡 Example of adding assertions
def test_tutor_print_success(self, capsys):
    """Test success status print."""
    tutor_print("Test message", "success")
    captured = capsys.readouterr()
    assert "Test message" in captured.out or len(captured.out) > 0

73-91: Tests lack meaningful assertions.

Similar to TestTutorPrint, these tests only verify that functions don't raise exceptions. Consider adding assertions to verify output contains expected content.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1fbed6a and a799d27.

📒 Files selected for processing (46)
  • cortex/cli.py
  • cortex/tutor/__init__.py
  • cortex/tutor/agents/tutor_agent/__init__.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/agents/tutor_agent/state.py
  • cortex/tutor/agents/tutor_agent/tutor_agent.py
  • cortex/tutor/branding.py
  • cortex/tutor/cli.py
  • cortex/tutor/config.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/memory/sqlite_store.py
  • cortex/tutor/tests/test_agent_methods.py
  • cortex/tutor/tests/test_agentic_tools.py
  • cortex/tutor/tests/test_branding.py
  • cortex/tutor/tests/test_cli.py
  • cortex/tutor/tests/test_deterministic_tools.py
  • cortex/tutor/tests/test_integration.py
  • cortex/tutor/tests/test_interactive_tutor.py
  • cortex/tutor/tests/test_progress_tracker.py
  • cortex/tutor/tests/test_tools.py
  • cortex/tutor/tests/test_tutor_agent.py
  • cortex/tutor/tests/test_validators.py
  • cortex/tutor/tools/__init__.py
  • cortex/tutor/tools/agentic/__init__.py
  • cortex/tutor/tools/agentic/examples_provider.py
  • cortex/tutor/tools/agentic/lesson_generator.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/tutor/tools/deterministic/lesson_loader.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/tools/deterministic/validators.py
  • docs/AI_TUTOR.md
  • requirements.txt
  • tests/tutor/__init__.py
  • tests/tutor/test_agent_methods.py
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_branding.py
  • tests/tutor/test_cli.py
  • tests/tutor/test_deterministic_tools.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_interactive_tutor.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_tools.py
  • tests/tutor/test_tutor_agent.py
  • tests/tutor/test_validators.py
✅ Files skipped from review due to trivial changes (1)
  • tests/tutor/init.py
🚧 Files skipped from review as they are similar to previous changes (9)
  • docs/AI_TUTOR.md
  • cortex/tutor/tests/test_cli.py
  • cortex/tutor/tests/test_agentic_tools.py
  • cortex/tutor/tests/test_tutor_agent.py
  • requirements.txt
  • cortex/tutor/init.py
  • cortex/tutor/agents/tutor_agent/init.py
  • cortex/tutor/tests/test_interactive_tutor.py
  • cortex/cli.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/tutor/test_agentic_tools.py
  • cortex/tutor/tests/test_integration.py
  • tests/tutor/test_tutor_agent.py
  • cortex/tutor/tools/agentic/__init__.py
  • tests/tutor/test_cli.py
  • cortex/tutor/tests/test_agent_methods.py
  • tests/tutor/test_interactive_tutor.py
  • cortex/tutor/tests/test_progress_tracker.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/tutor/config.py
  • tests/tutor/test_deterministic_tools.py
  • cortex/tutor/tools/__init__.py
  • tests/tutor/test_tools.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/tests/test_tools.py
  • cortex/tutor/branding.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_agent_methods.py
  • cortex/tutor/tests/test_validators.py
  • cortex/tutor/agents/tutor_agent/tutor_agent.py
  • cortex/tutor/memory/sqlite_store.py
  • cortex/tutor/agents/tutor_agent/state.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/tools/deterministic/validators.py
  • cortex/tutor/tools/agentic/lesson_generator.py
  • cortex/tutor/tools/deterministic/lesson_loader.py
  • cortex/tutor/tools/agentic/examples_provider.py
  • cortex/tutor/tests/test_deterministic_tools.py
  • cortex/tutor/cli.py
  • tests/tutor/test_validators.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • tests/tutor/test_branding.py
  • cortex/tutor/tests/test_branding.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_tutor_agent.py
  • tests/tutor/test_cli.py
  • tests/tutor/test_interactive_tutor.py
  • tests/tutor/test_deterministic_tools.py
  • tests/tutor/test_tools.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_agent_methods.py
  • tests/tutor/test_validators.py
  • tests/tutor/test_branding.py
{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

  • cortex/tutor/tools/agentic/__init__.py
  • cortex/tutor/tools/deterministic/__init__.py
  • cortex/tutor/tools/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to tests/**/*.py : Maintain >80% test coverage for pull requests

Applied to files:

  • cortex/tutor/tests/test_progress_tracker.py
🔇 Additional comments (77)
cortex/tutor/tests/test_validators.py (2)

26-90: Thorough package name validation tests.

Good coverage of edge cases including valid names, empty input, whitespace, length limits, blocked patterns, and invalid characters. The blocked pattern tests appropriately cover dangerous shell commands.


266-302: Well-structured integration tests for validation workflow.

The TestGetValidationErrors and TestValidateAll classes properly test the aggregation and combined validation flows. Good use of multiple validation types in combination.

cortex/tutor/memory/sqlite_store.py (2)

362-390: Race condition handling in _create_default_profile is well implemented.

Good use of INSERT OR IGNORE followed by a re-fetch to handle concurrent profile creation attempts. The re-entrant lock (RLock) also helps with nested calls.


474-496: Lesson cache expiration logic is correct.

The TTL-based cache properly uses UTC timestamps and the expiration check in get_cached_lesson correctly filters expired entries.

cortex/tutor/tools/agentic/__init__.py (1)

1-17: Clean package initialization with proper exports.

Well-structured __init__.py with a clear docstring explaining the purpose of agentic tools. The __all__ properly defines the public API surface.

cortex/tutor/tests/test_deterministic_tools.py (3)

23-58: Good cache behavior tests for LessonLoaderTool.

Tests properly verify cache miss, cache storage/retrieval, and force_fresh bypass. The use of a temporary database fixture ensures test isolation.


174-179: Good coverage of error handling for invalid actions.

The test properly verifies that invalid actions return a failure response with an error message.


89-95: Ensure fallback validation is exhaustive.

The test iterates over FALLBACK_LESSONS but only checks for package_name and summary fields. Consider verifying all required fields that downstream code depends on.

tests/tutor/test_validators.py (1)

1-303: Verify whether duplicate test files exist.

This review raises a concern about potential duplicate test files. Before acting on this claim:

  1. Confirm that both cortex/tutor/tests/test_validators.py and tests/tutor/test_validators.py exist in the current state
  2. If both exist, compare them to verify they are identical
  3. If confirmed as duplicates, determine which should be removed based on your refactoring intent
tests/tutor/test_agentic_tools.py (1)

12-197: Test structure and coverage look solid.

The tests properly mock external dependencies (get_config, ChatAnthropic) and validate _structure_response behavior across full and minimal response scenarios for each agentic tool. The ConversationHandler context-building tests cover both empty and populated history states.

cortex/tutor/tools/deterministic/__init__.py (1)

1-17: Clean package initialization with appropriate exports.

The module docstring clearly explains that these are deterministic (non-LLM) tools, and __all__ properly exposes the public API.

cortex/tutor/tests/test_agent_methods.py (1)

37-441: Comprehensive test coverage for TutorAgent orchestration.

Tests thoroughly cover agent methods, graph node behaviors (success/failure/exception paths), routing decisions, and state helpers. The mocking strategy properly isolates units from external dependencies.

cortex/tutor/tests/test_branding.py (1)

30-227: UI smoke tests provide reasonable coverage.

The tests verify that branding functions execute without errors across various input scenarios. For Rich console output, smoke testing is a practical approach since exact output assertions would be brittle.

tests/tutor/test_tutor_agent.py (1)

33-321: Well-structured tests covering state management, graph nodes, and routing.

Tests comprehensively validate state creation, error/checkpoint/cost accumulation, critical error detection, and routing logic. The integration tests properly verify input validation.

tests/tutor/test_deterministic_tools.py (1)

1-179: Excellent test coverage for deterministic tools.

Tests properly isolate database operations using temporary directories, cover all ProgressTrackerTool actions including error handling for invalid actions, and verify fallback lesson content including case-insensitivity and required fields. The fixture pattern ensures clean test isolation.

cortex/tutor/tests/test_integration.py (1)

298-364: Solid end-to-end workflow test.

The test validates the complete cache-hit lesson workflow through plan_node → load_cache_node → reflect_node, verifying state transitions and final output validation. Good use of mocking to isolate from actual LLM calls.

tests/tutor/test_interactive_tutor.py (1)

84-123: Well-structured test fixture.

The mock_tutor fixture provides comprehensive mock data covering all lesson components (examples, tutorial steps, best practices) with properly configured agent mocks. This enables clean, focused test methods.

tests/tutor/test_agent_methods.py (2)

37-198: Good coverage of TutorAgent public API.

The tests comprehensively cover all public methods (teach, ask, get_profile, update_learning_style, mark_completed, reset_progress, get_packages_studied) with proper mocking of dependencies.


200-259: Comprehensive node testing with error scenarios.

The generate_lesson_node tests cover success, API failure, and exception handling paths. The caching behavior on success is properly verified.

tests/tutor/test_integration.py (1)

32-81: Config tests are comprehensive.

Good coverage of environment loading, API key validation, path expansion, and directory creation with proper use of monkeypatch and tempfile for isolation.

tests/tutor/test_cli.py (3)

24-87: Parser tests are thorough.

All CLI flags and arguments are tested including edge cases like --reset with and without package argument.


89-149: Good security test for blocked patterns.

The test_blocked_package_name test verifies that potentially dangerous input like "rm -rf /" is rejected, which is important for security.


401-453: Main entry point routing is well-tested.

Tests verify that CLI arguments are correctly routed to their respective command handlers.

cortex/tutor/tests/test_progress_tracker.py (2)

47-212: Excellent SQLiteStore test coverage.

Comprehensive testing of all store operations including CRUD, statistics, profile management, lesson caching with TTL, and reset functionality.


214-288: Good action-based testing of ProgressTrackerTool.

All supported actions are tested, including error cases for unknown actions and missing required parameters.

cortex/tutor/tests/test_tools.py (1)

31-122: Good caching and fallback behavior testing.

Tests cover cache hit/miss scenarios, force-refresh, fallback templates, and the priority order (cache → fallback → none).

cortex/tutor/config.py (2)

127-141: Docstring Raises is accurate but indirect.

The docstring correctly notes ValueError can be raised, which happens via Config.from_env(). This is accurate.


1-16: Well-structured configuration module.

Good use of Pydantic for validation, proper docstrings, type hints, and secure defaults (0o700 permissions for data directory). The environment-based configuration with optional API key requirement enables the lazy import pattern mentioned in the PR objectives.

cortex/tutor/tools/__init__.py (1)

1-20: LGTM!

Clean package initialization with proper docstring and centralized exports. The __all__ list correctly exposes the public API for both deterministic and agentic tools.

cortex/tutor/contracts/lesson_context.py (2)

101-131: Well-designed serialization and utility methods.

The to_json(), from_json(), and to_display_dict() methods provide a clean API for caching and presentation. Good use of Pydantic's built-in serialization.


56-75: Pydantic v2 max_length on list fields correctly constrains list size.

In Pydantic v2, Field(max_length=N) on a list field constrains the number of items in the list, not the length of individual string elements. The code is using this correctly and requires no changes.

Likely an incorrect or invalid review comment.

cortex/tutor/branding.py (2)

34-50: Well-designed status printing utility.

Clean implementation with proper fallback to default config. The status-emoji mapping provides a consistent UI experience.


193-216: Good defensive error handling in get_user_input.

Properly handles EOFError and KeyboardInterrupt to gracefully cancel operations without crashing.

tests/tutor/test_progress_tracker.py (3)

27-44: Well-structured test fixtures.

Good use of tempfile.TemporaryDirectory context manager with yield for automatic cleanup. The fixture chain (temp_dbstore/tracker) provides proper test isolation.


278-288: Good error case coverage.

Tests for unknown actions and missing required parameters ensure the tool gracefully handles invalid inputs.


293-350: Proper mocking pattern for convenience functions.

Good use of patch to substitute config, ensuring tests are isolated from the actual database path. The tests verify end-to-end behavior through the public API.

cortex/tutor/contracts/progress_context.py (2)

31-63: Well-designed computed fields with proper edge case handling.

The @computed_field decorators correctly implement derived properties. The is_complete() method properly returns False for empty topic lists, preventing false positives.


159-176: Clean factory method pattern.

from_results properly guards against division by zero and provides sensible defaults. The 70% pass threshold is a reasonable default for quiz assessment.

cortex/tutor/agents/tutor_agent/tutor_agent.py (3)

265-275: Lazy imports inside start() method are appropriate.

The PR objectives mention "lazy imports to allow listing/progress commands without an API key." This approach is intentional and correctly defers heavy imports until interactive mode is needed.


309-313: Consider validating menu option range before conversion.

The input handling catches ValueError for non-numeric input, but the range check happens after successful conversion. This is acceptable but could provide clearer feedback.


23-48: TutorAgent class is well-structured with proper docstrings and type hints.

The class follows PEP 8 style, includes comprehensive docstrings with examples, and uses type hints throughout. The initialization pattern is clean.

cortex/tutor/tools/deterministic/lesson_loader.py (2)

163-232: FALLBACK_LESSONS provides good offline resilience.

The predefined fallback lessons for common packages (docker, git, nginx) ensure the tutor can function without API access. The lower confidence score (0.7) appropriately signals these are fallback content.


18-49: LessonLoaderTool class is well-documented with proper type hints.

The class follows PEP 8, has comprehensive docstrings, and uses Pydantic fields appropriately. The Config class enables arbitrary types for the SQLiteStore field.

cortex/tutor/tools/agentic/lesson_generator.py (2)

262-276: Good defensive coding in _structure_response.

The method properly handles missing fields with defaults, clamps confidence to valid range (0.0-1.0), and limits list sizes to prevent unbounded responses. This is well-implemented.


186-202: System prompt has appropriate safety rules.

The critical rules against fabricating features, URLs, or versions are important for educational content integrity. Good security posture.

tests/tutor/test_tools.py (2)

24-28: Good use of pytest fixture for test isolation.

The temp_db fixture properly creates an isolated temporary database for each test, ensuring tests don't interfere with each other.


127-181: Test mocking is thorough and at the correct level.

The tests mock ChatAnthropic and get_config at the appropriate module level, and verify the tool's response structure without requiring actual LLM calls.

cortex/tutor/tools/agentic/examples_provider.py (2)

159-174: System prompt enforces good safety practices.

The critical rules against inventing flags, using real credentials, and flagging dangerous commands are appropriate for educational content generation.


208-236: Good defensive coding in _structure_response.

The method properly validates example structure, limits list sizes, and provides sensible defaults. The confidence clamping is correctly implemented.

cortex/tutor/agents/tutor_agent/graph.py (2)

200-249: Reflect node has well-designed validation logic.

The confidence calculation accounting for errors and cache staleness, combined with comprehensive output preparation, demonstrates good design. The validation_passed flag provides clear success/failure indication.


315-371: Graph structure is well-organized with clear entry point and edges.

The workflow correctly implements the Plan→Act→Reflect pattern with proper conditional routing and terminal states. The conditional edges provide flexibility for different session types.

cortex/tutor/agents/tutor_agent/state.py (4)

50-105: Well-designed TypedDict state schema.

The TutorAgentState provides a clear contract for the workflow state with comprehensive documentation. Using total=False allows optional fields while maintaining type safety.


107-150: Factory function provides clean state initialization.

create_initial_state sets sensible defaults for all fields, ensuring the state is always in a valid initial configuration. The explicit initialization prevents undefined state issues.


153-171: Helper functions mutate state in-place.

The add_error, add_checkpoint, and add_cost functions modify the state dictionary directly. This is appropriate for LangGraph workflows where state is passed through nodes and accumulated.


42-47: ErrorState uses total=True (default) enforcing required fields.

Unlike other state TypedDicts, ErrorState requires all fields (node, error, recoverable), which is appropriate since an error entry should always have complete information.

cortex/tutor/cli.py (5)

1-33: LGTM! Well-structured module docstring and imports.

The module docstring clearly documents CLI usage patterns. Lazy imports strategy for API-key-dependent modules is a good design choice. The DEFAULT_TUTOR_TOPICS constant is appropriately defined at module level.


35-109: LGTM! Argument parser is well-configured.

The parser correctly handles all required CLI options including version, verbose mode, package name, list, progress, reset, fresh, and question flags. The --reset with nargs="?" and const="__all__" is a nice pattern for optional argument handling.


112-152: LGTM! Solid error handling in teach command.

The function properly validates input before expensive imports, handles ValueError, KeyboardInterrupt, and general exceptions with appropriate exit codes. The lazy import pattern ensures API key is only required when actually starting a session.


316-347: LGTM! Reset command with proper confirmation flow.

The confirmation prompt before destructive operations is good UX practice. The scope calculation handles both None and "__all__" correctly.


350-392: LGTM! Clean command dispatch logic.

The main function correctly handles argument parsing, conditional banner display, and command routing. The fallback to parser.print_help() when no command is specified is appropriate.

cortex/tutor/tools/deterministic/validators.py (7)

15-34: LGTM! Comprehensive security patterns for input validation.

The blocked patterns list covers common injection vectors (shell commands, fork bombs, Python code injection). This is a solid defense-in-depth approach for a CLI tool that might pass user input to an LLM.


36-48: LGTM! Well-defined validation patterns and categories.

The regex pattern for valid package names is appropriately restrictive while allowing common naming conventions (dots, hyphens, underscores). The package categories list provides good context for documentation.


51-93: LGTM! Thorough package name validation with clear error messages.

The function correctly handles empty input, length limits, blocked patterns, and format validation. The docstring includes helpful examples demonstrating expected behavior.


96-135: LGTM! Generic input validation with configurable parameters.

The function properly handles the allow_empty flag and applies length limits before checking blocked patterns.


138-210: LGTM! Specialized validators with appropriate constraints.

The validate_question, validate_topic, validate_score, and validate_learning_style functions properly delegate to base validators or implement specific rules. The score validation correctly handles both int and float types.


213-277: LGTM! Input sanitization and package name extraction utilities.

The sanitize_input function removes null bytes and enforces length limits. The extract_package_name function uses a reasonable set of patterns to extract package names from natural language input.


280-367: LGTM! Clean aggregation utilities with ValidationResult class.

The get_validation_errors and validate_all functions provide convenient batch validation. The ValidationResult class implements __bool__ and __str__ for ergonomic usage.

cortex/tutor/tools/deterministic/progress_tracker.py (5)

23-59: LGTM! Well-documented tool class with proper initialization.

The class docstring clearly explains the tool's purpose and constraints (no LLM calls). The initialization correctly handles optional db_path with fallback to config default.


61-113: LGTM! Clean action dispatcher with error handling.

The dispatcher pattern is well-implemented with a dictionary mapping actions to handlers. Unknown actions return helpful error messages with valid action lists.


119-164: LGTM! Progress retrieval handlers with proper validation.

Both _get_progress and _get_all_progress correctly handle missing required parameters and return structured responses.


166-213: LGTM! Progress update handlers with proper data preservation.

The _mark_completed and _update_progress handlers correctly preserve existing values when updating progress. The time accumulation logic in _update_progress is correctly implemented.


215-295: LGTM! Stats and profile management handlers.

The handlers correctly implement statistics retrieval, profile management, and progress reset functionality. Error handling is consistent across all methods.

cortex/tutor/tools/agentic/qa_handler.py (4)

59-116: LGTM! Robust error handling and response structuring.

The _run method properly wraps LLM calls in try-except, structures responses consistently, and includes cost estimation. The chain composition with prompt | self.llm | JsonOutputParser() is idiomatic LangChain.


164-213: LGTM! Well-designed system and Q&A prompts.

The system prompt includes critical safety rules (no fabrication, no fake URLs, confidence expression). The Q&A prompt template defines a clear JSON schema for structured responses.


215-241: LGTM! Response structuring with validation.

The _structure_response method properly clamps confidence to [0.0, 1.0], limits array lengths for related topics and suggestions, and safely handles missing or malformed code examples.


36-57: Model name is valid and pricing estimate is reasonable.

The model identifier claude-sonnet-4-20250514 is confirmed as a valid current Anthropic model. Current pricing is $3.00 per 1M input tokens and $15.00 per 1M output tokens. With max_tokens set to 2048, the estimated cost of ~$0.02 per question is reasonable for typical usage patterns.

tests/tutor/test_branding.py (1)

229-249: LGTM! User input tests with proper mocking and assertions.

These tests correctly mock builtins.input and verify the expected behavior. The test at line 245-249 correctly tests that raw input is returned (checking "spaced" in result rather than exact match).

@srikrishnavansi
Copy link
Contributor Author

@Anshgrover23 All checks are now passing! ✅

Summary of updates:

  • ✅ CLA signed (PR chore(cla): add Doddi Sri Krishna Vamsi to signers #570 merged)
  • ✅ Demo video added to PR description
  • ✅ AI Usage disclosure added
  • ✅ All CodeRabbit/Gemini review comments addressed
  • ✅ Ruff lint errors fixed (modern Python 3.10+ type annotations)
  • ✅ Tests moved to tests/tutor/ for CI coverage
  • ✅ All 1280+ tests passing (266 tutor + 1014 existing)
  • ✅ Coverage threshold met

Ready for review. Please take a look when you get a chance. Thanks!

Copy link
Collaborator

@Anshgrover23 Anshgrover23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srikrishnavansi Your SonarQubeCloud Quality Gate is failing kindly address that issues. Also address coderabbit comments that are still open.

@Anshgrover23 Anshgrover23 marked this pull request as draft January 12, 2026 07:37
srikrishnavansi and others added 6 commits January 12, 2026 13:09
- Replace deprecated datetime.utcnow() with datetime.now(timezone.utc)
  in lesson_context.py and progress_context.py
- Fix upsert_progress to return actual row ID when lastrowid is 0
  (happens on ON CONFLICT UPDATE)
- Fix TOCTOU race condition in add_mastered_concept and add_weak_concept
  by using atomic read-modify-write within single connection

All 266 tests pass.
Fixes SonarQube security hotspot S5443 (publicly writable directories).

Changed all instances of hardcoded "/tmp/test.db" mock paths to use
`Path(tempfile.gettempdir()) / "test.db"` for secure temporary file
handling.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes for 86 SonarQube issues:

Critical fixes:
- Extract SQL constant for duplicated "SELECT * FROM student_profile" query
- Extract constants for duplicated "none specified" and error messages
- Reduce cognitive complexity in cmd_progress() by extracting helper functions
- Fix type hints for sanitize_input() and extract_package_name() to accept None

Major fixes:
- Use pytest.approx() for all floating point equality checks
- Prefix unused function parameters with underscore (_verbose, _description)
- Prefix unused local variables with underscore (_result, _captured, _store)
- Replace unused error variables with underscore in test assertions

Files changed:
- cortex/tutor/memory/sqlite_store.py: Add _SELECT_PROFILE constant
- cortex/tutor/tools/agentic/qa_handler.py: Add _NONE_SPECIFIED constant
- cortex/tutor/tools/deterministic/progress_tracker.py: Add _ERR_PKG_TOPIC_REQUIRED
- cortex/tutor/tools/deterministic/validators.py: Fix type hints
- cortex/tutor/cli.py: Reduce cognitive complexity, fix unused parameters
- cortex/tutor/branding.py: Fix unused parameter
- All test files: Fix floating point comparisons and unused variables
Tests are now only in tests/tutor/ where CI runs them.
This fixes SonarQube's ~99% duplicated lines warning.
@srikrishnavansi srikrishnavansi marked this pull request as ready for review January 12, 2026 09:38
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @cortex/tutor/contracts/progress_context.py:
- Around line 163-179: The from_results constructor allows total=0 which will
violate the Pydantic constraint questions_total (ge=1) and cause a runtime
validation error; update QuizContext.from_results to validate the input total
(e.g., if total < 1: raise ValueError("total must be >= 1") or similar) before
computing score and instantiating the class so callers get a clear error instead
of a Pydantic validation exception.
🧹 Nitpick comments (16)
tests/tutor/test_tutor_agent.py (2)

187-194: Consider making the error assertion more specific.

The assertion len(result["errors"]) > 0 verifies that errors exist but doesn't validate the error content. Consider asserting on the specific error message or type.

💡 Suggested improvement
         result = load_cache_node(state)

-        assert len(result["errors"]) > 0
+        assert len(result["errors"]) == 1
+        assert "cache" in result["errors"][0]["error"].lower() or "missing" in result["errors"][0]["error"].lower()

7-9: Unused import detected.

MagicMock is imported but not used in this test file. Only Mock and patch are utilized.

🧹 Remove unused import
 import tempfile
 from pathlib import Path
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch
cortex/tutor/branding.py (1)

101-114: Minor type hint style inconsistency.

The title parameter uses str | None (PEP 604 style), but Optional is imported from typing on line 7. While both are functionally equivalent, consider using consistent style throughout. Since the Optional import remains for get_user_input, this is fine but noted for awareness.

cortex/tutor/tools/deterministic/progress_tracker.py (2)

261-283: Dual parameter access pattern for concept.

The kwargs.get("concept") or concept pattern allows flexibility in how the concept is passed but may be confusing. Consider documenting this in the docstring.


304-349: Convenience functions create new tool instances per call.

Each convenience function instantiates a new ProgressTrackerTool, which creates a new SQLiteStore connection. This is acceptable for infrequent calls but could be inefficient for batch operations. Consider documenting this behavior or offering a shared instance pattern for performance-critical code paths.

cortex/tutor/tools/agentic/qa_handler.py (2)

1-17: Clean imports with proper library separation.

Imports are well-organized, separating LangChain, Pydantic, and internal modules. Note that Path from pathlib is imported but not used in this file.

🧹 Remove unused import
-from pathlib import Path
 from typing import Any

62-165: Consider extracting shared prompt/chain setup to reduce duplication.

The _run and _arun methods share identical prompt construction and response structuring logic. While this duplication is common in LangChain tools, extracting the shared parts could improve maintainability.

♻️ Suggested refactor
def _build_chain(self):
    """Build the QA chain (shared by sync and async)."""
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", self._get_system_prompt()),
            ("human", self._get_qa_prompt()),
        ]
    )
    return prompt | self.llm | JsonOutputParser()

def _build_invoke_params(self, package_name, question, learning_style, 
                         mastered_concepts, weak_concepts, lesson_context):
    """Build invocation parameters."""
    return {
        "package_name": package_name,
        "question": question,
        "learning_style": learning_style,
        "mastered_concepts": ", ".join(mastered_concepts or []) or self._NONE_SPECIFIED,
        "weak_concepts": ", ".join(weak_concepts or []) or self._NONE_SPECIFIED,
        "lesson_context": lesson_context or "starting fresh",
    }
tests/tutor/test_agentic_tools.py (2)

7-9: Unused import.

MagicMock is imported but not used; only Mock and patch are utilized in this test file.

🧹 Remove unused import
-from unittest.mock import MagicMock, Mock, patch
+from unittest.mock import Mock, patch

9-9: Unused import.

pytest is imported but not used in this test file. Tests rely on plain assertions.

🧹 Remove unused import
-import pytest
tests/tutor/test_validators.py (1)

165-180: Consider adding test for boolean edge case.

The tests cover numeric edge cases well, but given that Python's bool is a subclass of int, consider adding a test to document the expected behavior for validate_score(True) and validate_score(False).

💡 Suggested addition
def test_boolean_input(self):
    """Test boolean input behavior (bool is subclass of int)."""
    # Document expected behavior: booleans pass as 1 and 0
    is_valid_true, _ = validate_score(True)   # True == 1
    is_valid_false, _ = validate_score(False)  # False == 0
    # Adjust assertions based on intended behavior
    assert is_valid_true  # or not, if booleans should be rejected
    assert is_valid_false
tests/tutor/test_integration.py (2)

205-221: Consider adding basic output assertions.

The branding tests only verify that no exceptions are raised. While the comment explains Rich output complexity, consider adding at least one assertion that the output contains expected substrings (e.g., the message text) to catch regressions.

♻️ Suggested improvement
     def test_tutor_print_success(self, capsys):
         """Test tutor_print with success status."""
         tutor_print("Test message", "success")
-        _captured = capsys.readouterr()
-        # Rich console output is complex, just ensure no errors
+        captured = capsys.readouterr()
+        # Rich adds formatting, but message text should be present
+        assert "Test message" in captured.out or len(captured.out) > 0

7-10: Unused import: os.

The os module is imported but doesn't appear to be used in this file.

♻️ Suggested fix
-import os
 import tempfile
 from pathlib import Path
 from unittest.mock import MagicMock, Mock, patch
tests/tutor/test_progress_tracker.py (2)

290-350: Consider extracting repeated mock setup to a fixture.

The mock config pattern is duplicated across all three test methods. This could be extracted to a pytest fixture for DRYer tests.

♻️ Suggested improvement
@pytest.fixture
def mock_config_with_temp_db(temp_db):
    """Create a mocked config pointing to temp_db."""
    from unittest.mock import Mock, patch
    
    mock_config = Mock()
    mock_config.get_db_path.return_value = temp_db
    
    with patch(
        "cortex.tutor.tools.deterministic.progress_tracker.get_config",
        return_value=mock_config
    ):
        yield mock_config


class TestConvenienceFunctions:
    def test_get_learning_progress(self, mock_config_with_temp_db):
        """Test get_learning_progress function."""
        result = mark_topic_completed("docker", "basics", 0.85)
        assert result is True
        # ...

7-9: Unused import: os.

The os module is imported but not used in this file.

♻️ Suggested fix
-import os
 import tempfile
 from pathlib import Path
tests/tutor/test_tools.py (1)

277-309: The __new__ pattern is fragile for testing.

Using __new__ to bypass __init__ and manually setting history works but could break if ConversationHandler's initialization changes. Consider mocking the config at __init__ time instead for more robust tests.

♻️ Alternative approach
def test_build_context_empty(self):
    """Test context building with empty history."""
    with patch("cortex.tutor.tools.agentic.qa_handler.get_config") as mock_config:
        mock_config.return_value = Mock(
            anthropic_api_key="test_key",
            model="claude-sonnet-4-20250514",
        )
        with patch("cortex.tutor.tools.agentic.qa_handler.ChatAnthropic"):
            handler = ConversationHandler()
            handler.history = []  # Reset history after init
            
            context = handler._build_context()
            assert "Starting fresh" in context
cortex/tutor/memory/sqlite_store.py (1)

42-49: Consider using Field(default_factory=list) for list defaults.

While Pydantic handles = [] correctly in BaseModel fields (unlike regular Python classes), using Field(default_factory=list) is more explicit and consistent with the patterns used elsewhere in the codebase (e.g., in progress_context.py).

♻️ Suggested improvement
+from pydantic import BaseModel, Field
+
+
 class StudentProfile(BaseModel):
     """Model for student profile."""

     id: int | None = None
-    mastered_concepts: list[str] = []
-    weak_concepts: list[str] = []
+    mastered_concepts: list[str] = Field(default_factory=list)
+    weak_concepts: list[str] = Field(default_factory=list)
     learning_style: str = "reading"  # visual, reading, hands-on
     last_session: str | None = None
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a799d27 and 4ff4eb6.

📒 Files selected for processing (15)
  • cortex/tutor/branding.py
  • cortex/tutor/cli.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/memory/sqlite_store.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/tools/deterministic/validators.py
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_cli.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_tools.py
  • tests/tutor/test_tutor_agent.py
  • tests/tutor/test_validators.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/tutor/test_cli.py
  • cortex/tutor/contracts/lesson_context.py
  • cortex/tutor/cli.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/tutor/test_validators.py
  • tests/tutor/test_integration.py
  • cortex/tutor/tools/deterministic/validators.py
  • cortex/tutor/contracts/progress_context.py
  • tests/tutor/test_agentic_tools.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/tools/deterministic/progress_tracker.py
  • cortex/tutor/memory/sqlite_store.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_tools.py
  • tests/tutor/test_tutor_agent.py
  • cortex/tutor/branding.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_validators.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_tools.py
  • tests/tutor/test_tutor_agent.py
🧬 Code graph analysis (7)
tests/tutor/test_validators.py (1)
cortex/tutor/tools/deterministic/validators.py (11)
  • ValidationResult (323-345)
  • extract_package_name (239-277)
  • get_validation_errors (280-320)
  • sanitize_input (213-236)
  • validate_all (348-367)
  • validate_input (96-135)
  • validate_learning_style (195-210)
  • validate_package_name (51-93)
  • validate_question (138-148)
  • validate_score (176-192)
  • validate_topic (151-173)
tests/tutor/test_agentic_tools.py (3)
cortex/tutor/tools/agentic/lesson_generator.py (1)
  • LessonGeneratorTool (39-303)
cortex/tutor/tools/agentic/qa_handler.py (5)
  • _structure_response (218-244)
  • QAHandlerTool (19-244)
  • ConversationHandler (271-347)
  • _build_context (332-343)
  • clear_history (345-347)
cortex/tutor/tools/agentic/examples_provider.py (1)
  • ExamplesProviderTool (19-236)
cortex/tutor/tools/agentic/qa_handler.py (3)
cortex/tutor/config.py (1)
  • Config (18-120)
cortex/tutor/tools/agentic/lesson_generator.py (1)
  • Config (62-63)
cortex/tutor/agents/tutor_agent/tutor_agent.py (1)
  • ask (99-142)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
cortex/tutor/memory/sqlite_store.py (14)
  • LearningProgress (19-28)
  • SQLiteStore (52-595)
  • StudentProfile (42-49)
  • get_progress (146-173)
  • get_all_progress (175-207)
  • mark_topic_completed (252-267)
  • upsert_progress (209-250)
  • get_completion_stats (269-298)
  • get_student_profile (352-371)
  • update_student_profile (403-428)
  • add_mastered_concept (430-464)
  • add_weak_concept (466-495)
  • reset_progress (564-582)
  • get_packages_studied (584-595)
cortex/tutor/memory/sqlite_store.py (5)
cortex/tutor/contracts/progress_context.py (1)
  • total_time_seconds (50-52)
cortex/tutor/agents/tutor_agent/tutor_agent.py (3)
  • get_progress (144-156)
  • reset_progress (200-211)
  • get_packages_studied (213-221)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • mark_topic_completed (320-334)
cortex/hwprofiler.py (1)
  • profile (421-448)
cortex/tutor/tools/deterministic/lesson_loader.py (1)
  • cache_lesson (108-129)
tests/tutor/test_progress_tracker.py (3)
cortex/tutor/memory/sqlite_store.py (19)
  • LearningProgress (19-28)
  • QuizResult (31-39)
  • SQLiteStore (52-595)
  • StudentProfile (42-49)
  • mark_topic_completed (252-267)
  • upsert_progress (209-250)
  • get_progress (146-173)
  • get_all_progress (175-207)
  • get_completion_stats (269-298)
  • add_quiz_result (302-321)
  • get_quiz_results (323-348)
  • get_student_profile (352-371)
  • update_student_profile (403-428)
  • add_mastered_concept (430-464)
  • add_weak_concept (466-495)
  • cache_lesson (499-523)
  • get_cached_lesson (525-547)
  • reset_progress (564-582)
  • get_packages_studied (584-595)
cortex/tutor/tools/deterministic/progress_tracker.py (5)
  • ProgressTrackerTool (23-298)
  • get_learning_progress (304-317)
  • get_package_stats (337-349)
  • mark_topic_completed (320-334)
  • _run (64-116)
cortex/tutor/config.py (1)
  • get_db_path (100-108)
tests/tutor/test_tutor_agent.py (2)
cortex/tutor/tools/deterministic/progress_tracker.py (1)
  • _run (64-116)
cortex/tutor/config.py (1)
  • reset_config (143-146)
🔇 Additional comments (54)
tests/tutor/test_tutor_agent.py (4)

1-31: Well-structured test module with comprehensive imports.

The module is properly organized with clear docstrings and appropriate imports from the tutor agent modules.


33-109: Good coverage of state utility functions.

The tests properly cover the state management utilities with appropriate use of pytest.approx for floating-point comparisons. The test cases for critical vs recoverable errors ensure proper error classification.


235-284: Comprehensive routing tests covering all decision paths.

The routing tests properly verify each branch of the routing logic including cache, generation, Q&A, error, and success paths.


287-320: Good test isolation with config reset.

The integration tests properly use reset_config() for test isolation and validate both teach and ask input validation paths. Based on the relevant code snippet, reset_config() sets the global _config to None ensuring a clean slate between tests.

cortex/tutor/branding.py (6)

1-31: Well-organized module with clear Rich library integration.

The module properly sets up the Rich console instance and defines status configurations with appropriate emoji and color mappings. Type hints are used consistently.


34-68: Robust status handling with graceful fallback.

tutor_print properly falls back to "info" styling for unknown status types, preventing runtime errors.


142-158: Correct progress bar calculation with proper edge case handling.

The division by zero is properly handled, and the progress bar calculation correctly maps percentages to a 20-character visual representation.


193-215: Robust input handling with proper interrupt management.

The function correctly handles EOFError and KeyboardInterrupt, providing a graceful cancellation experience for CLI users.


218-234: Reserved parameter correctly marked with underscore prefix.

The _description parameter is appropriately prefixed with an underscore to indicate it's reserved for future use, following Python conventions for intentionally unused parameters.


237-266: Consistent panel styling for error and success messages.

Both panel functions follow the same pattern with appropriate color theming for their respective purposes.

cortex/tutor/tools/deterministic/validators.py (5)

1-48: Comprehensive security-focused validation constants.

The blocked patterns provide good coverage against common injection and destructive command patterns. The package name pattern is appropriately restrictive.


51-93: Well-structured validation with correct ordering.

The validation follows a logical order: empty check, length check, security patterns, then format validation. The docstring includes helpful examples.


176-192: Potential edge case: boolean values pass the type check.

In Python, bool is a subclass of int, so isinstance(True, int) returns True. This means validate_score(True) would pass validation (treated as 1.0) and validate_score(False) would pass (treated as 0.0). Consider whether this is intended behavior.

If boolean inputs should be rejected:

💡 Suggested fix
 def validate_score(score: float) -> tuple[bool, str | None]:
-    if not isinstance(score, (int, float)):
+    if isinstance(score, bool) or not isinstance(score, (int, float)):
         return False, "Score must be a number"

239-277: Robust package name extraction with validation.

The function uses multiple patterns to handle various natural language inputs and properly validates candidates before returning. The lowercasing ensures consistent output.


323-367: Clean validation result API with intuitive boolean semantics.

The ValidationResult class provides a clean interface with __bool__ for easy conditional checks and __str__ for debugging/logging.

cortex/tutor/tools/deterministic/progress_tracker.py (3)

1-21: Proper imports with clear module dependencies.

The imports are well-organized, separating standard library, LangChain, Pydantic, and internal modules.


23-62: Well-structured tool class with proper Pydantic configuration.

The Field(exclude=True) correctly prevents the SQLiteStore from being serialized, and arbitrary_types_allowed enables the custom type usage.


64-116: Clean action dispatcher with comprehensive error handling.

The dispatcher pattern with a dictionary lookup is efficient and maintainable. The generic exception handler ensures failures return structured error responses.

cortex/tutor/tools/agentic/qa_handler.py (3)

45-60: Appropriate LLM configuration for Q&A use case.

The temperature of 0.1 provides slight creativity while maintaining consistency, suitable for educational Q&A responses.


167-216: Well-designed prompts with safety constraints and clear output structure.

The system prompt includes important guardrails against fabrication, and the expected JSON structure is clearly defined for consistent parsing.


332-343: Appropriate context window with truncation.

The sliding window of 3 exchanges with 100-character truncation prevents context from growing unbounded while maintaining conversational continuity.

tests/tutor/test_agentic_tools.py (2)

12-73: Good coverage of LessonGeneratorTool structure methods.

Tests verify both full and minimal response scenarios, ensuring defaults are correctly applied when fields are missing.


140-197: Comprehensive ConversationHandler tests.

Tests cover empty history, history with entries, and clearing history. The mock setup properly isolates the tests from LLM calls.

tests/tutor/test_validators.py (4)

1-23: Well-organized test module with comprehensive imports.

All necessary validators and constants are imported for thorough testing.


26-89: Thorough package name validation tests with good edge case coverage.

Tests cover valid formats, boundary conditions, security patterns, and invalid characters. The descriptive assertion messages aid debugging.


199-243: Good sanitization and extraction tests.

Tests verify whitespace handling, null byte removal, truncation, and pattern-based package name extraction from natural language.


266-302: Integration tests for combined validation flows.

Tests verify both get_validation_errors aggregation and validate_all result packaging, ensuring the high-level APIs work correctly.

tests/tutor/test_integration.py (6)

24-29: LGTM!

Good use of autouse=True fixture to ensure clean state between tests. The reset before and after pattern properly isolates tests.


32-80: LGTM!

Good coverage of configuration edge cases including environment loading, API key validation, path expansion, and directory creation. The use of tempfile.TemporaryDirectory properly isolates filesystem tests.


83-154: LGTM!

Thorough contract tests for LessonContext. Good use of pytest.approx for floating-point comparisons. Tests cover creation, nested objects (CodeExample), serialization round-trip, and display formatting.


157-202: LGTM!

Good coverage of ProgressContext and PackageProgress contracts including computed fields (completion_percentage, average_score), helper methods (is_complete, get_next_topic), and recommendation generation.


224-295: LGTM!

Comprehensive CLI parser tests covering package arguments and all flags (-q, --list, --progress, --reset). The SystemExit assertions for --help and --version correctly validate argparse behavior.


298-364: LGTM!

Well-structured end-to-end test that validates the Plan→Act→Reflect workflow with mocked tools. Good practice to mock external dependencies while testing state transitions. The comment about manual API testing is appropriate.

cortex/tutor/contracts/progress_context.py (4)

13-20: LGTM!

Clean Pydantic model with appropriate field constraints (ge, le) and type hints. Good use of datetime | None for optional timestamps.


23-63: LGTM!

Well-implemented model with computed fields. Division-by-zero cases are properly guarded with if not self.topics checks. The is_complete() method correctly returns False for empty topic lists.


66-144: LGTM!

Comprehensive progress context with proper defaults and timezone-aware timestamps. The get_recommendations() method has sensible limits (3 weak concepts, 1 package suggestion). The to_summary_dict() provides useful display formatting.


1-10: LGTM!

Clean module structure with appropriate docstring and imports. Good use of Pydantic v2's computed_field decorator.

tests/tutor/test_progress_tracker.py (3)

27-44: LGTM!

Well-structured fixtures with proper cleanup via tempfile.TemporaryDirectory. The fixture chain (temp_dbstore/tracker) provides good test isolation.


47-211: LGTM!

Excellent test coverage for SQLiteStore including edge cases like expired cache (ttl_hours=0), upsert updates, and reset operations. Good use of pytest.approx for floating-point comparisons throughout.


214-287: LGTM!

Good coverage of the ProgressTrackerTool action-based API including success paths and error handling (unknown action, missing parameters).

tests/tutor/test_tools.py (6)

31-64: LGTM!

Good coverage of LessonLoaderTool cache behavior including miss, force fresh, and cache hit scenarios.


67-95: LGTM!

Thorough fallback lesson tests covering known packages, unknown packages, and case-insensitivity.


98-121: LGTM!

Good tests for the lesson loading hierarchy: cache → fallback → none.


124-204: LGTM!

Well-structured tests for LessonGeneratorTool with proper LLM mocking. Testing _structure_response directly is a good approach for verifying output formatting without requiring actual API calls.


207-241: LGTM!

Consistent test pattern for ExamplesProviderTool structure response formatting.


244-274: LGTM!

Good test coverage for QAHandlerTool response structure including code examples and related topics.

cortex/tutor/memory/sqlite_store.py (8)

62-103: LGTM!

Well-designed schema with appropriate UNIQUE constraints for upsert operations and indexes on frequently queried columns (package_name).


128-142: LGTM!

Thread-safe connection management using RLock. Creating a new connection per operation is appropriate for SQLite's usage patterns. The context manager ensures proper cleanup.


209-250: LGTM!

Solid upsert implementation with proper handling of the lastrowid=0 case when updating existing records. Parameterized queries prevent SQL injection.


302-348: LGTM!

Simple and correct quiz result storage and retrieval.


430-495: LGTM!

Atomic concept management with proper race condition handling. Good use of RLock to allow nested calls (e.g., _create_default_profile within add_mastered_concept). The logic correctly prevents a concept from being both mastered and weak.


499-560: LGTM!

Well-implemented TTL-based cache with proper expiration handling. ISO format timestamps work correctly for lexicographic comparison in SQLite.


564-595: LGTM!

Clean utility methods for progress reset and package listing.


1-16: LGTM!

Well-structured module with comprehensive docstrings, type hints, and proper imports. The implementation follows PEP 8 and the coding guidelines.

@Anshgrover23
Copy link
Collaborator

@srikrishnavansi Why using langchain ?

PyYAML==6.0.3

# AI Tutor Dependencies (Issue #131)
langchain>=0.3.0,<2.0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need of a langchain dependency, I think we can do without this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why LangChain was used:

LangChain was chosen initially for:

  1. LangGraph - For the Plan→Act→Reflect workflow pattern with state management
  2. ChatAnthropic integration - Structured prompts and output parsing
  3. Tool abstractions - BaseTool class for consistent tool interfaces

- Make student_level dynamic in graph.py based on mastered concepts
- Add validation for total >= 1 in QuizContext.from_results
- Move reset_config import to top of test_cli.py
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @cortex/tutor/agents/tutor_agent/graph.py:
- Around line 330-403: The functions create_tutor_graph and get_tutor_graph
currently annotate return type as StateGraph but they actually return the
compiled graph; change their return annotations to
CompiledStateGraph[TutorAgentState, InputType] (or the exact generic
CompiledStateGraph type LangGraph exposes), import CompiledStateGraph from the
LangGraph package, and update the module-level _graph variable typing to
Optional[CompiledStateGraph[TutorAgentState, InputType]] so the types align with
the value returned by graph.compile().

In @cortex/tutor/contracts/progress_context.py:
- Around line 163-182: The from_results factory in QuizContext allows correct >
total which can yield score_percentage > 100 and trigger Pydantic errors; update
the method (QuizContext.from_results) to validate the inputs by checking that
correct is between 0 and total (e.g., if correct < 0 or correct > total: raise
ValueError("correct must be between 0 and total")) before computing score so the
error is raised early and yields a clear message.
🧹 Nitpick comments (3)
tests/tutor/test_cli.py (2)

263-339: Consider adding error handling test for consistency.

The TestCmdProgress class covers success paths well but lacks an error handling test similar to test_list_with_error in TestCmdListPackages. Adding a test that verifies cmd_progress returns 1 when Config.from_env raises an exception would improve consistency and coverage.

💡 Suggested test for error handling
@patch("cortex.tutor.cli.Config")
def test_progress_with_error(self, mock_config_class):
    """Test progress handles errors."""
    mock_config_class.from_env.side_effect = Exception("Test error")

    with patch("cortex.tutor.cli.print_error_panel"):
        result = cmd_progress()
        assert result == 1

429-434: Consider verifying arguments passed to mocked commands.

The test verifies mock_question.assert_called_once() but doesn't validate the arguments. Adding argument verification would strengthen the test.

💡 Suggested improvement
 @patch("cortex.tutor.cli.cmd_question")
 def test_question_command(self, mock_question):
     """Test question command."""
     mock_question.return_value = 0
     _result = main(["docker", "-q", "What is Docker?"])
-    mock_question.assert_called_once()
+    mock_question.assert_called_once_with("docker", "What is Docker?")
cortex/tutor/agents/tutor_agent/graph.py (1)

170-212: Minor: Missing checkpoint on exception path.

The success and error-result paths add qa_complete checkpoint (line 205) or error, but the exception handler (lines 209-210) only records the error without a corresponding checkpoint. This could make debugging harder when tracing workflow execution.

Proposed consistency fix
     except Exception as e:
         add_error(state, "qa", str(e))
+        add_checkpoint(state, "qa_complete", "error", str(e))

     return state
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ff4eb6 and 2fab6a5.

📒 Files selected for processing (3)
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/contracts/progress_context.py
  • tests/tutor/test_cli.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/tutor/test_cli.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/contracts/progress_context.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_cli.py
🧬 Code graph analysis (1)
tests/tutor/test_cli.py (3)
cortex/tutor/cli.py (7)
  • cmd_list_packages (213-243)
  • cmd_progress (289-313)
  • cmd_question (155-210)
  • cmd_reset (316-348)
  • cmd_teach (112-152)
  • create_parser (35-109)
  • main (351-389)
cortex/tutor/config.py (2)
  • reset_config (143-146)
  • from_env (59-88)
cortex/tutor/memory/sqlite_store.py (4)
  • get_packages_studied (584-595)
  • get_completion_stats (269-298)
  • get_all_progress (175-207)
  • reset_progress (564-582)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.11)
🔇 Additional comments (19)
tests/tutor/test_cli.py (7)

1-25: Clean module setup with appropriate imports.

The module structure is well-organized with clear separation of imports and a descriptive docstring. The import of reset_config ensures proper test isolation between tests that manipulate the config singleton.


27-90: Comprehensive parser tests with good coverage.

The TestCreateParser class thoroughly validates all CLI arguments including edge cases like --reset defaulting to "__all__" when used without a package. Each test is focused and tests a single behavior.


92-146: Good coverage of teach command error paths.

The tests correctly handle the lazy import pattern by patching at the definition site (cortex.tutor.agents.tutor_agent.InteractiveTutor). The coverage of ValueError and KeyboardInterrupt scenarios ensures graceful error handling is verified.


218-261: Well-structured tests for list packages command.

The mock setup correctly patches SQLiteStore and Config at the import site in cli.py. The tests cover the empty state, populated state, and error handling paths appropriately.


341-390: Good coverage of reset confirmation flow.

The tests properly verify both cancellation and confirmation paths, including the per-package reset scenario. The mock setup for get_user_input effectively tests the user confirmation flow.


436-443: LGTM!

The teach command test correctly verifies both the command dispatch and banner display for interactive mode. The use of _result appropriately signals the return value is intentionally unused.


198-201: No action required. The mock patch target is correct.

The patch targets cortex.tutor.branding.print_code_example at its definition site, which is the correct approach here. The print_code_example function is imported locally within a conditional block inside the cmd_question function (line 189), not at module level. When the patch is applied before the test calls cmd_question(), the local import that occurs later retrieves the already-patched version from cortex.tutor.branding, so the mock intercepts the call correctly.

cortex/tutor/contracts/progress_context.py (4)

1-10: LGTM!

Module docstring is present and imports are properly organized. Type hints use modern Python 3.10+ syntax (| union, list[], dict[]).


13-20: LGTM!

Well-defined model with appropriate field constraints (ge, le) and descriptive documentation.


23-63: LGTM!

Computed fields are correctly implemented with @computed_field and @property decorators. Edge cases (empty topics list) are handled appropriately. The is_complete() method correctly returns False for empty topic lists, which is a sensible semantic choice.


66-144: LGTM!

Well-structured progress context with comprehensive helper methods. The get_recommendations() method wisely limits suggestions to 3 weak concepts and one incomplete package for actionable output.

cortex/tutor/agents/tutor_agent/graph.py (8)

1-24: LGTM!

Module docstring and imports are well-organized. The import of state helpers (add_checkpoint, add_cost, add_error, etc.) promotes clean separation of concerns.


28-85: LGTM!

The planning logic correctly prioritizes: Q&A mode → cache lookup → fresh generation. The deterministic cache-first approach with force_fresh escape hatch is a sensible cost-optimization pattern.


88-107: LGTM!

Cache loading is straightforward. The recoverable=True error flag appropriately indicates this isn't a fatal failure.


110-119: LGTM!

Clean helper function with sensible thresholds for inferring student level from mastered concepts.


122-167: LGTM!

Good implementation with dynamic student level inference (as mentioned in commit message). The exception handling properly records errors for downstream processing.


215-264: LGTM!

Well-structured reflection phase. The confidence scoring logic (error penalty + cache staleness cap) provides useful signal for downstream consumers.


267-288: LGTM!

Error output structure mirrors the success output from reflect_node, ensuring consistent downstream handling.


294-324: LGTM!

Routing logic is clear with proper error-first checks. The Literal return type hints provide good documentation of valid routes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @cortex/tutor/tools/agentic/qa_handler.py:
- Around line 217-243: In _structure_response, validate and coerce fields coming
from response before using them: ensure related_topics is a list (if it's a
string or other type, wrap into [str] or use empty list) before slicing, ensure
follow_up_suggestions is a list likewise, and ensure confidence is converted to
a float with a fallback (parse numeric strings, clamp to [0.0,1.0], default 0.7)
to avoid TypeError from min/max; also harden handling of code_example so that if
response["code_example"] is not a dict but a string or list you normalize it
into a dict with keys "code", "language" (default "bash"), and "description",
and always defensive-get values (use isinstance checks and sensible defaults)
before assigning into structured.
- Around line 270-346: ConversationHandler.__init__ eagerly instantiates
QAHandlerTool() which can fail if API keys aren't set; change to lazy-init by
replacing the immediate creation in __init__ with self.qa_tool = None (or
similar) and instantiate QAHandlerTool() inside ask() if self.qa_tool is falsy
(use QAHandlerTool() right before calling _run); additionally bound the
conversation history inside ask() or _build_context() (e.g., truncate
self.history to the last N entries after appending) to prevent unbounded growth;
update references to self.qa_tool in ask() to handle the lazy-initialized
instance.
- Around line 61-165: Both `_run` and `_arun` duplicate the same logic and
return a `cost_gbp` field while the docstring and value imply USD; extract the
shared logic into a single helper (e.g., a private method like
`_handle_question_common` or `_invoke_qa_chain`) that: validates inputs with
validate_package_name() and validate_question() from
cortex.tutor.tools.deterministic.validators, builds the prompt
(ChatPromptTemplate.from_messages with self._get_system_prompt and
self._get_qa_prompt), creates the chain (prompt | self.llm |
JsonOutputParser()), invokes the chain using a passed-in invoker (sync invoke vs
await chain.ainvoke), calls self._structure_response(result, package_name,
question), and returns the structured dict; then make `_run` call the sync
invoker helper and `_arun` await the same helper or a thin async wrapper. Also
rename the returned field `cost_gbp` to `cost_usd` in all return paths (both
success and error responses where applicable).
🧹 Nitpick comments (10)
tests/tutor/test_progress_tracker.py (3)

266-269: Consider verifying the concept was actually stored.

The test only asserts result["success"] but doesn't verify that "docker basics" was actually added to the mastered concepts. Consider adding a follow-up assertion to confirm persistence.

♻️ Suggested improvement
     def test_add_mastered_concept_action(self, tracker):
         """Test add_mastered action."""
         result = tracker._run("add_mastered", concept="docker basics")
         assert result["success"]
+
+        # Verify concept was actually stored
+        profile = tracker._run("get_profile")
+        assert "docker basics" in profile["profile"]["mastered_concepts"]

271-275: Consider verifying the reset actually cleared the progress.

Similar to the previous comment, the test asserts success but doesn't confirm the progress was actually removed.

♻️ Suggested improvement
     def test_reset_action(self, tracker):
         """Test reset action."""
         tracker._run("mark_completed", package_name="docker", topic="basics")
         result = tracker._run("reset", package_name="docker")
         assert result["success"]
+
+        # Verify progress was actually cleared
+        progress = tracker._run("get_progress", package_name="docker", topic="basics")
+        assert progress["progress"] is None

292-349: Consider hoisting repeated imports to the top of the class or file.

The from unittest.mock import Mock, patch import is repeated in each test method (lines 294, 315, 333). While functional, hoisting to module level would improve readability and slightly reduce overhead.

♻️ Suggested improvement

Add to the top-level imports:

from unittest.mock import Mock, patch

Then remove the duplicate imports from each test method.

tests/tutor/test_tutor_agent.py (5)

1-31: Add minimal type hints to comply with repo guidelines (**/*.py: type hints required).
Right now most test methods have no return type annotation; consider at least -> None for each test method (and optionally Any for mocks where needed).

Proposed diff (illustrative)
 class TestTutorAgentState:
     """Tests for TutorAgentState and state utilities."""

-    def test_create_initial_state(self):
+    def test_create_initial_state(self) -> None:
         """Test creating initial state."""
         state = create_initial_state(
             package_name="docker",
             session_type="lesson",
         )

115-173: Strengthen plan_node cache-hit assertions to ensure the cache path can actually succeed.
route_after_plan() will choose "load_cache" when strategy is "use_cache", but this test doesn’t verify that plan_node() also persisted the cached lesson into state["plan"]["cached_data"] (which load_cache_node() expects).

Proposed diff
         result = plan_node(state)

         assert result["plan"]["strategy"] == "use_cache"
         assert result["cache_hit"] is True
+        assert result["plan"].get("cached_data") == {"summary": "Cached lesson"}

187-195: Assert the specific error shape for missing cache data (not just “some error exists”).
This test will pass even if the wrong node emits the error or it’s marked non-recoverable; tightening it makes regressions easier to diagnose.

Proposed diff
         result = load_cache_node(state)

         assert len(result["errors"]) > 0
+        assert result["errors"][-1]["node"] == "load_cache"
+        assert result["errors"][-1]["recoverable"] is True

270-277: Use a non-empty results["content"] for the “success” routing case (or rename the test).
Right now content is {} (which reflect_node() treats as “No content generated”), so the “success” here is only “has a results object”, not “has valid results”. Consider using a minimal non-empty payload to better match intent.

Proposed diff
         state = create_initial_state("docker")
-        state["results"] = {"type": "lesson", "content": {}}
+        state["results"] = {"type": "lesson", "content": {"summary": "ok"}}

290-320: Make validation tests explicitly prove “fail-fast before graph usage”.
Since you patch get_tutor_graph, you can assert it was not called when inputs are invalid; this both hardens the contract and prevents future flakiness if graph init becomes heavier.

Proposed diff
     def test_teach_validation(self, mock_graph):
         """Test teach validates package name."""
         from cortex.tutor.agents.tutor_agent import TutorAgent
         from cortex.tutor.config import reset_config

         reset_config()

         with pytest.raises(ValueError) as exc_info:
             agent = TutorAgent()
             agent.teach("")

         assert "Invalid package name" in str(exc_info.value)
+        mock_graph.assert_not_called()
cortex/tutor/tools/agentic/qa_handler.py (2)

246-267: Avoid calling the private _run directly in answer_question.

Prefer the public BaseTool entrypoint (invoke/run, depending on the LangChain version) so callbacks/tracing/config behave consistently.


166-216: Load prompts from the existing source-of-truth file to avoid divergence.

The file cortex/tutor/prompts/tools/qa_handler.md exists and contains a more comprehensive, layered prompt structure than the inlined versions in _get_system_prompt() and _get_qa_prompt(). Adopt the pattern already used in lesson_generator.py by loading prompts from the file with a fallback, similar to:

def _load_prompt_template(filename: str) -> str:
    """Load prompt from file with fallback."""
    prompt_path = Path(__file__).parent.parent.parent / "prompts" / "tools" / filename
    if prompt_path.exists():
        return prompt_path.read_text()
    return "fallback inline prompt"

This reduces maintenance burden and prevents the inlined prompts from drifting out of sync with the source-of-truth file.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fab6a5 and 79bc930.

📒 Files selected for processing (5)
  • cortex/tutor/tools/agentic/qa_handler.py
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_integration.py
  • tests/tutor/test_progress_tracker.py
  • tests/tutor/test_tutor_agent.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_integration.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • tests/tutor/test_tutor_agent.py
  • tests/tutor/test_progress_tracker.py
  • cortex/tutor/tools/agentic/qa_handler.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_tutor_agent.py
  • tests/tutor/test_progress_tracker.py
🧬 Code graph analysis (2)
tests/tutor/test_tutor_agent.py (2)
cortex/tutor/agents/tutor_agent/graph.py (5)
  • fail_node (267-288)
  • load_cache_node (88-107)
  • reflect_node (215-264)
  • route_after_act (313-324)
  • route_after_plan (294-310)
cortex/tutor/agents/tutor_agent/state.py (8)
  • TutorAgentState (50-104)
  • add_checkpoint (174-192)
  • add_cost (195-204)
  • add_error (153-171)
  • create_initial_state (107-150)
  • get_package_name (234-244)
  • get_session_type (221-231)
  • has_critical_error (207-218)
cortex/tutor/tools/agentic/qa_handler.py (2)
cortex/tutor/config.py (2)
  • get_config (127-140)
  • Config (18-120)
cortex/tutor/agents/tutor_agent/tutor_agent.py (1)
  • ask (99-142)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build Package
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.10)
  • GitHub Check: test (3.12)
🔇 Additional comments (5)
tests/tutor/test_progress_tracker.py (3)

1-23: LGTM!

Imports are well-organized and the module docstring adequately describes the test file's purpose.


26-43: LGTM!

Fixtures are well-designed with proper cleanup via tempfile.TemporaryDirectory context manager. The separation of concerns between temp_db, store, and tracker fixtures allows flexible test composition.


46-210: LGTM!

Comprehensive test coverage for SQLiteStore with clear test method names and appropriate assertions. The approach of using ttl_hours=0 for testing expired cache behavior (line 186) is elegant. All test methods include docstrings as required.

tests/tutor/test_tutor_agent.py (1)

115-173: No action needed - mocks are correctly aligned with implementation.

The tests properly stub mock_tool._run(), which matches how plan_node() actually invokes both tools:

  • Line 45: progress_tool._run("get_profile")
  • Line 63: loader._run(package_name)

The mocks will be exercised as expected.

cortex/tutor/tools/agentic/qa_handler.py (1)

7-16: The top-level LangChain imports in this module are not a concern. The cmd_list_packages() and cmd_progress() functions do not import cortex.tutor.tools or qa_handler, so there is no risk of import failures during "list/progress" operations. Additionally, LangChain is listed as a core dependency in requirements.txt, not an optional dependency. The existing code already uses lazy imports for TutorAgent and InteractiveTutor, which are the only entry points that transitively import this module.

Likely an incorrect or invalid review comment.

Comment on lines 61 to 165
def _run(
self,
package_name: str,
question: str,
learning_style: str = "reading",
mastered_concepts: list[str] | None = None,
weak_concepts: list[str] | None = None,
lesson_context: str | None = None,
) -> dict[str, Any]:
"""
Answer a user question about a package.
Args:
package_name: Current package context.
question: The user's question.
learning_style: User's learning preference.
mastered_concepts: Concepts user has mastered.
weak_concepts: Concepts user struggles with.
lesson_context: What they've learned so far.
Returns:
Dict containing the answer and related info.
"""
try:
prompt = ChatPromptTemplate.from_messages(
[
("system", self._get_system_prompt()),
("human", self._get_qa_prompt()),
]
)

chain = prompt | self.llm | JsonOutputParser()

result = chain.invoke(
{
"package_name": package_name,
"question": question,
"learning_style": learning_style,
"mastered_concepts": ", ".join(mastered_concepts or []) or self._NONE_SPECIFIED,
"weak_concepts": ", ".join(weak_concepts or []) or self._NONE_SPECIFIED,
"lesson_context": lesson_context or "starting fresh",
}
)

answer = self._structure_response(result, package_name, question)

return {
"success": True,
"answer": answer,
"cost_gbp": 0.02,
}

except Exception as e:
return {
"success": False,
"error": str(e),
"answer": None,
}

async def _arun(
self,
package_name: str,
question: str,
learning_style: str = "reading",
mastered_concepts: list[str] | None = None,
weak_concepts: list[str] | None = None,
lesson_context: str | None = None,
) -> dict[str, Any]:
"""Async version of Q&A handling."""
try:
prompt = ChatPromptTemplate.from_messages(
[
("system", self._get_system_prompt()),
("human", self._get_qa_prompt()),
]
)

chain = prompt | self.llm | JsonOutputParser()

result = await chain.ainvoke(
{
"package_name": package_name,
"question": question,
"learning_style": learning_style,
"mastered_concepts": ", ".join(mastered_concepts or []) or self._NONE_SPECIFIED,
"weak_concepts": ", ".join(weak_concepts or []) or self._NONE_SPECIFIED,
"lesson_context": lesson_context or "starting fresh",
}
)

answer = self._structure_response(result, package_name, question)

return {
"success": True,
"answer": answer,
"cost_gbp": 0.02,
}

except Exception as e:
return {
"success": False,
"error": str(e),
"answer": None,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the qa_handler.py file
find . -name "qa_handler.py" -type f

Repository: cortexlinux/cortex

Length of output: 104


🏁 Script executed:

# Check the structure of the cortex directory
git ls-files | grep -E "(qa_handler|tutor_agent)" | head -20

Repository: cortexlinux/cortex

Length of output: 348


🏁 Script executed:

# Get the full content of qa_handler.py to see docstrings and actual code
cat -n cortex/tutor/tools/agentic/qa_handler.py | head -200

Repository: cortexlinux/cortex

Length of output: 7925


🏁 Script executed:

# Look for the tutor_agent.py file and check for validation functions
find . -path "*tutor_agent*" -name "*.py" -type f

Repository: cortexlinux/cortex

Length of output: 276


🏁 Script executed:

# Check the tutor_agent.py file around lines 98-141 for validation functions
cat -n cortex/tutor/agents/tutor_agent/tutor_agent.py | sed -n '90,150p'

Repository: cortexlinux/cortex

Length of output: 2113


🏁 Script executed:

# Search for validation functions in tutor_agent.py
rg "def validate" cortex/tutor/agents/tutor_agent/tutor_agent.py -A 5

Repository: cortexlinux/cortex

Length of output: 44


🏁 Script executed:

# Check if qa_handler.py uses any validation functions currently
rg "validate" cortex/tutor/tools/agentic/qa_handler.py

Repository: cortexlinux/cortex

Length of output: 116


🏁 Script executed:

# Check the complete _run and _arun methods to see current validation
cat -n cortex/tutor/tools/agentic/qa_handler.py | sed -n '61,165p'

Repository: cortexlinux/cortex

Length of output: 4262


🏁 Script executed:

# Find where validate_package_name and validate_question are defined
rg "def validate_package_name|def validate_question" -A 3

Repository: cortexlinux/cortex

Length of output: 897


🏁 Script executed:

# Check if these functions are exported from a utils module
git ls-files | xargs grep -l "validate_package_name\|validate_question" | head -10

Repository: cortexlinux/cortex

Length of output: 303


🏁 Script executed:

# Check what validators.py contains and verify the validation functions
cat -n cortex/tutor/tools/deterministic/validators.py

Repository: cortexlinux/cortex

Length of output: 12997


🏁 Script executed:

# Check if tutor_agent.py imports these validators
rg "from.*validators|import.*validate" cortex/tutor/agents/tutor_agent/tutor_agent.py

Repository: cortexlinux/cortex

Length of output: 119


🏁 Script executed:

# Check if qa_handler.py has any imports for validators
cat -n cortex/tutor/tools/agentic/qa_handler.py | head -30

Repository: cortexlinux/cortex

Length of output: 1053


🏁 Script executed:

# Check if there are any other usages of cost_gbp or cost_usd in the codebase
rg "cost_gbp|cost_usd" --type py

Repository: cortexlinux/cortex

Length of output: 5180


Deduplicate _run/_arun, add input validation, and fix cost currency notation.

  • _run and _arun are nearly identical; extract the common logic into a helper to prevent drift.
  • Add input validation using validate_package_name() and validate_question() from cortex.tutor.tools.deterministic.validators to match the pattern used in tutor_agent.py (lines 118-124) and ensure consistent behavior.
  • The docstring states "Cost: ~$0.02" (USD notation) but the returned field is cost_gbp (GBP notation). This is inconsistent. Change the field to cost_usd to align with the docstring.
Currency field correction
@@
-                "cost_gbp": 0.02,
+                "cost_usd": 0.02,
@@
-                "cost_gbp": 0.02,
+                "cost_usd": 0.02,
🤖 Prompt for AI Agents
In @cortex/tutor/tools/agentic/qa_handler.py around lines 61 - 165, Both `_run`
and `_arun` duplicate the same logic and return a `cost_gbp` field while the
docstring and value imply USD; extract the shared logic into a single helper
(e.g., a private method like `_handle_question_common` or `_invoke_qa_chain`)
that: validates inputs with validate_package_name() and validate_question() from
cortex.tutor.tools.deterministic.validators, builds the prompt
(ChatPromptTemplate.from_messages with self._get_system_prompt and
self._get_qa_prompt), creates the chain (prompt | self.llm |
JsonOutputParser()), invokes the chain using a passed-in invoker (sync invoke vs
await chain.ainvoke), calls self._structure_response(result, package_name,
question), and returns the structured dict; then make `_run` call the sync
invoker helper and `_arun` await the same helper or a thin async wrapper. Also
rename the returned field `cost_gbp` to `cost_usd` in all return paths (both
success and error responses where applicable).

- Use Pydantic v2 model_post_init pattern instead of __init__
- Implement lazy LLM initialization with _get_llm()
- Add input validation using validate_package_name and validate_question
- Extract common logic into _build_chain and _build_invoke_params
- Harden _structure_response to handle non-conforming LLM JSON types
- Lazy-init QAHandlerTool in ConversationHandler
- Bound conversation history to prevent unbounded growth
- Update tests to reflect new lazy initialization pattern
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/tutor/test_tools.py (1)

277-309: LGTM! ConversationHandler context building tested effectively.

Using __new__ to bypass __init__ is a valid pattern for testing internal methods in isolation. The tests cover empty history, populated history, and clearing behavior.

Note: The get_config patches aren't strictly necessary for these tests since _build_context doesn't access the config, but they don't cause harm.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79bc930 and 24bd578.

📒 Files selected for processing (5)
  • cortex/tutor/agents/tutor_agent/graph.py
  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • tests/tutor/test_agentic_tools.py
  • tests/tutor/test_tools.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/tutor/test_agentic_tools.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/tutor/contracts/progress_context.py
  • cortex/tutor/tools/agentic/qa_handler.py
  • cortex/tutor/agents/tutor_agent/graph.py
  • tests/tutor/test_tools.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_tools.py
🧬 Code graph analysis (2)
cortex/tutor/tools/agentic/qa_handler.py (2)
cortex/tutor/config.py (2)
  • get_config (127-140)
  • Config (18-120)
cortex/tutor/tools/deterministic/validators.py (2)
  • validate_package_name (51-93)
  • validate_question (138-148)
tests/tutor/test_tools.py (2)
cortex/tutor/tools/agentic/qa_handler.py (5)
  • ConversationHandler (351-433)
  • QAHandlerTool (24-320)
  • _structure_response (262-320)
  • _build_context (418-429)
  • clear_history (431-433)
cortex/tutor/tools/deterministic/lesson_loader.py (3)
  • LessonLoaderTool (18-157)
  • get_fallback_lesson (235-245)
  • load_lesson_with_fallback (248-284)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: test (3.10)
  • GitHub Check: security
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: Security Scan
🔇 Additional comments (25)
cortex/tutor/contracts/progress_context.py (4)

1-21: LGTM! Well-structured Pydantic models with proper validation.

The TopicProgress model correctly uses Field constraints (ge, le) for validation, and the union type datetime | None follows modern Python 3.10+ conventions. Type hints and docstrings are present as required.


23-64: LGTM! Computed fields handle edge cases correctly.

The PackageProgress model properly guards against division by zero in computed fields, and the is_complete() method correctly returns False for empty topic lists. The implementation is clean and well-documented.


66-144: LGTM! Comprehensive progress tracking with useful accessors.

The ProgressContext model provides well-designed methods for progress aggregation and recommendations. The get_recommendations() method sensibly limits output to avoid overwhelming users. Type hints and docstrings are complete.


147-186: LGTM! Factory method with proper cross-field validation.

The from_results classmethod correctly validates the relationship between correct and total parameters, which cannot be enforced by Field constraints alone. The 70% passing threshold and default feedback generation are sensible defaults.

cortex/tutor/tools/agentic/qa_handler.py (7)

24-68: LGTM! Lazy LLM initialization pattern is well-implemented.

The lazy initialization in _get_llm() correctly defers API key requirements until actual usage, supporting the PR's goal of allowing listing/progress commands without an API key. The model_post_init hook properly respects user-specified model names while applying config defaults.


70-112: LGTM! Clean separation between validation and chain building.

Input validation correctly delegates to the deterministic validators module, maintaining consistency across the codebase. The chain building follows standard LangChain patterns with proper prompt template construction.


114-209: LGTM! Consistent sync/async implementations with proper error handling.

Both _run and _arun methods follow the same validation-first pattern and return consistent response structures. The error handling provides a stable API surface while capturing exception details for debugging.


211-260: LGTM! Well-designed prompts with safety guardrails.

The system prompt includes appropriate constraints against fabrication and fake URLs. The JSON schema in the QA prompt provides clear structure for response parsing. The double-brace escaping for literal JSON in the template is correct.


262-320: LGTM! Robust response structuring with comprehensive type validation.

The _structure_response method demonstrates excellent defensive programming: type checking before operations, truncation to prevent unbounded lists, confidence clamping, and graceful handling of both dict and string code examples. This ensures stable output regardless of LLM response variations.


347-433: LGTM! ConversationHandler with proper history bounding.

The _MAX_HISTORY_SIZE = 50 limit and truncation to last 3 exchanges for context building effectively prevent unbounded memory growth and token bloat. Lazy initialization of qa_tool aligns with the pattern used throughout the module.


323-344: LGTM! Clean convenience function.

The answer_question helper provides a simplified interface for one-off questions. Creating a new tool instance per call is appropriate for this stateless use case.

cortex/tutor/agents/tutor_agent/graph.py (8)

29-86: LGTM! Well-structured planning phase with clear strategy selection.

The plan_node correctly implements the hybrid approach: Q&A mode bypasses caching, cache hits avoid LLM costs, and fresh generation is the fallback. The checkpoint logging provides good observability.


89-108: LGTM! Simple cache loading with recoverable error handling.

The load_cache_node correctly marks missing cache data as a recoverable error, allowing the workflow to potentially fall back to generation.


111-168: LGTM! Lesson generation with caching and flexible level inference.

The _infer_student_level helper provides sensible defaults based on mastery count. The generate_lesson_node properly caches generated content to reduce future API costs and handles both API errors and exceptions.


171-213: LGTM! Q&A node with proper input validation.

The qa_node correctly validates question presence before invoking the LLM. The non-recoverable error for missing questions is appropriate since Q&A mode requires a question.


216-265: LGTM! Comprehensive reflection phase with validation and confidence scoring.

The reflect_node properly validates results and calculates confidence with appropriate penalties for errors and cache hits. The confidence is correctly clamped to [0.0, 1.0] via max(confidence, 0.0) combined with the subtraction logic.


268-289: LGTM! Clean failure handling with consistent output structure.

The fail_node produces an output structure consistent with reflect_node, ensuring downstream consumers can handle both success and failure cases uniformly.


295-325: LGTM! Clear routing logic with fail-fast error handling.

Both routing functions correctly prioritize critical error checking before proceeding with normal flow. The strategy-based routing in route_after_plan covers all expected cases with a sensible default.


331-404: LGTM! Well-structured LangGraph workflow with singleton pattern.

The graph construction correctly wires the Plan→Act→Reflect workflow with appropriate conditional routing. The singleton get_tutor_graph() avoids repeated compilation overhead for the workflow definition.

tests/tutor/test_tools.py (6)

24-64: LGTM! Good coverage of LessonLoaderTool caching behavior.

The tests properly cover the three main scenarios: cache miss, force refresh bypass, and successful cache retrieval. The temporary database fixture ensures test isolation.


67-96: LGTM! Thorough fallback template testing.

The tests verify both the existence of expected fallback templates and the case-insensitive lookup behavior, which is important for user-facing package names.


98-121: LGTM! Complete coverage of load_lesson_with_fallback decision tree.

The tests verify all three source outcomes: cache, fallback_template, and none. The needs_generation flag assertion confirms proper signaling for unknown packages.


124-205: LGTM! Proper mocking and edge case testing for LessonGeneratorTool.

The tests correctly mock the LLM chain and directly test _structure_response to verify output formatting. The missing fields test ensures robust handling of incomplete LLM responses.


207-242: LGTM! Core functionality tested for ExamplesProviderTool.

The test verifies proper structuring of LLM responses with examples, tips, and confidence values. Consider adding edge case tests for malformed responses in a follow-up.


244-275: LGTM! QAHandlerTool response structuring verified.

The test correctly leverages the lazy LLM initialization pattern, only mocking get_config. The assertions verify the key response fields including the nested code_example structure.

- Change llm field type from "ChatAnthropic | None" to Any to fix
  Pydantic model_rebuild() error when ChatAnthropic is not imported
- Update _structure_response type hint to accept Any for response param
  to match actual runtime behavior (handles None gracefully)
- Remove unnecessary type: ignore comment from test
- Add docstring with Args/Returns to _structure_response method
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @cortex/tutor/tools/agentic/qa_handler.py:
- Around line 413-424: Summary: Accessing result["answer"].get(...) is unsafe if
result["answer"] is not a dict; instead guard and extract safely. Fix: inside
the existing if result.get("success") block, read answer_obj =
result.get("answer"), check isinstance(answer_obj, dict) and then set
answer_text = answer_obj.get("answer", "") (otherwise answer_text = ""), then
append {"question": question, "answer": answer_text} to self.history and still
trim with self.history = self.history[-_MAX_HISTORY_SIZE:]; references: result,
answer (field), history, _MAX_HISTORY_SIZE.
🧹 Nitpick comments (5)
tests/tutor/test_agentic_tools.py (3)

12-73: LGTM! Consider additional edge case tests.

The tests adequately cover full and minimal response scenarios. For completeness, you might consider adding tests for:

  • code_examples items that lack the "code" key (should be filtered out per the implementation)
  • Confidence values outside [0, 1] range (clamping behavior)

This is optional since coverage is reported at ~86%.


75-106: Consider adding a minimal response test for consistency.

Unlike TestLessonGeneratorStructure, this class only has one test for the full response case. Adding a test_structure_response_minimal test would improve coverage and maintain consistency with the other test classes.

💡 Suggested test to add
@patch("cortex.tutor.tools.agentic.examples_provider.get_config")
@patch("cortex.tutor.tools.agentic.examples_provider.ChatAnthropic")
def test_structure_response_minimal(self, mock_llm_class, mock_config):
    """Test structure_response with minimal response."""
    from cortex.tutor.tools.agentic.examples_provider import ExamplesProviderTool

    mock_config.return_value = Mock(
        anthropic_api_key="test_key",
        model="claude-sonnet-4-20250514",
    )
    mock_llm_class.return_value = Mock()

    tool = ExamplesProviderTool()
    response = {"package_name": "test", "topic": "basics"}

    result = tool._structure_response(response, "test", "basics")

    assert result["package_name"] == "test"
    assert result["examples"] == []
    assert result["tips"] == []

225-292: Good coverage of ConversationHandler.

Tests adequately verify context building and lazy initialization. Minor note: line 240 (handler.history = []) is redundant since history is already initialized to an empty list in __init__.

Optional: Consider adding a test for the history bounding behavior when exceeding _MAX_HISTORY_SIZE (defined as 50 in the implementation).

cortex/tutor/tools/agentic/qa_handler.py (2)

58-70: Consider validating API key before LLM initialization.

The _get_llm method passes the API key directly to ChatAnthropic without checking if it's None or empty. If the key is missing, this will likely fail with a less informative error at LLM invocation time.

💡 Suggested validation
     def _get_llm(self) -> "ChatAnthropic":
         """Lazily initialize and return the LLM."""
         if self.llm is None:
             from langchain_anthropic import ChatAnthropic

             config = get_config()
+            if not config.anthropic_api_key:
+                raise ValueError("Anthropic API key not configured. Set ANTHROPIC_API_KEY environment variable.")
             self.llm = ChatAnthropic(
                 model=self.model_name,
                 api_key=config.anthropic_api_key,
                 temperature=0.1,
                 max_tokens=2048,
             )
         return self.llm

113-208: Code duplication between _run and _arun.

The sync and async implementations are nearly identical except for invoke vs ainvoke. While acceptable for clarity, you could extract the shared validation and parameter building into a helper if the methods grow more complex.

Also, cost_gbp is hardcoded to 0.02 in both places. Consider extracting this to a class constant for maintainability.

💡 Optional: Extract cost constant
+    _COST_GBP: float = 0.02
     _NONE_SPECIFIED: str = "none specified"

     # In _run and _arun:
-            return {
-                "success": True,
-                "answer": answer,
-                "cost_gbp": 0.02,
-            }
+            return {
+                "success": True,
+                "answer": answer,
+                "cost_gbp": self._COST_GBP,
+            }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24bd578 and 0d0b495.

📒 Files selected for processing (2)
  • cortex/tutor/tools/agentic/qa_handler.py
  • tests/tutor/test_agentic_tools.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/tutor/tools/agentic/qa_handler.py
  • tests/tutor/test_agentic_tools.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

  • tests/tutor/test_agentic_tools.py
🧬 Code graph analysis (2)
cortex/tutor/tools/agentic/qa_handler.py (1)
cortex/tutor/tools/deterministic/validators.py (2)
  • validate_package_name (51-93)
  • validate_question (138-148)
tests/tutor/test_agentic_tools.py (1)
cortex/tutor/tools/agentic/qa_handler.py (3)
  • _structure_response (261-328)
  • ConversationHandler (359-441)
  • _build_context (426-437)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.12)
  • GitHub Check: Test (Python 3.10)
  • GitHub Check: Test (Python 3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.10)
🔇 Additional comments (7)
tests/tutor/test_agentic_tools.py (2)

1-10: LGTM!

Module docstring adequately describes the test purpose, and imports are minimal and appropriate. The inline imports inside test methods are a reasonable pattern for lazy loading in tests.


108-223: Excellent test coverage for QAHandlerTool structure methods.

The tests comprehensively cover:

  • Full response handling
  • Non-dict input graceful handling
  • Invalid confidence type handling
  • Confidence clamping (both > 1 and < 0)
  • String code_example normalization

The comment at line 121-122 explaining the lazy init pattern is helpful.

cortex/tutor/tools/agentic/qa_handler.py (5)

1-22: LGTM!

Good use of TYPE_CHECKING to avoid runtime import of ChatAnthropic while still enabling type hints. Imports are well-organized and all appear necessary.


72-92: LGTM!

Input validation is properly delegated to centralized validators, and the chain composition follows standard LangChain patterns. Type hints are present and correct.


261-328: Excellent defensive response handling.

The _structure_response method robustly handles:

  • Non-dict LLM responses (line 275-276)
  • Invalid related_topics and follow_up_suggestions types
  • Invalid confidence values with try/except and clamping
  • Code examples as either dict or string

This defensive approach is essential for handling unpredictable LLM outputs.


331-352: LGTM!

Clean convenience function with proper type hints and docstring. Creating a new tool instance per call is appropriate for this stateless use case.


439-441: Missing docstring on public method.

Per coding guidelines, docstrings are required for all public APIs. While clear_history is self-explanatory, a brief docstring would maintain consistency.

💡 Add docstring
     def clear_history(self) -> None:
-        """Clear conversation history."""
+        """Clear conversation history.
+
+        Resets the conversation state, allowing a fresh start
+        without creating a new handler instance.
+        """
         self.history = []

Actually, I see there is already a docstring at line 440. Disregard this comment.

Guard against potential KeyError/TypeError when accessing answer data
by checking isinstance before calling .get() method.
@sonarqubecloud
Copy link

@srikrishnavansi
Copy link
Contributor Author

@coderabbitai

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

Tip

For best results, initiate chat on the files or code changes.

@srikrishnavansi, how can I help you?

I noticed @Anshgrover23 asked about the LangChain choice. Would you like me to help explain the rationale for using LangChain in this PR? Or is there something else you'd like me to review or assist with?

For context, if you'd like me to address the LangChain question, I can explain:

  • LangChain provides a standardized framework for orchestrating LLM calls with built-in prompt management, output parsing, and tool abstractions
  • LangGraph enables the Plan→Act→Reflect workflow with state management and conditional routing between nodes
  • LangChain-Anthropic offers a clean integration with Claude's API with proper error handling and async support

These libraries significantly reduce boilerplate code for the agent-based tutoring system compared to direct API calls. Let me know if you'd like a more detailed explanation or have other questions!

@srikrishnavansi
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@srikrishnavansi
Copy link
Contributor Author

@Anshgrover23 Can you review again.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
cortex/tutor/tools/agentic/qa_handler.py (4)

49-51: Pydantic v2 deprecates inner Config class.

In Pydantic v2, class Config is deprecated in favor of model_config = ConfigDict(...). While this still works, it will generate deprecation warnings.

♻️ Suggested refactor
-    class Config:
-        arbitrary_types_allowed = True
+    model_config = ConfigDict(arbitrary_types_allowed=True)

You'll also need to add the import:

from pydantic import ConfigDict, Field

58-70: Consider validating API key before LLM initialization.

If config.anthropic_api_key is None or empty, the error will surface later during API calls with a less clear message. Consider adding validation or at least documenting this assumption.

♻️ Optional defensive check
     def _get_llm(self) -> "ChatAnthropic":
         """Lazily initialize and return the LLM."""
         if self.llm is None:
             from langchain_anthropic import ChatAnthropic

             config = get_config()
+            if not config.anthropic_api_key:
+                raise ValueError("ANTHROPIC_API_KEY not configured")
             self.llm = ChatAnthropic(
                 model=self.model_name,
                 api_key=config.anthropic_api_key,
                 temperature=0.1,
                 max_tokens=2048,
             )
         return self.llm

161-166: Consider sanitizing exception messages in error responses.

Returning str(e) directly could expose internal details (API keys in error messages, internal paths, etc.) to callers. Consider logging the full exception internally and returning a generic message.

♻️ Suggested improvement
         except Exception as e:
+            # Log full exception for debugging
+            # logger.exception("QA handler error")
             return {
                 "success": False,
-                "error": str(e),
+                "error": f"Failed to process question: {type(e).__name__}",
                 "answer": None,
             }

155-159: Consider adding logging for observability.

The hardcoded cost_gbp: 0.02 may not reflect actual API costs which vary by model and token count. Consider either calculating from actual token usage or documenting this as an estimate. Adding logging for Q&A invocations would also aid debugging and usage tracking.

Also applies to: 197-201

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d0b495 and 814018b.

📒 Files selected for processing (1)
  • cortex/tutor/tools/agentic/qa_handler.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

  • cortex/tutor/tools/agentic/qa_handler.py
🧬 Code graph analysis (1)
cortex/tutor/tools/agentic/qa_handler.py (2)
cortex/tutor/config.py (2)
  • get_config (127-140)
  • Config (18-120)
cortex/tutor/tools/deterministic/validators.py (1)
  • validate_package_name (51-93)
🔇 Additional comments (11)
cortex/tutor/tools/agentic/qa_handler.py (11)

1-22: LGTM!

Clean imports with appropriate use of TYPE_CHECKING for the lazily-imported ChatAnthropic. Module docstring is present and descriptive.


52-56: LGTM!

The model_post_init hook correctly uses the Pydantic v2 signature and preserves user-specified model names while applying config defaults otherwise.


72-82: LGTM!

Input validation properly delegates to the deterministic validators, maintaining separation of concerns.


84-92: LGTM!

Standard LangChain Expression Language (LCEL) chain construction.


94-111: LGTM!

Clean parameter building with appropriate fallbacks for optional values.


168-208: LGTM, but consider reducing duplication with _run.

The async method correctly mirrors the sync implementation. The duplication is acceptable given the different execution models, though a shared helper for validation and response structuring could reduce maintenance burden.


210-226: LGTM!

Well-structured system prompt with appropriate guardrails against fabrication and clear educational guidance.


228-259: LGTM!

Clear prompt template with well-defined JSON schema. Good handling of edge cases (unclear questions, unknown answers) in the prompt instructions.


261-328: LGTM!

Excellent defensive programming with robust type checking and validation. Handles malformed LLM responses gracefully with sensible defaults and proper bounds clamping for confidence scores.


331-352: LGTM!

Clean convenience wrapper. Note that this creates a new QAHandlerTool instance per call, which is fine for simple use cases but ConversationHandler is more appropriate for repeated interactions.


359-443: LGTM!

Well-implemented conversation handler with:

  • Lazy tool initialization
  • Bounded history (50 entries max) to prevent unbounded memory growth
  • Safe access pattern for extracting answers (per the commit message fix)
  • Reasonable context summarization (last 3 exchanges, truncated)

@Anshgrover23
Copy link
Collaborator

@srikrishnavansi, thanks for putting this together! The feature works well and tests look solid.

My main question: why LangGraph for this? We already have an LLM interpreter in cortex/llm/ that handles Claude, OpenAI, and Ollama. Adding LangChain + LangGraph feels like overkill for a CLI tutor that's basically prompt → LLM → response.

A few other things I noticed:

  • Model name (claude-sonnet-4-20250514) is hardcoded in 4 different files
  • DEFAULT_TUTOR_TOPICS is duplicated in both cli.py and config.py
  • Tests are split between tests/tutor/ and tests/unit/tutor/ for some reason

I think we could slim this down to just tutor.py + cli.py + db.py using our existing LLM code. Would cut out ~4 dependencies and a lot of boilerplate.

If there's a reason for the LangGraph approach I'm missing (future agentic features?), happy to hear it. Just want to make sure we're not over-engineering this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AI-Powered Installation Tutor

2 participants