diff --git a/.gitignore b/.gitignore index c4ad990a..2afa91a7 100644 --- a/.gitignore +++ b/.gitignore @@ -8,5 +8,10 @@ __private__ VERSION *.DS_Store .env* -.serena/cache -.specify/ \ No newline at end of file +.serena +.specify + +# In Memoria +in-memoria.db +.in-memoria/cache/ +.in-memoria/.env diff --git a/.serena/.gitignore b/.serena/.gitignore deleted file mode 100644 index 14d86ad6..00000000 --- a/.serena/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/cache diff --git a/.serena/memories/analyzer_architecture.md b/.serena/memories/analyzer_architecture.md deleted file mode 100644 index 849a0e1c..00000000 --- a/.serena/memories/analyzer_architecture.md +++ /dev/null @@ -1,69 +0,0 @@ -# Analyzer Architecture - -## Overview - -The analyzer system is the core content domain of Mango Tango CLI, designed for modularity and extensibility. - -## Analyzer Types - -### Primary Analyzers - -- **Purpose**: Core data processing and analysis -- **Input**: Raw imported data (CSV/Excel → Parquet) -- **Output**: Normalized, non-duplicated analysis results -- **Context**: Receives input file path, preprocessing method, output path -- **Examples**: hashtags, ngrams, temporal, time_coordination - -### Secondary Analyzers - -- **Purpose**: Transform primary outputs into user-friendly formats -- **Input**: Primary analyzer outputs -- **Output**: User-consumable tables/reports -- **Context**: Receives primary output path, provides secondary output path -- **Examples**: ngram_stats (processes ngrams output) - -### Web Presenters - -- **Purpose**: Interactive dashboards and visualizations -- **Input**: Primary + Secondary analyzer outputs -- **Framework**: Dash or Shiny for Python -- **Context**: Receives all relevant output paths + Dash/Shiny app object -- **Examples**: hashtags_web, ngram_web, temporal_barplot - -## Interface Pattern - -Each analyzer defines an interface in `interface.py`: - -```python -interface = AnalyzerInterface( - input=AnalyzerInput(...), # Define required columns/semantics - params=[...], # User-configurable parameters - outputs=[...], # Output table schemas - kind="primary" # or "secondary"/"web" -) -``` - -## Context Pattern - -All analyzers receive context objects providing: - -- File paths (input/output) -- Preprocessing methods -- Application hooks (for web presenters) -- Configuration parameters - -## Data Flow - -1. **Import**: CSV/Excel → Parquet via importers -2. **Preprocess**: Semantic preprocessing applies column mappings -3. **Primary**: Raw data → structured analysis results -4. **Secondary**: Primary results → user-friendly outputs -5. **Web**: All outputs → interactive dashboards -6. **Export**: Results → user-selected formats (XLSX, CSV, etc.) - -## Key Components - -- `analyzer_interface/` - Base interface definitions -- `analyzers/suite` - Registry of all available analyzers -- Context objects for dependency injection -- Parquet-based data persistence between stages diff --git a/.serena/memories/code_structure.md b/.serena/memories/code_structure.md deleted file mode 100644 index 2ea3351e..00000000 --- a/.serena/memories/code_structure.md +++ /dev/null @@ -1,60 +0,0 @@ -# Code Structure - -## Entry Point - -- `mangotango.py` - Main entry point, bootstraps the application with Storage, App, and terminal components - -## Core Modules - -### App (`app/`) - -- `app.py` - Main App class with workspace capabilities -- `app_context.py` - AppContext class for dependency injection -- `project_context.py` - ProjectContext for project-specific operations -- `analysis_context.py` - AnalysisContext and AnalysisRunProgressEvent for analysis execution -- `analysis_output_context.py` - Context for handling analysis outputs -- `analysis_webserver_context.py` - Context for web server operations -- `settings_context.py` - SettingsContext for configuration management - -### Components (`components/`) - -Terminal UI components using inquirer for interactive flows: - -- `main_menu.py` - Main application menu -- `splash.py` - Application splash screen -- `new_project.py` - Project creation flow -- `select_project.py` - Project selection interface -- `project_main.py` - Project main menu -- `new_analysis.py` - Analysis creation flow -- `select_analysis.py` - Analysis selection interface -- `analysis_main.py` - Analysis main menu -- `analysis_params.py` - Parameter customization interface -- `analysis_web_server.py` - Web server management -- `export_outputs.py` - Output export functionality -- `context.py` - ViewContext class for UI state - -### Storage (`storage/`) - -- `__init__.py` - Storage class, models (ProjectModel, AnalysisModel, etc.) -- `file_selector.py` - File selection state management - -### Analyzers (`analyzers/`) - -Modular analysis system: - -- `__init__.py` - Main analyzer suite registration -- `example/` - Example analyzer implementation -- `hashtags/` - Hashtag analysis (primary analyzer) -- `hashtags_web/` - Hashtag web dashboard (web presenter) -- `ngrams/` - N-gram analysis (primary analyzer) -- `ngram_stats/` - N-gram statistics (secondary analyzer) -- `ngram_web/` - N-gram web dashboard (web presenter) -- `temporal/` - Temporal analysis (primary analyzer) -- `temporal_barplot/` - Temporal visualization (web presenter) -- `time_coordination/` - Time coordination analysis - -### Importing (`importing/`) - -- `importer.py` - Base Importer and ImporterSession classes -- `csv.py` - CSV import implementation -- `excel.py` - Excel import implementation diff --git a/.serena/memories/code_style_conventions.md b/.serena/memories/code_style_conventions.md deleted file mode 100644 index bb4306c0..00000000 --- a/.serena/memories/code_style_conventions.md +++ /dev/null @@ -1,58 +0,0 @@ -# Code Style and Conventions - -## Formatting Tools - -- **Black**: Code formatter (automatically configured) -- **isort**: Import sorter with black profile -- **Pre-commit hooks**: Automatically format code on commit - -## Code Style Requirements - -- Python 3.12 syntax and features -- Black-formatted code (line length, spacing, etc.) -- isort-organized imports with black profile -- Type hints using modern Python syntax (`list[str]` not `List[str]`) -- Pydantic models for data validation - -## Project Conventions - -### File Organization - -- Modules organized by domain (app, components, analyzers, storage, importing) -- Each analyzer has its own subdirectory with interface.py, main.py, and optional web components -- Test files follow naming: `test_*.py` in same directory as code being tested -- Interface files define data schemas and parameters - -### Naming Conventions - -- Snake_case for functions, variables, modules -- PascalCase for classes -- UPPER_CASE for constants -- Descriptive names reflecting domain concepts - -### Architecture Patterns - -- Context pattern for dependency injection (AppContext, ViewContext, etc.) -- Interface pattern for analyzer definitions -- Factory pattern for web presenters -- Modular analyzer system with clear separation of concerns - -### Import Patterns - -- Relative imports within modules -- Clear separation between core, edge, and content domains -- Dependencies injected through context objects - -### Data Handling - -- Parquet files for data persistence -- Polars for data processing (preferred over pandas) -- TinyDB for lightweight metadata storage -- Type-safe data models using Pydantic - -### Testing Patterns - -- pytest framework -- Test data stored in test_data/ subdirectories -- Integration tests for analyzer workflows -- CI/CD runs formatting checks and tests on PRs diff --git a/.serena/memories/project_overview.md b/.serena/memories/project_overview.md deleted file mode 100644 index 0ee4b4e0..00000000 --- a/.serena/memories/project_overview.md +++ /dev/null @@ -1,35 +0,0 @@ -# Project Overview: Mango Tango CLI - -## Purpose - -CIB 🥭 (Mango Tango CLI) is a Python terminal-based tool for performing data analysis and visualization, specifically designed for social media data analysis. It provides a modular and extensible architecture that allows developers to contribute new analysis modules while maintaining a consistent user experience. - -## Core Problem - -The tool addresses the common pain point of moving from private data analysis scripts to shareable tools. It prevents inconsistent UX across analyses, code duplication, and bugs by providing a clear separation between core application logic and analysis modules. - -## Key Features - -- Terminal-based interface for data analysis workflows -- Modular analyzer system (Primary, Secondary, Web Presenters) -- Built-in data import/export capabilities -- Interactive web dashboards using Dash and Shiny -- Support for various data formats (CSV, Excel, Parquet) -- Hashtag analysis, n-gram analysis, temporal analysis -- Multi-tenancy support - -## Tech Stack - -- **Language**: Python 3.12 -- **Data Processing**: Polars, Pandas, PyArrow -- **Web Framework**: Dash, Shiny for Python -- **CLI**: Inquirer for interactive prompts -- **Data Storage**: TinyDB, Parquet files -- **Visualization**: Plotly -- **Export**: XlsxWriter for Excel output - -## Architecture Domains - -1. **Core Domain**: Application logic, Terminal Components, Storage IO -2. **Edge Domain**: Data import/export, Semantic Preprocessor -3. **Content Domain**: Analyzers (Primary/Secondary), Web Presenters diff --git a/.serena/memories/suggested_commands.md b/.serena/memories/suggested_commands.md deleted file mode 100644 index e9b6b2e2..00000000 --- a/.serena/memories/suggested_commands.md +++ /dev/null @@ -1,79 +0,0 @@ -# Suggested Commands - -## Development Environment - -```bash -# Setup virtual environment (first time) -python -m venv venv - -# Activate environment and install dependencies -./bootstrap.sh # macOS/Linux -./bootstrap.ps1 # Windows PowerShell -``` - -## Running the Application - -```bash -# Start the application -python -m mangotango - -# Run with no-op flag (for testing) -python -m mangotango --noop -``` - -## Development Commands - -```bash -# Code formatting (must be run before commits) -isort . -black . - -# Run both formatters together -isort . && black . - -# Run tests -pytest - -# Run specific test -pytest analyzers/hashtags/test_hashtags_analyzer.py - -# Install development dependencies -pip install -r requirements-dev.txt - -# Install production dependencies only -pip install -r requirements.txt -``` - -## Git Workflow - -```bash -# Create feature branch from develop -git checkout develop -git pull origin develop -git checkout -b feature/new-feature - -# Make changes and commit -git add . -git commit -m "Description of changes" -git push origin feature/new-feature - -# Create PR to develop branch (not main) -``` - -## Build Commands - -```bash -# Build executable (from GitHub Actions) -pyinstaller pyinstaller.spec -``` - -## System Commands (macOS) - -```bash -# Standard Unix commands work on macOS -ls, cd, find, grep, git - -# Use these for file operations -find . -name "*.py" -type f -grep -r "pattern" --include="*.py" . -``` diff --git a/.serena/memories/task_completion_checklist.md b/.serena/memories/task_completion_checklist.md deleted file mode 100644 index 67bc70f6..00000000 --- a/.serena/memories/task_completion_checklist.md +++ /dev/null @@ -1,77 +0,0 @@ -# Task Completion Checklist - -## Before Committing Code - -### 1. Code Formatting (Required) - -```bash -isort . -black . -``` - -**Critical**: Pre-commit hooks will run these automatically, but manually running ensures no surprises. - -### 2. Testing - -```bash -# Run all tests -pytest - -# Run specific analyzer tests if modified -pytest analyzers/[analyzer_name]/test_*.py -``` - -### 3. Code Quality Validation - -- Ensure no new linting errors -- Check that type hints are present for new functions -- Verify imports are properly organized - -## For New Analyzers - -### 1. Required Files - -- `interface.py` - Define analyzer interface with input/output schemas -- `main.py` - Implement analyzer logic -- `__init__.py` - Export analyzer module -- `test_*.py` - Add tests for the analyzer - -### 2. Registration - -- Add analyzer to `analyzers/__init__.py` suite -- Ensure interface follows AnalyzerInterface pattern - -### 3. Testing - -- Create test data in `test_data/` directory -- Test with sample data to ensure parquet output works -- Verify web presenter integration if applicable - -## Git Workflow Checklist - -### 1. Branch Management - -- Always branch from `develop` (not `main`) -- Use descriptive branch names: `feature/name` or `bugfix/name` - -### 2. Commit Requirements - -- Clear commit messages describing the change -- Code must pass formatting checks (isort + black) -- All tests must pass - -### 3. Pull Request - -- Target `develop` branch -- Use the template file `.github/PULL_REQUEST_TEMPALTE.md` -- Include description of changes -- Wait for CI/CD checks to pass -- Address any review feedback - -## CI/CD Requirements - -All PRs must pass: - -- Code formatting checks (isort + black) -- PyTest suite -- Build verification diff --git a/.serena/project.yml b/.serena/project.yml deleted file mode 100644 index bde5e791..00000000 --- a/.serena/project.yml +++ /dev/null @@ -1,68 +0,0 @@ -# language of the project (csharp, python, rust, java, typescript, go, cpp, or ruby) -# * For C, use cpp -# * For JavaScript, use typescript -# Special requirements: -# * csharp: Requires the presence of a .sln file in the project folder. -language: python - -# whether to use the project's gitignore file to ignore files -# Added on 2025-04-07 -ignore_all_files_in_gitignore: true -# list of additional paths to ignore -# same syntax as gitignore, so you can use * and ** -# Was previously called `ignored_dirs`, please update your config if you are using that. -# Added (renamed)on 2025-04-07 -ignored_paths: [] - -# whether the project is in read-only mode -# If set to true, all editing tools will be disabled and attempts to use them will result in an error -# Added on 2025-04-18 -read_only: false - - -# list of tool names to exclude. We recommend not excluding any tools, see the readme for more details. -# Below is the complete list of tools for convenience. -# To make sure you have the latest list of tools, and to view their descriptions, -# execute `uv run scripts/print_tool_overview.py`. -# -# * `activate_project`: Activates a project by name. -# * `check_onboarding_performed`: Checks whether project onboarding was already performed. -# * `create_text_file`: Creates/overwrites a file in the project directory. -# * `delete_lines`: Deletes a range of lines within a file. -# * `delete_memory`: Deletes a memory from Serena's project-specific memory store. -# * `execute_shell_command`: Executes a shell command. -# * `find_referencing_code_snippets`: Finds code snippets in which the symbol at the given location is referenced. -# * `find_referencing_symbols`: Finds symbols that reference the symbol at the given location (optionally filtered by type). -# * `find_symbol`: Performs a global (or local) search for symbols with/containing a given name/substring (optionally filtered by type). -# * `get_current_config`: Prints the current configuration of the agent, including the active and available projects, tools, contexts, and modes. -# * `get_symbols_overview`: Gets an overview of the top-level symbols defined in a given file or directory. -# * `initial_instructions`: Gets the initial instructions for the current project. -# Should only be used in settings where the system prompt cannot be set, -# e.g. in clients you have no control over, like Claude Desktop. -# * `insert_after_symbol`: Inserts content after the end of the definition of a given symbol. -# * `insert_at_line`: Inserts content at a given line in a file. -# * `insert_before_symbol`: Inserts content before the beginning of the definition of a given symbol. -# * `list_dir`: Lists files and directories in the given directory (optionally with recursion). -# * `list_memories`: Lists memories in Serena's project-specific memory store. -# * `onboarding`: Performs onboarding (identifying the project structure and essential tasks, e.g. for testing or building). -# * `prepare_for_new_conversation`: Provides instructions for preparing for a new conversation (in order to continue with the necessary context). -# * `read_file`: Reads a file within the project directory. -# * `read_memory`: Reads the memory with the given name from Serena's project-specific memory store. -# * `remove_project`: Removes a project from the Serena configuration. -# * `replace_lines`: Replaces a range of lines within a file with new content. -# * `replace_symbol_body`: Replaces the full definition of a symbol. -# * `restart_language_server`: Restarts the language server, may be necessary when edits not through Serena happen. -# * `search_for_pattern`: Performs a search for a pattern in the project. -# * `summarize_changes`: Provides instructions for summarizing the changes made to the codebase. -# * `switch_modes`: Activates modes by providing a list of their names -# * `think_about_collected_information`: Thinking tool for pondering the completeness of collected information. -# * `think_about_task_adherence`: Thinking tool for determining whether the agent is still on track with the current task. -# * `think_about_whether_you_are_done`: Thinking tool for determining whether the task is truly completed. -# * `write_memory`: Writes a named memory (for future reference) to Serena's project-specific memory store. -excluded_tools: [] - -# initial prompt for the project. It will always be given to the LLM upon activating the project -# (contrary to the memories, which are loaded on demand). -initial_prompt: "" - -project_name: "mango-tango-cli" diff --git a/CLAUDE.md b/CLAUDE.md index 20335b5c..48d8b909 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,99 +4,125 @@ ### Core Documentation -- **Repository Overview**: @.ai-context/README.md -- **Architecture Deep Dive**: @.ai-context/architecture-overview.md -- **Symbol Reference**: @.ai-context/symbol-reference.md -- **Setup Guide**: @.ai-context/setup-guide.md -- **Development Guide**: @docs/dev-guide.md +- **Repository Overview**: `.ai-context/README.md` +- **Architecture Deep Dive**: `.ai-context/architecture-overview.md` +- **Symbol Reference**: `.ai-context/symbol-reference.md` +- **Setup Guide**: `.ai-context/setup-guide.md` +- **Development Guide**: `docs/dev-guide.md` ### Quick Context Loading ```markdown # Start with this for comprehensive context -@.ai-context/README.md +`.ai-context/README.md` # For architectural understanding -@.ai-context/architecture-overview.md +`.ai-context/architecture-overview.md` # For precise symbol navigation -@.ai-context/symbol-reference.md +`.ai-context/symbol-reference.md` ``` -## Serena MCP Integration +## Knowledge Graph Integration -### Essential Serena Usage +### Essential Knowledge Graph Usage -**Symbol-Level Development**: +**Entity-Based Project Knowledge**: ```markdown -- Use `get_symbols_overview` for high-level code structure -- Use `find_symbol` for specific class/function discovery -- Use `find_referencing_symbols` for dependency analysis -- Prefer symbolic operations over reading entire files +- Use `search_nodes(query)` to find entities by query +- Use `open_nodes([names])` to retrieve specific entities +- Use `read_graph()` for comprehensive project overview +- Use `create_entities([...])` to capture new insights +- Use `add_observations([...])` to enhance existing knowledge +- Use `create_relations([...])` to link related concepts ``` -**Memory System**: +### Knowledge Structure -```markdown -- Use `list_memories` to see available project knowledge -- Use `read_memory` for specific domain knowledge -- Use `write_memory` for new insights worth preserving -``` +**Entity Types**: + +- `Module` - Core application modules (app, storage, analyzers) +- `Analyzer` - Specific analyzer implementations +- `Component` - UI and terminal components +- `Service` - Shared services (tokenizer, logging) +- `Pattern` - Architectural patterns and conventions +- `Concept` - Domain concepts and abstractions +- `Workflow` - Development workflows and processes + +**Relation Types**: + +- `implements` - Entity implements interface/pattern +- `uses` - Entity depends on or uses another +- `part_of` - Entity is component of another +- `extends` - Entity extends/inherits from another +- `related_to` - General relationship + +### When to Use Knowledge Graph + +**Capture Knowledge When**: -### Serena Semantic Analysis +- Discovering non-obvious architectural patterns +- Understanding complex dependencies +- Learning analyzer-specific implementation details +- Identifying gotchas or edge cases +- Documenting workflow improvements -**When to Use Semantic Tools**: +**Retrieve Knowledge When**: -- Understanding code architecture and relationships -- Finding specific functions, classes, or components -- Tracing dependencies and references -- Getting project overviews without reading full files +- Starting work on unfamiliar modules +- Understanding analyzer ecosystem +- Looking for similar implementations +- Debugging complex interactions +- Planning architectural changes **When NOT to Use**: -- Reading specific known file paths (use Read tool) -- Simple file operations (use standard tools) -- When you already have the full file content +- Reading specific known file paths (use Read) +- Simple code lookups (use Grep/Glob) +- When manual docs suffice -## Tool Usage Patterns +## Code Navigation Patterns -### Symbol Discovery Workflow +### Finding Code with Standard Tools ```markdown -1. get_symbols_overview("target_directory") -2. find_symbol("TargetClass", include_body=False, depth=1) -3. find_symbol("TargetClass/method", include_body=True) -4. find_referencing_symbols("TargetClass/method", "file.py") -``` +# Find files by pattern +Glob: "**/*analyzer*.py" +Glob: "app/**/*.py" -### Analysis Integration Workflow +# Find class definitions +Grep: "^class AnalyzerInterface" --type py -```markdown -1. find_symbol("AnalyzerInterface") # Find base interface -2. get_symbols_overview("analyzers/") # See all analyzers -3. find_symbol("specific_analyzer/main") # Get implementation -4. find_referencing_symbols() # See usage patterns +# Find function definitions +Grep: "^def main\(" --type py + +# Find usage/references +Grep: "from app.logger import" --type py +Grep: "AnalysisContext" --type py + +# Read specific files +Read: app/app.py +Read: analyzers/hashtags/main.py ``` -### Context-Aware Development +### Code Exploration Workflow -```python -# Always understand the context pattern first -find_symbol("AnalysisContext", include_body=True) -find_symbol("ViewContext", include_body=True) -find_symbol("AppContext", include_body=True) +```markdown +1. Glob to find relevant files +2. Grep to locate specific symbols +3. Read to understand implementation +4. Query knowledge graph for architectural context ``` ## Development Guidelines ### Session Startup Checklist -1. ✅ **Call `initial_instructions`** -2. ✅ Load @.ai-context/README.md for project overview -3. ✅ Check `.serena/memories/` for deep insights if needed -4. ✅ Use semantic tools for code exploration -5. ✅ Maintain context throughout development +1. ✅ Load `.ai-context/README.md for project overview` +2. ✅ Query knowledge graph for relevant domain knowledge +3. ✅ Use Grep/Glob for code exploration +4. ✅ Maintain context throughout development ### Code Development Standards @@ -108,53 +134,142 @@ logger = get_logger(__name__) logger.info("Operation started", extra={"context": "value"}) ``` -Use structured logging throughout development for debugging and monitoring. See @docs/dev-guide.md#logging for complete usage patterns. +Use structured logging throughout development. See `docs/dev-guide.md#logging` for complete patterns. ### Task-Specific Patterns **New Analyzer Development**: ```markdown -1. get_symbols_overview("analyzers/example/") -2. find_symbol("AnalyzerInterface", include_body=True) -3. read_memory("analyzer_architecture") -4. Use symbolic tools to create new analyzer +1. Glob: "analyzers/example/**/*.py" # Find example analyzer +2. Read: analyzers/example/interface.py +3. search_nodes("analyzer architecture") # Understand patterns +4. Read: analyzers/example/main.py +5. Use knowledge graph insights to implement ``` **Bug Investigation**: ```markdown -1. find_symbol("problematic_function", include_body=True) -2. find_referencing_symbols("problematic_function", "file.py") -3. Use semantic analysis to trace execution flow +1. Grep: "problematic_function" --type py -n +2. Read file with function implementation +3. Grep: "problematic_function" (find all usages) +4. search_nodes("related pattern") # Context +5. Use knowledge graph to trace execution flow ``` **Code Refactoring**: ```markdown -1. find_referencing_symbols("target_symbol", "file.py") -2. get_symbols_overview() to understand impact -3. Use replace_symbol_body for precise changes +1. Grep: "target_symbol" --type py (find all references) +2. Read each file to understand usage +3. open_nodes(["RelatedPattern"]) # Understand constraints +4. Make changes with full context ``` -### Memory System Usage +## Knowledge Graph Usage -**Available Memories**: +### Entity Structure Examples -- `project_overview` - High-level project understanding -- `code_structure` - Module organization and responsibilities -- `analyzer_architecture` - Analyzer system deep dive -- `suggested_commands` - Development and testing commands -- `code_style_conventions` - Style guides and patterns -- `task_completion_checklist` - Pre-commit requirements +**Analyzer Entity**: -**Memory Loading Pattern**: +```markdown +{ + name: "HashtagAnalyzer", + entityType: "Analyzer", + observations: [ + "Primary analyzer for hashtag extraction and analysis", + "Located in analyzers/hashtags/main.py", + "Uses regex patterns to extract hashtags from text columns", + "Outputs: hashtag frequency, co-occurrence, temporal patterns", + "Gotcha: Handles Unicode hashtags correctly via preprocessing" + ] +} +``` + +**Pattern Entity**: + +```markdown +{ + name: "AnalyzerInterface", + entityType: "Pattern", + observations: [ + "Declarative interface definition for all analyzers", + "Defines inputs (columns + semantic types), outputs, parameters", + "Three stages: Primary → Secondary → Web Presenter", + "Context pattern used for dependency injection", + "See .ai-context/architecture-overview.md for details" + ] +} +``` + +**Service Entity**: + +```markdown +{ + name: "TokenizerService", + entityType: "Service", + observations: [ + "Located in services/tokenizer/", + "AbstractTokenizer base, BasicTokenizer implementation", + "Handles scriptio continua (CJK, Thai, Lao, Myanmar, Khmer)", + "Space-separated tokenization (Latin, Arabic)", + "Social media entity preservation (URLs, @mentions, #hashtags)", + "Thread-safe, stateless API with optional streaming" + ] +} +``` + +### Relation Examples ```markdown -# Load relevant memory for current task -read_memory("analyzer_architecture") # For analyzer work -read_memory("suggested_commands") # For development setup -read_memory("task_completion_checklist") # Before committing +create_relations([ + {from: "HashtagAnalyzer", to: "AnalyzerInterface", relationType: "implements"}, + {from: "NGramAnalyzer", to: "TokenizerService", relationType: "uses"}, + {from: "TokenizerService", to: "Service", relationType: "part_of"}, + {from: "DashPresenter", to: "WebPresenterPattern", relationType: "implements"} +]) +``` + +### Common Query Patterns + +```markdown +# Find analyzer-related knowledge +search_nodes("analyzer architecture") + +# Get tokenizer service details +open_nodes(["TokenizerService", "BasicTokenizer"]) + +# Understand context pattern +search_nodes("context dependency injection") + +# Find all analyzers +search_nodes("analyzer") # Filter by entityType: Analyzer + +# Explore web presenter patterns +search_nodes("dash shiny web presenter") +``` + +### Capturing New Knowledge + +```markdown +# After discovering architectural patterns +create_entities([{ + name: "ProgressTrackingPattern", + entityType: "Pattern", + observations: [ + "Used in AnalysisContext for long-running operations", + "Callback-based with AnalysisRunProgressEvent", + "Supports both terminal and web UI progress reporting" + ] +}]) + +# Link related concepts +create_relations([{ + from: "ProgressTrackingPattern", + to: "AnalysisContext", + relationType: "part_of" +}]) ``` ## Context Management @@ -163,34 +278,45 @@ read_memory("task_completion_checklist") # Before committing ```markdown # Core context (always load) -@.ai-context/README.md +`.ai-context/README.md` # Task-specific context -@.ai-context/symbol-reference.md # For code navigation -@.ai-context/architecture-overview.md # For system design -@.ai-context/setup-guide.md # For environment issues +`.ai-context/symbol-reference.md` # For code navigation +`.ai-context/architecture-overview.md` # For system design +`.ai-context/setup-guide.md` # For environment issues # Deep domain knowledge -@.serena/memories/analyzer_architecture.md # For analyzer work -@.serena/memories/code_style_conventions.md # For style questions +search_nodes("analyzer architecture") # For analyzer work +search_nodes("code style conventions") # For style questions +search_nodes("tokenizer patterns") # For text processing ``` -### Symbol Navigation Examples +### Code Navigation Examples ```markdown # Find app entry point -find_symbol("main", relative_path="mangotango.py") +Grep: "^def main" --path mangotango.py # Explore analyzer system -get_symbols_overview("analyzers/") -find_symbol("suite", relative_path="analyzers/__init__.py") +Glob: "analyzers/**/__init__.py" +Read: analyzers/__init__.py # Understand storage layer -find_symbol("Storage", relative_path="storage/__init__.py", depth=1) +Grep: "^class Storage" --type py +Read: storage/__init__.py # Trace UI components -get_symbols_overview("components/") -find_symbol("main_menu", include_body=True) +Glob: "components/**/*.py" +Grep: "^def main_menu" --type py +``` + +### Context Switching Strategy + +```markdown +1. Start with manual docs for overview +2. Use knowledge graph for domain-specific deep dives +3. Use Grep/Glob for precise code navigation +4. Reference symbol guide for quick lookups ``` ## Reference Links @@ -199,7 +325,7 @@ find_symbol("main_menu", include_body=True) - **AI Context**: `.ai-context/` - Token-efficient documentation - **Development**: `docs/dev-guide.md` - Comprehensive development guide -- **Serena Memories**: `.serena/memories/` - Semantic project knowledge +- **Knowledge Graph**: Entity-based semantic project knowledge ### Key Architecture References @@ -216,22 +342,15 @@ find_symbol("main_menu", include_body=True) - **Web Dashboards**: Dash and Shiny framework integration - **Export System**: Multi-format output generation -## Memory System Integration +## Documentation Integration Strategy -### Serena + Manual Documentation Bridge +### Knowledge Graph + Manual Documentation Bridge - **Manual docs** (`.ai-context/`) provide structured overviews -- **Serena memories** (`.serena/memories/`) provide deep semantic insights +- **Knowledge graph** provides deep semantic insights and relationships - **Both systems** complement each other for comprehensive understanding - **Symbol reference** links to actual code locations for navigation -### Context Switching Strategy - -```markdown -1. Start with manual docs for overview -2. Use Serena memories for domain-specific deep dives -3. Use semantic tools for precise code navigation -4. Reference symbol guide for quick lookups -``` +### Context Hybrid Approach -**Note**: This hybrid approach ensures both human-readable documentation and AI-powered semantic understanding are available for maximum development efficiency. +This approach ensures both human-readable documentation and AI-powered semantic understanding through the knowledge graph for maximum development efficiency.