Skip to content

120 add option to remove translation keys that only exists on kanta translations table#131

Merged
vincentvanbush merged 9 commits intodevelopfrom
120-add-option-to-remove-translation-keys-that-only-exists-on-kanta__translations-table
Oct 24, 2025
Merged

120 add option to remove translation keys that only exists on kanta translations table#131
vincentvanbush merged 9 commits intodevelopfrom
120-add-option-to-remove-translation-keys-that-only-exists-on-kanta__translations-table

Conversation

@jk-lamb
Copy link
Contributor

@jk-lamb jk-lamb commented Oct 20, 2025

Pull Request

Description

Add stale translation detection system with fuzzy matching and merge functionality. This feature allows users to identify translations that exist in the database but are no longer in PO files, and provides tools to either merge them with similar messages (preserving translations) or delete them entirely through the UI.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Test improvements
  • CI/CD improvements

Related Issues

Closes #120 #127 #130

Changes Made

Core Functionality

  • Add stale detection system: add StaleDetection service
  • Fuzzy matching algorithm: detection of similar messages between stale and active ones (Jaro distance threshold-based matching)
  • Message merging: Implemented Kanta.Translations.merge_messages/2 to preserve translations when merging stale messages with active ones

UI Enhancements

  • Dashboard updates: Added stale and mergeable message count cards with dedicated views
  • Bulk operations: Added bulk merge and bulk delete functionality for stale translations
  • Messages table: Enhanced table with merge/delete action buttons for individual stale messages
  • Icons: Added new icon components for merge operations

Testing

  • Added comprehensive test suite for StaleDetection service (Kanta.PoFiles.Services.StaleDetectionTest)
  • Added extensive test coverage for message merging functionality (425 new test lines)
  • Tests cover edge cases, plural translations, contexts, and error scenarios

Testing

Test Environment

  • Elixir version: 1.19.0
  • OTP version: 28
  • Phoenix version: 1.8.1
  • Database: PostgreSQL (via Ecto.Adapters.Postgres)
  • Gettext version: >= 0.26.0

Test Cases

  • All existing tests pass (99 tests, 0 failures)
  • New tests added for new functionality at appropriate levels
  • Manual testing performed

Test Commands Run

mix test
# Output: 99 tests, 0 failures

Documentation

  • Updated README.md (if applicable)
  • Updated documentation comments (with examples for new features)
  • Updated CHANGELOG.md (if applicable)

Code Quality

  • Code follows the existing style conventions
  • Self-review of the code has been performed
  • Code has been commented, particularly in hard-to-understand areas
  • No new linting warnings introduced
  • No new Dialyzer warnings introduced

Backward Compatibility

  • This change is backward compatible
  • This change includes breaking changes (please describe below)
  • Migration guide provided for breaking changes

Breaking Changes

None. This is a purely additive feature that doesn't modify existing APIs or database schema.

Performance Impact

  • No performance impact
  • Performance improvement
  • Potential performance regression (please describe)

Performance Notes

The fuzzy matching algorithm uses Jaro distance calculation which is O(n*m) complexity, but it's only triggered on-demand when users navigate to the stale translations view. The results are presented in the UI for user action, not executed automatically. Matches are scoped by domain and context to reduce the comparison space.

Translation Management Impact

  • No impact on existing translations
  • Affects translation extraction process
  • Affects translation storage/retrieval
  • Affects Kanta UI/dashboard
  • Affects plugin system
  • Database schema changes

Translation Impact Notes

  • Adds new UI functionality for managing stale translations (messages in DB but not in PO files)
  • Provides merge functionality to preserve translations when msgids change
  • No changes to existing translation workflows or storage mechanisms
  • Purely additive feature that enhances translation management capabilities

Security Considerations

  • No security impact
  • Security improvement
  • Potential security impact (please describe)

Additional Notes

Key Implementation Details

Fuzzy Matching Algorithm:
The fuzzy matching uses Jaro distance (String.jaro_distance/2) to calculate similarity between messages. The threshold is configurable (default 0.8 similarity ratio), allowing users to control how strict the matching should be. Matches are scoped by domain and context for relevance.

Modular Architecture:

  • Kanta.PoFiles.Services.StaleDetection - Main service module with call/1 function
  • Kanta.PoFiles.Services.StaleDetection.FuzzyMatch - Struct representing a fuzzy match result
  • Kanta.PoFiles.Services.StaleDetection.Result - Structured result type for detection operations
  • Kanta.PoFiles.MessagesExtractorAgent - GenServer that caches stale detection results for UI performance

Caching Strategy:
The MessagesExtractorAgent runs stale detection on startup and caches the result. The UI calls get_stale_detection_result() to get cached results (fast) or get_stale_detection_result(true) to force recalculation after merge/delete operations.

UI Workflow:

  1. Dashboard shows counts of stale and mergeable messages
  2. Users can view detailed list of stale translations
  3. For each stale message, users can:
    • Delete it (removes message and all translations)
    • Merge it with a similar active message (preserves translations)
  4. Bulk operations available for efficiency

Screenshots/Examples

Stale Detection API Usage

Recommended: Using MessagesExtractorAgent (with caching)

# Get cached stale detection result (recommended for UI)
result = Kanta.PoFiles.MessagesExtractorAgent.get_stale_detection_result()

# Force recalculation of stale detection
result = Kanta.PoFiles.MessagesExtractorAgent.get_stale_detection_result(true)

Direct Service Call (lower-level API)

# Detect stale translations with fuzzy matching (default threshold 0.8)
{:ok, result} = Kanta.PoFiles.Services.StaleDetection.call(fuzzy_threshold: 0.8)

Result Structure

# Both APIs return a %StaleDetection.Result{} struct containing:
# - stale_message_ids: MapSet of message IDs that are stale
# - fuzzy_matches_map: Map of stale_message_id => %FuzzyMatch{}
# - stale_count: Total number of stale messages
# - mergeable_count: Number of stale messages with fuzzy matches above threshold

# Each FuzzyMatch struct contains:
# - stale_message_id: ID of the stale message
# - matched_message_id: ID of the active message that matches
# - matched_msgid: The msgid string of the matched message
# - similarity_score: Jaro distance score (0.0-1.0)

Message Merging Example

# Merge a stale message into an active one (moves all translations)
{:ok, target_message} = Kanta.Translations.merge_messages(
  from_message_id,  # Source message (will be deleted)
  to_message_id     # Target message (will receive all translations)
)

# This operation:
# 1. Deletes all existing translations from the target message
# 2. Moves all translations (singular & plural) from source to target
# 3. Deletes the source message
# 4. Returns the target message

Dashboard Integration

  • Dashboard now displays:
    • "Stale Messages" count card (links to stale messages view)
    • "Mergeable Messages" count card (stale messages with fuzzy matches)
  • Translations table shows merge/delete buttons for stale messages
  • Bulk operations support selecting multiple messages for merge or delete

Checklist

  • I have read the Contributing Guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Reviewer Notes

Please pay special attention to:

  1. Fuzzy matching algorithm in lib/kanta/po_files/services/stale_detection.ex - verify the Jaro distance implementation and domain/context scoping
  2. Message merging logic in lib/kanta/translations/messages/messages.ex (merge_messages/2 function) - verify that translations are correctly moved and edge cases are handled
  3. UI bulk operations - test the bulk merge/delete functionality with various selections in the dashboard
  4. Test coverage - review the comprehensive test suites in test/kanta/po_files/services/stale_detection_test.exs and test/kanta/translations/messages/merge_test.exs

…translation-keys-that-only-exists-on-kanta__translations-table

Merge with latest gwttext verion
…ality (#120)

Add stale translation detection system with a modular architecture:
- New service structure for stale messages detection
- Added fuzzy matching algorithm to detect similar messages based on edit distance
- Implemented message merging to preserve translations from stale messages

Enhanced UI with bulk operations:
- Dashboard now shows stale and mergeable message counts
- Added bulk merge and delete operations for stale translations
- Updated translations table with merge/delete actions

Added comprehensive test coverage for stale detection and message merging.
…remove-translation-keys-that-only-exists-on-kanta__translations-table
Copy link
Contributor Author

@jk-lamb jk-lamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the requested changes applied.

- Replace Keyword.merge chain on empty list with direct keyword list syntax
- Move require Logger to module level and remove debug logging from event handlers
- Unify MessagesExtractor.call() API to return {:ok, result} for consistency with StaleDetection.call()
- Fix unused variable warnings by prefixing with underscore
@vincentvanbush vincentvanbush merged commit b502791 into develop Oct 24, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants