feat(reporting): add passive voice detection to TipTap editor #796

marcpfuller · 2026-01-12T19:15:06Z

Issue

N/A

Description of the Change

Added server-side passive voice detection to the TipTap rich text editor using spaCy NLP library. Created ghostwriter/modules/passive_voice/ module with PassiveVoiceDetector singleton class that loads spaCy's en_core_web_sm model once per Django worker. The detector identifies passive voice by finding auxiliary passive dependencies (auxpass) combined with past participle verb tags (VBN). Added REST API endpoint POST /api/v1/passive-voice/detect with @login_required authentication that accepts text and returns character ranges of passive sentences. Created TipTap PassiveVoiceMark extension with inclusive: false to prevent highlights from extending to new text. Added React button component with scan/clear functionality and yellow highlight styling with red wavy underline.

Alternate Designs

Client-Side JavaScript NLP: Rejected due to lower accuracy than spaCy, and client-side computational load.

Real-Time Detection on Typing: Rejected due to excessive API traffic, distracting UX with constantly updating highlights, and network dependency during editing.

Background Task Processing: Rejected as unnecessary complexity for sub-2-second operations on typical report text.

Selected: On-demand server-side detection for accuracy, lean bundle size, user-controlled timing, and simple architecture.

Possible Drawbacks

Memory: spaCy model adds ~50MB per Django worker (100-200MB total for typical 2-4 worker deployments). Build Time: Adds ~30 seconds to Docker image builds for model download. Text Limits: Default 100KB max may restrict extremely long reports (configurable via SPACY_MAX_TEXT_LENGTH). Accuracy: NLP detection not 100% accurate but acceptable as writing aid, not validation.

Verification Process

Unit Tests: Created 22 tests (11 detector logic, 11 API endpoint) covering singleton pattern, passive/active detection, authentication, validation, and error handling. All tests pass with 100% detector coverage, 95% API coverage.

Manual Testing: Verified basic detection in finding editor, clear highlights functionality, mixed active/passive text handling, 50KB document performance (<2 seconds), empty editor handling, network failure error messages, authentication enforcement, and unicode character support. Tested in Chrome 131, Firefox 133, Safari 18 on Mac.

Regression: Full test suite passes (1,247 tests, +3s runtime). No regression in existing modules.

Docker: Verified model download in both local and production Dockerfiles, confirmed model loads at startup.

Release Notes

Added passive voice detection to the rich text editor with one-click scanning and visual highlighting of passive voice sentences using spaCy NLP

Implement server-side passive voice detection using spaCy NLP library with visual highlighting in the TipTap rich text editor. Signed-off-by: marc fuller <gogita99@gmail.com>

ghostwriter/api/views.py

Signed-off-by: marc fuller <gogita99@gmail.com>

ColonelThirtyTwo

I'm concerned about adding in an entire ML model to the server, both from a performance angle and in regards to how much this will be used. Obviously there's push from SpecterOps to reduce passive voice in reports but I'm not sure how much other GW users care about it.

ColonelThirtyTwo · 2026-01-13T13:50:39Z

javascript/src/services/passive_voice_api.ts

+    return data.ranges.map(([start, end]) => ({ start, end }));
+}
+
+function getCsrfToken(): string {


Should be in a shared module - I'm pretty sure we already have other JS code that gets the CSRF token.

so I made it a shared module. The other JS code has it as inline, currently this is only used in very few places, thus if there were to be no future api or code changes that require getting the Csrf token, having it as a shared module may be a bit overkill. So I wonder if I should just remove the function and just keep it inline for now?

ghostwriter/modules/passive_voice/detector.py

compose/production/django/Dockerfile

ghostwriter/modules/passive_voice/detector.py

ColonelThirtyTwo · 2026-01-13T13:59:12Z

javascript/src/tiptap_gw/passive_voice_mark.ts

+ * Note: This mark is non-inclusive, meaning typing at the boundaries
+ * won't extend the mark. Editing within the mark will remove it.
+ */
+export const PassiveVoiceMark = Mark.create({


Ideally this would be a client-side decoration rather than an actual mark so that it doesn't affect the document, but that's likely harder. Closest thing I can see is the InvisibleCharacters extension.

Barring that, we should have an option to remove all of the marks so that appropriate uses of passive voice don't end up in the final document.

Make sure this doesn't conflict with the regular highlighting mark. Adjust the mark's priority if so.

Might also want to detect the class in the report writer and just ignore the mark. Right now I think it will interpret any of these sections as a highlight.

I made a slight change to this. instead of using mark, I created an extension that should only highlight the text, it would not be added to the document.

requirements/base.txt

marcpfuller · 2026-01-14T00:07:01Z

I'm concerned about adding in an entire ML model to the server, both from a performance angle and in regards to how much this will be used. Obviously there's push from SpecterOps to reduce passive voice in reports but I'm not sure how much other GW users care about it.

During Robby's Roundup, there appears to be many who would care about. In regards to the ML model, the model itself is small (50mb). I also made some changes to disable certain functions of spaCy such that the overhead is kept to a minimum. The only other solution i could see is creating an entire new service separate from the django service. That way the performance of the django service would be less of an issue.

Signed-off-by: marc fuller <gogita99@gmail.com>

Copilot

Pull request overview

Adds server-side passive voice detection to the TipTap rich text editor using spaCy NLP. Users can scan editor content via a button that highlights passive voice sentences with yellow background and wavy red underline. Detection uses spaCy's en_core_web_sm model loaded once per Django worker via singleton pattern.

Changes:

Added spaCy dependency and ghostwriter/modules/passive_voice/ module with NLP-based passive voice detection
Created REST API endpoint POST /api/v1/passive-voice/detect with authentication and validation
Implemented TipTap extension with React button component for scan/clear functionality and visual decorations

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
requirements/base.txt	Adds spaCy 3.8.11 dependency
javascript/src/tiptap_gw/passive_voice_decoration.ts	TipTap extension for passive voice decorations with hover effects
javascript/src/tiptap_gw/index.ts	Registers PassiveVoiceDecoration extension
javascript/src/services/passive_voice_api.ts	API service for passive voice detection
javascript/src/services/csrf.ts	CSRF token utility extracted for reuse
javascript/src/frontend/collab_forms/rich_text_editor/passive_voice.tsx	React button component for scanning and clearing highlights
javascript/src/frontend/collab_forms/rich_text_editor/index.tsx	Adds PassiveVoiceButton to toolbar
javascript/src/frontend/collab_forms/rich_text_editor/evidence/upload.tsx	Refactored to use shared getCsrfToken utility
javascript/src/frontend/collab_forms/editor.scss	Styles for passive voice highlights
ghostwriter/modules/passive_voice/tests/test_detector.py	Unit tests for PassiveVoiceDetector class
ghostwriter/modules/passive_voice/tests/test_api.py	Unit tests for API endpoint
ghostwriter/modules/passive_voice/tests/init.py	Test module initialization
ghostwriter/modules/passive_voice/detector.py	Singleton detector class using spaCy
ghostwriter/modules/passive_voice/init.py	Module initialization
ghostwriter/api/views.py	Adds detect_passive_voice endpoint
ghostwriter/api/urls.py	Registers passive voice detection URL
config/settings/base.py	Adds spaCy configuration settings
compose/production/django/Dockerfile	Downloads spaCy model during build
compose/local/django/Dockerfile	Downloads spaCy model during build

Files not reviewed (1)

javascript/package-lock.json: Language not supported

javascript/src/frontend/collab_forms/rich_text_editor/evidence/upload.tsx

Signed-off-by: marc fuller <gogita99@gmail.com>

feat(reporting): add passive voice detection to TipTap editor

cd4fcfa

Implement server-side passive voice detection using spaCy NLP library with visual highlighting in the TipTap rich text editor. Signed-off-by: marc fuller <gogita99@gmail.com>

github-advanced-security bot found potential problems Jan 12, 2026

View reviewed changes

ghostwriter/api/views.py Fixed Show fixed Hide fixed

marcpfuller requested review from ColonelThirtyTwo and chrismaddalena January 12, 2026 19:16

marcpfuller self-assigned this Jan 12, 2026

marcpfuller requested review from chrismaddalena and removed request for chrismaddalena January 12, 2026 22:37

marcpfuller added 2 commits January 12, 2026 17:27

fix: resolve issures with tests

e64858d

Signed-off-by: marc fuller <gogita99@gmail.com>

fix: resolve pylint issues

c4c4f67

Signed-off-by: marc fuller <gogita99@gmail.com>

marcpfuller force-pushed the passive_voice_detection branch from 6547886 to c4c4f67 Compare January 13, 2026 01:51

fix: resolve github security stack trace issue

ac0545b

Signed-off-by: marc fuller <gogita99@gmail.com>

marcpfuller force-pushed the passive_voice_detection branch from dfd0385 to ac0545b Compare January 13, 2026 01:57

Merge branch 'master' into passive_voice_detection

0d67486

ColonelThirtyTwo requested changes Jan 13, 2026

View reviewed changes

marcpfuller requested a review from ColonelThirtyTwo January 14, 2026 00:07

fix: resolve comments made by @ColonelThirtyTwo in PR

05061fb

Signed-off-by: marc fuller <gogita99@gmail.com>

Copilot AI review requested due to automatic review settings January 14, 2026 00:08

Copilot AI reviewed Jan 14, 2026

View reviewed changes

javascript/src/frontend/collab_forms/rich_text_editor/evidence/upload.tsx Show resolved Hide resolved

marcpfuller added 2 commits January 14, 2026 09:02

fix: add null check to upload.tsx when getting csrf token

a78cf05

Signed-off-by: marc fuller <gogita99@gmail.com>

Merge branch 'master' into passive_voice_detection

d2f10f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(reporting): add passive voice detection to TipTap editor #796

feat(reporting): add passive voice detection to TipTap editor #796

Uh oh!

marcpfuller commented Jan 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

ColonelThirtyTwo left a comment

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Uh oh!

marcpfuller Jan 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Uh oh!

marcpfuller Jan 14, 2026

Uh oh!

Uh oh!

marcpfuller commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(reporting): add passive voice detection to TipTap editor #796

Are you sure you want to change the base?

feat(reporting): add passive voice detection to TipTap editor #796

Uh oh!

Conversation

marcpfuller commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Description of the Change

Alternate Designs

Possible Drawbacks

Verification Process

Release Notes

Uh oh!

Uh oh!

ColonelThirtyTwo left a comment

Choose a reason for hiding this comment

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

marcpfuller Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

ColonelThirtyTwo Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

marcpfuller Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marcpfuller commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marcpfuller commented Jan 12, 2026 •

edited

Loading