Textifying Speaking is a local full-stack web application that automates the transcription of audio and video files, offering the option to generate summaries of the obtained transcriptions. This tool is ideal for students, professionals, and anyone who needs to convert multimedia content into text and obtain summaries from them.
- User Registration (US-01): Secure user registration with email and password
- Username validation (3-30 characters)
- Email format validation
- Password strength enforcement (minimum 8 characters)
- Secure password hashing using bcrypt
- Duplicate email/username detection
- Success modal with navigation options
- User Login (US-02): Secure JWT-based authentication
- Email and password validation
- Bcrypt password verification
- JWT token generation (1-hour expiration)
- Client-side token storage
- Invalid credentials handling
- Success notification with automatic redirect
-
Media File Upload (US-03): Upload audio/video files for transcription
- Supported formats: MP3, WAV, MP4, M4A
- Maximum file size: 100MB
- Drag-and-drop interface
- Real-time upload progress tracking
- Client-side file validation
- Server-side file type and size validation
- JWT-protected endpoint (authentication required)
- Multipart form data support with Axios
-
Dashboard & File Management (US-04): View and manage uploaded files
- List all user's uploaded files in a responsive grid layout
- File cards display: filename, type, size, upload date, status
- Color-coded status badges (uploaded, processing, completed, failed)
- File type icons for audio/video files
- View detailed file information in modal
- Delete files with confirmation prompt
- Ownership validation (users can only delete their own files)
- Dynamic UI updates after deletion
- Empty state with upload CTA
- JWT-protected endpoints (authentication required)
-
Real-Time File Status Updates (US-05): Monitor file processing in real-time
- WebSocket-based real-time status updates using Socket.IO
- Status indicators: uploading, ready, processing, completed, error
- Progress tracking (0-100%) for files being processed
- Connection status indicator in dashboard
- Automatic UI updates without page refresh
- Status badges with animated icons:
- Uploading: Purple with spinner
- Ready: Green
- Processing: Yellow with spinning cog + progress percentage
- Completed: Blue with checkmark
- Error: Red with alert icon
- Progress bar visualization in file details modal
- Error message display for failed operations
- Toast notifications for completed files and errors
- JWT-secured WebSocket connections
- User-scoped updates (only see your own file updates)
- Status update API endpoint for testing/integration
-
Audio/Video Transcription (US-06): Transcribe media files to text
- One-click transcription initiation from dashboard
- "Transcribe" button for files in 'ready' status
- Async transcription processing (non-blocking)
- HuggingFace Whisper (medium) model for speech recognition
- Python-based transcription service (Flask + Transformers)
- Real-time status updates via WebSocket
- Status progression: ready → processing → completed/error
- Transcribed text stored in database
- View transcribed text in modal with copy-to-clipboard
- File ownership validation (users can only transcribe their own files)
- File type validation (audio/video only)
- Duplicate transcription prevention
- Error handling with descriptive messages
- Support for multiple audio formats (MP3, WAV, M4A, MP4)
- GPU acceleration support when available
- Containerized transcription service with Docker
- JWT-protected transcription endpoint
-
Background Real-time Transcription Progress (US-07): Process transcriptions asynchronously with real-time feedback
- BullMQ job queue for background transcription processing
- Redis as message broker for distributed job queue
- Non-blocking transcription (returns immediately after job enqueue)
- Background worker processes jobs with configurable concurrency (2 jobs simultaneously)
- Automatic retry mechanism (up to 3 attempts with exponential backoff)
- Real-time progress updates every 5% (5%, 10%, 15%, ..., 90%, 95%, 100%)
- WebSocket broadcasts of progress to authenticated users
- Google Drive-style floating progress indicator in UI:
- Fixed bottom-right corner position
- Shows all files currently processing
- Real-time progress bars with percentage
- Auto-hides when no files processing
- Enhanced toast notifications with rich content:
- Transcription started (info with spinner icon)
- Transcription completed (success with checkmark)
- Transcription failed (error with details)
- User can navigate freely while transcription runs in background
- Failed jobs retained for debugging and monitoring
- Dashboard updates automatically without page refresh
- Scalable architecture for handling multiple concurrent transcriptions
-
View Transcription (US-08): Access and review transcribed text
- Dedicated GET
/media/:id/transcriptionendpoint for secure access - View transcribed text in responsive modal dialog
- Copy-to-clipboard functionality for easy text extraction
- Displays file metadata (filename, file ID) with transcription
- Status-aware responses:
- Completed: Returns full transcription text
- Processing: Shows progress and "in progress" message
- Error: Displays error message from failed transcription
- Ready/Uploading: Indicates transcription hasn't started
- Ownership validation (users can only view their own transcriptions)
- Real-time UI updates when transcription completes via WebSocket
- Scrollable container for long transcriptions
- Whitespace-preserved text formatting
- Error handling for missing or incomplete transcriptions
- JWT-protected endpoint ensuring data privacy
- Integrates with existing Dashboard file cards
- "View Text" button appears for completed files with transcription
- Dedicated GET
-
Summarize Transcription (US-09): Generate AI-powered summaries of completed transcriptions
- Dedicated POST
/media/:id/summarizeendpoint with JWT authentication - Uses mT5_multilingual_XLSum model for multilingual abstractive text summarization
- Supports 45+ languages for comprehensive international coverage
- Asynchronous processing via BullMQ job queue (non-blocking API responses)
- Real-time WebSocket updates for summarization progress and completion
- Dashboard features:
- "Summarize" button for completed files with transcriptions
- "View Summary" button displays after successful summarization
- Summary modal shows both summary and original transcription
- Copy-to-clipboard for both summary and transcription
- Color-coded UI (purple theme) to distinguish from transcription
- Validation and error handling:
- Only files with completed transcriptions can be summarized
- File ownership validation (403 if unauthorized)
- Prevents duplicate summarization when already processing
- Descriptive error messages for failures
- Progress tracking:
- Files tracked with summaryStatus field (pending, processing, completed, error)
- Real-time toast notifications (start, completion, error)
- FloatingProgressIndicator shows summarization progress
- Visual indicators with animated icons
- Background processing:
- Automatic chunking for texts longer than 512 words
- Configurable summary length (30-150 tokens)
- Retry mechanism (up to 3 attempts with exponential backoff)
- Failed jobs retained for debugging
- Data persistence:
- Summary stored in MongoDB (summaryText field)
- Error messages captured (summaryErrorMessage field)
- Summary status tracked separately from transcription status
- Independent service architecture:
- Python Flask service on port 5001
- GPU acceleration support via CUDA
- Scalable and isolated from transcription service
- WebSocket events for real-time updates:
- summaryStatusUpdate broadcasts to user when status changes
- Automatic UI synchronization without page refresh
- Dedicated POST
-
Background Real-time Summary Status (US-10): Process summarizations asynchronously with real-time feedback
- BullMQ job queue for background summarization processing
- Redis as message broker for distributed job queue
- Non-blocking summarization (returns immediately after job enqueue)
- Background worker processes jobs with configurable concurrency (2 jobs simultaneously)
- Automatic retry mechanism (up to 3 attempts with exponential backoff)
- Real-time status updates via WebSocket broadcasts to authenticated users
- Google Drive-style floating progress indicator in UI:
- Shows all files currently processing (transcription and summarization)
- Purple gradient theme for summarization progress
- Real-time progress updates
- Auto-hides when no files processing
- Multiple files displayed with scroll support
- Enhanced toast notifications with rich content:
- Summarization started (info with spinner icon)
- Summarization completed (success with checkmark)
- Summarization failed (error with details and error message)
- User can navigate freely while summarization runs in background
- Failed jobs retained for debugging and monitoring
- Dashboard updates automatically without page refresh
- Duplicate job prevention (cannot start multiple summarizations for same file)
- Scalable architecture for handling multiple concurrent summarizations
- Independent from transcription processing (parallel processing support)
- Status management via summaryStatus field (pending, processing, completed, error)
- Error messages captured and displayed to users
- Complete integration with existing real-time update infrastructure
-
View Summary Alongside Transcription (US-11): Compare and review transcription with summary side-by-side
- Enhanced modal dialog with side-by-side layout for transcription and summary
- GET
/media/:id/transcriptionendpoint returns both transcription and summary data - Side-by-side comparison view:
- Left panel: Full transcription with indigo theme
- Right panel: AI-generated summary with purple theme
- Independent scrolling for each panel
- Copy-to-clipboard buttons for both transcription and summary
- Status-aware summary display:
- Completed: Shows generated summary with copy functionality
- Processing: Animated spinner with "Summary in progress..." message
- Pending: Clock icon with suggestion to click "Summarize" button
- Error: Alert icon with error message details
- Not started: Placeholder with instructions to generate summary
- Real-time updates when summary becomes ready:
- WebSocket integration automatically updates modal
- No page refresh required
- Toast notifications on status changes
- Smooth UI transitions between states
- Enhanced API response format:
{ "status": "completed", "transcription": "Full transcribed text...", "summaryText": "AI-generated summary...", "summaryStatus": "completed", "fileId": "...", "originalFilename": "..." } - Responsive design:
- Desktop: Two-column side-by-side layout
- Tablet/Mobile: Single-column stacked layout (future enhancement)
- Maximum viewport height (90vh) with scrollable content
- File ownership validation (403 if unauthorized)
- Integrated with existing Dashboard:
- "View Text" button opens new side-by-side modal
- Summary modal deprecated in favor of unified view
- Consistent styling with existing UI components
- Enhanced user experience:
- Clear visual distinction between transcription and summary
- Status indicators prevent confusion about summary availability
- Automatic updates keep displayed content synchronized
- File metadata displayed in modal header
- Supports all transcription/summary states:
- File not yet transcribed: Shows appropriate message
- Transcription in progress: Displays progress information
- Transcription completed, summary not started: Shows transcription with prompt to summarize
- Both completed: Full side-by-side comparison view
- Performance optimized:
- Lazy loading of summary text
- Efficient WebSocket message handling
- Minimal re-renders on status updates
- Framework: NestJS 10.x (TypeScript)
- Database: MongoDB with Mongoose ODM
- Cache/Queue: Redis 7.0 for BullMQ job queue
- Job Queue: BullMQ with @nestjs/bullmq for background job processing
- Authentication: bcrypt for password hashing, JWT for session management
- File Upload: Multer for multipart form data handling
- Real-Time Communication: Socket.IO with @nestjs/websockets & @nestjs/platform-socket.io
- HTTP Client: axios for service-to-service communication
- Validation: class-validator & class-transformer
- Configuration: @nestjs/config for environment variables
- Testing: Jest for unit and E2E tests
- Language: Python 3.11
- Framework: Flask 3.0
- AI Model: OpenAI Whisper (medium) via HuggingFace Transformers
- ML Libraries: PyTorch, Transformers, Accelerate
- Audio Processing: FFmpeg
- Container: Python 3.11-slim Docker image
- Optimization: Lazy loading (model loads on first request)
- Language: Python 3.11
- Framework: Flask 3.0
- AI Model: mT5_multilingual_XLSum via HuggingFace Transformers (45+ languages)
- ML Libraries: PyTorch, Transformers, Accelerate, SentencePiece
- Container: Python 3.11-slim Docker image
- Optimization: Lazy loading (model loads on first request)
Both AI services use lazy loading to optimize GPU memory usage:
- Models load on-demand (on first request) rather than at service startup
- Services start in seconds without loading heavy AI models
- Both services can coexist on same GPU without memory conflicts
- Health endpoint includes
model_loadedfield to verify model status - Implementation:
get_pipeline()in transcription,get_summarizer()in summarization
- Framework: React 19 + Vite 7
- Routing: React Router DOM 7
- Styling: TailwindCSS 4
- HTTP Client: Axios for API requests & file uploads
- Real-Time Communication: Socket.IO client (socket.io-client)
- Form Validation: Yup for schema-based validation
- State Management: Zustand with persist middleware
- Notifications: React Toastify
- Icons: Iconify React
- Containerization: Docker & Docker Compose
- Services: Frontend (port 5173), Backend (port 3001), Transcription (port 5000), MongoDB (port 27017)
- Build Tool: Makefile for common tasks
- Docker and Docker Compose V2
- Make (optional, for using Makefile commands)
# Clone the repository
git clone <repository-url>
cd textifying-speaking
# Build and start all services
make quickstart
# OR
docker compose up --build -d
# Access the application
# Frontend: http://localhost:5173
# Backend: http://localhost:3001
# Transcription: http://localhost:5000
# MongoDB: mongodb://localhost:27017
# Redis: redis://localhost:6379# View all available commands
make help
# Build containers
make build
# Start services
make up
# Stop services
make down
# View logs
make logs
make logs-backend
make logs-frontend
make logs-transcription
make logs-redis
# Run tests
make test-backend
# Access container shells
make shell-backend
make shell-frontend
make shell-transcription
make shell-redis
make redis-cli
make db-shell
# Clean up (removes volumes)
make cleanRegister a new user account.
Request Body:
{
"username": "johndoe",
"email": "john@example.com",
"password": "securePassword123"
}Success Response (201):
{
"message": "User registered successfully",
"user": {
"id": "507f1f77bcf86cd799439011",
"username": "johndoe",
"email": "john@example.com",
"createdAt": "2025-11-19T20:00:00.000Z"
}
}Error Responses:
400 Bad Request: Validation errors (invalid email, weak password, etc.)409 Conflict: Email or username already exists
Validation Rules:
username: Required, 3-30 charactersemail: Required, valid email formatpassword: Required, minimum 8 characters
Authenticate a user and receive a JWT token.
Request Body:
{
"email": "john@example.com",
"password": "securePassword123"
}Success Response (200):
{
"message": "Login successful",
"accessToken": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"user": {
"id": "507f1f77bcf86cd799439011",
"username": "johndoe",
"email": "john@example.com"
}
}Error Responses:
400 Bad Request: Validation errors (missing fields, invalid email format)401 Unauthorized: Invalid credentials (wrong email or password)
Validation Rules:
email: Required, valid email formatpassword: Required
JWT Token:
- Expiration: 1 hour
- Payload includes: user ID (
sub), email, username - Store in
localStorageon client-side for subsequent authenticated requests
Upload an audio or video file for transcription.
Authentication: Required (Bearer JWT token)
Request:
- Content-Type:
multipart/form-data - Body: Form data with file field
Example (curl):
curl -X POST http://localhost:3001/media/upload \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-F "file=@/path/to/audio.mp3"Success Response (201):
{
"message": "File uploaded successfully",
"file": {
"id": "507f1f77bcf86cd799439011",
"filename": "file-1637258400000-123456789.mp3",
"originalFilename": "audio.mp3",
"mimetype": "audio/mpeg",
"size": 2048576,
"uploadDate": "2025-11-19T20:00:00.000Z",
"status": "uploaded"
}
}Error Responses:
400 Bad Request: No file uploaded, invalid file type, or file too large401 Unauthorized: Missing or invalid JWT token
Validation Rules:
- Allowed MIME types:
audio/mpeg,audio/wav,audio/x-wav,video/mp4,audio/mp4,audio/x-m4a - Maximum file size: 100MB
- Allowed extensions:
.mp3,.wav,.mp4,.m4a
Storage:
- Files stored in:
MEDIA_STORAGE_PATH(default:./uploads) - Filename format:
file-{timestamp}-{random}.{ext} - Metadata stored in MongoDB
List all uploaded files for the authenticated user.
Authentication: Required (Bearer JWT token)
Example (curl):
curl -X GET http://localhost:3001/media \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Success Response (200):
{
"files": [
{
"id": "507f1f77bcf86cd799439011",
"filename": "file-1637258400000-123456789.mp3",
"originalFilename": "audio.mp3",
"mimetype": "audio/mpeg",
"size": 2048576,
"uploadDate": "2025-11-19T20:00:00.000Z",
"status": "uploaded"
}
]
}Error Responses:
401 Unauthorized: Missing or invalid JWT token
Response Fields:
id: Unique file identifierfilename: Stored filename on serveroriginalFilename: Original filename uploaded by usermimetype: File MIME typesize: File size in bytesuploadDate: Upload timestamp (ISO 8601)status: Processing status (uploaded,processing,completed,failed)
Delete a specific file by ID.
Authentication: Required (Bearer JWT token)
Path Parameters:
id: File ID (MongoDB ObjectId)
Example (curl):
curl -X DELETE http://localhost:3001/media/507f1f77bcf86cd799439011 \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Success Response (200):
{
"message": "File deleted successfully"
}Error Responses:
401 Unauthorized: Missing or invalid JWT token403 Forbidden: User does not own the file404 Not Found: File not found
Behavior:
- Validates file ownership before deletion
- Deletes both physical file from storage and database record
- If physical file is missing, continues with database deletion (logs error)
Update the processing status of a file.
Authentication: Required (Bearer JWT token)
Path Parameters:
id: File ID (MongoDB ObjectId)
Request Body:
{
"status": "processing",
"progress": 50,
"errorMessage": "Optional error message"
}Example (curl):
curl -X PATCH http://localhost:3001/media/507f1f77bcf86cd799439011/status \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"status":"processing","progress":50}'Success Response (200):
{
"message": "File status updated successfully",
"file": {
"id": "507f1f77bcf86cd799439011",
"status": "processing",
"progress": 50
}
}Error Responses:
400 Bad Request: Invalid status value401 Unauthorized: Missing or invalid JWT token403 Forbidden: User does not own the file404 Not Found: File not found
Valid Status Values:
uploading: File is being uploadedready: File is ready for processingprocessing: File is being transcribedcompleted: Transcription completed successfullyerror: An error occurred during processing
Behavior:
- Validates file ownership before updating
- Emits real-time WebSocket update to user
- Updates status, progress, and optional error message
Initiate transcription for an audio/video file.
Authentication: Required (Bearer JWT token)
Path Parameters:
id: File ID (MongoDB ObjectId)
Example (curl):
curl -X POST http://localhost:3001/media/507f1f77bcf86cd799439011/transcribe \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Success Response (200):
{
"message": "Transcription started",
"file": {
"id": "507f1f77bcf86cd799439011",
"status": "processing"
}
}Error Responses:
400 Bad Request: File is not an audio/video file401 Unauthorized: Missing or invalid JWT token403 Forbidden: User does not own the file404 Not Found: File not found500 Internal Server Error: File is already processing or completed, transcription service error
Behavior:
- Validates file ownership and type (audio/video only)
- Checks file status (rejects if already processing or completed)
- Updates status to
processingimmediately - Sends file to Python transcription service asynchronously
- Emits real-time WebSocket updates during processing
- On success: stores transcribed text, updates status to
completed - On failure: updates status to
errorwith error message - Returns immediately (transcription happens in background)
Supported MIME Types:
audio/mpeg(MP3)audio/wav(WAV)audio/mp4(M4A)video/mp4(MP4)audio/x-m4a(M4A)
Transcription Service:
- Uses OpenAI Whisper (small) model via HuggingFace
- Supports GPU acceleration when available
- Timeout: 5 minutes per file
- Automatic chunking for long audio (30-second chunks)
Retrieve the transcription text and summary (if available) for a transcribed file.
Authentication: Required (Bearer JWT token)
Path Parameters:
id: File ID (MongoDB ObjectId)
Example (curl):
curl -X GET http://localhost:3001/media/507f1f77bcf86cd799439011/transcription \
-H "Authorization: Bearer YOUR_JWT_TOKEN"Success Response (200) - Completed File with Summary:
{
"status": "completed",
"transcription": "This is the transcribed text from the audio file...",
"fileId": "507f1f77bcf86cd799439011",
"originalFilename": "audio.mp3",
"summaryText": "AI-generated summary of the transcription...",
"summaryStatus": "completed"
}Success Response (200) - Completed File with Pending Summary:
{
"status": "completed",
"transcription": "This is the transcribed text from the audio file...",
"fileId": "507f1f77bcf86cd799439011",
"originalFilename": "audio.mp3",
"summaryStatus": "pending"
}Success Response (200) - Completed File with Processing Summary:
{
"status": "completed",
"transcription": "This is the transcribed text from the audio file...",
"fileId": "507f1f77bcf86cd799439011",
"originalFilename": "audio.mp3",
"summaryStatus": "processing"
}Success Response (200) - Completed File with Summary Error:
{
"status": "completed",
"transcription": "This is the transcribed text from the audio file...",
"fileId": "507f1f77bcf86cd799439011",
"originalFilename": "audio.mp3",
"summaryStatus": "error",
"summaryErrorMessage": "Summarization service unavailable"
}Success Response (200) - Processing File:
{
"status": "processing",
"progress": 75,
"message": "Transcription is in progress"
}Success Response (200) - Error File:
{
"status": "error",
"message": "Transcription failed: Service unavailable"
}Success Response (200) - Ready/Uploading File:
{
"status": "ready",
"message": "Transcription has not been started yet"
}Error Responses:
401 Unauthorized: Missing or invalid JWT token403 Forbidden: User does not own the file404 Not Found: File not found500 Internal Server Error: Completed file has no transcription text (data inconsistency)
Behavior:
- Validates file ownership before returning transcription
- Returns different responses based on file status:
completed: Returns full transcription text with file metadata and summary data (if available)processing: Returns progress percentage and status messageerror: Returns error message from failed transcriptionready/uploading: Returns message indicating transcription hasn't started
- Summary fields included in response when transcription is completed:
summaryText: The generated summary (only if summaryStatus is 'completed')summaryStatus: Current status of summarization (pending, processing, completed, error, or undefined)summaryErrorMessage: Error details if summarization failed
- Only the file owner can retrieve transcription and summary (enforces data privacy)
Use Cases:
- Display transcribed text and summary side-by-side in UI after completion
- Check transcription and summary status without fetching full file details
- Implement "View Transcription & Summary" feature in frontend (US-11)
- Poll for completion status (though WebSockets are preferred for real-time updates)
Best Practices:
- Use WebSocket events for real-time status updates instead of polling this endpoint
- This endpoint is ideal for retrieving transcription and summary after page reload or direct navigation
- Handle all status responses gracefully in UI (show spinner for processing, error message for errors, etc.)
- Display summary status indicators (pending, processing, completed, error) when showing transcription
textifying-speaking/
├── ts-back/ # NestJS Backend
│ ├── src/
│ │ ├── auth/ # Authentication module
│ │ │ ├── dto/ # Data Transfer Objects
│ │ │ ├── guards/ # JWT AuthGuard
│ │ │ ├── strategies/ # JWT Strategy
│ │ │ ├── auth.controller.ts
│ │ │ ├── auth.service.ts
│ │ │ └── auth.module.ts
│ │ ├── users/ # Users module
│ │ │ ├── schemas/ # MongoDB schemas (User)
│ │ │ ├── users.service.ts
│ │ │ └── users.module.ts
│ │ ├── media/ # Media upload & transcription module
│ │ │ ├── schemas/ # MongoDB schemas (MediaFile)
│ │ │ ├── filters/ # Exception filters
│ │ │ ├── media.controller.ts
│ │ │ ├── media.service.ts
│ │ │ ├── media.gateway.ts # WebSocket gateway
│ │ │ └── media.module.ts
│ │ ├── app.module.ts # Main application module
│ │ └── main.ts # Application entry point
│ ├── test/ # E2E tests
│ ├── uploads/ # Uploaded files storage
│ ├── Dockerfile
│ └── package.json
├── ts-front/ # React Frontend
│ ├── src/
│ │ ├── components/
│ │ │ └── Navbar.jsx # Navigation with auth UI & upload/dashboard links
│ │ ├── hooks/
│ │ │ └── useFileStatus.js # WebSocket hook for real-time updates
│ │ ├── pages/
│ │ │ ├── Register.jsx # Registration page
│ │ │ ├── Login.jsx # Login page
│ │ │ ├── Upload.jsx # File upload page
│ │ │ ├── Dashboard.jsx # File management & transcription dashboard
│ │ │ └── HealthCheck.jsx
│ │ ├── store/
│ │ │ └── authStore.js # Zustand auth state
│ │ ├── App.jsx # Main app component & routes
│ │ └── main.jsx # Application entry point
│ ├── Dockerfile
│ └── package.json
├── ts-transcription/ # Python Transcription Service
│ ├── app.py # Flask application with Whisper model
│ ├── requirements.txt # Python dependencies
│ ├── Dockerfile # Container with PyTorch & Transformers
│ └── README.md # Service documentation
├── docker-compose.yml # Docker services configuration
├── Makefile # Development commands
└── README.md # This file
# Run all unit tests
make test-backend
# OR
docker exec ts-backend npm test
# Run E2E tests
make test-backend-e2e
# OR
docker exec ts-backend npm run test:e2e
# Run tests with coverage
make test-backend-cov
# OR
docker exec ts-backend npm run test:cov
# Run specific test suites
npm test -- media.service.spec.ts # MediaService unit tests
npm test -- summarization.processor.spec.ts # Summarization unit tests
npm test -- media-summarize.e2e-spec.ts # Summarization E2E tests# Build frontend (validates JSX/imports)
cd ts-front && npm run build
# Run ESLint
cd ts-front && npm run lintAutomated test script validates the complete summarization feature:
# Run comprehensive US-09 tests
./test-us-09.shThis script validates:
- ✅ Backend unit tests (MediaService + SummarizationProcessor)
- ✅ Frontend build (validates JSX/imports)
- ✅ Frontend linting (ESLint)
- ✅ Docker Compose configuration (service presence)
Test Results Summary:
- Backend: 43/43 tests passing
- MediaService: 24 tests
- SummarizationProcessor: 6 tests
- Other modules: 13 tests
- Frontend: Builds successfully with no errors
- Frontend: ESLint passes with no warnings
To test the complete summarization workflow with services running:
# 1. Start all services
make quickstart
# 2. Register a user
curl -X POST http://localhost:3001/auth/register \
-H 'Content-Type: application/json' \
-d '{"username":"testuser","email":"test@example.com","password":"password123"}'
# 3. Login to get token
TOKEN=$(curl -s -X POST http://localhost:3001/auth/login \
-H 'Content-Type: application/json' \
-d '{"email":"test@example.com","password":"password123"}' \
| jq -r '.accessToken')
# 4. Upload a media file
FILE_ID=$(curl -s -X POST http://localhost:3001/media/upload \
-H "Authorization: Bearer $TOKEN" \
-F "file=@/path/to/audio.mp3" \
| jq -r '.file._id')
# 5. Start transcription
curl -X POST http://localhost:3001/media/$FILE_ID/transcribe \
-H "Authorization: Bearer $TOKEN"
# 6. Wait for transcription to complete (check status via WebSocket or polling)
# 7. Start summarization
curl -X POST http://localhost:3001/media/$FILE_ID/summarize \
-H "Authorization: Bearer $TOKEN"
# 8. Check summary status
curl -X GET http://localhost:3001/media \
-H "Authorization: Bearer $TOKEN" \
| jq '.[] | select(._id=="'$FILE_ID'") | {summaryStatus, summaryText}'- Open http://localhost:5173 in browser
- Register/login as user
- Upload an audio/video file
- Click 'Transcribe' button
- Wait for transcription to complete (watch floating progress indicator)
- Click 'Summarize' button (purple button appears after transcription)
- Wait for summarization to complete (watch floating progress indicator)
- Click 'View Summary' button to see modal with summary and transcription
- Test copy-to-clipboard buttons in modal
- Verify real-time updates work (status badges update automatically)
US-10 testing is integrated into US-09 tests since US-10 represents the background real-time aspects already implemented in US-09.
Automated Tests:
# Run backend unit tests (includes SummarizationProcessor tests)
make test-backend
# Run all E2E tests
make test-backend-e2eTest Coverage:
- ✅ BullMQ job queue integration (SummarizationProcessor)
- ✅ Background job processing with concurrency control
- ✅ Automatic retry mechanism with exponential backoff
- ✅ WebSocket real-time status broadcasts
- ✅ Error handling and recovery
- ✅ Multiple simultaneous job processing
- ✅ Job completion and failure scenarios
Manual Real-Time Testing:
- Start all services:
make quickstart - Open browser at http://localhost:5173
- Register/login as user
- Upload multiple audio/video files
- Start transcription on multiple files simultaneously
- Observe floating progress indicator showing all processing files
- Navigate to different pages - progress continues in background
- After transcriptions complete, start summarization on multiple files
- Observe:
- Purple gradient progress indicator for summarization
- Real-time toast notifications (started, completed, failed)
- Dashboard status badges update automatically
- User can continue browsing while processing occurs
- Verify summarization completes successfully
- Click "View Summary" to see generated summary
- Test error handling by stopping summarization service:
docker stop ts-summarization # Try to summarize a file - should show error toast docker start ts-summarization
WebSocket Event Testing:
# Monitor WebSocket events in browser console:
# 1. Open browser DevTools → Console
# 2. Look for "Summary status update received:" messages
# 3. Verify events contain: fileId, summaryStatus, summaryText, summaryErrorMessage
# Manual WebSocket connection test:
TOKEN=$(curl -s -X POST http://localhost:3001/auth/login \
-H 'Content-Type: application/json' \
-d '{"email":"test@example.com","password":"password123"}' \
| jq -r '.accessToken')
# Connect via Socket.IO client and listen to 'summaryStatusUpdate' eventsBackground Processing Verification:
# Check Redis for queued jobs
make redis-cli
> KEYS *summarization*
> LLEN bull:summarization:waiting
> LLEN bull:summarization:active
> LLEN bull:summarization:completed
> LLEN bull:summarization:failed
> exit
# Check summarization service logs
make logs-summarization
# Check backend processor logs
make logs-backend | grep SummarizationProcessorMONGODB_URI=mongodb://mongodb:27017/textifying-speaking
JWT_SECRET=your-super-secret-jwt-key-change-in-production
PORT=3001
MEDIA_STORAGE_PATH=./uploads
REDIS_HOST=redis
REDIS_PORT=6379
TRANSCRIPTION_SERVICE_URL=http://transcription:5000
SUMMARIZATION_SERVICE_URL=http://summarization:5001PORT=5000PORT=5001Environment variables are configured in docker-compose.yml for containerized deployments.
- ✅ Password hashing with bcrypt (salt rounds: 10)
- ✅ JWT-based authentication with 1-hour token expiration
- ✅ Input validation (client-side and server-side)
- ✅ Email uniqueness enforcement
- ✅ Username uniqueness enforcement
- ✅ File ownership validation (users can only delete their own files)
- ✅ Status update ownership validation (users can only update their own files)
- ✅ Protected routes (authentication required for sensitive operations)
- ✅ JWT-secured WebSocket connections (authentication required for real-time updates)
- ✅ User-scoped WebSocket broadcasts (users only receive updates for their own files)
- ✅ File ownership validation for transcription (users can only transcribe their own files)
- ✅ File type validation for transcription (audio/video only)
- ✅ File ownership validation for summarization (users can only summarize their own files)
- ✅ Transcription completion validation for summarization (prevents summarization of unfinished transcriptions)
- ✅ CORS enabled for frontend communication
- ✅ MongoDB connection security
- ✅ Secure credential verification (constant-time comparison via bcrypt)
- 🔜 Rate limiting (future enhancement)
- 🔜 Email verification (future enhancement)
- 🔜 Refresh tokens (future enhancement)
- Frontend uses JavaScript (JSX), not TypeScript
- Backend uses TypeScript with strict validation
- MongoDB uses Mongoose ODM for schema definition
- All passwords are hashed before storage
- Docker Compose manages service orchestration
- Hot reload enabled for development mode
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.