KP-OCR is a production-ready document intelligence platform designed with cloud constraints, cost efficiency and reliability in mind.
https://kp-doc-intelligence-gfdabpaqg6hbcybz.canadacentral-01.azurewebsites.net/
Note: Some features require authentication.
- CNN-based document classification
- OCR-driven text extraction
- Secure PDF invoice generation
- Subscription-based billing
- GST-compliant invoices
- Role-based access control
- Cloud-native deployment (Azure App Service)
Frontend
- Jinja2 templates
- Bootstrap UI
Backend
- Flask (Python)
- MongoDB (Atlas)
- ONNX Runtime for inference
Document Processing
- CNN-based classifier
- OCR extraction pipeline
- ReportLab-based PDF generation
PDF Generation
- ReportLab (pure Python, no native deps)
Deployment
- Azure App Service (Linux)
- Gunicorn WSGI server
-
Replaced TensorFlow with ONNX Runtime in production to significantly reduce memory footprint and cold-start latency.
-
Selected ReportLab over HTML-to-PDF tools to avoid native Cairo dependencies that often fail on managed cloud platforms.
-
Deferred digital PDF signing (pyHanko) intentionally due to crypto and OpenSSL dependency instability on low-tier instances.
-
Designed invoice generation to be fully in-memory, eliminating filesystem dependencies and improving security.
- All billing routes are authenticated
- Invoice access is user-scoped and validated server-side
- No sensitive keys or credentials are stored in the repository
- PDF generation is stateless and ephemeral
-
HTML/CSS-based PDF rendering was avoided, which limits pixel-perfect visual styling in favor of reliability.
-
Free-tier cloud constraints influenced dependency choices and deferred certain enterprise features.
-
The current OCR pipeline prioritizes accuracy and validation over raw throughput.
These trade-offs were made intentionally to ensure stability, cost control and predictable deployments.
The following screenshots demonstrate key user flows from document upload to billing and invoice generation.
Actively developed and deployed in a production environment.
Planned enhancements include:
- Digital PDF signatures
- Enterprise compliance workflows
- Advanced analytics and audit logging






