Structura AI is a web-based image analysis tool that generates a structural mask by combining monocular depth estimation and texture-based feature detection. The system integrates MiDaS (DPT-Hybrid) for depth prediction with OpenCV-based edge, corner, and texture fusion to produce a detailed, high-frequency structural representation of any input image.
This mask is useful for segmentation workflows, diffusion model conditioning, preprocessing for generative AI, architectural analysis, and general computer vision pipelines.
- MiDaS Depth Estimation: High-quality depth prediction using the DPT-Hybrid model.
- Combined Structural Masking: Fusion of depth gradients, edges (Canny), Laplacian detail, and Harris corner responses.
- FastAPI Backend: Clean API endpoint (
POST /mask/) returning PNG masks with inference time metadata. - Web UI: A simple and modern interface built with HTML, CSS, and JavaScript allowing image upload, preview, overlay visualization, and mask download.
- Docker Support: Fully dockerized and ready for deployment on services such as Hugging Face Spaces.
Live Demo on Hugging Face Spaces: Structura AI Demo
This section is reserved for demo images and demo videos. Add examples such as:
- Original input image:

- Generated structural mask

- Overlay visualization

- A short screen recording demonstrating the UI https://github.com/user-attachments/assets/28e5fdc0-cac2-4c31-a34c-36fff82f1851
Backend: FastAPI, Uvicorn AI / Computer Vision: PyTorch, MiDaS, OpenCV (headless), NumPy Frontend: HTML, CSS, JavaScript Deployment: Docker, Hugging Face Spaces
structura-ai/
├── app.py
├── depth_texture_mask.py
├── requirements.txt
├── Dockerfile
├── templates/
│ └── index.html
└── static/
├── styles.css
└── app.js
git clone https://github.com/PritamTheCoder/midas-depth-texture-mask-api.git
cd midas-depth-texture-mask-api
python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
pip install -r requirements.txt
uvicorn app:app --reload --host 0.0.0.0 --port 8000
Open the browser and navigate to: http://127.0.0.1:8000/
The MiDaS model weights will automatically download on the first startup.
The project includes a Dockerfile configured for seamless deployment on Hugging Face Spaces.
Ensure the following files are present at the repository root:
Dockerfile
app.py
depth_texture_mask.py
requirements.txt
templates/
static/
-
Create a new Hugging Face Space.
-
Select "Docker" as the runtime.
-
Upload or commit all repository files.
-
Hugging Face will automatically:
- Build the Docker image
- Install dependencies
- Expose the correct port
- Start the FastAPI server
The application will then be available as a hosted interactive web demo.
This project is licensed under the MIT License. See the LICENSE file for details.
MiDaS by Intel-ISL FastAPI for backend framework OpenCV for feature detection and image processing PyTorch for deep learning inference support Hugging Face for deployment infrastructure