🚀 VisionaryQA

VisionaryQA is a next-generation Visual Question Answering application that empowers users to upload any image, pose natural-language queries, and receive intelligent, context-aware answers driven by powerful vision-language models like BLIP-2.

🛠️ Features

Upload & Ask: Drag & drop or browse an image, then type any question.
Model Backend: Uses co-operative FastAPI + Hugging Face transformers with BLIP-2 or alternative.
Interactive UI: Modern, responsive interface built with Streamlit (or React + Tailwind).
Aesthetic Touches: Embedded images, styling, and live feedback animations.
Containerized: Dockerfile for seamless deployment.

📁 Repository Structure

visual-vqa/
├── app/
│   ├── main.py            # FastAPI backend
│   ├── vqa.py             # Inference logic (BLIP-2 pipeline)
│   ├── utils.py           # Image preprocessing
│   └── models/
│       └── blip2_model.py # Model loading wrapper
├── frontend/
│   ├── streamlit_app.py   # Streamlit UI
│   └── assets/
│       ├── logo.png       # Application logo

🚀 Quick Start

1. Clone & Install

git clone https://github.com/your-username/visual-vqa.git
cd visual-vqa
pip install -r requirements.txt

2. Run Backend

uvicorn app.main:app --reload

3. Run Frontend

In a new terminal:

streamlit run frontend/streamlit_app.py

🖼️ User Interface

Image Upload
Question Input
Answer Display with typing animation

🔧 Implementation Details

Backend (`app/main.py`)

from fastapi import FastAPI, File, UploadFile
from app.vqa import answer_question

app = FastAPI()

@app.post("/vqa/")
async def vqa_api(image: UploadFile = File(...), question: str = ""):
    content = await image.read()
    answer = answer_question(content, question)
    return {"answer": answer}

Inference (`app/vqa.py`)

from transformers import Blip2Processor, Blip2ForConditionalGeneration
from PIL import Image
import io

processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
model = Blip2ForConditionalGeneration.from_pretrained("Salesforce/blip2-opt-2.7b")

def answer_question(image_bytes: bytes, question: str) -> str:
    image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
    inputs = processor(images=image, text=question, return_tensors="pt")
    outputs = model.generate(**inputs)
    return processor.decode(outputs[0], skip_special_tokens=True)

Frontend (`frontend/streamlit_app.py`)

import streamlit as st
import requests

st.title("🖼️ Visual Q&A")
image = st.file_uploader("Upload an image", type=["png","jpg","jpeg"])
question = st.text_input("Ask a question about the image...")
if st.button("Get Answer") and image and question:
    files = {"image": image.getvalue()}
    data = {"question": question}
    res = requests.post("http://localhost:8000/vqa/", files={'image': image}, data={'question': question})
    st.success(res.json().get("answer"))

🐳 Docker Support

FROM python:3.10-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8501 8000
CMD ["/bin/sh", "-c", "uvicorn app.main:app --host 0.0.0.0 & streamlit run frontend/streamlit_app.py --server.port 8501"]

📸 Assets

logo.png: Application logo
screenshot.png: UI mockup

⚙️ CI/CD Pipeline

This project uses GitHub Actions to automate testing, linting, and deployment.

... (existing CI/CD content) ...

🧪 Tests

We use pytest to validate the core functionality:

visual-vqa/
├── tests/
│   ├── test_vqa_api.py
│   └── test_answer_question.py

1. `tests/test_vqa_api.py`

import io
import pytest
from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

def test_vqa_endpoint_no_question():
    # Upload an image without question should return 400
    response = client.post(
        "/vqa/",
        files={"image": ("test.png", b"fakebytes", "image/png")},
        data={"question": ""}
    )
    assert response.status_code == 400

@pytest.mark.parametrize("question,expected_status", [
    ("What is this?", 200),
])
def test_vqa_endpoint_valid(question, expected_status):
    # Upload fake image bytes and a question
    response = client.post(
        "/vqa/",
        files={"image": ("test.png", b"fakebytes", "image/png")},
        data={"question": question}
    )
    assert response.status_code == expected_status
    json_data = response.json()
    assert "answer" in json_data

2. `tests/test_answer_question.py`

import tempfile
from app.vqa import answer_question
from PIL import Image

def create_test_image(path):
    # Create a simple RGB image
    img = Image.new("RGB", (64, 64), color=(155, 0, 0))
    img.save(path, format="PNG")

def test_answer_question_runs_without_error(tmp_path):
    img_path = tmp_path / "test.png"
    create_test_image(img_path)
    # Ensure it returns a string
    answer = answer_question(str(img_path), "Is it red?")
    assert isinstance(answer, str)

These tests will run in CI, catching regressions in API and inference logic.

🔍 Benefits of CI/CD

Implementing CI/CD brings multiple advantages to the Visual VQA project:

Automated Quality Assurance: Every code change triggers automated tests and linting, catching bugs and style issues early.
Faster Feedback Loop: Developers receive immediate feedback on code quality and functionality before merging.
Consistent Builds: Ensures that the application builds and runs correctly across different environments.
Easy Rollbacks & Deployments: Automated deployment pipelines can quickly roll out new features or revert problematic releases.
Improved Collaboration: Contributors can focus on feature development, trusting that CI/CD enforces standards.

🧪 What CI/CD Does:

Continuous Integration (CI): Automatically merges and tests every change pushed to the repository. It includes:
- Testing: Runs unit and integration tests (via pytest) to confirm that new changes don’t break existing functionality.
- Linting: Uses tools like flake8 to enforce code style and catch syntax errors or anti-patterns.
Continuous Deployment (CD): Automates packaging and releasing the application. It includes:
- Building: Creates artifacts such as Docker images in a reproducible manner.
- Publishing: Pushes the Docker image to a registry (e.g., Docker Hub) or deploys to hosting platforms.

By integrating CI/CD, the Visual VQA system remains stable, maintainable, and ready for rapid iteration.

🤝 Contributing

Feel free to open issues or pull requests.

📬 Contact

[Your Name] • [your@email.com]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 VisionaryQA

🛠️ Features

📁 Repository Structure

🚀 Quick Start

1. Clone & Install

2. Run Backend

3. Run Frontend

🖼️ User Interface

🔧 Implementation Details

Backend (`app/main.py`)

Inference (`app/vqa.py`)

Frontend (`frontend/streamlit_app.py`)

🐳 Docker Support

📸 Assets

⚙️ CI/CD Pipeline

🧪 Tests

1. `tests/test_vqa_api.py`

2. `tests/test_answer_question.py`

🔍 Benefits of CI/CD

🧪 What CI/CD Does:

🤝 Contributing

📬 Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
app		app
frontend		frontend
tests		tests
uploads		uploads
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

trainmker/VisionaryQA

Folders and files

Latest commit

History

Repository files navigation

🚀 VisionaryQA

🛠️ Features

📁 Repository Structure

🚀 Quick Start

1. Clone & Install

2. Run Backend

3. Run Frontend

🖼️ User Interface

🔧 Implementation Details

Backend (app/main.py)

Inference (app/vqa.py)

Frontend (frontend/streamlit_app.py)

🐳 Docker Support

📸 Assets

⚙️ CI/CD Pipeline

🧪 Tests

1. tests/test_vqa_api.py

2. tests/test_answer_question.py

🔍 Benefits of CI/CD

🧪 What CI/CD Does:

🤝 Contributing

📬 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Backend (`app/main.py`)

Inference (`app/vqa.py`)

Frontend (`frontend/streamlit_app.py`)

1. `tests/test_vqa_api.py`

2. `tests/test_answer_question.py`

Packages