🧠 Multi-Modal Crime Scene Interpreter

This project is a multi-modal system designed to interpret crime-related images and textual observations to generate contextual insights, similarity-based reasoning, and a narrated video summarizing the incident. It uses image captioning (BLIP-2), semantic similarity (SBERT), and TTS + video generation.

📁 Project Structure

├── main.py                  # Main script to run the interpreter
├── Facts.csv                # CSV file containing factual statements and reasoning
├── crime_story.mp4          # Output video narration (generated)
├── narration.mp3            # Audio file generated from narrative
└── README.md                # This file

✅ Prerequisites

📦 Python Libraries

Install the following Python packages:

pip install transformers sentence-transformers pandas torch gtts moviepy pillow

🧠 Models Required

Salesforce/blip2-opt-2.7b via Hugging Face Transformers
all-MiniLM-L6-v2 via Sentence Transformers

These will be downloaded automatically the first time you run the script.

🧾 CSV File

You must provide a Facts.csv file with the following columns:

fact: factual observation
reasoning: reasoning behind the fact

📸 Images

Ensure images are in .jpg, .jpeg, or .png formats.

🧰 Dependencies

📍 ImageMagick

Download & install ImageMagick
Ensure the executable path is correct in:

mpy_config.change_settings({"IMAGEMAGICK_BINARY": r"C:\\Program Files\\ImageMagick-7.1.1-Q16-HDRI\\magick.exe"})
os.environ["IMAGEMAGICK_PATH"] = r"C:\\Program Files\\ImageMagick-7.1.1-Q16-HDRI\\magick.exe"

Change it based on your system if necessary.

🔤 Fonts

Ensure arial.ttf or any .ttf font file is accessible for PIL. Modify this line if needed:

font = ImageFont.truetype("arial.ttf", 40)

How to Run

Run the script:

python main.py

Input image paths and/or text separated by commas. Example:

image1.jpg, image2.jpg, There was blood on the floor

The system will:
- Caption images
- Find the most similar fact and reasoning
- Print the analysis
- Generate a crime story narrative
- Produce a narrated video: crime_story.mp4

Output

crime_story.mp4 — a narrated video constructed from text+image reasoning
narration.mp3 — audio narration of the crime story
CLI outputs for each step

🧠 Authors

Rohan Raghav – Full Stack Developer & Machine Learning Enthusiast

Feel free to modify this project to suit forensic, educational, or AI storytelling needs!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Clips		Clips
.gitignore		.gitignore
CDG_Hackathon [Autosaved].pptx		CDG_Hackathon [Autosaved].pptx
Crime_Scene_generatorvideo - Made with Clipchamp (2) (1).mp4		Crime_Scene_generatorvideo - Made with Clipchamp (2) (1).mp4
Facts.csv		Facts.csv
New.py		New.py
README.md		README.md
Videogen.py		Videogen.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Multi-Modal Crime Scene Interpreter

📁 Project Structure

✅ Prerequisites

📦 Python Libraries

🧠 Models Required

🧾 CSV File

📸 Images

🧰 Dependencies

📍 ImageMagick

🔤 Fonts

How to Run

Output

🧠 Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

RohanRaghav/crime_scene_generator

Folders and files

Latest commit

History

Repository files navigation

🧠 Multi-Modal Crime Scene Interpreter

📁 Project Structure

✅ Prerequisites

📦 Python Libraries

🧠 Models Required

🧾 CSV File

📸 Images

🧰 Dependencies

📍 ImageMagick

🔤 Fonts

How to Run

Output

🧠 Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages