GIF and Image moderation using AWS Rekognition or local HuggingFace models.
PyFrame utilizes Temporal Segmentation to optimize moderation. Instead of processing every frame, the system divides the animation into equal time windows ("buckets") and calculates the inter-frame difference for each frame. It then applies motion-based keyframe selection to extract the single most significant frame from each bucket. This guarantees diverse scene coverage and captures peak motion events across the entire GIF. Supports both AWS Rekognition and local HuggingFace models for classification.
AWS Rekognition charges $1.00 per 1,000 images processed. A typical 5-second GIF (150 frames at 30 FPS) costs $0.15 to moderate when processing every frame, making comprehensive moderation expensive at scale.
PyFrame analyzes the same 150 frame GIF using just 10 intelligently selected frames, reducing the cost to $0.01 per GIF a 93% savings while maintaining detection accuracy. Alternatively, run the same frame extraction with a local HuggingFace model for zero cost (less accuracy than AWS) but can utilise a two pass approach optionally.
- Install dependencies:
pip install -r requirements.txt- Install AWS CLI (if not already installed):
brew install awscli- Configure AWS credentials:
aws configureExtract key frames from a GIF and moderate them:
from lib.aws.pipe import Pipe
pipe = Pipe("content/gifs/your-gif.gif", max_frames=10, min_confidence=80.0)
results = pipe.run()max_frames- Number of frames to extract (default: 10)min_confidence- Minimum detection confidence (default: 80.0)use_merged- Merge frames before moderating (default: False)frames_per_batch- Frames per merged image (default: 2)
Bring your own model from HuggingFace instead of using AWS. Runs entirely locally, no API keys or AWS config needed. Defaults to AdamCodd/vit-base-nsfw-detector but you can pass any HuggingFace image-classification model. Not as accurate as AWS Rekognition but works well as a free alternative, or use both together for a two-pass approach.
from lib.local.local_pipe import LocalPipe
# default model
pipe = LocalPipe("content/gifs/your-gif.gif", max_frames=10)
results = pipe.run()
# custom model
pipe = LocalPipe("content/gifs/your-gif.gif", max_frames=10, model="Falconsai/nsfw_image_detection")
results = pipe.run()max_frames- Number of frames to extract (default: 5)model- HuggingFace model ID (default:AdamCodd/vit-base-nsfw-detector)use_merged- Merge frames before classifying (default: False)frames_per_batch- Frames per merged image (default: 2)
Requires transformers and torch:
pip install transformers torchsource .venv/bin/activate && python main.pycontent/- All input/output fileslib/- Core functionalityaws/- AWS Rekognition pipelinepipe.py- Rekognition piperekognition_moderator.py- Rekognition API wrapper
local/- Local HuggingFace pipelinelocal_pipe.py- Local pipelocal_detector.py- HuggingFace model wrapper
frame_processor.py- Frame extractionimage_utils.py- Shared image helpersvideo_converter.py- Video to GIF conversion
| Method | Frames Analyzed per GIF | Cost per GIF | GIFs Moderated | Cost Savings |
|---|---|---|---|---|
| Standard Method (All Frames) | 150 frames | $0.15 | 66 GIFs | Baseline |
| PyFrame (10 Buckets) | 10 frames | $0.01 | 1,000 GIFs | 93% reduction |