A Python-based tool that analyzes Reddit users' posts and comments to generate comprehensive user personas using AI. The bot scrapes user content, processes it with Google's Gemini AI, and creates beautiful PDF reports with avatars.
- Reddit Content Scraping: Extracts posts and comments from Reddit user profiles
- AI-Powered Analysis: Uses Google Gemini AI to analyze user behavior and generate detailed personas
- Avatar Generation: Creates custom avatars based on persona descriptions using AI image generation
- PDF Report Generation: Creates professional-looking PDF reports with personality visualizations
- Personality Metrics: Visual representation of personality traits and tech savviness
- Structured Data: Organizes persona data into categories like demographics, interests, values, and motivations
persona_bot/
├── main.py # Main entry point for the application
├── scrape_reddit.py # Reddit content scraping functionality
├── generate_persona.py # AI persona generation and image creation
├── generate_pdf.py # PDF report generation
├── requirements.txt # Python dependencies
├── templates/
│ └── persona_template.html # HTML template for PDF generation
├── persona_outputs/ # Generated personas and avatars
│ ├── [username]_persona.txt
│ ├── [username]_persona.pdf
│ └── [username]_avatar.png
└── .env # Environment variables (create this file)
- Python 3.8+ - Make sure Python is installed on your system
- wkhtmltopdf - Required for PDF generation
- Windows: Download from wkhtmltopdf.org
- Install to
C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe
- Reddit API Access - Create a Reddit app at reddit.com/prefs/apps
- Google Gemini API Key - Get one from Google AI Studio
- Image Generation API - Get Claid.ai API key for avatar generation
-
Clone or download the project
git clone <repository-url> cd persona_bot
-
Install Python dependencies
pip install -r requirements.txt
-
Create environment file Create a
.envfile in the project root with the following variables:# Reddit API Credentials REDDIT_CLIENT_ID=your_reddit_client_id REDDIT_CLIENT_SECRET=your_reddit_client_secret REDDIT_USER_AGENT=PersonaBot/1.0 by YourUsername # Google Gemini AI API Key GEMINI_API_KEY=your_gemini_api_key # Image Generation API (Claid.ai) IMAGE_API_KEY=your_claid_api_key
-
Verify wkhtmltopdf installation
- Ensure wkhtmltopdf is installed at
C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe - Or update the path in
generate_pdf.pyif installed elsewhere
- Ensure wkhtmltopdf is installed at
Run the main script to analyze predefined Reddit users:
python main.pyEdit the reddit_urls list in main.py to target specific Reddit users:
reddit_urls = [
"https://www.reddit.com/user/your_target_user/",
"https://www.reddit.com/user/another_user/",
]After generating a persona, create a PDF report:
python generate_pdf.py-
Scrape Reddit data only:
from scrape_reddit import get_user_content posts, comments = get_user_content("username")
-
Generate persona from existing data:
from generate_persona import generate_user_persona persona = generate_user_persona(posts, comments, "username")
For each analyzed user, the bot creates:
-
Text Persona (
username_persona.txt): Detailed analysis including:- Demographics (hypothetical)
- Interests and hobbies
- Personality traits
- Values and motivations
- Tech savviness level
- Communication tone
- Opinions and perspectives
-
PDF Report (
username_persona.pdf): Professional visual report with:- User avatar
- Personality trait bars
- Tech savviness circular chart
- Organized sections for easy reading
-
Avatar Image (
username_avatar.png): AI-generated profile picture
## User Persona: Alex
**1. Demographics (Hypothetical):**
* Age: 30-35
* Gender: Male
* Location: New York City
* Occupation: Software Engineer
**2. Interests:**
* Technology and Apple products
* Video games
* Japanese culture
* Environmental sustainability
**3. Personality:**
Curious, analytical, tech-savvy, introspective, socially aware
**4. Tech Savviness:**
High - Comfortable with advanced technology and development tools
Edit prompts in generate_persona.py to change analysis focus:
- Modify the persona generation prompt for different analysis styles
- Adjust post/comment limits in
scrape_reddit.py - Update personality traits in
generate_pdf.py
Customize the visual appearance in templates/persona_template.html:
- Modify CSS styles for colors, fonts, and layout
- Add new sections or remove existing ones
- Change personality trait categories
-
wkhtmltopdf not found
- Ensure wkhtmltopdf is properly installed
- Update the path in
generate_pdf.pyline 59
-
Reddit API errors
- Verify your Reddit API credentials in
.env - Check if the target user's profile is public
- Ensure you're not hitting rate limits
- Verify your Reddit API credentials in
-
Gemini API errors
- Verify your Gemini API key is valid
- Check your API quota and billing status
-
Missing dependencies
- Run
pip install -r requirements.txtagain - Use a virtual environment to avoid conflicts
- Run
Enable detailed logging by modifying the logging level in scrape_reddit.py:
logging.basicConfig(level=logging.DEBUG)requests- HTTP requestsgoogle-generativeai- Google Gemini AI integrationpython-dotenv- Environment variable managementPillow- Image processingjinja2- Template renderingpdfkit- PDF generationpraw- Reddit API wrapper
- Public Data Only: This tool only accesses publicly available Reddit content
- Hypothetical Analysis: Generated personas are AI interpretations, not factual profiles
This project is for educational and research purposes. Please use responsibly and in accordance with Reddit's terms of service and API usage guidelines.
Note: This tool analyzes public Reddit content and generates hypothetical personas for educational purposes. Always use responsibly and respect user privacy.