📊 CSV Insight Assistant (with DeepSeek) Author: YILUO Date: 2025-03-27
A Streamlit-powered AI tool for automatically analyzing and visualizing any CSV file — including basic statistics, missing values heatmaps, correlation maps, and boxplots. It also leverages a local DeepSeek language model (via Ollama) to generate contextual insights and analytical suggestions based on your uploaded dataset.
⸻
🚀 Features • Upload any .csv file • Auto-analysis includes: • Dataset structure (.info()) • Summary statistics (.describe()) • Missing values count and heatmap • Correlation matrix (including target) • Visualizations: • Missing value heatmap • Boxplots for all numeric columns • Boxplots by target column (if detected) • AI Summary via DeepSeek (locally run): • Description of the dataset • 3 analysis/modeling suggestions
⸻
🛠️ Setup Instructions
Step 1: Install Python dependencies
pip install streamlit pandas matplotlib seaborn requests
Step 2: Install and run DeepSeek locally with Ollama
Install Ollama (if not already)
brew install ollama
Pull the DeepSeek model
ollama pull deepseek-coder:6.7b
Run DeepSeek model (keep this running in a separate terminal)
ollama run deepseek-coder:6.7b
Step 3: Launch the app
streamlit run 20250327_CSV_Insight_Assistant_wDS.py
⸻
📄 Example Dataset
This app was tested using the Spaceship Titanic dataset from Kaggle: https://www.kaggle.com/competitions/spaceship-titanic/data