A powerful natural language interface for querying Saudi Stock Exchange (TASI) financial data. Ask questions in plain English and get instant insights powered by AI.
- Natural Language Queries: Ask questions like "Show me the top 10 most profitable companies" or "Compare sector performance by ROE"
- AI-Powered SQL Generation: Converts natural language to optimized PostgreSQL queries using Gemini Flash 2.5
- Fresh 2024 Data: Newly extracted 2024 annual financial data for 36 TASI companies
- 6 Banks (Riyad Bank, Bank Aljazira, Saudi Investment Bank, Saudi Awwal Bank, Bank Albilad, Alinma Bank)
- 15 Industrial companies (cement manufacturers and industrial groups)
- 7 Consumer & Retail companies (Almarai, Jarir, Nahdi, BinDawood, etc.)
- 3 Finance companies
- Plus Real Estate, Media, Telecom, and Healthcare sectors
- Real-Time Financial Data: Access comprehensive TASI financial metrics including:
- Revenue, Net Profit, Assets, Equity
- ROE, Net Margin, Current Ratio, Debt-to-Equity
- Sector classifications and performance categories
- Interactive Data Tables: View, sort, and filter results with ease
- Export Functionality: Download results as CSV or Excel files
- Saudi-Themed UI: Beautiful dark theme with Saudi green and gold branding
Main dashboard with natural language query interface
Example query results with data visualization
- Python 3.10+
- PostgreSQL database with TASI financial data
- OpenRouter API key (for AI features)
-
Clone the repository
git clone https://github.com/yourusername/TASI-Financial-AI-v2.git cd TASI-Financial-AI-v2 -
Create a virtual environment
python -m venv venv source venv/bin/activate # Linux/Mac # or venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
cp .env.example .env # Edit .env with your database URL and API keys -
Run the application
streamlit run streamlit_app.py
-
Open in browser Navigate to
http://localhost:8501
- Install PostgreSQL
- Create a database:
CREATE DATABASE tasi_financials; CREATE USER tasi WITH PASSWORD 'your_password'; GRANT ALL PRIVILEGES ON DATABASE tasi_financials TO tasi;
- Run the schema setup:
python setup_database.py
- Migrate the data:
python migrate_data.py
We recommend Neon for serverless PostgreSQL:
- Create a free Neon account
- Create a new project and database
- Copy your connection string
- Use the connection string in your environment variables
Ensure your repository has:
streamlit_app.py(entry point)requirements.txt.streamlit/config.toml
- Create a Neon PostgreSQL database
- Run the schema and data migration scripts against your cloud database
- Note your connection string
- Go to share.streamlit.io
- Connect your GitHub repository
- Select
streamlit_app.pyas the main file - Add secrets in the Streamlit Cloud dashboard:
[database]
url = "postgresql://user:password@ep-xxx.region.aws.neon.tech/tasi_financials?sslmode=require"
[openrouter]
api_key = "sk-or-v1-your-api-key"- Click Deploy!
- Check the app loads correctly
- Verify database connection (green status in sidebar)
- Test a sample query
| Variable | Description | Required |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | Yes |
OPENROUTER_API_KEY |
OpenRouter API key for LLM | Yes |
VANNA_API_KEY |
Vanna.ai API key (optional) | No |
For Streamlit Cloud deployment, configure secrets in the dashboard:
[database]
url = "your-database-url"
[openrouter]
api_key = "your-api-key"venna-ai/
├── streamlit_app.py # Main Streamlit application
├── vanna_app.py # AI agent and SQL generation
├── requirements.txt # Python dependencies
├── components/ # UI components
│ ├── __init__.py
│ ├── chat.py # Chat interface
│ ├── sidebar.py # Sidebar controls
│ └── example_questions.py
├── styles/ # CSS styling
│ ├── __init__.py
│ ├── variables.py # Design tokens
│ └── css.py # CSS generation
├── schema/ # Database schema
│ ├── 01_schema.sql # Table definitions
│ └── ...
├── .streamlit/ # Streamlit configuration
│ ├── config.toml
│ └── secrets.toml.example
└── data/ # Data files
The platform includes automated extraction capabilities to pull fresh financial data from company reports. The latest extraction was performed on February 3, 2026, adding 2024 annual financial data for 36 companies.
Banks (6):
- Riyad Bank (1010)
- Bank Aljazira (1020)
- Saudi Investment Bank (1030)
- Saudi Awwal Bank (1040)
- Bank Albilad (1140)
- Alinma Bank (1150)
Industrial Companies (15):
- Arabian Cement Co (3020)
- Yamama Cement Co (3090)
- Yanbu Cement Co (3060)
- City Cement Co (3003)
- Southern Province Cement Co (3050)
- Umm Al-Qura Cement Co (3040)
- Qassim Cement Co (3010)
- Riyadh Cement Co (3010)
- Eastern Province Cement Co (3080)
- Arabian Pipes Co (1304)
- Zamil Industrial Investment Co (2240)
- Saudi Industrial Investment Group (2250)
- Astra Industrial Group (1212)
- Bawan Co (2090)
- United Wire Factories Co (2180)
Consumer & Retail (7):
- Almarai Co (2280)
- Jarir Marketing Co (4190)
- Nahdi Medical Co (4164)
- BinDawood Holding Co (4161)
- Leejam Sports Co (4231)
- Aldrees Petroleum and Transport Services Co (4200)
- Almunajem Foods Co (2271)
Other Sectors (8):
- Amlak International Finance Co (4310)
- Nayifat Finance Co (4080)
- SHL Finance Co (4130)
- Emaar The Economic City (4220)
- MBC Group Co (4210)
- Etihad Atheeb Telecommunication Co (7040)
- Canadian Medical Center Co (4004)
- Saudi Tadawul Group Holding Co (1111)
To extract new financial data or update existing data:
- Prepare source files: Place Excel/PDF financial reports in the
data/extracted/directory - Process and validate: Extract data from financial reports and save to
data/extracted/INSERT_READY.csv - Run the insertion script to load data into the database:
python scripts/insert_extracted_data.py
- Verify the results: The script will provide a detailed summary of:
- New records inserted
- Existing records updated
- Duplicate records skipped
- Final database record count
The insertion script automatically:
- Calculates derived financial metrics (ROE, margins, ratios)
- Validates data integrity
- Prevents duplicate entries
- Updates records with newer extraction dates
- Generates detailed logs in
data/insertion_log_[date].txt
All extracted data undergoes rigorous validation:
- ✅ Completeness checks: Verifying all key financial metrics are present
- ✅ Data type validation: Ensuring numerical values are properly formatted
- ✅ Range validation: Flagging outliers and unusual values
- ✅ Cross-referencing: Validating against known company information
- ✅ Confidence scoring: Each extraction is scored for reliability
Try these natural language queries:
2024 Data Queries:
- "Show me bank performance in 2024"
- "Compare Riyad Bank vs Alinma Bank 2024"
- "Which industrial companies have 2024 data?"
- "Show me the top 5 cement companies by revenue in 2024"
- "List all companies with 2024 annual data"
General Queries:
- "Show me the top 10 most profitable companies"
- "Which companies have the highest ROE?"
- "Compare sector performance by average revenue"
- "Show companies with losses in the insurance sector"
- "What is the total revenue by sector?"
- "List companies with high debt-to-equity ratio"
- "Show year-over-year financial summary"
- Frontend: Streamlit
- Database: PostgreSQL
- AI/LLM: Gemini Flash 2.5 via OpenRouter
- Styling: Custom CSS with Saudi green/gold theme
- Data Processing: Pandas, NumPy
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Saudi Stock Exchange (TASI) for financial data
- OpenRouter for AI API access
- Streamlit for the amazing web framework
- Neon for serverless PostgreSQL
Built with love for the Saudi financial community