A comprehensive AI-powered code quality analyzer with multi-language support. This tool analyzes code quality across multiple programming languages using:
- Multi-language parsing to extract code metrics from Python, JavaScript, TypeScript, Java, C++, Go, Rust, Ruby, and PHP
- Rule-based detectors for code smells (long functions, unused imports, deep nesting, complexity)
- Language-specific linters integration (flake8, pylint, ESLint, Checkstyle, cppcheck, golangci-lint, clippy, RuboCop, PHP_CodeSniffer)
- ML classifier using scikit-learn to classify code snippets as "good" or "bad"
- CLI to analyze files and Flask web UI to upload and analyze code with autofix options
- Sample dataset and comprehensive unit tests
| Language | Parser Type | Linter Integration | Status |
|---|---|---|---|
| Python | AST | flake8, pylint | ✅ Full |
| JavaScript/TS | Regex | ESLint | ✅ Full |
| Java | Regex | Checkstyle, PMD | ✅ Full |
| C/C++ | Regex | cppcheck, clang-tidy | ✅ Full |
| Go | Regex | golangci-lint | ✅ Full |
| Rust | Regex | clippy | ✅ Full |
| Ruby | Regex | RuboCop | ✅ Full |
| PHP | Regex | PHP_CodeSniffer | ✅ Full |
- Create and activate a Python environment (>= 3.9 recommended).
- Install requirements:
pip install -r requirements.txt- (Optional) Install language-specific linters for enhanced analysis:
# JavaScript/TypeScript
npm install -g eslint
# Java (Ubuntu/Debian)
apt-get install checkstyle
# C/C++ (Ubuntu/Debian)
apt-get install cppcheck
# Go
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
# Rust
rustup component add clippy
# Ruby
gem install rubocop
# PHP
composer global require squizlabs/php_codesniffer- Train a small demo model (optional):
python -m code_quality_analyzer.cli train --dataset datasets/synthetic_dataset.csv --model-out models/code_quality_model.joblib- Analyze any supported file:
# Python
python -m code_quality_analyzer.cli analyze --file examples/bad_example.py --model models/code_quality_model.joblib
# JavaScript
python -m code_quality_analyzer.cli analyze --file app.js --model models/code_quality_model.joblib
# Java
python -m code_quality_analyzer.cli analyze --file Main.java --model models/code_quality_model.joblib
# Any supported language (auto-detected by extension)
python -m code_quality_analyzer.cli analyze --file <your-file> --model models/code_quality_model.joblib- Run the web app:
python -m code_quality_analyzer.webappcode_quality_analyzer/- main packageparser.py- Multi-language parsing and feature extraction (Python AST, regex-based for others)detectors.py- Rule-based detectors and linter integrations for all supported languagesml_classifier.py- Model training and prediction helperssuggestion_engine.py- Auto-fix suggestionscli.py- CLI wrapper with multi-language supportwebapp.py- Flask demo for uploading and analyzing code
datasets/- Small synthetic datasetsexamples/- Sample files in various languagestests/- Unit testsrequirements.txt- Required Python packages
from code_quality_analyzer.parser import extract_features_from_file
from code_quality_analyzer.detectors import RuleBasedDetector
features = extract_features_from_file('script.py')
detector = RuleBasedDetector()
issues = detector.detect_all_languages(source_code, 'python')features = extract_features_from_file('app.js')
issues = detector.detect_all_languages(source_code, 'javascript')features = extract_features_from_file('Main.java')
issues = detector.detect_all_languages(source_code, 'java')Language detection is automatic based on file extension.
This project supports comprehensive multi-language analysis. You can extend it with:
- Tree-sitter integration for more precise AST parsing across all languages
- Better ML dataset with language-specific embeddings
- Enhanced auto-fix engine using language-specific AST rewriters
- CI integration for scanning across multi-language repositories
- Custom rule definitions per language
Contributions & improvements are welcome.
This repo contains a Dockerfile and docker-compose.yml to run the Flask web app behind Gunicorn. The container exposes port 5000.
Build and run locally with Docker:
docker build -t code-quality-analyzer:latest .
docker run --rm -p 5000:5000 -v ${PWD}\models:/app/models -e MODEL_PATH=/app/models/code_quality_model.joblib code-quality-analyzer:latestUsing Docker Compose:
docker-compose up --buildHeroku / PaaS (example): create an app, set a BUILDPACK or use the Docker container; the included Procfile runs Gunicorn for you.
heroku create
git push heroku main
heroku ps:scale web=1Note: Ensure model file models/code_quality_model.joblib is included or set the MODEL_PATH environment variable pointing to the model artifact.
Best deployment method - No size limits, full multi-language support, ML features included.
The project includes automated CI/CD that builds and publishes Docker images to GitHub Container Registry (GHCR) on every push to main:
-
GitHub Actions automatically:
- Builds multi-architecture Docker image (linux/amd64, linux/arm64)
- Pushes to
ghcr.io/<YOUR_USERNAME>/code-quality-analyzer:latest - Optionally uploads ML models to S3 (if configured)
- Uses build cache for faster subsequent builds
-
Pull and run the pre-built image:
# Pull latest image from GHCR
docker pull ghcr.io/shahinshac/code-quality-analyzer:latest
# Run the container
docker run -d -p 5000:5000 \
-e MODEL_PATH=/app/models/code_quality_model.joblib \
ghcr.io/shahinshac/code-quality-analyzer:latest- Deploy to any cloud platform:
- Railway: Connect GitHub repo, auto-deploys from GHCR
- Render: Deploy from Docker registry
- Fly.io:
fly launchfrom Dockerfile - Google Cloud Run: Deploy from container registry
- AWS ECS/Fargate: Use GHCR image
- Azure Container Instances: Pull from GHCR
# Build locally
docker build -t ghcr.io/<YOUR_USERNAME>/code-quality-analyzer:latest .
# Login to GHCR
echo <GITHUB_TOKEN> | docker login ghcr.io -u <YOUR_USERNAME> --password-stdin
# Push to GHCR
docker push ghcr.io/<YOUR_USERNAME>/code-quality-analyzer:latestConfigure these GitHub repository secrets/variables to enable S3 model upload:
Secrets:
AWS_ACCESS_KEY_ID- AWS access keyAWS_SECRET_ACCESS_KEY- AWS secret key
Variables:
AWS_REGION- AWS region (default: us-east-1)AWS_S3_BUCKET- S3 bucket name for model storage
The workflow will automatically upload models to S3 and generate presigned URLs for download.
Contributions & improvements are welcome.