RefractorIQ

AI-Powered Code Analysis and Technical Debt Detection

Overview

RefractorIQ is a comprehensive code analysis platform that combines static analysis, dependency mapping, duplication detection, semantic search, and AI-powered refactoring suggestions. It helps development teams understand their codebase structure, identify technical debt, and receive intelligent recommendations for code improvements.

The platform analyzes repositories using:

Tree-sitter AST parsing for multi-language code analysis
NetworkX for dependency graph construction
MinHash/SimHash for code duplication detection
FAISS + SentenceTransformers for semantic code search
Google Gemini AI for intelligent refactoring suggestions

The entire system runs asynchronously with Celery workers, providing real-time progress updates through a modern React dashboard.

Quick Start

Prerequisites

Python 3.10+
Node.js 16+
Redis (for task queue)
PostgreSQL (for database)

Using Docker Compose

# Start supporting services
docker-compose up -d

# Install Python dependencies
pip install -r requirements.txt

# Start Celery worker
celery -A backend.celery_app worker --loglevel=info

# Start FastAPI server
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

Start the Frontend

cd frontend
npm install
npm run dev

Access the Application

Web Interface: http://localhost:3000
API Documentation: http://localhost:8000/docs
Health Check: http://localhost:8000/health

API Usage

Start Repository Analysis

Endpoint: GET /analyze/full

Parameters:

repo_url - GitHub repository URL
exclude_third_party - Skip third-party libraries (default: true)
exclude_tests - Skip test files (default: true)

Example:

curl -X GET "http://localhost:8000/analyze/full?repo_url=https://github.com/user/repo.git&exclude_third_party=true&exclude_tests=true"

Response:

{
  "message": "Analysis started",
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status_url": "/analyze/status/550e8400-e29b-41d4-a716-446655440000"
}

Check Analysis Status

Endpoint: GET /analyze/status/{job_id}

Example:

curl http://localhost:8000/analyze/status/550e8400-e29b-41d4-a716-446655440000

Response:

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "COMPLETED",
  "repo_url": "https://github.com/user/repo.git",
  "results_url": "/analyze/results/550e8400-e29b-41d4-a716-446655440000",
  "summary": {
    "loc": 15234,
    "debt_score": 87.5,
    "avg_complexity": 5.2,
    "duplicate_pairs": 12
  }
}

Get Analysis Results

Endpoint: GET /analyze/results/{job_id}

Example:

curl http://localhost:8000/analyze/results/550e8400-e29b-41d4-a716-446655440000

Response:

{
  "repository": "https://github.com/user/repo.git",
  "code_metrics": {
    "LOC": 15234,
    "TODOs_FIXME_HACK": 42,
    "AvgCyclomaticComplexity": 5.2,
    "DebtScore": 87.5,
    "TotalFunctions": 487,
    "MaxComplexity": 28,
    "ComplexityDistribution": {
      "low": 312,
      "medium": 143,
      "high": 28,
      "very_high": 4
    }
  },
  "dependency_metrics": {
    "total_files": 156,
    "total_edges": 423,
    "circular_dependencies": 2,
    "most_dependent_files": [...],
    "graph_json": {...}
  },
  "duplication_metrics": {
    "duplicate_pairs_found": 12,
    "similarity_threshold": 0.85,
    "duplicates": [...]
  },
  "llm_suggestions": [...]
}

Semantic Code Search

Endpoint: GET /analyze/results/{job_id}/search

Parameters:

q - Search query (natural language)
k - Number of results (default: 10)

Example:

curl "http://localhost:8000/analyze/results/550e8400-e29b-41d4-a716-446655440000/search?q=authentication+logic&k=5"

Response:

{
  "results": [
    {
      "path": "backend/auth/login.py",
      "score": 0.8923
    },
    {
      "path": "backend/middleware/auth.py",
      "score": 0.8456
    }
  ]
}

How It Works

Repository Cloning - Shallow clone of the target repository into a temporary directory
Static Analysis - Tree-sitter parses code files to extract metrics (LOC, complexity, TODOs)
Dependency Graph - NetworkX builds a directed graph of file dependencies and imports
Duplication Detection - MinHash fingerprinting identifies similar code blocks
Semantic Indexing - SentenceTransformer creates FAISS vector index for search
AI Analysis - Google Gemini generates refactoring suggestions for complex functions
Report Assembly - All metrics are aggregated into a JSON report and stored locally
Status Updates - Real-time job status polling via FastAPI endpoints

Features

Code Quality Metrics

Lines of code (excluding comments/blanks)
Cyclomatic complexity analysis
Technical debt scoring
TODO/FIXME/HACK detection
Complexity distribution visualization

Dependency Analysis

File-level dependency graphs
Import relationship mapping
Circular dependency detection
Most dependent/depended-on file identification
Interactive ReactFlow visualization

Code Duplication

MinHash-based similarity detection
Configurable similarity thresholds
Side-by-side duplicate comparison
Jaccard similarity scoring

Semantic Search

Natural language code search
Vector similarity using FAISS
Fast retrieval across entire codebase
Context-aware file ranking

AI Refactoring

Automatic detection of complex functions
Google Gemini-powered suggestions
Side-by-side code comparison
Complexity-driven prioritization

Project Purpose

RefractorIQ was built to demonstrate a production-ready code analysis platform that combines traditional static analysis with modern AI capabilities. The project showcases:

Async Architecture - Celery task queue with Redis backend for scalable analysis
Multi-Language Support - Tree-sitter parsers for Python, JavaScript/TypeScript, and Java
Graph Algorithms - NetworkX for dependency analysis and circular dependency detection
Vector Search - FAISS indexing for semantic code search
LLM Integration - Google Gemini API for intelligent refactoring suggestions
Modern Frontend - React with TailwindCSS and ReactFlow for interactive visualizations
REST API - FastAPI with OpenAPI documentation and async endpoints
Database Management - SQLAlchemy ORM with PostgreSQL for job persistence
Deployment Ready - Docker, Railway, and Nixpacks configuration included

It serves as a comprehensive example of building and deploying a full-stack ML-powered application with real-world complexity.

Tech Stack

Backend: FastAPI, Celery, SQLAlchemy, PostgreSQL, Redis
Analysis: Tree-sitter, NetworkX, MinHash, SimHash
AI/ML: SentenceTransformers, FAISS, Google Generative AI
Frontend: React 18, Vite, TailwindCSS, ReactFlow
Deployment: Docker, Railway, Uvicorn

Environment Configuration

Create a .env file:

# Database
DATABASE_URL=postgresql://user:password@localhost:5432/refractor_db

# Celery
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/0

# GitHub (optional)
GITHUB_CLIENT_ID=your_client_id
GITHUB_CLIENT_SECRET=your_client_secret
GITHUB_PAT=your_personal_access_token

# Google AI
GOOGLE_API_KEY=your_google_api_key

# Application
BACKEND_URL=http://localhost:8000
REPORT_STORAGE_PATH=./analysis_reports

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.vscode		.vscode
backend		backend
frontend		frontend
.dockerignore		.dockerignore
.gitignore		.gitignore
.railwayignore		.railwayignore
Dockerfile		Dockerfile
Procfile		Procfile
README.md		README.md
docker-compose.yml		docker-compose.yml
nixpacks.toml		nixpacks.toml
package-lock.json		package-lock.json
railway.json		railway.json
railway.toml		railway.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RefractorIQ

Overview

Quick Start

Prerequisites

Using Docker Compose

Start the Frontend

Access the Application

API Usage

Start Repository Analysis

Check Analysis Status

Get Analysis Results

Semantic Code Search

How It Works

Features

Code Quality Metrics

Dependency Analysis

Code Duplication

Semantic Search

AI Refactoring

Project Purpose

Tech Stack

Environment Configuration

License

Contributing

About

Uh oh!

Releases

Packages

Languages

nandanadileep/RefractorIQ

Folders and files

Latest commit

History

Repository files navigation

RefractorIQ

Overview

Quick Start

Prerequisites

Using Docker Compose

Start the Frontend

Access the Application

API Usage

Start Repository Analysis

Check Analysis Status

Get Analysis Results

Semantic Code Search

How It Works

Features

Code Quality Metrics

Dependency Analysis

Code Duplication

Semantic Search

AI Refactoring

Project Purpose

Tech Stack

Environment Configuration

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages