Skip to content

Cloud-scan/cloudscan-orchestrator

Repository files navigation

cloudscan-orchestrator

Core orchestration service for CloudScan platform - manages scan lifecycle, dispatches Kubernetes jobs, and coordinates all scanning activities


🎯 Overview

The cloudscan-orchestrator is the heart of the CloudScan platform. It:

  • πŸ“‹ Manages scan lifecycle (create, queue, execute, complete)
  • πŸš€ Dispatches Kubernetes Jobs for scanner runners
  • πŸ—„οΈ Persists scan metadata and findings in PostgreSQL
  • πŸ”„ Runs background workers (sweeper, cleaner, notifier)
  • πŸ” Handles multi-tenant data isolation
  • πŸ“‘ Exposes gRPC and HTTP APIs
  • πŸ“Š Collects and exposes Prometheus metrics

πŸ—οΈ Architecture

Service Interactions

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   API Gateway   β”‚
β”‚   (REST/gRPC)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ CreateScan(artifact_id)
         β”‚ GetScan(), ListScans()
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Orchestrator Service (This)             β”‚
β”‚                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ gRPC Serverβ”‚  β”‚ Sweeper  β”‚  β”‚  Cleaner    β”‚ β”‚
β”‚  β”‚ (Port 9999)β”‚  β”‚ (Worker) β”‚  β”‚  (Worker)   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚        β”‚              β”‚                β”‚        β”‚
β”‚        β–Ό              β–Ό                β–Ό        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚      Job Dispatcher (K8s client-go)      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚        β”‚        β”‚
         β”‚        β–Ό        β”‚
         β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  β”‚ Storage Service  β”‚
         β”‚  β”‚ (gRPC)           β”‚
         β”‚  β”‚ GetArtifact()    β”‚
         β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚        β”‚
         β”‚        β”‚ presigned URL
         β”‚        β–Ό
         β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚  β”‚  S3/MinIO/GCS    β”‚
         β”‚  β”‚  Object Storage  β”‚
         β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Kubernetes Cluster  β”‚
β”‚                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Runner Job    β”‚  β”‚
β”‚  β”‚ (Pod)         β”‚  β”‚
β”‚  β”‚               β”‚  β”‚
β”‚  β”‚ - Downloads   │──┼──→ S3 (presigned URL)
β”‚  β”‚   source      β”‚  β”‚
β”‚  β”‚ - Runs        β”‚  β”‚
β”‚  β”‚   scanners    β”‚  β”‚
β”‚  β”‚ - Reports     β”‚  β”‚
β”‚  β”‚   findings    │──┼──→ Orchestrator.CreateFindings()
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚      Orchestrator.UpdateScan()
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   PostgreSQL        β”‚
β”‚   - Scans           β”‚
β”‚   - Findings        β”‚
β”‚   - Projects        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Code Structure

cloudscan-orchestrator
β”œβ”€β”€ cmd/
β”‚   └── main.go                    # Application entrypoint
β”œβ”€β”€ pkg/
β”‚   β”œβ”€β”€ controller/                # Main controller & component manager
β”‚   β”‚   β”œβ”€β”€ controller.go
β”‚   β”‚   β”œβ”€β”€ grpc.go               # gRPC server component
β”‚   β”‚   └── http.go               # HTTP server component
β”‚   β”œβ”€β”€ dispatcher/                # Kubernetes job dispatcher
β”‚   β”‚   β”œβ”€β”€ dispatcher.go         # Job creation & dispatch logic
β”‚   β”‚   └── job_spec.go           # K8s Job spec builder
β”‚   β”œβ”€β”€ handlers/
β”‚   β”‚   β”œβ”€β”€ grpc/                 # gRPC service implementations
β”‚   β”‚   β”‚   β”œβ”€β”€ scans.go         # Scan management
β”‚   β”‚   β”‚   β”œβ”€β”€ jobs.go          # Job operations
β”‚   β”‚   β”‚   └── health.go        # Health checks
β”‚   β”‚   └── http/                 # HTTP handlers (optional REST API)
β”‚   β”‚       β”œβ”€β”€ scans.go
β”‚   β”‚       └── middleware.go
β”‚   β”œβ”€β”€ persistence/               # Database layer
β”‚   β”‚   β”œβ”€β”€ scans.go              # Scan CRUD operations
β”‚   β”‚   β”œβ”€β”€ findings.go           # Findings storage
β”‚   β”‚   β”œβ”€β”€ projects.go           # Project management
β”‚   β”‚   └── users.go              # User management
β”‚   β”œβ”€β”€ sweeper/                   # Background worker: job status monitor
β”‚   β”‚   └── sweeper.go
β”‚   β”œβ”€β”€ cleaner/                   # Background worker: retention cleanup
β”‚   β”‚   └── cleaner.go
β”‚   β”œβ”€β”€ authentication/            # Auth providers (JWT, mTLS)
β”‚   β”‚   β”œβ”€β”€ jwt.go
β”‚   β”‚   └── mtls.go
β”‚   β”œβ”€β”€ metrics/                   # Prometheus metrics
β”‚   β”‚   └── metrics.go
β”‚   └── config/                    # Configuration management
β”‚       └── config.go
β”œβ”€β”€ proto/                         # Protocol buffers definitions
β”‚   └── scans.proto               # Scan service gRPC API
β”œβ”€β”€ migrations/                    # Database migrations (SQL)
β”‚   β”œβ”€β”€ 001_initial_schema.up.sql
β”‚   └── 001_initial_schema.down.sql
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ go.mod
β”œβ”€β”€ go.sum
└── README.md

πŸš€ Quick Start

Prerequisites

  • Go 1.23+
  • PostgreSQL 15+
  • Kubernetes cluster (for job dispatching)
  • kubectl configured

Development Setup

# Clone the repository
cd cloudscan-orchestrator

# Install dependencies
go mod download

# Run PostgreSQL locally (via Docker)
docker run --name postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=cloudscan \
  -p 5432:5432 \
  -d postgres:15

# Run database migrations
# TODO: Add migration tool (e.g., golang-migrate)

# Run the service
go run cmd/main.go \
  --db-host=localhost \
  --db-port=5432 \
  --db-name=cloudscan \
  --db-user=postgres \
  --db-password=postgres \
  --grpc-port=9999 \
  --http-port=8081

Configuration

Configuration can be provided via:

  1. Environment variables (recommended for production)
  2. Config file (config.yaml)
  3. Command-line flags

Environment Variables:

# Database
export DB_HOST=localhost
export DB_PORT=5432
export DB_NAME=cloudscan
export DB_USER=postgres
export DB_PASSWORD=postgres

# Kubernetes
export KUBE_NAMESPACE=cloudscan
export KUBE_IN_CLUSTER=false  # Set to true when running in K8s

# Ports
export GRPC_PORT=9999
export HTTP_PORT=8081

# Observability
export PROMETHEUS_PORT=9090
export JAEGER_URL=http://jaeger:14268/api/traces

# Storage Service
export STORAGE_SERVICE_URL=cloudscan-storage:8082

πŸ“‘ API

gRPC API

The orchestrator exposes gRPC services defined in proto/scans.proto:

Key RPCs:

  • CreateScan - Start a new security scan
  • GetScan - Retrieve scan status and results
  • ListScans - List scans with filters
  • CancelScan - Cancel a running scan
  • GetFindings - Get security findings for a scan
  • UpdateScan - Update scan metadata

Example gRPC call:

grpcurl -plaintext \
  -d '{"project_id": "proj-123", "scan_types": ["sast", "sca"]}' \
  localhost:9999 \
  cloudscan.ScanService.CreateScan

HTTP API (Optional)

If enabled, provides RESTful endpoints:

POST   /api/v1/scans              # Create scan
GET    /api/v1/scans/:id          # Get scan details
GET    /api/v1/scans              # List scans
DELETE /api/v1/scans/:id          # Cancel scan
GET    /api/v1/scans/:id/findings # Get findings

πŸ”„ Background Workers

Sweeper

Monitors Kubernetes Jobs and updates scan status:

  • Polls K8s Job status every 30 seconds
  • Updates scan state: queued β†’ running β†’ completed/failed
  • Handles job failures and retries
  • Cleans up completed jobs after retention period

Cleaner

Enforces data retention policies:

  • Deletes scans older than configured retention (default: 90 days)
  • Removes associated findings and artifacts
  • Runs daily at midnight
  • Configurable retention per organization

πŸ—„οΈ Database Schema

Key tables:

organizations - Multi-tenant isolation

CREATE TABLE organizations (
    id UUID PRIMARY KEY,
    name TEXT NOT NULL,
    slug TEXT UNIQUE NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

scans - Scan metadata (partitioned by date)

CREATE TABLE scans (
    id UUID PRIMARY KEY,
    organization_id UUID NOT NULL,
    project_id UUID REFERENCES projects(id),
    status TEXT NOT NULL, -- queued, running, completed, failed
    scan_types TEXT[] NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
) PARTITION BY RANGE (created_at);

findings - Security vulnerabilities

CREATE TABLE findings (
    id UUID PRIMARY KEY,
    scan_id UUID REFERENCES scans(id),
    severity TEXT NOT NULL, -- critical, high, medium, low
    title TEXT NOT NULL,
    file_path TEXT,
    line_number INT
);

See migrations/ for full schema.


πŸ” Authentication

Supports multiple authentication methods:

  1. JWT (recommended for UI/API access)

    • Tokens issued by API gateway
    • Validated on each request
  2. mTLS (for service-to-service)

    • Client certificates validated
    • Used by scanner runners
  3. Disabled (development only)

    • No authentication

πŸ“Š Metrics

Prometheus metrics exposed on /metrics:

Key metrics:

  • cloudscan_scans_total{status="completed|failed"} - Total scans
  • cloudscan_scan_duration_seconds - Scan duration histogram
  • cloudscan_queue_depth - Number of queued scans
  • cloudscan_findings_total{severity="critical|high|medium|low"} - Total findings
  • cloudscan_k8s_jobs_active - Active Kubernetes jobs

πŸ§ͺ Testing

# Run unit tests
go test ./pkg/...

# Run with coverage
go test -cover ./pkg/...

# Run integration tests (requires PostgreSQL)
go test -tags=integration ./pkg/...

# Run specific package tests
go test ./pkg/dispatcher/...

🐳 Docker

Build

# Build for linux/amd64
docker build -t cloudscan/orchestrator:latest .

# Multi-platform build
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t cloudscan/orchestrator:latest \
  --push .

Run

docker run -p 9999:9999 -p 8081:8081 \
  -e DB_HOST=postgres \
  -e DB_PASSWORD=secret \
  cloudscan/orchestrator:latest

🚒 Deployment

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cloudscan-orchestrator
spec:
  replicas: 3
  selector:
    matchLabels:
      app: cloudscan-orchestrator
  template:
    metadata:
      labels:
        app: cloudscan-orchestrator
    spec:
      serviceAccountName: cloudscan-orchestrator
      containers:
      - name: orchestrator
        image: cloudscan/orchestrator:latest
        ports:
        - containerPort: 9999  # gRPC
        - containerPort: 8081  # HTTP
        - containerPort: 9090  # Metrics
        env:
        - name: DB_HOST
          value: postgres
        - name: KUBE_IN_CLUSTER
          value: "true"
        resources:
          requests:
            cpu: 500m
            memory: 512Mi
          limits:
            cpu: 2000m
            memory: 2Gi

Service Account: The orchestrator needs permissions to create/manage Kubernetes Jobs.

See cloudscan-umbrella for complete Helm deployment.


🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make changes with tests
  4. Submit a pull request

Code style:

  • Follow Go conventions
  • Run gofmt and golint
  • Add unit tests for new features

πŸ“„ License

Apache 2.0 - See LICENSE

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages