diff --git a/docs/.ice.yaml b/docs/.ice.yaml new file mode 100644 index 0000000..a0e9c16 --- /dev/null +++ b/docs/.ice.yaml @@ -0,0 +1,7 @@ +uri: http://localhost:8181 +s3: + endpoint: http://localhost:9000 + pathStyleAccess: true + accessKeyID: minio + secretAccessKey: minio123 + region: us-east-1 \ No newline at end of file diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..b9833eb --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,70 @@ +# ICE REST Catalog Architecture + +![ICE REST Catalog Architecture](ice-rest-catalog-architecture.drawio.png) + +## Components + +- **ice-rest-catalog**: Stateless REST API service (Kubernetes Deployment) +- **etcd**: Distributed key-value store for catalog state (Kubernetes StatefulSet) +- **Object Storage**: S3-compatible storage for data files +- **Clients**: ClickHouse or other Iceberg-compatible engines + +## Design Principles + +### Stateless Catalog + +The `ice-rest-catalog` is completely stateless and deployed as a Kubernetes Deployment with multiple replicas. +It can be scaled horizontally without coordination. The catalog does not store any state locallyβ€”all metadata is persisted in etcd. + +### State Management + +All catalog state (namespaces, tables, schemas, snapshots, etc.) is maintained in **etcd**, a distributed, consistent key-value store. +Each etcd instance runs as a StatefulSet pod with persistent storage, ensuring data durability across restarts. + +### Service Discovery + +`ice-rest-catalog` uses the k8s service to access the cluster. +The catalog uses jetcd library to interact with etcd https://github.com/etcd-io/jetcd. +In the etcd cluster, the data is replicated in all the nodes of the cluster. +The service provides a round-robin approach to access the nodes in the cluster. + +### High Availability + +- Multiple `ice-rest-catalog` replicas behind a load balancer +- etcd cluster. +- Persistent volumes for etcd data +- S3 for durable object storage + +## Backup/Recovery +All state information for the catalog is maintained in etcd. To back up the ICE REST Catalog state, you can use standard etcd snapshot tools. The official etcd documentation provides guidance on [snapshotting and recovery](https://etcd.io/docs/v3.5/op-guide/recovery/). + +**Backup etcd Example**: +```shell +etcdctl --endpoints= \ + --cacert= \ + --cert= \ + --key= \ + snapshot save /path/to/backup.db +``` + +Replace the arguments as appropriate for your deployment (for example, endpoints, authentication, and TLS options). + +**Restore etcd Example**: +```shell +etcdctl snapshot restore /path/to/backup.db \ + --data-dir /var/lib/etcd +``` + +The ICE REST Catalog is designed such that if you restore etcd and point the catalog services at the restored etcd cluster, all catalog state (databases, tables, schemas, snapshots) will be recovered automatically. + +**Note:** Data files themselves (table/parquet data) are stored in Object Storage (e.g., S3, MinIO), and should be backed up or protected in accordance with your object storage vendor's recommendations. + +### k8s Manifest Files + +Kubernetes deployment manifests and configuration files are available in the [`examples/eks`](../examples/eks/) folder: + +- [`etcd.eks.yaml`](../examples/eks/etcd.eks.yaml) - etcd StatefulSet deployment +- [`ice-rest-catalog.eks.envsubst.yaml`](../examples/eks/ice-rest-catalog.eks.envsubst.yaml) - ice-rest-catalog Deployment (requires envsubst) +- [`eks.envsubst.yaml`](../examples/eks/eks.envsubst.yaml) - Combined EKS deployment template + +See the [EKS README](../examples/eks/README.md) for detailed setup instructions. \ No newline at end of file diff --git a/docs/ice-rest-catalog-architecture.drawio b/docs/ice-rest-catalog-architecture.drawio new file mode 100644 index 0000000..8c2430e --- /dev/null +++ b/docs/ice-rest-catalog-architecture.drawio @@ -0,0 +1,244 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/ice-rest-catalog-architecture.drawio.png b/docs/ice-rest-catalog-architecture.drawio.png new file mode 100644 index 0000000..0775acb Binary files /dev/null and b/docs/ice-rest-catalog-architecture.drawio.png differ diff --git a/docs/ice-rest-catalog-k8s.yaml b/docs/ice-rest-catalog-k8s.yaml new file mode 100644 index 0000000..e0c0781 --- /dev/null +++ b/docs/ice-rest-catalog-k8s.yaml @@ -0,0 +1,590 @@ +# ============================================================================= +# ice-rest-catalog Kubernetes Manifests +# ============================================================================= +# Deploy with: kubectl apply -f ice-rest-catalog-k8s.yaml +# +# For kind cluster access (run these commands to access services locally): +# kubectl port-forward svc/ice-rest-catalog 8181:8181 -n iceberg-system & +# kubectl port-forward svc/minio 9000:9000 -n iceberg-system & +# kubectl port-forward svc/minio 9001:9001 -n iceberg-system & +# +# Access URLs: +# - ice-rest-catalog: http://localhost:8181 +# - minio API: http://localhost:9000 +# - minio console: http://localhost:9001 +# +# For production use, consider using LoadBalancer or Ingress instead of NodePort +# ============================================================================= + +--- +# Namespace +apiVersion: v1 +kind: Namespace +metadata: + name: iceberg-system + labels: + app.kubernetes.io/name: iceberg-system + app.kubernetes.io/part-of: ice-rest-catalog + +--- +# ============================================================================= +# SECRETS & CONFIGMAPS +# ============================================================================= + +# MinIO Credentials Secret +apiVersion: v1 +kind: Secret +metadata: + name: minio-credentials + namespace: iceberg-system + labels: + app.kubernetes.io/name: minio + app.kubernetes.io/part-of: ice-rest-catalog +type: Opaque +stringData: + MINIO_ROOT_USER: "minio" + MINIO_ROOT_PASSWORD: "minio123" + +--- +# ice-rest-catalog Configuration +apiVersion: v1 +kind: ConfigMap +metadata: + name: ice-rest-catalog-config + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog +data: + config.yaml: | + # etcd connection - uses DNS SRV discovery via headless service + uri: etcd:http://etcd.iceberg-system.svc.cluster.local:2379 + + # S3/MinIO warehouse configuration + warehouse: s3://warehouse + + # S3 settings for MinIO + s3: + endpoint: http://minio.iceberg-system.svc.cluster.local:9000 + pathStyleAccess: true + accessKeyID: minio + secretAccessKey: minio123 + region: us-east-1 + + # Server address + addr: 0.0.0.0:8181 + + # Anonymous access for development/testing + anonymousAccess: + enabled: true + accessConfig: {} + +--- +# ice-rest-catalog Secrets +apiVersion: v1 +kind: Secret +metadata: + name: ice-rest-catalog-secrets + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog +type: Opaque +stringData: + S3_ACCESS_KEY_ID: "minio" + S3_SECRET_ACCESS_KEY: "minio123" + +--- +# ============================================================================= +# ETCD CLUSTER +# ============================================================================= + +# etcd Headless Service for DNS SRV discovery +apiVersion: v1 +kind: Service +metadata: + name: etcd + namespace: iceberg-system + labels: + app.kubernetes.io/name: etcd + app.kubernetes.io/part-of: ice-rest-catalog +spec: + clusterIP: None + publishNotReadyAddresses: true + ports: + - name: client + port: 2379 + targetPort: 2379 + - name: peer + port: 2380 + targetPort: 2380 + selector: + app.kubernetes.io/name: etcd + +--- +# etcd StatefulSet +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: etcd + namespace: iceberg-system + labels: + app.kubernetes.io/name: etcd + app.kubernetes.io/part-of: ice-rest-catalog +spec: + serviceName: etcd + replicas: 3 + podManagementPolicy: Parallel + selector: + matchLabels: + app.kubernetes.io/name: etcd + template: + metadata: + labels: + app.kubernetes.io/name: etcd + app.kubernetes.io/part-of: ice-rest-catalog + spec: + terminationGracePeriodSeconds: 30 + containers: + - name: etcd + image: quay.io/coreos/etcd:v3.5.12 + ports: + - name: client + containerPort: 2379 + - name: peer + containerPort: 2380 + env: + - name: POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: POD_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + - name: ETCD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: ETCD_DATA_DIR + value: /var/lib/etcd + - name: ETCD_INITIAL_CLUSTER_STATE + value: new + - name: ETCD_INITIAL_CLUSTER_TOKEN + value: etcd-cluster-iceberg + - name: ETCD_LISTEN_PEER_URLS + value: http://0.0.0.0:2380 + - name: ETCD_LISTEN_CLIENT_URLS + value: http://0.0.0.0:2379 + - name: ETCD_ADVERTISE_CLIENT_URLS + value: http://$(POD_NAME).etcd.$(POD_NAMESPACE).svc.cluster.local:2379 + - name: ETCD_INITIAL_ADVERTISE_PEER_URLS + value: http://$(POD_NAME).etcd.$(POD_NAMESPACE).svc.cluster.local:2380 + - name: ETCD_INITIAL_CLUSTER + value: etcd-0=http://etcd-0.etcd.iceberg-system.svc.cluster.local:2380,etcd-1=http://etcd-1.etcd.iceberg-system.svc.cluster.local:2380,etcd-2=http://etcd-2.etcd.iceberg-system.svc.cluster.local:2380 + volumeMounts: + - name: etcd-data + mountPath: /var/lib/etcd + resources: + requests: + cpu: 100m + memory: 256Mi + limits: + cpu: 500m + memory: 512Mi + livenessProbe: + httpGet: + path: /health + port: 2379 + initialDelaySeconds: 15 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + readinessProbe: + httpGet: + path: /health + port: 2379 + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 3 + volumeClaimTemplates: + - metadata: + name: etcd-data + labels: + app.kubernetes.io/name: etcd + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Gi + +--- +# ============================================================================= +# MINIO (S3-Compatible Storage) +# ============================================================================= + +# MinIO NodePort Service for external access +apiVersion: v1 +kind: Service +metadata: + name: minio + namespace: iceberg-system + labels: + app.kubernetes.io/name: minio + app.kubernetes.io/part-of: ice-rest-catalog +spec: + type: NodePort + ports: + - name: api + port: 9000 + targetPort: 9000 + nodePort: 30900 + - name: console + port: 9001 + targetPort: 9001 + nodePort: 30901 + selector: + app.kubernetes.io/name: minio + +--- +# MinIO Headless Service +apiVersion: v1 +kind: Service +metadata: + name: minio-headless + namespace: iceberg-system + labels: + app.kubernetes.io/name: minio +spec: + clusterIP: None + ports: + - name: api + port: 9000 + targetPort: 9000 + selector: + app.kubernetes.io/name: minio + +--- +# MinIO StatefulSet +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: minio + namespace: iceberg-system + labels: + app.kubernetes.io/name: minio + app.kubernetes.io/part-of: ice-rest-catalog +spec: + serviceName: minio-headless + replicas: 1 + podManagementPolicy: Parallel + selector: + matchLabels: + app.kubernetes.io/name: minio + template: + metadata: + labels: + app.kubernetes.io/name: minio + app.kubernetes.io/part-of: ice-rest-catalog + spec: + terminationGracePeriodSeconds: 30 + containers: + - name: minio + image: minio/minio:RELEASE.2024-01-31T20-20-33Z + args: + - server + - /data + - --console-address + - ":9001" + ports: + - name: api + containerPort: 9000 + - name: console + containerPort: 9001 + env: + - name: MINIO_ROOT_USER + valueFrom: + secretKeyRef: + name: minio-credentials + key: MINIO_ROOT_USER + - name: MINIO_ROOT_PASSWORD + valueFrom: + secretKeyRef: + name: minio-credentials + key: MINIO_ROOT_PASSWORD + volumeMounts: + - name: minio-data + mountPath: /data + resources: + requests: + cpu: 100m + memory: 256Mi + limits: + cpu: 1000m + memory: 1Gi + livenessProbe: + httpGet: + path: /minio/health/live + port: 9000 + initialDelaySeconds: 30 + periodSeconds: 20 + timeoutSeconds: 10 + readinessProbe: + httpGet: + path: /minio/health/ready + port: 9000 + initialDelaySeconds: 10 + periodSeconds: 10 + timeoutSeconds: 5 + volumeClaimTemplates: + - metadata: + name: minio-data + labels: + app.kubernetes.io/name: minio + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 50Gi + +--- +# MinIO Bucket Setup Job +apiVersion: batch/v1 +kind: Job +metadata: + name: minio-bucket-setup + namespace: iceberg-system + labels: + app.kubernetes.io/name: minio-setup + app.kubernetes.io/part-of: ice-rest-catalog +spec: + ttlSecondsAfterFinished: 300 + template: + metadata: + labels: + app.kubernetes.io/name: minio-setup + spec: + restartPolicy: OnFailure + initContainers: + - name: wait-for-minio + image: busybox:1.36 + command: + - sh + - -c + - | + echo "Waiting for MinIO to be ready..." + until wget -q --spider http://minio.iceberg-system.svc.cluster.local:9000/minio/health/ready; do + echo "MinIO not ready, waiting..." + sleep 5 + done + echo "MinIO is ready!" + containers: + - name: mc + image: minio/mc:RELEASE.2024-01-31T08-59-40Z + command: + - sh + - -c + - | + mc alias set myminio http://minio.iceberg-system.svc.cluster.local:9000 $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD + mc mb --ignore-existing myminio/warehouse + echo "Bucket 'warehouse' created successfully!" + env: + - name: MINIO_ROOT_USER + valueFrom: + secretKeyRef: + name: minio-credentials + key: MINIO_ROOT_USER + - name: MINIO_ROOT_PASSWORD + valueFrom: + secretKeyRef: + name: minio-credentials + key: MINIO_ROOT_PASSWORD + +--- +# ============================================================================= +# ICE-REST-CATALOG +# ============================================================================= + +# ServiceAccount +apiVersion: v1 +kind: ServiceAccount +metadata: + name: ice-rest-catalog + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog + +--- +# ice-rest-catalog NodePort Service for external access +apiVersion: v1 +kind: Service +metadata: + name: ice-rest-catalog + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog + app.kubernetes.io/part-of: ice-rest-catalog +spec: + type: NodePort + ports: + - name: http + port: 8181 + targetPort: 8181 + nodePort: 30181 + protocol: TCP + selector: + app.kubernetes.io/name: ice-rest-catalog + +--- +# ice-rest-catalog Deployment +apiVersion: apps/v1 +kind: Deployment +metadata: + name: ice-rest-catalog + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog + app.kubernetes.io/part-of: ice-rest-catalog +spec: + replicas: 3 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app.kubernetes.io/name: ice-rest-catalog + template: + metadata: + labels: + app.kubernetes.io/name: ice-rest-catalog + app.kubernetes.io/part-of: ice-rest-catalog + spec: + terminationGracePeriodSeconds: 30 + serviceAccountName: ice-rest-catalog + initContainers: + - name: wait-for-etcd + image: busybox:1.36 + command: + - sh + - -c + - | + echo "Waiting for etcd cluster to be ready..." + until wget -q --spider http://etcd.iceberg-system.svc.cluster.local:2379/health; do + echo "etcd not ready, waiting..." + sleep 5 + done + echo "etcd is ready!" + - name: wait-for-minio + image: busybox:1.36 + command: + - sh + - -c + - | + echo "Waiting for MinIO to be ready..." + until wget -q --spider http://minio.iceberg-system.svc.cluster.local:9000/minio/health/ready; do + echo "MinIO not ready, waiting..." + sleep 5 + done + echo "MinIO is ready!" + containers: + - name: ice-rest-catalog + image: altinity/ice-rest-catalog:latest + ports: + - name: http + containerPort: 8181 + protocol: TCP + args: + - "-c" + - "/etc/ice-rest-catalog/config.yaml" + env: + - name: AWS_ACCESS_KEY_ID + valueFrom: + secretKeyRef: + name: ice-rest-catalog-secrets + key: S3_ACCESS_KEY_ID + - name: AWS_SECRET_ACCESS_KEY + valueFrom: + secretKeyRef: + name: ice-rest-catalog-secrets + key: S3_SECRET_ACCESS_KEY + - name: AWS_REGION + value: "us-east-1" + volumeMounts: + - name: config + mountPath: /etc/ice-rest-catalog + readOnly: true + resources: + requests: + cpu: 200m + memory: 512Mi + limits: + cpu: 1000m + memory: 1Gi + livenessProbe: + httpGet: + path: /v1/config + port: 8181 + initialDelaySeconds: 30 + periodSeconds: 15 + timeoutSeconds: 5 + failureThreshold: 3 + readinessProbe: + httpGet: + path: /v1/config + port: 8181 + initialDelaySeconds: 10 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + volumes: + - name: config + configMap: + name: ice-rest-catalog-config + +--- +# PodDisruptionBudget +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: ice-rest-catalog + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog +spec: + minAvailable: 2 + selector: + matchLabels: + app.kubernetes.io/name: ice-rest-catalog + +--- +# HorizontalPodAutoscaler +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: ice-rest-catalog + namespace: iceberg-system + labels: + app.kubernetes.io/name: ice-rest-catalog +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: ice-rest-catalog + minReplicas: 3 + maxReplicas: 10 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 diff --git a/docs/k8s_setup.md b/docs/k8s_setup.md new file mode 100644 index 0000000..7722437 --- /dev/null +++ b/docs/k8s_setup.md @@ -0,0 +1,110 @@ +### k8s setup + +The file ice-rest-catalog-k8s.yaml contains the following components: + +| Component | K8s Resource Type | Replicas | Purpose | +|-----------|-------------------|----------|---------| +| ice-rest-catalog | Deployment | 3 | Stateless REST catalog service (horizontally scalable) | +| etcd | StatefulSet | 3 | Distributed key-value store for catalog metadata | +| minio | StatefulSet | 1 | S3-compatible object storage for Iceberg data | + +``` +kubectl get pods -n iceberg-system +NAME READY STATUS RESTARTS AGE +etcd-0 1/1 Running 0 19h +etcd-1 1/1 Running 0 19h +etcd-2 1/1 Running 0 19h +ice-rest-catalog-dcdd9cb99-6gd8h 1/1 Running 0 15h +ice-rest-catalog-dcdd9cb99-bh7kt 1/1 Running 0 15h +ice-rest-catalog-dcdd9cb99-hdx8c 1/1 Running 0 15h +minio-0 1/1 Running 0 19h +``` + +--- + +### Replacing MinIO with AWS S3 + +For production deployments, you can replace MinIO with AWS S3. Follow these steps: + +#### 1. Remove MinIO Resources + +Delete or comment out these sections from `ice-rest-catalog-k8s.yaml`: +- `minio-credentials` Secret +- `minio` Service (NodePort) +- `minio-headless` Service +- `minio` StatefulSet +- `minio-bucket-setup` Job + +#### 2. Update the ConfigMap + +Replace the `ice-rest-catalog-config` ConfigMap with S3 settings: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: ice-rest-catalog-config + namespace: iceberg-system +data: + config.yaml: | + uri: etcd:http://etcd.iceberg-system.svc.cluster.local:2379 + + # Use your S3 bucket path + warehouse: s3://your-bucket-name/warehouse + + # S3 settings (remove endpoint and pathStyleAccess for real S3) + s3: + region: us-east-1 + + addr: 0.0.0.0:8181 + + anonymousAccess: + enabled: true + accessConfig: {} +``` + +#### 3. Update the Secret with AWS Credentials + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: ice-rest-catalog-secrets + namespace: iceberg-system +type: Opaque +stringData: + S3_ACCESS_KEY_ID: "" + S3_SECRET_ACCESS_KEY: "" +``` + +#### 4. Remove MinIO Init Container + +In the `ice-rest-catalog` Deployment, remove the `wait-for-minio` init container: + +```yaml +initContainers: + - name: wait-for-etcd + # ... keep this one + # Remove the wait-for-minio init container +``` + +#### 5. Create the S3 Bucket + +Ensure your S3 bucket exists before deploying: + +```bash +aws s3 mb s3://your-bucket-name --region us-east-1 +``` + +#### Summary of Changes + +| Resource | Action | +|----------|--------| +| `minio-credentials` Secret | Remove | +| `minio` Service | Remove | +| `minio-headless` Service | Remove | +| `minio` StatefulSet | Remove | +| `minio-bucket-setup` Job | Remove | +| `ice-rest-catalog-config` ConfigMap | Update (remove endpoint, pathStyleAccess) | +| `ice-rest-catalog-secrets` Secret | Update (use AWS credentials) | +| `ice-rest-catalog` Deployment | Remove `wait-for-minio` init container |