apache-iceberg-kickstart

🧊 Apache Iceberg Exploration

Welcome to my personal journey exploring Apache Iceberg, an open table format for large-scale analytics datasets. This repository tracks my experiments, findings, setup steps, and integration with other tools like Spark, Nessie, MinIO, zeppelin and dremio.

📚 What is Apache Iceberg?

Apache Iceberg is an open table format designed for huge analytic datasets. It brings SQL table-like features to data lakes: ACID transactions, schema evolution, time travel, partition evolution, and hidden partitioning.

Iceberg doc

🛠️ My Setup

🔧 Technologies Used

Apache Spark (Data processing)
Apache Iceberg (table format)
Project Nessie (catalog service for versioned data)
MinIO (S3-compatible object storage)
Docker Compose (for local orchestration)
Zeppelin (interactive notebooks)
dremio (Lakehouse platform)

📁 Repo Structure

.
├── docker-compose.yml/               # Docker Compose setup
├── zeppelin_notebooks/               # JupyterLab / Zeppelin notebooks
├── zeppelin_conf/                    # zeppelin interpretor conf
├── spark/                            # spark bin
├── spark-jars/                       # necessary spark bundles

⚙️ Installation Guide

This guide sets up a local Apache Iceberg environment using Docker Compose. It includes Spark, Nessie (for catalog/versioning), MinIO (as S3-compatible storage), dremio, zeppelin, spark with jupyter notebook.

🧩 Prerequisites

Make sure you have the following installed:

Docker
Docker Compose
Git
Python

📁 Clone the Repository

git clone git@github.com:riju18/apache-iceberg-kickstart.git

cd apache-iceberg-kickstart

Download spark

download via this link
and place it into root dir

🐳 Start the Docker Environment

docker compose up

or,

docker compose up -d

This will start:
- Zeppelin: localhost:8090
- Nessie: localhost:19120
- MinIO: localhost:9001
- dremio: localhost:9047
- jupyterlab: localhost:8888

🗃️ Initialize MinIO Buckets

minio will come up with 4 preinitialized buckets.
- datalake
- datalakehouse
- seed
- warehouse
Log in using:
- Username: admin
- Password: password

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
zeppelin_conf		zeppelin_conf
zeppelin_notebook/Iceberg_finance_data		zeppelin_notebook/Iceberg_finance_data
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

apache-iceberg-kickstart

🧊 Apache Iceberg Exploration

📚 What is Apache Iceberg?

🛠️ My Setup

🔧 Technologies Used

📁 Repo Structure

⚙️ Installation Guide

🧩 Prerequisites

📁 Clone the Repository

Download spark

🐳 Start the Docker Environment

🗃️ Initialize MinIO Buckets

About

Uh oh!

Releases

Packages

riju18/apache-iceberg-kickstart

Folders and files

Latest commit

History

Repository files navigation

apache-iceberg-kickstart

🧊 Apache Iceberg Exploration

📚 What is Apache Iceberg?

🛠️ My Setup

🔧 Technologies Used

📁 Repo Structure

⚙️ Installation Guide

🧩 Prerequisites

📁 Clone the Repository

Download spark

🐳 Start the Docker Environment

🗃️ Initialize MinIO Buckets

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages