# Interim
docker-compose up
docker exec irah-spark-master spark-submit --master spark://spark-master:7077 --deploy-mode client ./apps/main.py
docker-compose down --volumes --remove-orphans
- Add pre-commit hooks (lint, formating)
- Makefile complete
- Cluster mode include if possible (Yarn)
- Create a requirements.in file and use uv to compile to requirements.txt
uv·pip·compile·requirements.in
- Dockerfile
- Install all the packages required for os
- Set path for hadoop and spark
- Download the spark image
- Create a spark-default.conf file to set some conf
- Create a entrypoint.sh which can help submit the spark application
- This way of setup we can run the spark app in client mode
- client mode means that the driver code is running on our machine
- Docker compose
- spark-master
- image name
- entrypoint
- in volumes mount the folder in repo which will have all the scripts
- for access failures include
:z
- for access failures include
- refer the env file for spark conf
- spark-history-server
- spark-worker
- spark-master
- Makefile
- Create a makefile which can do all the command related operations
- Advantage of this setup
- scale the worker
- uses same image for all three container
- Makefile controlable