Orpiment Group Project

Overview

In this project, we have created applications that will extract, transform, and load data from the totesys database into a data lake and warehouse hosted in AWS. We have used Terraform to manage the AWS services used.

Objectives

Extract raw data from the totesys database using a Lambda Function and store it in an S3 bucket
Clean and standardise the raw data using a Lambda Function, and store this cleaned data in another S3 bucket
Transform the data into a 'star' schema using another Lambda Function that will have fact and dimension tables
Store the transformed data as parquet files in another S3 bucket
Create a database with the transformed data to act as the data warehouse (which will contain a full history of all updates to the facts tables)
Automate the pipeline by creating a Step Function and adding a job scheduler using Eventbridge trigger the Lambda Functions
Log progress of the pipeline using Cloudwatch
Create a visual presentation that allows users to view useful data in the warehouse

How to run

Clone repo
Run make all to create and activate virtual environment and download required packages and libraries
Set up a .env file with the following variables:
- PG_USER={your_pg_username}
- PG_PASSWORD=test_pass
- PG_DATABASE=test_db
- PG_HOST=localhost
- PG_PORT=5432
- ENV=dev
Run make run-pytest and check all tests pass
Run make run-script to create s3 buckets and necessary layers
Go into the terraform folder in your terminal and run terraform init
Run terraform plan
Run terraform apply

Name		Name	Last commit message	Last commit date
Latest commit History 351 Commits
.github/workflows		.github/workflows
clean_layer		clean_layer
extract_layer		extract_layer
load_layer		load_layer
terraform		terraform
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
create_bucket.sh		create_bucket.sh
read_rds.py		read_rds.py
requirements.txt		requirements.txt
run_extract.py		run_extract.py
test_db_script.sh		test_db_script.sh
upload_layer_func.sh		upload_layer_func.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orpiment Group Project

Overview

Objectives

How to run

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

izzycrouch/de-orpiment-group-project

Folders and files

Latest commit

History

Repository files navigation

Orpiment Group Project

Overview

Objectives

How to run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages