This repository contains projects done during my Udacity nano degree course.
Built a database schema and ETL pipeline for this analysis. I tested the database and ETL pipeline by running queries given to me by the analytics team from Sparkify and compared my results with their expected results.
Built an AWS Redshift database with tables designed to optimize queries.
Created ETL pipelines that extracts data from AWS S3, processes the data using Spark, and loads the data back into S3 as a set of dimensional tables.
Built ETL pipelines using Airflow.