I'm Roy, a U.S. senior Data Engineer with 5+ years of experience. I architect, modernize, and operate batch & streaming data platforms on AWS-native tooling for clients in manufacturing and financial services domains.
I worked with:
- Batch ETL (Spark, Redshift)
- Real-time Streaming (Kafka, Flink, Aurora);
- Lakehouse Patterns (S3, Hudi, Athena);
- Data Modeling (PySpark, Bigquery, dbt);
- Platform IaC (Airflow, Kubernetes, Terraform);
- Governance (Lake Formation, KMS, IAM, CloudWatch);
My work combines independent ownership with measurable revenue impacts: cost savings, latency reductions, operational scaling, and tech debt reductions.
Outside of work, I enjoy building personal projects such as: Batch ETL, Stream Processing and LLM-related coding competitions, recently won Silver Medal in a featured Kaggle LLM competition.
Here are some of the architectures I've worked with:
| Automated Data Marketplace | Hot/Cold Realtime Streaming |
|---|---|
![]() |
![]() |
| Batch ETL for Retail Analytics | Stream Processing - NASA's data |
![]() |
![]() |
Currently open to Data Engineering, DataOps, MLOps roles.
- roy.ma9@gmail.com
- Linkedin Profile
- I'm a U.S. citizen, and fully authorized to work in U.S. permanently, no visa sponsorships needed; open to relocation;







