⛅
The Light Behind Your Cloud
Data @ Google
Stars
This is a repo with links to everything you'd ever want to learn about data engineering
A technical explainer by @kognise of how your computer runs programs, from start to finish.
Spark: The Definitive Guide's Code Repository
Implementing best practices for PySpark ETL jobs and applications.
This blog explains a solution architecture to handle fast changing reference data stored in DynamoDB through an AWS Glue Streaming job
pipeline for migrating lichess data into postgresql
🎓 A collection of interactive courses for the swirl R package.