Build production-grade data pipelines that scale — from your first ETL workflow to distributed systems handling real-time data under production pressure
Data engineering isn’t about scripts that work once. It’s about systems that process massive volumes of data continuously, survive failures, and deliver results when it matters. As datasets grow and systems become distributed, the real challenge is no longer writing code — it’s designing pipelines that scale, perform, and remain reliable in production.
This book takes you from zero to production-ready data systems with a practical, no-nonsense approach. You’ll start by understanding how distributed processing actually works — why single machines fail at scale, and how parallelism, latency, and throughput define system performance. Then you’ll build a complete pipeline from scratch, implementing extraction, transformation, and loading while adding logging, monitoring, and debugging practices used in real-world systems.
As your pipeline grows, you’ll move beyond basics into the problems that break most systems. You’ll learn how to partition large datasets correctly, eliminate bottlenecks caused by skewed data, and process streaming data in real time. You’ll integrate message brokers to decouple services and build pipelines that don’t collapse under load.
You’ll design systems that tolerate failure by default, implement checkpointing and recovery mechanisms, and optimise performance using profiling and resource tuning. Security is treated as a core requirement, not an afterthought, with practical approaches to encryption, access control, and audit logging.
You’ll then step into operating data systems at scale — building monitoring and observability pipelines, setting up alerting, managing infrastructure costs, and testing systems under real-world conditions. The book concludes with deployment strategies using CI/CD, zero-downtime updates, and advanced architectures like Lambda, Kappa, and event-driven systems used in modern data platforms.
Key Features
Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.
Da: California Books, Miami, FL, U.S.A.
Condizione: New. Print on Demand. Codice articolo I-9798195654290
Quantità: Più di 20 disponibili