
Building Trusted Data Platforms with Azure Databricks and GenAI: A Hands-On Guide to Creating Governed Data Products in a Lakehouse
Author(s): Manoj Kukreja (Author)
- Publisher: Packt Publishing - ebooks Account
- Publication Date: September 9, 2026
- Language: English
- Print length: 412 pages
- ISBN-10: 1806679779
- ISBN-13: 9781806679775
Book Description
A practical guide to building a modern, GenAI-powered data platform with a Lakehouse foundation, covering MDM, data mesh, AI enablement, streaming pipelines, observability, and cloud-driven architectures for trusted analytics.
Key Features
- Discover characteristics of future-ready platforms - data mesh, automation, & observability
- Design trustworthy data products with contracts, federated governance, and decentralized ownership
- Understand how GenAI accelerates Lakehouse development and enables self‑service analytics
Book Description
Discover the defining hallmarks of future‑ready data platforms, including data mesh architectures, intelligent automation, and end‑to‑end data observability. Learn how to design and deliver trusted data products through data contracts, federated governance, decentralized domain ownership, and endorsed datasets. The book explores modern Lakehouse patterns with a strong focus on the medallion architecture, explaining how bronze, silver, and gold layers transform raw data into analytics‑ready assets governed through Unity Catalog. You’ll gain practical guidance on MDM linkages, survivorship rules, and entity resolution to ensure consistent master data across domains. It also covers real‑time and streaming pipelines that integrate seamlessly with the Lakehouse. A dedicated focus is placed on self‑service analytics, showing how governed data products empower business users to explore, analyze, and derive insights independently with confidence. Finally, understand how GenAI accelerates platform development through automated code generation using tools like Claude Code and Databricks Genie Code, enabling faster pipeline creation, governance, and analytics delivery.
What you will learn
- Future‑ready platforms: data mesh, automation, observability
- Design trusted data products with contracts and governance
- Build Lakehouses with medallion architecture: bronze, silver, gold
- Apply Unity Catalog for governance and endorsed datasets
- Implement MDM using linkages, survivorship, and entity resolution
- Develop real‑time and streaming pipelines at scale
- Enable governed self‑service analytics for business users
- Use GenAI to generate code with Claude and Databricks Genie
Who this book is for
This book is crafted for aspiring data and AI/ML architects, engineers and analysts starting their data engineering journey and seeking a practical, hands‑on guide to building scalable, cloud‑driven data platforms. It’s ideal for professionals familiar with PySpark who want to design modern Lakehouse architectures using Delta Lake, while learning MDM, data mesh, AI enablement, streaming pipelines, automation, and data observability. A working knowledge of Python, Spark, and SQL is expected.
Table of Contents
- The Story of Data Engineering and Analytics
- Discovering Storage and Compute in Lakehouses
- Data Engineering on Microsoft Azure
- Designing Future Data Platforms
- Databricks, Medallion Architecture & Delta Lake
- Understanding Modern Data Pipelines
- Data Collection Stage – The Bronze Layer
- Data Curation Stage – The Silver Layer
- Data Aggregation Stage – The Gold Layer
- Next-Gen Data Analytics with Generative AI
- Data Observability
- Data Governance
Editorial Reviews
About the Author
Manoj Kukreja is a Principal Architect at Northbay Solutions who specializes in creating complex Data Lakes and Data Analytics Pipelines for large-scale organizations such as banks, insurance companies, universities, and US/Canadian government agencies. Previously, he worked for Pythian, a large managed service provider where he was leading the MySQL and MongoDB DBA group and supporting large-scale data infrastructure for enterprises across the globe. With over 25 years of IT experience, he has delivered Data Lake solutions using all major cloud providers including AWS, Azure, GCP, and Alibaba Cloud. On weekends, he trains groups of aspiring Data Engineers and Data Scientists on Hadoop, Spark, Kafka and Data Analytics on AWS and Azure Cloud.
{"@context":"https://schema.org","@type":"Book","name":"Building Trusted Data Platforms with Azure Databricks and GenAI: A Hands-On Guide to Creating Governed Data Products in a Lakehouse","image":"https://m.media-amazon.com/images/I/71FPKURT9LL._SY342_.jpg","author":{"@type":"Person","name":"Manoj Kukreja (Author)"},"publisher":{"@type":"Organization","name":"Packt Publishing - ebooks Account"},"datePublished":"September 9, 2026","isbn":"9781806679775","numberOfPages":412,"inLanguage":"English","description":"A practical guide to building a modern, GenAI-powered data platform with a Lakehouse foundation, covering MDM, data mesh, AI enablement, streaming pipelines, observability, and cloud-driven architectures for trusted analytics. Key FeaturesDiscover characteristics of future-ready platforms - data mesh, automation, & observabilityDesign trustworthy data products with contracts, federated governance, and decentralized ownershipUnderstand how GenAI accelerates Lakehouse development and enables self‑service analyticsBook DescriptionDiscover the defining hallmarks of future‑ready data platforms, including data mesh architectures, intelligent automation, and end‑to‑end data observability. Learn how to design and deliver trusted data products through data contracts, federated governance, decentralized domain ownership, and endorsed datasets. The book explores modern Lakehouse patterns with a strong focus on the medallion architecture, explaining how bronze, silver, and gold layers transform raw data into analytics‑ready assets governed through Unity Catalog. You’ll gain practical guidance on MDM linkages, survivorship rules, and entity resolution to ensure consistent master data across domains. It also covers real‑time and streaming pipelines that integrate seamlessly with the Lakehouse. A dedicated focus is placed on self‑service analytics, showing how governed data products empower business users to explore, analyze, and derive insights independently with confidence. Finally, understand how GenAI accelerates platform development through automated code generation using tools like Claude Code and Databricks Genie Code, enabling faster pipeline creation, governance, and analytics delivery.What you will learnFuture‑ready platforms: data mesh, automation, observabilityDesign trusted data products with contracts and governanceBuild Lakehouses with medallion architecture: bronze, silver, goldApply Unity Catalog for governance and endorsed datasetsImplement MDM using linkages, survivorship, and entity resolutionDevelop real‑time and streaming pipelines at scaleEnable governed self‑service analytics for business usersUse GenAI to generate code with Claude and Databricks GenieWho this book is forThis book is crafted for aspiring data and AI/ML architects, engineers and analysts starting their data engineering journey and seeking a practical, hands‑on guide to building scalable, cloud‑driven data platforms. It’s ideal for professionals familiar with PySpark who want to design modern Lakehouse architectures using Delta Lake, while learning MDM, data mesh, AI enablement, streaming pipelines, automation, and data observability. A working knowledge of Python, Spark, and SQL is expected.Table of ContentsThe Story of Data Engineering and AnalyticsDiscovering Storage and Compute in LakehousesData Engineering on Microsoft AzureDesigning Future Data PlatformsDatabricks, Medallion Architecture & Delta LakeUnderstanding Modern Data PipelinesData Collection Stage – The Bronze LayerData Curation Stage – The Silver LayerData Aggregation Stage – The Gold LayerNext-Gen Data Analytics with Generative AIData ObservabilityData Governance","url":"https://www.amazon.com/dp/1806679779/","bookFormat":"http://schema.org/EBook","additionalType":"http://schema.org/PDF","fileSize":"79 MB","accessibilityFeature":["login required","member access only"],"accessibilitySummary":"PDF version available to authenticated members only. File size: 79 MB."}
电子书百科大全







评论前必须登录!
立即登录 注册