Name: Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices
Author: Leonid Kuligin (Author)
ISBN: 9781806678655

Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices book cover

Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices

Author(s): Leonid Kuligin (Author)

Publisher: Packt Publishing
Publication Date: March 30, 2026
Language: English
Print length: 278 pages
ISBN-10: 1806678659
ISBN-13: 9781806678655

Book Description

Take generative AI applications from prototype to production by mastering LLM architectures, evaluation strategies, LLMOps workflows, and deployment pipelines, using proven approaches to build reliable, secure, and scalable systems

Free with your book: DRM-free PDF version + access to Packt's next-gen Reader*

Key Features

Learn how to take generative AI apps from prototype to production
Apply evaluation, LLMOps, and SRE practices for reliable systems
Design scalable architectures using modern AI engineering patterns

Book Description

Build production-ready generative AI applications by moving beyond prototypes and applying proven engineering principles. This book shows you how to design, evaluate, deploy, and scale AI systems that remain reliable, secure, and maintainable in real-world environments.

Vibe-coding tools and coding assistants make it easy to create prototypes, but taking them into production is where most teams struggle. Written by a Staff AI Engineer at Google, this book guides you through scoping use cases, aligning them with business goals, and scaling generative AI adoption. You’ll learn how to evaluate LLMs using offline metrics, human-in-the-loop approaches, and statistical testing, as well as how to design architectures such as RAG, vector databases, agents, and memory systems.

You’ll also understand how to operationalize these systems with production-grade code, testing practices, and DevOps, MLOps, and LLMOps workflows. The book covers deployment, scaling, and key considerations for security, Responsible AI, observability, and reliability.

By the end of this book, you will be able to design, deploy, and maintain scalable generative AI applications, run A/B tests to measure impact, and apply durable engineering principles so your systems succeed beyond the prototype stage.

*Email sign-up and proof of purchase required

What you will learn

Design end-to-end generative AI product workflows
Build and evaluate AI systems with robust metrics
Implement production-ready code and testing practices
Apply LLMOps and automation for AI deployments
Architect scalable systems using modern AI patterns
Improve reliability with observability and SRE practices
Run A/B tests to measure product impact effectively

Who this book is for

Technical leaders, AI engineers, data scientists, software engineers, and architects building generative AI applications. Engineering managers, product leaders, and decision-makers seeking to deploy, scale, and maintain production-grade AI systems will also benefit.

Building a Prototype
Evaluation
Key Architectures
From Prototype to Production
Moving from DevOps and MLOps to LLMOps
Deploying Your Application
Ethics and Security
Observability and Reliability
Maintaining Your Application
A/B Testing and Online Experiments

Editorial Reviews

Review

“Prototyping agents with new tools and coding assistants is easier than ever. But if we’re not careful, those prototypes never make it to production or fail once real users depend on them. This book focuses on closing that gap.

It provides practical guidance on evaluation, architecture, LLMOps, deployment, and reliability, treating production readiness as a core design requirement, not an afterthought. The emphasis on observability, testing, and scalability reflects what teams need to build systems that last.”

Roya Kandalan, Adjunct Professor, Northeastern University

“This book goes more broad into designing real-world GenAI systems, evaluating them, tuning them, and setting up ongoing LLMOps. It's one thing to build AI prototypes, but another to build a system that can scale, be maintained, are observable and reliable, and are ethical and secure. There are ideas in here that were new to me - like using MapReduce (yes, MapReduce!) as a tool for managing large context windows.

Definitely a worthwhile read for anyone looking to deploy AI in the real world, at scale.”

Frank Kane, Founder, Sundon Software LLC

About the Author

Leonid Kuligin is a staff AI engineer at Google Cloud, working on generative AI and classical machine learning solutions (such as agentic AI, demand forecasting, and optimization problems). Leonid is also an associate researcher at TUM University Hospital, Technical University of Munich. With over two decades of experience, Leonid has a track record of building B2C and B2B applications and solving users' problems in domains such as search, maps, knowledge extraction, and investment management in industry-leading German and Russian technological, financial, and retail companies.

View on Amazon

电子书代发PDF格式价格30元我要求助