Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices

Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices book cover

Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices

Author(s): Leonid Kuligin (Author)

  • Publisher: Packt Publishing
  • Publication Date: March 30, 2026
  • Language: English
  • Print length: 278 pages
  • ISBN-10: 1806678659
  • ISBN-13: 9781806678655

Book Description

Take generative AI applications from prototype to production by mastering LLM architectures, evaluation strategies, LLMOps workflows, and deployment pipelines, using proven approaches to build reliable, secure, and scalable systems

Free with your book: DRM-free PDF version + access to Packt's next-gen Reader*

Key Features

  • Learn how to take generative AI apps from prototype to production
  • Apply evaluation, LLMOps, and SRE practices for reliable systems
  • Design scalable architectures using modern AI engineering patterns

Book Description

Build production-ready generative AI applications by moving beyond prototypes and applying proven engineering principles. This book shows you how to design, evaluate, deploy, and scale AI systems that remain reliable, secure, and maintainable in real-world environments.

Vibe-coding tools and coding assistants make it easy to create prototypes, but taking them into production is where most teams struggle. Written by a Staff AI Engineer at Google, this book guides you through scoping use cases, aligning them with business goals, and scaling generative AI adoption. You’ll learn how to evaluate LLMs using offline metrics, human-in-the-loop approaches, and statistical testing, as well as how to design architectures such as RAG, vector databases, agents, and memory systems.

You’ll also understand how to operationalize these systems with production-grade code, testing practices, and DevOps, MLOps, and LLMOps workflows. The book covers deployment, scaling, and key considerations for security, Responsible AI, observability, and reliability.

By the end of this book, you will be able to design, deploy, and maintain scalable generative AI applications, run A/B tests to measure impact, and apply durable engineering principles so your systems succeed beyond the prototype stage.

*Email sign-up and proof of purchase required

What you will learn

  • Design end-to-end generative AI product workflows
  • Build and evaluate AI systems with robust metrics
  • Implement production-ready code and testing practices
  • Apply LLMOps and automation for AI deployments
  • Architect scalable systems using modern AI patterns
  • Improve reliability with observability and SRE practices
  • Run A/B tests to measure product impact effectively

Who this book is for

Technical leaders, AI engineers, data scientists, software engineers, and architects building generative AI applications. Engineering managers, product leaders, and decision-makers seeking to deploy, scale, and maintain production-grade AI systems will also benefit.

Table of Contents

  1. Building a Prototype
  2. Evaluation
  3. Key Architectures
  4. From Prototype to Production
  5. Moving from DevOps and MLOps to LLMOps
  6. Deploying Your Application
  7. Ethics and Security
  8. Observability and Reliability
  9. Maintaining Your Application
  10. A/B Testing and Online Experiments

Editorial Reviews

Review

“Prototyping agents with new tools and coding assistants is easier than ever. But if we’re not careful, those prototypes never make it to production or fail once real users depend on them. This book focuses on closing that gap.

It provides practical guidance on evaluation, architecture, LLMOps, deployment, and reliability, treating production readiness as a core design requirement, not an afterthought. The emphasis on observability, testing, and scalability reflects what teams need to build systems that last.”

Roya Kandalan, Adjunct Professor, Northeastern University

“This book goes more broad into designing real-world GenAI systems, evaluating them, tuning them, and setting up ongoing LLMOps. It's one thing to build AI prototypes, but another to build a system that can scale, be maintained, are observable and reliable, and are ethical and secure. There are ideas in here that were new to me - like using MapReduce (yes, MapReduce!) as a tool for managing large context windows.

Definitely a worthwhile read for anyone looking to deploy AI in the real world, at scale.”

Frank Kane, Founder, Sundon Software LLC

About the Author

Leonid Kuligin is a staff AI engineer at Google Cloud, working on generative AI and classical machine learning solutions (such as agentic AI, demand forecasting, and optimization problems). Leonid is also an associate researcher at TUM University Hospital, Technical University of Munich. With over two decades of experience, Leonid has a track record of building B2C and B2B applications and solving users' problems in domains such as search, maps, knowledge extraction, and investment management in industry-leading German and Russian technological, financial, and retail companies.

View on Amazon

代发服务PDF电子书30立即求助
未经允许不得转载:电子书百科大全 » Architecting Generative AI Applications: Build, deploy, and scale production-ready GenAI systems with LLMOps best practices

评论 抢沙发

评论前必须登录!

立即登录   注册