Название: Ultimate Multimodal Transformer Models: Master LLMs, Vision Transformers, RAG, AI Agents, Fine-Tuning, and Multimodal AI Systems with PyTorch and Hugging Face
Автор: S. Mahesh Anand
Издательство: Orange Education Pvt Ltd, AVA
Год: 2026
Страниц: 459
Язык: английский
Формат: epub (true)
Размер: 14.4 MB
One Architecture. Infinite Intelligence. Transformer architectures have become the unified foundation of modern AI — powering language models, computer vision systems, and multimodal applications that process text, images, and speech together. Ultimate Multimodal Transformer Models provides a comprehensive, hands-on guide to mastering every major Transformer variant, from foundational encoder-decoder architectures to cutting-edge vision-language models and production GenAI systems. You begin with the core building blocks of Transformer architecture and text data preparation, then progressively advance through encoder-only models, generative LLMs, RAG, Agentic workflows, and efficient fine-tuning using PEFT, LoRA, and QLoRA. By the end of the book, you will be proficient to build, fine-tune, and deploy Transformer-based AI systems across text, vision, and multimodal domains with confidence, applying the right architecture and strategy for every real-world use case! This book is tailored for Data Scientists, ML Engineers, AI Researchers, and Computer Vision Engineers who want to build and deploy Transformer-based AI applications. A working knowledge of Python, basic linear algebra, and fundamental deep learning concepts is expected; no prior Transformer experience is required.