Автор: Anthony Sarkis
Издательство: O’Reilly Media, Inc.
Год: 2024
Страниц: 332
Язык: английский
Формат: True PDF, True EPUB (Retail Copy)
Размер: 21.3 MB, 13.2 MB
Your training data has as much to do with the success of your data project as the algorithms themselves because most failures in AI systems relate to training data. But while training data is the foundation for successful AI and machine learning, there are few comprehensive resources to help you ace the process.
In this hands-on guide, author Anthony Sarkis—lead engineer for the Diffgram AI training data software—shows technical professionals, managers, and subject matter experts how to work with and scale training data, while illuminating the human side of supervising machines. Engineering leaders, data engineers, and data science professionals alike will gain a solid understanding of the concepts, tools, and processes they need to succeed with training data.
Data is all around us—videos, images, text, documents, as well as geospatial, multi-dimensional data, and more. Yet, in its raw form, this data is of little use to supervised machine learning (ML) and artificial intelligence (AI). How do we make use of this data? How do we record our intelligence so it can be reproduced through ML and AI? The answer is the art of training data—the discipline of making raw data useful.
With this book, you'll learn how to:
• Work effectively with training data including schemas, raw data, and annotations
• Transform your work, team, or organization to be more AI/ML data-centric
• Clearly explain training data concepts to other staff, team members, and stakeholders
• Design, deploy, and ship training data for production-grade AI applications
• Recognize and correct new training-data-based failure modes such as data bias
• Confidently use automation to more effectively create training data
• Successfully maintain, operate, and improve training data systems of record
Who Should Read This Book?
This book is a foundational overview of training data. It’s ideally suited to those who are totally new, or just getting started, with training data. For intermediate practitioners, the later chapters provide unique value and insights that can’t be found anywhere else; in a nutshell, insider knowledge. I will highlight specific areas of interest for subject matter experts, workflow managers, directors of training data, data engineers, and data scientists. Computer science (CS) knowledge is not required. Knowing CS, machine learning, or data science will make more sections of the book accessible. I strive to make this book maximally accessible to data annotators, including subject matter experts, because they play a key part in training data, including supervising the system.
Скачать Training Data for Machine Learning: Human Supervision from Annotation to Data Science (Final)
True PDF:
True ePub: