Автор: Wee Hyong Tok, Amit Bahree, Senja Filipi
Издательство: O’Reilly Media, Inc.
Год: 2021
Страниц: 200
Язык: английский
Формат: epub
Размер: 10.2 MB
Build products using Deep Learning, weakly supervised learning, and natural language processing without collecting millions of training records. This practical book explains how and provides a how-to guide for actually shipping deep learning models–since most of these projects never leave the lab. Deep networks have enabled new applications using unstructured data to proliferate, but much of the work means collecting millions of records as well as labeled datasets. Author Russell Jurney from Data Syndrome helps machine-learning engineers, software engineers, deep learning engineers, and data scientists learn practical applications using several weakly supervised learning methods. You'll explore: Semi-supervised learning: Combine a small amount of labeled data with a large amount of unlabeled data to train an improved final model Transfer learning: Re-train existing models from a related domain using training data from the problem domain Distant supervision: Combine low-quality labels from databases and other sources to create high-quality labels for the entire dataset Model versioning and management: start with a small labeled dataset and create a production grade model from concept through deployment
Who Should Read This Book:
The primary audience of the book will be professional and citizen data scientists who are already working on machine learning projects and face the typical challenges of getting good-quality labeled data for these projects. They will have working knowledge of the programming language Python and be familiar with machine learning libraries and tools.
Navigating This Book:
This book is organized roughly as follows:
• Chapter 1 provides a basic introduction to the field of weak supervision and how data scientists and machine learning engineers can use it as part of the data science process.
• Chapter 2 discusses how to get started with using Snorkel for weak supervision and introduces concepts in using Snorkel for data programming.
• Chapter 3 describes how to use Snorkel for labeling and provides code examples on how one can use Snorkel to label a text and image dataset.
• Chapters 4 and 5 are included as part of the book to enable practitioners to have an end-to-end understanding of how to use a weakly labeled dataset for text and image classification.
• Chapter 6 discusses the practical considerations for using Snorkel with large datasets and how to use Spark clusters to scale labeling.
Скачать Practical Weak Supervision: Doing More with Less Data (Final)