Автор: Khaled El Emam, Lucy Mosquera
Издательство: O’Reilly Media, Inc.
Год: 2020
Язык: английский
Формат: epub
Размер: 10.1 MB
One challenge with big data and other secondary analytics initiatives is getting access to large and diverse data. Secondary analytics allow insights beyond the questions that data initially collected can answer. This practical book introduces techniques for generating synthetic data—fake data generated from real data—that can provide secondary analytics to help you understand customer behaviors, develop new products, or generate new revenue.
CTOs, CIOs, and directors of analytics will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps of synthetic data generation from real data sets. Business leaders will examine how synthetic data can help accelerate time to a solution.
Interest in synthetic data has been growing quite rapidly over the last few years. This has been driven by two simultaneous trends. The first is the demand for large amounts of data to train and build artificial intelligence and machine learning (AIML) models. The second is recent work that has demonstrated effective methods to generate high quality synthetic data. Both have resulted in the recognition that synthetic data can solve some difficult problems quite effectively, especially within the AIML community. Groups and businesses within companies like NVIDIA, IBM, and Alphabet, as well as agencies such as the US Census Bureau, have adopted different types of data synthesis methodologies to support model building, application development, and data dissemination.
Скачать Practical Synthetic Data Generation: Balancing Privacy and the Broad Availability of Data (Early Release)