Автор: Age K. Smilde, Tormod Næs, Kristian Hovde Liland
Издательство: Wiley
Год: 2022
Страниц: 418
Язык: английский
Формат: pdf (true)
Размер: 22.97 MB
Multiblock Data Fusion in Statistics and Machine Learning Explore the advantages and shortcomings of various forms of multiblock analysis, and the relationships between them, with this expert guide.
Arising out of fusion problems that exist in a variety of fields in the natural and life sciences, the methods available to fuse multiple data sets have expanded dramatically in recent years. Older methods, rooted in psychometrics and chemometrics, also exist. Multiblock Data Fusion in Statistics and Machine Learning: Applications in the Natural and Life Sciences is a detailed overview of all relevant multiblock data analysis methods for fusing multiple data sets. It focuses on methods based on components and latent variables, including both well-known and lesser-known methods with potential applications in different types of problems.
Many of the included methods are illustrated by practical examples and are accompanied by a freely available R-package. The distinguished authors have created an accessible and useful guide to help readers fuse data, develop new data fusion models, discover how the involved algorithms and models work, and understand the advantages and shortcomings of various approaches.
A large proportion of the multiblock methods found in this book are available as software implementations both in open source software and through commercial software packages. A large, but non-exhaustive, selection is included in Section 11.10. As of 2022 the largest collection of methods is found in the programming language R, though spread over many packages with various interfaces. Our attempt at unifying a large amount of methods and extending with more is described below and exemplified in Section 11.2. MATLAB is also a popular programming environment for multiblock packages with many open source contributions (see Section 11.10.2) in addition to the popular commercial software in the PLS_Toolbox by Eigenvector. Further, Python is gaining popularity in multiblock analyses (11.10.3), and Excel still has a large user base among data analysts, e.g., using the XLSTAT add-on from Addinsoft. The rest of this chapter contains six sections describing the R package multiblock accompanying this book and a final section summarising a large proportion of the available multiblock software available through other sources.
This book includes:
A thorough introduction to the different options available for the fusion of multiple data sets, including methods originating in psychometrics and chemometrics
Practical discussions of well-known and lesser-known methods with applications in a wide variety of data problems
Included, functional R-code for the application of many of the discussed methods
Perfect for graduate students studying data analysis in the context of the natural and life sciences, including bioinformatics, sensometrics, and chemometrics, Multiblock Data Fusion in Statistics and Machine Learning: Applications in the Natural and Life Sciences is also an indispensable resource for developers and users of the results of multiblock methods.
Скачать Multiblock Data Fusion in Statistics and Machine Learning: Applications in the Natural and Life Sciences