Название: Programming Big Data Applications: Scalable Tools and Frameworks for Your Needs
Автор: Domenico Talia, Paolo Trunfio, Fabrizio Marozzo, Loris Belcastro
Издательство: World Scientific Publishing
Год: 2024
Страниц: 296
Язык: английский
Формат: pdf (true)
Размер: 11.3 MB
In the age of the Internet of Things and social media platforms, huge amounts of digital data are generated by and collected from many sources, including sensors, mobile devices, wearable trackers and security cameras. These data, commonly referred to as Big Data, are challenging current storage, processing and analysis capabilities. New models, languages, systems and algorithms continue to be developed to effectively collect, store, analyze and learn from Big Data. Programming Big Data Applications introduces and discusses models, programming frameworks and algorithms to process and analyze large amounts of data. In particular, the book provides an in-depth description of the properties and mechanisms of the main programming paradigms for Big Data analysis, including MapReduce, workflow, BSP, message passing, and SQL-like. Through programming examples it also describes the most used frameworks for Big Data analysis like Hadoop, Spark, MPI, Hive and Storm. Each of the different systems is discussed and compared, highlighting their main features, their diffusion (both within their community of developers and among users), and their main advantages and disadvantages in implementing Big Data analysis applications. This book describes and reviews parallel and distributed paradigms, languages, and systems used today to analyze and learn from Big Data on scalable computers. In particular, the book provides a detailed description of the properties and mechanisms of the main parallel programming paradigms, and through programming examples, it illustrates the most widely used frameworks for Big Data analysis. The final goal of this volume is to help designers and developers in programming Big Data applications by identifying and selecting the best or most appropriate programming tool(s) based on their skills, hardware availability, application domains, and purposes, while also considering the support provided by the developer community.