Название: Data Science: An Emerging Trend in Engineering, Science & Technology
Автор: Mrunalini Moon, Ankita Shende, Monika Walde, Devki Nandgaye
Издательство: Independently published
Год: 2024
Страниц: 432
Язык: английский
Формат: epub
Размер: 36.3 MB
Data Science is a deep study of the massive amount of data, which involves extracting meaningful insights from raw, structured, and unstructured data that is processed using the scientific method, different technologies, and algorithms. It is a multidisciplinary field that uses tools and techniques to manipulate the data so that you can find something new and meaningful. Some of the most popular Data science tools are Python, Hadoop, Spark, R, Tensor Flow, BigML, MATLAB, Excel, and more. NumPy stands for Numerical Python. It is a Python library used for working with an array. In Python, we use the list for purpose of the array but it’s slow to process. NumPy array is a powerful N-dimensional array object and its use in linear algebra, Fourier transform, and random number capabilities. It provides an array object much faster than traditional Python lists. Numpy has fast built-in aggregate and statistical for working on arrays. By using these function or if we have good knowledge of these functions than we will play with arrays. NumPy is a Python package which means ‘Numerical Python’. It is the library for logical computing, which contains a powerful n-dimensional array object, gives tools to integrate C, C++ and so on. Pandas is an open-source data analysis and data manipulation library written in Python. Pandas provide you with data structures and functions to work on structured data seamlessly. The name Pandas refer to “Panel Data”, which means a structured dataset. Pandas have two main classes to work on, DataFrame and Series. Data Visualization is the process of presenting data in the form of graphs or charts. Matplotlib is a low-level library of Python which is used for data visualization. It is easy to use and emulates MATLAB like graphs and visualization. This library is built on the top of NumPy arrays and consists of several plots like line chart, bar chart, histogram, etc. It provides a lot of flexibility but at the cost of writing more code.