Programming for Corpus Linguistics with Python and Dataframes

Автор: literator от 26-05-2024, 14:35, Коментариев: 0

Название: Programming for Corpus Linguistics with Python and Dataframes
Автор: Daniel Keller
Издательство: Cambridge University Press
Год: 2024
Страниц: 114
Язык: английский
Формат: pdf (true), epub
Размер: 10.1 MB

This Element offers intermediate or experienced programmers algorithms for Corpus Linguistic (CL) programming in the Python language using dataframes that provide a fast, efficient, intuitive set of methods for working with large, complex datasets such as corpora. This Element demonstrates principles of dataframe programming applied to CL analyses, as well as complete algorithms for creating concordances; producing lists of collocates, keywords, and lexical bundles; and performing key feature analysis. An additional algorithm for creating dataframe corpora is presented including methods for tokenizing, part-of-speech tagging, and lemmatizing using spaCy. This Element provides a set of core skills that can be applied to a range of CL research questions, as well as to original analyses not possible with existing corpus software.

Programming often involves manipulating data. In CL, our data are samples of language, and our operations are things like counting word types, calculating association strength, measuring dispersion, and so on. To accomplish these things, we need to be able to hold and reference data in a computer’s memory, often in discrete chunks. We do this with variables. To perform operations on these variables, we write instructions (code) that the Python interpreter understands how to carry out. We can group sets of instructions and save them to be reused later. These are called functions. Often, we will use functions written by other people to save time and guarantee replicability.

This section introduces Pandas DataFrame and Series classes, methods for loading and saving them to disk, and methods and functions for counting values, grouping rows, and combining values. These form a core set of tools that can be used to accomplish a range of CL tasks. The focus in this section is on explaining these elements generally, while Section 4 describes algorithms that use these procedures to complete CL analyses speciﬁcally. We will use two data types extensively in this element, DataFrames and Series. These are not core data types in Python and must be imported through the Pandas package. However, once imported, we will be able to leverage the powerful methods built into them to do corpus linguistic tasks quickly, reliably, and with minimal hardware resources.

Скачать Programming for Corpus Linguistics with Python and Dataframes

Скачать с Turbobit

ОТСУТСТВУЕТ ССЫЛКА/ НЕ РАБОЧАЯ ССЫЛКА ЕСТЬ РЕШЕНИЕ, ПИШИМ СЮДА!

Нашел ошибку? Есть жалоба? Жми!
Пожаловаться администрации

Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.

Информация
Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.

КНИЖНАЯ ПОЛКА

ЭЛЕКТРОНИКА

ДЛЯ ДЕТЕЙ

БИЗНЕС И ФИНАНСЫ

ХУДОЖЕСТВЕННАЯ ЛИТЕРАТУРА

ВОЕННАЯ ТЕМАТИКА

ГУМАНИТАРНЫЕ НАУКИ

ИЗУЧЕНИЕ ЯЗЫКОВ

ЕСТЕСТВЕННЫЕ НАУКИ

АРХИТЕКТУРА И ДИЗАЙН

ДИЗАЙН

ДОМОВОДСТВО

ЗДОРОВЬЕ И МЕДИЦИНА

ИСТОРИЯ

КУЛЬТУРА И ИСКУССТВО

КУЛИНАРИЯ

УЧЕБНАЯ ЛИТЕРАТУРА

ГРАФИКА И ФОТОШОП

НАУЧНО-ПОПУЛЯРНОЕ

ПРОГРАММИРОВАНИЕ

ПРОФЕССИИ И РЕМЕСЛА

ПСИХОЛОГИЯ

ОС И БД

ОГОРОД, САД, ХОЗЯЙСТВО

РАЗНОЕ

РЕЛИГИЯ

РАЗВЛЕЧЕНИЯ И ЮМОР

СЕТЕВЫЕ ТЕХНОЛОГИИ

СТИХИ И ПОЭЗИЯ

СТРОИТЕЛЬСТВО И РЕМОНТ

САМООБОРОНА И СПОРТ

ТЕХНИКА

ТЕХНИЧЕСКИЕ НАУКИ

ХОББИ И ДОСУГ

ФОТО-ВИДЕО

WEB-РАЗРАБОТКИ

ЭЗОТЕРИКА

ЖИВОПИСЬ И РИСОВАНИЕ

ЧЕЛОВЕК

ФАНТАСТИКА

ГАЗЕТЫ И ЖУРНАЛЫ

АВТОМОБИЛЬНЫЕ

АРХИТЕКТУРА, ДИЗАЙН, СТРОИТЕЛЬСТВО

БИЗНЕС

ВОЕННЫЕ

ВЯЗАНИЕ И ШИТЬЕ

ГУМАНИТАРНЫЕ

ЗДОРОВЬЕ

ДОМ И САД

ДЛЯ ДЕТЕЙ И РОДИТЕЛЕЙ

СПОРТИВНЫЕ

СДЕЛАЙ САМ

РУКОДЕЛИЕ

КОМПЬЮТЕРНЫЕ

КУЛИНАРНЫЕ

РАЗВЛЕКАТЕЛЬНЫЕ

НАУЧНО-ПОПУЛЯРНЫЕ

ТЕХНИЧЕСКИЕ

ФОТО И ГРАФИКА

ЭЛЕКТРОНИКА

МОДЕЛИЗМ

ИСТОРИЧЕСКИЕ

АУДИОФАЙЛЫ

БЕЛЛЕТРИСТИКА

ДЕТЯМ
ОБУЧЕНИЕ

РАЗВЛЕЧЕНИЯ

СТИХИ И ПОЭЗИЯ

ЧЕЛОВЕК И ПСИХОЛОГИЯ

ЯЗЫКИ

РАЗНОЕ

Litgu.ru - Литературный Гуру

Programming for Corpus Linguistics with Python and Dataframes