Using Python for Text Analysis in Accounting Research provides an interactive step-by-step framework for analyzing spoken or written language for faculty and PhD students in social sciences. The goal is to demonstrate how textual analysis can enhance research by automatically extracting new and previously unknown information from voluminous disclosures, news articles, and social media posts. Materials are presented in a way that allows the reader to learn about a textual analysis concept or technique and also replicate it by doing. The monograph begins by showing how to install and use Python, a popular general purpose programming language, reviewing Python's basic programming syntax, operators, data types, functions, and so on; allowing the readers to familiarize themselves with the programming environment first. It discusses the Jupyter notebook, which is an open-source web application that allows creating, running, and testing your Python code interactively. And the monograph introduces the Pandas package for working with tabular data that aids researchers as they convert unstructured textual data into structured, tabular data. The authors introduce regular expressions which represent patterns for matching different elements in texts. They then proceed with the discussion and coding of different textual analysis methods used in accounting and finance studies. Finally, the monograph provides an overview of web scraping and file processing features in Python with a focus on downloading EDGAR filings and identifying specific sections in them. Taken together, the first five chapters of this monograph will help readers get started with Python and prepare for writing their own code.
Using Python for Text Analysis in Accounting Research provides an interactive step-by-step framework for analyzing spoken or written language for faculty and PhD students in social sciences. The goal is to demonstrate how textual analysis can enhance research by automatically extracting new and previously unknown information from voluminous disclosures, news articles, and social media posts. Materials are presented in a way that allows the reader to learn about a textual analysis concept or technique and also replicate it by doing. The monograph begins by showing how to install and use Python, a popular general purpose programming language, reviewing Python's basic programming syntax, operators, data types, functions, and so on; allowing the readers to familiarize themselves with the programming environment first. It discusses the Jupyter notebook, which is an open-source web application that allows creating, running, and testing your Python code interactively. And the monograph introduces the Pandas package for working with tabular data that aids researchers as they convert unstructured textual data into structured, tabular data. The authors introduce regular expressions which represent patterns for matching different elements in texts. They then proceed with the discussion and coding of different textual analysis methods used in accounting and finance studies. Finally, the monograph provides an overview of web scraping and file processing features in Python with a focus on downloading EDGAR filings and identifying specific sections in them. Taken together, the first five chapters of this monograph will help readers get started with Python and prepare for writing their own code.