This is the first three chapters of a textbook for data scientists who want to improve how they work with, analyze, and extract information from data. The focus of the textbook is how to appropriately apply statistical methods, both simple and sophisticated, to 21st century data and problems. This book contains the first three chapters: Introduction -- Data Science and Statistics, Descriptive Statistics, and Data Visualization -- as well as the book front matter. Subsequent chapters will be published in 3- to 5-chapter sets as they become available. The textbook is intended for current and future data scientists, and for anyone interested in deriving information from data. It requires some mathematical sophistication on the part of the reader, as well as comfort using computers and statistical software. Data science is a new field that has arisen to exploit the proliferation of data in the modern world. Mathematical statistics dates back to the mid-18th century, where the field began as the systematic collection of population and economic data by nations. The modern practice of statistics - which includes the collection, summarization, and analysis of data - dates to the early 20th century. Today statistical methods are widely used by governments, businesses and other organizations, as well as by all scientific disciplines. It has been said that a data scientist must have a better grasp of statistics than the average computer scientist and a better grasp of programming than the average statistician. This book will give data scientists a firm foundation in statistics.
This is the first three chapters of a textbook for data scientists who want to improve how they work with, analyze, and extract information from data. The focus of the textbook is how to appropriately apply statistical methods, both simple and sophisticated, to 21st century data and problems. This book contains the first three chapters: Introduction -- Data Science and Statistics, Descriptive Statistics, and Data Visualization -- as well as the book front matter. Subsequent chapters will be published in 3- to 5-chapter sets as they become available. The textbook is intended for current and future data scientists, and for anyone interested in deriving information from data. It requires some mathematical sophistication on the part of the reader, as well as comfort using computers and statistical software. Data science is a new field that has arisen to exploit the proliferation of data in the modern world. Mathematical statistics dates back to the mid-18th century, where the field began as the systematic collection of population and economic data by nations. The modern practice of statistics - which includes the collection, summarization, and analysis of data - dates to the early 20th century. Today statistical methods are widely used by governments, businesses and other organizations, as well as by all scientific disciplines. It has been said that a data scientist must have a better grasp of statistics than the average computer scientist and a better grasp of programming than the average statistician. This book will give data scientists a firm foundation in statistics.