Buch. Hardcover
2. Auflage. 2025
496 S. 172 s/w-Abbildungen, 172 s/w-Zeichnungen, 56 s/w-Tabelle.
In englischer Sprache
Taylor & Francis Ltd. ISBN 978-1-03-272451-5
Format (B x L): 17.8 x 25.4 cm
Gewicht: 453 g
Produktbeschreibung
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse, Second Edition offers a comprehensive guide to learning statistical inference with data science tools widely used in industry, academia, and government. The first part of this book introduces the tidyverse suite of R packages, including ggplot2 for data visualization and dplyr for data wrangling. The second part introduces data modeling via simple and multiple linear regression. The third part presents statistical inference using simulation-based methods within a general framework implemented in R via the infer package, a suitable complement to the tidyverse. By working with these methods, readers can implement effective exploratory data analyses, conduct statistical modeling with data, and carry out statistical inference via confidence intervals and hypothesis testing. All these tasks are performed strongly emphasizing data visualization.
Key Features in the Second Edition:
- Minimal Prerequisites: no prior calculus or coding experience is needed, making the content accessible to a wide audience.
- Real-World Data: learn with real-world datasets, including all domestic flights leaving New York City in 2023, the Gapminder project, FiveThirtyEight.com data, and new datasets on health, global development, music, coffee quality, and geyser eruptions.
- Simulation-Based Inference: statistical inference through simulation-based methods.
- Expanded Theoretical Discussions: includes deeper coverage of theory-based approaches, their connection with simulation-based approaches, and a presentation of intuitive and formal aspects of these methods.
- Enhanced Use of the infer Package: leverages the infer package for “tidy” and transparent statistical inference, enabling readers to construct confidence intervals and conduct hypothesis tests through multiple linear regression and beyond.
- Dynamic Online Resources: all code and output are embedded in the text, with additional interactive exercises, discussions, and solutions available online at the book website.
- Broadened Applications: Suitable for undergraduate and graduate courses, including statistics, data science, and courses emphasizing reproducible research.
The first edition of the book has been used in so many different ways; for courses in statistical inference, statistical programming, business analytics, and data science for social policy, and by professionals in many more. Ideal for those new to statistics or looking to deepen their knowledge, this edition provides a clear entry point into data science and modern statistical methods.