|New Reviews| |Software Methodologies| |Popular Science| |AI/Machine Learning| |Programming| |Java| |Linux/Open Source| |XML| |Software Tools| |Other| |Web| |Tutorials| |All By Date| |All By Title| |Resources| |About| |
Keywords: Statistics, data analysis, R, data mining Title: R in Action: Data Analysis and Graphics with R Author: Robert Kabacoff Publisher: Manning ISBN: 978-1935182399 Media: Book Level: Introductory R, but some stats background required Verdict: A solid companion text to using R |
R is a programming language and software environment for statistical computation and graphics. It is, by a long shot, the leading open source stats package in the world. It is also, not surprisingly, incredibly powerful, flexible and not a little bit scary to the beginner. Which means that there's plenty of scope for good supporting materials to augment the wealth of online information available to users. 'R In Action' aims to provide a tutorial to getting the best out of R for real world data analysis. The assumption here is that as a user you know one end of a data set from another, and that you have some familiarity with statistical analysis. What is not assumed is too much familiarity with the R language or environment.
The opening part of the book starts from the basics pretty much — from downloading and installing to interactively using the environment to create and explore some data, to creating simple charts and some beginning data management. This leads naturally to the second part of the book where we get a couple of chapters on basic graphs and descriptive statistics.
The next section of the book really gets into the meat of statistical analysis — covering regression (the single biggest topic in the book), analysis of variance, power analysis, resampling and bootstrapping and more on graphics.
The final part of the book, (aside from the numerous appendices), looks at generalised linear models, principal components and factors, dealing with missing data and ends with advanced graphics.
There's no faulting the choice of topics, this really does cover a good range of data analysis techniques that you'll come across in practice — unless you're an academic or a specialist. The author liberally uses real examples and data sets that can be downloaded and played with — along with sample code. It's also good that there are references and pointers to further reading for those who want to know more about some of the underlying techniques being used. Also good are the caveats and the close attention to the assumptions that have to be met to be able to use this technique or that.
At the same time, it's also important to be clear about what this book doesn't do. It's not a statistics primer. Sure, it's possible to pick things up by following along with the text, but the author hasn't set out to write a book about statistical methods. Methodology yes, methods no. It's also clear that this isn't a programmers manual. The language side of R doesn't really get covered to a great extent — so for example you'll not see any coverage of writing your own R packages or extensions.
However, for readers who really want to get to grips with using R, particularly those who are familiar with stats and data analysis, this is a very readable and useful companion to have beside you as you work your way into the environment.