book review


Data Analysis and Graphics Using R:
An Example-Based Approach, 2nd ed.

J. Maindonald and J. Braun
Cambridge University Press, 2007, 502 pages + 10 color plates


To top
Other Book Reviews

Data Analysis and Graphics Using R:
An Example-Based Approach, 2nd ed.

J. Maindonald and J. Braun
Reviewed by

S tatistics packages, even those that use a graphical user interface, are notoriously clumsy and unintuitive to use, and are often poorly documented. This book uses examples to teach R, a free but powerful command-line based statistics package. A familiarity with basic statistics is assumed.

This book doesn't waste time telling you how to install or configure R. It goes right to its topic, teaching R by example. The teach-by-example method can be very effective. An excellent example is Statistical Analysis: A Decision-Making Approach by Robert Parsons. To make this sort of book useful requires discipline and organization. Data Analysis and Graphics Using R is fairly well organized. It has an extensive index of R functions and statistical topics.

However, there is one big problem: Where are all the examples?? It turns out that by 'examples', the authors don't mean 'examples of R code,' but sample statistical problems, as distinct from a theory-based approach. There is surprisingly little R code in this book. The commands for linear modeling (which is R's term for linear regression), for instance, are scattered across several chapters, making it hard for the reader to piece together the correct syntax. This could have been avoided by including parts (or all) of the authors' R scripts (such as lm-tests.R). This well-written file makes it immediately obvious how to run a linear model. These are the "examples" that should have been included in the text. I eventually discovered that it was much easier to learn R by reading the help pages within R rather than guess the correct syntax from the text.

A related problem, at least in the early sections, is that some of the examples don't make sense unless you install the authors' DAAG data package. The book also provides relatively little insight as to how R processes the data internally.

On the positive side, topics such as time series analysis and tree-based classification, which are missing from many other books, are thoroughly covered. The authors try to teach some statistics along with R and give many warnings about whether a particular model is appropriate. As might be expected, little mathematical background is provided for the statistical methods.


name and address
May 12, 2007
Back