Data Analysis and Graphics Using R:

An Example-Based Approach, 2nd ed.

J. Maindonald and J. Braun

Cambridge University Press, 2007, 502 pages + 10 color plates

Data Analysis and Graphics Using R:

An Example-Based Approach, 2nd ed.

J. Maindonald and J. Braun

Cambridge University Press, 2007, 502 pages + 10 color plates

An Example-Based Approach, 2nd ed.

tatistics packages, even those that use a graphical user interface, are notoriously clumsy and unintuitive to use, and are often poorly documented. This book uses examples to teach R, a free but powerful command-line based statistics package. A familiarity with basic statistics is assumed.

This book doesn't waste time telling you how to install or configure
R. It goes right to its topic, teaching R by example. The teach-by-example
method can be very effective. An excellent example is ** Statistical
Analysis: A Decision-Making Approach ** by Robert Parsons. To make this
sort of book useful requires discipline and organization. ** Data Analysis
and Graphics Using R ** is fairly well organized. It has an extensive
index of R functions and statistical topics.

However, there is one big problem: Where are all the examples?? It turns
out that by 'examples', the authors don't mean 'examples of R code,' but
sample statistical problems, as distinct from a theory-based approach.
There is surprisingly little R code in this book. The commands for linear
modeling (which is R's term for linear regression), for instance, are
scattered across several chapters, making it
hard for the reader to piece together the correct syntax. This could have
been avoided by including parts (or all) of the authors' R scripts (such
as `lm-tests.R`

). This well-written file makes
it immediately obvious how to run a linear model. These are the "examples"
that should have been included in the text. I eventually discovered that
it was much easier to learn R by reading the help pages within R rather
than guess the correct syntax from the text.

A related problem, at least in the early sections, is that some of the examples don't make sense unless you install the authors' DAAG data package. The book also provides relatively little insight as to how R processes the data internally.

On the positive side, topics such as time series analysis and tree-based classification, which are missing from many other books, are thoroughly covered. The authors try to teach some statistics along with R and give many warnings about whether a particular model is appropriate. As might be expected, little mathematical background is provided for the statistical methods.

Back