The Elements of Statistical Learning: Data Mining, Inference, and Prediction

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/225734295
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Article in The Mathematical Intelligencer · November 2004

DOI: 10.1007/BF02985802
CITATIONS READS
12,787 10,079
4 authors, including:
Trevor Hastie Robert Tibshirani

Stanford University Stanford University
402 PUBLICATIONS 145,237 CITATIONS 506 PUBLICATIONS 124,018 CITATIONS
SEE PROFILE SEE PROFILE
Jerome H. Friedman
Stanford University
215 PUBLICATIONS 109,253 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Single-cell multidimensional analysis of cancer related EMT phenomena View project
Pregnancy View project
All content following this page was uploaded by Trevor Hastie on 30 May 2014.
The user has requested enhancement of the downloaded file.

Book Reviews 567
The Elements of Statistical Learning: Data Mining, Inference, and Statistical Process Adjustment Methods for Quality Control.
Prediction.
Enrique DEL C AS TILLO . New York: Wiley, 2002. ISBN 0-471-43574-0.
Trevor H AS TIE , Robert T IBSHIR ANI, and Jerome F RIEDM AN . New York: xviii C 357 pp. $99.95 (H).
Springer-Verlag, 2001. ISBN 0-387-95284-5. viii C 533 pp. $74.95 (H).
This book addresses the core issues of integration between statistical process
control (SPC) and engineering process control (EPC). Traditionally, SPC
In the words of the authors, the goal of this book was to “bring together techniques have been developed to monitor variables and, through a (usually
many of the important new ideas in learning, and explain them in a statistical off-line) cycle of diagnosing and correcting special causes, to reduce process
framework.” The authors have been quite successful in achieving this objective, variability. In contrast, EPC techniques have been developed to directly reduce
and their work is a welcome addition to the statistics and learning literatures. process variability by adjusting or controlling input variables based on each
Statistics has always been interdisciplinary, borrowing ideas from diverse elds (usually real-time) observation of the output variables. The area of SPC–EPC
and repaying the debt with contributions, both theoretical and practical, to the integration certainly owes its development to George E. P. Box and his collab-
other intellectual disciplines. For statistical learning, this cross-fertilization is orators. Box and Luceño (1997) concluded that:
especially noticeable. This book is a valuable resource, both for the statisti- To augment the monitoring aspects of statistical process control with appro-
cian needing an introduction to machine learning and related elds and for the priate techniques for process adjustment has long been an evident need. Some
computer scientist wishing to learn more about statistics. Statisticians will es- 35 years ago, in response to a paper that attempted such enhancement a dis-
pecially appreciate that it is written in their own language. cussant [Prof. J. H. Westcott in the discussion of “Some Statistical Aspects
of Adaptive Optimization and Control” by Box and Jenkins, 1962] remarked,
The level of the book is roughly that of a second-year doctoral student in sta- “I welcome this irtation between control engineering and statistics. I doubt,
tistics, and it will be useful as a textbook for such students. In a stimulating arti- however, whether they can yet be said to be going steady.”
cle, Breiman (2001) argued that statistics has been focused too much on a “data
Box and Luceño went on to suggest that their book brings about the desired
modeling culture,” where the model is paramount. Breiman argued instead for
marriage, but I do not think that is completely true. With all due respect to their
an “algorithmic modeling culture,” with emphasis on black-box types of predic-
contribution, I think that perhaps we can celebrate an engagement, but there is
tion. Breiman’s article is controversial, and in his discussion, Efron objects that
a lot of room for improving the bridge between control and monitoring. I think
“prediction is certainly an interesting subject, but Leo’s paper overstates both
that the best approach is to focus on the industrial statistics audience to foster
its role and our profession’s lack of interest in it.” Although I mostly agree with
an appreciation of control engineering (as opposed to focusing on control en-
Efron, I worry that the courses offered by most statistics departments include gineers to develop their statistical appreciation). With that in mind, I believe
little, if any, treatment of statistical learning and prediction. (Stanford, where that this book goes a long way toward achieving this end. He states that the ob-
Efron and the authors of this book teach, is an exception.) Graduate students jective of his book is to “present process adjustment techniques based on EPC
in statistics certainly need to know more than they do now about prediction, methods and to discuss them from the point of view of controlling the quality
machine learning, statistical learning, and data mining (not disjoint subjects). of a product.” This product quality focus is a good point of connection. The
I hope that graduate courses covering the topics of this book will become more book goes on to truly synthesize several sources across time series, statistics,
common in statistics curricula. and control theory, with a clear focus on quality control outcomes.
Most of the book is focused on supervised learning, where one has in- The book’s organization makes a natural progression from process monitor-
puts and outputs from some system and wishes to predict unknown outputs ing basics (Chap. 1), to stochastic-dynamic process modeling (Chaps. 2–4), to
corresponding to known inputs. The methods discussed for supervised learn- process control techniques (Chaps. 5–9). In the rst chapter, Figure 1.23 pro-
ing include linear and logistic regression; basis expansion, such as splines and vides a very nice owchart guide to the use of the EPC and SPC techniques
wavelets; kernel techniques, such as local regression, local likelihood, and ra- discussed in the book. SAS procedures that support aspects of the modeling
dial basis functions; neural networks; additive models; decision trees based on and analysis are discussed in suf cient detail within the text. There are several
recursive partitioning, such as CART; and support vector machines. examples of SAS code and output. The graphical user interface of MATLAB’s
There is a nal chapter on unsupervised learning, including association system ID toolbox is presented and discussed brie y. Minitab’s STAT functions
rules, cluster analysis, self-organizing maps, principal components and curves, are also frequently used for data analysis and plotting.
and independent component analysis. Many statisticians will be unfamiliar with Although the author suggests that the book could be used in an undergradu-
at least some of these algorithms. Association rules are popular for mining com- ate course, I think its level demands a certain amount of statistical and mathe-
mercial data in what is called “market basket analysis.” The aim is to discover matical sophistication that would be beyond all but the very top undergraduate
types of products often purchased together. Such knowledge can be used to students. However, rst-year or second-year graduate students would be very
develop marketing strategies, such as store or catalog layouts. Self-organizing well prepared in the area by using this text. As a text for course instruction,
maps (SOMs) involve essentially constrained k-means clustering, where pro- this book certainly excels on the basis of exercises and real datasets. There are
totypes are mapped to a two-dimensional curved coordinate system. Indepen- about 15 problems at the end of each chapter (a solutions manual was prepared
dent components analysis is similar to principal components analysis and factor by Rong Pan) and 18 data les and spreadsheets that serve to illuminate top-
analysis, but it uses higher-order moments to achieve independence, not merely ics in each chapter. (In comparison, the Box and Luceño book has only one
zero correlation between components. or two problems in most chapters and only three datasets.) The author’s web-
A strength of the book is the attempt to organize a plethora of methods into a site, www.ie.psu.edu/faculty/castillo/castillo.htm, contains the electronic les,
coherent whole. The relationships among the methods are emphasized. I know solutions manual, and errata in the rst printing.
of no other book that covers so much ground. Of course, with such broad cov- One of the past criticisms of the quality area is the perspective that quality
erage, it is not possible to cover any single topic in great depth, so this book is free, or that quality objectives should be pursued for purely intrinsic reasons.
will encourage further reading. Fortunately, each chapter includes bibliographic Six Sigma, of course, has sought to work against this misconception with a
notes surveying the recent literature. These notes and the extensive references focus on bottom-line pro tability of quality improvement. This book, with its
strong focus on controlling the quality of products and processes, underscores
provide a good introduction to the learning literature, including much outside of
the high relevance of quality control to industrial practice. To further this idea,
statistics. The book might be more suitable as a textbook if less material were
I would have liked to see some strategic-level consideration of how statistical
covered in greater depth; however, such a change would compromise the book’s
process adjustment may factor into a company’s nancial strength by creating
usefulness as a reference, and so I am happier with the book as it was written.
opportunities that it may not have otherwise had.
Overall, I think that this is a great book that is well worth its price. Most of
David RUP PERT the text focuses on univariate system analysis, and it will help the reader ap-
Cornell University preciate the fundamentals. The nal chapter gives a brief introduction to mul-
tivariate system analysis and suggests other avenues of future research in the
SPC–EPC area. For those working in the manufacturing area—from either an
REFERENCE
academic or an industry point of view—this book is a valuable resource.
Breiman, L. (2001), “Statistical Modeling: The Two Cultures” (with discus- Harriet Black N E MBHAR D
sion), Statistical Science, 16, 199–231. University of Wisconsin, Madison
View publication stats

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Uploaded by

Copyright:

Available Formats

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Article in The Mathematical Intelligencer · November 2004

Trevor Hastie Robert Tibshirani

SEE PROFILE SEE PROFILE

Single-cell multidimensional analysis of cancer related EMT phenomena View project

Pregnancy View project

The user has requested enhancement of the downloaded file.

View publication stats

You might also like