

Full description not available
J**T
You will need some time but it is worth the investment
To get the most out of this book, you need to either be a statistician, AI professional, or be willing to invest some time. But: if you commit yourself, then this book goes a long way to substitute for a graduate-level course on data mining. Don't get me wrong - it is not written with an academic audience in mind; as a matter of fact, it is unusually rich with application examples. But there is a lot to digest conceptually and many of the examples are quite involved. As such, it addresses the opposite end the O'Reilly series of how-to books. This one gets you up to speed with one of if not the best software package for data mining in all its many facets. With Weka and 'R', you have the tools to tackle many of the World's problems, and this book is the best introduction to one part of the duo.
R**J
Great book
One of the best books I have read on the subject thus far.There seems to be so much hype on "data science" these days, when actuaries were doing this stuff with slide rules decades ago.This book removes the mystery and explains it clearly....An understanding of data architecture and some math would be helpful, but I think anyone with a technical background would benefit from it.
J**D
Two Paths to Prediction
This is a good text on machine learning techniques from both the statistics and the machine learning perspectives. The authors note that these fields have developed in parallel with many researchers and practitioners working in each, but few familiar with the full range of techniques in both disciplines. Some procedures, such as tree induction and nearest neighbor clustering techniques, have been developed independently in both fields. However, for the most part statistics has focused on hypothesis testing and machine learning has tried to optimize search through the space of possible hypotheses. This book presents techniques from both traditions.The organizational structure of the book supports its use as either a comprehensive text or a modular reference. The first section's five chapters introduce the foundations of data mining. In addition to concepts and definitions, there are simple example data sets and accessible descriptions of how both raw data and final analyses are used in this field. A particularly well-written fifth chapter discusses how to evaluate data mining models. It discusses the rationale for holdout samples, the use of cross-validation procedures, and how to avoid over-fitting models. Machine learning texts frequently lack depth on this topic while statistics texts often fail to communicate the consequences of poorly-fitted models. This integration of perspectives is a good one.Chapters in the second section build on this foundation. Chapter 6 describes how to use ten different techniques to detect and describe patterns in large data sets. This section also describes how to prepare data for data mining, how to combine and transform variables to increase model accuracy, and how to improve prediction by combining different model types using bagging, boosting, and other aggregation techniques. A final chapter outlines directions of current and future research expanding our toolbox of techniques. The eight chapters of the third and final section are a detailed tutorial covering the Weka workbench of machine learning algorithms and data transformation tools.This book has several communication strengths. The scope is broad for an introductory text. The Further Readings collections at each chapter's end are reasonably brief and point to current and in-depth sources. The text itself contains numerous example analyses and follows the useful strategy of analyzing the same data with several techniques. Its review of algorithms and formulas focuses on explaining how they work rather than on deriving them from general principles. A key strength is the book's close integration with Weka. This ensures that readers can step through analysis procedures, experiment with variations from default paths, and compare the performance of different formulations of the same research problems.I recommend the book for readers introducing themselves to machine learning. It will take some of your time to learn the techniques and practice using them in Weka, but it will be time well spent. Don't skip the Weka section!
J**S
Nice book relatively easy to read
Nice book relatively easy to read, although sometimes less instructive in the details of particular algorithms than might be desired. I did skip part II of the book so that might be an issue. A great resource for cutting your teeth on data mining.
T**.
Good Content, Poor Electronic Format
I was really looking forward to the new edition of this book, but I've found that the Kindle version has unreadable tables and figures that are too small to make out. Many of the equations are almost completely illegible. I should have purchased the hardcopy.The content, on the other hand, is very well written and accessible. Great book; terrible Kindle edition.Note: I have a first-edition Kindle, which could account for some of my problems, and the Windows desktop version works great. It's just too bad that I have to run a Windows virtual machine just to see equations from a book I purchased.
E**V
Must have
The book is a must have in case you'd like to know how things works under the hood. It describes in details neural nets, decision trees, associative rules and others. The cool thing, which I like most, is that this book could be really helpful for beginners. It's slightly over talkative, but for this particular book that's probably positive characteristics.
D**L
Great book
I am way too lazy to write a long winded review about this book but I recently took a graduate class at my university as an undergrad and the content was easy to understand. That course landed me an internship involving data mining so this book works. Caters to the plebeian beginner as well as the seasoned data scientist.
C**S
Avoid
Reading some of these reviews I feel like I must have gotten another book. I really didn't think the book was worth the time or money investment.My main issues were:1. 50% of the book covers WEKA- but who is really going to use WEKA over a product like R.- the WEKA coverage is mind numbingly bad. Lists of algorithm names without explanations of those algorithms and no real practical advice or examples using the program.2. The 50% of the book that covers general data mining is not really that good at all. It is meant to be an easily accessible overview without technical details but manages to be so breezy an overview as to be totally useless.3. The "Data Mining with Rattle and R" (as a practical introduction) is so much better in almost every area that I can't understand why people are still recommending this book.
Trustpilot
3 days ago
1 day ago