From the course: Predictive Analytics Essential Training: Data Mining

Unlock the full course today

Join today to access over 23,400 courses taught by industry experts.

Understanding the data preparation law

Understanding the data preparation law

- [Instructor] Most of us have heard the quote that data scientists spend 70, or even 80% of their time doing data prep. It's usually said to inspire the kind of comradery you get from complaining together. You hear it at a conference, or maybe just chatting at the water cooler. The third law, the data preparation law, gets us thinking about why this is so. Data preparation is more than half of every data mining process. F`rankly, this is my favorite law. I quote it all the time. Tom is making a strong claim here. He's saying that it's always more than half. He even implies that it always will be. So why is it always more than half? And why do I feel so strongly about it? Well it's very popular these days to talk about self-service analytics and the notion that either software can make data prep go away or that having IT do data prep and cleaning at the enterprise level for the whole organization can make data…
