Thursday, 30 August 2018

Coming To Terms With Analytics Terms

Have you ever wondered about, or been confused by, the terminology in our industry? Over the years, we have applied many labels to describe what we do. The origins are particularly interesting.

Traditionally, statistical analysis followed a standard process:

1) Identify a problem.
2) Develop a hypothesis.
3) Gather the data.
4) Prove or disprove the hypothesis.

Serious researchers considered the concept of just looking through data without a hypothesis as beneath their dignity. In fact, it was so unacceptable that it earned the label, "data dredging."

Analysis of big data emerged in the late 80s and early 90s. Computer power was the real driver. My own experience is a great example.

In 1993, a credit card bank hired me to leverage credit bureau data to build acquisition-targeting models. It gave me a PC with a 600-megabyte hard drive. With sampled data of about 45,000 records -- a lot of data back then -- running one logistic regression model took 27 hours. In addition, when the process was running, I couldn't use my computer to do anything else. So I would spend all week preparing the variables. Then I would start the model processing on Friday afternoon and pray that it would not crash over the weekend. A year later, we got a Unix server with one gigabyte of space for the whole bank. It then took only two hours to run a logistic model. We were ecstatic. We thought we would never run out of space.

Around 1995, the term "data mining" started entering the conversation. I remember thinking, "Finally, I have a name for what I do."

It turns out that those dastardly data dredgers were starting to uncover patterns that proved to be quite valuable. They discovered some "nuggets" of information that companies could use to boost profits. Given the newfound value of just looking through data without a hypothesis or test design, the term data mining replaced data dredging. So, in its purest form, data mining is the act of exploring data to find valuable nuggets of information.

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | Affiliate Network Reviews