In the hype surrounding big data, too many people seem to be forgetting, “What is the question?”
Whether selecting a vendor, a marketing campaign, or your customer-service approach, knowing the “what” and the “why” is the starting point. Consider this—in a survey*, 54 percent of people prefer kickball over dodgeball. But among those who prefer chemistry to physics, 73 percent prefer kickball to dodgeball. And thus…what?
Too many people get distracted by the hype of a big data tool rather than focusing on the hypothesis to test. Is it enough to know a correlation (that says nothing about causality) or does a decision-maker need to know the actual cause? For example, should a sales prediction for snow shovels be based on sales of winter coats or on the weather forecast? Once the question is clear, an appropriate technique can be selected.
In another survey, 78 percent of people say they always tip their servers, even if the service is poor. But among those who prefer their gravy thin and soupy rather than thick and gloopy, 94 percent always give a server a tip, even if the service is poor. (If you have seen Men in Black 3, you know about causality and tipping!)
Using only associations when causality is needed is why marketers get ugly surprises when underlying causes change. (Consider the rapidly changing tastes in social media platforms.) It is a problem, for example, when IT planners run correlations on web-server loads without knowing about new consumer promotions causing traffic increases.
Avoiding problems like this requires employing the right technique to answer the right question. Too often, big lazy data comes from analysts using software tools and/or datasets that are easily available, rather than using the right ones. This is seen when financial-news commentators give poor advice because they don’t dig into the stories behind the numbers.
IT departments also fall into this trap when reporting easily available infrastructure-monitoring data (e.g., processing units) to internal customers, rather than more meaningful measures, such as user-response time.
IT leaders should consider four opportunities to partner with business users to:
- Understand data SOURCE quality. Many errors spring from misunderstanding the data source (e.g., nature of questions asked, respondent characteristics, randomness of sample, strata, response rate, point-in-time). This is Statistics 101 stuff that gets lost in the minds of busy people.
- Improve data SET quality. Not just the basics (truncated data passed from system to system), but also the ability to more easily align time series and transform data points (e.g., math or logic operations).
- Understand the STAGE of research. To determine what to study for causality, a correlation might be a helpful initial filter, but not the more robust answer.
- Understand the cost SAVINGS from using the right tool for the job. For example, rather than spending mountains of time and money on big-number crunching for correlations, focus-group interviews might offer more insight on why a new software product is not selling. For example, why would 46 percent of people rather be called a nerd than a geek, but among those who would rather be a portrait photographer than a landscape photographer, 67 percent would rather be called a nerd? The data associations aren’t actionable without knowing why. The “why” is probed with focus groups and similar approaches.
So how do you implement this approach? You can cross-train business and IT pros to serve as expert resources for the organization. Basic tools in the hands of experts are usually better than advanced tools in the hands of the confused. When basic tools are fully utilized, the business case for more—and more advanced—tools gets easier. I think 100% of people would agree with that.
*The crazy correlations included here are courtesy of Correlated.
Principal, ValueBridge Advisors
Continue the conversation in the Big Data topic within ISACA’s Knowledge Center.