The term "big data" gets thrown around a lot these days but what does it actually mean? Why should we care?
To find out, we interviewed two Bentley alumni doing exciting things in the world of data analytics for their take on big data and upcoming trends they see from inside this fascinating yet foggy-to-most emerging area of business.
What Is Big Data?
"Big data is the amount of data that puts pressure and stress on the technology available at that point in time — it's the amount of data that's going to break the way we're currently used to handling it," says Joe Dery, MS in Marketing Analytics and current doctoral candidate.
"Big data is nothing new. My first computer had 64GB of memory, which my father told me I'd never be able to fill up in my lifetime. Now my phone has more storage than that."
What makes data "big" is relative to the time period and relative to the user. "Big data" to digital behemoths like Amazon, Google, Twitter and Facebook is very different than what "big data" would mean to a mom-and-pop small business or to the average family.
When you stop to consider that 90 percent of the world's data has been created in the last two years alone, according to IBM, it's no wonder that demand for hiring workers able to analyze all that data has risen 82 percent in the past five years, with more than 270,000 jobs currently available that involve "data analysis."
As a VP/Director of Analytics for Hill Holliday, one of the top advertising and marketing agencies in the world, Lindsay Starner MBA '13 spends a lot of time convincing clients and colleagues that more data isn't always better — that just because you have access to all of the data all of the time and everything can be counted, doesn’t mean that it should.
"Measurement has to be strategic," Starner says. "Just because data is available in real time doesn't mean that you should be looking at hour-over-hour data . . . or that after you've already decided on certain KPIs (key performance indicators) you can throw them out because, ‘Oh, look over here, this click-through rate is bad.’"
Now that big data has become such big business, here are six big trends Dery and Starner anticipate seeing in the workplace in the near future:
1. More Unicorns
When managers talk about hiring data scientists, what they've been looking for to date has been something of a "unicorn": a classically trained computer scientist or statistician, with all the hard technical skills and expertise in programming languages (SAP, SQL, modeling, etc.), but who also has the soft skills to think creatively and weave the story behind the numbers to effectively communicate in a cross-functional organization.
As data analytics became more buzzworthy, many eager hard-science professionals changed their LinkedIn profiles overnight to say "data scientist." But hiring managers will find that may not be what they're actually getting. True data scientists have to use both left and right brain, so to speak, and be passionate about living in a sort of limbo where hard and soft are constantly at odds, being inflexible and flexible at the same time.
"When you're traditionally trained in mathematics, creativity is not something that's always encouraged," Dery shares. "You have to try things that are unconventional, try hybrid approaches and bend the rules in certain places where you can, which can be extremely uncomfortable for some people. Data analytics has its highs and lows, and you have to really be OK with failure. You’re working with some of the hardest business problems that your company has ever seen, and you’re not going to come to the answer right away, or possibly even ever."
Analytics programs (like the master’s programs in audit analytics, business analytics, marketing analytics offered at Bentley) are now specifically designed to create more "unicorns." They teach people to take the business aspect — whether it's marketing, sales, operations, manufacturing, engineering, etc. — and marry it with mathematics and programming in out-of-the-box ways to get innovative results. But they also train them to have the ability to explain those unconventional results in easy-to-understand ways.
And undergraduate programs are catching up, too.
"I spend a lot of time looking at résumés, and it's crazy to me the number of people learning different programming languages," Starner says. "The technical skill sets coming out of undergrad are insane, across almost every capability we would have."
2. Standardization and Accessibility
Since dealing with big data is still a relatively new concept for most organizations, it can be a very subjective task, and one that's still challenging for even veteran data analytics professionals.
"We're in this weird place with some of these new and emerging channels, where it's still the Wild West," Starner says. "We have partners who are coming out with their own proprietary measurement techniques. There is very little standardization in nomenclature or methodology. And there are very few third-party options. So you're kind of relying on the partners to measure themselves."
That lack of standardization in nomenclature, methodology and measurement is something that innovative third-party vendors are hard at work to change by creating "software as a service" or SaaS packages that make analytics more accessible. In turn, they're creating new job specialties and skill sets around data analytics, as they and their software adoption rates grow.
For Joe Dery, one of the coolest trends in analytics happens to be the growing number of data visualization tools available, like Tableau and SAS visual analytics. They offer the ability to take large quantities of data and transform them into visualizations within seconds, then manipulate those visualizations to tell a better story and drive the actions that they’re trying to target.
"It's incredibly powerful," Dery says. "Data science is actually taking on more of an art form than a science. If the techniques you use are not very clear and how you visualize the data is not very clear, then the actions they are going to drive are not very clear. It's really up to you as the data scientist to take on that artistic creativity and draw out the statue within the block of marble or concrete, like Michelangelo — which again plays up those soft skills to a level where they have never been before."
When, as Lindsay Starner pointed out above, one of the major challenges is teaching clients that more data isn't always better, that just because you can collect a zillion data points doesn't mean you should, where do you draw the ethical lines and take consumer privacy into account, especially in the world of B2C?
"There are a lot of questions around privacy and multiple devices," she says. "So, how do you measure if I'm on my phone and then I switch to my desktop or my tablet, without violating my privacy? Because you need to get a device ID from my phone and map it to an IP address, which is in my home, to get to my desktop. And all of that just seems very personal to people."
Starner spoke at the mobile marketing conference FutureM, which she says was a really interesting experience. It reminded her of how far there still is to go with analytics and mobile technologies, and how much room for innovation still exists.
"One of the questions that came up was around privacy and ad blocking. It’s the exact same question we had to face [as an industry] when you were allowed to block cookies on your desktop computer," Starner said. "You can go into incognito mode in your browser or you can block all of the tracking. There was 'Cookiegate' for desktop browsing, where everyone thought it was the end of online advertising because nobody would ever be able to be served an ad again. But it never happened. Mobile channels are going through the same thing right now."
Mobile is currently used as both a channel and a measurement source: The same data that powers meaningful relevance from a digital media targeting perspective can also be extremely valuable to a brand, as it maps unique (but anonymized) users to locations over time.
“We can now begin to understand the physical journey cohorts take before and after visiting our retail stores or restaurants or branches, while also better understanding loyalty (did they visit a competitor as well?) and using actual location information to verify what has typically been self-reported data," says Starner.
5. Predictive Analytics
When big data first became a challenge for modern businesses, human beings were doing the majority of the number-crunching and analysis that is now automated. The issues were figuring out how to use what we know from the past in order to fix the problem at hand — or what Dery calls "data analytics triage."
"Today, business problems are more forward looking. They're geared more toward leading indicators," he says. "Preventative health looks to the future and solves for the problem of 'How can we keep this from happening?’ or ‘How can we know when there are warning signs that something is going to happen?’"
What Dery calls preventative health has recently emerged as a set of tools, strategies and software known as Predictive Analytics, defined by the SAS Institute as "the use of data, statistical algorithms and machine-learning techniques to identify the likelihood of future outcomes based on historical data."
"Predictive analytics will fuel an increase in requests for continuous and future-looking analyses and statistics, versus the historical analyses and reporting that have always been our 'bread and butter,'” Starner says. "Clients will still want to see what happened, but will focus more time and energy on what is about to happen and how we can shape the future."
6. Artificial Intelligence
Just as centralized data or business intelligence was a novel concept a decade ago, expect the automation by artificial intelligence (AI) of some functions where data scientists are currently needed to become commonplace as the field grows.
"Watson (from IBM), in particular, is something I’m really interested in because of its use of AI to conduct analyses," Starner says. "You simply type in a question and Watson mines the data, adds on some math and voila! You have an answer. Personally, I like to see the details of exactly which variables are being considered in the output of an analysis, but this is the type of AI that could really save time for our teams as they run standard queries."