Mark Flaherty: So when I look at analytics, and I hear what people are talking about, I think there are two strengths. Sometimes people are talking about the first wave of analytics, the exploration and analysis. And sometimes people are talking about the second wave of analytics, which is prediction and optimization. And these two are different, as you can imagine and as we have discussed already.
So with exploration and analysis we are really navigating through historical sets of data. It’s much more of a top-down type of analysis where we see the results. And then we have a hypothesis, a mental-model of perhaps of what caused that result, and then we go explore the data, drill-down to hierarchies across dimensions to find the root cause. So in that respect it is very deductive type of reasoning that we employ in that type of analysis. But, typically we are using query tools and OLAP tools to do that type of analysis.
With prediction and optimization the process is almost reversed. It’s much more automated. We typically start with the data itself, and let the data for the most part tell us what are the patterns and the relationships and the valuable things to know that are in it.
So in that respect, prediction and optimization is much more inductive, and it uses modeling tools and optimization tools to do that. In reality nothing is truly inductive, even when creating a predictive model you need to start with the hypothesis of what you want to find, in order for the model to be somewhat effective and come up with some reasonable patterns that reflect reality.
Today we are going to talk about exploration and analysis. Later we will focus on prediction and optimization. We also have to ask why is analytics hot now, why are you on this webinar, why are people going to these seminars that we are doing about analytics? Well I think this for a reason and in the live seminars people have added couple of more, but number one is that we have matured in the BI and data warehousing industry.
It has taken 15 years of people slogging in the trenches to try to figure out how to build this data repositories. They pull information from across the organization from a variety of systems. But they store the data in different formats and different levels of granularity. They have lots of errors in them. That makes business analytics difficult. And then there is the question of how to deploy BI tools on top of that so that they are usable by various groups of users.
Just to deliver basic interactive reports, it has taken a long time for us to get our hands around. But I think we are at the point now in the industry where we are ready for something more. We have built these rather large repositories, and we wanted to ensure that we deliver as much value to the business through them as possible.
“Flexible product with great training and support. The product has been very useful for quickly creating dashboards and data views. Support and training has always been available to us and quick to respond.
- George R, Information Technology Specialist at Sonepar USA
Second, as time goes on we have accumulated lots and lots of data, years worth of detailed granular transaction data in most companies. And this data is exploding. It’s growing at ever accelerated rates as we add more detail, over more years. And we add stuff like unstructured data, text images and audio. We add sensor data from RFID chips, GPS signal data, smart metering data, and the list goes on and on of information that companies now want to capture, store and analyze. So that’s what we call Big Data.
And third we have that analytical techniques to analyze Big Data. We call them machine-learning algorithms. Lots of these neural networks and decision trees were created explicitly to create models against large volumes of data. If you have smaller volumes of data, it’s easy to tease out the patterns. You can almost manually do that or visually do that. Or you can use an OLAP tool to figure out what some of the patterns are. These techniques as I said earlier have been around for a long time, but they have been perfected over the years.
And finally, fourth, we now have horsepower to run these deep analytics against this Big Data in the form of massively parallel processing machines, common databases, where the price/performance ratio of CPUs and memory and disk storage have just gone through the roof on making things possible that we couldn’t even dream of doing a few years ago in cost effective way. So this is why I think analytics is hot now. Folks from our live seminar series added that people want to use analytics as a competitive weapon. And I think they have probably read or heard about books like “Competing on Analytics.”