Eric Kavanagh: That’s good, so we won’t lose our jobs anytime soon.
Jim Ericson: Less and less likely, I am thinking.
Eric Kavanagh: Yeah that’s it. What are the some of the biggest mistakes you’ve seen people make with data visualization? Such as drawing false conclusions or focusing on false positives? What are some of the errors to avoid?
Mark Flaherty: Definitely that correlation is not causation. Just because we see a visual trend of all the dots going in one line doesn’t mean that whatever we applied them to is actually driving the Y axis.
So, you do have to understand the logic and see if it’s really possible. You need to do things using rational analysis where you bring in multiple variables and actually test. You know -- did they really drive that Y axis statistical?
Eric Kavanagh: Yeah, we have to watch obviously if the data is not clean or there is some problem with the data. You’re going to get some wacky visualization so you always have to apply the reality check, right?
Mark Flaherty: Correct.
Eric Kavanagh: And then, I am sure also look at the raw data to see some patterns. Then, the next step is probably, “Okay, let me get underneath this and take a look at the data sets.” That’s when you can figure out some field was askew or you know something else happened along the way to render nonsense.
I was working just yesterday and something happened with the formula because the little graph couldn’t rash dries itself. It just kept dancing around on the screen. I was like, “Mm-hmm, I think there is a problem with data quality in there somewhere.”
So you do have to watch out for just basic mistakes and basic errors to make sure that what you’re seeing is really what you’re getting, right?
Mark Flaherty: Yes, certainly for any new data set that you bring in. You are always going to go through that first step of trying to make sure it’s clean.
Where the software works best is in helping you profile a data you can see. Just take some data that is a part of the points -- then there are some that are negative when the rest are positive and that probably shows you that there is some data input error. It’s a common problem.