InetSoft Webinar: Healthcare Analytics in the Cloud
Below is the continuation of the transcript of a Webinar hosted by InetSoft on the topic of Machine Learning Big Data Analytics in Healthcare. The presenter is Abhishek Gupta, Product Manager at InetSoft, and the guest is Jim Reynolds, CTO at Health Analytica.
Abhishek: Now Jim, there are also sensitive data and privacy issues that impact healthcare analytics in the cloud. There are regulations and potential audits involved. How do you manage to protect the data even as you have to go through a lot of these cleansing and joining steps across different formats types and even sources of data?
Jim: So there's actually lots of encryption involved at various places and along the way in a pipeline, and so we do keep the data in our archives in an encrypted fashion. When we move data along from one part of the pipeline to another we keep control of the environment by having really good controls on each of the stages.
This is where Vertica actually helps us out quite a bit because we have the ability to nicely go in and assign roles to go there and put in some protections, and that was one of the things that we were looking for in a data store which is to have some ability to have controls over the data.
So as it moves along the pipeline we keep these controls in place, and as things move along everywhere where we have an attack surface. We have to keep the data either protected by network access or by encryption, and the infrastructure that we build has to deal with that.
|#1 Ranking: Read how InetSoft was rated #1 for user adoption in G2's user survey-based index
Abhishek: Okay you mentioned Hewlett-Packard Enterprise Vertica as part of your overall solution. Tell us a little bit about what you're using Vertica for specifically and maybe perhaps tell us the journey that led you to Vertica.
Jim: Sure, so Vertica is a really very fast and very easy to manage and very cost effective column store for our problem, and the reason that that's an important idea is that traditional relational databases work really well when you're dealing with things in a worldwide fashion, and they're really good for online transaction processing, like updating your bank account as a classic example, where you want all of that data to be really transactionally secure and consistent.
But when you're dealing with large scale analytics you really need two things out of the environment: the ability to move analytics to the data in a cost effective fashion, and you need to be able to do data scans really fast. So that was what the big breakthrough was with column stores, and Vertica has been for us a very solid platform on which we can build highly embeddable analytics, and it allows us to perform lots of complicated analytics both in the database as well as satisfying a lot of the interactive analytics use cases that we have for our customers.
Abhishek: So just so I understand where Vertica fits in on your machine learning analytics solution, where does Vertica fit? Are you using it for both the data acquisition and management and then also for an analytics or one of the other? How do they relate your solution and platform and the Vertica technology?
Jim: Right, so in our data analytics pipeline we divide up into three large segments. One is our data curation and staging, and that's where we perform the activities of publishing, and we use Vertica both for the staging and the publishing.
Then we have a next stage of our pipeline which is our large scale compute, and in that stage we also use Vertica because we do a lot of metric scans, metric calculations, and that requires a lot of column scans, and so for our compute environment we use Vertica for that.
View a 2-minute demonstration
of InetSoft's easy, agile, and robust BI software.
But it also does really well for interactive data filtering and interactive metric calculations, and so for our user presentation layer we also use Vertica for that, and it serves all of those parts of our pipeline quite well.