Virtualizing Access to All Data Sources

Below is the continuation of the transcript of a Webinar hosted by InetSoft on the topic of Agile Data Access and Agile Business Intelligence. The presenter is Mark Flaherty, CMO at InetSoft.

Mark Flaherty (MF): This is the anti-silo approach. Clearly, this needs to be a unified and extensible set of applications that build on common reusable components. So often you’re driving towards real time or near real time data access which is essentially a cross platform silo agnostic infrastructure for multi-latency integration, batch, near real time, and real time that can all be accommodated as different delivery models within an end-to-end fabric.

Then of course the semantic abstraction layer and registry are key of course to rolling up unified access, unified administration to all these disparate data repositories and so forth throughout your infrastructure. So if you build a semantic abstraction layer and registry, think of it in a broader sense which means it needs to be unified infrastructure where you maintain all the important artifacts which is metadata, which is schema definitions, which is business rules, predictive models, statistical models, a full range of report definitions, a full range of artifacts that are absolutely essential for maximum reuse across disparate BI efforts.

demo
Read how InetSoft saves money and resources with deployment flexibility.

And so when it comes down to front end BI then in an agile data access world, the front end to the end users in anyway won't work radically different from what they are used to. You will still be building out reports and dashboards, scorecards and time series analysis and forecasts and presenting them with of course hopefully an ever more sophisticated visualization and interaction capability. But fundamentally those are the dominant type of applications that you will continue to build out and deliver into all manner of business rules and processes.

But fundamentally in the back end it's all going to be changing, moving towards this agile distributed federated cloud oriented increasingly approach to unifying and virtualizing access to all data sources. This is agile data access. It all sounds ambitious, but really when you are trying to justify going this approach where do you start, on what sort of projects, what sort of requirements?

Well really any analytics, any BI effort, any reporting initiative where you need rapid, on-demand access to information anywhere at anytime for any purpose, fundamentally, agile data access is the enabler for that vision so that you can rapidly discover any information internal and external to your company quickly, access it, share, invoke it, reuse it and publish. Really so you can then aggregate and visualize and cleanse and build predictive models in a way that’s very fast where you need to do as little new coding as absolutely essential because it's all been built for you already or much of it of the underlying functionality.

What Is the Difference Between Data Mashup and Virtualization?

Data Mashup:

Data mashup involves combining data from multiple sources, often disparate ones, to create a unified view or dataset. This process typically involves merging data from various sources such as databases, spreadsheets, web services, APIs, and other data repositories. The goal of data mashup is to create a comprehensive dataset that provides a holistic view of the information.

One common example of data mashup is in business intelligence (BI) applications, where data from different departments or systems within an organization are combined to generate insights and make informed decisions. For instance, combining sales data from a CRM system with financial data from an ERP system to analyze revenue trends.

Data mashup often involves data transformation and cleansing to ensure consistency and accuracy across the integrated dataset. This may include standardizing data formats, resolving inconsistencies, and handling missing or incomplete data.

Data Virtualization:

Data virtualization, on the other hand, is a technique used to provide a unified and integrated view of data without physically consolidating it into a single repository. Instead of copying or moving data, data virtualization creates a layer of abstraction that allows users to access and query data from multiple sources as if it were stored in a single location.

In data virtualization, data remains in its original source systems, and virtualization software retrieves and presents the data to users in real-time, on-demand, or as needed. This approach provides flexibility and agility, as it allows organizations to access and analyze data from disparate sources without the need for extensive data movement or replication.

Data virtualization can be particularly beneficial in scenarios where data is distributed across multiple systems or stored in cloud-based environments. It enables organizations to leverage existing data assets without the complexity and overhead of traditional data integration approaches.

Key Differences:

  1. Data Integration Approach: Data mashup involves physically combining data from multiple sources into a unified dataset, whereas data virtualization creates a virtual layer that allows access to data from multiple sources without physically consolidating it.

  2. Data Movement: Data mashup often requires copying or moving data from source systems to a centralized location, while data virtualization enables access to data in its original location without the need for data movement.

  3. Real-time Access: Data virtualization provides real-time access to data from disparate sources, allowing users to query and analyze the most up-to-date information without delays caused by data replication or synchronization processes.

  4. Flexibility: Data virtualization offers greater flexibility, as it allows organizations to access and analyze data from diverse sources without the constraints of physical data integration. Data mashup, on the other hand, involves creating a static dataset that may require periodic updates to remain relevant.

Previous: Agile Data Access