Virtualized Data Warehousing

Below is the continuation of the transcript of a Webinar hosted by InetSoft on the topic of Agile Data Access and Agile Business Intelligence. The presenter is Mark Flaherty, CMO at InetSoft.

Mark Flaherty (MF): And that’s where you need to think through your metadata and semantic abstraction layers. What do you rollout that can unify at least there is a metadata for structured, semi-structured, unstructured information so you can, for example, access data not just at the records level, but also if it happens to be text, and if the sources are text, how can you access it through semantic approaches that allow you to determine the concepts and taxonomies and do searches and manipulation of information using that sort of metadata?

So in order to start, you need to think through the semantic abstraction and the approaches that help you to bring this all together. And really in terms of normalizing it and reusing it in a broad range of applications like BI, you start to need to think through the policies and rules governing how this information in various formats, implementing various schemas can be, what policies and rules are relevant to orchestration of the integrations and the transformations and the calculations and so forth.

You need to think through the discrete rules that will govern the ongoing reuse of these data blocks throughout your architecture. Think through the requirements of our customers datasets, customer facing applications for keeping track of who the customers are, presenting a 360 degree view of them to drive and feed into customer service and sales and marketing and the like.

#1 Ranking: Read how InetSoft was rated #1 for user adoption in G2's user survey-based index Read More

Agile Data Access Strategy

So as you are working through your agile data access strategy, you need to organize it and think it through by the subject areas essentially of the underlying data sets and applications and the underlying business processes. So think it through roughly the same way you traditionally thought it through in building out a multi-domain subject-oriented data warehousing environment. Think of it essentially as almost virtualized data warehousing and I am using in the loosest sense.

Building a multi-domain subject-oriented data warehousing environment involves several key steps to ensure that the data warehouse meets the diverse analytical needs of an organization across different domains. Here are the steps involved in constructing such a data warehousing environment:

  1. Requirement Analysis:
    • The first step is to conduct a comprehensive analysis of the organization's business requirements and data needs across multiple domains or business areas. This involves gathering input from various stakeholders to identify the key subject areas, data sources, and analytical requirements that the data warehouse must support.
  2. Data Modeling:
    • Once the requirements are understood, the next step is to design a flexible and scalable data model that can accommodate data from multiple domains while maintaining a subject-oriented approach. This involves defining dimensional models or star schemas for each subject area, identifying the key dimensions and facts, and establishing relationships between them.
  3. Data Integration:
    • Data integration is a critical step in building a multi-domain data warehousing environment. It involves extracting data from disparate sources across different domains, transforming it into a consistent format, and loading it into the data warehouse. This may require the use of ETL (Extract, Transform, Load) tools to cleanse, enrich, and harmonize data from diverse sources before loading it into the warehouse.
  4. Domain-Specific Data Marts:
    • To meet the specific analytical needs of different domains or business areas, it is often advisable to create domain-specific data marts within the data warehouse. These data marts are subsets of the overall data warehouse that focus on specific subject areas or domains, such as sales, marketing, finance, or operations. Each data mart is optimized for querying and reporting on data relevant to its respective domain.
  5. Metadata Management:
    • Effective metadata management is essential for maintaining data consistency, lineage, and governance in a multi-domain data warehousing environment. This involves documenting the structure, meaning, and relationships of data elements within the data warehouse, as well as tracking the lineage of data from source systems to the warehouse. Metadata repositories and data catalog tools can help manage and govern metadata effectively.
  6. Security and Access Control:
    • Security is paramount in a multi-domain data warehousing environment, especially when dealing with sensitive or confidential data across different domains. Access control mechanisms should be implemented to ensure that users only have access to the data they are authorized to view, based on their role and permissions. This may involve implementing role-based access control (RBAC), encryption, and auditing mechanisms to protect data privacy and integrity.
  7. Performance Optimization:
    • Finally, performance optimization is crucial to ensure that the data warehouse can handle the analytical workload across multiple domains efficiently. This may involve indexing key columns, partitioning large tables, optimizing query execution plans, and implementing caching mechanisms to improve query performance and response times.

It’s very important as you build out these subject areas that all of these subject areas are leveraging common data, common schemas, common hierarchies, common calculations and so forth so that when you are building out these disparate subject areas, none of them are silos. They are all leveraging a common pool of agile data access artifacts and models, and fundamentally they are all leveraging a common set of source connectors and source applications so what you want to do is go towards agile data access.

Previous: Example of an Advanced Visualization
Next: Mobile BI