InetSoft's BI software employs a combination of in-memory database technology and disk-based access to scale up for big data applications using commodity-priced servers. InetSoft's proprietary term for this approach is 'Data Grid Cache.'
Optimized, compressed indexes are loaded into memory while the data can either remain on the hard disk or be loaded in chunks into memory based on available memory and the data needed for a given dashboard or visualization.
The data can either be accessed in real-time from a data warehouse, operational data store, or a mashup of several sources, or it can be configured to be cached on disk by InetSoft's Style Intelligence server application at specified time intervals. Incremental updates can be added on a scheduled basis or on demand, and cache re-writes can be scheduled for off-peak times.
This approach offers maximum flexibility and leaves the choice up to the enterprise. Data timeliness and performance requirements vary from case to case, and InetSoft provides the agility for all cases.Data Grid Cache is InetSoft's proprietary data querying technology. It is a columnar data store that uses in-memory technology to enable highly scalable big data analytics on a real time map-reduce (Hadoop-like) architecture.
In other words, it is a data accelerator that stores data in columnar format and use in-memory technologies to speed up the processing of data, and the map-reduce data cluster enables unlimited scalability.
The data grid cache is optionally deployed when performance requirements call for it, whether to support big data, massive concurrency, high reliability, and/or to avoid overtaxing the operational data stores.A data grid cache intelligently queries and saves the data necessary to support a given dashboard or data visualization, including all filtering and drill-down levels. When a cluster of commodity-priced servers is used to run a data grid cache, the data file is automatically split into chunks and distributed to the nodes.
One data chunk may be copied to multiple nodes, so if some nodes fail, the cluster could still work in most cases. On each node, the dimensions are always loaded into memory when executing the query. Measures are loaded into memory in chunks.
If the size of available memory allows, the entire measure column is loaded into memory, too. The advantage of not always loading measures into memory is greater speed for most interactive operations. Oftentimes a user's interaction initiates a query that only filters the rows and only calculates the measure on a small subset of the rows. Having dimensions in memory allows the filtering to be done quickly. After the filtering is done, only the rows in the subset needed for the final calculation are loaded in memory.
If there is enough memory available to hold the data grid cache, the entire data cache will be loaded into memory for processing.A pure in-memory BI solution means that you are limited to analyzing or reporting on the size of the database by the size of memory in the BI server. Yes, memory costs continually come down, but terabyte memory machines still cost multiples of gigabyte memory machines. The data grid cache solution suffers from no such limits and can scale up as performance, usage concurrency, and database size grow by simply adding memory and/or commodity-priced servers in a cluster.
Therefore, a pure in-memory approach is not the best option for scalable, multi-user BI apps. 64-bit computing on column-based technologies can provide a better alternative for hefty OLAP projects.Data Grid Cache technology, as implemented by InetSoft in its StyleBI platform, is often highlighted as a key feature for enabling scalable, high-performance business intelligence (BI) solutions, particularly for big data applications. However, the question of whether this technology is unique to InetSoft requires a closer examination of the broader landscape of in-memory data management, caching mechanisms, and distributed computing technologies. This article explores the concept of Data Grid Cache, its implementation in StyleBI, and whether it represents a truly unique innovation or a specialized adaptation of existing paradigms for technical audiences.
InetSoft describes its Data Grid Cache as a proprietary data querying technology designed to enable highly scalable big data analytics. It operates as a columnar data store that leverages in-memory technology to accelerate data processing, drawing inspiration from Hadoop and MapReduce architectures. The system uses optimized, compressed indexes loaded into memory, while data can either remain on disk or be loaded in chunks based on available memory and query requirements. This approach allows StyleBI to handle large datasets—potentially terabytes—while maintaining low-latency query performance, supporting massive concurrency, and avoiding overtaxing operational data stores.
The technology supports real-time data access from multiple sources, such as data warehouses or operational databases, and can cache data on disk at specified intervals with incremental updates. By distributing processing across a cluster of commodity servers, it achieves scalability and fault tolerance, making it suitable for big data environments where volume, velocity, and variety are critical challenges.
While InetSoft brands its Data Grid Cache as a proprietary solution, the underlying principles—columnar storage, in-memory processing, and distributed computing—are not unique to InetSoft. These concepts are rooted in the broader domain of in-memory data grids (IMDGs) and distributed caching technologies, which have been developed and implemented by various vendors and open-source projects since the early 2000s. To determine the uniqueness of InetSoft’s approach, it’s essential to compare it with similar technologies.
In-memory data grids (IMDGs) are distributed computing systems that pool the RAM of multiple servers to store and process data, providing low-latency access and high scalability. Popular IMDGs, such as Apache Ignite, Hazelcast, Oracle Coherence, and Red Hat JBoss Data Grid, share several characteristics with InetSoft’s Data Grid Cache. For instance, they use in-memory storage to accelerate data access, support distributed processing across nodes, and often incorporate features like data partitioning, replication, and co-located computation to minimize network latency.
Like InetSoft’s Data Grid Cache, IMDGs can operate as a caching layer on top of databases, using read-through/write-through patterns to synchronize data with underlying stores. They also support scalability by adding nodes to a cluster and can handle real-time analytics, transactional processing, and large-scale data workloads. For example, Apache Ignite enables co-location of data and computations, similar to how InetSoft’s solution distributes queries across a cluster for parallel execution.
However, there are distinctions. InetSoft’s Data Grid Cache is tightly integrated into its StyleBI platform, optimized specifically for BI tasks like data visualization, dashboarding, and ad-hoc reporting. This focus on BI differentiates it from general-purpose IMDGs, which are often designed for broader use cases, such as caching session data, real-time analytics in financial systems, or IoT data processing. InetSoft’s implementation emphasizes a columnar data store and a MapReduce-inspired approach tailored for analytics, which may provide specific optimizations for query performance in BI workflows.
Distributed caching, as seen in tools like Redis, Memcached, and EHCache, also shares similarities with Data Grid Cache. These systems store frequently accessed data in memory across multiple nodes to reduce database load and improve application performance. However, distributed caches typically focus on simple key-value storage and lack the advanced computational capabilities of IMDGs or InetSoft’s solution, such as distributed SQL queries or co-located processing.
InetSoft’s Data Grid Cache goes beyond basic caching by incorporating compute capabilities, allowing it to process complex queries and aggregations directly on the cached data. This aligns it more closely with IMDGs than traditional distributed caches. However, the concept of combining caching with distributed processing is not exclusive to InetSoft, as IMDGs like GridGain and ScaleOut StateServer also provide these features.
While the foundational technologies are not unique, InetSoft’s Data Grid Cache has distinctive features that enhance its suitability for BI. The platform’s drag-and-drop interface, backed by a sophisticated data mashup engine, allows seamless integration of disparate data sources without extensive ETL processes. This user-friendly approach, combined with the ability to handle massive datasets (e.g., 67 million rows on a laptop, as cited in InetSoft’s webinars), makes it accessible to business users while retaining the scalability needed for technical environments.
Additionally, InetSoft’s solution is designed to work with commodity hardware, reducing costs compared to pure in-memory solutions that require expensive, high-RAM servers. The hybrid approach—using in-memory indexes and disk-based storage—provides flexibility, allowing organizations to balance performance and cost based on their infrastructure. This hybrid model, while not unique (as seen in some IMDGs), is optimized for BI workflows, enabling real-time analytics and visualizations without the memory limitations of pure in-memory databases.
The evolution of in-memory computing, driven by falling DRAM prices and the rise of big data, has led to widespread adoption of IMDGs and distributed caches. Since the early 2000s, solutions like Oracle Coherence (2001) and Apache Ignite (2014) have established the paradigm of distributed, in-memory data processing. These tools, along with others like Hazelcast and Infinispan, offer robust alternatives to InetSoft’s Data Grid Cache, with features like ACID transactions, distributed SQL, and fault tolerance.
Unlike general-purpose IMDGs, InetSoft’s Data Grid Cache is purpose-built for StyleBI, providing a seamless integration that enhances the platform’s visualization and reporting capabilities. While other BI platforms, such as Tableau or Power BI, rely on in-memory engines or external databases, they typically do not emphasize a distributed, MapReduce-inspired caching layer as a core component. This focus on BI-specific optimizations gives InetSoft an edge in certain use cases, though the core technology draws heavily on established paradigms.