InetSoft's BI software employs a combination of in-memory database technology and disk-based access to scale up for big data applications using commodity-priced servers. InetSoft's proprietary term for this approach is 'Data Grid Cache.'
Optimized, compressed indexes are loaded into memory while the data can either remain on the hard disk or be loaded in chunks into memory based on available memory and the data needed for a given dashboard or visualization.
The data can either be accessed in real-time from a data warehouse, operational data store, or a mashup of several sources, or it can be configured to be cached on disk by InetSoft's Style Intelligence server application at specified time intervals. Incremental updates can be added on a scheduled basis or on demand, and cache re-writes can be scheduled for off-peak times.
Data Grid Cache is InetSoft's proprietary data querying technology. It is a columnar data store that uses in-memory technology to enable highly scalable big data analytics on a real time map-reduce (Hadoop-like) architecture.
In other words, it is a data accelerator that stores data in columnar format and use in-memory technologies to speed up the processing of data, and the map-reduce data cluster enables unlimited scalability.
A data grid cache intelligently queries and saves the data necessary to support a given dashboard or data visualization, including all filtering and drill-down levels. When a cluster of commodity-priced servers is used to run a data grid cache, the data file is automatically split into chunks and distributed to the nodes.
One data chunk may be copied to multiple nodes, so if some nodes fail, the cluster could still work in most cases. On each node, the dimensions are always loaded into memory when executing the query. Measures are loaded into memory in chunks.
If the size of available memory allows, the entire measure column is loaded into memory, too. The advantage of not always loading measures into memory is greater speed for most interactive operations. Oftentimes a user's interaction initiates a query that only filters the rows and only calculates the measure on a small subset of the rows. Having dimensions in memory allows the filtering to be done quickly. After the filtering is done, only the rows in the subset needed for the final calculation are loaded in memory.
A pure in-memory BI solution means that you are limited to analyzing or reporting on the size of the database by the size of memory in the BI server. Yes, memory costs continually come down, but terabyte memory machines still cost multiples of gigabyte memory machines. The data grid cache solution suffers from no such limits and can scale up as performance, usage concurrency, and database size grow by simply adding memory and/or commodity-priced servers in a cluster.
Copyright © 2024, InetSoft Technology Corp.