This is the transcript of a podcast hosted by InetSoft in May 2018 entitled "A Hot Topic in OLAP: R." The speaker is Mark Flaherty, CMO at InetSoft.
Today’s subject is a hot one in the field of OLAP. And if you’re in the business intelligence field, we’re hearing a lot about R.
R is an open source statistical package that includes a language and a real time computing environment. It provides a wide array of statistical components such as data preparation functions, exploration functions, linear and non linear modeling, and a wide variety of different graphics packages.
It started in academia and research areas. However, we see that a lot of the companies, as they embrace advanced analytics and predictive analytics, they’re looking for a cost effective solution. So a lot of the mid sized companies are embracing R because it is free, and it’s open source.
But we also see a lot of the larger companies embracing R because of the emerging algorithms and technology available. Because it is open source, there are a lot of contributors with over a thousand different packages, so it makes it really appealing to innovative companies.
Now you’re starting to see R add ons for different data warehouse and business intelligence solutions. For instance, a package will allow users to stay within the R console or the R environment but yet call the data warehouse or OLAP cube for big data processing. For example, there is a function that allows you to easily connect to the database and establish a data frame, which is an R data structure.
With a pointer that points to the data warehouse, you can call over forty five different analytic functions that are actually performed in the database by R. In addition to the forty five functions, there are also over twenty one functions that help provide you with the linkage for the infrastructure.
In addition to that, there are products like data warehouse miner which enables in database data mining. And with data warehouse miner you can create predictive variables or create an entire analytic data set and export it as an analytic service that runs in the database. Of course, there is a function in R that enables you to call these in database functions. So, it enables an analyst to customize their in database analytics that’s capable by R.
You also see out there ETL and BI tools for advanced analytics environments that are wrapped together and tightly integrated. Advanced analytics components are based on R, and they can export R programs as custom services better known as a UDF that runs in an OLAP cube. They also support PMML so they can export the model as PMML, and the database can consume the PMML model and convert it to SQL to run in parallel in the database.
Yes — people are absolutely still using R, and it remains a strong, active choice in many domains even as Python and other languages have grown in popularity.
R is deeply entrenched in academia, statistics departments, and social sciences. Its extensive ecosystem of statistical packages and powerful visualization tools make it the go-to for many researchers. Universities frequently teach R for statistics and data analysis, so new graduates often enter the workforce with R skills and workflows.
Fields such as genomics, clinical trials, epidemiology, and bioinformatics rely heavily on R. Pharmaceutical companies and regulatory work often favor R for reproducible analysis and reporting because of its mature, validated packages and a strong tradition of rigorous statistical methodology.
R remains popular in quantitative finance, actuarial science, and risk modeling. Its libraries for time series analysis, econometrics, and portfolio modeling are mature and trusted by practitioners who need precise statistical tools.
Some enterprises integrate R into analytics workflows using tools like RStudio (Posit) Connect, Shiny applications, and interfaces to databases or Spark. R is especially useful when teams need reproducible reports, ad-hoc analyses, or interactive dashboards rather than end-to-end ML deployment pipelines.
Python has become dominant for machine learning and general-purpose development, driven by libraries such as TensorFlow, PyTorch, and scikit-learn and by easier integration with production systems. That said, R still excels in specialized statistical analysis, exploratory work, and high-quality visualizations.
The R community remains active: CRAN and package development continue, and tooling (RStudio / Posit, packages like tidyverse
, ggplot2
, and Shiny) is robust. Conferences and user groups still draw strong participation, and enterprise support options exist for production environments.