Below is the continuation of the transcript of a Webinar hosted by InetSoft on the topic of What Machine Learning Means for Company Analytics. The presenter is Abhishek Gupta, Chief Data Scientist at InetSoft.
I think it's just really important to think about before you start a machine learning project or an analytics project, how are you going to tell if this is making sense, if you're saving money, if you're creating revenue, if you're finding knowledge? Before you get involved with one of these projects you need to think about how you're going to assess it.
That varies a lot by different businesses but being able to have a feedback loop where you can tell how well your machine learning project did, you need to think about that from the beginning. How am I going to work that into my machine learning solution? What are my assessment criteria going to be? Am I trying to create revenue? Am I trying to find savings? Am I trying to generate knowledge? Just be aware of that, it's a hugely important part of the process, but we're running out of time and we're just going to go to questions.
All right so we have time for one to two questions. Let's lead off this one: do you have any practical examples in the area of manufacturing?
Yes, but unfortunately I can't talk that much in detail about it for confidentiality reasons since this is a real customer use case. We work with a large manufacturer of high tech devices that are used in computers and cell phones. It's an older company. They have their manufacturing process nailed down just perfectly, but they want to keep pushing that. They want to keep improving them.
They are using very sophisticated machine learning techniques to try to find defects or problems in their manufacturing process that wouldn't be visible through more traditional sort of manufacturing quality assessment methods.
And then the closing question: how will the job description of data scientist evolve as these tools get more sophisticated and also simpler to use?
I think that question could go a couple of different ways. There could be a wider variety of people who start getting involved in what is typically the domain of data scientists today. I think we have to be sensible and think about what kind of training did those people need to get up to speed quickly?
I think the demand for data scientists is going to grow with the amount of data that's streaming in and we have to fill that gap with bodies. How can we sensibly get those people up to speed? One way is making it more accessible in the beginning. Does this mean that all data scientists still need to be able to code? I'm a little bit old school on that, I think coding is still a very essential part of more advanced data science.
I don't dare to make a prognosis on that because I think technology has shown if anything it can go beyond our expectations. I think we're going to have to think broader about how we train the next generation of data scientists. Is the next generation who's grown up with a tablet in their hand since they were little, are they still going to be willing to code or find it interesting to code? I don't know.
Should we be teaching coding in high schools? I don't know. It's a very good question to end the debate with. I do think we're going to have to think about new ways of training the next generation of data scientist. Then you have to be creative in doing that. Only time is going to tell how important it is for a data science to be a real machine learning expert, or how important is it for a data scientist to know more about business.
Only time is going to tell what's more important. I think we'll see continued specialization of data science and data visualization, machine learning experts, data engineers, that's my prediction. But as we know, making predictions is a dangerous game in the technology industry.
Before beginning any machine learning initiative, it is equally important to evaluate the organizational readiness required to support such a project. Many teams focus heavily on algorithms and tools but overlook whether the business has the processes, staffing, and cultural alignment needed for long‑term success. Establishing clear ownership, defining who will maintain the models, and ensuring that business stakeholders understand their role in the lifecycle all help prevent the common pattern of abandoned prototypes. A machine learning project is not just a technical exercise—it is an operational commitment that must be supported beyond the initial build.
Another critical consideration is the stability and accessibility of the data pipeline. Machine learning models depend on consistent, high‑quality data, yet many organizations attempt to build models before addressing upstream issues such as missing fields, inconsistent formats, or unreliable refresh schedules. Investing time in data profiling, lineage mapping, and validation rules ensures that the model will not degrade once deployed. This preparation also helps teams identify which data sources are truly predictive and which are simply adding noise or unnecessary complexity.
It is also essential to define the operational environment in which the model will run. Decisions about batch versus real‑time scoring, integration with existing applications, and the expected latency of predictions all influence the architecture of the solution. Without this clarity, teams risk building models that perform well in isolation but cannot be deployed effectively. Considering infrastructure constraints early—such as compute availability, security requirements, and monitoring capabilities—helps ensure that the final solution is both scalable and maintainable.
Equally important is establishing a clear communication plan for how results will be interpreted and used. Machine learning outputs often require context, especially when they influence high‑impact decisions. Creating dashboards, visual explanations, and documentation that translate model behavior into business‑friendly insights helps build trust and encourages adoption. This communication layer should not be an afterthought; it is a core component of making machine learning actionable and ensuring that stakeholders understand both the strengths and limitations of the system.
Finally, teams should plan for continuous evaluation and improvement. Machine learning models are not static—they drift as business conditions, customer behavior, or operational processes change. Establishing a monitoring framework that tracks model accuracy, data drift, and performance over time allows teams to detect issues early and retrain models before they become unreliable. This ongoing feedback loop transforms machine learning from a one‑time project into a sustainable capability that evolves with the organization’s needs.