What data is needed for machine learning tools to detect and predict churn?

This post is the second in a series discussing a machine learning use case for a mobile app provider. The link to the full case study can be found at the end of the post. The first post can be found at https://www.inetsoft.com/blog/machine-learning-concepts-defining-churn-predictive-metrics/

What data is needed for machine learning to detect and predict churn?

The use case we are discussing used 60 days of user activity data before a 30-day no-use window.

Sometimes, straight raw data can be used from an organization’s operational data stores,but many times, data needs to be transformed or cleansed for machine learning modeling. For this activity-based use case, it is apparent that raw data must be aggregated to create a new metrics.

User activity data and any other data items associated with a user that the machine learning model will use as inputs are called “features.” Examples for a B2B cloud-based solution provider would be subscription period and number of support cases. Correspondingly, each user is also marked as “churned” or “not-churned,” which is called a “label”. In other words, each user will have a set of associated features as inputs that determine the output of the “label.”

Each labeled user, in this case, is called an “observation.” Machine learning uses existing observations to study the relationship between features and the label. The goal is to produce a machine learning model that can assign a label given a set of features about a user.

Some features are apparent and readily available. But most times, this step requires intimate business knowledge to pick out the right data likely to be correlated or causative with the outcome. In the real world, this also probably will be an iterative process of experimenting by examining machine learning model test results.

This is also a collaborative process with the technologist because machine learning requires data in certain ways. For example, when two numerical features are on very different scales, their influence can be different. Then these features must be normalized so that their scale will not distort the learning model.

The full case study can be found at https://insidebigdata.com/2017/04/14/predicting-mobile-app-user-churn-training-scaling-machine-learning-model/

If you’d like to read about our software application, Style Intelligence, see https://www.inetsoft.com/products/StyleIntelligence/

The next post in the series is at: https://www.inetsoft.com/blog/is-machine-learning-useful-for-a-small-dataset/