Diagram of the predictive model showing how datasets are processed into actionable insights and application events.

What is Predictive Analytics?

What it means, why it matters, and how it works. This guide provides definitions and practical advice to help you understand modern predictive analytics.

What is Predictive Analytics?

Predictive analytics refers to the use of statistical modeling, data mining techniques and machine learning to make predictions about future outcomes based on historical and current data. These predictions help guide your decision making to mitigate risk, improve efficiency, and identify opportunities.

Four Types of Analytics

Predictive analytics builds upon descriptive and diagnostic analytics (which describe the present situation) and provides a foundation for prescriptive analytics (which makes specific recommendations on your optimal course of action).

Type Question Answered
What happened?
Why did it happen?
Predictive
What will happen?
What should we do?

Predictive analytics brings key benefits.
Your organization is likely flooded by big data–large, complex, and high velocity datasets from many sources. Predictive data analytics helps you use all this information to make better, data-driven decisions which can improve your business performance. It can guide your decision making regarding any aspect of your business, such as increasing revenue, improving operational efficiencies, and reducing fraud. Importantly, it lowers the risk of human bias or error because your decisions are driven by data, not instinct. And it also allows you to focus on executing your plan rather than laboring over decisions.

Predictive analytics is growing rapidly.
Until the recent rise of self-service predictive analytics tools, predictive analytics required data scientists to develop custom machine learning algorithms. Plus you’d have to make significant investments in hardware and data engineers to integrate, store and manage the data. Modern AutoML (automated machine learning) now makes it easier for you to build, train, and deploy custom ML models yourself. And you can secure the data storage and system power and speed you need with a cloud data warehouse.

Predictive Analytics Examples

A wide range of industries and job roles leverage predictive analytics techniques. Here are some common examples of how different industries use predictive analytics.

  • Insurance companies analyze policy applications based on the risk pool of similar policyholders to predict the probability of future claims.

  • Financial services firms predict the likelihood of loan default, detect and reduce fraud, and forecast future price movements of securities.

  • Retailers and CPG companies analyze the effectiveness of previous promotional activity to forecast which offers will be most effective.

  • Healthcare companies better manage patient care by forecasting patient admission and readmissions.

  • Energy and utilities mitigate safety risks by analyzing historical equipment failures and predict future energy needs based on previous demand cycles.

  • Life sciences organizations develop patient personas and predict the probability of nonadherence to treatment plans.

  • Manufacturing and supply chain operations forecast demand to better manage inventory, and identify factors which result in production failures.

  • The public sector analyzes population trends to plan infrastructure investments and other public works projects.

How Predictive Analytics Works

Predictive analytics uses statistical analysis, deep learning, and machine learning algorithms to identify and analyze patterns in historical and current data and then forecast the likelihood that those patterns will appear again. Your specific workflow will depend on the types of data you’re working with and the details of your specific use case(s) but here’s an overview to get you started.

Diagram of the predictive model showing how datasets are processed into actionable insights and application events.

1) Define your project. First you need to clearly define the business question you’d like to answer or the problem you’re trying to solve. In other words, what do you want to be able to predict? Being clear on the ideal project outcome will inform your data requirements and allow your predictive model to generate an actionable output.

2) Build the right team. While new tools make it much easier to perform predictive data analytics, you should still consider having these five key players on your team:

  • An executive sponsor who will ensure funding and prioritization of the project.
  • A line-of-business manager who deeply understands the business problem you’re trying to solve.
  • A data wrangler or someone with data management expertise who can clean, prepare, and integrate the data (although some modern analytics and BI tools include data integration capabilities).
  • An IT manager to implement the proper analytics infrastructure.
  • A data scientist to build, refine and deploy the models (although AutoML tools now allow data analysts to do this).

3) Collect and integrate your data. Now you’re ready to gather the data you need and prepare your dataset. Bring in data representing every factor you can think of to provide a complete view of the situation and make your model more accurate. You’ll probably be bringing in both highly-organized and formatted structured data such as sales history and demographic information, and unstructured data such as social media content, customer service notes, and web logs. Prepping data requires you to do the following:

  • Correctly label and format your dataset.
  • Ensure data integrity by cleaning up incomplete, missing, or inconsistent data.
  • Avoid data leakage and training-serving skew.
  • After importing, review your dataset to ensure accuracy.

You’ll be working with big data, and possibly even real time streaming data, so you’ll need to find the right tools. As stated above, cloud data warehouses can now cost effectively bring the storage, power, and speed you need.

4) Develop and validate your model. The next stage involves building, training, evaluating and deploying your predictive model. There are two ways you can go about this. You can find and hire a data scientist to develop a model or you can use an AutoML tool to develop one yourself. There are also two main types of algorithmic models–classification and regression–which we describe in the next section. These algorithms ultimately place a numerical value, weight, or score on the likelihood of a particular event happening in the future. You’ll need to test and refine your model multiple times to come up with the best performer, the model which generates predictions that meet what you would expect.

5) Deploy your model. Finally, you can put your model to work on your chosen dataset. You can use the results as a one-time or ongoing decision making or you can automate actions by integrating the output into other systems. (Application automation triggering action is one of the top 10 BI trends this year.) Ideally, your model should automatically adjust as new data is added over time as this will improve the accuracy of the predictions.

6) Monitor and refine your model. Keep a close eye on the outputs of your model to make sure it continues to provide results you expect. You’ll likely need to tweak the model as new variables emerge. You can also improve your model’s predictions by applying data mining techniques such as clustering, sampling, and decision trees to data collected over time.

Predictive Analytics Models and Techniques

There are a wide variety of predictive data models available. Here we briefly describe the most popular model types and techniques.

The most widely used classification model is decision trees, which partition your data into subsets based on defined input variable categories. As the name suggests, decision trees help you understand a customer’s decision path by visually representing each choice a user could make like a tree branch. Each classification or decision is represented as a leaf. Decision tree models identify the defining variable which logically splits the data into different groups.

Regression analysis is a popular statistical method which identifies meaningful patterns in data and estimates relationships among variables. Linear regression describes to what degree one or more factors influence or can be used to predict the activity of another factor. Logistic regression also uses an equation where input values are combined linearly using weights or coefficient values to predict an output value. But in logistic regression the output value being modeled is limited to being categorical, meaning it can assume only a limited number of values, or a binary value such as 0 or 1 instead of any numeric value.

Another popular predictive modeling technique is neural networks. This pattern recognition technique is well suited for when you don’t have a mathematical formula which relates inputs to outputs. Neural networks are also excellent when you have a lot of training data because they’re powerful and flexible and can model highly complex, nonlinear relationships in your data.

As stated above, there are many predictive analytics models you can leverage. Some of the other popular techniques include bayesian analysis, gradient boosting, k-nearest neighbor (knn), ensemble models, memory-based reasoning, incremental response, partial least squares, principal component analysis, support vector machine, and time series data mining.

Learn more about AutoML and predictive analytics

WHITEPAPER

Introduction to Automated Machine Learning Whitepaper

WHITEPAPER

Machine Learning for Your Modern Business

WHITEPAPER

Machine Learning for Analysts

DATASHEET

Machine Learning for Your Analytics Teams

Want to try Qlik AutoML for predictive analytics?