What it is, why you need it, and best practices. This guide provides definitions and practical advice to help you understand and practice modern automated machine learning.
AutoML (short for automated machine learning) refers to the tools and processes which make it easy to build, train, deploy and serve custom machine learning models. AutoML provides both ML experts and citizen data scientists a simple, code-free experience to generate models, make predictions, and test business scenarios. This allows you to quickly apply machine learning across your organization.
You can use automated machine learning in a variety of applications, such as natural language processing, voice recognition, and recommendation engines. It can also support your BI and analytics needs, by using models to analyze historical data, find key drivers and patterns across large sets of business metrics, and make smart business predictions based on those patterns.
Citizen data scientists benefit from AutoML tools and processes by quickly and easily developing baseline models and acting on the results of these models. ML experts benefit from AutoML avoiding the traditional trial-and-error workflow process and instead putting their time and effort toward customizing models and notebooks.
Here are the high-level benefits of automated machine learning which apply to both types of users:
Quickly apply machine learning across your organization. AutoML allows non-ML-experts to leverage machine learning models and helps ML-experienced developers and data scientists to more quickly produce solutions which are often simpler and even perform better than hand-coded models.
Focus effort on higher impact work. AutoML eliminates time-intensive and monotonous coding throughout the machine learning workflow, from preprocessing and cleaning the data to selecting the algorithm to optimizing and monitoring model parameters. Also, training a computer to identify content can reduce errors and save countless hours of manually curating tables, text, images and videos.
Improve business performance. AutoML makes it faster and easier to give your analytics teams the power of predictive analytics, which can significantly improve business performance. Specific applications include detecting fraud, giving consumers more personalized experiences, and better managing inventory through improved demand forecasting.
Automated machine learning typically maps to the traditional machine learning workflow. As with other data science or data analytics projects, you should first clearly define the question you’re trying to answer or the problem you’d like to solve. This critical step will inform your data requirements.
Depending on your specific use case and type of data (structured, image, video, or language), the details of the AutoML process will vary. But, below is a high-level overview to get you started.
Dataset. First you gather the appropriate data and prepare your dataset. Key actions include:
Train and Evaluate. Now you’re ready to train your model. Most AutoML tools will get you going with a default model but you should consider adjusting parameters to better fit your particular use case. Your platform should have a simple graphical interface which allows you to build custom models.
Deploy and Serve. Once you’re confident in the performance of your model, you can make it available to use. Usage may mean a one-time project or as part of an on-going production process.
Download the AutoML guide with 5 factors for machine learning success
The four main different types of AutoML models are based on the different types of data you’re analyzing: structured, image, video, and language.
Structured Data
An example of structured (or tabular) data might be historical sales data for a company. You can train your automated machine learning model to:
Image Data
Image data refers to pictures stored in a database. Manually categorizing large volumes of images can lead to errors and be very time-consuming. Your model can be trained to either:
Video Data
Like image data, manually categorizing large volumes of videos can lead to errors and take countless hours to perform. Your model can be trained to:
Text Data
Text data, including natural language, refers to any types of messages your company might have, from social media to the help section on your site. Your model can analyze this text data to decipher its meaning and structure:
To illustrate AutoML in action, let’s imagine you’re running a SaaS company selling monthly subscriptions to an online platform. Below we look at how you can use automated machine learning in a BI tool to evaluate customer behavior.
Evaluating customer churn
AutoML tables can help you understand the patterns and drivers affecting customer churn in the past. It can also use those same patterns to predict which current customers have the highest risk of leaving in the future.
Accessing a dataset of historic customers reveals what the first 12 rows of such a dataset might look like:
Each row in the table above represents a unique, historic customer. Each column represents an attribute about the customer.
Some attributes about each customer became clear the moment that person became a customer. For example, CustomerID, Gender, Age, Zip, Plan_Type. Some attributes about each customer became available later in the customer journey, for example, Logins_1M (the number of times a customer logged into the site during month one), Avg_min_log_1M (the average time – in minutes – that a customer spent on the site during month one), and Churn_1Y (whether or not the customer quit the platform within a year of becoming a customer). Churn_1Y is the column of interest because you want to be able to predict whether or not a given customer is likely to leave the platform during the first 12 months.
Close inspection of AutoML tables reveals three key patterns in the dataset:
Automated machine learning excels at finding patterns such as these. It can even discover significantly more complex patterns over a large number of columns to predict how combinations of values in the feature columns will affect the values in the target columns.
AutoML takes a dataset (like the one in the example) and allows you to specify a target field (e.g., Churn_1Y). It then finds key drivers and patterns in the data that are often impossible to visualize or detect by a human. You can then refine and finalize the model as well as use it to make future predictions for both forward-looking data and scenario planning.
Taking action on your AutoML predictions
Once you learn which customers will churn, you can employ customer retention strategies as well as begin to ask questions about more valuable customers’ purchasing patterns, demographics, and other characteristics that make them the most successful.
The knowledge you gain from the patterns that automated machine learning uncovers can directly impact business conversations related to your company's long-term growth strategy and financial performance.
The best machine learning automation platforms include the following key capabilities: