What Is Machine Learning?

Machine learning is all about statistical models.

You probably know about parametric models, like when you’re calculating the mass of the moon, and you have a formula. If you know the variables, you can calculate the answer by plugging them in and doing the math.

Sometimes you don’t have a formula but you have a ton of data, and you want to find patterns or make predictions. In this case, you’d use nonparametric machine learning models.

I’m Loren Shure, and I’m a scientist who has been at MathWorks for over 30 years. I’m going to walk you through the three types of machine learning: clustering, classification, and regression.

Suppose I give you a stack of cards with pictures on them, and I ask you to sort the cards into groups. Different people group these cards in different ways.

What is on these cards to cause that to happen? Well, they are pictures of dogs, cats, and birds.

Some of you say, “Aha! I see three different groups here: clearly, dogs, cats, and birds.”

Some of you see four legged animals vs. two legged animals, and you put the cards into two piles.

And those of you who put them into one pile might say, “They’re all animals!”

Well, who’s right? You all are, because the instructions just said to put the cards into groups.

This is clustering: Clustering helps you segment a collection of things into groups with distinct attributes.

Now let’s move on to classification.

You have the same cards, with each one labeled with three categories: either dog, cat, or bird.

You need to determine the features that help distinguish between the different animals.

You use these features to train a model, which will determine whether something gets labeled as a dog, a cat, or a bird.

Now I give you a new image. What category does it belong to? Well, let’s run it through the model to figure it out.

This model is good at classifying only dogs, cats, and birds, but it clearly wasn’t developed for anything else. It did the best it could with the horse.

This is classification, and you’d use it for things like object detection in images, predictive maintenance, and spam detection.

The third type of machine learning is regression, where instead of classifying into a finite number of outputs, we’re trying to find an answer on a continuum – like the maximum running speed of an animal.

To build a model that will predict speed, we do what we did before – select features that may be relevant. For example, let’s try the weight of an animal and how long its legs are.

The model uses these features to estimate where the animal lands on that speed continuum.

That’s regression. Regression models are used in many applications -- like forecasting electricity usage or stock prices.

So those are the three different kinds of machine learning.

Machine learning is an incredibly complex topic, and I’ve just skimmed the surface here. You may have heard of deep learning, which is a type of machine learning where you don’t manually select the features. Instead, the features are learned as part of the model training process, but it costs you lots more data.