Why Machine Learning Terminology is So Confusing
And definitions for the most important terms you should know
The greatest part about writing about machine learning is the excitement around the field. My writing has led to the amazing opportunity to meet many readers I otherwise would never have known. My favorite conversations are with those just starting to learn about ML and how to use it. It's amazing to get to know them, hear their fresh perspectives, and get them off on the right foot.
The topic this week has been driven by these conversations. I get a lot of messages asking questions that don't make sense because the terminology is misunderstood. This isn't the fault of the person asking the question; it is the fault of the field. Technical communication is already difficult enough without terminology being muddied and used incorrectly.
This week, let’s clarify some terminology. We’re going to:
Clarify terminology I see many people misunderstand.
Define some machine learning algorithms you should know.
Explain why machine learning terminology is so difficult to understand.
Artificial Intelligence Versus Machine Learning
Let's start off with the most basic distinction many people get wrong: AI vs ML. Artificial intelligence and machine learning are often, mistakenly, used interchangeably. At its core, Artificial Intelligence is a broad field of computer science focused on creating systems capable of performing tasks that typically require human intelligence. These tasks include learning, reasoning, problem-solving, and understanding language.
Machine Learning, on the other hand, is a subset of AI that involves training machines to learn from data, improving their accuracy over time without being explicitly programmed for each task. In machine learning, we allow a system to figure out rules for itself from data we provide it. In traditional artificial intelligence, we provide these rules for the system. In order to create a system capable of performing tasks that generally require human intelligence, these rules we give traditional AI tend to be quite complex.
To think about AI outside of ML, think about video games. The concept of 'AI' in video games has been around for decades. AI in video games follows pre-programmed rules or behaviors, which is a form of AI but not necessarily ML. This is what you're experiencing when you feel like a boss is reading your inputs and reacting to them.
Why are AI and ML used interchangeably? There are two main reasons I can think of:
Machine learning is what has caused AI to really take off. Remember how I said the rules we have to feed into traditional AI systems tend to be quite complex for them to achieve a level of human-like problem solving? This makes it difficult to create and scale these algorithms. So for many people, ML is what they know about AI.
AI has become a buzzword and is thrown into everything. While many new companies are using machine learning to solve problems, it's much catchier to say "We're using AI". So machine learning is what people associate with term "artificial intelligence".
It's important to distinguish between machine learning and artificial intelligence because there are a lot of methods of traditional AI that aren't machine learning. Understanding this makes communication and learning about AI and ML easier. Here are a few examples non-ML subsets of AI):
Rule-Based Systems: These systems follow a set of predefined rules to make decisions or solve problems. A chatbot that navigates through a decision tree to answer user queries is a good example.
Reactive Machines: These systems, like IBM's Deep Blue chess program, make decisions based on the current situation without learning from past experiences.
Expert Systems: Expert systems are AI systems that emulate the decision-making ability of a human expert in a specific domain. They are built using a knowledge base and an inference engine to provide advice or make decisions.
There are many more than just these, but these provide some good examples of not needing ML to create an AI system. Part of what makes understanding this so difficult is there are other subsets of Artificial Intelligence that can use machine learning but don't have to. A good example is natural language processing (NLP) which can perform tasks such as language modeling, text classification, and named entity recognition using traditional rule-based systems, but has recently shifted toward doing these tasks with machine learning.
The important distinction between AI and ML:
AI is a field of computer science in which computer systems can solve problems that require human intelligence. This is done by providing a computer system the rules for solving problems.
ML is a subset of AI that uses data to allow a system to determine rules for itself.
Deep Learning != Machine Learning
Many of the recent advancements within the field of machine learning have taken place within a subset of machine learning called deep learning. I often see those who have just gotten into ML conflate the two. Deep Learning is a subset of ML that uses neural networks with many layers (hence "deep") to analyze various forms of data. It’s taken off in recent years due to its ability to process large amounts of data and achieve high levels of accuracy on many different tasks.
Deep learning also requires a lot of computational power and high-quality data to learn a task. This means it isn’t always the optimal solution for a task. There are other subsets of machine learning that don’t suffer from these downsides that can work just as well. If you’re a modeler or machine learning engineer, try not to default to deep learning.
Understanding ML Algorithms
I find it helpful to define the terminology related to machine learning algorithms by evaluating them via two different metrics:
Styles of Learning: This separates machine learning algorithms into how they use data to learn.
Types of Algorithms: This separates the algorithms used within each style of learning.
Styles of learning and types of algorithms are often clumped into lists together, which makes it difficult to understand them. Algorithms and styles of learning work together—not separately. The definitions for styles of learning are all generally straightforward while algorithms are less clear.
It’s crucial for everyone to have a firm grasp of the different styles of learning because data is a machine learning engineer’s most valuable asset. With regard to algorithms, knowing each individual algorithm isn’t as important as knowing which problem to use them for (I’ll have to discuss this in a later article).
Styles of Learning
There are four different overarching styles of machine learning:
Supervised Learning: This is learning from labeled data, making predictions based on input-output pairs.
Unsupervised Learning: This works on identifying patterns or groupings without prior knowledge of outcomes.
Semi-supervised Learning: This uses a combination of labeled and unlabeled training data to train a model.
Reinforcement Learning: This category involves algorithms that learn to make decisions through trial and error, using feedback from their environment to achieve specific goals.
These subsets of learning are differentiated by how they use data to help machines learn. The most important aspect of machine learning is your data and how it's used. If you want to learn more about these, I've written an overview about each in a way that anyone can understand.
Types of Algorithms
Algorithms are where I see the most confusion in terminology. This is because most algorithms are taught in a list and portrayed as though they’re entirely separate algorithms. In reality, different algorithms are combined and work together to help machines learn. Instead of being a list, machine learning algorithms should really be portrayed in a graph. Let me explain this with an example.
If you look up machine learning algorithms or ask an LLM to tell you what different types of machine learning algorithms exist, you'll likely get a list similar to this:
Deep Learning
Decision Trees
Linear Regression
Supervised Learning
Reinforcement Learning
These are the points of confusions that can be found in a list like this:
Two styles of learning are listed as algorithms (supervised and reinforcement learning). While this isn’t technically wrong, it makes it seem like they’re distinct from the other algorithms on the list. In reality, deep learning, decision trees, and linear regression all use styles of learning to achieve their goals. For example, deep learning can be used for both supervised or unsupervised learning.
This list shows linear regression, decision trees, and deep learning as separate entries. Both deep learning and decision trees can use linear regression as part of their algorithm. Linear regression can also be used outside of decision trees and deep learning (such as ordinary least squares1), but it’s often to used as the statistical model other machine learning algorithms fit to.
Linear regression is always considered supervised learning. Deep learning and decision trees can be both unsupervised and supervised. While deep learning and decision trees can use linear regression, they also use other statistical models for fitting data.
This is just what I can pull from the example list above. When you add 50 more algorithms to the list, this gets exponentially more complex. Oversimplification of machine learning algorithms makes them difficult to comprehend. Here are the most important takeaways from this section:
Machine learning algorithms aren’t mutually exclusive and often work in conjunction to solve a task.
Machine learning algorithms fit into styles of learning, which are how algorithms use data to learn. Separating these styles of learning out makes it easier to understand data use, which is the most important aspect for comprehending machine learning.
Common Machine Learning Algorithms You Should Know
Deep Learning: A subset of machine learning that uses neural networks with multiple layers to learn from data. It has been particularly successful in tasks such as image and speech recognition, natural language processing, and reinforcement learning.
Decision Trees: A type of learning algorithm used for both classification and regression tasks. Decision trees recursively split data into subsets based on the most significant attribute, creating a tree-like structure to make decisions.
Linear Regression: A fundamental statistical method used to model the relationship between two continuous variables. It aims to fit a linear equation to the observed data, allowing for predictions based on new values of the independent variable.
Logistic Regression: A statistical model used for binary classification tasks. It estimates the probability that a given input belongs to a certain category.
Support Vector Machines (SVM): A supervised learning algorithm used for both classification and regression tasks. It finds the optimal hyperplane that best separates data points into different classes in a high-dimensional space.
K-Nearest Neighbors (KNN): A simple and effective supervised learning algorithm. It works by identifying the K-nearest neighbors of a given query point and assigning a class label based on the majority class of those neighbors.
Random Forest: An ensemble learning method that constructs multiple decision trees. It creates a random forest by combining multiple decision trees and making predictions based on the majority votes of predictions from the individual trees.
K-means: An unsupervised learning algorithm used for clustering tasks. It partitions N data points into K clusters based on their features, aiming to minimize the variance within each cluster.
Hidden Markov Model (HMM): An unsupervised machine learning algorithm based on the idea that there is an underlying process with hidden states that all contribute to the outputs we observe. They use patterns in data to predict things that can’t be seen directly.
They are many more than this, but it would be too much to fit them all in here. I'll be adding a comprehensive list to my Machine Learning Road Map, so make sure to star it on GitHub if you want more information on different types of machine learning algorithms.
As an aside, I’m just realizing how little an overview of each of these types of algorithms actually tells you about what they are, how they work, and what they’re good for. In the realm of machine learning, we have a lot of overviews like this, but not a lot of deep dives into how things actually work. Ironically, as I’m writing this article about why machine learning terminology is so confusing I’m likely contributing to what makes it that way. If you’d like a deep dive into how any of these algorithms actually work, let me know in a comment.
If you're interested in a breakdown of the different types of machine learning algorithms grouped by similarity, check out this article on machinelearningmastery.com. Remember as you're reading it that the algorithms aren't mutually exclusive.
When to use each ML algorithm will require an article of it's own. For now, check out this handy diagram on scikit-learn.org that breaks down common machine learning algorithms based on the task at hand:
Takeaways
This is by far the most difficult article I've written. Between buzzwords, oversimplification, differences in opinion, and terminology evolving over time, ML terminology has become muddied and keeping it all straight has become difficult. This makes comprehending machine learning especially difficult for those just getting started. Here are the takeaways I’d like everyone to know:
Machine learning is a subset of AI that lets systems determine rules via data.
Deep learning is a subset of machine learning based on using neural networks to help machines learn.
There are four distinct styles machines use to learn.
Machine learning algorithms aren’t mutually exclusive and are often used together.
Thanks for reading! If you disagree with anything here, let me know. If you have questions for me, reach out on X. I’ve been trying to chat with as many new machine learners as possible to help them get started. I was lucky enough to have people to help me learn and it’s something I took entirely for granted.
If you’re interested in machine learning and the engineering that goes on behind the scenes to the products you use, join Society’s Backend to get articles like this in your inbox once a week (it’s free!).
You can also support Society’s Backend by becoming a paid sub. Below is a code to get 30% off your subscription forever. This will get you:
Access to the resources I use to study and build ML.
Updates ~2 times a month on the important developments in the world of machine learning. These will be a noise-free, one-stop shop for your ML info.
To further show how complex ML terminology is, it’s debated whether or not OLS is machine learning or not. You can add “differing opinions” to the list of factors that cause ML terminology to be difficult to understand.
Super cool your article
Was referred this by a friend who knows well my mania for 'The Definiton Problem' in AI.
My experience maps onto yours pretty directly in respect of the definitional confusions I encounter within applied AI - which naturally get worse the further out you go from a centre filled with engineers on more intimate terms with the tech - but I think this is also a much wider issue in theoretical AI too.
Just as there's a lack of terminological definition in ML vs. DL, to use your for-instance, there's a lack of terminological consensus in AI research about what, for instance 'intelligence' (/'learning'/'consciousness' etc.) actually means. In most respects, participants in these sorts of conversations don't even realise that their definitions a. differ and b. are not at all rigorous in and of themselves. Shy of such essentials, the nearness-at-hand of the pathbreaking work that some of the most optimistic (and most pessimistic) specialists expect seems highly improbable.
I think all over that this is hallmark of a very nascent field only just beginning to set its proper foundations, and one where some of the town planners are so excited about debating the value of building a monorail in our small village that they've failed to notice that we haven't invented fire yet and every time there's a strong wind all our stick huts fall down.
Really good piece.