Processing…Please wait.
Search Engine Land » Channel » SEO »
When it comes to machine learning, there are some broad concepts and terms that everyone in search should know. We should all know where machine learning is used, and the different types of machine learning that exist.
Read on to gain a better grasp of how machine learning impacts search, what the search engines are doing and how to recognize machine learning at work. Let’s start with a few definitions. Then we’ll get into machine learning algorithms and models.
What follows are definitions of some important machine learning terms, most of which will be discussed at some point in the article. This is not intended to be a comprehensive glossary of every machine learning term. If you want that, Google provides a good one here.
Often we hear the words artificial intelligence and machine learning used interchangeably. They are not exactly the same.
Artificial intelligence is the field of making machines mimic intelligence, whereas machine learning is the pursuit of systems that can learn without being explicitly programmed for a task.
Visually, you can think of it like this:
All the major search engines use machine learning in one or many ways. In fact, Microsoft is producing some significant breakthroughs. So are social networks like Facebook through Meta AI with models such as WebFormer.
But our focus here is SEO. And while Bing is a search engine, with a 6.61% U.S. market share, we won’t focus on it in this article as we explore popular and important search-related technologies.
Google uses a plethora of machine learning algorithms. There is literally no way you, me, or likely any Google engineer could know them all. On top of that, many are simply unsung heroes of search, and we don’t need to explore them fully as they simply make other systems work better.
For context, these would include algorithms and models like:
None of these directly impact ranking or layouts. But they impact how successful Google is.
So now let’s look at the core algorithms and models involved with Google rankings.
This is where it all started, the introduction of machine learning into Google’s algorithms.
Introduced in 2015, the RankBrain algorithm was applied to queries that Google had not seen before (accounting for 15% of them). By June 2016 it was expanded to include all queries.
Following huge advances like Hummingbird and the Knowledge Graph, RankBrain helped Google expand from viewing the world as strings (keywords and sets of words and characters) to things (entities). For example, prior to this Google would essentially see the city I live in (Victoria, BC) as two words that regularly co-occur, but also regularly occur separately and can but don’t always mean something different when they do.
After RankBrain they saw Victoria, BC as an entity – perhaps the machine ID (/m/07ypt) – and so even if they hit just the word “Victoria,” if they could establish the context they would treat it as the same entity as Victoria, BC.
With this they “see” beyond mere keywords and to meaning, just our brains do. After all, when you read “pizza near me” do you understand that in terms of three individual words or do you have a visual in your head of pizza, and an understanding of you in the location you’re in?
In short, RankBrain helps the algorithms apply their signals to things instead of keywords.
BERT (Bidirectional Encoder Representations from Transformers).
With the introduction of a BERT model into Google’s algorithms in 2019, Google shifted from unidirectional understanding of concepts, to bidirectional.
This was not a mundane change.
The visual Google included in their announcement of their open-sourcing of the BERT model in 2018 helps paint the picture:
Without getting into detail on how tokens and transformers work in machine learning, it’s sufficient for our needs here to simply look at the three images and the arrows and think about how in the BERT version, each of the words gains information from the ones on either side, including those multiple words away.
Where previously a model could only apply insight from the words in one direction, now they gain a contextual understanding based on words in both directions.
A simple example might be “the car is red”.
Only after BERT was red understood properly to be the color of the car, because until then the word red came after the word car, and that information wasn’t sent back.
As an aside, if you’d like to play with BERT, various models are available on GitHub.
LaMDA has not yet been deployed in the wild, and was first announced at Google I/O in May of 2021.
To clarify, when I write “has not yet been deployed” I mean “to the best of my knowledge.” After all, we found out about RankBrain months after it was deployed into the algorithms. That said, when it is it will be revolutionary.
LaMDA is a conversational language model, that seemingly crushes current state-of-the-art.
The focus with LaMDA is basically two-fold:
A sample conversation from Google is:
We can see it adapting far better than one would expect from a chatbot.
I see LaMDA being implemented in the Google Assistant. But if we think about it, enhanced capabilities in understanding how a flow of queries works on an individual level would certainly help in both the tailoring of search result layouts, and the presentation of additional topics and queries to the user.
Basically, I’m pretty sure we’ll see technologies inspired by LaMDA permeate non-chat areas of search.
Above, when we were discussing RankBrain, we touched on machine IDs and entities. Well, KELM, which was announced in May 2021, takes it to a whole new level.
KELM was born from the effort to reduce bias and toxic information in search. Because it is based on trusted information (Wikidata), it can be used well for this purpose.
Rather than being a model, KELM is more like a dataset. Basically, it is training data for machine learning models. More interesting for our purposes here, is that it tells us about an approach Google takes to data.
In a nutshell, Google took the English Wikidata Knowledge Graph, which is a collection of triples (subject entity, relationship, object entity (car, color, red) and turned it into various entity subgraphs and verbalized it. This is most easily explained in an image:
In this image we see:
This is then usable by other models to help train them to recognize facts and filter toxic information.
Google has open-sourced the corpus, and it is available on GitHub. Looking at their description will help you understand how it works and its structure, if you’d like more information.
MUM was also announced at Google I/O in May 2021.
While is it revolutionary, it’s deceptively simple to describe.
MUM stands for Multitask Unified Model and it is multimodal. This means it “understands” different content formats like test, images, video, etc. This gives it the power to gain information from multiple modalities, as well as respond.
Aside: This is not the first use of the MultiModel architecture. It was first presented by Google in 2017.
Additionally, because MUM functions in things and not strings, it can collect information across languages and then provide an answer in the user’s own. This opens the door to vast improvements in information access, especially for those who speak languages that are not catered to on the Internet, but even English speakers will benefit directly.
The example Google uses is a hiker wanting to climb Mt Fuji. Some of the best tips and information may be written in Japanese and completely unavailable to the user as they won’t know how to surface it even if they could translate it.
An important note on MUM is that the model not only understands content, but can produce it. So rather than passively sending a user to a result, it can facilitate the collection of data from multiple sources and provide the feedback (page, voice, etc.) itself.
This may also be a concerning aspect of this technology for many, myself included.
We’ve only touched on some of the key algorithms you’ll have heard of and that I believe are having a significant impact on organic search. But this is far from the totality of where machine learning is used.
For instance, we can also ask:
All of these questions and hundreds if not many thousands more all have the same answer:
Machine learning.
Now let’s walk through two supervision levels of machine learning algorithms and models – supervised and unsupervised learning. Understanding the type of algorithm we’re looking at, and where to look for them, is important.
Simply put, with supervised learning the algorithm is handed fully labeled training and test data.
This is to say, someone has gone through the effort of labeling thousands (or millions) of examples to train a model on reliable data. For example, labeling red shirts in x number of photos of people wearing red shirts.
Supervised learning is useful in classification and regression problems. Classification problems are fairly straightforward. Determining if something is or is not a part of a group.
An easy example is Google Photos.
Google has classified me, as well as stages. They have not manually labeled each of these pictures. But the model will have been trained on manually labeled data for stages. And anyone who has used Google Photos knows that they ask you to confirm photos and the people in them periodically. We are the manual labelers.
Ever used ReCAPTCHA? Guess what you’re doing? That’s right. You regularly help train machine learning models.
Regression problems, on the other hand, deal with problems where there is a set of inputs that need to be mapped to an output value.
A simple example is to think of a system for estimating the sale price of a house with the input of square feet, number of bedrooms, number of bathrooms, distance from the ocean, etc.
Can you think of any other systems that might carry in a wide array of features/signals and then need to assign a value to the entity (/site) in question?
While certainly more complex and taking in an enormous array of individual algorithms serving various functions, regression is likely one of the algorithm types that drives the core functions of search.
I suspect we’re moving into semi-supervised models here – with manual labeling (think quality raters) being done at some stages and system-collected signals determining the satisfaction of users with the result sets being used to adjust and craft the models at play.
In unsupervised learning, a system is given a set of unlabeled data and left to determine for itself what to do with it.
No end goal is specified. The system may cluster similar items together, look for outliers, find co-relation, etc.
Unsupervised learning is used when you have a lot of data, and you can’t or don’t know in advance how it should be used.
A good example might be Google News.
Google clusters similar news stories and also surfaces news stories that didn’t previously exist (thus, they are news).
These tasks would best be performed by mainly (though not exclusively) unsupervised models. Models that have “seen” how successful or unsuccessful previous clustering or surfacing has gone but not be able to fully apply that to the current data, which is unlabeled (as was the previous news) and make decisions.
It’s an incredibly important area of machine learning as it relates to search, especially as things expand.
Google Translate is another good example. Not the one-to-one translation that used to exist, where the system was trained to understand that word x in English is equal to word y in Spanish, but rather newer techniques that seek out patterns in usage of both, improving translation through semi-supervised learning (some labeled data and much not) and unsupervised learning, translating from one language into a completely unknown (to the system) language.
We saw this with MUM above, but it exists in other papers and models are well.
Hopefully, this has provided a baseline for machine learning and how it’s used in search.
My future articles won’t just be about how and where machine learning can be found (though some will). We’ll also dive into practical applications of machine learning that you can use to be a better SEO. Don’t worry, in those cases I’ll have done the coding for you and generally provide an easy-to-use Google Colab to follow along with, helping you answer some important SEO and business questions.
For instance, you can use direct machine learning models to evolve your understanding of your sites, content, traffic and more. My next article will show you how. Teaser: time series forecasting.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
New on Search Engine Land
About The Author
Related Topics
Get the daily newsletter search marketers rely on.
Processing…Please wait.
See terms.
Learn actionable search marketing tactics that can help you drive more traffic, leads, and revenue.
August 16-17, 2022: Master Classes
September 29-30, 2022: SMX Advanced Europe
November 15-16, 2022: SMX Next
March 15-16, 2023: SMX Munich
Discover time-saving technologies and actionable tactics that can help you overcome crucial marketing challenges.
Start Discovering Now: Spring (virtual)
September 28-29, 2022: Fall (virtual)
Tracking Growth From Organic Search
Beyond the Buzzword: Transform Digitally to Drive Organic & SEO Growth
Leap or Linger: Determining Which Ad Platforms to Test for Your B2B Brand
Enterprise Marketing Performance Management Platforms: A Marketer’s Guide
Enterprise Customer Journey Orchestration Platforms: A Marketer’s Guide
Enterprise Account-Based Marketing Platforms: A Marketer’s Guide
The CMO’s Formula To 3x Your Digital Marketing Campaign Results
Receive daily search news and analysis.
Processing…Please wait.
Topics
Our Events
About
Follow Us
© 2022 Third Door Media, Inc. All rights reserved.