Ethan Lazuk

SEO & marketing professional.


Looking Natural vs. Being Natural: What a Cre8site Discussion on Links from 2005 Can Tell Us About Search Today

By Ethan Lazuk

Last updated:

Hamsterdam History Lesson #2 with True Detective backdrop.

Welcome to the second installment of Hamsterdam History!

We’ve had some renewed discussions about SEOs doing things for search engines versus for users lately.

In March 2024, multiple news stories covered Danny Sullivan advising site owners on X not to do things for Google.

This included a Search Engine Roundtable article that even used the words “Google Begs.”

Google Begs SEOs To Stop Showing Google Things; Show Visitors Things Instead Search Engine Roundtable story.

I mean, is there anything more depressing than a demoralized robot?

Seeing as this is a history article, though, wouldn’t it be fun to look at an earlier version of this topic from nearly 20 years ago?

We know Google has referenced giving us people-first guidance since 2002.

But how far back does the “optimize for users” discussion really go?

In this lesson, we’ll look at one example from a Cre8site Forums discussion on links from 2005, and then we’ll discuss what its themes mean in the context of Google’s AI-driven ranking systems today.

In other words, we’ll be talking about links specifically and optimizing for users more generally.

Setting the stage: Cre8site conversation about links in 2005

Cre8site Forums was described by Search Engine Journal’s Loren Baker in 2005 as “the usability and SEO community governed by Kim Kraus Berg, Bill Slawski, Barry Welford, Barry Schwartz and Rand Fishkin (to name a few).”

The domain cre8asiteforums.com has content on WayBack Machine from 2002 onward.

I randomly clicked into 2005 to explore and went to the first forum that caught my eye: “Home > Search Engine Optimization (SEO) > Best linking stratergies.”

The misspelling of strategy was an incentive to check it out, as well.

After all, this was the George W. Bush era:

George W. Bush Stategery SNL gif.

Here’s the user’s question, which I find funny for its “hypothetical” nature. 😉

Question by newsniche about links on Cre8site.

If that’s hard to read on mobile, here’s the full text:

“Hi,

I hope someone more knowledgable than me can offer me advice on linking. My query is if there are several sites all covering a similar topic which would bethe best way to link between them for maximum SE benefit and maximum traffic flow between the sites.

I do not actually have any sites I am just asking theoretically. The sites would be on different servers.

Thanks.”

In particular, I honed in on two comments.

The first was where someone said to make the links “look natural,” or more specifically, “if you concentrate on making your linking look natural you shouldn’t go too wrong.”

The next person responded quite passionately (with their own quaint typo):

kensplace answer on Cre8site about links.

Again, here’s the text if you’re on mobile:

“I would say, just link how you want to link, ie naturally, dont “make it look” natural, actually link genuinely, dont fake anything, dont have a plan, just link when needed, as needed. If you are what to link to and what not to link to based on search engines then your links are fake, and I would not trust the links, as they are optimised for search engines. Some of the best links are little known ones, and with fake linking of any kind, even basing linking on pr etc it means that little known links stand zero chance of gaining there true potential.

Oddly, considering, from what I have read from seo types, the site I run, lsblogs, has a good pr, and I dont care about linking strategies, ALL sites on lsblogs are linked to through real links, no hidden agenda anywhere. The only rule I have is If I dont think the site fits, or is taking the p*ss then I dont accept it. No harm to my pr.

Go with what you want to do, dont base it on any type of trying to fool a search engine. It only takes on person to report you, or one person at the s/e site to have a look and you can be in trouble.”

This person, kensplace, doesn’t appear to be an SEO but rather someone who ran a search site for blogs.

So I take it they were passionate about content.

It’s also interesting how, not being an SEO (presumably), they speak about “pr,” or PageRank.

That’s a metric we don’t necessarily talk (or think) about these days.

But what we do still talk about is optimizing for users versus for search engines.

Based on the context of the forum discussion, we could describe this as the difference between “being natural” and “looking natural.”

What it means to “be natural” in 2024, the age of AI

The more I learn about machine learning in the context of search engines today, the more I understand why sites that try to “look natural,” or superficially appear helpful and relevant, may get impacted by updates eventually.

Just to get our vocabulary straight before going on, artificial intelligence (AI) is the overarching concept of getting machines to think like humans. Machine learning (ML) is where computers learn without human programming using algorithms to spot patterns and make predictions. Neural networks (NNs) are a type of machine learning that uses layers of processing, where each layer performs calculations on data from the previous layer. While multiple layers of NNs can create deep learning (DL) models.

We’ll discuss more about Google’s use of neural networks below, and tie this back to our discussion of not only links but also “looking natural” versus “being natural.”

But first, it’s important to set the context for what AI developments mean for ranking search results.

Before neural networks, ranking systems were coded by people.

Check out these excerpts from Google’s Alan Kent during a September 2022 discussion on YouTube with Marie Haynes (about ecommerce sites), where I’ve highlighted key points, including the last one in orange:

What the machine learning is trying to do is to try and spot patterns of how important are these different things to real people and their results. And then when you get into AI and machine learning, there’s no algorithm that says, “If the image is original then give them a 50 point boost.” No, no, no. Machine learning is just trying to look at what all the good examples was and tries to work out … [what’s] possibly more interesting to them.

Fascinating, a human actually can’t tell you its decision process. … Machine learning is used by Google … [and] it’s one of the reasons you can’t actually answer some of the questions at times because no human knows. These are the sorts of signals that feed into these processes. But it’s not like this signal if you put original pictures into it that’s it and you’re going to hit the top of the rankings, right. No, it contributes and it’s not even necessarily a constant contribution. …

We’re not saying these are ranking signals. We’re saying these are the characteristics that we see of good results. …

When you get into machine learning, the whole idea is it’s meant to spot patterns that the human didn’t spot because the old expert systems was you’d actually write code. You’d write out all the rules, but then you had to work out all the rules. Well, the more modern machine learning is you teach it, “Here’s good and here’s bad” and you work out what the rules are and it’s all numbers and weighted. And so no human sort of sits down and says, “Oh this one is 0.3729 more important.” And, you know, it’s just not how it works.”

Another important concept for this discussion, specific to neural network training and fine-tuning, is backpropagation, which is the process that could adjust search results periodically (offline) or even in real-time (online), such as based on user interaction data.

Consider this Gemini explanation about backpropagation and search ranking systems (keeping in mind LLMs can be factually inaccurate):

Google Gemini answer about backpropagation's role in search ranking systems.

Here’s it is for mobile users:

Initial Training:

  • Offline Dataset: Before going live, a search ranking model is trained on a massive dataset containing queries, web pages, and user interaction data (clicks, dwell-time, etc.).
  • Backpropagation’s Role: It optimizes the neural network’s parameters to learn complex patterns and predict relevance scores more accurately.

Fine-Tuning in the Real World:

  • Adapting to New Data: Once a ranking model is deployed, it’s continuously exposed to new queries, content, and user behavior. Search engines can leverage two approaches involving backpropagation:
    1. Periodic Retraining:
      • Fresh data is collected and aggregated with the original dataset.
      • The neural network is retrained (using backpropagation again) on this expanded dataset, leading to updated parameters that reflect recent trends and changes.
    2. Online Learning:
      • This is a more advanced approach where the model adjusts in real-time based on live user interactions.
      • Backpropagation can still play a role, though it might be used in a modified form suitable for continuous updates (e.g., variants of gradient descent).

Key Points

  • Speed Considerations: Live fine-tuning using backpropagation can be computationally intensive. Search engines often use strategies like selective updates, smaller batches of data, or optimized algorithms to balance accuracy and speed.
  • The Goal: Both training and fine-tuning aim to continually improve the search ranking model’s ability to understand complex queries and deliver the most relevant results for users.”

In the realm of SEO, we have announced updates, unannounced but suspected updates, and real-time ranking fluctuations.

These could all be influenced by external factors as well as backpropagation through periodic or near real-time ranking model fine-tuning.

For example, a neural network, like RankBrain, can make a prediction, such as the relevancy score of a web page to a query. That prediction is then compared against the desired result, either learned through training data (offline) or potentially from live user interaction data gleaned from clicks or time spent on the page (online). This difference is called the cost function.

Backpropagation is then an algorithm that adjusts the neural network’s parameters (weights and biases), such as to predict more accurate relevancy scores for documents to queries (minimize the cost function). Gradients represent how much (positive or negative) a parameter needs to change.

Back in 2015, Danny Sullivan (then at Search Engine Land) reported that RankBrain was trained offline with past searches, implying it was periodically refreshed.

However, Google Gemini today suggests advancements in Google’s AI technology mean RankBrain may have online components or at least be supported by additional machine learning models that use online learning.

Google Gemini answer about online learning and RankBrain.

Again, here’s the text for mobile users:

“You bring up an excellent point! Here’s what we need to consider when looking at that article from 2015 and considering online learning in Google Search today:

  • Continuous Evolution: Machine learning and AI-powered search have progressed significantly since 2015. While RankBrain might have started with primarily offline training, it’s very likely that Google has incorporated online learning techniques to continually refine and improve its search results.
  • Implicit Signals as Online Learning: Here’s a possible scenario where online learning occurs even if core RankBrain components maintain offline training cycles:
    • Observing User Behavior: Search engines gather huge amounts of live data about how people interact with search results. Things like clicks, the time spent on pages, immediate back-clicks, and query reformulations serve as strong implicit signals about result quality.
    • Adaptive Models: It’s possible that even if the core RankBrain model is updated less frequently, additional machine learning models are used to analyze these live signals. These models might adjust the ranking presented to users (or personalize results) based on this near-real-time feedback.

Possible Techniques for Online Learning:

  • Reinforcement Learning: Search engines could leverage models that learn by maximizing “rewards” (positive user interactions with results). This fits well with implicit feedback mechanisms.
  • Fine-tuning Pre-trained Models: Online learning might involve selectively updating pre-trained models (like BERT-based models) based on fresh data and observed patterns.
  • Hybrid Approaches: Google likely employs a complex blend of offline learning, online learning, and other adaptive techniques to keep the system up-to-date.”

Google has also had further announced AI advancements since RankBrain.

Neural Matching in 2018 likely plays a role in natural language processing (NLP) and understanding entity relationships to match queries with documents. In other words, it can find points of relevance beyond semantic keyword matches on pages.

BERT in 2018 has a similar impact, enabling a more accurate understanding of query intents and how document content aligns with them. In other words, it can understand more naturally written content on queries.

Also MUM in 2021 likely supports Google’s ability to correlate unique images or video content to satisfying search intents. In other words, think about Alan’s point about original product images in the video transcript above.

How does this all apply when we’re talking about linking and “looking natural”?

Well, let’s first discuss linking.

In that 2015 SEL article, it’s also mentioned that, at that time, RankBrain was a top 3 ranking signal, with links and content as first and second.

However, in 2023, Google’s Gary Illyes, speaking at Pubcon Pro in Austin, said, “I don’t agree it’s in the top three. It hasn’t been for some time.” It meaning links.

Furthermore, in Google’s post-trial debrief (document #833) we see excerpts about how user interaction data has become less important compared to natural language processing.

When I asked Gemini why this might be the case (keeping in mind it’s an LLM, not a fact checker), it explained this could be because user interaction data is messy, while models like BERT and MUM help understand queries and content on deeper levels without direct feedback.

It also appears E-E-A-T alignment (depicted algorithmically) could be considered like a proxy for the predicted user satisfaction of content based on its source.

Gemini answer about user interaction data and NLP.

Here’s the text of the reply:

  • Implicit Signals are Tricky: Clicks, dwell time, etc., are valuable, but they’re imperfect indicators of quality. Users might click on the top result out of habit, or spend time on a page that’s not actually helpful.
  • The Saturation Point: Google has amassed such a huge dataset of user interactions that the marginal benefit of each additional data point decreases. It’s likely reached a point where this data alone doesn’t lead to massive leaps in quality.

What Replaces It?

  1. Advances in NLP & Machine Learning:
    • Understanding Intent: Models like BERT and MUM achieve a deeper understanding of query meaning and the content of web pages. This allows ranking based on a better match between what the user needs and what the page offers, even without direct feedback.
    • Zero-Click Searches Search engines can now directly answer complex questions (through featured snippets), reducing the need to rely on whether users clicked on any result.
  2. Focusing on Content Quality:
    • Evaluating Expertise, Authority, Trustworthiness (E-A-T): Google could increasingly rely on algorithms designed to analyze and score the quality of content independent of user clicks.
    • Semantic Understanding: Understanding synonyms, related concepts, and overall structure of content allows search engines to rank pages even without a perfect keyword match, which was harder to achieve without sophisticated AI models.
  3. A Hybrid Approach (Most Likely):
    • User data is still valuable: While it may have diminishing returns, user interaction data will likely still play a role, potentially in a more refined form.
    • Reinforcement Learning: Search engines might employ AI techniques that find a balance between immediate rewards (user clicks) and long-term exploration (trying new results).”

Going back to Alan’s excerpt in the transcript, there can be all sorts of signals at play.

This gives us a lot to think about, but really it speaks to the evolution of the underlying technologies that power search results.

Links (like user interaction data) are likely not as critical today as in 2015, or let alone probably in 2005, because more advanced AI systems (than say PageRank) can more accurately predict a webpage’s relevance to a query.

But what about the broader topic of optimizing for users and not search engines, or the difference between “looking natural” and “being natural”?

If Google Search has the ability to adjust its search results through backpropagation (either online or offline) of neural networks based on user interaction data, this means initial signs of relevance to satisfy the AI models, or “looking natural,” likely only go so far.

In practical terms, those H2s, links, and keywords or entities used to optimize the relevance of a page for Search, they only hold water long-term if users who visit the page agree those elements are helpful to solving their needs.

This is the point about optimizing for users.

We can also bet that neural networks can be trained as proxies even for user interactions.

Here’s a diagram from a recent paper by Google Research about ScreenAI:

ScreenAI gif from Google Research.
Source: Google Research

Just imagine the applications this sort of technology could have for predicting user satisfaction or page “helpfulness” (such as Page Experience insights).

Here are some predictions from Gemini (just food for thought) on how a technology like ScreenAI could be used for Search purposes:

Google Gemini hypothesizing about ScreenAI technology for Search.

And here’s the text version:

Key Strengths of ScreenAI:

  • Understanding UI Elements: ScreenAI excels at parsing complex visual interfaces, identifying icons, buttons, text, and their relationships. This extends beyond web pages to app interfaces too.
  • Multimodal Capability: It processes images and text together, giving it a nuanced understanding of the context surrounding content within a page layout.
  • Question Answering, Navigation, & Summarization: ScreenAI is trained to not only answer questions about the screen’s content but also navigate the interface and provide summaries. This demonstrates its versatile language understanding ability.

How ScreenAI Could Help Google:

  1. Beyond Text Analysis: Traditional search engines primarily focus on textual content. ScreenAI could supplement this by:
    • Rich Media & Infographics: Analyzing images, diagrams, and charts within a page, extracting meaning and answering questions even if they aren’t explicitly described in the text.
    • Page Structure: Understanding how content is laid out can indicate importance, relationships between elements, and overall clarity for users.
  2. “Reading” the User Experience: ScreenAI’s UI understanding could help Google assess factors contributing to a positive user experience:
    • Navigation: Are key elements easy to find? Is the layout intuitive?
    • Visual Clarity: Are images appropriately integrated, and do they complement the information being presented?
  3. Automated Quality Signals: ScreenAI’s capabilities could be harnessed to generate signals used in ranking:
    • Automatic Summaries: ScreenAI could create concise summaries of complex pages, giving search engines a quick reference point.
    • Answering Hypothetical Questions: Asking ScreenAI “What does this chart show?” or “How do I find [item] on this page?” could test comprehensiveness and ease of use.”

When we hear Google say, “optimize for users,” it’s because users may help determine relevance in a variety of ways.

“Looking natural” might get a website ahead for a while, but only until Google’s systems (informed by users) catch up.

And it could work both ways. Perhaps a lot of the collateral damage we’re seeing to site owners who have created original content but were still impacted by the third HCU or March 2024 core update is because they’re following the examples of other sites that are only trying to “look natural.”

It’s not hard to spot templated layouts or SEO-first decisions with the naked eye. Imagine if you had thousands or millions of examples to train neural networks on.

“Being natural” is therefore the timeless answer. We saw such people-first optimizations being endorsed back in 2005 in Cre8site, as kensplace pointed out, and it’s still a focus today.

The only question is whether your natural content is also helpful and relevant. 😉

“In the future”

Thanks for checking out Lesson #2 of Hamsterdam History!

Check back next week for a new history lesson, or subscribe to the newsletter for this and weekly news recaps.

Until next time, enjoy the vibes:

Thanks for reading. Happy optimizing! 🙂

Editorial history:

Created by Ethan Lazuk on:

Last updated:

Need a hand with a brand audit or marketing strategy?

I’m an independent brand strategist and marketing consultant. Learn about my services or contact me for more information!

Leave a Reply

Discover more from Ethan Lazuk

Subscribe now to keep reading and get access to the full archive.

Continue reading

GDPR Cookie Consent with Real Cookie Banner