Ethan Lazuk

SEO & marketing professional.


Hamsterdam Part 59: Weekly SEO & AI News Recap (5/20 to 5/26, 2024)

By Ethan Lazuk

Last updated:

A weekly look-back at SEO & AI news, tips, and other content shared on social media & beyond.

Hamsterdam Part 59 SEO News Recap with Google spokesperson quote.
Source: NYT

Opening notes:

*Feel free to jump to the main recap below, or continue reading for some vocab of the week, “This week in SEO history,” plus an introduction and summary of the week’s news!

Want these weekly Hamsterdam recap delivered? Subscribe to the free newsletter here!


Marketing word of the week: “B2B”

B2B is short for “business-to-business,” which is when the target audience is other businesses or organizations, as opposed to B2C (business-to-consumer) or D2C (direct-to-consumer).

In general, B2B marketing has a longer sales cycle that involves multiple decision makers. Web searches (and SEO) can play a roll throughout various points, especially in the upper- and middle-funnel stages but also in lower-funnel product research.

The B2B conversion point for digital marketing may be for customers to request a demo or a quote, while some products or services are available for direct online sales.

B2B Buyer's Journey illustrated by Gartner.

I’ve created SEO strategies for different types of B2B companies, and one thing I’ve learned is that unless you have a robust website with rich data in Google Search Console, keyword research likely won’t get you too far. Instead, work with internal experts to map buyer’s journeys for various personas, identify pain points, and focus on all decision makers.

I also find it’s helpful to produce thought leadership content with a distinct voice and perspective, especially in this gen-AI era where basic glossary definitions or guides will be handled by AI Overviews, chatbots, etc.

That said, internal search features or gen-AI customer agents grounded with company documents on a B2B website could help address long-form questions most concisely.

Examples of B2B companies include technology companies (software, SaaS (software-as-a-service), or cloud services), business financial services (analytics or payroll), manufacturing (construction or industrial equipment), or marketing (CRMs, web developers, or SEO consultants). 😉

Some businesses are exclusively B2B, while others have both B2B and B2C product or service lines.


AI word of the week: “backpropagation”

Backpropagation is an algorithm in neural networks that implements gradient descent, which is an iterative process by which the system minimizes its loss function by adjusting the weights of neurons in the hidden layers. The goal is getting as close the global minimum (perfect alignment between the example and output) as possible, though neural networks commonly settle into local minima.

In short, backpropagation is a really important part of training neural networks to make accurate predictions. It calculates the gradient of the loss function (level of accuracy), which is used to update the weights of neurons to improve the network’s performance.

The reason it’s called backpropagation is because the weights are adjusted moving backward through the system from the final layer to the initial layer.

Backpropagation comes after a forward pass, where a batch of examples (the input) are processed through the neural network and the loss function is determined.

Each neuron can contribute to the overall loss in different ways, so backpropagation determines how much to increase or decrease the weights applied to neurons. This is also influenced by the activation function used. (For example, sigmoid functions can be impacted by the vanishing gradient problem (early gradients become very small, especially in deeper networks), while functions like ReLU (which we saw used in the Anthropic paper) can help mitigate that issue.)

Artem Kirsanov did a fantastic YouTube video on backpropagation recently, where he called it “the most important algorithm in machine learning.”

Here are some screenshots:

Fitting a curve to represent data on an XY plane.
Fitting a curve to represent data.
Loss measurement of data points.
Calculating measurement of loss (for loss function).
Iterative adjustments to minimize loss function.
Iterative adjustments to minimize loss function.
Finding global minimum with gradient descent.
Finding minima with gradient descent.
Iterative adjustments through backpropagation.
Iterative adjustments (learning) through backpropagation.

For more on this topic, I included an introductory guide to backpropagation in the AI section of Hamsterdam Part 56.

While it’s a foundational process for training neural networks, backpropagation isn’t always used.

Geoffrey Hinton’s research has been instrumental in the field of deep learning. Along with David E. Rumelhart and Ronald J. Williams, he published the 1986 paper, Learning representations by back-propagating errors, which was some of the first evidence that training a network (multi-layer perceptron) with backpropagation could train it to solve problems.

Hinton has recently proposed the forward-forward algorithm:

“As a model of how cortex learns, backpropagation remains implausible despite considerable effort to invent ways in which it could be implemented by real neurons (Lillicrap et al., 2020; Richards and Lillicrap, 2019; Guerguiev et al., 2017; Scellier and Bengio, 2017). There is no convincing evidence that cortex explicitly propagates error derivatives or stores neural activities for use in a subsequent backward pass. …

The Forward-Forward algorithm (FF) is comparable in speed to backpropagation but has the advantage that it can be used when the precise details of the forward computation are unknown. It also has the advantage that it can learn while pipelining sequential data through a neural network without ever storing the neural activities or stopping to propagate error derivatives. …

The two areas in which the forward-forward algorithm may be superior to backpropagation are as a model of learning in cortex and as a way of making use of very low-power analog hardware without resorting to reinforcement learning (Jabri and Flower, 1992).”

– The Forward-Forward Algorithm: Some Preliminary Investigations (2022), Geoffrey Hinton, Google Brain

Instead of one forward pass and a backward pass (backpropagation), the FF algorithm uses two forward passes.

The first pass uses real (positive) data while the second uses generated or corrupted (negative) data. Each layer in the network has a “goodness” function, where the goal is to maximize the goodness for positive data and minimize it for negative data.

The FF algorithm is considered more biologically plausible (what our brains might do) than backpropagation. You can watch more in a presentation by Hinton in Hamsterdam Part 55.


This week in SEO history: “ALIWEB search engine” (1994)

In May of 1994, ALIWEB (Archie-Like Indexing for the Web) was introduced to the public at the first international WWW conference (WWW94).

This month marks the 30th anniversary of its launch.

ALIWEB was designed in 1993 by Martijn Koster to work specifically for the World Wide Web service.

This arguably makes it the first search engine, preceding WebCrawler (see Hamsterdam Part 54) by a few months.

Interestingly, according to Martijn’s personal website, he worked at WebCrawler, as well.

Martijn Koster website homepage.

Fortunately for us, Martijn’s website also has a history page, which mentions ALIWEB and links to a PDF explaining more about it.

How did the first search engine work?

Here’s a summary I had Gemini 1.5 Flash create based on the 1994 PDF Martijn linked on his site called “ALIWEB – Archie-Like Indexing in the WEB“:

“Here’s how it worked:

  • Distributed Indexing: Instead of a single central database, ALIWEB relied on individual web server administrators to create and maintain index files for their own content. These files, in a specific format based on IAFA templates, described the server’s resources using attributes and values.
  • Automatic Collection: ALIWEB had a ‘harvesting’ process that periodically retrieved these index files from registered servers and combined them into a single, searchable database.
  • Simple Search Interface: Users could search the database using a simple search engine that looked for keywords within the index files.

Key features:

  • Low Maintenance: The distributed indexing model minimized the burden on a single entity to maintain the entire index, shifting responsibility to the individual server owners.
  • Up-to-Date: Because servers controlled their own index files, the information was generally up-to-date.
  • Contextual Results: The index file format allowed for descriptions and keywords, providing users with more relevant information about the resources found.”
– Gemini 1.5 Flash summary of ALIWEB PDF (1994)

What I find super interesting about ALIWEB is that users had a lot of control.

SEOs may be intrigued to know that website owners provided their own keywords.

Here’s an example of the index files users submitted:

Index file example for ALIWEB.

They could also choose which parts of their website to index.

On that point, Koster was also instrumental in creating the Robots Exclusion Protocol (REP) the same year ALIWEB was introduced publicly (1994).

Google Search Central acknowledges him specifically in their Formalizing the Robots Exclusion Protocol Specification blog post:

“For 25 years, the Robots Exclusion Protocol (REP) has been one of the most basic and critical components of the web. It allows website owners to exclude automated clients, for example web crawlers, from accessing their sites – either partially or completely.

In 1994, Martijn Koster (a webmaster himself) created the initial standard after crawlers were overwhelming his site. With more input from other webmasters, the REP was born, and it was adopted by search engines to help website owners manage their server resources easier.”

– Google Search Central Blog (2019)

Koster was working at NEXOR when he created ALIWEB.

Wikipedia links to a screenshot from 1997 via WayBack Machine, showing it hosted on the NEXOR domain:

ALIWEB on Nexor domain via WayBack Machine circa 1997.

I saw ALIWEB is still returned in search results today:

ALIWEB Search Engine query on Google desktop.

However, it is on an unsecured (HTTP) domain, and it looks quite spammy:

ALIWEB homepage in 2024.

As it turns out, Koster isn’t affiliate with the site anymore, and shares similar feelings:

Quote from Koster's website about ALIWEB.com association.

It sucks to see an important part of search history cannibalized like that.

Nonetheless, in light of the 30th anniversary of its launch, I think we can look back fondly at ALIWEB’s contributions to the formative days of web search. 🙂

I look forward to learning and sharing more about the early days of ALIWEB and its contributions to search today in future articles.

Speaking of search today …

Let’s get to our introduction this week, talking about a difference of perspective.


Introduction to week 59: “a difference of perspective”

What I Told you was True from a Certain Point of View Star Wars Glue on Pizza.

I’ve been following developments in Google’s AI Overviews since the early days of it being a Search Labs experiment called SGE.

By and large, the SEO community has been skeptical of it, at least in my estimation.

Some people likely feel anxiety over potential lost traffic, and maybe a diminished ability to influence organic visibility through the methods they learned years prior.

I know a lot of SEOs also speak enthusiastically about the potential.

Google says AI Overviews create higher rates of clicks. I could see that.

That doesn’t mean more clicks, of course.

Bing has long spoken about “qualified clicks” with Copilot answers. In practice, that means spending more time in the generative AI summary, but having higher intent upon visiting a website.

And increasingly, the links or citations aren’t webpages, but videos, social media posts, and the like.

I could see the clicks to results in AI Overviews being more qualified, because they may be highly contextualized and not necessarily representative of the same search results below them.

And traffic isn’t always that valuable, either.

I’ve seen countless times where websites created popular blog posts that earned lots of organic traffic that didn’t convert well because it wasn’t relevant to their customer buyer’s journey.

Even rankings often feel like a misrepresentation to me, considering the degree of personalization or alternative surfaces for organic visibility beyond traditional search results.

Frankly, I’m not really concerned with losing traffic to AI Overviews.

This will mean having to explain to website owners a lot more context around what clicks or rankings mean — and what they miss.

Whether SEO strategies drive corresponding traffic like before, they still play an instrumental role in sustaining businesses’ visibility in the search and generative AI-based ecosystems.

The nature of SEO strategies will need to change, especially around content and going into channels beyond the website itself, but that’s always been the reality of SEO.

I’ve been using AI Overviews from the first week they rolled out as SGE last year, and I find them helpful.

I regularly reference them before checking web results, and sometimes they’re all I need.

And when it comes to doing SEO in the Gemini era of AI Overviews, I ain’t skurred.

Frankly, I’m excited to play in this environment as a holistic strategist.

But there’s also the other part of this equation.

The development of AI-driven search goes back a long ways.

RankBrain was foundational in 2015, but the year 2017 introduced a game-changer with transformers, kicking off a paradigm shift.

We had BERT in 2018. Neural matching and Google’s Medic Update also. Then MUM came in 2022, and notable rankings changes from the review and helpful content systems and hidden gems improvements the following year.

Then by March of 2024, we’d reached a point where analyzing updates officially feels like it doesn’t do much good, at least for individual factors.

It’s patterns and aggregate data.

Those aren’t informed by lists of tactics, but rather alignment conceptually with helpfulness and E-E-A-T criteria, as inferred through user behavior.

That’s why I started Hamsterdam Research, Hamsterdam Marketing, and Hamsterdam History as side projects to these recaps.

The goal is to contextualize what it means to optimize for users first in a way machines understand, and recognize how these developments are part of an interconnected web of affairs: Slice and dice vs. explore, synonyms and semantics, neural networks, graph clustering, activated features. — it’s all tiny pixels along a linear landscape that moves through time and encompasses a range of perspectives …

But what happens when those perspectives clash?

Reddit can be helpful.

It’s jokes can be funny.

But what happens when those jokes become confidently stated answers about pizza glue in generative AI summaries?

People do put glue on pizza. They’re called advertisers.

Is that the right advice to give someone making a pizza, though?

The talk of the town this week has been Google’s AI Overviews producing such weird, untrue, or risky answers.

Remember back when we had spam stories in early 2024? This feels like an updated version of that.

It’s been featured in several outlets this week (which you can see a few of them in the Articles section), most notably including the New York Times:

“The new technology has since generated a litany of untruths and errors — including recommending glue as part of a pizza recipe and the ingesting of rocks for nutrients — giving a black eye to Google and causing a furor online.

The incorrect answers in the feature, called AI Overview, have undermined trust in a search engine that more than two billion people turn to for authoritative information.”

– Nico Grant, NYT

The story included this quote from Google spokesperson, Lara Levin:

“‘Many of the examples we’ve seen have been uncommon queries, and we’ve also seen examples that were doctored or that we couldn’t reproduce,’ she added. The company will use ‘isolated examples’ of problematic answers to refine its system.”

“Uncommon queries,” another spokesperson said the same thing (via an article in The Verge).

Compare that to this excerpt from an interview Sundar Pichai, Google’s CEO, gave with Nilay Patel for The Verge:

“[Sighs] The thing with Search — we handle billions of queries. You can absolutely find a query and hand it to me and say, ‘Could we have done better on that query?’ Yes, for sure. But in many cases, part of what is making people respond positively to AI Overviews is that the summary we are providing clearly adds value and helps them look at things they may not have otherwise thought about. If you’re adding value at that level, I think people notice it over time, and I think that’s the bar you’re trying to meet. Our data would show, over 25 years, if you aren’t doing something that users find valuable or enjoyable, they let us know right away. Over and over again we see that.”

– Sundar Pichai, The Verge interview

In that interview, Sundar used the word “aggregate” three times.

That’s including this exchange, referencing sites impacted by ranking system adjustments (reviews/HCU/hidden gems/etc.):

Nilay: “A bunch of small players are feeling the hurt. Loudly, they’re saying it: ‘Our businesses are going away.’ And that’s the thing you’re saying: ‘We’re engaging, we’re talking.’ But this thing is happening very clearly.” 

Sundar: “It’s not clear to me if that’s a uniform trend. I have to look at data on an aggregate [basis], so anecdotally, there are always times when people have come in an area and said, ‘Me, as a specific site, I have done worse.’ But it’s like an individual restaurant saying, ‘I’ve started getting fewer customers this year. People have stopped eating food,’ or whatever it is. It’s not necessarily true. Some other restaurant might have opened next door that’s doing very well. So it’s tough to say.”

“I have to look at data on an aggregate [basis.]”

That to me is at the crux of this discussion.

We’ll have to see how often AI Overviews trigger, but let’s say for argument’s sake, it’s 45% of the time, based on this pre-I/O SEL article in May.

I’ve seen or heard referenced dozens if not a few hundred weird AI overview examples so far. (One popular thread had 13.)

What percentage is that of the total? Not sure, but …

Back when we were talking about spam in early 2024, the topic was prevalent in the news. Was the problem actually widespread, though, or were the spam results largely out of sight for most users?

In some ways, it doesn’t matter.

Perception is reality.

Let’s also caveat that doing AI Overviews at Google’s scale is a massive engineering challenge.

Consider this quote from an interview with Perplexity’s head of search (featured in last week’s Hamsterdam recap, Part 58):

“Building an index as extensive as Google’s requires considerable time and resources. Instead, we are focusing on topics that our users frequently inquire about on Perplexity. It turns out that the majority of our users utilize Perplexity as a work/research assistant, and many queries seek high-quality, trusted, and helpful parts of the web. This is a power law distribution, where you can achieve significant results with an 80/20 approach. Based on these insights, we were able to build a much more compact index optimized for quality and truthfulness. Currently, we spend less time chasing the tail, but as we scale our infrastructure, we will also pursue the tail.”

– Alexandr Yarats, Perplexity

I doubt it’s a Google problem.

After all, a suggestion to put glue on pizza still comes from somewhere — a joke informed by a real business practice.

But it’s not a good answer.

We’ll talk more about grounding in the AI section below, but how do you ground generative AI summaries with sources like Reddit, because that’s what people want, when those same sources can introduce weird answers?

As I said last week, AI-organized result pages feel like the real future of Search to me. AI Overviews feels like a competitive response and likely a primer for Gemini proper circa 2025. That’s just my guess, though.

In the meantime, does it matter if those weird Overview answers are a rarity for “uncommon queries,” or even doctored in some cases?

Here’s the perception of AI Overviews in Top Stories today:

Google AI Overviews top stories.

Sure, they may be helpful in the aggregate based on data (Google’s perspective), but the question is, what does everyone else perceive?

At least SEO remains fun. 🙂

Buckle up for a full week’s recap, and enjoy the vibes (Neil Young is back on Spotify! So here’s a rarity for you):

Thank you for supporting Hamsterdam and the cause of SEO & AI learning.

Missed last week? Don’t worry, I got you! Read Part 58 to catch up.

Other great sources of weekly SEO news:


Now, time for our weekly review of SEO social posts, articles, & more …

The Big Lebowski is this your homework Larry scene.

Quick summary

  • Sundar Pichai gave a notable interview with The Verge, spoke of “aggregate” data.
  • Microsoft Build introduced GPT-4o in a Copilot context.
  • Google’s site reputation abuse policy kicked off on May 5th but no algorithmic penalties have occurred yet.
  • Reddit gets special rich result treatment in test; contributes to some AI Overview quirks; which are now tracked at GoogEnough account.
  • Olaf Kopp’s shopping graph optimization SEL post is my pick of the week; my sneaky pick of the week is Anthropic’s research into activated features in Claude (which I wrote a bit about).
  • And much more!

Jump to a section of this week’s recap:

Or keep scrolling to see it all.

Ok, time to step inside the white flags of Hamsterdam …

Hamsterdam scene from The Wire with Carver pointing at the white flags.

SEO news, Google updates, & SERP tests

Notable updates or news related to Google Search or related SEO topics.

Context: Hashtags tied to search results.
Google Profiles Mastodon post.
Source

SEO tips & tidbits

Actionable tips, cool tidbits, and other findings and observations that can be teaching moments.

SEO (and AI) fundamentals & resources

Essential information, concepts, or resources to learn about SEO or AI.

Articles, videos, case studies & more

Longer-form content pieces shared on social, in newsletters, and elsewhere.

Excerpt: “I’ve talked to Sundar quite a bit over the past few years, and this was the most fired up I’ve ever seen him. You can really tell that there is a deep tension between the vision Google has for the future — where AI magically makes us smarter, more productive, and more artistic — and the very real fears and anxieties creators and website owners are feeling right now about how search has changed and how AI might swallow the internet forever. Sundar is wrestling with that tension.”

Google scrambles to manually remove weird AI answers in search – Kylie Robison, The Verge

Excerpt: “Google continues to say that its AI Overview product largely outputs ‘high quality information’ to users. ‘Many of the examples we’ve seen have been uncommon queries, and we’ve also seen examples that were doctored or that we couldn’t reproduce,’ Google spokesperson Meghann Farnsworth said in an email to The Verge. Farnsworth also confirmed that the company is ‘taking swift action’ to remove AI Overviews on certain queries ‘where appropriate under our content policies, and using these examples to develop broader improvements to our systems, some of which have already started to roll out.’”
Excerpt: “The way ecommerce SEO has worked so far will evolve due to changes in research behavior brought about by generative AI such as AI Overviews, ChatGPT and Copilot. Shop category pages will attract less and less organic traffic and users will increasingly be introduced to products through generative AI or LLMs. The extent to which this shift will occur is unclear.”
Excerpt: “But if you want to do better than anyone else, using AI-based tools for SEO, then an entity-based approach is, in our view, a stronger place to start.”

Technical SEO

Everything from basics to advanced moves (and also tools).

Gary Illyes LinkedIn reply to question about duplicate pages with different content.
Source

Content marketing

From what is helpful content to user journeys and beyond.

Data analysis & reporting

Showing that what you’re doing is helping.

AI, machine learning, & LLMs

Excerpt (Anthropic blog post): “Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer.”

Why it matters: I covered this in Hamsterdam Research this week. For AI researchers, it helps shed light on how activated features in LLMs may influence behavior, overcoming “black box” issues of model interpretability. For us SEOs, I think it holds interesting implications for semantic SEO strategies, like around content topic research, site architectures, and buyer’s journey modeling.

Excerpt: “Prior research on improving grounding mostly follows two prominent paradigms. One is to add citations post-hoc using an additional natural language inference (NLI) model. This approach heavily relies on the knowledge within an LLM’s embeddings and does not extend well to facts beyond that. Another common method for grounding is to leverage the instruction-following and in-context learning capabilities of LLMs. With this second approach, LLMs are required to learn grounding just from a few demonstration prompts, which, in practice, does not lead to the best grounding quality. Our new framework, AGREE, takes a holistic approach to adapt LLMs for better grounding and citation generation, combining both learning-based adaptation and test-time adaptation (TTA). Different from prior prompting-based approaches, AGREE fine-tunes LLMs, enabling them to self-ground the claims in their responses and provide accurate citations. This tuning on top of the pre-trained LLMs requires well-grounded responses (with citations), for which we introduce a method that can automatically construct such data from unlabeled queries. The self-grounding capability of tuned LLMs further grants them a TTA capability that can iteratively improve their responses.”

Why it matters: Hallucinations contribute to the unreliability problem with LLMs. Grounding links an LLM’s claims back to reliable sources to improve trustworthiness of answers. AGREE is a new framework from Google Research that uses a combination of fine-tuning (using synthetic data created from unlabeled queries) and test-time adaptation (seeking additional information if needed). It outperforms previous methods in terms of grounding and citation accuracy and can be extended to datasets outside of the training domain. We may presume Search or AI Overviews here, but I think this’ll be related to customer agents, especially, given that the researchers are associated with Google Cloud AI. (Full PDF – we’ll likely cover this in Hamsterdam Research this week!)

Excerpt: “These impressive results highlight the potential of sparse MoE architectures combined with co-upcycling to develop more capable yet efficient multimodal AI assistants. As the researchers have open-sourced their work, CuMo could pave the way for a new generation of AI systems that can seamlessly understand and reason about text, images, and beyond.”

Why it matters: CuMo can enable efficient scaling of multimodal LLMs and lead to the development of more interactive AI assistants that accept text and visual inputs. The paper includes researchers from ByteDance.

Why it matters: We mentioned Model Explorer last week. It offers a neat look inside ML models to analyze and debug them.

Humor

Subjectively funny content.

General marketing & miscellaneous

This is for great content that isn’t necessarily SEO or marketing-specific. PPC, PR, dev, design, and social friends, check it out!

Positive news Mastodon post by Simon Cox.
Source

Older stuff that’s good!

Not everything I find worth sharing is new as of this week, so these are gems I came across published in the past.

Great job making it to the end. You rock!

Want help with your SEO strategy?

I’m an independent SEO consultant based in Orlando, Florida, focusing on custom audits and strategies for brands. Don’t hesitate to reach out, or visit my about page for more information.

Let’s connect!

Hit me up anytime via text or call at 813-557-9745 or on social or email:

Cheers!

Editorial history:

Created by Ethan Lazuk on:

Last updated:

Need a hand with a brand audit or marketing strategy?

I’m an independent brand strategist and marketing consultant. Learn about my services or contact me for more information!

Leave a Reply

Discover more from Ethan Lazuk

Subscribe now to keep reading and get access to the full archive.

Continue reading

GDPR Cookie Consent with Real Cookie Banner