Everything You Need to Know (Or A Lot of It, Probably) from Google Search Central’s SOTR Podcasts (as Told Through Gemini Prompted with 75 Transcript PDFs)

Last updated:

June 15, 2024

“Okay, you’ve hit the SEO sweet spot!” Gemini said to me.

What a nice feeling.

That’s Gemini 1.5 Pro with a 2 million token context window.

I recently got access to it in Google AI Studio:

2m tokens. This became available to me earlier today.

Been trying text-based prompts with all combinations of textbooks, PDFs, articles (ML, NLP, IR, CRO, KGs, etc.). Most I hit was 1.28m tokens. pic.twitter.com/u9QiMBxm6d
— Ethan Lazuk 🪬 (@EthanLazuk) June 14, 2024

What I’m referencing in that colloquial Twitter-English above is Gemini and I recently had a long chat.

It began when I went to the web and downloaded several free textbooks about deep learning, information retrieval, digital marketing, web design, and CRO.

I also gathered several Google patents and AI research papers that were in my recent downloads and the text of a few SEO blog articles currently open in my browser.

I put ALL of that information into prompts in Gemini with a goal of creating a “holistic SEO” outline, one on the scale of a textbook — or a blog series. 😉

I didn’t put it all in at once, though. It was part of a long conversation in a “Chat prompt.”

Along the way, I’d ask for takeaways, like vocabulary lists or document summaries.

In total, my “Token Count” was: 1,285,238 / 2,097,152.

In other words, I used 61.3% of the available context window.

Two million tokens isn’t quite an infinite context window (although Google is working on that, too), but it feels close.

You’d really have to try hard to hit 2 million tokens, probably with a lot of video.

What’s the advantage of putting these materials into Gemini prompts, though? Doesn’t it have that knowledge already?

It probably does. Nothing I added was proprietary, and it all came from the open web.

There are practical advantages, though.

You might be familiar with retrieval augmented generation (RAG), where LLMs are grounded with external data sources to prevent hallucinations. (AI Overviews is one example.)

This is kind of like that, except rather than connecting the model to a database or the web for context (the ground truth or epistemic certainty), I merely put the data in the prompt.

Here’s a description (from a VentureBeat article about infinite context windows) that describes the benefits of prompts like this:

“Currently, customizing LLMs for specific applications requires techniques such as fine-tuning or retrieval-augmented generation (RAG). While those techniques are very useful, they require challenging engineering efforts.

An LLM with infinite context could, theoretically, enable you to insert all of your documents into the prompt and let the model pick the most relevant parts for each query. It could also enable you to customize the model by providing it with a long list of examples to improve its performance on specific tasks without the need to fine-tune it.“
– Ben Dickson, VentureBeat (2024)

That’s a good way to think about it — we’re customizing Gemini.

At the end of that conversation, I asked Gemini to make an outline for a holistic SEO strategy — it was a topic I’d just written about in Hamsterdam History, so it was top of mind.

What came back was a thorough outline:

And that’s just a fraction of the full response.

I pasted the answer into a Google Doc for editing, and here was the final word count:

On that topic, you might be wondering how many words is 2 million tokens?

It first helps to understand what a token is.

If you listen when Google engineers like Gary Illyes speak, they’ll sometimes use the word “token.”

Here’s how Stanford’s page on “Tokenization” — which is on an nlp.stanford.edu subdomain and in an /IR-book/ subfolder, so we know it’s relevant context 😉 — describes a token:

“Given a character sequence and a defined document unit, tokenization is the task of chopping it up into pieces, called tokens, perhaps at the same time throwing away certain characters, such as punctuation. Here is an example of tokenization:

Input: Friends, Romans, Countrymen, lend me your ears;
Output: $\framebox{Friends\weestrut}$ $\framebox{Romans\weestrut}$ $\framebox{Countrymen\weestrut}$ $\framebox{lend\weestrut}$ $\framebox{me\weestrut}$ $\framebox{your\weestrut}$ $\framebox{ears\weestrut}$

These tokens are often loosely referred to as terms or words, but it is sometimes important to make a type/token distinction. A token is an instance of a sequence of characters in some particular document that are grouped together as a useful semantic unit for processing.“
– Tokenization (Stanford NLP)

This is an important distinction: tokens can be words, but they don’t have to be.

Here’s a screenshot from an awesome 3Blue1Brown video on attention in transformers that demonstrates this:

Transformers are also a huge part of NLP and LLMs (the “T” in BERT, GPT, or T5 (likely also related to MUM)).

As for the word-to-token ratio, it varies based on the model and how its tokenization process works.

OpenAI’s FAQs explain that 100 tokens equals around 75 words:

Using that math, 2 million tokens in Gemini’s context window would equate to a prompt of around 1.5 million words.

OpenAI also has this fun tokenizer tool we can play with. How many tokens do you think we’ve written above so far (pre-final edit, that is)?

Colorful language, wouldn’t you say? 🙂

Now, I think I’ve got a great idea!

As part of this topic, I began digging into Google Search Central’s “Search Off the Record” podcast transcripts to find a reference where Gary or someone else mentioned “token,” but that got me thinking …

I was originally planning to have this article be a “holistic SEO guide,” based on a reproduction of the prompt I did with Gemini from earlier, except now putting all of the resources in a single prompt.

Plot twist …

What if instead, we plug all of the PDF transcripts from Google’s Search Off the Record (SOTR) podcast episodes into Gemini and … asked some questions?

That sounds fun, actually.

I still want to create a holistic SEO guide, but the more I think about it, it might be better to spend time collecting the best sources first — as opposed to using a random mishmash.

I’ll work on that holistic SEO guide for next week and add a link here when it’s available. (Please don’t rage click that “dead link” now, though, or Clarity will know.)

Ok, let’s create some SEO-related FAQs using Google Search Central’s SOTR Podcast transcripts!

At the time of this writing, there have been 75 episodes of the SOTR podcast.

The main page lists 30 recent ones:

To get the full list, I’ll need to visit YouTube.

Or so I thought …

At first, I went to the YouTube videos, one by one.

The first episode didn’t have a transcript, so I copied that one from YouTube’s transcription and made my own PDF.

The next eight videos all had transcripts linked in their descriptions.

Each time I opened a video, though, it would autoplay.

Now, I enjoy the SOTR theme music, but soon it was living rent-free in my head. 🙂

That’s when I realized the breadcrumb links …

… and a page that lists all episodes with PDF transcript links.

Also, if you’re not familiar with SOTR, here are the main hosts:

Many episodes feature guests, as well.

I’m going to get to work now downloading the transcripts.

Feel free to enjoy some music while you wait:

Ok, I’m back.

It took a while — I also got distracted updating another post — but I’ve got the goods:

I wasn’t sure offhand which FAQs to ask Gemini, as these podcast episodes cover a lot of ground.

For the sake of simplicity,

I had Gemini create a list of questions with answers.

I didn’t want to steer the model toward any particular episodes, so I followed Google’s prompting tips and used the following prompt:

“I’m creating blog content for SEO professionals and website owners. Given the uploaded transcripts from podcast episodes, please give me a comprehensive list of FAQs with answers about the most important topics related to SEO. These can include actionable advice for SEO professionals and website owners or helpful information about information retrieval, ranking systems, helpful content, machine learning, and related topics. Each FAQ should be listed as a numbered heading with the answer as paragraph text. The FAQs should also be subdivided by topic.”

I also didn’t ask for citations because I wanted the FAQs to come from all of the episodes holistically, like one big universe of information.

Here’s how the prompt looked with the transcript PDFs uploaded:

If you had to guess, how many tokens do you think our prompt used?

Our written portion without the PDFs was 96 tokens, per OpenAI’s Tokenizer tool.

So we can deduct around 100 from this total:

That means our 75 SOTR podcast transcript PDFs equated to around 850,450 tokens in Gemini’s context window.

That’s about 11.3k tokens per episode. Since each episode is around 30 minutes, that’s about 380 tokens per minute or 6 tokens per second.

Knowing that we have around 2.1 million tokens available, we could put approximately 185 podcast episode transcripts in one Gemini prompt.

The first episode was recorded on May 8th, 2020. That means roughly 1.5 episodes of SOTR come out every month. We need 110 more to hit our token limit, which will happen in around 73 months.

We should max out in June or July of 2027.

That will be a sad day.

Maybe we’ll have an infinite context window by then (and GSC data for AI Overviews). 😉

Without further ado! Here’s everything you need to know (or a lot of it, probably) from the first 75 episodes of Google’s Search Off the Record podcast.

These FAQs cover all episodes from May 2020 to June 2024. I’ll add new FAQs when future podcasts are released.

Also, an AI content caveat: I wrote everything you’ve read so far, but Gemini wrote much of what’s to come in the following sections (meaning it’s AI-generated content).

All content from Gemini responses will be in IBM Plex Mono font, so you can tell it apart.

I changed anything that was outdated (like live Q&A office hours). While I’ve tried to proof everything, please double-check any AI-generated information with an authoritative human source (which is my conclusion, spoiler alert!). 😉

General SEO

1. Is SEO dead?

No. SEO is not dead. The core principles of SEO, like creating high-quality content for users, remain relevant and essential for website visibility and success. While specific techniques and algorithms evolve, the need for optimizing websites to be easily found and understood by search engines persists.

2. Do I need to know HTML as an SEO professional?

Yes, understanding HTML remains crucial for SEO professionals. HTML is the foundation of the web and underpins how search engines understand and process web pages. Even with the rise of CMS platforms and visual editors, knowing HTML allows for advanced customization, optimization of elements like title tags and meta descriptions, and troubleshooting technical SEO issues.

3. Should I build my own website to learn SEO?

Yes, building your own website is a highly recommended way to learn and practice SEO. It provides a risk-free environment to experiment with different techniques, test the impact of changes, and gain a practical understanding of SEO principles. Free platforms like WordPress offer easy setup and allow you to focus on content optimization and other key aspects.

4. Should I be worried about every Google ranking update?

No. While Google makes frequent updates to its ranking systems, most of these changes are focused on refining existing algorithms and improving the quality and relevance of search results. Instead of reacting to every update, focus on consistently creating high-quality, helpful content that serves your target audience. If you notice a change in your website’s performance after an update, review Google’s guidance on the specific ranking system to ensure your content aligns with best practices.

5. What is the best way to get help with my SEO issues?

If you’re facing specific SEO issues with your website, the best approach is to utilize public channels to seek help:

Google Search Central Help Community: Post your questions in the forum to get help from experienced SEO professionals and Google product experts.
Social Media: Tag relevant Google Search team members on X or LinkedIn with your questions to engage in public conversations and discussions.

Avoid reaching out through private channels like direct messages or emails, as this goes against Google’s commitment to “Honest Results,” which ensures a level playing field for all website owners regardless of their relationship with Google.

Technical SEO

1. What is the difference between pre-rendering, server-side rendering, and dynamic rendering?

Pre-rendering: Generating a static HTML version of a dynamic website when content changes are predictable. This is suitable for sites like blogs where you know when content will be updated.
Server-side rendering: Running JavaScript on the server to generate HTML every time a user requests a page. This is ideal for websites with dynamic content that changes frequently based on user interactions.
Dynamic rendering: Serving a pre-rendered, non-JavaScript version of a page to search engine bots and the full JavaScript version to users. While this was previously recommended, it’s no longer considered a best practice due to its complexity.

Choosing the right approach depends on your website’s needs and content. If you’re unsure, consult with a developer experienced in SEO.

2. Should I use JavaScript on my website?

JavaScript can enhance website interactivity and user experience, but it can also pose challenges for SEO if not implemented carefully. Consider the following:

Need for Interactivity: Only use JavaScript when necessary for specific functionality, like dynamic content or user interactions.
Performance: JavaScript can significantly impact page speed and loading times. Optimize your code, minimize bundle sizes, and leverage server-side rendering or hydration to improve performance.
Crawlability and Indexation: Ensure Googlebot can access and render your JavaScript content correctly. Use tools like the Rich Results Test and the URL Inspection tool in Search Console to check for issues.
Third-Party Scripts: Be mindful of third-party JavaScript libraries, as their performance and accessibility can impact your website’s SEO.

3. Should I bundle my JavaScript code into one file or multiple files?

While bundling JavaScript code into one file can reduce HTTP requests, splitting the bundle into smaller chunks based on page or view functionality can improve caching and performance, especially for large websites with dynamic content. Explore code splitting techniques and tools like Webpack or Parcel.

4. Should I be worried about crawl budget?

Crawl budget refers to the number of URLs Googlebot can and is willing to crawl on your website within a given timeframe. Most websites don’t need to worry about crawl budget unless they have millions of pages or experience issues with slow crawling or indexing.

To optimize crawl budget:

Ensure your server is fast and responsive.
Avoid sending unnecessary “back off” signals like 429 or 500 status codes.
Remove low-quality or duplicate content that might waste crawl budget.
Submit sitemaps to inform Google about important URLs.

5. Is it better to keep AMP pages with issues or remove them completely?

If your AMP pages have validation errors or are not functioning correctly, it’s generally better to either fix them or remove them entirely. Broken AMP pages won’t provide the intended speed and performance benefits, and they might even harm your website’s overall user experience. If you have a functional non-AMP version of the page, removing the broken AMP version is usually the best course of action.

Ranking Systems

1. How does Google decide on new ranking signals?

Google considers various factors before introducing a new ranking signal, including:

Impact on Search Quality: Does the signal contribute to delivering more relevant and useful results to users?
Weighting: Will the signal have a significant impact on rankings, or will it serve primarily as a tiebreaker?
User Feedback: How do users interact with search results when the signal is applied?
Rater Evaluation: How do human raters assess the quality and relevance of search results with the new signal?
Feasibility of Spam: Is the signal easily manipulated or abused by spammers?

Rigorous testing and analysis are crucial before any new ranking signal is launched.

2. What is the purpose of core updates?

Core updates are significant changes to Google’s ranking algorithms designed to improve the overall quality and relevance of search results. They are not punitive measures but rather reflect Google’s ongoing efforts to better understand user intent and surface the most helpful and authoritative content.

3. Are links still an important ranking signal?

Yes, links remain a valuable signal for Google to discover new content and assess the authority and trustworthiness of websites. However, the weight and influence of links have evolved over time, and other factors like content quality, user experience, and topical relevance now play a more significant role in ranking.

4. What is the best way to handle duplicate content?

Canonicalization: Utilize the rel=canonical tag to indicate the preferred version of a page when you have multiple URLs with similar content.
Content Diversification: Avoid creating pages with near-identical content, especially for different language or regional versions. Focus on providing unique and valuable information for each target audience.
301 Redirects: Implement permanent redirects to consolidate content and signals when a page has permanently moved to a new URL.

5. Should I try to get my content indexed faster using the “Request Indexing” tool in Search Console?

For most websites with new content, there’s no need to manually request indexing through Search Console. Google’s crawling and indexing systems are designed to discover and process fresh content automatically. Focus on creating high-quality content and building natural links to your site, which will attract Googlebot’s attention. Use the “Request Indexing” tool only for time-sensitive content or to test if Google can access and index a specific URL.

Working With Developers

1. How can I better collaborate with developers on SEO?

Effective communication and collaboration between SEOs and developers is crucial for website success.

Early Involvement: Engage developers early in the website development process to address SEO considerations from the outset.
Shared Understanding: Educate developers on basic SEO principles and explain the impact of their decisions on website performance and visibility in search results.
Clear Recommendations: Provide specific, actionable SEO recommendations that are relevant to the website’s technology and architecture.
Open Communication: Foster an open dialogue, listen to developers’ perspectives, and find solutions that balance SEO best practices with development constraints.
Regular Communication: Maintain ongoing communication throughout the development lifecycle to address any emerging SEO issues or opportunities.

2. Should I ask developers to change their JavaScript implementation for SEO?

Not necessarily. While JavaScript can pose challenges for SEO, most modern JavaScript frameworks are designed to work reasonably well with search engines. Focus on optimizing JavaScript performance and ensuring content is accessible to Googlebot. Only request changes to JavaScript implementation if you see clear evidence of SEO issues related to rendering, crawling, or indexing.

Search Behind The Scenes

1. How does Google handle language in Search?

Google uses a variety of signals to determine the language of a query and serve the most relevant results. These include:

Browser Language Settings: The Accept-Language header sent by the user’s browser.
Search Settings: The user’s preferred language setting in Google Search.
Query Language: The characters and words used in the search query.
User Location: The geographical location of the user, which can help disambiguate language and regional variations.

2. What is the role of machine learning in Search?

Machine learning plays a significant role in various aspects of Google Search, including:

Understanding Content: Identifying the topic, sentiment, and entities present in web pages and other content.
Query Parsing and Understanding: Expanding queries with synonyms, stemming words, removing stop words, and detecting concepts.
Ranking: Assessing the quality and relevance of web pages based on a wide range of signals, including user behavior, content freshness, and topical authority.
Spam Detection: Identifying and filtering low-quality or malicious content from search results.

3. How does Google handle outages?

Google has robust systems and teams in place to monitor and address any outages or issues that might affect search. In recent years, there’s been an increased effort to communicate more transparently about outages to the web community through:

Search Status Dashboard: Provides timely updates on any known issues affecting crawling, indexing, or serving of search results.
Social Media: Search Relations team members will often post updates on X (Twitter) about any ongoing outages or issues.
Webmaster Help Forum: Users can report any suspected outages or issues in the forum to bring them to Google’s attention.

While outages are inevitable with any complex system, Google strives to minimize their impact and keep users informed.

4. How does Google test new features and algorithms?

Google employs rigorous testing methodologies before launching any new features or ranking algorithms. These include:

Live Experiments: Rolling out changes to a small percentage of users to observe their impact on search behavior and metrics.
Human Raters: Employing a global network of trained human raters to assess the quality and relevance of search results with new features or algorithms.
Offline Testing: Utilizing internal testing environments to simulate different user scenarios and evaluate the performance and impact of new changes.

Data and feedback from these various testing methods are crucial for making informed decisions about launching new features or algorithms.

Ok, it’s me again …

I think there’s a lot of good stuff in there, in general.

Still, I think we can do better. 😉

What happens if I take several blog posts about Bing and ask for a comparison?

For this exercise, I’ll adjust the prompt and directly paste in the content from a handful of articles on Microsoft Bing Blogs and rerun the model:

“I’m creating blog content for SEO professionals and website owners. Compare the uploaded transcripts from the Google podcast episodes with the content of Bing blog posts to give me a comprehensive list of similarities and differences. Here is the Bing blog content for comparison: [Bing blog content]”

I chose 15 blog posts going back to 2020 from Bing’s Search and Webmaster blog areas. Just whatever felt relevant.

They added about 100k tokens to our prompt:

It’s not a totally fair comparison — podcast transcripts to blog posts — but just for the sake of experimentation, let’s see what Gemini came up with.

(Note that I removed stuff that was obvious, like “Bing appears to place a stronger emphasis on Microsoft-specific solutions.”)

1. Focus on User Experience

Both Google and Bing emphasize creating a positive user experience as the key to SEO success. They acknowledge that users want fast-loading, accessible, and engaging content, regardless of the underlying technology.

2. Importance of Content Quality

Both platforms stress creating valuable, informative content that satisfies user intent. They advise against tactics like keyword stuffing and thin content, advocating for a natural, user-centric writing style.

3. Transparency and Education

Both Google and Bing are increasingly transparent about their search processes, offering detailed documentation, tools, and events to educate webmasters and SEOs.

4. Adapting to Evolving Web

Both recognize the web’s constant evolution, acknowledging the need to adapt algorithms and guidelines to new technologies and user behaviors. They are actively exploring AI-powered solutions and user feedback to improve search experiences.

5. Addressing Spam and Abuse

Both platforms actively combat spam, using machine learning and manual actions to protect users and maintain search quality. They encourage user reporting and offer resources for addressing issues like hacking and negative SEO.

6. Embracing New Technologies

Both are receptive to new technologies, such as PWAs and WebAssembly, while acknowledging the need to address potential challenges related to performance and SEO.

7. Value of Data and Analytics

Both companies recognize the importance of data and analytics for understanding user behavior and optimizing websites. They offer tools like Search Console, Bing Webmaster Tools, and Microsoft Clarity to help webmasters gather insights and make informed decisions.

8. Approach to Crawl Budget

While both acknowledge crawl budget as a consideration for large websites, Bing seems to provide more detailed guidance on managing crawl budget and offers tools for optimizing crawl efficiency. Google downplays crawl budget concerns for most websites.

Overall, both Google and Bing prioritize content quality, user experience, and staying ahead of the curve in a constantly evolving web landscape.

That’s helpful, certainly.

It speaks to universal aspects of SEO, as opposed to focusing on this or that search engine.

I still think we can do better. 😉

What happens if I paste in recent SEO blog posts and ask for holistic takeaways?

I’ll repeat the above exercise, but I’ll use 15 recent SEO-related blog posts, chosen from the last 3 weeks of Hamsterdam recaps.

These posts were selected at random. Some pertain to Google, others to SEO tactics, and a few to related topics from the world of Search, AI, and digital marketing.

This timeframe did include the week of the API Content Warehouse leak, so the initial posts from Mike King, Rand Fishkin, and some related news were included. (Please reference their work for takeaways rather than any AI summaries below. Again, my conclusion, spoiler!) 😉

Instead of asking Gemini for a comparison, like we did with Bing’s blog, I’ll ask for key areas to learn more about using this prompt:

“I’m creating blog content for SEO professionals and website owners. Compare the uploaded transcripts from the Google podcast episodes with the content of SEO-related blog posts to give me a comprehensive list of important takeaways to know or learn more about. Here is the SEO-related blog content for comparison: [SEO-related blog post content]”

This brought our token count to its highest yet, adding over 500k tokens to the original:

Still, we didn’t even hit the million mark, yet!

As for the model’s response to our prompt, now we’re getting somewhere interesting …

Please don’t take this information at face value, though. It’s just an AI-generated summary of sources done for experimentation and may include hallucinations or information taken out of context. Verify with experts. (Again, my conclusion.) 😉

That said, I’ve updated any statements below with words like “may” to ensure the nuance is there and removed anything that sounded out of context.

Transparency vs. Secrecy

Google must balance transparency about its algorithms and protecting its systems from manipulation. The leak of Google’s internal Content Warehouse API documentation in May 2024 may have exposed this tension, potentially revealing features and data points previously denied by Google representatives.

Importance of User Experience

Google repeatedly emphasizes the importance of creating high-quality, user-centric content and websites. This is likely reflected in ranking algorithms, the focus on Core Web Vitals, and the shift towards more natural language processing in Search.

Evolution of Search

Google Search is constantly evolving. New technologies present both challenges and opportunities for SEOs. Understanding how these technologies work and impact Search is crucial for staying ahead.

SEO is a Long Game

Traditional SEO tactics like keyword stuffing and link schemes are increasingly ineffective. Building a strong brand, creating high-quality content, and focusing on user experience are the keys to long-term success in Search.

There were also some specific takeaways.

1. Google Does Use Clicks & User Data for Ranking

NavBoost and Glue: The leaked API documentation confirms the existence of NavBoost and Glue, two ranking systems that may rely on user clickstream data.
Clickstream Data from Chrome and Android: Google collects and analyzes clickstream data from Chrome and Android user, which may inform its ranking algorithms.
Metrics like “Good Clicks”, “Bad Clicks” and “Last Longest Clicks”: These metrics indicate how Google may be measuring user engagement and satisfaction with search results.
SEO Implication: Focus on creating engaging content that keeps users on the page. Longer click durations may signal relevance and satisfaction to Google.

2. Domain and Site Authority Signals Exist

“siteAuthority” Attribute: Google may explicitly store a “siteAuthority” score for websites, potentially contradicting previous statements denying the existence of sitewide authority metrics.
Homepage PageRank: Google may also consider the PageRank of a site’s homepage when evaluating individual pages, especially for newer websites.
SEO Implication: Building a strong, authoritative website with high-quality content and backlinks from trusted sources is crucial.

3. Topical Authority Matters

siteFocusScore and siteRadius: Google may measure how focused a website is on a particular topic and penalize pages that deviate too far from the core topic.
Page and Site Embeddings: Google may use vector embeddings to analyze the topical relevance of pages and websites.
SEO Implication: Create content clusters around specific themes and ensure pages are topically relevant to the overall site.

4. Content Quality is Crucial

EffortScore and OriginalContentScore: Google may measure the effort put into creating content and the originality of short content.
Keyword Stuffing Score and Gibberish Scores: Google may penalize websites for keyword stuffing and using gibberish content.
SEO Implication: Focus on creating high-quality, original content that provides value to users. Avoid using AI content generators to churn out low-effort content.

5. Links Still Matter, But Quality is Key

sourceType: Google may assign a quality score to links based on the indexing tier of the linking page. Links from high-quality, frequently updated pages are likely more valuable.
Link Spam Velocity: Google may measure the velocity of spammy anchor text to identify and penalize link schemes.
No mention of Disavow: The leaked documentation does not mention the disavow tool, suggesting it may be decoupled from the core ranking systems.
SEO Implication: Focus on building high-quality backlinks from relevant, authoritative websites. Avoid buying links or participating in link schemes.

6. Technical SEO Remains Important

Robots.txt: Google uses robots.txt to determine which pages to crawl, but it’s not a foolproof method for blocking sensitive content.
AMP: Google still supports AMP, but it’s not a ranking factor. Fix any errors or consider removing AMP pages if they are not providing value.
Page Experience Signals: Core Web Vitals are a set of user experience metrics that impact rankings.
SEO Implication: Ensure your website is technically sound, mobile-friendly, and provides a good user experience. Use tools like Search Console and Lighthouse to diagnose issues.

7. New Technologies Will Continue to Shape SEO

JavaScript: JavaScript is increasingly used to build websites, but it can present challenges for crawling and rendering. Developers should be mindful of performance and accessibility considerations.
AI-generated Content: Google is aware of the increasing use of AI to generate content and is working to ensure it doesn’t negatively impact search quality.
SEO Implication: Stay up-to-date on new technologies and understand how they impact Search.

Conclusion:

The leaked Google API documentation may provide valuable insights into the complexities of Google Search. It may confirm long-held beliefs about the importance of clicks, authority, and topicality, while also potentially revealing previously unknown factors.

The overall takeaway is that SEO is more than ever a long-term game that requires a holistic approach.

I definitely agree with that last sentence. 😉

Before taking anything you just read as fact, especially related to the API leak, I’d suggest visiting the original source.

John Shehata created a nice summary:

Here is a list of best articles, analysis and tools on the latest Google Search API Leak, I will continue to update. Feel free to add other resources

Must Read & Watch:
– How it all started https://t.co/CY5QZvn0Ie by @randfish
– The Google API Leak Should Change How Marketers…
— John Shehata (@JShehata) June 2, 2024

To that point …

So, what did we learn here? 🙂

Well, this has been an interesting exercise!

I had to work my tail off editing some of that last section, which tells me a few things.

In general, I think we learned AI models can be a good way to extract basic information. The takeaways in the first two sections were good, yet quite straightforward.

Once we got into the nuances of multiple viewpoints and opinions, the model tended to reference one side or another, with limited context. (Relatedly, I recently wrote about the challenges of epistemic vs. aleatoric uncertainty in LLMs.)

In short, I don’t believe the summaries above are as valuable as reading the original sources.

It’s certainly helpful to use generative AI to create efficiencies in SEO workflows, but it’s no substitute for hard work.

Even with a 2 million token context window, we didn’t get all of the context.

The information distilled above — whether it came from Google Search Central’s team members in a podcast or SEO professionals based on their own research and writing from experience — is best understood directly from the voice and perspective of the author(s).

That’s just my opinion, and it doesn’t mean AI-generated summaries lack value. 🙂

We just reviewed information from nearly 3 million tokens worth of prompts. That’s efficiency.

Let’s try one last prompt, though:

“Find an example from the podcast transcripts of someone using the word ‘token’ in a sentence.”

Well, that didn’t work:

On the other hand, it took about 5 seconds on Google Search to find this article from Search Engine Roundtable:

And there we have it:

To quote the article: “‘Basically, we tokenize them because we don’t need all the fluff that comes with the HTML,’ Gary explained.”

If there’s one thought I can leave you with:

Listen to the perspectives of human experts. 😉

Now are you ready for a meta moment?

I put this blog post’s full text back into our same Gemini prompt.

It came back with several good suggestions, including this one:

Highlight Contradictions and Nuances: The podcast transcripts likely contain differing opinions and perspectives on various SEO topics. It would be interesting to highlight these contradictions and explore the nuances of different approaches.

Maybe summarization isn’t always the final benefit, but rather where that summarization can lead us to explore further.

Gemini just gave me an idea for round 2.

Outro

Thanks for making it through this article!

It was an experimental endeavor, but I hope you learned something; I know I did!

To hear more expert voices, head over to Hamsterdam for weekly SEO and AI news recaps, or visit my full blog for more articles.

Until next time, enjoy the vibes:

Thanks for reading. Happy optimizing! 🙂

SEO Strategist and Consultant

Ethan Lazuk