How Does Perplexity Work? A Summary from an SEO’s Perspective, Based on Recent Interviews

By Ethan Lazuk

Last updated:

July 4, 2024

Pay attention to Perplexity.

The more I use it, the more impressed I am, and I’m sure others are, too.

That said, how Perplexity works is an interesting topic, especially from an SEO perspective.

Perplexity describes itself as an “answer engine,” and ~~not a search engine~~.

What does that mean, exactly? 🤔

Let’s discuss. 🙌

First off, this isn’t definitely everything to know about how Perplexity works.

It’s just good info that I’ve extracted from interviews given by their CEO, CTO, and head of search:

Alexandr Yarats, Head of Search at Perplexity – Interview Series – Unite.AI (May 2024)
Transforming Search with Perplexity AI’s CTO Denis Yarats – Gradient Dissent, Weights & Biases (June 2024)
Transcript for Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet – Lex Fridman, Podcast #434 (June 2024)

I’ve linked to other relevant sources, where needed, and I also plan to expand this over time.

While I’ve reviewed these sources in depth, I also used Gemini 1.5 Pro to summarize parts of the transcripts during my research. I did my best to ensure the information is accurate.

Still, I always encourage you to visit the original sources, as well. 😇

Based on the team’s recent interviews above, here’s what I found important to know about how the Perplexity answer engine works:

Query processing 🤖
Retrieval 🎾
Answer generation 📝
Post-processing and refinement 🧑‍🏫

I’ve also concluded with a few thoughts and takeaways. (And a tune. 🎵)

Let’s jump in! 🤘

1. Query processing 🤖

Perplexity uses LLMs to interpret queries. These large language models go beyond simple lexical matches (keywords). They also identify words’ semantic relationships, or their underlying meanings and context.

This is why their search product works even when a question is naturally or informally worded, or poorly structured:

Their LLMs classify queries based on their complexity and required resources (not sure if this is in the present or a future goal, though).

This allows (or would allow) the systems to allocate compute dynamically.

Perplexity also routes queries to the most appropriate models (and Pro users can choose):

Alexandr Yarats: “We engineered our system where tens of LLMs (ranging from big to small) work in parallel to handle one user request quickly and cost-efficiently. We also built a training and inference infrastructure that allows us to train LLMs together with search end-to-end, so they are tightly integrated. This significantly reduces hallucinations and improves the helpfulness of our answers.”
– Unite.AI Interview Series

Perplexity initially relied heavily on OpenAI’s APIs. (This was around the era of GPT-3.)

Aravind Srinivas also referenced BERT in his interview with Lex Fridman, albeit in the context of Google:

Aravind Srinivas: “Google has this whole system called Now Boost (sic) that extracts the relevant metadata and relevant content from each raw URL content.”

Lex Fridman: “Is that a fully machine learning system with embedding into some kind of vector space?”

Aravind Srinivas: “It’s not purely vector space. It’s not like once the content is fetched, there is some bird m- … once the content is fetched, there’s some BERT model that runs on all of it and puts it into a big, gigantic vector database which you retrieve from. It’s not like that, because packing all the knowledge about a webpage into one vector space representation is very, very difficult.”
– Lex Fridman Podcast #434 (2:03:40)

However, Perplexity also has its own model called Sonar. It’s a post-trained version of Llama 3 (70b) designed for summarization, referencing citations, and keeping context.

2. Retrieval 🎾

The evolution of Perplexity from using existing resources to in-house infrastructure is also demonstrated by the retrieval end of the answer engine.

In the early stages (2022), Perplexity relied on Bing as their search engine (in combination with OpenAI models for answer generation).

As the company grew, Perplexity began investing in a custom crawler (PerplexityBot) as well as an indexer and ranking algorithm.

[History side note: MSN originally used third-party crawlers, like Inktomi, but after that company was acquired by Yahoo! in 2003, and given competition from Google, Microsoft developed an in-house crawler for MSN and deployed it in 2005.]

PerplexityBot follows links and fetches content, abiding by robots.txt. However, Perplexity also uses third-party web crawlers (like the one that got them in the news recently).

Aravind has also spoken, in general, on the limitations of vector embeddings and the enduring value of traditional information retrieval methods, like the BM25 algorithm (a more sophisticated version of TF-IDF), as well as n-gram-based retrieval and domain authority signals (like PageRank).

For the record, term frequency (TF) measures how often a word appears in a document, while inverse document frequency (IDF) measures how common (or rare) a word is across all documents in a collection.

BM25 (best matching 25) accounts for TF-IDF’s shortcomings, like document length (normalizing term frequencies), term frequency saturation or diminishing returns of word occurrences (non-linear scaling function), and different document types (fine-tuning or tunable parameters).

N-grams are characters, words, or syllables that documents and search queries can be broken into to find similarities, helping with issues like spelling errors or partial word matches.

Perplexity’s search index is smaller than Google’s. As a result, they’re focused on the head of the distribution curve, meaning the most popular and high-quality content from the sources most likely to be relevant and trustworthy.

They are less focused on the tail of the distribution, meaning less common or niche queries:

Alexandr Yarats: “It turns out that the majority of our users utilize Perplexity as a work/research assistant, and many queries seek high-quality, trusted, and helpful parts of the web. This is a power law distribution, where you can achieve significant results with an 80/20 approach. Based on these insights, we were able to build a much more compact index optimized for quality and truthfulness. Currently, we spend less time chasing the tail, but as we scale our infrastructure, we will also pursue the tail.”
– Unite.AI Interview Series

As a backup, Perplexity leverages language models to generate possible answers or explanations or suggest the rephrasing of a query.

In short, if the retrieval system struggles to find relevant documents, the LLM can analyze less relevant ones and attempt to synthesize an answer:

Aravind Srinivas: “So what LLMs add is even if your initial retrieval doesn’t have a amazing set of documents, like it has really good recall but not as high a precision, LLMs can still find a needle in the haystack and traditional search cannot, because they’re all about precision and recall simultaneously. … You get the right link maybe in the 10th or ninth. You feed it in the model. It can still know that that was more relevant than the first. So that flexibility allows you to rethink where to put your resources in terms of whether you want to keep making the model better or whether you want to make the retrieval stage better.”
– Lex Fridman Podcast #434 (2:08:56)

However, unlike a traditional RAG (retrieval augmented generation) method, Perplexity limits the LLM’s contributions beyond the retrieved sources to achieve better grounding:

Aravind Srinivas: “The principle in Perplexity is you’re not supposed to say anything that you don’t retrieve, which is even more powerful than RAG because RAG just says, ‘Okay, use this additional context and write an answer.’ But we say, ‘Don’t use anything more than that too.’ That way we ensure a factual grounding. ‘And if you don’t have enough information from documents you retrieve, just say, ‘We don’t have enough search resource to give you a good answer.’”
– Lex Fridman Podcast #434 (2:08:56)

Perplexity also emphasizes high-quality scraping and parsing of webpages (not just indexing links) to extract relevant paragraphs for feeding the LLM and generating answers.

Unlike traditional search engines that focus on click probability, Perplexity prioritizes ranking content based on its helpfulness in answering the user’s query.

They also use a trust score for domains and webpages to help filter out low-quality content (and “search” spam). 😊

The periodicity of recrawling to maintain index freshness is likely prioritized based on query categories and content sources, among other factors.

3. Answer generation 📝

Once relevant documents are retrieved, Perplexity’s systems extract relevant snippets (paragraphs or sentences) using LLMs and embedding techniques to identify the most contextually relevant sections.

These extracted snippets are fed into an LLM, along with the original query. The LLM generates a concise and formatted answer based on the retrieved snippets.

Citations are included for sentences to demonstrate factual grounding and transparency.

The system also dynamically selects the best model for the query, unless the user specifies a preferred model (Pro version).

I’m not sure how Perplexity’s citation process works. Google is working on this AGREE framework, where citations are part of the response generation.

I’d imagine Perplexity’s process is kind of in between that and post-hoc citations.

Maybe they keep track of source documents for retrieved snippets (like by attaching token-level metadata or using an LLM), and then after an answer is generated, they likely use sentence boundary detection and align that sentence with its most relevant source snippets (again, through metadata or maybe semantic similarity).

But I’m not positive on that.

In my opinion, the sources in Perplexity are there more for transparency than traffic referral. Hence, it being an “answer engine” and not a “search engine.”

Clicks from Perplexity also don’t show as organic traffic in GA4, as far as I’m aware.

I see their source/medium reported either as “perplexity / (not set)” or “perplexity.ai / referral”:

Brave new world. Maybe they’ll hook us up with some webmaster tools. 🙏

4. Post-processing and refinement 🧑‍🏫

Good stuff in, good stuff out.

Perplexity focuses on retrieving accurate and relevant snippets for answers while training models to recognize inconsistencies and avoid generating unsupported claims.

Users can report inaccurate answers or other issues, as well.

Perplexity also leverages RLHF (reinforcement learning from human feedback), including both user feedback and their team of annotators (LLM teachers), who evaluate answer quality and retrain the models.

They also use active learning techniques (where models learn from selective data) to identify challenging queries and refine systems based on real-world usage patterns.

Frequent retraining of models, updates to the index, and refinement of algorithms based on user feedback and usage data all contribute to iterative improvements for Perplexity.

A few closing thoughts on Perplexity’s evolution (and why “SEO spam” is really “search spam”). 😇

Perplexity has shifted from relying on third-party APIs to building its own internal search engine infrastructure and LLMs.

Early on, the company’s strategy was to leverage more existing resources (or what some call a “wrapper” strategy).

As they’ve received market share, they’ve gradually committed to building more in-house resources.

Frankly, I’m impressed by their direction. After all, the answer engine figured this one out:

Perplexity also has acquired top-tier talent and has support from notable investors.

Their Series B funding blog post in January 2024 even called out “SEO spam.” 😭

For the record, if someone creates web spam, I wouldn’t consider that ~~SEO~~.

SEOs focus on satisfying users to support business goals.

In a way, Perplexity and we do similar things, just from different angles. 🤜 🎆 🤛

I’d argue we should call it “search” spam, instead.

As for optimizing for Perplexity, the answer engine doesn’t rank results based on clickstream data. Its systems focus on the helpfulness and accuracy of answers, relying on trust scores for domains and webpages.

Their index is also smaller than Google’s currently, so they’re more focused on the head of the distribution curve (fewer niche queries).

In short, if you want your content to appear in Perplexity, focus on quality with a great source and brand behind it.

Just don’t expect to see “organic traffic” from the answer engine, as that’s not how Perplexity’s referrals get reported in GA4.

You may, however, receive some qualified clicks. 🙌

Outro

Thanks for checking out my overview of how Perplexity works, from an SEO’s perspective.

This is just a first attempt. I’ll revisit this and update it with more information, as well.

Feel free to leave a comment or contact me with feedback, or check out related posts below.

Until next time, enjoy the vibes:

Thanks for reading. Happy optimizing! 🙂

SEO Strategist and Consultant

Ethan Lazuk

People Tell Me What to Say: Creating Helpful, Reliable, People-First Content for Google Search in 2024 & Beyond (An SEO Deep Dive)

Learn why current SEO tactics may be creating search engine-first content that’s hurting your visibility on Google Search in light of AI-based ranking systems like…

November 21, 2023November 16, 2024

“Hello, World”: Exploring the History of Microsoft’s Search Engines, from MSN to Bing Sources in Today’s AI Chats (Hamsterdam History)

This Hamsterdam History lesson examine the history of search engines from Microsoft, from MSN Search in 1998 to Copilot with Bing and other AI chats…

April 24, 2024October 5, 2024

USER-LLM: Contextualizing LLMs with User Embeddings for Enhanced Personalization (via Google Research), & Why SEOs Should Care (Maybe)

This rendition of Hamsterdam Research explores USER-LLM, a novel framework from Google Research for contextualizing LLMs with user embeddings of user interaction data.

May 30, 2024October 5, 2024

Ethan Lazuk

How Does Perplexity Work? A Summary from an SEO’s Perspective, Based on Recent Interviews

1. Query processing 🤖

2. Retrieval 🎾

3. Answer generation 📝

4. Post-processing and refinement 🧑‍🏫

A few closing thoughts on Perplexity’s evolution (and why “SEO spam” is really “search spam”). 😇

Outro

Related posts

People Tell Me What to Say: Creating Helpful, Reliable, People-First Content for Google Search in 2024 & Beyond (An SEO Deep Dive)

“Hello, World”: Exploring the History of Microsoft’s Search Engines, from MSN to Bing Sources in Today’s AI Chats (Hamsterdam History)

USER-LLM: Contextualizing LLMs with User Embeddings for Enhanced Personalization (via Google Research), & Why SEOs Should Care (Maybe)

Like this:

Leave a ReplyCancel reply

How Does Perplexity Work? A Summary from an SEO’s Perspective, Based on Recent Interviews

1. Query processing 🤖

2. Retrieval 🎾

3. Answer generation 📝

4. Post-processing and refinement 🧑‍🏫

A few closing thoughts on Perplexity’s evolution (and why “SEO spam” is really “search spam”). 😇

Outro

Related posts

People Tell Me What to Say: Creating Helpful, Reliable, People-First Content for Google Search in 2024 & Beyond (An SEO Deep Dive)

“Hello, World”: Exploring the History of Microsoft’s Search Engines, from MSN to Bing Sources in Today’s AI Chats (Hamsterdam History)

USER-LLM: Contextualizing LLMs with User Embeddings for Enhanced Personalization (via Google Research), & Why SEOs Should Care (Maybe)

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from Ethan Lazuk