🐹 Hamsterdam part two. (9/29 to 10/4, 2024)
A weekly marketing and AI content recap.

Welcome to the second week of the new Hamsterdam! 🐹
We have a new format here, where the focus is no longer content from my socials, emails, and beyond but rather limited to Google Discover. 🤳 🌏
I’m trying to focus on publications or stories you might not see in your ordinary travels through SEO or marketing circles.
Why the change?
One reason is I’ve been off socials for a while for mental health reasons.
Here’s a quick introduction explaining more …
I have a diagnosis of bipolar disorder, which I got 3 years ago.
A couple of months ago, I had an episode.
When these happen, my fight-or-flight kicks in, and I start to see or hear things that aren’t real. It can last a while.
It’s not scary, to be honest, but it’s embarrassing once you come out the other side.
There’s usually weird social posts, emails, or website activity associated with the episodes, typically that I don’t remember — though I tried to limit that this time.
I did delete nearly all of my website’s content … but that’s something I’ll likely keep and rebuild from instead. (More on that later.)
Usually what sets off my bipolar episodes — as best I can tell now after having two of them — is feeling isolated from a social group while also having some success (and pressure) professionally.
I loved doing SEO, especially as a consultant, yet I found it hard to breakthrough to making peer connections.
As Kendrick Lamar says in the song DNA, I’m an “antisocial extrovert.” 📢 🚪
I’m thankfully on the other side of the episode now, getting back to my old self …
But it also gave me insight into my “new” self and my future goals.
That’s why I’m broadening my consulting career beyond SEO to focus more on holistic marketing audits and brand-focused campaigns.
You’ll also see this reflected in my new blog content (albeit, I’m not 100% back to my normal writing self, but working hard to get there).
I’m also working on a film script that discusses some of these themes, which I hope to follow through on in the future. 📼
But it’s for these reasons that I’m not on socials much …
You’ll see me share these Hamsterdams on X and LinkedIn on Fridays or Saturdays, but you probably won’t see me doing much else there for a while.
Thanks for understanding, especially if you saw any weirdness from my social accounts or emails the past two months.
The world needs more empathy, in my opinion. 🤗
With that all said, and without further ado, let’s dig in!
Excerpts go to the authors. Bolding is mine.
🧨 My pick of the week is 13 about semantic search and more accurate embeddings.
1. Researchers at UC Berkeley Developed DocETL: An Open-Source Low-Code AI System for LLM-Powered Data Processing – Pragati Jhunjhunwala, MarkTech Post

Excerpt:
“Handling unstructured data is challenging due to its inherent lack of structure and consistency. …
Current document processing methods often rely on manual techniques or basic automation that need more sophistication to handle unstructured data effectively. Natural language processing (NLP) tools may offer some capabilities but fall short when processing complex documents that require higher-level understanding. Researchers from UC Berkeley introduced DocETL, a more advanced, low-code solution powered by large language models (LLMs) to address the challenge of processing complex, unstructured documents. …
DocETL operates by ingesting documents and following a multi-step pipeline that includes document preprocessing, feature extraction, and LLM-based operations for in-depth analysis. The LLMs used within the system can handle tasks like summarizing long documents, classifying them into categories, answering user queries, and identifying key entities such as people or organizations. …
The tool’s efficiency heavily relies on the capabilities of the integrated LLMs, the design of the processing pipeline, and the quality of the input data, all of which contribute to its ability to automate complex workflows.”
2. Seeing the Forest: Using Graph RAG for Information Architecture – Jorge Arango

Excerpt:
“When designing an information architecture, you must approach the system — a website, product, book, whatever — as a whole. …
I’ve written previously about how AIs can augment information architecture work. Alas, out-of-the-box, LLMs work more effectively with individual content items than with larger sets, such as entire websites. …
What if we wanted to work on an entire website rather than on one post at a time? That requires different techniques. This post explains the use of one such technique: retrieval augmented generation (RAG) using knowledge graphs. …
One approach to overcoming these limitations is a technique called retrieval augmented generation, or RAG. The basic idea is searching within a predefined content repository for appropriate data to inject into prompts to make them more relevant. …
The additional context provided by this extra text makes a big difference, focusing responses on the material available in the repository. By default, this material consists of unstructured text. But text can also be structured to express semantic meanings, reducing ambiguity. These relations are captured in a knowledge graph, a structure that captures relationships between ideas. …
For information architecture, an obvious use case is producing a first draft for an organization scheme. Graph RAG allows me to take in all of the system’s content to look for patterns. The technique is much more effective than my previous experiment in this direction.”
3. Why Your Pretty New Homepage Is Probably a Waste of Money – Jessica Stillman, Inc.

Excerpt:
“Thanks to an explosion of image generators and other fancy new tools, the web has recently become chock-full of extremely pretty websites that do a terrible job of actually selling anything, he argued on Medium recently. …
Over the course of her career, she has seen more than her fair share of failed homepage redesigns. ‘What usually happens?’ writes Verna. ‘A multi-month effort ensues, involving everyone and their mother’s opinions, and the result often doesn’t fail to lift sign-ups — it can crash them.’ The problem, she explains, is two-fold. One, many potential customers make up their mind about a purchase long before they go Googling homepages, meaning the expense and effort of these huge redesign projects often fails to move any meaningful needle. But also, when companies redesign their website they generally want to make them more beautiful and fancy-looking. Verna’s word for this is ‘aspirational,’ and she claims it rarely works out. Nice-looking words and images usually just end up confusing those looking to make a purchase. …
But according to Malewicz’s detailed Medium post, the problem is more acute these days for two reasons: technology and design trends. Like Verna, he too has observed that business owners have long fallen into the trap of prioritizing good looks over results.”
4. How Perplexity’s AI is reinventing search – Jeremy Caplan, Fast Company

Excerpt:
“Tip: quick searches are fine when you’re just looking for a simple fact (e.g. when did Jordan retire). Pro searches are best for more intricate queries like the ones below. …
Perplexity breaks down complex queries into steps. It shows you the phrases it uses to conduct your search. …
Tip: Use a domain limiter to narrow your search to a particular site. Type domain:.gov to focus only on government sites. Or just use natural language to limit Perplexity to certain kinds of sites.”
5. Generative AI adoption surpasses early PC and internet usage, study finds – Michael Nuñez, Venture Beat

Excerpt:
“According to the paper, The Rapid Adoption of Generative AI, the technology has taken hold faster than previous transformative technologies like the personal computer (PC) or the internet. …
Generative AI is spreading faster than anyone could have predicted. Just two years after the public release of ChatGPT, 39.4% of Americans aged 18-64 reported using generative AI, with 28% using it at work. To put that in perspective, it took three years for PCs to hit a 20% adoption rate. … the research shows that adoption is widespread across industries. In fact, one in five ‘blue-collar’ workers—those in construction, installation, repair, and transportation—regularly use generative AI on the job. …
The study found that younger, more educated, and higher-income workers are more likely to use AI on the job. Notably, workers with a bachelor’s degree or higher are twice as likely to use AI as those without one (40% vs. 20%). …
The most common uses of AI at work include writing, administrative tasks, and interpreting text or data. In fact, 57% of those using AI at work reported using it to help with writing tasks, and 49% said they used it for searching for information.”
6. People who can express themselves better through writing than speaking usually have these 8 unique traits – Lachlan Brown, Small Business Bonfire

Excerpt:
“For some, the act of speaking can be a daunting task. The pressure to instantly respond, the fear of saying something wrong, it can all be a little overwhelming. But when they write? Magic happens. Thoughts are laid out with precision, emotions captured flawlessly – it’s their preferred communication highway.
People who express themselves better through writing than talking often have unique traits that set them apart. … They’re deep thinkers … They appreciate solitude … They are detailed observers … They are patient … They use writing as a form of therapy … They find comfort in structure … They are avid readers … They value authenticity … For those who find solace in expressing themselves through writing, it’s their unique blend of traits that makes their communication style so powerful.”
7. Perplexity AI: 7 of the Best Everyday Ways to Use This LLM – Micha Sulit, Acer

Excerpt:
“Unlike traditional search engines that simply provide a list of links, this AI tool synthesizes information from multiple sources and offers direct answers along with citations. …
One of the core strengths of Perplexity AI is its ability to engage in conversational interactions. After answering a prompt, the tool automatically suggests follow-up questions so users can refine their queries for more precise results. …
Perplexity AI offers Focus modes to further streamline your search. For example, you can ask it to search exclusively for academic sources, or offer only video results. By combining the functionalities of a search engine with the conversational abilities of a chatbot, Perplexity AI has positioned itself as a powerful and versatile research assistant not only for academics and work, but also for everyday life.”
8. Model Distillation in the API – OpenAI

Excerpt:
“Model distillation involves fine-tuning smaller, cost-efficient models using outputs from more capable models, allowing them to match the performance of advanced models on specific tasks at a much lower cost. Until now, distillation has been a multi-step, error-prone process, which required developers to manually orchestrate multiple operations across disconnected tools, from generating datasets to fine-tuning models and measuring performance improvements. Since distillation is inherently iterative, developers needed to repeatedly run each step, adding significant effort and complexity.”
9. Liquid AI debuts new LFM-based models that seem to outperform most traditional large language models – Mike Wheatley, Silicon Angle

Excerpt:
“Artificial intelligence startup and MIT spinoff Liquid AI Inc. today launched its first set of generative AI models, and they’re notably different from competing models because they’re built on a fundamentally new architecture. The new models are being called “Liquid Foundation Models,” or LFMs, and they’re said to deliver impressive performance that’s on a par with, or even superior to, some of the best large language models available today. …
They use minimal system memory while delivering exceptional computing power, the company explains. They’re grounded in dynamical systems, numerical linear algebra and signal processing. That makes them ideal for handling various types of sequential data, including text, audio, images, video and signals. …
Whereas traditional deep learning models need thousands of neurons to perform computing tasks, LNNs can achieve the same performance with significantly fewer. It does this by combining those neurons with innovative mathematical formulations, enabling it to do much more with less. …
The startup says its LFMs retain this adaptable and efficient capability, which enables them to perform real-time adjustments during inference without the enormous computational overheads associated with traditional LLMs. As a result, they can handle up to 1 million tokens efficiently without any noticeable impact on memory usage.”
10. ChatGPT is changing the way we write. Here’s how – and why it’s a problem – Ritesh Chugh, The Conversation

Excerpt:
“Have you noticed certain words and phrases popping up everywhere lately? Phrases such as ‘delve into’ and ‘navigate the landscape’ seem to feature in everything from social media posts to news articles and academic publications. They may sound fancy, but their overuse can make a text feel monotonous and repetitive. This trend may be linked to the increasing use of generative artificial intelligence (AI) tools such as ChatGPT and other large language models (LLMs). …
Generative AI tools are trained on vast amounts of text from various sources. As such, they tend to favour the most common words and phrases in their outputs. …
The overuse of certain words and phrases leads to writing losing its personal touch. It becomes harder to distinguish between individual voices and perspectives and everything takes on a robotic undertone. …
ChatGPT can be a helpful starting point for writing many different types of text, but editing its outputs remains important. By reviewing and changing certain words and phrases, you can still add your own voice to the output.“
11. Ask questions in new ways with AI in Search – Liz Reid, Google

Excerpt:
“We previewed our video understanding capabilities at I/O, and now you can use Lens to search by taking a video, and asking questions about the moving objects that you see. … Open Lens in the Google app and hold down the shutter button to record while asking your question out loud, like, ‘why are they swimming together?’ Our systems will make sense of the video and your question together to produce an AI Overview, along with helpful resources from across the web. …
We’re also making it easier to shop the world around you with Lens. … starting this week, you’ll now see a dramatically more helpful results page that shows key information about the product you’re looking for, including reviews, price info across retailers and where to buy. … Just take a photo and Lens will bring together our advanced AI models and Google’s Shopping Graph — which has information on more than 45 billion products — to identify the exact item. So you can learn more about whatever catches your eye, and start shopping right in the moment. …
This week, we’re rolling out search results pages organized with AI in the U.S. — beginning with recipes and meal inspiration on mobile. You’ll now see a full-page experience, with relevant results organized just for you. You can easily explore content and perspectives from across the web including articles, videos, forums and more — all in one place. …
We’ve been testing a new design for AI Overviews that adds prominent links to supporting webpages directly within the text of an AI Overview. In our tests, we’ve seen that this improved experience has driven an increase in traffic to supporting websites compared to the previous design, and people are finding it easier to visit sites that interest them. Based on this positive reception, starting today, we’re rolling out this experience globally to all countries where AI Overviews are available.”
12. Go Behind the Scenes of Award-Winning Conversationally Website With the B2C Content Marketer of the Year – Ann Gynn, CMI

Excerpt:
“This story may sound familiar. A small team writes and publishes content for a corporate blog. The company doesn’t know what, if any, impact the content has on visitors. The team just keeps creating and publishing content, and no one knows if, let alone how, it ultimately drives action. …
First-party data reveals opportunities ‘We’re in such a unique situation where (our customers are) literally telling us what they’re saving for, what they’re striving for with their life goals,’ Jim explains. … He and another Ally employee dug into the first-party data to understand what people were spending money on, what they saved for, etc. They overlaid search trends, social conversations, and third-party research to create a lens from which to think about content. …
Audiences think uses, not products A lot of financial sites explain what a mortgage is, but the Conversationally team sees its role differently. ‘People don’t go, ‘Oh man, I can’t wait to get a mortgage,’’ Jim says. ‘They’re like, ‘I’m saving for a home.’ So, Ally sees customers’ goals as savings buckets for homes, weddings, travel, etc. They also work as broad themes for Conversationally. ‘There is a lot of power in personalization and being able to put content closer to our products,’ Jim says. The personalization he refers to isn’t just, for example, content about home ownership targeted to people looking for mortgages. Instead, Ally tailors the content to the journey. … Ally calls this the ‘nurture nature of content’ — Conversationally nurtures its readers to become customers. Someone who reads a piece on the data-driven content hub and visits an Ally product page is two times more likely to convert into a customer than someone who just goes to a product page. …
Potential visitors aren’t just eagerly waiting for a corporate blog update, so the Conversationally team needed a strong distribution plan to get in front of people when they might be thinking about money and major life milestones. Ally went to the places where those audiences were already consuming relevant content by partnering with media brands. …
‘Too often brands are chasing SEO terms that might not align ultimately to what they want, what is beneficial to a consumer, or what they provide as a service,’ Jim says. ‘Our North Star is always to provide personalization whenever we can so it’s relevant to the audience. But, ultimately, it puts somebody on the path to the right product or solution, so it’s mutually beneficial to us, too.’”
13. Improved RAG: More effective Semantic Search with content transformations – Sebastian Gingter, Think Tecture

Excerpt:
“Semantic Search – what is it? There are basically only four steps involved in semantic search. …
Step 1: Preparation You take your documents, and put them through a bit of tooling to prepare the data. For the most cases, this involves splitting large documents up into several smaller chunks, as too large pieces of content tend to cover too many topics. This makes it harder to determine the most relevant piece of information in large documents. Also, you might later want to use a Large Language Model (LLM) to generate an answer based on the found documents, and these models can only process a certain amount of information (they have a maximum size of their so-called context window), so this is an additional reason to limit the size of the chunks. …
Step 2: Indexing, or embedding creation The smaller chunks of content are transformed into ’embeddings’. To do this, the content is passed to a specialized AI model that is trained to convert the input into an output value, which encodes the semantic meaning of the content into a computer-readable representation. Since a computer only works on numbers, the meaning is represented by several numbers, like an array of positions on several axes, or dimensions. This is called a vector. … Getting all these measures will position this document somewhere ’embedded in the brain’ of the used embedding model. This ‘brain’ is a multidimensional space. Depending on the model you have, there are hundreds if not several thousands of different dimensions. The result of this step is a ‘coordinate’ of that chunk of data ‘within the embedding model’s brain’. This array of floating point values is a mathematical vector, and in our context this vector is called an embedding. …
Step 3: Storage The embedding is stored in a database, together with the ID and also often the complete content of the document it belongs to. … there are specialized databases that are optimized for this kind of search. They are called vector databases, or vector stores. Vector databases build indices based on the embedding values, and they also implement algorithms that can search for similar vectors in the index very efficiently. Depending on the type of database, you can also store additional data (like the full chunk of the content) and also additional metadata (like the title of the document, the author, the date of creation, and so on). …
Step 4: Search The query is passed to the same embedding model, which was used to create the embeddings. The result of this is a vector that represents the meaning of the query. It is very important to use the exact same model … This vector is then passed to the vector database, and the database will calculate the mathematical distance between the query vector and the stored embedding vectors. This distance value, in a mathematical sense, is low, when the vectors point in the same direction (near together, or very similar meaning) and large, when the vectors point into different directions (far away, or a big difference in the meaning). …
In a lot of cases, the full text of the found document and the question of the user will be passed to a Large Language Model (LLM) like GPT to formulate a full response that is then returned to the user. This is called the RAG pattern (Retrieval Augmented Generation). But as said, there is no need for that except for a better readable answer. You also simply display the list of found documents ranked by their similarity (or distance), and have the user look for the piece of information in the document. …
The main problem with that approach is, like pretty much everything in the field of AI, the quality of the input data. … In most businesses, your documents have to do with your business. Not with astronomy, not with animals or plants. They all have to with your business problems in your field of expertise. Probably all written in a very formal style. By experts in their field. And to not confuse a reader, most authors will try and reduce their vocabulary to your domain-specific language and be very specific. So, naturally, when you put your documents in the huge multidimensional brain of an AI embeddings model, they will probably all be very close to each other. And when you search for a document, you will get a lot of documents that are very similar to the query, but not necessarily the one you are looking for. … The main idea is to create the embeddings from content that is closer to our questions. And what is closer to a question about a specific topic than a document about this topic? … So we need a way to get questions for our documents to create the embeddings for these. Luckily, that is also what language models excel in. Understanding our content, and crafting questions about it that are answered in this specific piece of our document, is exactly what we need. …
If your semantic search results are not as good as you hoped for, it might be because the contents of the documents and the questions asked to retrieve them are not as similar as they could be. In this case, you might want to try and transform your documents into questions instead, and use these transformed documents to create the embeddings for your search. This can make your search results a lot more accurate and specific for your specific use-case.”
14. The definitive guide to semantic search engines – Algolia

Excerpt:
“Semantic search is a breakthrough technology that can grasp the true meaning and context of words and phrases typed in user search queries, as opposed to just matching up keywords with corresponding content on web pages. It does this thanks to natural language processing (NLP), machine learning, and other AI techniques. In terms of ecommerce, by semantically analyzing words and phrases, a semantic search engine infers a shopper’s authentic intent and responds with accurate search results. …
To process queries, semantic search engines employ a suite of advanced techniques, including: …
With vector search, text is converted into vectors — numerical representations of data. Then, the K-Nearest Neighbor (KNN) algorithm matches the vectors that represent items most similar in content or meaning to the query. …
Entity recognition involves identifying and classifying the key components of a query or text in predefined categories such as people, places, organizations, products, and dates. Entity recognition allows semantic search engines to understand the specific subject of a query …
Contextual understanding refers to ascertaining meaning and intent in a query. It looks at the words used, the shopper’s search history, and trends. …
Semantic search engines use NLP … The process involves breaking sentences down into words and phrases, interpreting the language being used, and generating appropriate natural-language responses. …
Ecommerce sites big and small are leveraging semantic search to improve the online shopping experience. By assimilating the semantics of a search query like ‘wireless headphones with noise cancel’, Amazon.com, for instance, can not just infer specific product features the shopper might want but suggest alternatives, accessories, and bundles that reflect the intent. …
Semantic search’s ability to comprehend the context and intent behind queries means delivering search results that align closely with what shoppers are looking for, potentially increasing conversion rates and satisfaction. …
Semantic search engines go a step further for ecommerce vendors by analyzing customer behavior and preferences to deliver personalized recommendations. … Personalized recommendations in any industry mean shoppers are more likely to find products and services that resonate with them. …
The shift toward semantic search also impacts SEO content creation, as content can be optimized to go beyond keywords to reflect deeper intent and context in queries. Businesses and content creators can align with the ways semantic search engines interpret and prioritize content, developing material that genuinely addresses user needs and questions, which can lead to better visibility and ranking in search results.”
Thanks for checking out the new Hamsterdam! 🐹
Until next time, enjoy the vibes:
Thanks for reading. Happy marketing! 🤗
Leave a Reply