The headings describe nothing.
When we review editorial headings, we typically find issues of clarity and uniqueness. For example, headings like "Overview," "Our Approach," "Learn More," and "Get Started" could probably appear on any page of any website.
A heading's job is to tell the reader what the section covers. But when a heading says "Our Approach," it tells the reader nothing (and it tells an AI tool even less). AI tools lean on a page's structure — including its headings — as one signal for working out what a section covers, so a heading like "How a home equity loan works and what it costs" gives them something to work with, while "Overview" doesn't.
What's more, page headings (especially the first-level — H1 — headings that usually map to the page title) require more than clarity and context. They need to be relatively unique. While this work helps clarify the site for GEO, it's also useful for internal site search and general understanding of site structure.
One recent audit we performed flagged more than 40 pages with generic H1s, which means 40 pages were missing a unique and powerful point of differentiation. The main title of the page did not communicate what the page actually covered.
This is one of the easiest fixes on the list. It doesn't require new content or a technical overhaul — just a pass through your existing pages with a simple question: does this heading tell someone what they're about to read?
Nobody knows who wrote the articles.
If your hope is to be recognized as a thought leader in your industry, you have to provide a level of trust. This means letting your audience know who wrote your articles and why we should trust them. This is a constant issue, especially at organizations where content is written by committee, produced by an outside agency, or published under a generic brand voice with no individual attached.
This matters because AI search, at its core, tries to predict trustworthy answers to real questions. A page with a named author, a title, and a few lines establishing their experience on the topic carries a stronger credibility signal than the same content published anonymously. It's the difference between "here's what an expert at this organization thinks" and "here's what a website says."
The fix isn't complicated: put a name on your content and add a short bio. This may require updates to the content model itself (to add an author field, or to create a more complex content type that includes the full author bio) but in short, your editorial team simply needs to make it clear that a real person with relevant knowledge is responsible for what's on the page.
This is especially important for topics where expertise matters — financial guidance, healthcare information, legal explanations, or anything where the reader needs a reason to trust the source.
The content doesn't answer anyone's actual question.
The old content strategist in me slowly pumps his fist with this one.
After years of pleading with the world as a whole to begin thinking about the why behind each and every page — to spend time critically assessing why a page exists and whether it's answering the questions a site user might be asking — GEO swings in and whispers into our ears: "I only want content that answers a real question."
It makes sense: answer engines aren't finding things like a traditional search engine might. They're answering questions.
A lot of the pages we audit are full of positioning language: "We're committed to providing innovative solutions for our members," or "Our team delivers best-in-class service." These sentences fill space; they are empty calories posing as LinkedIn posts. Someone searching for "how does a home equity line of credit work" doesn't need to know that you're committed to innovation. They need to know how a HELOC works, what it costs, and whether it's a good fit for their situation.
Google's GEO guidance highlights a concept they call "non-commodity content" — writing that provides real insight, specific information, and a point of view that couldn't come from just anyone. The generic "7 Tips for First-Time Homebuyers" article that exists on ten thousand websites is commodity content. A piece that walks through a specific decision your team helped a real borrower make, with real numbers and a real outcome, is not.
Frustratingly enough to us content strategists, we find that a site's most important pages — the ones that describe mainline products or areas of expertise — are often the least specific. These are pages that passed through dozens of hands until every sharp edge was sanded into oblivion. The content is sanitized, and in doing that the content doesn't answer any real questions.
This is an editorial concern, and it takes time. It means going back to your most important pages and figuring out the purpose of each section — of each paragraph, and each definition. That's way harder than adding schema or fixing a heading or slapping some FAQ on the bottom of the page, but it's also where the biggest gains are.
The site is accidentally blocking AI search crawlers.
When concerns about AI tracking data picked up over the past couple of years, a lot of organizations added blanket blocks in their robots.txt files to keep AI crawlers out.
This is okay! It made sense at the time — we really didn't know what to expect, and erred on the side of security and protecting our assets.
Fortunately, the crawlers that train AI models and the crawlers that power AI search are different bots — and the major platforms now document them separately. OpenAI runs three: GPTBot for model training, OAI-SearchBot for indexing the content that surfaces in ChatGPT's search results, and ChatGPT-User for fetching a page live when someone asks a question. Anthropic mirrors this with ClaudeBot for training, Claude-SearchBot for search indexing, and Claude-User for live, user-triggered fetches. Perplexity keeps it to two: PerplexityBot for indexing and Perplexity-User for live retrieval.
If you want to block the training crawlers, you can do that. But block the search crawlers along with them, and you've cut your content out of those platforms' AI answers entirely. The platform might still show a bare link or your page title, but the content itself won't be there to quote or summarize.
The fix is relatively easy: check your robots.txt file and make sure you're blocking only what you intend to block. For many organizations, this is the single highest-impact change they can make, because everything else on this list is irrelevant if the crawlers can't reach your pages in the first place.
The common thread.
There's a common thread here, obviously: we are all looking for clarity from people we trust. Schema and headings and specificity and clear authorship are the basics of usable and trustable content. These things have always mattered for search, and they continue to matter in search's next phase. In fact, they probably matter even more.
We're no longer asking for a search engine to point us toward the cake mixes. We're asking a search engine to go grab the cake mix for us. Which means it's even more important to understand which cake mixes are on the shelf in the first place.
Putting that delicious metaphor aside, things are clearer now about what good content looks like: structured, authored, and clear.
If you'd like to run through a checklist yourself, you are in luck. We have put together two self-audit checklists — one for editorial teams and one for technical teams — that cover these findings and more. They're free, they're practical, and they'll give you a solid starting point for understanding where your site stands.
Good luck. It all boils down to one thing: start with content worth trusting, and the machines will trust it too.