The Best AEO Agency May Not Be the One With the Best Pitch

How to evaluate AI visibility expertise, separate diagnosis from salesmanship, and choose an agency that earns the right to recommend.

Every agency has updated its website. The language is everywhere: AEO, GEO, AI visibility, generative search optimization, and answer engine strategy. The terminology is new. In many cases, the thinking behind it is not.

That is the problem.

AI and LLM search now account for 17% of all search traffic in 2026, and AI platforms are driving approximately 1.13 billion referral visits per month, up 357% from 2024. Meanwhile, 37% of consumers now start their searches in AI tools rather than a traditional search engine. The shift is real and accelerating.

The market’s demand for AI visibility expertise is outpacing the maturation of genuine expertise.

That gap is where bad agency decisions happen.

Over the years, I have noticed a pattern that extends far beyond AI visibility.

The strongest advisors earn the right to recommend.

They spend time understanding before prescribing.

They diagnose before proposing.

They ask questions before presenting answers.

The weakest advisors often do the opposite.

That principle applies whether you are selecting an AI visibility agency, a software platform, a technology vendor, or a strategic consultant.

Many agencies are learning. Mine included. That is not a weakness. In a market evolving this fast, every agency, consultant, platform vendor, and practitioner is learning continuously. Nobody has finished. Organizations that stop learning become obsolete faster than those that never started. Continuous learning is no longer a competitive advantage. It is the minimum requirement for remaining relevant.

The distinction is not between agencies that are learning and agencies that are not. The distinction is between agencies that are transparent about where they are in that journey and agencies that present assumptions as established expertise. One earns trust through honesty. The other borrows it through confidence.

The problem is that buyers often cannot distinguish between sales presentations. Both groups use the same vocabulary. Both have case studies. Both sound certain.

This article will not explain what AEO or GEO is.

It will show you how to evaluate whether an agency has actually earned the right to advise you on AI visibility, before you sign anything.

Diagnosis Before Tactics Is the First Filter

Most buyers walk into an agency evaluation asking the wrong question.

“What will you do for us?”

It feels like the right question. It is not. The better question is: “How did you determine what needs to be done?”

Those are not the same question. One asks for a deliverable list. The other asks for evidence of thinking.

If an agency responds to your first conversation with a recommendation to publish more content, create FAQ pages, add schema markup, or launch a digital PR campaign, pay attention. That agency is prescribing before it has diagnosed anything. It is guessing confidently.

Strong consulting begins with diagnosis. Weak consulting begins with recommendations.

Here is the paradox.

In emerging markets, the most confident advisor is not always the most qualified.

The best AI visibility agency may not be the one with the best pitch.

It may be the one asking the best questions.

Expertise reveals itself through curiosity long before it reveals itself through recommendations.

Here is what that difference looks like in practice:

The best AEO agencies start with evidence, benchmarks, and gap analysis before recommending a strategy.

If I were evaluating an agency, the first thing I would ask for is a documented baseline: where your brand currently appears in AI-generated responses, which engines were tested, which prompts were used, and what the reporting cadence looks like. Not a rough estimate. A structured starting point.

If that baseline does not exist before the proposal, the proposal is not evidence-based.

The quality of recommendations is always limited by the quality of diagnosis.

Over the years, I have evaluated agencies, software platforms, ecommerce systems, implementation partners, and technology vendors across multiple industries and markets. The pattern is remarkably consistent. The strongest partners spend more time understanding the problem than presenting the solution. The weakest partners often do the opposite.

That principle applies equally to medicine, engineering, and digital strategy.

What I Would Expect an Agency to Audit First

Before strategy comes visibility. Before visibility comes diagnosis.

In my experience, the agencies worth trusting arrive at discovery with a structured audit framework rather than a generic questionnaire. Here is what that framework should cover.

1. Current AI Citation Visibility

The agency should be able to answer: where does your brand currently appear in AI-generated responses, across which engines, for which prompts, and how does that compare to your competitors?

Useful benchmarks include citation frequency, prompt coverage, and representation quality across engines like ChatGPT, Perplexity, Claude, and Google AI Overviews. Without this baseline, there is nothing to improve against.

You cannot optimize what you have not measured.

2. Entity Presence and Validation

AI systems do not just find your content. They form an understanding of who you are based on signals across the web. That understanding is called your entity footprint.

I would expect an agency to assess whether your brand is clearly understood as a distinct entity, whether authority signals are consistent across platforms, and whether trusted third-party sources reinforce the same identity. This includes review platforms, industry directories, press mentions, and knowledge panel data.

Inconsistent or thin entity signals are one of the most common reasons brands are underrepresented in AI-generated responses, even when their content is technically sound. The agency should be able to identify specific gaps: missing or conflicting information across platforms, weak third-party corroboration, or brand descriptions that vary enough to create ambiguity for AI systems.

Entity work is not glamorous. It is often the fastest path to improved citation visibility.

Visibility follows trust. Trust begins with validation.

3. Technical Infrastructure

The technical foundations of AI discoverability still include crawlability, indexation, site architecture, structured data implementation, and page performance. These are not legacy concerns. They are the infrastructure layer that determines whether AI systems can reliably interpret and trust your content.

If machines cannot efficiently parse your content, they will not consistently surface it. The goal is to ensure content can be parsed, understood, and trusted by LLMs and AI ranking systems.

These same principles apply whenever you are evaluating expertise in an emerging market, whether for technology vendors, software platforms, consultants, or implementation partners. AEO and GEO are the current case study. The underlying lesson is about decision quality in conditions where no one has all the answers yet.

An agency that skips any of these three areas in discovery is not running a real audit. It is running a sales conversation.

Two Proof Points That Separate Real Expertise from Relabeled SEO

Once you have seen how an agency approaches diagnosis, two additional tests will tell you most of what you need to know.

Proof Point 1: Technical SEO Depth

AEO and GEO are not replacements for technical SEO. They are extensions built on top of it. Any agency that presents AI visibility as a separate discipline, disconnected from crawlability, indexation control, structured data, content architecture, and internal linking, does not fully understand how AI systems discover and evaluate content.

Ask them to walk you through a technical SEO issue they identified and resolved for a client. Ask how they think about structured data beyond basic schema types. Ask what they look for in a crawl audit.

If the answers are vague, the AI visibility pitch is fragile.

In my experience, agencies that struggle with technical SEO tend to struggle with AI visibility as well. It is the foundation on which everything else depends.

Proof Point 2: Measurement Maturity

This is where most agencies fall short, and it is the clearest truth test available to buyers.

Only 14% of SEO professionals currently track AI and LLM citation visibility, according to Ahrefs. Yet 43% consider AI optimization a core strategy for 2026, per Search Engine Land. That gap between intent and measurement is where weak agencies hide.

Traditional SEO metrics explain only 4-7% of AI citation behavior. Traffic reports and ranking positions do not tell you whether your brand is being cited, how accurately it is being represented, or which prompts are generating visibility for competitors.

An agency worth considering should be able to define its reporting framework before work begins. That means citation frequency, answer inclusion rate, prompt coverage, and representation quality, not just sessions and impressions.

If they cannot describe what they will measure and how, they cannot prove what they changed.

Five Questions to Ask Before You Sign a Contract

Use these questions in your next agency conversation. The answers matter. The speed and confidence of those answers matter even more.

1. Can you show me where my brand currently appears in AI-generated responses?

This reveals whether the agency begins with evidence or assumptions. A strong answer includes a documented baseline across multiple engines, a defined set of prompts, and a methodology for tracking changes over time. A weak answer is a vague promise to “audit your AI presence” with no specifics.

2. How do you measure success beyond traffic?

Traditional SEO metrics alone are no longer sufficient for evaluating AI visibility work. The agency should be able to name specific AI-native KPIs: citation frequency, answer inclusion rate, prompt coverage, representation quality, and hallucination monitoring. If the answer circles back to sessions, rankings, and impressions, the reporting framework has not evolved.

3. What is the difference between being mentioned and being cited?

This question separates surface-level familiarity from genuine understanding. Mentions create awareness. Citations build trust and signal that AI systems are actively using your content as a source. An agency that cannot articulate this distinction clearly is not operating at the level of depth the work requires.

4. What technical issues could prevent AI systems from trusting our content?

This exposes whether the agency understands infrastructure. A strong answer covers structured data gaps, crawlability issues, inconsistent entity signals, thin authority footprints, and content that is not machine-interpretable. A weak answer stays at the surface level of “content quality” and “relevance.”

5. What would you need to audit before recommending tactics?

Experienced advisors diagnose first. This question forces the agency to describe its discovery process. If the answer is a proposal with tactics already outlined, a diagnosis never happened. If the answer is a structured audit covering citation visibility, entity footprint, and technical infrastructure, you are talking to someone who understands the work.

What a Well-Run Engagement Actually Looks Like

Many agencies invert the engagement process. They lead with deliverables, then build a rationale around them afterward. A legitimate AI visibility engagement follows a different sequence.

  1. Discovery: understand the business, competitive landscape, existing digital infrastructure, and where AI visibility sits relative to business goals
  2. Current-state audit: establish baseline citation visibility across engines, map entity footprint, and assess technical health before any strategy is proposed
  3. Gap analysis: identify where visibility is missing, inconsistent, or underperforming relative to competitors, and why
  4. Measurement framework: define KPIs, reporting cadence, engines tracked, and prompt sets before any work begins, not after the first month of activity
  5. Prioritization: sequence opportunities by impact and feasibility, not by what is easiest to execute or fastest to invoice
  6. Execution: implement recommendations against a defined strategy tied to the gap analysis, not a generic checklist recycled from the last client
  7. Reporting: report against the baseline established in step two, not against vanity metrics that have no connection to AI citation behavior
  8. Iteration: refine based on what citation data and visibility tracking actually show, not on assumptions carried over from traditional SEO campaigns

The order matters. Agencies that skip steps two, three, and four are not running a strategy. They are running an activity.

Execution is not the first step. Understanding is.

When an agency cannot describe this sequence, or reverses it by starting with tactics, that is a structural signal about how the entire engagement will be managed.

Earn the Right to Recommend

The strongest advisors are often less certain in the first conversation.

Not because they know less. Because they are still diagnosing.

The weakest advisors are often the most confident because they feel compelled to prescribe before they fully understand the situation. Early certainty is not always expertise. Sometimes it is a sign that the diagnosis never happened.

The real question is not which AEO or GEO agency to hire first. The real question is which agency has earned the right to recommend on your AI visibility strategy.

Those are not the same question.

The companies that succeed in AI search over the next several years will not necessarily be the ones that move first. They will be the ones who choose better advisors, build on a diagnosed foundation, and measure what actually changed.

The question is not whether your agency has all the answers. Nobody does. The question is whether they know how to ask the right questions before proposing solutions.

Before you commit budget to an agency relationship, establish your current state.

Now It’s Your Turn.

  • Do you know where your brand currently appears in AI-generated responses?
  • Can your current or prospective agency show you a citation baseline before proposing tactics?
  • Does their reporting framework go beyond traffic and rankings?
  • Have they described what they need to audit before recommending anything?

If the answers are unclear, that is where to start.

AI visibility will continue to evolve.

New platforms will emerge.

Measurement frameworks will mature.

Best practices will change.

What should not change is the discipline of diagnosis before recommendation.

Whether you ultimately work with my team, another agency, or build capabilities internally, ask the same question:

Has this advisor earned the right to recommend?

Request a current-state AI visibility audit before signing anything, or schedule a free consultation to discuss what a diagnostic-first engagement looks like for your business.


Frequently Asked Questions:

How long does a proper AI visibility audit take before an agency should propose tactics?

A thorough current-state audit typically takes two to four weeks, depending on the size of your digital footprint and the number of engines being tracked. Any agency proposing a full strategy in the first conversation has not run a real audit. Speed in the discovery phase is a warning sign, not an efficiency signal.

Should I expect an agency to audit my competitors as well as my own brand?

Yes. A baseline that only measures your own citation visibility is incomplete. Understanding where competitors appear, for which prompts, and at what frequency is what gives the gap analysis meaning. Without a competitive context, prioritization is guesswork.

What does a good AI visibility case study actually look like?

It shows a before-and-after citation baseline across specific engines and prompt sets, not just a traffic chart. It names the audit areas addressed, the sequence of work, and what changed in citation frequency, answer inclusion rate, or representation quality. If the case study only shows traffic growth or ranking improvements, it is a traditional SEO case study with new language on top.

How should I evaluate an agency’s pricing for AEO or GEO work?

Pricing alone tells you very little. What matters is what is included in the engagement structure. A retainer that begins with a current-state audit, a defined measurement framework, and a gap analysis before any execution is structurally sounder than a cheaper retainer that starts with content production or schema implementation on day one. Pay for the thinking, not just the doing.

What should I do if an agency resists running a current-state audit before proposing work?

Technically yes. Practically, the claim is fragile. AI systems rely on the same infrastructure signals that technical SEO governs: crawlability, structured data, content architecture, and indexation control. An agency without demonstrated technical SEO depth is building an AI visibility strategy on an untested foundation. Ask for specific examples of technical SEO problems they have identified and resolved, not just AI visibility wins.

How do I evaluate whether an agency’s reporting will actually tell me something useful?

Ask to see a sample report before signing. A useful AI visibility report shows citation frequency by engine, answer inclusion rate across tracked prompts, representation quality, and changes relative to the baseline established at the start of the engagement. If the sample report shows sessions, rankings, and impressions with an AI section added at the bottom, the reporting framework has not evolved beyond traditional SEO measurement.

What is the difference between AI visibility and AI search traffic?

AI visibility measures whether your brand appears in AI-generated responses, how accurately it is represented, and how often it is cited as a source. AI search traffic measures the volume of visits arriving from AI platforms. Visibility is a leading indicator. Traffic is a lagging outcome. An agency focused only on traffic from AI sources is measuring the wrong thing first.

How do I know if an agency is applying a generic playbook or building a strategy specific to my situation?

Ask them to describe two recent clients with different starting points and how their strategies differed. Generic playbook agencies will describe similar approaches under different brand names. Strategy-first agencies will describe meaningfully different priorities based on what the audit revealed. The diagnostic findings should visibly shape the work. If they do not, the audit was performative.

What should I expect from the first 90 days of a well-run AI visibility engagement?

The first 30 days should be almost entirely diagnostic: a current-state audit, an entity footprint mapping, a technical infrastructure review, and a measurement framework definition. Days 31 to 60 should produce a prioritized gap analysis and a strategy tied to specific findings. Execution should not begin in earnest until day 61 at the earliest. Any agency that has produced significant deliverables in the first two weeks has skipped the work that makes those deliverables meaningful.


You might find these articles worth reading as well: