The Discovery Problem Is Bigger Than Search
“At a recent Identity Salon meeting, I found myself stuck on the word ‘discovery.'”
It came up when talking about accounts. It came up again when talking about AI. It came up yet again when talking about information.
That’s a lot of discovery.
When I lived more fully in the world of research and education federation, discovery usually meant helping an end user find their identity provider. The user wanted access to a service, the service needed to know where to send them to authenticate, and the discovery process helped bridge that gap. In scholarly publishing, discovery often means something else: helping the reader get to the content they are looking for. Same word. Different problem.
Or at least, that is how I had been treating it.
The more I sat with it, the more I realized I had been thinking about discovery inside localized problem spaces. Federation discovery. Content discovery. Account discovery. Credential discovery. Agent discovery. Each area has its own vocabulary, assumptions, tooling, and governance model. That makes sense. People solve the problem in front of them. Of course they do. The right answer depends on what you are trying to find.
Mostly.
The part I needed to remember is that this is not just about me, or the groups I work with, or the systems we try to make easier to find, navigate, or trust. Discovery is a much broader problem. People are trying to find and track information scattered across accounts, platforms, protocols, archives, feeds, wallets, inboxes, and systems they may not even remember using. The localized problems are real. They are also pieces of a much larger pattern.
You can Subscribe and Listen to the Podcast on Apple Podcasts, or wherever you listen to Podcasts.
And be sure to leave me a Rating and Review!
The Internet has a discovery problem.
Not because search is broken. Search still works quite well for many things. If the information is public, indexed, reasonably well-described, and close enough to the words you know to use, search remains astonishingly useful. I am not here to complain that search engines no longer work because they sometimes insist on showing me twelve product roundups, three Reddit threads, and a suspiciously cheerful AI summary before I get to the thing I actually wanted. Though, for the record, I am not delighted by that either.
The problem is that search is only one form of discovery. And many of the things we now need to discover do not behave like web pages.
They may be private accounts. They may be credentials in wallets. They may be identity providers. They may be claims. They may be institutional relationships. They may be software endpoints. They may be AI agents. They may be pieces of information buried in Slack, Discord, Teams, a customer portal, a password manager, a cloud drive, an email archive, a browser profile, or some system you used exactly once in 2018 and have not thought about since.
Search helps when the thing can be searched.
That is a much narrower condition than we sometimes admit.
More information is not the same as better discovery
One reason discovery feels harder is obvious: there is simply too much information.
That statement is so familiar that it risks becoming meaningless. “Information overload” sounds like one of those problems everyone recognizes and then promptly ignores because the answer is supposed to be something like “manage your inbox better” or “turn off notifications.” Fine advice, as far as it goes. Not exactly a structural solution.
A recent systematic review of strategies for managing information overload defines the problem as cognitive strain caused when the amount of information exceeds a person’s ability to process it. The review found that responses to overload fall into several categories: personal strategies such as filtering and avoidance, organizational and technical solutions such as dashboards and recommender systems, educational approaches such as information literacy, and communication changes such as simplifying what gets shared and how. The point is not that one of these solves the problem. The point is that overload is both a human problem and a systems problem.
That matters for discovery because we often talk as if finding information is only a matter of giving people better tools. Better search. Better filters. Better tags. Better folders. Better bookmarks. Better AI. Really, just be organized and all your discovery problems will go away, right?
Finding the match is not enough
All of those can help. None of them changes the basic problem: more information creates more burden unless the system also helps people understand what exists, what matters, what is current, what is authoritative, and what they can safely ignore.
This is where the discovery problem starts to separate itself from the search problem.
Search asks, “Can I find something that matches this query?”
Discovery asks a more complicated set of questions. What exists? Where is it? Who controls it? Is it the right thing? Is it still valid? Is it trustworthy? Why was this result shown instead of another? What is missing? What should remain hidden?
A good discovery system is not always one that makes everything easy to find. Sometimes the right answer is that something should be findable only by the right party, under the right conditions, for the right purpose. That is already true in identity systems. It is also true in personal data, enterprise systems, digital estate planning, and AI agent ecosystems. Discovery isn’t always a purely technical challenge.
But before getting there, it is worth sitting with the more basic issue: people are trying to manage more information than their own habits can reasonably support.
Saving everything just moves the problem
One of the more relatable responses to information overload is to save everything.
Bookmark it. Download it. Put it in a folder. Send it to yourself. Leave the browser tab open (guaranteed to make sure you’ll never find the information again). Save the PDF. Subscribe to the newsletter. Follow the account. Preserve the trail.
I say this with great affection for my own “I might need this someday” systems, several of which are best understood as private archaeological sites.
Saving information can be useful and necessary. It can also be a way of relocating the discovery problem from the public Internet into your own badly indexed attic.
Research on information hoarding is useful here because it complicates the comforting idea that saving more is always better. One recent study describes information hoarding as continuously saving and accumulating digital information while having difficulty deleting it. The authors connect this behavior to information overload, selective exposure, and identity bubble reinforcement, using an information ecology model that treats information, the information environment, and the person processing the information as interacting parts of the same system.
That is a useful reminder. The problem is not only that there is too much information “out there.” There is also too much information “in here”: in our saved folders, inboxes, notes apps, cloud drives, password managers, and mental lists of things we are absolutely going to read “someday.”
The information paradox
The same study found that information hoarding was positively related to selective exposure, with information overload acting as one mediator. In plainer terms, accumulating more information may not make us more open or better informed. It may increase the burden of filtering and make us more likely to return to the familiar, the confirmatory, and the already-saved.
That should make us pause.
A personal archive is not the same as a discovery system. A password manager is not the same as a complete account inventory. A bookmarks folder is not the same as knowledge management. A giant pile of saved PDFs is not the same as a research library, no matter how much I would personally like that to be true.
Saving can preserve information. It does not necessarily make that information findable, usable, or trustworthy later.
Filtering is necessary. Filtering is also dangerous.
When people face too much information, they filter. Of course they do. Filtering is not a moral failure. It is a survival mechanism.
The question is what kind of filtering happens and who controls it.
Some filtering is intentional. We choose sources we trust. We unsubscribe. We ignore some topics. We skim. We rely on experts. We use search terms, alerts, folders, feeds, and saved queries.
Some filtering is infrastructural. Platforms rank what we see. Search engines decide what appears first. Recommendation systems infer what might keep us engaged. Paywalls determine what we can access. Private communities move knowledge out of the indexable web. Enterprise systems hide information behind permissions. Standards groups, vendors, governments, publishers, and platforms all create different visibility rules. None of that filtering is neutral, even when it is useful.
That said, some filtering is psychological. Under overload, people naturally reduce effort. They rely on familiar sources, familiar frames, familiar communities, and familiar interpretations. The information hoarding study explicitly connects overload with selective exposure and discusses how individuals may process information in ways that reduce complexity, including focusing on information that aligns with their interests, needs, or cognitive bias.
Governance is not a naughty word, but it is a complicated one
This is where “just filter better” becomes a weak answer.
Filtering is unavoidable. But filtering is also where discovery starts to become governance.
Who decides what is visible? Who decides what is recommended? Who decides what is hidden? Who benefits from the ranking? Who is excluded from the index? Who can opt out? Who can correct errors? Who can tell whether the result was selected because it was authoritative, popular, sponsored, personalized, recent, or merely good at playing the ranking game?
Those questions are not new. Librarians, archivists, publishers, search engineers, standards people, and federation operators have been dealing with versions of them for years, sometimes centuries. (Sometimes it just feels like centuries.)
What feels different now is how many domains are being forced to deal with them at once.
AI feels like an answer because discovery is exhausting
This is where generative AI enters the story.
People are not turning to AI intermediaries only because they are dazzled by the technology. Some are, certainly. There is no shortage of novelty-chasing in this industry, and apparently we must all spend some portion of our lives pretending every chatbot is either the end of civilization or the start of a productivity utopia.
But there is a more practical reason people reach for AI: it reduces the apparent effort of discovery.
Instead of constructing the right search query, opening ten tabs, skimming pages, comparing sources, sorting dates, checking provenance, and deciding what to trust, a user can ask a question in plain language and get a synthesized answer. That is deeply appealing. It turns discovery into conversation.
The appeal becomes even stronger under information overload. A recent study of generative AI chatbots in e-commerce looked at how human-like features affect users’ willingness to adopt GenAI as a decision aid. The study is specific to online shopping, so I would not treat it as a universal model for discovery. But it is useful here because it points to a broader pattern: when users face complex search and evaluation tasks, information overload becomes a significant factor in decision-making, and people may become more willing to rely on algorithmic evaluation as task difficulty increases.
“Helpful” discovery
The same study uses the elaboration likelihood model, distinguishing between central cues that require more careful evaluation and peripheral cues that require less cognitive effort. In overloaded conditions, people may rely more on simpler cues when making judgments. The authors discuss how anthropomorphic features such as warmth, empathy, and perceived competence can influence user confidence and willingness to adopt AI chatbot recommendations as a decision aid.
That is useful, and a little unsettling. We used to rely on librarians, editors, indexers, and other human intermediaries for parts of this work. They brought professional norms with them. AI intermediaries bring a different set of incentives, failure modes, and accountability gaps.
Basically, the more human and helpful an AI intermediary feels, the easier it may be to confuse a reduced-effort interaction with a reliable discovery process. Those are not the same thing.
An AI assistant can help summarize, translate, compare, and brainstorm. It can also hallucinate, omit sources, flatten disagreement, present stale information with confidence, or hide the path between source and answer. A conversational interface can make discovery feel easier without making the result more trustworthy.
That does not mean AI has no role in discovery. It almost certainly does. But if AI becomes an intermediary between people and information, then provenance, source quality, recency, ranking, uncertainty, and accountability matter even more.
A confident answer is not the same as a findable source.
A fluent summary is not the same as a trustworthy discovery mechanism.
Discovery is not just retrieval
This is the foundation for the rest of the series: discovery is not solely the act of finding something.
Discovery is the process by which people and systems determine what exists, where it lives, whether it is relevant, whether it is trustworthy, whether it is available to the requester, and whether it should be visible in the first place.
That definition is intentionally broader than search.
Search is one discovery pattern. So are directories, registries, federation metadata, wallet-mediated matching, browser prompts, recommender systems, trust lists, DNS lookups, enterprise catalogs, account recovery workflows, and private inventories. Each pattern has different assumptions. Each creates different failure modes. Each shifts power in different ways.
This is why the word “discovery” keeps resurfacing in digital identity.
It is an architecture problem. It is a trust problem. It is a user experience problem. It is a governance problem. And, because the universe has a sense of humor, it is often all of those at the same time.
The question underneath the question
When someone says “we need discovery,” the first response should not be “what technology should we use?”
The first response should be: “What kind of discovery do we mean?”
Are we trying to find public information? A private account? A login path? A credential? A claim? A wallet? An endpoint? An agent? A legal authority? A source of truth? A thing the user forgot? A thing the user never knew existed? A thing that should be discoverable to one party but invisible to another?
Those are different problems.
Treating them as one problem leads to bad design. It also leads to a very familiar industry reflex: taking infrastructure that already exists and asking it to carry more meaning than it was designed to hold.
We will get to that. Because yes, this is where people start saying, “Couldn’t we just use DNS?”
I understand the appeal. DNS is ubiquitous, delegated, operationally familiar, and already part of the Internet’s nervous system. It is also already doing quite a lot, and using it as the default answer to every discovery problem risks overloading a system that is robust in some ways and fragile in others.
But before arguing about infrastructure, we need to understand the problem space.
That starts with identity discovery.
Because once the thing being discovered is tied to identity, the question is no longer just “can I find it?” It becomes “who is allowed to find it, under what authority, with what proof, and with what consequences?”
That is where the next post begins.
📩 If you’d like to be notified of new posts rather than hoping you catch it on social media, I have an option for you! Subscribe to get a notification when new posts go live. No spam, just announcements of new posts. [Subscribe here]
Transcript
Welcome back to another discussion on digital identity, trust, and the systems that shape how we interact online.
Recently, during an Identity Salon conversation, one word kept appearing across very different topics:
Discovery.
It surfaced in discussions about:
- Accounts
- Artificial intelligence
- Information management
- Identity systems
- Trust relationships
At first, that seemed unremarkable. However, the more I thought about it, the more I realized that discovery is far broader than any single technology problem.
In fact, discovery may be one of the defining challenges of our digital world.
Discovery Means Different Things to Different Communities
When I worked more closely with research and education federations, discovery usually referred to helping users find their identity provider.
The process looked something like this:
- A user wants access to a service
- The service needs authentication
- The user must find the correct identity provider
- Discovery bridges that gap
In scholarly publishing, however, the same word means something entirely different.
There, discovery focuses on helping readers find:
- Research articles
- Publications
- Authors
- Relevant content
Same word.
Different problem.
Or so it seems.
A Larger Pattern Emerges
The more I considered these examples, the more I realized that we often think about discovery within isolated problem spaces.
We talk about:
- Federation discovery
- Content discovery
- Account discovery
- Credential discovery
- Agent discovery
Each area develops its own:
- Vocabulary
- Assumptions
- Governance models
- Technical solutions
That makes sense because people naturally solve the problems directly in front of them.
However, all of these examples are part of a much larger pattern.
The Internet Is a Discovery Problem
At its core, the modern internet is a giant discovery challenge.
People are constantly trying to locate information scattered across:
- Websites
- Applications
- Accounts
- Platforms
- Wallets
- Inboxes
- Identity providers
- Services they may have forgotten they even use
This isn’t simply a search problem.
It’s a discovery problem.
And those are not the same thing.
Search Still Works
To be clear, search is not broken.
Search engines remain remarkably effective when information is:
- Public
- Indexed
- Well described
- Close to the language users know to search for
For many situations, search continues to work extremely well.
The challenge is that many of the things we now need to discover do not behave like web pages.
They may be:
- Private accounts
- Digital credentials
- Wallets
- Institutional relationships
- Software endpoints
- AI agents
- Enterprise resources
- Information buried inside collaboration tools
Search can only help when the thing you’re looking for is searchable.
That turns out to be a surprisingly narrow condition.
Information Overload Changes Everything
One obvious reason discovery feels harder today is the sheer volume of information.
We often describe this as information overload.
However, that phrase has become so common that it almost loses meaning.
Research generally defines information overload as:
Cognitive strain caused when the amount of information exceeds a person’s ability to process it.
Importantly, researchers have identified several responses to overload:
- Personal filtering strategies
- Organizational solutions
- Technical tools
- Educational approaches
- Communication improvements
No single approach solves the problem entirely.
Because information overload is both:
- A human problem
- A systems problem
And that distinction matters.
Better Tools Are Not Enough
We often assume that discovery improves if we simply provide better tools.
For example:
- Better search
- Better filters
- Better folders
- Better tagging
- Better recommendations
- Better AI
These tools certainly help.
However, they do not solve the underlying challenge.
Discovery also requires helping people understand:
- What exists
- What matters
- What is current
- What is authoritative
- What can safely be ignored
This is where discovery becomes something much larger than search.
Search and Discovery Are Not the Same
Search asks a relatively simple question:
Can I find something that matches this query?
Discovery asks a much broader set of questions:
- What exists?
- Where is it?
- Who controls it?
- Is it trustworthy?
- Is it still valid?
- Why was this shown?
- What is missing?
- What should remain hidden?
These questions introduce entirely different concerns.
And some of those concerns are not technical at all.
Sometimes Information Should Not Be Easy to Find
A good discovery system is not always one that exposes everything.
Sometimes the correct outcome is that information remains visible only:
- To the right person
- Under the right conditions
- For the right purpose
This principle already applies to:
- Identity systems
- Enterprise environments
- Personal data
- Digital estate planning
- AI ecosystems
As a result, discovery quickly becomes a governance issue as much as a technical one.
The Information Hoarding Problem
One common response to overload is simple:
Save everything.
People:
- Bookmark pages
- Download files
- Create folders
- Leave browser tabs open
- Save notes for later
Most of us have some version of this strategy.
Unfortunately, saving information often relocates the discovery problem rather than solving it.
Instead of losing information on the internet, we lose it in our own digital archives.
Saving More Does Not Mean Understanding More
Research on information hoarding offers an important insight.
Saving information is not the same as understanding it.
Studies have connected information hoarding to:
- Information overload
- Selective exposure
- Cognitive filtering
- Reinforcement of existing perspectives
In other words, accumulating more information can actually increase the burden of managing it.
The result is often predictable.
People return to:
- Familiar sources
- Familiar ideas
- Familiar interpretations
Not necessarily because those sources are best, but because they are easiest to find again.
Filtering Is Inevitable
When people encounter too much information, they filter.
That is not a flaw.
It is a survival mechanism.
Some filtering is intentional:
- Choosing trusted sources
- Ignoring certain topics
- Unsubscribing from content
Other filtering is built into infrastructure:
- Search rankings
- Recommendation engines
- Permissions systems
- Platform algorithms
And some filtering is psychological.
People naturally simplify complex information environments by relying on what feels familiar.
Discovery Becomes a Governance Question
Once filtering enters the picture, discovery becomes a governance challenge.
Questions suddenly emerge:
- Who decides what is visible?
- Who determines ranking?
- What gets hidden?
- What gets promoted?
- Who benefits from those decisions?
These questions have existed for a long time.
Libraries, archives, publishers, and search systems have always dealt with them.
Today, however, far more domains are confronting these challenges simultaneously.
Why AI Is Becoming Part of Discovery
This brings us to generative AI.
People are not embracing AI solely because it is new.
Many are adopting it because it reduces the effort required to discover information.
Instead of:
- Constructing search queries
- Opening multiple tabs
- Comparing sources
Users can simply ask a question and receive a synthesized response.
That convenience is incredibly appealing.
Especially during periods of information overload.
The Risk of Conversational Discovery
Research suggests that when tasks become more difficult, people become more willing to rely on algorithmic assistance.
That should not surprise anyone.
When work becomes harder, tools that reduce effort become more attractive.
However, there is a tradeoff.
A conversational interface can make discovery feel easier without necessarily making it more trustworthy.
AI systems can:
- Summarize information
- Compare sources
- Translate content
- Brainstorm ideas
But they can also:
- Hallucinate
- Omit sources
- Present outdated information
- Flatten disagreement
- Hide how conclusions were reached
As a result, convenience and trustworthiness are not always aligned.
Discovery Is More Than Finding Something
This realization forms the foundation for everything that follows.
Discovery is not simply the act of finding information.
Discovery is the process by which people and systems determine:
- What exists
- Where it exists
- Whether it is relevant
- Whether it is trustworthy
- Whether it is available
- Whether it should be visible at all
That definition extends far beyond search.
Discovery Appears Everywhere
Search is only one discovery pattern.
Others include:
- Directories
- Registries
- Federation metadata
- Wallet-mediated matching
- Browser prompts
- Recommendation systems
- DNS lookups
Each approach:
- Makes different assumptions
- Creates different risks
- Shifts power in different ways
And each introduces unique governance questions.
Why Discovery Keeps Returning in Digital Identity
Discovery repeatedly appears in digital identity discussions because it touches so many dimensions at once.
It is:
- An architecture problem
- A trust problem
- A user experience problem
- A governance problem
And frequently all four at the same time.
That complexity explains why the topic continues to resurface across identity conversations.
Looking Ahead
When someone says, “We need discovery,” the first question should not be:
What technology should we use?
Instead, ask:
What kind of discovery problem are we trying to solve?
Are we trying to find:
- Public information?
- A private account?
- A login path?
- A credential?
- A wallet?
- An AI agent?
- A source of truth?
These are fundamentally different challenges.
Treating them as the same problem often leads to poor design decisions.
Final Thoughts
Discovery is far bigger than search.
It influences how people locate information, establish trust, manage identities, navigate systems, and make decisions.
As digital ecosystems become more complex, discovery will become increasingly important—not only as a technical capability, but as a question of governance, authority, visibility, and trust.
Understanding the problem correctly is the first step.
Because once identity enters the picture, discovery is no longer simply about finding something.
It becomes about determining who is allowed to find it, under what authority, and with what consequences.
Conclusion
This discussion lays the foundation for a broader exploration of discovery in digital identity.
The next step is identity discovery itself.
Because once the object being discovered is tied to identity, entirely new questions emerge about trust, permission, authority, and control.
And that is where the conversation becomes even more interesting.
