Enterprise Search and the Myth of the Silver Bullet

Enterprise Search has always been a hard problem – one that lacks any “silver bullet” solution. A decade or more ago in the trenches at my old company, migrating from the Google Search Appliance to SharePoint 2010 with FAST Search, and then to SharePoint 2013’s integrated search, I’ve learned that effective search is less about magic algorithms and more about doing the unglamorous work: curating high-quality content, tuning relevance, and continuously aligning results with what users truly need. In fact, despite all the advances in technology, enterprise search satisfaction remains stubbornly low and organizations still underestimate what it takes to deliver a great search experience. The central lesson from my journey is simple: there are no shortcuts – good enterprise search requires deliberate curation of authoritative content and de-prioritization of the irrelevant or noisy. Let’s explore why this is so, through both personal anecdotes and the broader evolution of the field.

From Google Search Appliance to FAST: More Power, More Complexity

Back in the day, the Google Search Appliance (GSA) promised a plug-and-play enterprise search – essentially Google-in-a-box for your intranet. It indexed content across your organization and gave us a familiar search experience with minimal configuration. But as our content grew and requirements expanded, cracks began to show. We needed finer control over relevance, custom metadata handling, and deeper security integration than the GSA’s out-of-the-box setup could easily provide. This led us to migrate to SharePoint 2010 with FAST Search, a move that felt like trading a convenient black box for a high-performance race car that required a skilled driver and pit crew.

FAST Search for SharePoint 2010 was a powerful engine – one that offered capabilities far beyond the basic SharePoint search at the time. FAST could index content from a variety of sources and supported “early-binding” security trimming (processing access controls at index time) for faster, security-trimmed results. It let us customize ranking algorithms, define our own relevance rules, and leverage metadata in ways the GSA couldn’t. FAST’s advanced refiners and document previews were a revelation – features that, in 2010, were only available with this specialized (and expensive) add-on. In short, we gained the extensibility and sophistication we were seeking. However, we also inherited a great deal of complexity. Deploying FAST was no small feat: it required a separate server farm, careful configuration of connectors, and mastery of its pipeline architecture, which in some cases involved going to the lengths of writing custom extensions in C#. We spent countless hours mapping “crawled” properties to “managed” properties (to make metadata searchable), tuning the index schema, and debugging crawl issues. We discovered that simply switching to a more powerful search platform didn’t magically improve relevance – we had to roll up our sleeves and tune it for our content and users.

By the time we upgraded to SharePoint 2013, Microsoft had folded FAST’s capabilities into the core SharePoint Search. This simplified our infrastructure (no more separate FAST farm) and made advanced search features available to all SharePoint Enterprise users. We hoped this integration would finally make life easy: one unified search service with the best of both worlds. Indeed, the 2013 search brought improvements – the query engine was more advanced, and features like result refiners were now built-in. But even with “FAST goodness baked in,” we found that search excellence still didn’t come out-of-the-box. We had to re-apply many of the lessons we’d learned: carefully configuring content sources, re-tuning relevance rankings, and continuously monitoring search analytics to see where results fell short. The technology was improving, but the heavy lifting behind the scenes remained our responsibility.

The Security Tightrope: Misconfiguration Mishaps

One particularly hair-raising lesson we learned was that if you get security wrong in enterprise search, the consequences can be severe. Early on, we suffered from an embarrassing misconfiguration related to how the search index handled file share permissions. We had indexed content from a network file share, and we expected our search crawler to respect the file share’s SMB permissions so that users only saw results they were allowed to access. Unfortunately, the crawler relied solely on NTFS file system permissions and ignored the stricter SMB share permissions. The result? For a brief window, a user could discover documents in search that they couldn’t actually open (and shouldn’t have even known existed). We thankfully caught and rectified this before search went to broader release for the company, but nonetheless it was a face-palm moment which underscored how unforgiving the security aspect of search can be.

This incident taught us that search security is truly a tightrope. In enterprise search, every document must be accompanied by the correct security metadata (ACLs, group permissions, etc.) so that search results are security-trimmed – i.e. you only see what you have rights to see. The complexity is immense: different repositories (file shares, SharePoint sites, databases) have different security models, and the search engine’s connectors have to faithfully mirror each of them. If you take a shortcut or slip up in syncing permissions, users either see too much (a security leak) or see too little (missing content they should find). There’s a reason experts call search security one of the most challenging problems in the field, with no silver bullet solution. It often requires a combination of approaches – index-time ACL enforcement, query-time checks, content partitioning, etc. – and constant vigilance. Our takeaway was sobering: you can spend months perfecting relevance and user experience, only to have a minor security oversight erode all trust in the search tool. We learned to treat security as a first-class requirement in every search project, double-checking configurations and involving our security team early and often. Nothing about the security profile of enterprise search in any way compared to a broad web search like Google where all the web content was inherently public and accessible (with some exceptions – you may need a subscription for getting to New York Times, for example).

One Size Doesn’t Fit All: Tailoring Search to Diverse Needs

Another hard-won insight was that no single search interface or ranking formula can satisfy everyone in a large organization. Different teams have different information needs – and often, drastically different ways of phrasing queries or judging relevance. In our case, the research org wanted search to perform highly specialized queries to surface technical documents around chemical formulations, often based on their own nomenclature and desired search facets. The rest of the company, on the other hand, cared about quickly finding knowledge base articles and policy documents by plain-language queries. Further beyond that, other orgs wanted their own catered search around content that they cared about. We quickly realized that if we tried to funnel all users through one generic “Google-like” search page, many would end up frustrated with irrelevant results or too much noise. The solution was to create custom search experiences tailored to major user segments, that targeted subsets of content sources and provided relevant search facets (aka filters) based on metadata that was catered to those use cases. (Interestingly these same lessons tend to apply well for scoped RAG agents, too – but that’s a discussion for another blog post.)

Using SharePoint’s capabilities, we built out multiple search verticals – essentially, separate search result pages or views, each tuned for a specific context. For example, we made a “formulation search” page for researchers that defaulted to searching the relevant content sources, with refiners attuned to that domain. In effect, we learned to treat enterprise search less as a single monolithic tool and more as a flexible toolkit that could be adapted to each business scenario. This approach echoes a broader industry insight: successful enterprise search is often delivered as a series of targeted “search-based applications” that meet specific business needs, rather than one interface trying to do it all. By narrowing the scope and understanding each group’s priorities, we were able to improve relevance and satisfaction dramatically. The experience reinforced that deep engagement with users – interviews, observing search behavior, gathering feedback – was crucial to designing search that actually helps people get their jobs done. You can have your cake (broad search) and eat it, too (scoped search experiences / applications)!

When Search Was a Curated Discipline (and Why It Still Matters)

Looking back, I’m struck by how much human curation and information architecture work went into our search solutions – and how essential that work was to success. In the early days of enterprise search (think 2000s), it was common to have information architects or knowledge managers on the intranet team. These folks would create taxonomies, define metadata standards, and even manually curate search results for popular queries. (Who else remembers setting up “Best Bets” or “Key Results” for certain keywords? I kid you not, at one point our old intranet had something akin to AOL keywords where you would type a word and hit “Enter” it would take you directly to a certain page – and people were upset when we replaced that with “Best bets” in enterprise search because it involved one more click on a top result. 🙃 Retro indeed. And also: first world problems. But I digress…) This highly curated approach to search ensured that content was well-organized and annotated, which in turn made search more effective. Over time, however, I observed a decline in this discipline. Perhaps due to tighter budgets or a misplaced faith that “the system will figure it out automatically,” many organizations stepped back from active curation. The result? A lot of mediocre content and “information junk” piling up, and search indexes bloated with non-authoritative, untrustworthy and irrelevant data.

Our journey taught us the hard way that content quality is the bedrock of search quality. As one author put it, “The quality of the search experience is directly related to the quality of the content,” according to Martin White in his still highly relevant book from 2012, Enterprise Search. In other words, garbage in – garbage out. We encountered this repeatedly. For example, when we first indexed a large file share, the results were underwhelming: many documents had no titles, no authors, and cryptic filenames, so it was hard for any search engine to rank them meaningfully. Users complained that results were “nonsense” – not because the search tech was bad, but because the content itself was poorly managed. We responded by launching a clean-up project: adding titles to key documents, persuading content owners to fill in missing metadata, and deleting truly outdated material. In places where we couldn’t clean up, we would attempt to boost the ranking of authoritative sources and downgrade relevance of non-authoritative / low quality sources. These efforts immediately boosted search relevance. It was a vivid reminder that someone has to take responsibility for information quality and organization, or your fancy search engine will simply index and retrieve a mess.

In the past, many companies also invested in taxonomy design – controlled vocabularies and category structures for their content. That practice, too, has faded in some places, but its importance is returning as we grapple with ever-growing information, and especially as we try to inform AI about our content. (And in an interesting turn of events, LLMs may actually be able to help us apply these labels retroactively.) A good taxonomy (with a governance process to maintain it) helps ensure that, for instance, all documents about “HR Policies” are actually labeled consistently, or that a search for “CEO” also finds pages tagged “Executive Management.” Tagging and classification might sound old-fashioned, but it significantly improves findability. In fact, content management experts note that taxonomy and metadata are essential to making content searchable and retrievable in context. I was sad that in my time spent on search we never got around to utilizing the Term Store (aka managed metadata) in SharePoint to populate it with key business terms and synonyms, as that effectively would have resulted in teaching the search engine our company’s lingo, as well as provided a means for content creators in the company to tag and classify content in-line with available terms from the term store.

The broader point is that the need for curation hasn’t truly gone away – it’s just taken on new forms. Even if we don’t have full-time “search librarians” on staff, the work still needs to be done by someone. In our case it was an ad-hoc mix of IT, content owners, and enthusiastic volunteers who cared about findability. Together we played the role of the old information architects: pruning ROT (redundant, outdated, trivial content), organizing content repositories, and creating contextual search scopes that guided users. These efforts underscore an enduring truth: you can’t have great search without investing in your information architecture. Technology can assist, but it will never replace the need to curate, organize, and maintain your knowledge base. This need to curate can even extend to addressing over-sharing in the enterprise, per our discussion above – even if the content produced is not sensitive, over-sharing can inevitably introduce irrelevant content into the index for more people to see, and things like term stores can help raise the bar of what is needed for high quality content to be boosted in search results.

New Tech, Same Lessons: AI and RAG Won’t Eliminate the Basics

Fast forward to today, and the buzz in enterprise search is all about Artificial Intelligence and RAG (Retrieval-Augmented Generation). The promise is enticing: users ask questions in natural language and an AI (like a GPT-style chatbot) will fetch information from internal documents and generate a helpful answer on the fly. Having spent years wrestling with search algorithms and frustrated users, I understand the appeal – it sounds like the ultimate silver bullet. Feed all your documents to an AI, and anyone can get perfect answers, right? Well, not so fast. We’re using some of these AI-driven tools now, backed by search, and guess what we found? The old lessons still apply in full force.

RAG is essentially a two-step dance: first retrieve relevant content from your corpus (which typically has been chunked and vectorized with embeddings into some type of vector search database for semantic retrieval – the details of which I won’t delve into here), then generate an answer using that content. That first step is just enterprise search by another name – which means if your content is disorganized or low-quality, the AI will retrieve garbage and then eloquently tell you something based on garbage. We’ve seen an AI system confidently answer a query with outdated policy information, because our underlying content repository still had an old manual that hadn’t been properly marked as superseded. The fancy AI had no way to know which content was authoritative or current – that’s a curation problem, not an algorithm problem. It reinforced the notion that an AI is only as good as the knowledge you feed it. Industry observers have the same caution: an enterprise search (or RAG) experience is only as good as the sources it’s based on. In fact, successful use of RAG demands carefully curated, up-to-date data so that the AI’s answers are accurate and trustworthy.

This is not to pour cold water on AI’s potential – I’m actually excited about combining search with generation to simplify information access. But it’s telling that even in this cutting-edge domain, we’re circling back to the fundamentals. If we want the AI to surface the right answer, we must ensure the right content is discoverable. That means doing all the things we’ve discussed: managing content life cycles (so obsolete info is archived), adding metadata or annotations that give context to documents, and defining authority (so the HR policy portal outranks some random older file). In a way, RAG can actually incentivize better curation, because the cost of bad or mis-indexed content becomes immediately obvious when an AI spews out a flawed answer. The bottom line is that the core principles of enterprise search still apply, even as technology evolves. Whether a user is reading a search results page or a chatbot’s answer, the behind-the-scenes work to make that answer useful is largely the same. This is also why RAG solutions that focus on a small scope of high quality information tend to perform very well! But if you broaden that to the entire corpus of your company, the water gets muddier, and so do the results of RAG.

Conclusion: No Silver Bullet, Just Hard-Won Excellence

Reflecting on this journey – from GSA to FAST to modern AI-driven search – the clear takeaway is that enterprise search is a marathon, not a sprint. There will always be new tools and clever algorithms, but none of them absolve us from understanding our content and our users. There is no one-size-fits-all fix, no single purchase or upgrade that will suddenly make all information findable. (No, not even Glean.) The organizations that succeed in enterprise search are the ones that treat it as an ongoing discipline: part technology, part content strategy, part user education. In my experience, success has come from a blend of the right technology choices, diligent content curation, careful security and governance, and continuous tuning based on user feedback. When all those pieces come together, the outcome is immensely impactful – employees actually trust the search box to do its job, and that in turn boosts productivity and knowledge sharing.

It’s a bit poetic that today’s AI renaissance is bringing us back to lessons we should never have forgotten. To make AI answers useful, we need well-organized information – the same mandate behind good old enterprise search. As we forge ahead, I’m more convinced than ever that great search is designed, not just deployed. It requires human insight to prioritize what’s “authoritative,” to weed out what isn’t, and to present results in context. These are hard-won lessons, earned through the frustrations of users and the late-night problem-solving of IT teams over many years. Enterprise search has enduring complexity, but embracing that truth is the first step to taming it. In the end, we can’t rely on silver bullets – we have to load our own custom ammunition: high-quality content, thoughtful architecture, and user-focused refinement.

The reward is worth the effort: when someone finds exactly the information they need in seconds rather than hours, you realize that all the curation and tweaking translates directly into business value and happier, more effective teams. And that, to me, is the enduring promise of enterprise search – one that we can finally start to fulfill by applying these hard-won insights, old and new.