Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge For Windows 7,8,10,11

We have has fundamentally changed Facebook group searches to help people more reliably discover, sort and validate community content that is most relevant to them.
We introduced a new hybrid retrieval architecture and implemented automated model-based evaluation to eliminate the biggest friction points people face when searching for community content.
Under this new framework, we have achieved noticeable improvements in search engagement and relevance without increasing error rates.

People around the world rely on Facebook groups every day to find valuable information. Due to the amount of information available, the user journey is not always easy. As we help connect people across shared interests, it is also important to find a path through the wide range of conversations to uncover as accurately as possible the content a person is looking for. We have published an article discussing this how we address this issue by redesigning Facebook Group Scoped Search. By moving beyond traditional keyword mapping to one Hybrid retrieval architecture and implement automated model-based evaluationWe’re fundamentally innovating the way people discover, consume and validate community content.

Addressing friction points in community knowledge

When searching for answers in community content, people struggle with three frictions: discovery, consumption, and validation.

Discovery: Lost in Translation

In the past, discovery relied on keyword-based (lexical) systems. These systems search for exact words, creating a gap between a person’s natural language intent and the available content. For example, imagine a person searching for “small individual cakes with icing.” A traditional keyword system may not return results if the community uses the word “cupcakes” instead. Because the specific wording doesn’t match, this person is missing out on extremely relevant advice.

We needed a system where a search for an “Italian coffee drink” would effectively match a post about “cappuccino,” even if the word “coffee” was never explicitly mentioned.

Consumption: The expenditure tax

Even when people find the right content, they face an “effort tax.” They often have to scroll and sort through many comments before reaching a consensus. Imagine someone searches for “tips for caring for snake plants.” To get a clear answer, they have to read dozens of comments and put together a watering plan.

Validation: Decision making with community knowledge

People often need to verify a decision or validate a potential purchase using trusted community expertise. For example, imagine a buyer viewing a listing on Facebook Marketplace for a high-value item, such as a vintage Corvette. You want to get authentic opinions and advice about the product before purchasing, but this wisdom is usually left in isolated group discussions. The person must harness the collective wisdom of specialized groups to effectively evaluate the product. However, it is not easy to manually search for these validation signals.

The solution: A modernized hybrid retrieval architecture

We have developed one Hybrid retrieval architecture which supports a discussion module in Facebook search. This system runs parallel pipelines to combine the precision of inverted indices with the conceptual understanding of dense vector representations. We addressed the limitations of legacy search by restructuring three key components of our infrastructure.

The following workflow demonstrates how we modernize the stack to process intent in natural language:

Parallel retrieval strategy

We modernized the retrieval phase by decoupling query processing into two parallel paths to ensure we capture both precise terms and broad concepts:

Query preprocessing: Before retrieval, user queries undergo tokenization, normalization, and rewriting. This is important to ensure clean inputs to both the inverted index and the embedding model.

The lexical path (unicorn): We use Facebook Unicorn’s inverted index to retrieve posts that contain exact or very similar terms. This ensures high precision for queries with proper names or specific quotation marks.

At the same time, the request is passed on to our Search Semantic Retriever (SSR). That is a 12-layer model with 200 million parameters which encodes the user’s natural language input into a dense vector representation. We then perform an approximate nearest neighbor (ANN) search over a precomputed value Faiss Vector index of group contributions. This enables content retrieval based on high-dimensional conceptual similarity, regardless of keyword overlap.

L2 ranking with multi-task multi-label architecture (MTML).

Merging results from two fundamentally different paradigms—sparse lexical features and dense semantic features—required a sophisticated ranking strategy. The candidates identified from the keyword and embedding system are brought together in the ranking phase. Here, in addition to semantic features (cosine similarity values), the model also captures lexical features (such as TF-IDF and BM25 values).

Next we moved from single target models to a MTML Supermodel architecture. This allows the system to be optimized together for multiple engagement goals – in particular Clicks, shares and comments – while maintaining plug-and-play modularity. By weighting these signals, the model ensures that the results we uncover are not only theoretically relevant, but also likely to generate meaningful community interaction.

Automated offline evaluation

Using semantic search brings with it a validation challenge: similarity values are not always intuitive in high-dimensional vector space. To validate quality at scale without the bottleneck of human labeling, we integrated an automated assessment framework into our Build Verification Test (BVT) process.

We use Llama 3 with multimodal features as an automated judge to evaluate search results based on search queries. Unlike binary “good/bad” labels, our rating prompts are designed to recognize nuance. We explicitly programmed the system to recognize a “somewhat relevant” category, defined as cases where the query and result share a common domain or topic (e.g., different sports are still relevant in the general sports context). This allows us to measure improvements in outcome diversity and conceptual consistency.

Implications and future work

Using this hybrid architecture has led to measurable improvements in our quality metrics and confirms that the combination of lexical precision and neural understanding is superior to keyword-only methods. According to our offline evaluation results, the new L2 model + EBR (hybrid) The system outperformed the baseline overall Search engagement with the daily number of users who search on Facebook compared to baseline.

These numbers confirm that by incorporating semantic retrieval, we can successfully uncover more relevant content without sacrificing the precision that users expect. While modernizing the retrieval stack is an important milestone, it is only the beginning of unlocking community knowledge. Our roadmap focuses on deepening the integration of advanced models into the search experience:

LLMs in the ranking: We plan to apply LLMs directly in the ranking phase. By processing the content of posts during ranking, we aim to further refine the relevance score beyond vector similarity.
Adaptive retrieval: We explore LLM-driven adaptive retrieval strategies that can dynamically adjust retrieval parameters based on the complexity of the user query.

Read the paper

Modernizing Facebook Scope Search: Keyword and Embedding Hybrid Retrieval with LLM Assessment