Skip to content

Conversation

@margaretjgu
Copy link
Contributor

@margaretjgu margaretjgu commented Nov 12, 2025

Elasticsearch Hybrid Search Implementation

Summary

This PR adds hybrid search support to the Elasticsearch vector store integration in LangChain.js. Hybrid search combines semantic (vector) search with lexical (BM25 full-text) search using Reciprocal Rank Fusion (RRF) for improved search relevance.

Related documentation PR: langchain-ai/docs#1466

Features

1. Hybrid Search Strategy

  • New HybridRetrievalStrategy class for configuring hybrid search
  • Combines kNN vector search with BM25 full-text search
  • Uses Elasticsearch's built-in RRF (Reciprocal Rank Fusion) for result merging
  • Server-side implementation for optimal performance

2. Backward Compatible

  • Existing pure vector search behavior unchanged
  • Hybrid search is opt-in via strategy parameter
  • No breaking changes to existing APIs

Implementation Details

New Classes

HybridRetrievalStrategyConfig

Configuration interface for hybrid search:

interface HybridRetrievalStrategyConfig {
  rankWindowSize?: number;        // Default: 100
  rankConstant?: number;           // Default: 60
  textField?: string;              // Default: "text"
  includeSourceVectors?: boolean;  // For ES 9.2+
}

HybridRetrievalStrategy

Strategy class implementing hybrid search:

const strategy = new HybridRetrievalStrategy({
  rankWindowSize: 100,
  rankConstant: 60,
  textField: "text",
  includeSourceVectors: true  // For ES 9.2+
});

Modified Methods

similaritySearch()

  • Captures query text for hybrid search
  • Routes to hybrid or vector search based on strategy

similaritySearchVectorWithScore()

  • Routes to hybridSearchVectorWithScore() when strategy is present
  • Falls back to pure kNN search otherwise

New: hybridSearchVectorWithScore()

  • Private method implementing hybrid search
  • Uses Elasticsearch retriever API with RRF
  • Combines two retrievers:
    1. Standard retriever: BM25 full-text search
    2. kNN retriever: Vector similarity search

Usage

Basic Vector Search (No Change)

import { Client } from "@elastic/elasticsearch";
import { OpenAIEmbeddings } from "@langchain/openai";
import { ElasticVectorSearch } from "@langchain/community/vectorstores/elasticsearch";

const vectorStore = new ElasticVectorSearch(
  new OpenAIEmbeddings(),
  {
    client: new Client({ node: "http://localhost:9200" }),
    indexName: "my-index"
  }
);

const results = await vectorStore.similaritySearch("query", 5);

Hybrid Search (New)

import { HybridRetrievalStrategy } from "@langchain/community/vectorstores/elasticsearch";

const vectorStore = new ElasticVectorSearch(
  new OpenAIEmbeddings(),
  {
    client: new Client({ node: "http://localhost:9200" }),
    indexName: "my-index",
    strategy: new HybridRetrievalStrategy({
      rankWindowSize: 100,
      rankConstant: 60,
      textField: "text"
    })
  }
);

// Same API, but now uses hybrid search internally
const results = await vectorStore.similaritySearch(
  "how to prevent muscle soreness",
  5
);

Core Implementation

  • libs/langchain-community/src/vectorstores/elasticsearch.ts (+145 lines)
    • Added HybridRetrievalStrategyConfig interface
    • Added HybridRetrievalStrategy class
    • Updated ElasticClientArgs interface
    • Added hybridSearchVectorWithScore() method
    • Updated similaritySearch() and similaritySearchVectorWithScore()
    • Enhanced JSDoc documentation

@margaretjgu margaretjgu marked this pull request as draft November 12, 2025 21:58
@changeset-bot
Copy link

changeset-bot bot commented Nov 12, 2025

🦋 Changeset detected

Latest commit: 139374b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@langchain/community Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added community Issues related to `@langchain/community` pkg:@langchain/community labels Nov 12, 2025
@margaretjgu margaretjgu changed the title Enable Hybrid Search for Langchain.js Enable Elasticsearch Hybrid Search for Langchain.js Nov 12, 2025
@margaretjgu margaretjgu changed the title Enable Elasticsearch Hybrid Search for Langchain.js feat(): add elasticsearch hybrid search Nov 13, 2025
@margaretjgu margaretjgu marked this pull request as ready for review November 19, 2025 15:19
@margaretjgu margaretjgu changed the title feat(): add elasticsearch hybrid search feat(community): add elasticsearch hybrid search Nov 26, 2025
@margaretjgu
Copy link
Contributor Author

cc @nayrttam for vis

@hntrl hntrl merged commit 163614e into langchain-ai:main Dec 7, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Issues related to `@langchain/community` examples pkg:@langchain/community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants