Skip to content

Conversation

@SagarJariwalaScaleTeam
Copy link

Description

This PR adds support for the taskType and title parameters in the VertexAI embeddings constructor, enabling users to optimize embeddings for specific downstream applications such as retrieval, semantic similarity, and classification.

Closes #9371


Motivation

Currently, the taskType parameter can only be provided per request. Users requested the ability to define it at initialization for better embedding optimization across tasks.
This update enhances flexibility, alignment with Vertex AI API capabilities, and ease of configuration.


Changes

Type Definitions (@langchain/google-common)

  • Added optional taskType?: GoogleEmbeddingsTaskType to BaseGoogleEmbeddingsParams
  • Added optional title?: string to BaseGoogleEmbeddingsParams
  • Added JSDoc documentation for available task types

Implementation (@langchain/google-common)

  • Added taskType and title as class properties in BaseGoogleEmbeddings
  • Updated constructor to accept and persist these parameters
  • Modified embedDocuments() to forward taskType and title to embedding instances when specified

Tests (@langchain/google-vertexai)

  • Added unit tests to validate parameter acceptance
  • Added integration tests for:
    • taskType parameter
    • taskType combined with dimensions
    • Deprecated outputDimensionality parameter (backward compatibility)

Usage Example

Before

const embeddings = new VertexAIEmbeddings({
  location: "us-central1",
  model: "text-embedding-005",
  dimensions: 768,
});

After

const embeddings = new VertexAIEmbeddings({
  location: "us-central1",
  model: "text-embedding-005",
  taskType: "RETRIEVAL_QUERY", // new
  dimensions: 768,
});

Available Task Types

The taskType parameter allows you to optimize embeddings for different downstream applications.
Below are the supported values and their typical use cases:

  • RETRIEVAL_QUERY — Optimize embeddings for search or query text.
  • RETRIEVAL_DOCUMENT — Optimize embeddings for documents in a retrieval corpus.
  • SEMANTIC_SIMILARITY — Generate embeddings for measuring semantic similarity between texts.
  • CLASSIFICATION — Optimize embeddings for text classification tasks.
  • CLUSTERING — Generate embeddings suitable for clustering or grouping similar content.
  • QUESTION_ANSWERING — Optimize embeddings for question-answering datasets.
  • FACT_VERIFICATION — Generate embeddings to support fact-checking or verification pipelines.
  • CODE_RETRIEVAL_QUERY — Optimize embeddings for code or developer-related retrieval queries.

💡 Each task type informs the model about the embedding context, helping it produce vectors better aligned with your intended downstream task.

@changeset-bot
Copy link

changeset-bot bot commented Nov 11, 2025

⚠️ No Changeset found

Latest commit: fbc15a9

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@SagarJariwalaScaleTeam
Copy link
Author

@hntrl @christian-bromann
Could you please review this PR when you have a moment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VertexAIEmbeddings does not accept a taskType field

2 participants