跳到主要内容
兼容性:仅适用于 Node.js。您仍然可以通过将 runtime 变量设置为 nodejs 来创建使用 MongoDB 和 Next.js 的 API 路由,如下所示:export const runtime = "nodejs";您可以在 Next.js 文档此处阅读有关 Edge 运行时的更多信息。

This guide provides a quick overview for getting started with MongoDB Atlas [vector stores](/oss/javascript/integrations/vectorstores). For detailed documentation of all `MongoDBAtlasVectorSearch` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_mongodb.MongoDBAtlasVectorSearch.html).

## Overview

### Integration details

| Class | Package | [PY support](https://python.langchain.ac.cn/docs/integrations/vectorstores/mongodb_atlas/) |  Version |
| :--- | :--- | :---: | :---: |
| [`MongoDBAtlasVectorSearch`](https://api.js.langchain.com/classes/langchain_mongodb.MongoDBAtlasVectorSearch.html) | [`@langchain/mongodb`](https://npmjs.net.cn/package/@langchain/mongodb) | ✅ | ![NPM - Version](https://img.shields.io/npm/v/@langchain/mongodb?style=flat-square&label=%20&) |

## Setup

To use MongoDB Atlas vector stores, you'll need to configure a MongoDB Atlas cluster and install the `@langchain/mongodb` integration package.

### Initial Cluster Configuration

To create a MongoDB Atlas cluster, navigate to the [MongoDB Atlas website](https://mongodb.ac.cn/products/platform/atlas-database) and create an account if you don't already have one.

Create and name a cluster when prompted, then find it under `Database`. Select `Browse Collections` and create either a blank collection or one from the provided sample data.

**Note:** The cluster created must be MongoDB 7.0 or higher.

### Creating an Index

After configuring your cluster, you'll need to create an index on the collection field you want to search over.

Switch to the `Atlas Search` tab and click `Create Search Index`. From there, make sure you select `Atlas Vector Search - JSON Editor`, then select the appropriate database and collection and paste the following into the textbox:

```json
{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    }
  ]
}
请注意,dimensions 属性应与您正在使用的嵌入维度匹配。例如,Cohere 嵌入有 1024 维,而 OpenAI 嵌入默认有 1536 维: 注意:默认情况下,向量存储期望索引名称为 default,索引集合字段名称为 embedding,原始文本字段名称为 text。您应该使用与您的索引名称集合 schema 匹配的字段名称来初始化向量存储,如下所示。 最后,继续构建索引。

嵌入

本指南还将使用OpenAI 嵌入,这要求您安装 @langchain/openai 集成包。您也可以根据需要使用其他受支持的嵌入模型

安装

安装以下包
npm install @langchain/mongodb mongodb @langchain/openai @langchain/core

凭据

完成上述操作后,从 Mongo 仪表板中的 Connect 按钮设置 MONGODB_ATLAS_URI 环境变量。您还需要您的数据库名称和集合名称。
process.env.MONGODB_ATLAS_URI = "your-atlas-url";
process.env.MONGODB_ATLAS_COLLECTION_NAME = "your-atlas-db-name";
process.env.MONGODB_ATLAS_DB_NAME = "your-atlas-db-name";
如果您在本指南中使用 OpenAI 嵌入,您还需要设置您的 OpenAI 密钥
process.env.OPENAI_API_KEY = "YOUR_API_KEY";
如果您想获取模型调用的自动化跟踪,您还可以通过取消注释下方来设置您的 LangSmith API 密钥
// process.env.LANGSMITH_TRACING="true"
// process.env.LANGSMITH_API_KEY="your-api-key"

实例化

如上所示设置集群后,您可以按如下方式初始化向量存储:
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const collection = client.db(process.env.MONGODB_ATLAS_DB_NAME)
  .collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
  collection: collection,
  indexName: "vector_index", // The name of the Atlas search index. Defaults to "default"
  textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
  embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});

管理向量存储

向向量存储添加项目

您现在可以将文档添加到您的向量存储中
import type { Document } from "@langchain/core/documents";

const document1: Document = {
  pageContent: "The powerhouse of the cell is the mitochondria",
  metadata: { source: "https://example.com" }
};

const document2: Document = {
  pageContent: "Buildings are made out of brick",
  metadata: { source: "https://example.com" }
};

const document3: Document = {
  pageContent: "Mitochondria are made out of lipids",
  metadata: { source: "https://example.com" }
};

const document4: Document = {
  pageContent: "The 2024 Olympics are in Paris",
  metadata: { source: "https://example.com" }
}

const documents = [document1, document2, document3, document4];

await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });
[ '1', '2', '3', '4' ]
注意:添加文档后,会有一个短暂的延迟,然后它们才能被查询。 添加具有与现有文档相同 id 的文档将更新现有文档。

从向量存储中删除项目

await vectorStore.delete({ ids: ["4"] });

查询向量存储

一旦您的向量存储被创建并添加了相关文档,您很可能希望在链或代理运行期间查询它。

直接查询

执行简单的相似性搜索可以按如下方式完成
const similaritySearchResults = await vectorStore.similaritySearch("biology", 2);

for (const doc of similaritySearchResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

过滤

MongoDB Atlas 支持对其他字段进行结果预过滤。它们要求您通过更新最初创建的索引来定义您计划过滤的元数据字段。以下是一个示例
{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    },
    {
      "path": "source",
      "type": "filter"
    }
  ]
}
上面,fields 中的第一项是向量索引,第二项是您要过滤的元数据属性。属性的名称是 path 键的值。因此,上面的索引将允许我们搜索名为 source 的元数据字段。 然后,在您的代码中,您可以使用MQL 查询运算符进行过滤。 下面的示例说明了这一点:
const filter = {
  preFilter: {
    source: {
      $eq: "https://example.com",
    },
  },
}

const filteredResults = await vectorStore.similaritySearch("biology", 2, filter);

for (const doc of filteredResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

返回分数

如果您想执行相似性搜索并接收相应的分数,可以运行
const similaritySearchWithScoreResults = await vectorStore.similaritySearchWithScore("biology", 2, filter)

for (const [doc, score] of similaritySearchWithScoreResults) {
  console.log(`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(doc.metadata)}]`);
}
* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

通过转换为检索器进行查询

您还可以将向量存储转换为检索器,以便在您的链中更轻松地使用。
const retriever = vectorStore.asRetriever({
  // Optional filter
  filter: filter,
  k: 2,
});
await retriever.invoke("biology");
[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { _id: '1', source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Mitochondria are made out of lipids',
    metadata: { _id: '3', source: 'https://example.com' },
    id: undefined
  }
]

用于检索增强生成的使用

有关如何将此向量存储用于检索增强生成 (RAG) 的指南,请参阅以下部分

关闭连接

完成后请务必关闭客户端实例,以避免过度消耗资源
await client.close();

API 参考

有关所有 MongoDBAtlasVectorSearch 功能和配置的详细文档,请参阅API 参考
以编程方式连接这些文档到 Claude、VSCode 等,通过 MCP 获取实时答案。
© . This site is unofficial and not affiliated with LangChain, Inc.