Moorcheh

Moorcheh 是一个闪电般快速的语义搜索引擎和向量存储。Moorcheh 不使用 L2 或余弦等简单的距离度量，而是使用最大信息二值化 (MIB) 和信息论得分 (ITS) 来检索准确的文档块。以下教程将允许您使用 Moorcheh 和 LangChain 上传和存储文本文档和向量嵌入，并为您的所有查询检索相关块。

设置

首先，安装必要的包

pip install langchain-moorcheh

初始化

开始使用 Moorcheh

在 Moorcheh 控制台注册或登录。
转到“API 密钥”选项卡并生成一个 API 密钥。
将密钥保存为名为 MOORCHEH_API_KEY 的环境变量（您将在下面使用它）。
创建用于存储数据的命名空间
- 在控制台中，打开“命名空间”选项卡并单击“创建命名空间”；或者
- 使用下一节中的向量存储代码以编程方式对其进行初始化。
使用您的 API 密钥创建命名空间、上传文档和检索答案。

有关 Moorcheh SDK 函数的更多信息，请参阅 GitHub 存储库。

导入包

导入以下包

from langchain_moorcheh import MoorchehVectorStore
from moorcheh_sdk import MoorchehClient

import logging
import os
from uuid import uuid4
import asyncio
from typing import Any, List, Optional, Literal, Tuple, Type, TypeVar, Sequence
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_core.vectorstores import VectorStore
from google.colab import userdata

代码设置

在您的环境变量中设置您的 Moorcheh API 密钥

MOORCHEH_API_KEY = os.environ['MOORCHEH_API_KEY']

设置您的命名空间名称、类型并创建向量存储

namespace = "your_namespace_name"
namespace_type = "text" # or vector
store = MoorchehVectorStore(
            api_key=MOORCHEH_API_KEY,
            namespace=namespace,
            namespace_type=namespace_type
        )

添加文档

document_1 = Document(
    page_content="Brewed a fresh cup of Ethiopian coffee and paired it with a warm croissant.",
    metadata={"source": "blog"},
)

document_2 = Document(
    page_content="Tomorrow's weather will be sunny with light winds, reaching a high of 78°F.",
    metadata={"source": "news"},
)

document_3 = Document(
    page_content="Experimenting with LangChain for an AI-powered note-taking assistant!",
    metadata={"source": "tweet"},
)

document_4 = Document(
    page_content="Local bakery donates 500 loaves of bread to the community food bank.",
    metadata={"source": "news"},
)

document_5 = Document(
    page_content="That concert last night was absolutely unforgettable—what a performance!",
    metadata={"source": "tweet"},
)

document_6 = Document(
    page_content="Check out our latest article: 5 ways to boost productivity while working from home.",
    metadata={"source": "website"},
)

document_7 = Document(
    page_content="The ultimate guide to mastering homemade pizza dough.",
    metadata={"source": "website"},
)

document_8 = Document(
    page_content="LangGraph just made multi-agent workflows way easier—seriously impressive!",
    metadata={"source": "tweet"},
)

document_9 = Document(
    page_content="Oil prices rose 3% today after unexpected supply cuts from major exporters.",
    metadata={"source": "news"},
)

document_10 = Document(
    page_content="I really hope this post doesn't vanish into the digital void…",
    metadata={"source": "tweet"},
)

documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]

uuids = [str(uuid4()) for _ in range(len(documents))]

store.add_documents(documents=documents, ids=uuids)

删除文档

store.delete(ids=["chunk_id_here"])

查询引擎

创建命名空间并将文档上传到其中后，您可以直接通过向量存储查询文档。设置您希望回答查询的查询和 LLM。有关支持的 LLM 的更多信息，请访问我们的 Github 页面。

query = "Give me a brief summary of the provided documents"
answer = store.generative_answer(query, ai_model = "anthropic.claude-sonnet-4-5-20250929-v1:0")
print(answer)

热门提供商

按组件划分的集成

Moorcheh

Moorcheh

设置

初始化

导入包

代码设置

添加文档

删除文档

查询引擎

更多资源

热门提供商

按组件划分的集成

​Moorcheh

​设置

​初始化

​导入包

​代码设置

​添加文档

​删除文档

​查询引擎

​更多资源

Moorcheh

设置

初始化

导入包

代码设置

添加文档

删除文档

查询引擎

更多资源