跳到主要内容
GreenNode 是一家全球人工智能解决方案提供商,也是 NVIDIA 首选合作伙伴,为美国、中东和北非以及亚太地区的企业提供从基础设施到应用的全面人工智能能力。GreenNode 依靠世界一流的基础设施(LEED Gold、TIA‑942、Uptime Tier III),为企业、初创公司和研究人员提供一套全面的人工智能服务
本指南提供了 GreenNodeEmbeddings 的入门指南。它通过生成高质量的文本向量表示,使您能够使用各种内置连接器或您自己的自定义数据源执行语义文档搜索。

概览

集成详情

设置

要访问 GreenNode 嵌入模型,您需要创建一个 GreenNode 帐户,获取 API 密钥,并安装 langchain-greennode 集成包。

凭据

GreenNode 需要 API 密钥进行身份验证,该密钥可以在初始化期间作为 api_key 参数提供,或设置为环境变量 GREENNODE_API_KEY。您可以通过在 GreenNode Serverless AI 上注册帐户来获取 API 密钥。
import getpass
import os

if not os.getenv("GREENNODE_API_KEY"):
    os.environ["GREENNODE_API_KEY"] = getpass.getpass("Enter your GreenNode API key: ")
如果您想获取模型调用的自动化跟踪,您还可以通过取消注释下方来设置您的 LangSmith API 密钥
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")

安装

LangChain GreenNode 集成位于 langchain-greennode 包中
pip install -qU langchain-greennode
Note: you may need to restart the kernel to use updated packages.

实例化

GreenNodeEmbeddings 类可以使用 API 密钥和模型名称的可选参数进行实例化
from langchain_greennode import GreenNodeEmbeddings

# Initialize the embeddings model
embeddings = GreenNodeEmbeddings(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="BAAI/bge-m3"  # The default embedding model
)

索引和检索

嵌入模型通过实现内容的索引和高效检索,在检索增强生成 (RAG) 工作流中发挥关键作用。下面,我们将展示如何使用我们上面初始化的 embeddings 对象来索引和检索数据。在此示例中,我们将在 InMemoryVectorStore 中索引和检索一个示例文档。
# Create a vector store with a sample text
from langchain_core.vectorstores import InMemoryVectorStore

text = "LangChain is the framework for building context-aware reasoning applications"

vectorstore = InMemoryVectorStore.from_texts(
    [text],
    embedding=embeddings,
)

# Use the vectorstore as a retriever
retriever = vectorstore.as_retriever()

# Retrieve the most similar text
retrieved_documents = retriever.invoke("What is LangChain?")

# show the retrieved document's content
retrieved_documents[0].page_content
'LangChain is the framework for building context-aware reasoning applications'

直接使用

GreenNodeEmbeddings 类可以独立用于生成文本嵌入,而无需向量存储。这对于相似度评分、聚类或自定义处理管道等任务非常有用。

嵌入单个文本

您可以使用 embed_query 嵌入单个文本或文档
single_vector = embeddings.embed_query(text)
print(str(single_vector)[:100])  # Show the first 100 characters of the vector
[-0.01104736328125, -0.0281982421875, 0.0035858154296875, -0.0311279296875, -0.0106201171875, -0.039

嵌入多个文本

您可以使用 embed_documents 嵌入多个文本
text2 = (
    "LangGraph is a library for building stateful, multi-actor applications with LLMs"
)
two_vectors = embeddings.embed_documents([text, text2])
for vector in two_vectors:
    print(str(vector)[:100])  # Show the first 100 characters of the vector
[-0.01104736328125, -0.0281982421875, 0.0035858154296875, -0.0311279296875, -0.0106201171875, -0.039
[-0.07177734375, -0.00017452239990234375, -0.002044677734375, -0.0299072265625, -0.0184326171875, -0

异步支持

GreenNodeEmbeddings 支持异步操作
import asyncio


async def generate_embeddings_async():
    # Embed a single query
    query_result = await embeddings.aembed_query("What is the capital of France?")
    print(f"Async query embedding dimension: {len(query_result)}")

    # Embed multiple documents
    docs = [
        "Paris is the capital of France",
        "Berlin is the capital of Germany",
        "Rome is the capital of Italy",
    ]
    docs_result = await embeddings.aembed_documents(docs)
    print(f"Async document embeddings count: {len(docs_result)}")


await generate_embeddings_async()
Async query embedding dimension: 1024
Async document embeddings count: 3

文档相似度示例

import numpy as np
from scipy.spatial.distance import cosine

# Create some documents
documents = [
    "Machine learning algorithms build mathematical models based on sample data",
    "Deep learning uses neural networks with many layers",
    "Climate change is a major global environmental challenge",
    "Neural networks are inspired by the human brain's structure",
]

# Embed the documents
embeddings_list = embeddings.embed_documents(documents)


# Function to calculate similarity
def calculate_similarity(embedding1, embedding2):
    return 1 - cosine(embedding1, embedding2)


# Print similarity matrix
print("Document Similarity Matrix:")
for i, emb_i in enumerate(embeddings_list):
    similarities = []
    for j, emb_j in enumerate(embeddings_list):
        similarity = calculate_similarity(emb_i, emb_j)
        similarities.append(f"{similarity:.4f}")
    print(f"Document {i + 1}: {similarities}")
Document Similarity Matrix:
Document 1: ['1.0000', '0.6005', '0.3542', '0.5788']
Document 2: ['0.6005', '1.0000', '0.4154', '0.6170']
Document 3: ['0.3542', '0.4154', '1.0000', '0.3528']
Document 4: ['0.5788', '0.6170', '0.3528', '1.0000']

API 参考

有关 GreenNode Serverless AI API 的更多详细信息,请访问 GreenNode Serverless AI 文档
以编程方式连接这些文档到 Claude、VSCode 等,通过 MCP 获取实时答案。
© . This site is unofficial and not affiliated with LangChain, Inc.