Titan Takeoff

TitanML 通过我们的训练、压缩和推理优化平台，帮助企业构建和部署更好、更小、更便宜、更快的 NLP 模型。我们的推理服务器 Titan Takeoff 可以通过一条命令将 LLM 部署到您的本地硬件上。大多数嵌入模型都开箱即用，如果您在使用特定模型时遇到问题，请通过 hello@titanml.co 告知我们。

使用示例

以下是一些有用的示例，可帮助您开始使用 Titan Takeoff 服务器。在运行这些命令之前，您需要确保 Takeoff 服务器已在后台启动。有关更多信息，请参阅Takeoff 启动文档页面。

import time

from langchain_community.embeddings import TitanTakeoffEmbed

示例 1

假设 Takeoff 在您的机器上使用其默认端口（即 localhost:3000）运行，这是基本用法。

embed = TitanTakeoffEmbed()
output = embed.embed_query(
    "What is the weather in London in August?", consumer_group="embed"
)
print(output)

示例 2

使用 TitanTakeoffEmbed Python 封装器启动读取器。如果您在首次启动 Takeoff 时尚未创建任何读取器，或者您想添加另一个，可以在初始化 TitanTakeoffEmbed 对象时执行此操作。只需将您要启动的模型列表作为 models 参数传递。您可以使用 embed.query_documents 一次嵌入多个文档。预期的输入是字符串列表，而不是 embed_query 方法预期的单个字符串。

# Model config for the embedding model, where you can specify the following parameters:
#   model_name (str): The name of the model to use
#   device: (str): The device to use for inference, cuda or cpu
#   consumer_group (str): The consumer group to place the reader into
embedding_model = {
    "model_name": "BAAI/bge-large-en-v1.5",
    "device": "cpu",
    "consumer_group": "embed",
}
embed = TitanTakeoffEmbed(models=[embedding_model])

# The model needs time to spin up, length of time need will depend on the size of model and your network connection speed
time.sleep(60)

prompt = "What is the capital of France?"
# We specified "embed" consumer group so need to send request to the same consumer group so it hits our embedding model and not others
output = embed.embed_query(prompt, consumer_group="embed")
print(output)

在 GitHub 上编辑此页面源文件。

以编程方式连接这些文档到 Claude、VSCode 等，通过 MCP 获取实时答案。

热门提供商

按组件划分的集成

使用示例

示例 1

示例 2

热门提供商

按组件划分的集成

​使用示例

​示例 1

​示例 2

使用示例

示例 1

示例 2