中间件

中间件提供了一种更严格地控制代理内部发生的事情的方法。核心代理循环包括调用模型，让它选择要执行的工具，然后在它不再调用工具时结束：

中间件在每个步骤之前和之后都暴露了钩子

中间件能做什么？

监控

通过日志、分析和调试跟踪代理行为

修改

转换提示、工具选择和输出格式

控制

添加重试、回退和提前终止逻辑

强制执行

应用速率限制、安全防护和 PII 检测

通过将其传递给 @[create_agent] 来添加中间件

import {
  createAgent,
  summarizationMiddleware,
  humanInTheLoopMiddleware,
} from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [summarizationMiddleware, humanInTheLoopMiddleware],
});

内置中间件

LangChain 为常见用例提供了预构建的中间件

总结

当接近令牌限制时，自动总结对话历史记录。

非常适合

超出上下文窗口的长时间对话
包含大量历史记录的多轮对话
保留完整对话上下文很重要的应用程序

import { createAgent, summarizationMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [weatherTool, calculatorTool],
  middleware: [
    summarizationMiddleware({
      model: "gpt-4o-mini",
      maxTokensBeforeSummary: 4000, // Trigger summarization at 4000 tokens
      messagesToKeep: 20, // Keep last 20 messages after summary
      summaryPrompt: "Custom prompt for summarization...", // Optional
    }),
  ],
});

配置选项

model

字符串

必填

用于生成摘要的模型

maxTokensBeforeSummary

数字

触发摘要的令牌阈值

messagesToKeep

数字

默认值:"20"

要保留的最新消息

tokenCounter

函数

自定义令牌计数函数。默认为基于字符的计数。

summaryPrompt

字符串

自定义提示模板。如果未指定，则使用内置模板。

summaryPrefix

字符串

默认值:"## Previous conversation summary:"

摘要消息的前缀

人工干预

在工具调用执行之前，暂停代理执行以进行人工批准、编辑或拒绝。

非常适合

需要人工批准的高风险操作（数据库写入、金融交易）
强制人工监督的合规工作流
通过人工反馈指导代理的长时间对话

import { createAgent, humanInTheLoopMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [readEmailTool, sendEmailTool],
  middleware: [
    humanInTheLoopMiddleware({
      interruptOn: {
        // Require approval, editing, or rejection for sending emails
        send_email: {
          allowAccept: true,
          allowEdit: true,
          allowRespond: true,
        },
        // Auto-approve reading emails
        read_email: false,
      }
    })
  ]
});

配置选项

interruptOn

对象

必填

工具名称到审批配置的映射

工具审批配置选项

allowAccept

布尔值

默认值:"false"

是否允许审批

allowEdit

布尔值

默认值:"false"

是否允许编辑

allowRespond

布尔值

默认值:"false"

是否允许响应/拒绝

重要提示： 人工干预中间件需要一个检查点来维护中断之间的状态。有关完整的示例和集成模式，请参阅人工干预文档。

Anthropic 提示缓存

通过使用 Anthropic 模型缓存重复的提示前缀来降低成本。

非常适合

具有长而重复的系统提示的应用程序
在多次调用中重用相同上下文的代理
降低高吞吐量部署的 API 成本

了解有关Anthropic 提示缓存策略和限制的更多信息。

import { createAgent, HumanMessage, anthropicPromptCachingMiddleware } from "langchain";

const LONG_PROMPT = `
Please be a helpful assistant.

<Lots more context ...>
`;

const agent = createAgent({
  model: "claude-sonnet-4-5-20250929",
  prompt: LONG_PROMPT,
  middleware: [anthropicPromptCachingMiddleware({ ttl: "5m" })],
});

// cache store
await agent.invoke({
  messages: [new HumanMessage("Hi, my name is Bob")]
});

// cache hit, system prompt is cached
const result = await agent.invoke({
  messages: [new HumanMessage("What's my name?")]
});

配置选项

ttl

字符串

默认值:"5m"

缓存内容的存活时间。有效值："5m" 或 "1h"

模型调用限制

限制模型调用次数，以防止无限循环或过高的成本。

非常适合

防止失控代理进行过多的 API 调用
在生产部署中实施成本控制
在特定调用预算内测试代理行为

import { createAgent, modelCallLimitMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [
    modelCallLimitMiddleware({
      threadLimit: 10, // Max 10 calls per thread (across runs)
      runLimit: 5, // Max 5 calls per run (single invocation)
      exitBehavior: "end", // Or "error" to throw exception
    }),
  ],
});

配置选项

threadLimit

数字

线程中所有运行的最大模型调用次数。默认为无限制。

runLimit

数字

每次调用的最大模型调用次数。默认为无限制。

exitBehavior

字符串

默认值:"end"

达到限制时的行为。选项："end"（优雅终止）或 "error"（抛出异常）

工具调用限制

限制对特定工具或所有工具的调用次数。

非常适合

防止对昂贵的外部 API 进行过多调用
限制网页搜索或数据库查询
对特定工具使用实施速率限制

import { createAgent, toolCallLimitMiddleware } from "langchain";

// Limit all tool calls
const globalLimiter = toolCallLimitMiddleware({ threadLimit: 20, runLimit: 10 });

// Limit specific tool
const searchLimiter = toolCallLimitMiddleware({
  toolName: "search",
  threadLimit: 5,
  runLimit: 3,
});

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [globalLimiter, searchLimiter],
});

配置选项

toolName

字符串

要限制的特定工具。如果未提供，则限制适用于所有工具。

threadLimit

数字

线程中所有运行的最大工具调用次数。默认为无限制。

runLimit

数字

每次调用的最大工具调用次数。默认为无限制。

exitBehavior

字符串

默认值:"end"

达到限制时的行为。选项："end"（优雅终止）或 "error"（抛出异常）

模型回退

当主模型失败时，自动回退到备用模型。

非常适合

构建能够处理模型中断的弹性代理
通过回退到更便宜的模型来优化成本
跨 OpenAI、Anthropic 等提供商的冗余

import { createAgent, modelFallbackMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o", // Primary model
  tools: [...],
  middleware: [
    modelFallbackMiddleware(
      "gpt-4o-mini", // Try first on error
      "claude-3-5-sonnet-20241022" // Then this
    ),
  ],
});

配置选项

中间件接受可变数量的字符串参数，表示按顺序回退的模型

...模型

字符串[]

必填

当主模型失败时，按顺序尝试的一个或多个回退模型字符串

modelFallbackMiddleware(
  "first-fallback-model",
  "second-fallback-model",
  // ... more models
)

PII 检测

检测并处理对话中的个人身份信息。

非常适合

具有合规性要求的医疗保健和金融应用程序
需要清理日志的客户服务代理
任何处理敏感用户数据的应用程序

import { createAgent, piiRedactionMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [
    // Redact emails in user input
    piiRedactionMiddleware({
      piiType: "email",
      strategy: "redact",
      applyToInput: true,
    }),
    // Mask credit cards (show last 4 digits)
    piiRedactionMiddleware({
      piiType: "credit_card",
      strategy: "mask",
      applyToInput: true,
    }),
    // Custom PII type with regex
    piiRedactionMiddleware({
      piiType: "api_key",
      detector: /sk-[a-zA-Z0-9]{32}/,
      strategy: "block", // Throw error if detected
    }),
  ],
});

配置选项

piiType

字符串

必填

要检测的 PII 类型。可以是内置类型（email、credit_card、ip、mac_address、url）或自定义类型名称。

strategy

字符串

默认值:"redact"

如何处理检测到的 PII。选项

"block" - 检测到时抛出错误
"redact" - 替换为 [REDACTED_TYPE]
"mask" - 部分遮罩（例如，****-****-****-1234）
"hash" - 替换为确定性哈希

detector

正则表达式

自定义检测器正则表达式模式。如果未提供，则使用 PII 类型的内置检测器。

applyToInput

布尔值

默认值:"true"

在模型调用前检查用户消息

applyToOutput

布尔值

默认值:"false"

在模型调用后检查 AI 消息

applyToToolResults

布尔值

默认值:"false"

执行后检查工具结果消息

规划

为复杂的多步骤任务添加待办事项列表管理功能。

此中间件自动为代理提供 write_todos 工具和系统提示，以指导有效的任务规划。

import { createAgent, HumanMessage, todoListMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [
    /* ... */
  ],
  middleware: [todoListMiddleware()] as const,
});

const result = await agent.invoke({
  messages: [new HumanMessage("Help me refactor my codebase")],
});
console.log(result.todos); // Array of todo items with status tracking

配置选项

没有可用的配置选项（使用默认值）。

LLM 工具选择器

在调用主模型之前，使用 LLM 智能选择相关工具。

非常适合

工具很多（10+）但每次查询只有少数相关的代理
通过过滤不相关的工具来减少令牌使用
提高模型焦点和准确性

import { createAgent, llmToolSelectorMiddleware } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [tool1, tool2, tool3, tool4, tool5, ...], // Many tools
  middleware: [
    llmToolSelectorMiddleware({
      model: "gpt-4o-mini", // Use cheaper model for selection
      maxTools: 3, // Limit to 3 most relevant tools
      alwaysInclude: ["search"], // Always include certain tools
    }),
  ],
});

配置选项

model

字符串

用于工具选择的模型。默认为代理的主模型。

maxTools

数字

要选择的最大工具数量。默认为无限制。

alwaysInclude

字符串[]

始终包含在选择中的工具名称数组

上下文编辑

通过修剪、总结或清除工具使用来管理对话上下文。

非常适合

需要定期清理上下文的长时间对话
从上下文中移除失败的工具尝试
自定义上下文管理策略

import { createAgent, contextEditingMiddleware, ClearToolUsesEdit } from "langchain";

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [
    contextEditingMiddleware({
      edits: [
        new ClearToolUsesEdit({ maxTokens: 1000 }), // Clear old tool uses
      ],
    }),
  ],
});

配置选项

编辑

ContextEdit[]

默认值:"[new ClearToolUsesEdit()]"

要应用的 ContextEdit 策略数组

@[ClearToolUsesEdit] 选项

maxTokens

数字

默认值:"1000"

触发编辑的令牌计数

自定义中间件

通过实现代理执行流中特定点的钩子来构建自定义中间件。

基于类的中间件

两种钩子样式

节点样式钩子

在特定执行点按顺序运行。用于日志记录、验证和状态更新。

包装样式钩子

通过完全控制处理程序调用来拦截执行。用于重试、缓存和转换。

节点样式钩子

在执行流中的特定点运行

beforeAgent - 代理启动之前（每次调用一次）
beforeModel - 每次模型调用之前
afterModel - 每次模型响应之后
afterAgent - 代理完成之后（每次调用最多一次）

示例：日志中间件

import { createMiddleware } from "langchain";

const loggingMiddleware = createMiddleware({
  name: "LoggingMiddleware",
  beforeModel: (state) => {
    console.log(`About to call model with ${state.messages.length} messages`);
    return;
  },
  afterModel: (state) => {
    const lastMessage = state.messages[state.messages.length - 1];
    console.log(`Model returned: ${lastMessage.content}`);
    return;
  },
});

示例：对话长度限制

import { createMiddleware, AIMessage } from "langchain";

const createMessageLimitMiddleware = (maxMessages: number = 50) => {
  return createMiddleware({
    name: "MessageLimitMiddleware",
    beforeModel: (state) => {
      if (state.messages.length === maxMessages) {
        return {
          messages: [new AIMessage("Conversation limit reached.")],
          jumpTo: "end",
        };
      }
      return;
    },
  });
};

包装样式钩子

拦截执行并控制何时调用处理程序

wrapModelCall - 围绕每次模型调用
wrapToolCall - 围绕每次工具调用

您可以决定处理程序是调用零次（短路）、一次（正常流程）还是多次（重试逻辑）。 示例：模型重试中间件

import { createMiddleware } from "langchain";

const createRetryMiddleware = (maxRetries: number = 3) => {
  return createMiddleware({
    name: "RetryMiddleware",
    wrapModelCall: (request, handler) => {
      for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
          return handler(request);
        } catch (e) {
          if (attempt === maxRetries - 1) {
            throw e;
          }
          console.log(`Retry ${attempt + 1}/${maxRetries} after error: ${e}`);
        }
      }
      throw new Error("Unreachable");
    },
  });
};

示例：动态模型选择

import { createMiddleware, initChatModel } from "langchain";

const dynamicModelMiddleware = createMiddleware({
  name: "DynamicModelMiddleware",
  wrapModelCall: (request, handler) => {
    // Use different model based on conversation length
    const modifiedRequest = { ...request };
    if (request.messages.length > 10) {
      modifiedRequest.model = initChatModel("gpt-4o");
    } else {
      modifiedRequest.model = initChatModel("gpt-4o-mini");
    }
    return handler(modifiedRequest);
  },
});

示例：工具调用监控

import { createMiddleware } from "langchain";

const toolMonitoringMiddleware = createMiddleware({
  name: "ToolMonitoringMiddleware",
  wrapToolCall: (request, handler) => {
    console.log(`Executing tool: ${request.toolCall.name}`);
    console.log(`Arguments: ${JSON.stringify(request.toolCall.args)}`);

    try {
      const result = handler(request);
      console.log("Tool completed successfully");
      return result;
    } catch (e) {
      console.log(`Tool failed: ${e}`);
      throw e;
    }
  },
});

自定义状态模式

中间件可以使用自定义属性扩展代理的状态。定义自定义状态类型并将其设置为 state_schema

import { createMiddleware, createAgent, HumanMessage } from "langchain";
import * as z from "zod";

// Middleware with custom state requirements
const callCounterMiddleware = createMiddleware({
  name: "CallCounterMiddleware",
  stateSchema: z.object({
    modelCallCount: z.number().default(0),
    userId: z.string().optional(),
  }),
  beforeModel: (state) => {
    // Access custom state properties
    if (state.modelCallCount > 10) {
      return { jumpTo: "end" };
    }
    return;
  },
  afterModel: (state) => {
    // Update custom state
    return { modelCallCount: state.modelCallCount + 1 };
  },
});

const agent = createAgent({
  model: "gpt-4o",
  tools: [...],
  middleware: [callCounterMiddleware] as const,
});

// TypeScript enforces required state properties
const result = await agent.invoke({
  messages: [new HumanMessage("Hello")],
  modelCallCount: 0, // Optional due to default value
  userId: "user-123", // Optional
});

上下文扩展

上下文属性是通过可运行配置传递的配置值。与状态不同，上下文是只读的，通常用于在执行期间不会更改的配置。中间件可以定义必须通过代理配置满足的上下文要求：

import * as z from "zod";
import { createMiddleware, HumanMessage } from "langchain";

const rateLimitMiddleware = createMiddleware({
  name: "RateLimitMiddleware",
  contextSchema: z.object({
    maxRequestsPerMinute: z.number(),
    apiKey: z.string(),
  }),
  beforeModel: async (state, runtime) => {
    // Access context through runtime
    const { maxRequestsPerMinute, apiKey } = runtime.context;

    // Implement rate limiting logic
    const allowed = await checkRateLimit(apiKey, maxRequestsPerMinute);
    if (!allowed) {
      return { jumpTo: "END" };
    }

    return state;
  },
});

// Context is provided through config
await agent.invoke(
  { messages: [new HumanMessage("Process data")] },
  {
    context: {
      maxRequestsPerMinute: 60,
      apiKey: "api-key-123",
    },
  }
);

执行顺序

当使用多个中间件时，了解执行顺序很重要

const agent = createAgent({
  model: "gpt-4o",
  middleware: [middleware1, middleware2, middleware3],
  tools: [...],
});

执行流程（点击展开）

Before 钩子按顺序运行

middleware1.before_agent()
middleware2.before_agent()
middleware3.before_agent()

代理循环开始

middleware1.before_model()
middleware2.before_model()
middleware3.before_model()

Wrap 钩子像函数调用一样嵌套

middleware1.wrap_model_call() → middleware2.wrap_model_call() → middleware3.wrap_model_call() → 模型

After 钩子按相反顺序运行

middleware3.after_model()
middleware2.after_model()
middleware1.after_model()

代理循环结束

middleware3.after_agent()
middleware2.after_agent()
middleware1.after_agent()

关键规则

before_* 钩子：从头到尾
after_* 钩子：从尾到头（反向）
wrap_* 钩子：嵌套（第一个中间件包装所有其他中间件）

代理跳转

要提前退出中间件，请返回一个包含 jump_to 的字典

import { createMiddleware, AIMessage } from "langchain";

const earlyExitMiddleware = createMiddleware({
  name: "EarlyExitMiddleware",
  beforeModel: (state) => {
    // Check some condition
    if (shouldExit(state)) {
      return {
        messages: [new AIMessage("Exiting early due to condition.")],
        jumpTo: "end",
      };
    }
    return;
  },
});

可用跳转目标

"end"：跳转到代理执行的末尾
"tools"：跳转到工具节点
"model"：跳转到模型节点（或第一个 before_model 钩子）

重要提示： 当从 before_model 或 after_model 跳转时，跳转到 "model" 将导致所有 before_model 中间件再次运行。要启用跳转，请使用 @hook_config(can_jump_to=[...]) 装饰您的钩子：

import { createMiddleware } from "langchain";

const conditionalMiddleware = createMiddleware({
  name: "ConditionalMiddleware",
  afterModel: (state) => {
    if (someCondition(state)) {
      return { jumpTo: "end" };
    }
    return;
  },
});

最佳实践

保持中间件专注——每个中间件都应该做好一件事
优雅地处理错误——不要让中间件错误导致代理崩溃
使用适当的钩子类型:
- 节点样式用于顺序逻辑（日志记录、验证）
- 包装样式用于控制流（重试、回退、缓存）
清楚地记录任何自定义状态属性
在集成之前独立对中间件进行单元测试
考虑执行顺序——将关键中间件放在列表中的首位
尽可能使用内置中间件，不要重复造轮子 :)

示例

动态选择工具

在运行时选择相关工具以提高性能和准确性。

优势

更短的提示 - 通过只暴露相关工具来降低复杂性
更高的准确性 - 模型从更少的选项中正确选择
权限控制 - 根据用户访问动态过滤工具

import { createAgent, createMiddleware } from "langchain";

const toolSelectorMiddleware = createMiddleware({
  name: "ToolSelector",
  wrapModelCall: (request, handler) => {
    // Select a small, relevant subset of tools based on state/context
    const relevantTools = selectRelevantTools(request.state, request.runtime);
    const modifiedRequest = { ...request, tools: relevantTools };
    return handler(modifiedRequest);
  },
});

const agent = createAgent({
  model: "gpt-4o",
  tools: allTools, // All available tools need to be registered upfront
  // Middleware can be used to select a smaller subset that's relevant for the given run.
  middleware: [toolSelectorMiddleware],
});

显示扩展示例：GitHub 与 GitLab 工具选择

import * as z from "zod";
import { createAgent, createMiddleware, tool, HumanMessage } from "langchain";

const githubCreateIssue = tool(
  async ({ repo, title }) => ({
    url: `https://github.com/${repo}/issues/1`,
    title,
  }),
  {
    name: "github_create_issue",
    description: "Create an issue in a GitHub repository",
    schema: z.object({ repo: z.string(), title: z.string() }),
  }
);

const gitlabCreateIssue = tool(
  async ({ project, title }) => ({
    url: `https://gitlab.com/${project}/-/issues/1`,
    title,
  }),
  {
    name: "gitlab_create_issue",
    description: "Create an issue in a GitLab project",
    schema: z.object({ project: z.string(), title: z.string() }),
  }
);

const allTools = [githubCreateIssue, gitlabCreateIssue];

const toolSelector = createMiddleware({
  name: "toolSelector",
  contextSchema: z.object({ provider: z.enum(["github", "gitlab"]) }),
  wrapModelCall: (request, handler) => {
    const provider = request.runtime.context.provider;
    const toolName = provider === "gitlab" ? "gitlab_create_issue" : "github_create_issue";
    const selectedTools = request.tools.filter((t) => t.name === toolName);
    const modifiedRequest = { ...request, tools: selectedTools };
    return handler(modifiedRequest);
  },
});

const agent = createAgent({
  model: "gpt-4o",
  tools: allTools,
  middleware: [toolSelector],
});

// Invoke with GitHub context
await agent.invoke(
  {
    messages: [
      new HumanMessage("Open an issue titled 'Bug: where are the cats' in the repository `its-a-cats-game`"),
    ],
  },
  {
    context: { provider: "github" },
  }
);

关键点

预先注册所有工具
中间件根据请求选择相关的子集
使用 contextSchema 定义配置要求

附加资源

中间件 API 参考 - 自定义中间件的完整指南
人工干预 - 为敏感操作添加人工审查
测试代理 - 测试安全机制的策略

在 GitHub 上编辑此页面源文件。

以编程方式连接这些文档到 Claude、VSCode 等，通过 MCP 获取实时答案。

LangChain v1.0

入门

核心组件

高级用法

生产环境中使用

中间件能做什么？

监控

修改

控制

强制执行

内置中间件

总结

人工干预

Anthropic 提示缓存

模型调用限制

工具调用限制

模型回退

PII 检测

规划

LLM 工具选择器

上下文编辑

自定义中间件

基于类的中间件

两种钩子样式

节点样式钩子

包装样式钩子

节点样式钩子

包装样式钩子

自定义状态模式

上下文扩展

执行顺序

代理跳转

最佳实践

示例

动态选择工具

附加资源

LangChain v1.0

入门

核心组件

高级用法

生产环境中使用

​中间件能做什么？

监控

修改

控制

强制执行

​内置中间件

​总结

​人工干预

​Anthropic 提示缓存

​模型调用限制

​工具调用限制

​模型回退

​PII 检测

​规划

​LLM 工具选择器

​上下文编辑

​自定义中间件

​基于类的中间件

​两种钩子样式

节点样式钩子

包装样式钩子

​节点样式钩子

​包装样式钩子

​自定义状态模式

​上下文扩展

​执行顺序

​代理跳转

​最佳实践

​示例

​动态选择工具

​附加资源

中间件能做什么？

内置中间件

总结

人工干预

Anthropic 提示缓存

模型调用限制

工具调用限制

模型回退

PII 检测

规划

LLM 工具选择器

上下文编辑

自定义中间件

基于类的中间件

两种钩子样式

节点样式钩子

包装样式钩子

自定义状态模式

上下文扩展

执行顺序

代理跳转

最佳实践

示例

动态选择工具

附加资源