防护栏 - LangChain 文档 - LangChain 教程

防护栏通过在代理执行的关键点验证和过滤内容，帮助你构建安全、合规的 AI 应用程序。它们可以检测敏感信息，强制执行内容策略，验证输出，并在问题发生之前预防不安全行为。常见用例包括：

防止 PII 泄漏
检测并阻止提示注入攻击
阻止不适当或有害内容
强制执行业务规则和合规性要求
验证输出质量和准确性

你可以使用中间件来实现防护栏，以便在策略性点（代理开始之前、代理完成之后或模型和工具调用期间）拦截执行。

防护栏可以通过两种互补的方法实现

确定性防护栏

使用基于规则的逻辑，如正则表达式模式、关键词匹配或明确检查。快速、可预测且经济高效，但可能会遗漏细微的违规行为。

基于模型的防护栏

使用大型语言模型 (LLM) 或分类器通过语义理解评估内容。捕捉规则遗漏的细微问题，但速度较慢且成本较高。

LangChain 提供了内置的防护栏（例如，PII 检测，人工参与）以及灵活的中间件系统，用于使用这两种方法构建自定义防护栏。

内置防护栏

PII 检测

LangChain 提供了内置中间件，用于检测和处理对话中的个人身份信息 (PII)。此中间件可以检测常见的 PII 类型，如电子邮件、信用卡、IP 地址等。 PII 检测中间件对于具有合规性要求的医疗保健和金融应用程序、需要清除日志的客户服务代理以及通常处理敏感用户数据的任何应用程序都很有帮助。 PII 中间件支持多种处理检测到的 PII 的策略：

策略	描述	示例
`编辑`	替换为 `[REDACTED_TYPE]`	`[REDACTED_EMAIL]`
`掩码`	部分模糊处理（例如，最后 4 位数字）	`**--**-1234`
`哈希`	替换为确定性哈希	`a8f5f167...`
`阻止`	检测到时抛出异常	抛出错误

from langchain.agents import create_agent
from langchain.agents.middleware import PIIMiddleware


agent = create_agent(
    model="gpt-4o",
    tools=[customer_service_tool, email_tool],
    middleware=[
        # Redact emails in user input before sending to model
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True,
        ),
        # Mask credit cards in user input
        PIIMiddleware(
            "credit_card",
            strategy="mask",
            apply_to_input=True,
        ),
        # Block API keys - raise error if detected
        PIIMiddleware(
            "api_key",
            detector=r"sk-[a-zA-Z0-9]{32}",
            strategy="block",
            apply_to_input=True,
        ),
    ],
)

# When user provides PII, it will be handled according to the strategy
result = agent.invoke({
    "messages": [{"role": "user", "content": "My email is john.doe@example.com and card is 4532-1234-5678-9010"}]
})

内置 PII 类型和配置

内置 PII 类型

email - 电子邮件地址
credit_card - 信用卡号（经 Luhn 算法验证）
ip - IP 地址
mac_address - MAC 地址
url - URL

配置选项

参数	描述	默认
`pii_type`	要检测的 PII 类型（内置或自定义）	必填
`strategy`	如何处理检测到的 PII (`"block"`, `"redact"`, `"mask"`, `"hash"`)	`"redact"`
`detector`	自定义检测器函数或正则表达式模式	`None` (使用内置)
`apply_to_input`	在模型调用前检查用户消息	`True`
`apply_to_output`	在模型调用后检查 AI 消息	`False`
`apply_to_tool_results`	执行后检查工具结果消息	`False`

有关 PII 检测功能的完整详细信息，请参阅中间件文档。

人工干预

LangChain 提供了内置中间件，用于在执行敏感操作之前要求人工批准。这是高风险决策最有效的防护栏之一。人工参与中间件对于金融交易和转账、删除或修改生产数据、向外部方发送通信以及任何具有重大业务影响的操作等情况很有帮助。

from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import Command


agent = create_agent(
    model="gpt-4o",
    tools=[search_tool, send_email_tool, delete_database_tool],
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                # Require approval for sensitive operations
                "send_email": True,
                "delete_database": True,
                # Auto-approve safe operations
                "search": False,
            }
        ),
    ],
    # Persist the state across interrupts
    checkpointer=InMemorySaver(),
)

# Human-in-the-loop requires a thread ID for persistence
config = {"configurable": {"thread_id": "some_id"}}

# Agent will pause and wait for approval before executing sensitive tools
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Send an email to the team"}]},
    config=config
)

result = agent.invoke(
    Command(resume={"decisions": [{"type": "approve"}]}),
    config=config  # Same thread ID to resume the paused conversation
)

有关实施审批工作流的完整详细信息，请参阅人工参与文档。

自定义防护栏

对于更复杂的防护栏，你可以创建在代理执行之前或之后运行的自定义中间件。这让你能够完全控制验证逻辑、内容过滤和安全检查。

代理前置防护栏

使用“代理前置”钩子在每次调用开始时验证请求一次。这对于会话级检查（如身份验证、速率限制或在任何处理开始之前阻止不适当的请求）非常有用。

from typing import Any

from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
from langgraph.runtime import Runtime

class ContentFilterMiddleware(AgentMiddleware):
    """Deterministic guardrail: Block requests containing banned keywords."""

    def __init__(self, banned_keywords: list[str]):
        super().__init__()
        self.banned_keywords = [kw.lower() for kw in banned_keywords]

    @hook_config(can_jump_to=["end"])
    def before_agent(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        # Get the first user message
        if not state["messages"]:
            return None

        first_message = state["messages"][0]
        if first_message.type != "human":
            return None

        content = first_message.content.lower()

        # Check for banned keywords
        for keyword in self.banned_keywords:
            if keyword in content:
                # Block execution before any processing
                return {
                    "messages": [{
                        "role": "assistant",
                        "content": "I cannot process requests containing inappropriate content. Please rephrase your request."
                    }],
                    "jump_to": "end"
                }

        return None

# Use the custom guardrail
from langchain.agents import create_agent

agent = create_agent(
    model="gpt-4o",
    tools=[search_tool, calculator_tool],
    middleware=[
        ContentFilterMiddleware(
            banned_keywords=["hack", "exploit", "malware"]
        ),
    ],
)

# This request will be blocked before any processing
result = agent.invoke({
    "messages": [{"role": "user", "content": "How do I hack into a database?"}]
})

代理后置防护栏

使用“代理后置”钩子在返回给用户之前验证最终输出一次。这对于基于模型的安全检查、质量验证或对完整代理响应进行最终合规性扫描非常有用。

from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
from langgraph.runtime import Runtime
from langchain_core.messages import AIMessage
from langchain.chat_models import init_chat_model
from typing import Any

class SafetyGuardrailMiddleware(AgentMiddleware):
    """Model-based guardrail: Use an LLM to evaluate response safety."""

    def __init__(self):
        super().__init__()
        self.safety_model = init_chat_model("gpt-4o-mini")

    @hook_config(can_jump_to=["end"])
    def after_agent(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        # Get the final AI response
        if not state["messages"]:
            return None

        last_message = state["messages"][-1]
        if not isinstance(last_message, AIMessage):
            return None

        # Use a model to evaluate safety
        safety_prompt = f"""Evaluate if this response is safe and appropriate.
        Respond with only 'SAFE' or 'UNSAFE'.

        Response: {last_message.content}"""

        result = self.safety_model.invoke([{"role": "user", "content": safety_prompt}])

        if "UNSAFE" in result.content:
            return {
                "messages": [{
                    "role": "assistant",
                    "content": "I cannot provide that response. Please rephrase your request."
                }],
                "jump_to": "end"
            }

        return None

# Use the safety guardrail
from langchain.agents import create_agent

agent = create_agent(
    model="gpt-4o",
    tools=[search_tool, calculator_tool],
    middleware=[SafetyGuardrailMiddleware()],
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "How do I make explosives?"}]
})

组合多个防护栏

你可以通过将多个防护栏添加到中间件数组来堆叠它们。它们按顺序执行，允许你构建分层保护。

from langchain.agents import create_agent
from langchain.agents.middleware import PIIMiddleware, HumanInTheLoopMiddleware

agent = create_agent(
    model="gpt-4o",
    tools=[search_tool, send_email_tool],
    middleware=[
        # Layer 1: Deterministic input filter (before agent)
        ContentFilterMiddleware(banned_keywords=["hack", "exploit"]),

        # Layer 2: PII protection (before and after model)
        PIIMiddleware("email", strategy="redact", apply_to_input=True),
        PIIMiddleware("email", strategy="redact", apply_to_output=True),

        # Layer 3: Human approval for sensitive tools
        HumanInTheLoopMiddleware(interrupt_on={"send_email": True}),

        # Layer 4: Model-based safety check (after agent)
        SafetyGuardrailMiddleware(),
    ],
)

附加资源

中间件文档 - 自定义中间件的完整指南
中间件 API 参考 - 自定义中间件的完整指南
人工参与 - 为敏感操作添加人工审查
测试代理 - 测试安全机制的策略

在 GitHub 上编辑此页面源文件。

以编程方式连接这些文档到 Claude、VSCode 等，通过 MCP 获取实时答案。

LangChain v1.0

入门

核心组件

高级用法

生产环境中使用

防护措施

确定性防护栏

基于模型的防护栏

内置防护栏

PII 检测

人工干预

自定义防护栏

代理前置防护栏

代理后置防护栏

组合多个防护栏

附加资源

LangChain v1.0

入门

核心组件

高级用法

生产环境中使用

确定性防护栏

基于模型的防护栏

​内置防护栏

​PII 检测

​人工干预

​自定义防护栏

​代理前置防护栏

​代理后置防护栏

​组合多个防护栏

​附加资源

内置防护栏

PII 检测

人工干预

自定义防护栏

代理前置防护栏

代理后置防护栏

组合多个防护栏

附加资源