我们已完成 1.25 亿美元的 B 轮融资,用于构建代理工程平台。阅读更多。
[ # 'key' is the metric name # 'score' is the value of a numerical metric {"key": string, "score": number}, # 'value' is the value of a categorical metric {"key": string, "value": string}, ... # You may log as many as you wish ]
{results: [{ key: string, score: number }, ...]};
langsmith>=0.2.0
langsmith@0.1.32
def multiple_scores(outputs: dict, reference_outputs: dict) -> list[dict]: # Replace with real evaluation logic. precision = 0.8 recall = 0.9 f1 = 0.85 return [ {"key": "precision", "score": precision}, {"key": "recall", "score": recall}, {"key": "f1", "score": f1}, ]
此页面有帮助吗?