Feature Tracking（特性追踪）

特性追踪：结构化 JSON feature list，防止 one-shotting 和 premature victory

念

CONCEPT · FEATURE TRACKING · LONG-RUNNING AGENT STATE

Feature Tracking

Feature Tracking — an external task list that blocks two failure modes

An externalized, machine-readable task list — a JSON feature list the agent can read for global progress and update completion against. It blocks both “one-shotting” (doing it all in one go) and “premature victory” (declaring done too early).

JSON structure

categorydescriptionstepspasses

authuser login flow3false

authtoken refresh2true ✓

…200+ entries…false

key constraints

passes only

the agent may only flip passes to true; it cannot delete or edit the description

JSON, not Markdown

models are less inclined to “casually” edit JSON structure, making it safer than Markdown

end-to-end required

a task is only marked passing after end-to-end test verification — self-judgment is forbidden

→ Long-Running Agents · Harness Engineering · Context ManagementAnthropic (2024)

Feature Tracking（特性追踪）

定义

Feature tracking 是长时运行 agent 中用于追踪任务完成度的结构化机制。其核心是一份外部化的、machine-readable 的任务清单，agent 可以读取以了解全局进展，可以更新以记录自己的成果。

Anthropic 的实践

在 Anthropic 的 harness 设计中，feature tracking 采用 JSON 格式的 feature list：

每条 feature 包含：category、description、steps、passes（布尔值）
Initializer agent 基于用户需求生成完整列表（200+ 条），所有 passes 初始为 false
Coding agent 完成并验证一个 feature 后，将其 passes 改为 true

关键约束

Agent 只能修改 passes 字段，不得删除或编辑测试描述
选择 JSON 而非 Markdown，因为模型更不容易”顺手”修改 JSON 结构
Feature 必须经过端到端测试才能标记为通过

设计意图

Feature tracking 同时解决两个长时 agent 的失败模式：

防止 one-shotting：明确的 feature list 让 agent 知道有多少工作要做，逐个推进
防止 premature victory：未通过的 feature 是客观证据，agent 无法”觉得差不多了”就停下

References

sources/anthropic_official/effective-harnesses-long-running-agents.md