Seven Mental · 心智七篇
← Knowledge Atlas · Entity

Factory.ai

Factory.ai:AI 编码 agent 公司,context compression 评估研究发布者
ENTITY · FACTORY.AI · AI CODING AGENT · EMPIRICAL CONTEXT-COMPRESSION STUDY

Factory.ai

AI coding-agent company — a primary source of empirical work on context compression

On 36,000+ production session messages, Factory compared three compression strategies (Factory, OpenAI, Anthropic) and built a probe-based functional-quality evaluation framework that directly measures an agent’s ability to continue its task after compression. Headline finding: artifact tracking is a universal weakness across compression methods — regardless of strategy, post-compression ability to track code artifacts and file state drops sharply.

Compression research contributions
Anchored iterative summarization
Structured sections and incremental merging prevent information loss — distinct from naive truncation
Probe-based evaluation
Functional-quality framework: measures whether a compressed agent can still complete its task, not just textual similarity
36K production messages
Real production data, not a synthetic test set — gives results practical engineering weight
Three-way comparison
Factory vs. OpenAI vs. Anthropic — surfaces real-world tradeoffs of each approach in production
Universal weakness uncovered
Artifact-tracking failure
Common to all methods: after compression, models lose grip on code artifacts and file state
Engineering implication
Externalized artifact tracking (feature tracking + progress files) is a necessary compensating mechanism at the harness layer
→ Context Compression · Harness Engineering · AnthropicFactory (2025)

Factory.ai

AI 编码 agent 公司,专注于软件工程自动化。

与本 wiki 的关联

Factory 在 上下文压缩 评估领域提供了重要的实证研究:

  • 构建了 probe-based 功能质量评估框架,直接衡量压缩后 agent 的任务继续能力
  • 提出锚定式迭代摘要(Anchored Iterative Summarization)——通过结构化 section 和增量合并防止信息丢失
  • 在 36,000+ 条生产 session 消息上对比了三种压缩策略(Factory、OpenAIAnthropic
  • 揭示了 artifact tracking 是所有压缩方法的普遍弱点

相关实体

  • OpenAI — 压缩策略对比对象
  • Anthropic — 压缩策略对比对象

References

  • sources/factory-evaluating-context-compression.md