Factory.ai

Factory.ai：AI 编码 agent 公司，context compression 评估研究发布者

实

ENTITY · FACTORY.AI · AI CODING AGENT · EMPIRICAL CONTEXT-COMPRESSION STUDY

Factory.ai

AI coding-agent company — a primary source of empirical work on context compression

On 36,000+ production session messages, Factory compared three compression strategies (Factory, OpenAI, Anthropic) and built a probe-based functional-quality evaluation framework that directly measures an agent’s ability to continue its task after compression. Headline finding: artifact tracking is a universal weakness across compression methods — regardless of strategy, post-compression ability to track code artifacts and file state drops sharply.

Compression research contributions

Anchored iterative summarization

Structured sections and incremental merging prevent information loss — distinct from naive truncation

Probe-based evaluation

Functional-quality framework: measures whether a compressed agent can still complete its task, not just textual similarity

36K production messages

Real production data, not a synthetic test set — gives results practical engineering weight

Three-way comparison

Factory vs. OpenAI vs. Anthropic — surfaces real-world tradeoffs of each approach in production

Universal weakness uncovered

Artifact-tracking failure

Common to all methods: after compression, models lose grip on code artifacts and file state

Engineering implication

Externalized artifact tracking (feature tracking + progress files) is a necessary compensating mechanism at the harness layer

→ Context Compression · Harness Engineering · AnthropicFactory (2025)

Factory.ai

AI 编码 agent 公司，专注于软件工程自动化。

与本 wiki 的关联

Factory 在上下文压缩评估领域提供了重要的实证研究：

构建了 probe-based 功能质量评估框架，直接衡量压缩后 agent 的任务继续能力
提出锚定式迭代摘要（Anchored Iterative Summarization）——通过结构化 section 和增量合并防止信息丢失
在 36,000+ 条生产 session 消息上对比了三种压缩策略（Factory、OpenAI、Anthropic）
揭示了 artifact tracking 是所有压缩方法的普遍弱点

References

sources/factory-evaluating-context-compression.md

Factory.ai

Factory.ai

与本 wiki 的关联

相关实体

References