Seven Mental · 心智七篇
← Knowledge Atlas · Entity

OpenHands

OpenHands:开源 coding agent 框架
ENTITY · OPENHANDS · ALL HANDS AI · CODEACTAGENT · OPEN-SOURCE MULTI-AGENT CODING PLATFORM

OpenHands

All Hands AI’s open-source multi-agent coding platform — one of the evaluation frameworks in SWE-EVO

OpenHands’ CodeActAgent architecture surfaced a key finding in SWE-EVO: GLM-5 scores 37.5% on SWE-agent but only 8.33% on OpenHands — the same model, yet framework differences drive massive performance divergence. This proves agent capability is a function of model × framework, not an intrinsic property of the model.

Framework Comparison
Framework
Architecture
Characteristics
OpenHands
CodeActAgent
Multi-agent platform, unified action space
SWE-agent
Single agent
Emphasizes agent-computer interface design
Codex
Implicit loop
Cloud sandbox, bidirectional JSON-RPC
LangGraph
Explicit graph
StateGraph nodes and edges
Key Insights
GLM-5 37.5% vs. 8.33%
The same model differs by 4.5× across frameworks — framework prompt style and interaction patterns decisively shape performance
Capability = Model × Framework
There is no “intrinsic agent ability” of a model — evaluation results are always framework-specific
Up to 100 Iterations
The iteration cap in SWE-EVO settings — a resource boundary for long-horizon tasks
→ Implicit Loop Architecture · SWE-Bench · Codex · LangGraphSWE-EVO arXiv:2512.18470

OpenHands

简介

OpenHands 是一个开源多 agent 编码平台,使用 CodeActAgent 架构,支持在多个 benchmark 上评估 AI 编码 agent。由 All Hands AI 团队开发和维护。

在 SWE-EVO 中的角色

OpenHands 是 SWE-EVO 评估中使用的两个 agent 框架之一(另一个是 SWE-agent),配置为 CodeActAgent,最多 100 次迭代。

一个值得注意的发现:某些模型在不同框架上表现差异极大。GLM-5 在 SWE-agent 上 37.5%,在 OpenHands 上仅 8.33%。这说明 agent 能力是模型 x 框架的函数——框架的 prompt 风格和交互模式会显著影响模型表现。

与其他框架的对比

框架架构特点
OpenHandsCodeActAgent多 agent 平台,统一行动空间
SWE-agent单 agent强调 agent-computer interface 设计
Codex隐式循环云端沙箱,双向 JSON-RPC
LangGraph显式图编排StateGraph 定义节点和边

相关概念

References

  • sources/arxiv_papers/2512.18470-swe-evo.md