下面是一套**Agent 主导的 coding agent harness 架构**。核心思想：**workflow 由 agent 动态决定；harness 只提供安全边界、工具接口、持久化、验证、审批与可观测性**。人不是每步批准，而是作为 **reviewer / approver** 出现在关键风险点。

## 0. 调研后的关键判断

Anthropic 把“workflow”和“agent”区分得很清楚：workflow 是预定义代码路径，agent 是由 LLM 动态决定过程和工具使用；OpenAI 的 agent 指南也强调 agent 应能管理 workflow、判断完成、必要时自我纠正或把控制权交还给人。你的需求应采用后者：**agent owns workflow, harness owns constraints**。([anthropic.com](https://www.anthropic.com/engineering/building-effective-agents?__from__=talkingdev))

Coding agent 的关键不是把 shell 全放开，而是设计好 **Agent-Computer Interface / ACI**。SWE-agent 的经验是：专用文件查看器、代码搜索、受控 edit、edit 时 linter、明确空输出提示，比裸 bash 更适合 agent。([github.com](https://github.com/SWE-agent/SWE-agent/blob/main/docs/background/aci.md))

Human-in-the-loop 应做成可持久化的 interrupt / approval：运行可以暂停、保存状态、等待人批准或修改后继续；OpenAI Agents SDK 和 LangGraph 都支持这种“暂停—审批—恢复”模式。人介入的典型触发器应是**高风险动作**和**超过失败阈值**，而不是每个动作都问。([openai.github.io](https://openai.github.io/openai-agents-python/human_in_the_loop/))

---

## 1. 总体架构

```mermaid
flowchart TD
  U[User / Issue / Ticket] --> Intake[Task Intake & Contract Builder]
  Intake --> Kernel[Agent Kernel: workflow self-director]

  Kernel --> State[Run State / Self-Reflection Ledger]
  Kernel --> ACI[Agent-Computer Interface]
  Kernel --> Critics[Specialist Agents as Tools<br/>Reviewer / Tester / Security / Docs]

  ACI --> ToolGateway[Tool Gateway + Policy Engine]
  ToolGateway -->|allow| Sandbox[Ephemeral Sandbox / Worktree / Container]
  ToolGateway -->|ask| HITL[Human Review / Approval Console]
  ToolGateway -->|deny| Kernel

  Sandbox --> Verifier[Verification Engine<br/>tests / lint / typecheck / build / security]
  Verifier --> Kernel

  Kernel --> Repair[Auto-Repair Loop]
  Repair --> ACI

  Kernel --> PR[Branch / Commit / Draft PR]
  PR --> CI[CI + Checks]
  CI --> HITL
  HITL -->|approve / request changes / edit plan / take over| Kernel
  HITL -->|final approve| Merge[Human Merge / Release Gate]

  State --> Obs[Trajectory Store / Audit / Metrics / Evals]
  Sandbox --> Obs
  ToolGateway --> Obs
  HITL --> Obs
```

**一句话分工：**

| 层 | 责任 |
|---|---|
| Agent Kernel | 自己决定下一步：查代码、制定计划、编辑、测试、修复、请 reviewer、开 PR |
| Harness | 提供受控工具、沙箱、权限、checkpoint、验证、审批、日志 |
| Human | 只 review 关键节点：高风险计划、敏感工具、最终 PR、失败升级 |

---

## 2. 核心组件设计

### 2.1 Task Intake & Contract Builder

输入可以是 issue、Jira、Linear、Slack、人工 prompt。第一步不是直接写代码，而是生成结构化任务合同：

```ts
type TaskSpec = {
  task_id: string;
  goal: string;
  repo: string;
  branch_base: string;
  acceptance_criteria: string[];
  non_goals: string[];
  risk_hints: string[];
  required_checks: string[];
  human_contact?: string;
};
```

**自动继续条件：**

- 需求清楚；
- 影响范围低；
- acceptance criteria 可测试；
- 不涉及安全、权限、支付、数据迁移、生产配置。

**请求人类澄清条件：**

- 需求模糊；
- 目标和现有行为冲突；
- agent 发现多个可行方案且 trade-off 明显；
- 需要产品决策。

---

### 2.2 Agent Kernel：让 agent 主导 workflow

Agent Kernel 不应是固定 DAG，而应是一个“可约束的自治 loop”。

```python
while not state.done and budget.remaining():
    state = load_run_state()

    action = agent.decide_next_action(
        task=state.task,
        repo_summary=state.repo_summary,
        plan=state.plan,
        failures=state.failure_ledger,
        tools=tool_registry.available_tools(),
        policy_summary=policy.visible_rules(),
    )

    decision = policy.evaluate(action, state)

    if decision.type == "deny":
        observation = ToolObservation.denied(decision.reason)
    elif decision.type == "ask_human":
        checkpoint(state, action)
        human_decision = approval_service.interrupt(action, evidence_pack(state))
        observation = apply_human_decision(human_decision)
    else:
        observation = tool_gateway.execute(action)

    state = agent.reflect_and_update_state(state, action, observation)

    if verifier.should_run(action, state):
        report = verifier.run(state)
        state = agent.reflect_and_repair_or_continue(state, report)
```

重点：**下一步由 agent 决定**，但每个动作都经过 `Policy Engine` 和 `Tool Gateway`。

---

### 2.3 Self-Reflection Ledger：可自省，但不要存 raw chain-of-thought

建议保存的是**可审计的工程推理摘要**，而不是完整隐藏思维链。

```ts
type AgentRunState = {
  phase:
    | "intake"
    | "discover"
    | "plan"
    | "implement"
    | "verify"
    | "repair"
    | "review"
    | "done"
    | "escalated";

  current_goal: string;

  plan: Array<{
    id: string;
    description: string;
    status: "todo" | "doing" | "done" | "blocked";
    evidence_refs: string[];
  }>;

  assumptions: Array<{
    text: string;
    confidence: "low" | "medium" | "high";
    validated_by?: string;
  }>;

  context_map: Array<{
    file: string;
    relevance: string;
    symbols?: string[];
  }>;

  failure_ledger: Array<{
    check: string;
    command: string;
    failure_summary: string;
    suspected_causes: string[];
    attempted_fixes: string[];
    next_experiment?: string;
  }>;

  risk: {
    score: "low" | "medium" | "high" | "critical";
    reasons: string[];
    requires_human_gate: boolean;
  };

  verification: {
    tests_added: string[];
    tests_run: string[];
    lint: "pass" | "fail" | "not_run";
    typecheck: "pass" | "fail" | "not_run";
    ci: "pass" | "fail" | "pending" | "not_run";
  };

  open_questions: string[];
  done_criteria_status: Record<string, "met" | "unmet" | "unknown">;
};
```

每次重要 observation 后，agent 必须更新：

1. 我现在认为问题在哪里；
2. 哪些证据支持；
3. 哪些假设未验证；
4. 下一步实验是什么；
5. 什么情况下停止并升级给人。

这就是“自省”，但保持工程可读、可审计。

---

## 3. ACI / Tool 层设计

不要把 agent 直接接到完整 shell。给它一组专用工具。

### 推荐工具集

| 工具 | 用途 | 默认权限 |
|---|---|---|
| `repo_search(query)` | 搜代码、符号、调用点 | allow |
| `view_file(path, range)` | 分段查看文件 | allow |
| `semantic_search(query)` | 基于 embeddings / AST 的上下文检索 | allow |
| `dependency_graph(symbol)` | 查调用关系、依赖 | allow |
| `apply_patch(diff)` | 应用 patch | 条件 allow |
| `edit_file(path, range, replacement)` | 精确编辑 | 条件 allow |
| `run_test(selector)` | 运行测试 | allow |
| `run_lint()` | lint | allow |
| `run_typecheck()` | typecheck | allow |
| `safe_bash(cmd)` | 受限 shell | policy gated |
| `git_diff()` | 查看 diff | allow |
| `git_commit(message)` | 提交到 agent 分支 | 条件 allow |
| `open_draft_pr()` | 创建 draft PR | ask / allow by repo policy |
| `request_review()` | 请求人类 review | allow |
| `mark_ready_for_review()` | 转正式 PR | ask |
| `merge_pr()` | 合并 | deny，必须人类 |

SWE-agent 的经验支持这种方向：专用 ACI、受控编辑器、linter、目录搜索和文件查看器能显著改善 agent 操作代码库的能力。([github.com](https://github.com/SWE-agent/SWE-agent/blob/main/docs/background/aci.md))

---

## 4. Sandbox / Workspace

每个任务创建独立 workspace：

- 独立 git worktree 或 clone；
- 独立 branch；
- Docker / microVM sandbox；
- 最小权限 token；
- 默认无生产 secret；
- 网络 egress allowlist；
- 所有命令 stdout / stderr 入库；
- 文件改动全部通过 diff 记录。

OpenHands 文档把 sandbox 定义为 agent 执行命令、编辑文件、启动服务的环境，并推荐 Docker sandbox 以隔离 host；GitHub Copilot cloud agent 也在临时开发环境中探索代码、改代码、跑测试和 linter。([docs.openhands.dev](https://docs.openhands.dev/openhands/usage/runtimes/overview))

---

## 5. Policy Engine：只在关键节点问人

建议实现 **allow / ask / deny** 三态策略。

### 默认策略矩阵

| 动作类型 | 默认策略 |
|---|---|
| 读文件、搜索、查看 git diff | allow |
| 运行本地测试、lint、typecheck | allow |
| 修改普通业务代码 | allow if sandbox + diff 小 + 非 protected path |
| 修改测试、文档 | allow |
| 新增依赖 | ask |
| 修改 lockfile | ask |
| 修改 auth、crypto、payment、permission、tenant isolation | ask |
| 修改 DB migration / schema | ask |
| 修改 CI/CD、Dockerfile、部署脚本 | ask |
| 访问外网 | ask，必须说明目的和域名 |
| 读取 secret / `.env` / credential | deny |
| 删除大量文件 | ask 或 deny |
| force push / rewrite history | deny |
| merge 到 protected branch | deny |
| prod deploy / kubectl apply / terraform apply | deny |
| 关闭测试、安全扫描、绕过 CI | deny |

OpenAI 的 agent 指南建议把 guardrails 做成多层防御，并在高风险动作或超过失败阈值时触发人类介入；Agents SDK 也支持对敏感工具调用暂停等待批准。([openai.github.io](https://openai.github.io/openai-agents-python/guardrails/))

### Policy-as-code 示例

```yaml
protected_paths:
  - "infra/**"
  - ".github/workflows/**"
  - "migrations/**"
  - "auth/**"
  - "payments/**"
  - "security/**"
  - "**/.env*"

tools:
  repo_search:
    default: allow

  view_file:
    default: allow
    deny_if_path_matches:
      - "**/.env*"
      - "**/secrets/**"

  apply_patch:
    default: allow
    ask_if:
      - path_matches: protected_paths
      - diff_lines_gt: 500
      - deletes_files: true
      - modifies_public_api: true
    deny_if:
      - path_matches: ["**/.env*", "**/private_keys/**"]

  safe_bash:
    allow_patterns:
      - "git status"
      - "git diff*"
      - "pytest*"
      - "npm test*"
      - "pnpm test*"
      - "ruff*"
      - "mypy*"
      - "tsc*"
    ask_patterns:
      - "npm install*"
      - "pnpm add*"
      - "pip install*"
      - "curl*"
      - "wget*"
      - "docker*"
    deny_patterns:
      - "rm -rf /*"
      - "git push --force*"
      - "kubectl *"
      - "terraform apply*"
      - "aws *"
      - "gcloud *"

  merge_pr:
    default: deny
```

---

## 6. Auto-Repair Loop

Auto-repair 应该是 harness 的一等能力，而不是“失败后再问人”。

### 失败分类

```ts
type FailureKind =
  | "syntax"
  | "lint"
  | "typecheck"
  | "unit_test"
  | "integration_test"
  | "build"
  | "security_scan"
  | "dependency"
  | "flaky_test"
  | "merge_conflict"
  | "ambiguous_requirement"
  | "environment";
```

### 修复流程

```mermaid
flowchart TD
  F[Failure Report] --> C[Classify Failure]
  C --> R[Retrieve Relevant Context]
  R --> H[Generate Hypothesis]
  H --> P[Patch Candidate]
  P --> T[Run Targeted Check]
  T -->|pass| Full[Run Broader Verification]
  T -->|fail| Reflect[Reflect + Update Failure Ledger]
  Reflect --> Budget{Retry Budget Left?}
  Budget -->|yes| R
  Budget -->|no| Escalate[Human Escalation Pack]
  Full -->|pass| Continue[Continue Workflow]
  Full -->|fail| Reflect
```

AutoCodeRover 的论文中也采用了“生成 patch 后跑测试，失败则重新调用 patch generation agent”的验证重试思路。([zhiyufan.github.io](https://zhiyufan.github.io/files/ISSTA2024a.pdf))

### 默认 retry budget

| 场景 | 自动修复次数 |
|---|---:|
| lint / format | 5 |
| syntax / typecheck | 4 |
| 单元测试失败 | 3 |
| 集成测试失败 | 2 |
| flaky / 环境问题 | 1，然后标记不确定 |
| 安全扫描失败 | 1，然后 ask |
| schema / infra / auth 相关失败 | ask |

### 必须升级给人的条件

- 同一失败连续 3 次；
- agent 想扩大改动范围超过原计划；
- 需要修改 protected path；
- acceptance criteria 不可验证；
- 修复会改变 public API；
- agent confidence 低；
- CI 和本地结果冲突；
- 需要产品或架构决策。

---

## 7. Human Review Console

人类界面不是聊天窗口，而是 **review cockpit**。

### 每次 ask human 必须提供 Evidence Pack

```ts
type EvidencePack = {
  task_summary: string;
  current_phase: string;
  proposed_action: string;
  why_needed: string;
  risk_reasons: string[];
  files_touched: string[];
  diff_summary?: string;
  tests_run: string[];
  failing_checks?: string[];
  alternatives_considered: string[];
  rollback_plan: string;
  requested_decision:
    | "approve_tool"
    | "approve_plan"
    | "choose_option"
    | "clarify_requirement"
    | "review_pr"
    | "take_over";
};
```

### 人类可做的动作

| 人类动作 | Agent 后续 |
|---|---|
| Approve | 继续执行 |
| Reject | 重新计划 |
| Edit plan | 更新 state，继续 |
| Add constraint | 写入 task contract |
| Request changes | agent 自动修复 |
| Take over | 停止 agent，保留 branch |
| Final approve | 人类合并或发布 |

LangGraph 的 HITL 支持 approve/reject，也支持 review 并编辑 graph state 后恢复执行，这正适合这里的 review cockpit。([docs.langchain.com](https://docs.langchain.com/oss/python/langgraph/human-in-the-loop))

---

## 8. PR / Review 工作流

建议采用 GitHub-style PR 作为最终交付边界：

1. agent 自动创建 branch；
2. agent 自动 commit 小步改动；
3. agent 自动跑本地 checks；
4. agent 可自动创建 draft PR；
5. agent 写 PR summary、测试证据、风险说明；
6. human review；
7. reviewer comment 触发 agent repair；
8. CI 全绿后，human merge。

GitHub Copilot cloud agent 的产品形态也是让 agent 研究 repo、制定计划、在 branch 上改代码、跑测试和 linter，然后通过 PR 让开发者 review；GitHub 文档还强调这种方式让步骤通过 commit 和 logs 可见。([docs.github.com](https://docs.github.com/en/copilot/using-github-copilot/coding-agent/about-assigning-tasks-to-copilot))

### PR 模板

```md
## Summary
- What changed
- Why

## Acceptance Criteria
- [x] ...
- [ ] ...

## Files Changed
- `src/foo.ts`: ...
- `tests/foo.test.ts`: ...

## Verification
- [x] unit tests: `...`
- [x] lint: `...`
- [x] typecheck: `...`
- [ ] integration tests: not run, reason: ...

## Agent Notes
- Assumptions:
- Risks:
- Areas needing human attention:

## Rollback
- Revert commit: ...
```

---

## 9. Repo Instruction / Agent Memory

每个 repo 应该有一个 agent-facing instruction 文件，例如：

```md
# AGENTS.md

## Build
- pnpm install
- pnpm test
- pnpm typecheck

## Code style
- Use existing service pattern in `src/services`
- Do not introduce new state management library

## Testing
- Add unit tests for bug fixes
- Prefer integration tests for API route changes

## Protected areas
- Ask before editing `infra/**`
- Ask before editing `auth/**`
- Never read `.env*`

## PR etiquette
- Small commits
- Include verification output
- Draft PR first
```

Claude Code 的最佳实践也建议把 bash 命令、测试 runner、repo etiquette、架构决策等写进项目说明，并像代码一样 review / prune 这些 instructions。([anthropic.com](https://www.anthropic.com/engineering/claude-code-best-practices?_bhlid=6d35c25e04616eabb4469b78a6e8848b61863ba0))

---

## 10. 推荐落地版本

### MVP v0：Review-first

- agent 可读 repo；
- agent 生成 plan；
- human approve plan；
- agent 生成 patch；
- human review diff；
- 不自动 push。

适合刚开始建立信任。

### v1：Autonomous implementation

- 自动 branch；
- 自动 patch；
- 自动测试；
- 自动修复；
- 失败阈值后 ask；
- 自动 draft PR；
- human final review。

这是你描述的主要目标形态。

### v2：Risk-adaptive autonomy

- 低风险任务全自动到 draft PR；
- 中风险任务 plan gate；
- 高风险任务 tool gate + final review；
- reviewer comment 自动修复；
- metrics 驱动逐步放宽 allowlist。

---

## 11. 关键指标

| 指标 | 目的 |
|---|---|
| Human interrupts per task | 是否打扰太多 |
| Auto-repair success rate | 自修复能力 |
| First CI pass rate | 初次质量 |
| PR acceptance rate | 人类 review 通过率 |
| Reviewer comments per PR | 代码质量 |
| Mean time to draft PR | 效率 |
| Escalation reasons | 哪些地方需要改工具或策略 |
| Denied tool calls | 安全策略是否合理 |
| Reverted agent PRs | 线上质量 |
| Protected-path edit attempts | 风险行为监控 |

---

## 12. 最小可实现技术栈

如果你要快速做：

- **Orchestrator**：LangGraph / Temporal / 自研 event loop；
- **Checkpoint**：Postgres；
- **Sandbox**：Docker container + git worktree；
- **Tool Gateway**：Python/FastAPI service；
- **Policy Engine**：YAML + deterministic matcher；
- **Agent State**：JSONB + event sourcing；
- **Verification**：repo-specific command registry；
- **Review UI**：GitHub PR + lightweight approval dashboard；
- **Observability**：OpenTelemetry + trajectory viewer。

OpenHands 的新 SDK 论文也强调 production software agents 需要 flexible implementation、reliable/secure execution、human interaction interfaces、sandboxed execution、lifecycle control 和 security analysis；这和上面的 harness 分层基本一致。([arxiv.org](https://arxiv.org/abs/2511.03690))

---

## 13. 最重要的设计原则

**不要让 harness 决定 workflow。**  
Harness 应该决定：

- 哪些工具存在；
- 哪些动作安全；
- 何时 checkpoint；
- 何时 ask human；
- 如何验证；
- 如何记录证据；
- 如何回滚。

**Agent 应该决定：**

- 先查哪里；
- 是否需要写测试；
- 先修哪个失败；
- 是否需要更多上下文；
- 什么时候调用 reviewer/tester/security sub-agent；
- 什么时候认为任务完成；
- 什么时候主动升级给人。

这样才能满足你的要求：**agent 主导、自省、自动修复；人在 loop，但主要是 review，不是 babysit 每一步。**