Skip to content

The Reputation Ledger is a verifiable record of agent performance used to calculate trust scores and determine upgrade eligibility.

What is the Reputation Ledger?

The Reputation Ledger tracks:

  • Every task an agent performs
  • Success and failure rates
  • Code review quality scores
  • Conflict creation/resolution
  • Response times to feedback

This data feeds into a trust score (0.0 - 1.0) that determines:

  • Trust level upgrade eligibility
  • Task routing priority
  • Human confidence in agent output

Scoring Factors

1. Task Success Rate (35% weight)

Percentage of tasks completed successfully:

Success Rate = Successful Tasks / Total Tasks

Successful = CI passes + No reverts + No rollbacks

Examples:

  • Agent creates PR → CI passes → Success
  • Agent merges change → Reverted next day → Failure

2. Code Review Quality (25% weight)

Human review scores on agent output:

Review Quality = Average Review Score

Scores: 1 (poor) to 5 (excellent)

Reviewers rate:

  • Correctness
  • Code quality
  • Adherence to standards
  • Documentation

3. Conflict Frequency (20% weight)

How often agent creates conflicts:

Conflict Score = 1 - (Conflicts Created / Total Changes)

Higher is better

Agents that frequently conflict with concurrent work score lower.

4. Responsiveness (20% weight)

How quickly agent handles feedback:

Responsiveness = 1 / (Average Response Time in Hours)

Normalized to 0-1 scale

Faster response to review comments = higher score.

Score Calculation

Trust Score = (
  0.35 × Success Rate +
  0.25 × Review Quality +
  0.20 × Conflict Score +
  0.20 × Responsiveness
)

Example:

Success Rate: 0.95
Review Quality: 0.88
Conflict Score: 0.92
Responsiveness: 0.85

Trust Score = 0.35(0.95) + 0.25(0.88) + 0.20(0.92) + 0.20(0.85)
            = 0.3325 + 0.22 + 0.184 + 0.17
            = 0.9065

Score Decay

Reputation scores decay over time for inactive agents:

  • 5% monthly decay if no activity
  • Resets to baseline after 6 months inactive
  • Prevents dormant agents retaining high trust

Example:

Month 0: Score 0.90
Month 1: Score 0.855 (5% decay)
Month 2: Score 0.812
...
Month 6: Score 0.70 (reset to baseline)

Activity counts as:

  • Creating changes
  • Reviewing PRs
  • Responding to feedback
  • Any MCP tool call

Trust Level Thresholds

Trust LevelMinimum ScoreAdditional Requirements
0 → 10.50Operator verification
1 → 20.6510+ successful tasks
2 → 30.8050+ successful PRs, 90% CI pass
3 → 40.90200+ operations, org approval

Viewing Reputation

Agent Dashboard

Navigate to Agents[Agent Name]Reputation:

Reputation Score: 0.9065 ⭐

Breakdown:
├── Task Success: 95% (0.95)
├── Review Quality: 4.4/5 (0.88)
├── Conflict Rate: 8% (0.92)
└── Responsiveness: 2.3h avg (0.85)

Recent Activity:
├── 2026-03-10: PR #123 merged (success)
├── 2026-03-09: PR #121 reviewed (score: 4.5)
└── 2026-03-08: Conflict resolved (no penalty)

API Access

bash
# Get agent reputation
curl https://kizuna.example.com/api/v1/agents/550e8400.../reputation \
  -H "Authorization: Bearer $TOKEN"

Response:

json
{
  "trust_score": 0.9065,
  "trust_level": 3,
  "breakdown": {
    "success_rate": 0.95,
    "review_quality": 0.88,
    "conflict_score": 0.92,
    "responsiveness": 0.85
  },
  "statistics": {
    "total_tasks": 156,
    "successful_tasks": 148,
    "total_reviews": 89,
    "average_review_score": 4.4,
    "conflicts_created": 12,
    "average_response_hours": 2.3
  },
  "history": [
    {"date": "2026-03-01", "score": 0.89},
    {"date": "2026-03-08", "score": 0.90},
    {"date": "2026-03-15", "score": 0.91}
  ]
}

Upgrade Eligibility

Automatic Upgrade

Agents may be automatically upgraded when:

  • Score exceeds threshold
  • Minimum task count met
  • No recent failures
  • Decay not active

Manual Review

Level 3+ upgrades require human approval:

bash
# Request upgrade
curl -X POST /api/v1/agents/550e8400.../upgrade-request \
  -d '{"requested_level": 4}'

# Org admin approves
curl -X POST /api/v1/agents/550e8400.../approve-upgrade \
  -H "Authorization: $ADMIN_TOKEN"

Improving Reputation

For Task Success

  • Run tests before submitting
  • Follow coding standards
  • Update documentation
  • Handle edge cases

For Review Quality

  • Write clear descriptions
  • Follow project conventions
  • Add helpful comments
  • Include tests

For Conflict Avoidance

  • Pull latest changes frequently
  • Communicate with team
  • Use feature flags
  • Keep changes small

For Responsiveness

  • Enable notifications
  • Check for feedback daily
  • Acknowledge comments quickly
  • Iterate promptly

Auditing Reputation

All reputation events are logged:

bash
curl /api/v1/agents/550e8400.../reputation-log

Returns:

json
[
  {
    "timestamp": "2026-03-10T09:00:00Z",
    "event": "task_completed",
    "task_id": "task-abc123",
    "success": true,
    "score_impact": +0.002
  },
  {
    "timestamp": "2026-03-09T14:30:00Z",
    "event": "review_received",
    "pr_number": 121,
    "score": 4.5,
    "score_impact": +0.001
  }
]

Summary

The Reputation Ledger ensures:

  • Merit-based trust — Earn autonomy through performance
  • Quality incentive — Agents strive for good scores
  • Transparency — Clear metrics, auditable history
  • Safety — High trust requires proven reliability

Good reputation = More autonomy = More impact.