Cursor Cursor

Agent Trace

Version: 0.1.0-draft
Status: Proposal
Date: January 2026

Abstract

Agent Trace is an open specification for tracking AI-generated code. It provides a vendor-neutral framework for recording, storing, and displaying AI contributions alongside human authorship in version-controlled codebases.

Table of Contents

  1. Motivation
  2. Goals
  3. Non-Goals
  4. Terminology
  5. Architecture Overview
  6. Core Specification
  7. Storage Formats
  8. Integration
  9. Privacy Considerations
  10. Security Considerations
  11. Versioning and Extensibility
  12. Conformance Levels
  13. Appendix

1. Motivation

AI writes a lot of code now. This creates real problems.

Developers need to know what came from AI and what came from humans. Reviewers want to apply appropriate scrutiny to AI-generated sections. Regulated industries require documentation of AI involvement. Teams want to measure AI-assisted productivity without losing visibility into code origins.

Early efforts in this space have explored using Git notes and commit metadata to record AI involvement. These approaches work well for simple use cases. Large enterprises with scaled engineering teams face additional requirements: lock-free concurrent operations, flexible storage backends, and the ability to link code not just to a model, but to the actual conversation that produced it.

Agent Trace builds on this prior work to define an open, interoperable standard. It supports lightweight attribution for smaller projects while enabling full conversation-to-code traceability for organizations that need it.


2. Goals

  1. Vendor Neutrality: The specification favors no particular tool, IDE, or model provider.
  2. Interoperability: Any compliant tool can read and write attribution data.
  3. VCS Integration: Clean integration with existing version control, particularly Git.
  4. Granularity: Support attribution at file, hunk, line, and character-range levels.
  5. Extensibility: Vendors can add custom metadata without breaking compatibility.
  6. Privacy-Preserving: Configurable disclosure levels based on organizational policy.
  7. Human Readable: Core attribution data is readable without special tooling.

3. Non-Goals

  1. Code Ownership: Agent Trace does not track legal ownership or copyright.
  2. Training Data Provenance: We don't track what training data influenced AI outputs.
  3. Quality Assessment: We don't evaluate whether AI contributions are good or bad.
  4. Policy Enforcement: Agent Trace leaves enforcement to consuming tools.

4. Terminology

Term Definition
Contribution A unit of code change (addition, modification, or deletion)
Contributor The entity that produced a contribution (human or AI)
Trace Record Metadata describing a contribution's origin
Session A bounded interaction between a human and AI system
Provenance The complete history of a code segment's origins
Attestation A cryptographic proof of attribution claims

Contributor Types

Type Code Description
Human human Code authored directly by a human developer
AI Autocomplete ai:autocomplete Short completions, typically inline, under 5 lines
AI Chat/Agent ai:agent Code generated via conversational AI or autonomous agents
AI Refactor ai:refactor AI-assisted refactoring of existing code
AI Fix ai:fix AI-generated bug fixes or error corrections
AI Automation ai:automation Code generated via AI-executed shell commands, build tools, or scaffolding
Mixed mixed Human-edited AI output or AI-edited human code
Unknown unknown Origin cannot be determined

When to use ai:automation: Use this type when an AI agent executes commands that generate code, such as npx create-react-app, rails generate, prisma generate, or any scaffolding tool. The AI initiated the generation but didn't author the template code itself.

When to use mixed: Human modification to AI-generated code triggers mixed. The threshold for what constitutes "modification" is implementation-defined. Tools decide how to classify edits—some may consider typo fixes as still AI-authored while others use stricter thresholds. Implementations SHOULD track the original contributor type in extensions when marking as mixed to preserve the full history:

{
  "contributor": {
    "type": "mixed"
  },
  "extensions": {
    "com.example.ide": {
      "original_contributor_type": "ai:agent",
      "human_edit_percentage": 5
    }
  }
}

5. Architecture Overview

  ┌──────────────┐    ┌──────────────┐    ┌──────────────────────┐
  │  AI Coding   │    │  AI Coding   │    │     AI Coding        │
  │   Tool A     │    │   Tool B     │    │       Tool C         │
  └──────┬───────┘    └──────┬───────┘    └──────────┬───────────┘
         │                   │                       │
         ▼                   ▼                       ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     Agent Trace Specification                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                    Trace Record Schema                      │    │
│  │              (JSON format for attribution data)             │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                              │                                      │
│                              ▼                                      │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                    Storage Layer                            │    │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │    │
│  │  │.agent-trace/│  │ Git Notes   │  │  External Service   │  │    │
│  │  │  (local)    │  │ (vcs-bound) │  │    (optional)       │  │    │
│  │  └─────────────┘  └─────────────┘  └─────────────────────┘  │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘
         │                   │                       │
         ▼                   ▼                       ▼
  ┌──────────────┐    ┌──────────────┐    ┌──────────────────────┐
  │  IDE / CLI   │    │    CI/CD     │    │   Audit/Compliance   │
  │  (blame UI)  │    │  Pipelines   │    │       Tools          │
  └──────────────┘    └──────────────┘    └──────────────────────┘

5.1 Platform Independence

Agent Trace is a data specification, not a product. It defines how to record and store attribution data. This means:

No IDE required. The specification works anywhere Git works. A CLI tool can produce and consume traces just as easily as an IDE. The core operations are file writes and Git commands.

Visualization is optional. The architecture diagram shows IDEs with "blame UI" as one possible consumer, but this layer is optional. Organizations might:

Expect community tooling. Developers often use multiple coding agents. The standardized trace format makes it straightforward to build tools that aggregate and display attribution across all of them. We anticipate the ecosystem will produce visualization tools that work across any Agent Trace-compliant source.


6. Core Specification

6.1 Trace Record Schema

The fundamental unit of Agent Trace is the Trace Record, defined in JSON Schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://agent-trace.dev/schemas/v1/trace-record.json",
  "title": "Agent Trace Record",
  "type": "object",
  "required": ["version", "id", "timestamp", "contributor", "scope"],
  "properties": {
    "version": {
      "type": "string",
      "pattern": "^[0-9]+\\.[0-9]+$",
      "description": "Agent Trace specification version (e.g., '1.0')"
    },
    "id": {
      "type": "string",
      "format": "uuid",
      "description": "Unique identifier for this trace record"
    },
    "timestamp": {
      "type": "string",
      "format": "date-time",
      "description": "ISO 8601 timestamp when trace was recorded"
    },
    "contributor": {
      "$ref": "#/$defs/contributor"
    },
    "scope": {
      "$ref": "#/$defs/scope"
    },
    "session": {
      "$ref": "#/$defs/session"
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "How certain the tool is about this attribution (1.0 = definitive, 0.0 = unknown). See section 6.4."
    },
    "operation": {
      "$ref": "#/$defs/operation",
      "description": "Links related trace records from a single atomic operation (e.g., multi-file refactor)"
    },
    "post_processing": {
      "type": "object",
      "properties": {
        "modified": {
          "type": "boolean",
          "description": "Whether the code was modified by post-processing tools"
        },
        "tools": {
          "type": "array",
          "items": { "type": "string" },
          "description": "Tools that modified the code (e.g., 'prettier@3.0.0', 'eslint --fix')"
        }
      },
      "description": "Tracks modifications by formatters, linters, or pre-commit hooks"
    },
    "partial": {
      "type": "boolean",
      "default": false,
      "description": "True if this trace represents in-progress attribution (e.g., streaming generation)"
    },
    "extensions": {
      "type": "object",
      "description": "Vendor-specific extensions (namespaced)"
    }
  },
  "$defs": {
    "contributor": {
      "type": "object",
      "required": ["type"],
      "properties": {
        "type": {
          "type": "string",
          "enum": [
            "human",
            "ai:autocomplete",
            "ai:agent",
            "ai:refactor",
            "ai:fix",
            "ai:automation",
            "mixed",
            "unknown"
          ]
        },
        "model": {
          "type": "object",
          "description": "Identifier for the AI model, following the models.dev convention",
          "properties": {
            "id": {
              "type": "string",
              "maxLength": 250,
              "description": "The model's unique identifier (e.g., 'anthropic/claude-3.5-sonnet-20240620')"
            }
          },
          "required": ["id"]
        },
        "tool": {
          "type": "object",
          "properties": {
            "name": { "type": "string" },
            "version": { "type": "string" }
          }
        },
        "human": {
          "type": "object",
          "properties": {
            "id": { "type": "string" },
            "email": { "type": "string", "format": "email" }
          }
        }
      }
    },
    "scope": {
      "type": "object",
      "required": ["file"],
      "properties": {
        "file": {
          "type": "string",
          "description": "Relative file path from repository root"
        },
        "ranges": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/range"
          }
        },
        "content_hashes": {
          "type": "array",
          "items": { "type": "string" },
          "description": "Hashes of attributed content for position-independent tracking. See section 6.6."
        },
        "commit": {
          "type": "string",
          "pattern": "^[a-f0-9]{40}$",
          "description": "Git commit SHA where this trace applies"
        }
      }
    },
    "operation": {
      "type": "object",
      "properties": {
        "id": {
          "type": "string",
          "format": "uuid",
          "description": "Links related trace records from a single atomic operation"
        },
        "description": {
          "type": "string",
          "maxLength": 500,
          "description": "Human-readable description of the multi-file operation"
        }
      }
    },
    "range": {
      "type": "object",
      "required": ["start", "end", "operation"],
      "properties": {
        "start": {
          "type": "object",
          "required": ["line"],
          "properties": {
            "line": { "type": "integer", "minimum": 1 },
            "column": { "type": "integer", "minimum": 1 }
          }
        },
        "end": {
          "type": "object",
          "required": ["line"],
          "properties": {
            "line": { "type": "integer", "minimum": 1 },
            "column": { "type": "integer", "minimum": 1 }
          }
        },
        "operation": {
          "type": "string",
          "enum": ["add", "modify", "delete"],
          "description": "For delete, line numbers reference the parent commit (before deletion)"
        }
      }
    },
    "session": {
      "type": "object",
      "properties": {
        "id": {
          "type": "string",
          "description": "Unique session/conversation identifier"
        },
        "prompt_hash": {
          "type": "string",
          "description": "Hash of the prompt that generated this code (privacy-preserving)"
        },
        "summary": {
          "type": "string",
          "maxLength": 500,
          "description": "Human-readable summary of the AI interaction"
        }
      }
    }
  }
}

6.2 Commit Trace Summary

For commit-level attribution, Agent Trace defines a summary format:

{
  "$id": "https://agent-trace.dev/schemas/v1/commit-summary.json",
  "title": "Agent Trace Commit Summary",
  "type": "object",
  "required": ["version", "commit", "summary"],
  "properties": {
    "version": { "type": "string" },
    "commit": { "type": "string", "pattern": "^[a-f0-9]{40}$" },
    "summary": {
      "type": "object",
      "properties": {
        "total_lines_added": { "type": "integer" },
        "total_lines_deleted": { "type": "integer" },
        "by_contributor_type": {
          "type": "object",
          "additionalProperties": {
            "type": "object",
            "properties": {
              "lines_added": { "type": "integer" },
              "lines_deleted": { "type": "integer" },
              "percentage": { "type": "number" }
            }
          }
        }
      }
    },
    "files": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/file_summary"
      }
    }
  },
  "$defs": {
    "file_summary": {
      "type": "object",
      "properties": {
        "path": { "type": "string" },
        "trace_records": {
          "type": "array",
          "items": { "type": "string", "format": "uuid" }
        }
      }
    }
  }
}

6.3 Example Trace Record

{
  "version": "1.0",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-23T14:30:00Z",
  "contributor": {
    "type": "ai:agent",
    "model": {
      "id": "anthropic/claude-opus-4-5-20251101"
    },
    "tool": {
      "name": "cursor",
      "version": "2.4.0"
    }
  },
  "scope": {
    "file": "src/utils/parser.ts",
    "commit": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0",
    "ranges": [
      {
        "start": { "line": 42, "column": 1 },
        "end": { "line": 67, "column": 2 },
        "operation": "add"
      }
    ],
    "content_hashes": ["sha256:a1b2c3d4e5f6..."]
  },
  "operation": {
    "id": "op-67890",
    "description": "Refactored parser module"
  },
  "session": {
    "id": "session-12345",
    "summary": "Implemented recursive descent parser for arithmetic expressions"
  },
  "post_processing": {
    "modified": true,
    "tools": ["prettier@3.0.0"]
  },
  "confidence": 0.95,
  "extensions": {
    "dev.cursor": {
      "conversation_url": "https://cursor.com/sessions/12345"
    }
  }
}

6.4 Confidence Scoring

The confidence field (0.0–1.0) indicates how certain the tool is about the attribution, not the quality of the code.

Score Meaning
1.0 Definitive: tool directly produced or observed the code origin
0.7–0.99 High: strong signals (e.g., code appeared immediately after AI response)
0.4–0.69 Medium: heuristic-based (e.g., pattern matching, timing inference)
0.01–0.39 Low: weak signals or ambiguous origin
0.0 Unknown: no basis for attribution

Tools should use 1.0 when they have direct knowledge (e.g., the IDE recorded the AI generating this exact code). Use lower values for retroactive analysis or heuristic detection. When aggregating traces from multiple tools, consumers should prefer higher-confidence records.

6.5 Trace Immutability

Trace records are immutable snapshots. Line numbers in a trace refer to positions at the recorded commit, not current positions. As code evolves, it is expected those lines shift.

Consumers querying "who wrote line 50?" must walk git history: find the commit that last touched that line, then look up the trace for that commit and file. This mirrors how git blame works. Tools should not attempt to "rebase" traces onto new commits.

6.6 Content Hashes

Line numbers break when code moves, is reformatted, or is refactored. For tools that need resilient attribution across these changes, Agent Trace supports optional content hashes.

{
  "scope": {
    "file": "src/parser.ts",
    "content_hashes": [
      "sha256:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2",
      "sha256:b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3"
    ],
    "ranges": [
      { "start": { "line": 42 }, "end": { "line": 67 }, "operation": "add" }
    ]
  }
}

Hash computation: Hash the normalized content (whitespace-trimmed, optionally with comments stripped). Use SHA-256 with sha256: prefix. Each hash corresponds to a logical block of attributed code.

Use cases:

6.7 Delete Operations

For delete operations, ranges reference line numbers in the parent commit (before deletion). The commit field points to the commit where deletion occurred.

Single deletion example:

{
  "scope": {
    "file": "src/utils.ts",
    "commit": "abc123...",
    "ranges": [
      {
        "start": { "line": 10 },
        "end": { "line": 15 },
        "operation": "delete"
      }
    ]
  }
}

This indicates lines 10–15 were deleted from src/utils.ts. Those line numbers existed in the parent commit.

Multi-hunk deletion example:

{
  "scope": {
    "file": "src/utils.ts",
    "commit": "abc123...",
    "ranges": [
      { "start": { "line": 5 }, "end": { "line": 8 }, "operation": "delete" },
      { "start": { "line": 20 }, "end": { "line": 25 }, "operation": "delete" },
      { "start": { "line": 50 }, "end": { "line": 52 }, "operation": "delete" }
    ]
  }
}

All line numbers reference positions in the parent commit. Ranges do not overlap and are typically ordered by line number.

6.8 Merge Commits

Merge commits should not have trace records. Attribution belongs on the original commits where code was authored.

When querying attribution for lines introduced by a merge:

  1. Use git blame or similar to find the original commit
  2. Look up the trace for that commit

Tools merging branches with traces should preserve existing trace records from both branches without modification.

6.9 Model Identifier Format

Model identifiers follow the models.dev convention, using a single id string in provider/model-name format:

{
  "model": {
    "id": "anthropic/claude-opus-4-5-20251101"
  }
}

6.10 Column-Level Granularity

For inline completions, columns provide precise attribution:

{
  "scope": {
    "file": "src/app.ts",
    "ranges": [
      {
        "start": { "line": 42, "column": 15 },
        "end": { "line": 42, "column": 45 },
        "operation": "add"
      }
    ]
  }
}

This attributes columns 15–45 on line 42 to AI, while the rest of the line remains human-authored. Use column precision for:

6.11 Query Patterns

To answer "who wrote line N in file F?":

  1. Run git blame -L N,N -- F to find the commit that last modified line N
  2. Load the trace record for that commit and file:
    • Check .agent-trace/commits/{commit_prefix}.json for file list
    • Load referenced trace records from .agent-trace/records/
  3. Find the range containing line N
  4. Return the contributor from that range's trace record

6.12 Pre-Commit Tracking

For IDE blame views before code is committed, tools may maintain a staging trace file:

.agent-trace/
├── staging.json    # Pre-commit attribution (not version controlled)
└── ...

staging.json:

{
  "version": "1.0",
  "pending": [
    {
      "id": "temp-uuid",
      "file": "src/app.ts",
      "ranges": [
        { "start": { "line": 10 }, "end": { "line": 20 }, "operation": "add" }
      ],
      "contributor": { "type": "ai:agent" }
    }
  ]
}

On commit, staging traces are promoted to full trace records with the commit SHA. IDEs should read staging.json for real-time blame overlays on uncommitted changes.

The staging.json file should be added to .gitignore as it contains transient data.


7. Storage Formats

Agent Trace supports multiple storage mechanisms. Pick what works for your workflow.

7.1 Local file storage

The primary and recommended mechanism uses a .agent-trace/ directory at the repository root. Plain files avoid lock contention with git operations, support atomic concurrent writes, and work with standard tools.

.agent-trace/
├── config.json              # Repository-level configuration
├── commits/
│   ├── a1b2c3d4.json        # Commit summary (first 8 chars of SHA)
│   └── e5f6a7b8.json
└── records/
    ├── 2026/
    │   └── 01/
    │       ├── 550e8400-e29b-41d4-a716-446655440000.json
    │       └── 660f9500-f30c-52e5-b827-557766551111.json
    └── index.json           # Index for fast lookups

config.json:

{
  "version": "1.0",
  "policy": {
    "require_trace": true,
    "min_contributor_types": ["human", "ai:autocomplete", "ai:agent"],
    "privacy_level": "standard"
  },
  "hooks": {
    "pre-commit": true,
    "pre-push": false
  },
  "storage": {
    "commit_traces": true,
    "gitignore_staging": true
  }
}

index.json (for fast lookups):

{
  "version": "1.0",
  "last_updated": "2026-01-25T10:00:00Z",
  "by_file": {
    "src/parser.ts": [
      {
        "record_id": "550e8400-e29b-41d4-a716-446655440000",
        "commit": "a1b2c3d4",
        "ranges": [{ "start": 42, "end": 67 }]
      }
    ],
    "src/utils.ts": [
      {
        "record_id": "660f9500-f30c-52e5-b827-557766551111",
        "commit": "b2c3d4e5",
        "ranges": [{ "start": 1, "end": 30 }]
      }
    ]
  },
  "by_commit": {
    "a1b2c3d4": ["550e8400-e29b-41d4-a716-446655440000"],
    "b2c3d4e5": ["660f9500-f30c-52e5-b827-557766551111"]
  },
  "by_operation": {
    "op-67890": [
      "550e8400-e29b-41d4-a716-446655440000",
      "770a0600-a41d-63f6-c938-668877662222"
    ]
  },
  "statistics": {
    "total_records": 42,
    "by_contributor_type": {
      "ai:agent": 28,
      "ai:autocomplete": 10,
      "human": 4
    }
  }
}

7.1.1 Gitignore Recommendations

Whether to commit .agent-trace/ depends on organizational needs:

Commit traces (recommended for most teams):

# Ignore only staging/temp files
.agent-trace/staging.json
.agent-trace/*.tmp

Private traces (for sensitive codebases):

# Ignore all trace data
.agent-trace/

Selective commit (hybrid approach):

# Commit config and indexes, ignore detailed records
.agent-trace/records/
.agent-trace/staging.json

Set the policy in config.json:

{
  "storage": {
    "commit_traces": true, // Trace records committed to repo
    "gitignore_staging": true, // staging.json always ignored
    "retention_days": 365 // Optional: auto-cleanup old records
  }
}

7.2 Git Notes

For tighter VCS integration, Agent Trace can use Git notes:

# Store trace in git notes
git notes --ref=agent-trace add -m '{"version":"1.0",...}' <commit-sha>

# Read trace from git notes
git notes --ref=agent-trace show <commit-sha>

# Push/pull notes
git push origin refs/notes/agent-trace
git fetch origin refs/notes/agent-trace:refs/notes/agent-trace

Tradeoffs. Git notes have known limitations at scale:

Use notes for interoperability with existing note-based tools. For most deployments, prefer local files or external services.

7.3 Git Trailers

For lightweight commit-level attribution:

commit a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
Author: Developer <dev@example.com>
Date:   Thu Jan 23 14:30:00 2026 -0800

    Add arithmetic expression parser

    Implements recursive descent parsing for basic arithmetic.

    Agent-Trace-Version: 1.0
    Agent-Trace-AI-Contribution: 65%
    Agent-Trace-Contributors: ai:agent(anthropic/claude-opus-4-5-20251101), human
    Agent-Trace-Details: .agent-trace/commits/a1b2c3d4.json

7.4 External services

For enterprise deployments, traces can be stored externally:

{
  "storage": {
    "type": "external",
    "endpoint": "https://trace.example.com/api/v1",
    "auth": {
      "type": "bearer",
      "token_env": "AGENT_TRACE_TOKEN"
    }
  }
}

7.5 Local SQLite Cache

For real-time tracking of ephemeral attribution data before commits, implementations should use a local SQLite database. This provides:

Recommended location:

~/.agent-trace/
├── cache.db           # SQLite database for ephemeral state
└── config.json        # User-level configuration

Or within the project:

.agent-trace/
├── .cache.db          # Project-specific cache (gitignored)
└── ...

Schema outline:

-- Pending attributions not yet committed
CREATE TABLE pending_traces (
    id TEXT PRIMARY KEY,
    file_path TEXT NOT NULL,
    start_line INTEGER,
    end_line INTEGER,
    contributor_type TEXT NOT NULL,
    model_id TEXT,
    session_id TEXT,
    created_at TEXT NOT NULL,
    content_hash TEXT
);

-- Index for fast file lookups
CREATE INDEX idx_pending_file ON pending_traces(file_path);

-- Mapping between local changes and eventual commits
CREATE TABLE commit_mappings (
    trace_id TEXT NOT NULL,
    commit_sha TEXT NOT NULL,
    committed_at TEXT NOT NULL,
    FOREIGN KEY (trace_id) REFERENCES pending_traces(id)
);

Workflow:

  1. On AI generation: Insert row into pending_traces
  2. On file edit: Update ranges in pending_traces as content shifts
  3. On commit: Promote pending traces to full trace records, insert into commit_mappings, write to .agent-trace/records/
  4. On successful write: Delete from pending_traces

This cache is purely local and never committed to version control. It serves as the source of truth for "blame" overlays on uncommitted changes and ensures no attribution data is lost between the moment code is generated and when it's committed.


8. Integration

8.1 Dependencies

Agent Trace has minimal runtime dependencies:

Dependency Required Purpose
Git CLI Yes Commit detection, blame integration, history traversal
SQLite Recommended Local caching of ephemeral attribution data

Git CLI requirements:

Verification:

# Check git is available
git --version

# Check user email is configured
git config user.email

# Check we're in a git repository
git rev-parse --is-inside-work-tree

Remote configuration: Agent Trace works with any Git remote configuration, including proxy addresses and custom SSH configurations. As long as the Git CLI is configured to work with the remote, Agent Trace will function correctly.

Implementations should fail gracefully when dependencies are missing, clearly indicating what's required.

8.2 IDE Integration

IDEs implementing Agent Trace should:

  1. Record traces as code is generated
  2. Show attribution in editor gutter and hover
  3. Integrate with existing blame UI
  4. Auto-generate traces on commit

8.3 Integration Guide

This section outlines how tools might implement Agent Trace end-to-end.

Phase 1: Capture

When AI generates code, immediately record attribution:

┌─────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  AI generates   │ ───► │  Write to local  │ ───► │  Update blame   │
│     code        │      │   SQLite cache   │      │     overlay     │
└─────────────────┘      └──────────────────┘      └─────────────────┘
  1. Intercept AI output as it's inserted into the editor
  2. Record contributor type, model, session, and line ranges
  3. Store in local SQLite cache with content hash
  4. Update real-time blame overlay in editor gutter

Phase 2: Track

As the user edits, maintain attribution accuracy:

┌─────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  User edits     │ ───► │  Adjust ranges   │ ───► │  Detect mixed   │
│     file        │      │   in cache       │      │   attribution   │
└─────────────────┘      └──────────────────┘      └─────────────────┘
  1. On text insertion/deletion, shift line numbers in pending traces
  2. If human edits AI-generated code, mark as mixed (threshold is implementation-defined)
  3. Track deletions of AI-generated code

Phase 3: Commit

When the user commits, promote pending traces:

┌─────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│   git commit    │ ───► │  Read pending    │ ───► │  Write trace    │
│                 │      │   from cache     │      │    records      │
└─────────────────┘      └──────────────────┘      └─────────────────┘


                                                   ┌─────────────────┐
                                                   │  Update index,  │
                                                   │  clear cache    │
                                                   └─────────────────┘
  1. Hook into git commit (via IDE hooks or post-commit hook)
  2. Read pending traces from SQLite cache for committed files
  3. Assign the commit SHA to each trace record
  4. Write trace records to .agent-trace/records/
  5. Update .agent-trace/commits/{sha}.json summary
  6. Update index file
  7. Clear committed entries from SQLite cache

Phase 4: Query

When displaying blame, merge Git and Agent Trace data:

┌─────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  User hovers    │ ───► │  git blame -L    │ ───► │  Load trace     │
│   on line       │      │   N,N -- file    │      │   for commit    │
└─────────────────┘      └──────────────────┘      └─────────────────┘


                                                   ┌─────────────────┐
                                                   │  Merge & show   │
                                                   │   attribution   │
                                                   └─────────────────┘
  1. Run git blame to find which commit last touched the line
  2. Look up trace record for that commit and file
  3. Find the range containing the queried line
  4. Display combined information: Git author + AI contributor (if any)
  5. For uncommitted lines, query SQLite cache directly

Fallback behavior:

When no trace data exists for a commit, fall back to standard git blame. This ensures graceful degradation for:


9. Privacy Considerations

9.1 Privacy Levels

Level Description Stored Data
minimal Basic statistics only Percentages, no line details
standard Line-level attribution File paths, line ranges, contributor types
detailed Full context Includes session summaries, model versions
full Complete audit trail Includes prompt hashes, timestamps

9.2 Data Minimization

Store hashes instead of full prompts. Use opaque session identifiers, not conversation content. Support aggregated-only modes for sensitive codebases.

Tools implementing Agent Trace should inform users when traces are recorded, provide opt-out mechanisms where legally required, and respect "do not track" signals.


10. Security Considerations

10.1 Attestation

For high-assurance environments, Agent Trace supports cryptographic attestation:

{
  "attestation": {
    "signature": "base64-encoded-signature",
    "algorithm": "Ed25519",
    "public_key": "base64-encoded-public-key",
    "certificate_chain": ["base64-cert-1", "base64-cert-2"]
  }
}

10.2 Integrity Protection

Trace records should be immutable once created. Modifications should create new records referencing the original. Git Notes provide tamper-evidence through the commit graph.

10.3 Access Control

Organizations should implement appropriate access controls for trace data. External services must support authentication and authorization.


11. Versioning and Extensibility

11.1 Specification Versioning

Major version indicates breaking changes to required fields. Minor version indicates additive changes like new optional fields.

11.2 Extension Namespacing

Vendor extensions use reverse-domain notation:

{
  "extensions": {
    "com.github.copilot": {
      "suggestion_id": "abc123"
    },
    "dev.cursor": {
      "conversation_url": "https://cursor.com/..."
    },
    "com.jetbrains.ai": {
      "model_temperature": 0.7
    }
  }
}

11.3 Unknown Field Handling

Consumers must ignore unknown fields to ensure forward compatibility.


12. Conformance Levels

Agent Trace defines one conformance level. Implementations are either conformant or not.

12.1 Conformance Requirements

A conformant implementation MUST:

  1. Produce valid trace records with all required fields (version, id, timestamp, contributor, scope)
  2. Use valid contributor types from the defined enum
  3. Record file paths relative to repository root
  4. Generate unique IDs (UUIDv4 recommended)
  5. Use ISO 8601 timestamps

A conformant implementation SHOULD:

  1. Record traces at commit time (not just in staging)
  2. Support at least one storage format (local files recommended)
  3. Handle unknown fields gracefully when consuming

A conformant implementation MAY:

  1. Support additional storage formats
  2. Implement content hashes for resilient tracking
  3. Support pre-commit staging traces
  4. Include optional fields (confidence, session, operation, etc.)

12.2 Validation

Implementations should validate against the JSON schemas in section 6. A minimal valid trace record:

{
  "version": "1.0",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-25T10:00:00Z",
  "contributor": {
    "type": "ai:agent"
  },
  "scope": {
    "file": "src/app.ts"
  }
}

Appendix

A. MIME Types

Type MIME Type
Trace Record application/vnd.agent-trace.record+json
Commit Summary application/vnd.agent-trace.commit-summary+json

B. URI Scheme

agent-trace://repo/commit/file#range

Examples:
agent-trace://github.com/org/repo/a1b2c3d4/src/parser.ts#L42-L67
agent-trace://local/HEAD/src/parser.ts#L42-L67

C. SARIF Export

Agent Trace supports export to SARIF for integration with security tools:

{
  "$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "Agent Trace",
          "version": "1.0.0"
        }
      },
      "results": [
        {
          "ruleId": "ai-attribution",
          "message": { "text": "AI-generated code detected" },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": { "uri": "src/parser.ts" },
                "region": { "startLine": 42, "endLine": 67 }
              }
            }
          ],
          "properties": {
            "agent-trace": {
              "contributor_type": "ai:agent",
              "confidence": 0.95
            }
          }
        }
      ]
    }
  ]
}

D. Comparison with Existing Systems

Feature Agent Trace Git Blame DCO REUSE
AI Attribution Yes No No No
Line-level Granularity Yes Yes No No
Model Tracking Yes No No No
Session Context Yes No No No
VCS Integrated Yes Yes Yes No
Privacy Controls Yes No No No
Cryptographic Attestation Yes No Yes No

E. Adoption

For organizations adopting Agent Trace:

  1. Start with minimal privacy level for gradual adoption
  2. Use Git trailers for lightweight integration
  3. Add CI validation to ensure consistent attribution
  4. Train developers on tracing practices
  5. Define organizational policies for AI contribution thresholds

F. Frequently Asked Questions

Q: If the agent invokes a linter/prettier that reformats code, are those changes attributed to the agent?

A: Mostly, yes. Agent Trace uses normalized comparison to attribute formatting changes to the original AI contributor.

Changes that should remain attributed to the agent:

Change Type Example Still Matches?
Whitespace const x=1const x = 1 Yes
Semicolons return xreturn x; Yes
Trailing commas [a, b][a, b,] Yes
Quote style 'hello'"hello" Yes
Template literals `hi`"hi" Yes (JS/TS)
Arrow parens (x) =>x => Yes (JS/TS)

Changes that should break attribution:

Change Type Example Still Matches?
Import reordering Sorted imports No
Variable renaming foobar No
Structural changes Object key reordering No

When post-processing modifies code, implementations should record this in the post_processing field:

{
  "post_processing": {
    "modified": true,
    "tools": ["prettier@3.0.0", "eslint@9.0.0 --fix"]
  }
}

This preserves the audit trail while correctly attributing the code to its original author.


License

This specification is released under CC BY 4.0.

Contributing

Contributions welcome. See the contribution guidelines.