What an Agent-Readable Bug Report Actually Contains

May 19, 2026 · 5 min read

A coding agent reads your bug report like a compiler reads source: literally, in order, once. If a field is missing, it guesses. If two fields contradict, it picks one and runs. The fix is to write reports with named parts the agent can map to actions. Below are the parts that matter, defined precisely, with notes on when each one decides whether the agent ships a fix or a wrong patch.

Identity and scope fields

Title. A single sentence naming the broken behavior and the surface it appears on. Example: "Checkout submit button does nothing on mobile Safari after applying a coupon." The title is the only thing an agent sees in a list of issues, so it has to disambiguate against every other open bug.

Surface. The route, component, or file where the bug lives. Something like /checkout, CartSummary.tsx, or "the settings modal opened from the avatar menu." Surface matters because agents that grep before they edit need a starting point. Without it, Claude Code or Cursor will open the wrong file and confidently change it.

Scope. The blast radius of the fix you are authorizing. "Only the mobile breakpoint," "do not touch the Stripe call," "frontend only." Scope prevents the agent from refactoring three unrelated files because it noticed they looked stale. See how to give feedback to an AI coding agent for the longer treatment.

Behavior fields

Repro. An ordered list of steps from a known starting state to the failure. Each step is one user action. "Open /products/42 in a fresh incognito window. Click Add to cart. Click the cart icon. Click Checkout." Repro is the field most often written badly: people skip the starting state and jump to the click that broke. The agent then cannot replay it mentally.

Expected. What should happen at the last step of the repro. One sentence. "The Pay button becomes enabled within 200ms." Expected is where you encode the acceptance criterion. If you leave it out, the agent invents one based on what the code currently does, which is exactly the wrong reference.

Actual. What happens instead. One sentence, matched to the same last step. "The Pay button stays disabled and the console logs Uncaught TypeError: cannot read properties of undefined (reading 'total')." Pair actual with a captured error message verbatim whenever you have one.

Frequency. Always, sometimes, or once. If sometimes, name the condition you have observed. Agents handle "always on Safari 17, never on Chrome" very differently from "flaky." Flaky is a request for instrumentation, not a fix.

Environment fields

Build. The commit SHA, branch, or deploy URL the bug was seen on. staging.example.com at a1b2c3d. Without a build, the agent cannot tell if the bug is already fixed on main.

Client. Browser and OS, plus viewport size if layout is involved. "Chrome 124 on macOS 14, viewport 1440x900." For mobile bugs, the device matters more than the OS version.

Account state. Who was logged in, what plan, what feature flags. "Logged in as a free-tier user with new_checkout flag on." Account state is the silent cause of half of "cannot reproduce" closes.

Evidence fields

Screenshot. A still frame of the failure, cropped to the relevant region. Not a video. Agents do not watch videos, and the ones that do extract frames anyway. A cropped still with a numbered pin on the broken element is what gets acted on. The case for stills over recordings is laid out in screen capture versus screen recording.

Pin. A numbered marker on the screenshot pointing at the specific pixel region the comment refers to. Pin 1 on the disabled button, pin 2 on the toast that should have appeared. Pins resolve the ambiguity of "the button" when the screen has eleven buttons.

Item. One screenshot plus its comment, or a free-floating comment with no image. A bug report is usually one item per distinct symptom. Keep items atomic so an agent can resolve them one at a time.

Console and network excerpts. The exact text of the error, the failing request line, the response status. Paste them as code, not as a screenshot of the devtools panel, so the agent can grep the message against the source.

Output format

Markdown export. The whole report rendered as plain markdown at a stable URL the agent can fetch. This is the format Cursor, Aider, Cline, and Claude Code read most reliably. Cobalt Capture publishes every review at /r/<slug>/markdown for exactly this reason; see how it is structured in the markdown bug report format reference, and the broader argument in what is agent-readable feedback.

Stable URL. A link the agent can re-fetch on later turns without authentication. If your bug tracker requires a login token, the agent loses access between sessions and starts hallucinating the contents.

Workflow fields

Comment thread. Per-item replies from anyone with the link, used to add information after the report is filed. Useful when the first repro turns out to be incomplete.

Resolved flag. Per-item state set by the owner once the fix lands. The agent should be told to ignore resolved items on the next pass. This is the smallest possible workflow primitive, and it is enough. If you need swimlanes and assignees, you need a different tool.

Once the vocabulary above is in your reports, the agent stops asking clarifying questions and starts producing patches. Start a report at a fresh capture session and see how thin the friction can get, or read the QA bug report use case for a worked example end to end.