What is agent-readable feedback?

Agent-readable feedback is product feedback structured so an AI coding agent can act on it without human translation: markdown text, embedded screenshots with stable URLs, and explicit source references.

Agent-readable feedback is product feedback structured so an AI coding agent can act on it without human translation. The artifact is a markdown document with cropped screenshots at stable URLs, dictated or written context for each finding, and an explicit source URL pointing at the page or component under review. If you are shipping with Cursor, Claude Code, Windsurf, or any other AI-in-the-loop stack, this is the format that survives the handoff. The rest of the guides hub walks through related concepts; this page is the definition.

What you'll learn

  • The three properties every agent-readable artifact must have
  • Why video, chat screenshots, and free-form prose fall apart in agent workflows
  • What the format looks like in practice, with a worked example
  • Where CobaltCapture fits as the tool built for this category
  • Common questions answered for quick reference

Three properties of agent-readable feedback

Agent-readable feedback has three non-negotiable properties. Drop any one and the format breaks.

Plain text, structured as markdown. Coding agents read text. They do not watch video, parse Figma comment threads, or open .scribe files. Markdown is the universal input because every modern agent renders it natively: Cursor's composer, Claude Code's terminal, Windsurf's Cascade, Lovable and Bolt's chat boxes. Headings (##) separate findings, bullet points enumerate steps, fenced code blocks carry selectors and snippets, and inline links carry source URLs. The structure itself is what lets the agent know that finding 1 ends and finding 2 begins.

Stable image URLs, not bitmaps in attachments. A screenshot pasted into chat is a binary stuck inside a message; the agent cannot reliably fetch it from outside that context, and it cannot reference it back to you. An agent-readable screenshot lives at a public URL, served as a PNG over HTTPS, with the URL embedded in the markdown via standard image syntax: ![Caption](https://example.com/img/1.png). The agent fetches the image inline, processes it as visual context alongside the surrounding text, and patches against the component it shows.

Source URLs and explicit references. The agent has to know what it is looking at. A screenshot of a broken submit button is useful only if the document also says "this is /checkout at widths under 380px" or "this is the <SubmitButton> component on the cart page." Without a source URL or a component name, the agent has to guess which file to open, and guessing is where bad patches come from. Every finding in an agent-readable document carries either a URL, a route, or a component name. Ideally all three.

How non-agent-readable formats fail

The formats most teams use today were designed for human readers and break in predictable ways when an agent is the next consumer.

Video. Loom, Vidyard, and screen recordings are the default for "here is what is broken." They work for a human on the other side of an async handoff. They do not work for an agent because coding agents cannot watch video. Pasting a Loom URL into Cursor's composer gives Cursor the URL string and nothing else. The visual content of the recording, the spoken commentary, the timestamps where issues appear, all of it stays trapped in the video. The only path forward is transcribing the video by hand into text, at which point you have done the work twice.

Bitmaps in chat. A screenshot dropped in Slack or a Linear comment is technically an image, but it is not addressable. The image has no public URL the agent can fetch. The caption sits two messages above or below the image. The thread structure means context is scattered. When you paste the screenshot into Cursor, you get one image and no surrounding text. The agent has the picture but no idea what page it came from, what the user was trying to do, or which component to patch.

Free-form prose. A wall of text describing five issues is what most bug reports look like. The agent can read it, but ambiguity is high: which paragraph belongs to which screenshot, which selector applies where, what order to fix things in. Structure is not decoration. It is the resolution of ambiguity. A document with five ## headings, one per finding, with a screenshot and a source URL under each, lets the agent patch them in order without asking clarifying questions.

What the format looks like in practice

Here is what an agent-readable feedback document actually contains. This is the literal markdown you would paste into Cursor or save as feedback.md in your repo.

# Staging review, checkout flow

Source: https://staging.example.com/checkout

## Finding 1: Submit button overflows on mobile

Source: https://staging.example.com/checkout (viewport <= 380px)

![Submit button overflow](https://cobaltcapture.com/r/abc12345/img/1.png)

The submit button breaks out of its container at widths under 380px on
iOS Safari and Chrome Android. Looks like the container has a fixed
min-width or the button has an absolute width set. Should constrain to
the parent and let it wrap or shrink.

## Finding 2: Email validation fires too early

Source: https://staging.example.com/checkout#email

![Premature validation](https://cobaltcapture.com/r/abc12345/img/2.png)

Validation error shows up after the second keystroke, before the user
has finished typing. Should debounce by 300ms or only validate on blur.
The component is probably `<EmailField>` in the checkout form.

Note the structural elements. One H2 per finding, so the agent can iterate through them. A source URL on each finding, so the agent knows which route or component is in scope. A screenshot referenced by public URL, so the agent fetches it inline. Written context describing both what is wrong and a hypothesis about the fix, which gives the agent something concrete to act on rather than open-ended exploration.

How CobaltCapture fits in

CobaltCapture is the tool built for this category. It exists because writing agent-readable feedback by hand across ten findings is slow, and most existing capture tools produce the wrong output format.

The workflow is three steps. Open cobaltcapture.com, hit Capture screen, and pick the window you want feedback on. Drag a box around the broken part, then hit Dictate and talk through the problem. Repeat for each finding. Hit Publish and you get a public review URL plus a one-click markdown export.

Under the hood, CobaltCapture stamps each screenshot with the source URL automatically, hosts the cropped PNG at a stable public URL, and turns your dictated voice into editable markdown text. The output is the format described above, produced in a fraction of the time it takes to write it manually. Paste the URL into Cursor or save the export as feedback.md and reference it from any agent session. For the category-level positioning, see agent feedback.

Frequently asked questions

What does "agent-readable" mean?

Agent-readable means the artifact can be ingested by an AI coding agent directly, without a person summarizing or transcribing it first. In practice that means plain text, markdown structure, and image URLs the agent can fetch. Video, proprietary annotation files, and screenshots buried in chat threads do not qualify.

Why can't AI agents just read a Slack thread or Loom?

Slack threads scatter screenshots, replies, and links across multiple messages with no stable ordering or source-of-truth URL. Loom produces video, which coding agents cannot watch frame by frame. Both formats force a human to summarize before the agent gets useful input, which defeats the point of the handoff.

What format do AI coding agents actually consume best?

Markdown with embedded image links. Cursor, Claude Code, Windsurf, and most other coding agents render markdown natively, follow image URLs inline, and treat headings and code fences as structural cues. A single markdown document with one heading per finding is the closest thing to a universal input format.

Do I need a special tool, or can I write agent-readable feedback by hand?

You can absolutely write it by hand. The format is just markdown with screenshots hosted somewhere stable and source URLs noted per finding. A tool like CobaltCapture exists because doing this by hand across ten findings is slow: you have to capture, crop, upload, paste, label, and type out the context yourself.

What makes a screenshot agent-readable?

Three things. It lives at a stable public URL the agent can fetch, it is cropped tightly to the relevant region so the agent is not guessing where to look, and it sits next to written context that names the component or page it came from. A raw PNG dropped in chat without a URL or caption fails all three tests.

Is agent-readable feedback the same as a bug report?

Overlapping but not identical. A bug report describes one defect for a human triager. Agent-readable feedback can cover bugs, design changes, copy edits, or feature requests, and it is shaped for direct consumption by a coding agent rather than a ticket queue. The format is the difference, not the content.

Frequently asked questions

What does 'agent-readable' mean?

Agent-readable means the artifact can be ingested by an AI coding agent directly, without a person summarizing or transcribing it first. In practice that means plain text, markdown structure, and image URLs the agent can fetch. Video, proprietary annotation files, and screenshots buried in chat threads do not qualify.

Why can't AI agents just read a Slack thread or Loom?

Slack threads scatter screenshots, replies, and links across multiple messages with no stable ordering or source-of-truth URL. Loom produces video, which coding agents cannot watch frame by frame. Both formats force a human to summarize before the agent gets useful input, which defeats the point of the handoff.

What format do AI coding agents actually consume best?

Markdown with embedded image links. Cursor, Claude Code, Windsurf, and most other coding agents render markdown natively, follow image URLs inline, and treat headings and code fences as structural cues. A single markdown document with one heading per finding is the closest thing to a universal input format.

Do I need a special tool, or can I write agent-readable feedback by hand?

You can absolutely write it by hand. The format is just markdown with screenshots hosted somewhere stable and source URLs noted per finding. A tool like CobaltCapture exists because doing this by hand across ten findings is slow: you have to capture, crop, upload, paste, label, and type out the context yourself.

What makes a screenshot agent-readable?

Three things. It lives at a stable public URL the agent can fetch, it is cropped tightly to the relevant region so the agent is not guessing where to look, and it sits next to written context that names the component or page it came from. A raw PNG dropped in chat without a URL or caption fails all three tests.

Is agent-readable feedback the same as a bug report?

Overlapping but not identical. A bug report describes one defect for a human triager. Agent-readable feedback can cover bugs, design changes, copy edits, or feature requests, and it is shaped for direct consumption by a coding agent rather than a ticket queue. The format is the difference, not the content.

Capture your first review.

About a minute from open tab to a shareable URL your agent can ingest.

Start capturing