AI & ML

What I Learned Building a Cursor-Style AI Coding Panel in ~300 Lines

Everyone talks about AI coding agents like magic. I built a working Claude chat panel into VS Code as one ~300-line extension, and it taught me exactly how thin the demo layer is — and where the genuinely hard part (context and multi-file edits) actually begins.

Dhileep KumarJun 13, 20267 min read

What I Learned Building a Cursor-Style AI Coding Panel in ~300 Lines

AI coding tools have quietly split into two things wearing similar names. There is autocomplete — the inline suggestions that finish your line — and there is the agent, which takes a written task, edits across files, runs commands, and comes back with a diff. Most people use Claude Code, Cursor and their cousins as the former and feel mildly let down. The teams getting real leverage use them as the latter. That is a genuinely different skill, and I understand it better now because I spent a weekend building a small version of one of these tools myself.

The project is called mini-cursor. It is a VS Code extension that drops a streaming, context-aware Claude chat panel into the sidebar — a deliberately un-forked, minimal take on what Cursor does. It is not a product and it is not finished. But building even the skeleton forced me to make every decision the polished tools make invisibly, and that is where the real education was.

The 'AI agent in your editor' is thinner than the demos suggest

Here is the honest headline: the whole extension is about 307 lines of source — 168 lines of TypeScript, 60 lines of webview JavaScript, and 79 lines of CSS. The bundled output, which includes the entire Anthropic SDK, is roughly 162 KB. That is the whole thing. A chat panel that gathers your editor context, sends it to Claude, and streams the answer back token by token fits in a single source file you can read over a coffee.

That should reframe how you think about these tools. The chat-with-your-editor layer — the part that feels like magic in a launch video — is thin. It is API glue plus a webview. The heavy engineering, the part that actually separates a toy from Cursor, lives in two places the demos gloss over: how you decide what context to send, and how you safely apply edits back into the codebase. My extension solves the first one crudely and does not attempt the second at all. Naming that boundary honestly is, I think, the most useful thing a first-hand build can offer.

The chat panel is 300 lines. Everything that makes an AI coding agent actually good — context selection and safe multi-file edits — is the part that isn't in those 300 lines.

Context is the whole game, and it's a token-budget problem

An agent is only as good as what it can see. But you cannot just shovel the entire repository into every request — you pay for every token in latency and money. So even in a v1, you are immediately forced into a policy decision: what slice of the editor do you actually send?

The rule I landed on is deliberately dumb and it works surprisingly well for a single-file assistant. If you have text selected, I send only the selected lines with their line numbers. If nothing is selected, I send the whole active file — but capped at 12,000 characters, with a truncation marker if it runs over. Selection always wins, because a selection is the strongest signal you can give about what you actually mean.

typescript

const selectedText = sel.isEmpty ? "" : doc.getText(sel);
const parts = ["Active file: " + rel + " (" + doc.languageId + ")"];
if (selectedText) {
  parts.push("Selected lines " + (sel.start.line + 1) + "-" + (sel.end.line + 1) + ":\n...");
} else {
  // No selection -> send the whole (small) file, capped to keep tokens sane.
  const full = doc.getText();
  const capped = full.length > 12000
    ? full.slice(0, 12000) + "\n...(truncated)"
    : full;
  parts.push("File contents:\n" + capped);
}

That 12,000-character cap is not a benchmark — I did not measure an optimal number. It is a pragmatic guardrail, chosen to keep token cost and latency bounded on a normal source file. But writing that one clamp is what made the next hard problem obvious. The moment your file is bigger than the cap, truncation means the model is reasoning about a document it can only half-see. That is exactly why retrieval — pulling the relevant chunks from across the repo instead of blindly sending one whole file — is the single biggest quality lever left on the table. In my roadmap notes I call codebase RAG the next layer for precisely this reason. The naive version teaches you why the sophisticated version exists.

Streaming, memory, and secrets: the unglamorous decisions

Three implementation choices ended up mattering more than I expected, and none of them are the part anyone demos.

The first is streaming. A reply that arrives all at once after a long pause feels broken, even if it is fast. So the extension uses the Anthropic SDK's streaming interface and forwards each text delta to the webview the instant it arrives, appending it to the current assistant bubble. Full history is replayed to the model on every turn so the conversation actually remembers itself.

typescript

const stream = client.messages.stream({
  model,
  max_tokens: maxTokens,
  system: "You are Mini Cursor, a coding assistant embedded in VS Code. Be concise...",
  messages: this.history,
});

stream.on("text", (delta) => {
  full += delta;
  this.post({ type: "delta", text: delta });
});

await stream.finalMessage();
this.history.push({ role: "assistant", content: full });

The second is where memory lives. I keep the conversation history in the extension host — the Node side — as a typed array of message params, not in the webview. That felt like an odd choice until I hit the reason: VS Code can tear down a webview whenever it is hidden. If the transcript lived in the webview, hiding the panel would wipe your session. Keeping it host-side (plus a flag to retain context when the view is hidden) is what gives the chat durable per-session memory. It is a small architectural decision that only reveals itself once you have watched state vanish.

The third is secrets. The Anthropic API key never touches settings. json or the source. It goes into VS Code's SecretStorage via a password input box, and that is the only place it lives. This is the kind of thing that is trivial to get right when you build it yourself and genuinely worth insisting on — an API key committed to a dotfile is a real incident, not a hypothetical.

Why I didn't fork VS Code — and what's deliberately still missing

Cursor forked the entire VS Code editor. That is a huge undertaking, and the obvious question when you start a project like this is whether you have to do the same. My conclusion, which I wrote into the README as a design stance, is: not for a v1 or v2. You fork the editor only when the editor chrome itself is the bottleneck — when you need to change how the UI fundamentally behaves. For a chat panel plus context gathering, an extension is plenty. Cline and Continue are living proof that serious AI coding tools ship as extensions. Forking is a cost you pay later, if ever, not a prerequisite.

It is just as important to be clear about what mini-cursor does not do, because the gap is the whole point. This is a v1 skeleton. Here is what genuinely works today, and what is only a documented plan:

Works now: a sidebar chat panel, active-file and selection context, full-history conversations, and token-by-token streaming from Claude.
Roadmap, not built: codebase RAG so the agent can pull relevant chunks from the whole repo instead of one capped file.
Roadmap, not built: agentic multi-file edits — the model proposing changes and applying them across the project.
Roadmap, not built: inline tab-style autocomplete.

The one design decision I did make for that future agentic layer is worth stating, because it is a lesson I did not have to learn the hard way — I got to learn it from other people's failures first. When the extension eventually applies edits, it will do so with search-and-replace blocks, never line numbers. Line numbers drift the instant a file changes, and a model editing by line number is a classic way to corrupt code confidently. Anchoring edits to the actual text you want to replace is more robust. Pre-committing to that in the roadmap is the difference between designing for a known failure mode and rediscovering it in production.

What building the toy taught me about using the real ones

Using AI coding agents well is mostly about delegation: you stop asking 'how do I write this? ' and start asking 'how do I describe this so something else gets it right? ' Building one made that concrete. Every quality problem I ran into as a builder maps directly onto a habit that makes you better as a user.

Context is a budget, so scope your asks. My 12K cap exists because you cannot send everything. When you use an agent, a tight selection or a narrow, well-bounded task is not a nicety — it is you doing the retrieval the tool cannot yet do for you.
Give it a standing context file. A CLAUDE. md, AGENTS. md, or . cursorrules is the same idea as my hardcoded system prompt, scaled up: state how your project works once so you are not re-explaining it every turn.
Verify every diff. An agent produces confident, plausible code — the build and the tests passing is what turns a proposal into something you can trust. Having seen how thin the layer that generates that code is, I trust the raw output less and the review step more.
Keep the unit of work reviewable. A change that rewrites twenty files is impossible to supervise. The skill that is emerging is not writing code — it is specifying it, breaking work into pieces small enough to actually check.

That last point is the whole thing. An AI coding agent is a force multiplier on a clear plan and a magnifier of a vague one. It will not save a task you could not describe; it will just produce the wrong thing faster. Building a small one did not make me trust these tools more — it made me a better supervisor of them, because now I know exactly which part is easy glue and which part is the genuinely hard engineering they are quietly doing on my behalf.

Enjoyed this?

Get the next deep dive in your inbox. No spam — just the stories worth reading.

Subscribe to the newsletter