September 28, 2025

Pikachu: a Pair Programming Agent

How we built a real-time, voice and text enabled pair programming agent and won a hackathon

Google AI TinkerersPair ProgrammingAgentic ProgrammingFastAPIWebSocketElectronGemini 2.0 Flash Live

Pikachu ⚡ Building a Pair Programming Agent at Google AI Tinkerers Toronto

Pair programming is magic when it is available. During crunch moments it rarely is. That gap is what our team tried to compress into software.

Over one intense weekend in Toronto, we built Pikachu, a real-time, voice and text enabled pair programming agent. We qualified as finalists, learned a ton about what makes agents actually useful on a developer desktop, and shipped something that felt genuinely helpful.

Why an Agentic Pair Programmer

Developers do not just need answers. They need a partner that sees the same screen, tracks the same files, and reacts fast. At Shopify, pairing is part of the culture. In practice, mentors and teammates are not always free at the exact moment a bug melts your brain. Pikachu fills that gap.

Our design goal was simple: a sidekick that sits in the corner of your screen, watches the same context you do, and can help without breaking flow. If you highlight a function, it should understand that selection. If you are missing an import, it should suggest the right diff. If you ask a question out loud, it should respond before your frustration spikes.

The Hackathon in Three Acts

Act 1. Scaffolding the loop
We stood up a FastAPI backend and wired bidirectional WebSocket streaming for audio and text. The target was consistent sub second turn responsibility, not benchmark bragging rights. Streaming shaved perceived latency and made the agent feel present.

Act 2. Teaching Pikachu to look around
We added a permissions gated context layer: reading files in the active project, grabbing the current selection, and detecting the IDE state. Security scoped file access was a hard constraint. The rule was clear: only read what the user explicitly allows.

Act 3. Putting it on the screen
We built an Electron overlay that is always on top and click through friendly. The UI sits on the edges of your screen with a playful Pikachu animation and thought bubbles. It avoids stealing focus. It injects help when asked or when a high confidence fix is obvious.

How Pikachu Works

Core Capabilities

Contextual code intelligence: Reads the current file or a small set of user approved files, summarizes call sites, and tracks the selection.
Web scale lookup: Runs live Google searches for docs, RFCs, and error strings, then distills results into short, source linked notes.
Smart clipboard and prompts: Generates minimal prompts or diffs and can copy them to the clipboard on request.
Session memory: Keeps short lived context in memory so you can jump between tasks without repeating yourself.
Hands free voice: Wake word or push to talk, with real time partial transcription and barge in.

Architecture at a Glance

Inference: Gemini 2.0 Flash Live for ultra low latency conversation and tool output.
Transport: Bidirectional WebSockets for both audio and text, with incremental deltas.
Agent runtime: Google ADK for tool orchestration and state, plus a slim task router to decide between search, read file, or propose diff.
Backend: FastAPI with async streaming endpoints and lightweight rate limiting.
Desktop UI: Electron overlay that is always on top, supports click through, and renders subtle borders to show agent focus.
Security: Security scoped file reads, no raw disk crawling, explicit user consent for any new path, strict allowlist on network calls.

The Tooling Loop

Detect selection or listen for a request.
Decide: read file, search the web, propose a code change, or ask a clarifying question.
Execute tool calls concurrently where safe.
Stream a concise plan, then a candidate fix as a diff.
Offer one click actions: copy patch, open file, run command in terminal.

The Moment It Clicked

Late at night, we highlighted a gnarly TypeScript error in a utility that wrapped fetch. Pikachu read only the file and its imports, found an await in a non async function, surfaced the exact TypeScript diagnostic from docs, and proposed a minimal diff that converted the caller. No multi paragraph essay. Just a plan and the patch. That interaction felt like real pairing.

What Was Hard

Selection awareness: Mapping GUI selection to semantically relevant AST context is nontrivial. We built a quick parser backed by heuristics that favors stable tokens near the cursor.
Latency budgets: Voice plus tools amplifies delays. Streaming everywhere, shrinking prompt templates, and parallelizing search with file reads kept the agent snappy.
UI restraint: It is easy to spam the screen. We biased toward quiet defaults and only elevate when confidence is high.

What Surprised Us

Small diffs beat long explanations.
Keeping a tiny rolling context window is enough for most micro tasks.
Developers value a clean exit. The best moment is when the agent gets out of the way fast after you accept a fix.

Impact and Utility

Pikachu does not try to replace your IDE or your teammate. It makes pairing accessible at odd hours and lowers the activation energy to ask for help. If you are debugging an API mismatch, pulling in a missing dependency, or trying to remember a flags syntax, the agent is faster than a tab switch and kinder than a Stack Overflow rabbit hole.

What We Would Build Next

Deeper IDE hooks: First class Cursor and VS Code integrations with native diff apply and test run.
Stronger safety rails: Sandboxed commands with preview and auto rollback.
Knowledge plugins: Local docs packs for common stacks that work offline.
Team mode: Share a short session trace so a human teammate can review what the agent did.

Team

Built with friends from the Toronto AI community: Ethan Zemelman, Aayush Grover, Kevin Z, and me, Aryan. Huge thanks to the organizers and mentors who created the space and brought serious energy to the room.

Closing

I judge projects by a simple question: if I disappeared tomorrow, would this still help someone. Pikachu feels like a yes. It is not perfect. It is already useful.

If you want to try it or jam on agents for developer workflows, reach out. I am happy to share the repo, walk through the architecture, and keep building.

Check out our demo video here! 👉 Youtube Video