I Rebuilt My AI Coding Stack
December 23, 2025
The week before Christmas is always a little slower at work; so, I took the opportunity to rebuild my AI coding stack. I've been reading and hearing about this tool and that; so, I took the time to hook them all together. No one directed me to use specifically this setup, but I tried it and liked it.
Core Constraints
We work on Macs, so all code snippets in prompts are in bash. I use zsh and iTerm2 for my shell (I like the tabbed terminal panes). I also prefer Homebrew for installing tools when possible.
The core coding agent is Claude Code running Claude Sonnet 4.5 as the LLM. This means that all my custom slash commands are setup for Claude to use. I turned the tool chain into 3 slash commands - 1 that sets up everything as an interactive prompt, another to make Claude act as a planning agent, and a 3rd to start multiple Claude Code sessions in parallel as coding agents.
I also setup the GitHub Copliot CLI, defaulted to the grok-code-fast-1 model, to be available for Claude Code to call as an oracle if it gets stuck. I've seen other devs setup their own oracle LLM options, but this is one that I have available through our corporate accounts.
For source code management, my setup uses Git and GitHub. Claude uses the GitHub CLI tool for GitHub-specific interactions.
Prereqs
My setup slash command checks that some CLI tools are installed: ripgrep, fzf, lazygit, and ast-grep. It also installs some language runtime tools that other tools will depend on: uv, go, rust+cargo, and bun.
Plan
My plan and work slash commands follow the basics of the software development process.
When I invoke the plan command, I give it either a Jira ticket or a feature description and Claude makes the Jira ticket for me. Both situations end up using the Atlassian MCP server to interact with Jira: https://github.com/atlassian/atlassian-mcp-server
Then it uses the Superpowers skills (https://github.com/obra/superpowers) to brainstorm with me and interview me about the feature.
That expanded feature description gets handed off to Every's Compound Engineering plugin (https://github.com/EveryInc/compound-engineering-plugin/tree/main/plugins/compound-engineering) to be turned into a full-fledged development plan. Read more about Every's Compound Engineering philosophy.
Claude breaks that plan into beads (https://github.com/steveyegge/beads/tree/main). I like tracking those beads with Chris Edward's Abacus tool (https://github.com/ChrisEdwards/abacus).
As a final bit of setup, the plan command makes a Git feature branch for this Jira ticket and records any learnings from this planning session in a file that it can read later to augment its context.
Work
I started Claude Code instances running with the work slash command. They coordinated with each other through beads and reserved files (like semaphore-style locking) with mcp_agent_mail (https://github.com/Dicklesworthstone/mcp_agent_mail). This did a fairly good job at keeping them from colliding, and it opened them up to work on beads in parallel when they became unblocked. Speaking of which, the agents would run beadsviewer (https://github.com/Dicklesworthstone/beads_viewer) with the "--robot-plan" argument to get a JSON report on what is the highest priority, unblocked bead to work on next.
Each work agent builds its initial context by reading the plan, using coding_agent_session_search (https://github.com/Dicklesworthstone/coding_agent_session_search) and cass_memory_system (https://github.com/Dicklesworthstone/cass_memory_system), and the Context7 mcp server (https://context7.com/) to look up documentation. Also, it reviews the content in its learnings document for the project.
Once it's context is loaded up and its found a bead to work on, it goes back to Superpowers for Test Driven Development - style execution.
It reviews its work with multiple review tools: Claude's built in /review and /security-review slash commands, ultimate_bug_scanner (https://github.com/Dicklesworthstone/ultimate_bug_scanner), the Superpowers review, and Compound Engineering's review. The point of all these review tools isn't to be perfect, but to catch as much stupid stuff as possible before a human needs to look at the code.
Last, it makes a PR, pushes everything up to the remote git repo, and records any learnings.
Results
The work agents aren't fully autonomous. They need bash commands approved for running, cajoling, answers to clarifying questions, or help when stuck. But they are really capable, though!
I tested this setup with a feature idea I had this morning. In 2023, it probably would have taken me a week to code by hand with multiple rounds of revisions from human code reviews. This AI coding setup did that work in roughly 40 minutes, and it was approved by human reviewers with no change requests.
I can't wait to see what innovations 2026 brings!