AXME Code April 11, 2026 · 8 min read

The First Session With a New Codebase: What AXME Code Does Before the Agent Starts

Stack detection, structure scanning, glossary extraction, safety rule seeding. What happens in the first 30 seconds of a new AXME Code setup, why each step matters, and what happens if you skip it.

The most expensive moment in any tool’s life is the first time someone uses it on their own code. If setup takes 10 minutes and the immediate value is not obvious, most people never come back.

AXME Code had this problem in its first version. Setup was a CLI command that created a directory, and that was it. The first session felt identical to Claude Code without the plugin. Memory, decisions, safety rules, all empty. There was nothing useful to load. The user had to do meaningful work for a few sessions before the plugin felt meaningful.

I rewrote setup to fix that. The new first-time setup is about 30 seconds of LLM work that produces a minimally useful context before the first real session. I want to walk through exactly what happens during those 30 seconds, because each step is there to solve a specific first-session failure mode.

Before anything runs: the situation

You just cloned a repo you haven’t touched. You run axme-code setup at the repo root. At this moment:

The agent has no idea what the repo is.
There is no CLAUDE.md. Or there is one, but it’s inconsistent with the code (common).
You have not yet told the agent any conventions.
You don’t know what the agent needs to know to be useful here.

If you start a session right now, the agent defaults to “generic helpful assistant” behavior. It doesn’t know that this repo uses pnpm and not npm. It doesn’t know that the test command is npm run test:ci, not npm test. It doesn’t know that users.py is legacy and all new code goes in user_service.py.

Every one of these facts is discoverable from the repo itself if you look, but the agent doesn’t look without being asked. The point of setup is to do the looking up front, once, and cache the results.

Step 1: Stack detection (3-5 seconds)

The setup command reads a handful of signal files: package.json, pyproject.toml, Cargo.toml, go.mod, pom.xml, requirements.txt, Gemfile. Each of these signals a language/runtime. The detection logic is mostly deterministic (not LLM-driven) because file names are reliable signals.

From these files, setup extracts:

Language and version: "language": "TypeScript", "runtime": "Node >=22"
Package manager: "pkg_manager": "pnpm" (from pnpm-lock.yaml or packageManager field in package.json)
Scripts: "test": "vitest", "build": "astro build", "lint": "eslint" (pulled from package.json scripts)
Key dependencies: the top 10 dependencies, filtered for ones that signal frameworks (astro, react, next, fastapi, etc.)

This gets written to .axme-code/oracle/stack.md as a short markdown file:

# Stack

- Language: TypeScript (strict mode)
- Runtime: Node.js >=22.12
- Package manager: pnpm
- Test: `pnpm test` (Vitest)
- Build: `pnpm build` (Astro)
- Lint: `pnpm lint` (ESLint)
- Frameworks: Astro 6, Tailwind 4

From this point forward, every session loads this file. The agent does not have to guess whether to run npm or pnpm. It reads it.

This step saves the most common first-session annoyance: the agent runs the wrong package manager and you have to correct it.

Step 2: Structure scan (5-8 seconds)

The setup command uses a simple walk of the repo (respecting .gitignore) to build a shallow tree. For each top-level directory, it counts files and notes the dominant language. For depth 2, it also includes directory names.

The output is a compact structure map:

# Structure

- `src/` — TypeScript (24 files)
  - `src/pages/` — Astro components (4 files)
  - `src/layouts/` — Astro layouts (1 file)
  - `src/content/blog/` — Markdown content (21 files)
  - `src/styles/` — CSS (1 file)
- `public/` — static assets (4 files)
- `dist/` — (gitignored)
- `node_modules/` — (gitignored)

## Entry points (heuristic)
- `src/pages/index.astro` — likely home route
- `astro.config.mjs` — framework config

Entry points are heuristic. For Astro, the src/pages/index.* convention is strong. For Node, we look at main in package.json. For Python, we look for __main__.py or the package name + /main.py. These are guesses, not guarantees, but they’re better than nothing.

This step solves the “where do I start” problem. When you start a session and say “fix the bug in the home page,” the agent already knows the home page is src/pages/index.astro, because the structure scan told it.

Step 3: Pattern scan (10-15 seconds, LLM call)

This is the first LLM call in setup. The agent reads up to ~20 key source files (picked by heuristic: entry points, config files, the largest file per top-level source directory) and is asked to extract observable patterns from the code.

The prompt is roughly: “Here are some source files from a project. What conventions, preferences, and patterns do you notice that a new contributor would want to know? Format as a list of one-line observations.”

Examples of what this produces on a real project:

# Patterns

- Uses ESM imports throughout, no CommonJS
- Strict TypeScript (no `any`, no `@ts-ignore`)
- All API handlers are `async def` — no sync handlers
- Frontmatter validation via Zod, schema defined in src/content.config.ts
- Tests colocated with source: `foo.ts` and `foo.test.ts` in same directory
- All logging goes through a single logger module, no direct `console.log`
- Error types are always `Result<T, E>` pattern, no thrown exceptions in hot paths

These are patterns the LLM observed, not patterns I prescribed. They get written to .axme-code/oracle/patterns.md. From the next session onward, the agent has them loaded and can follow them without being told.

There’s a real risk here: the LLM can be wrong. It can pattern-match to what similar projects usually do rather than what this project actually does. I mitigate by:

Only using the top-N largest files (not random sample), which biases toward signal
Asking for observations phrased as “I noticed that X,” not prescriptions
Letting the user review patterns.md after setup and delete anything wrong

In practice, about 70-80% of extracted patterns are correct. 10-20% need to be deleted or edited. The net is still positive because 80% of correct patterns loaded automatically is much better than 0% loaded.

Step 4: Glossary extraction (5-10 seconds, LLM call)

Same files, different prompt. The LLM is asked: “What domain-specific words or terms appear in this code that wouldn’t be obvious to a new contributor? List them with one-line definitions.”

On the AXME Code repo, this produces:

# Glossary

- **Intent**: a work item sent to an agent, with a lifecycle (SUBMITTED, DELIVERED, IN_PROGRESS, COMPLETED, FAILED, TIMED_OUT, CANCELED)
- **Audit worker**: the background process that runs after a session ends and extracts memories from the transcript
- **Handoff**: a short markdown summary of a session's stopping state, generated by the audit worker
- **Oracle**: the stable metadata about a project (stack, structure, glossary) stored in .axme-code/oracle/
- **Enforce level**: a field on decisions indicating whether the rule is required or advisory

This is important because most projects have words that the code uses as technical terms but that don’t have entries in any public dictionary. The first session in a new codebase often wastes 5-10 minutes of back-and-forth clarifying “what do you mean by X.” The glossary front-loads that work.

Step 5: Safety rule seeding (2-3 seconds)

This step is deterministic, not LLM-driven. A set of default safety rules get dropped into .axme-code/safety/rules.yaml:

protected_branches: [main, master, develop]
deny_commands:
  - pattern: "rm -rf /"
    reason: "Catastrophic"
  - pattern: "chmod 777"
    reason: "Security footgun"
  - pattern: "git push --force"
    reason: "Lost work at least once"
  - pattern: "curl | sh"
    reason: "Arbitrary code execution from network"
  - pattern: "curl | bash"
    reason: "Same as above"

These are defaults that apply to essentially every project. You get them for free. You can add more later as incidents accumulate. But the baseline is non-negotiable and is there from session 1.

This is the most underrated step. The other steps make the agent more useful. This step makes the agent less dangerous. The combination is the point.

Step 6: Write the handoff file

Setup ends by writing an empty handoff file:

# Previous Session Handoff

This is a fresh setup. No previous session yet.

When you start your first session, tell the agent what you want to
work on, and at session close the audit worker will write a real
handoff here for the next time.

This is here so that the “load handoff at session start” code doesn’t need to handle the “handoff doesn’t exist yet” edge case separately. Every session, including the first one, finds a handoff file, even if that file says “fresh setup.”

Little thing. Saves a bug.

What the first real session looks like

After those 30 seconds of setup, you start your first session. Here’s what the agent says:

Project: axme-landing
Stack: TypeScript, Astro 6, Tailwind 4, pnpm
Entry point: src/pages/index.astro
Patterns (extracted, unreviewed):
  - Uses ESM imports, no CommonJS
  - Strict TypeScript, no any
  - Tests colocated with source
  - [... 4 more]
Glossary: 6 terms loaded
Safety: 5 deny patterns active
Previous session: fresh setup

What would you like to work on?

Compare this to the baseline first session with a fresh Claude Code install:

How can I help you today?

The 30 seconds of setup buys you a session where the agent already has context before the first message. You don’t have to explain the stack. You don’t have to warn about rm -rf. You don’t have to define “intent” or “audit worker.” The agent knows.

What happens if you skip setup

You can skip setup and start using AXME Code immediately. It will work. The first session will just feel like vanilla Claude Code, because there’s nothing loaded.

After a week of real use, the .axme-code/ directory will start to fill up from audit-driven writes (memories accumulated, decisions captured, patterns discovered). Eventually you’ll get to something resembling what setup would have produced in 30 seconds.

The difference is weeks vs seconds. Setup compresses the first week of discovery into a one-time scan. You pay it once. You get the value from session 1 instead of session 15.

The one insight

The point of setup is not to be exhaustive. It is to provide a useful starting point that avoids the dumbest first-session failures.

Stack mismatch, structure confusion, domain vocabulary gaps, and trivially-catchable safety incidents are the four things that make the first session with a new codebase painful. Setup addresses all four in 30 seconds of work. That’s the whole design goal.

Everything else (richer patterns, more sophisticated decisions, safety rules tuned to your specific incidents) accumulates from real use. Setup doesn’t try to predict what you’ll need. It just makes sure the floor is not embarrassingly low.

The floor is what most tools fail at. They get the ceiling right (expert users love them) and leave the floor bare (new users give up before expert level). Setup fixes the floor. If you use AXME Code, the 30 seconds you spend on axme-code setup at the start of a new repo is the single highest-leverage minute in the tool’s whole workflow.