AFD — Agent-First Development

THE
OPACITY
PROBLEM

Your app is a black box to AI.

Traditional apps work fine for humans. Click a button, fill a form, read a screen. But an LLM staring at your codebase is a brilliant engineer locked behind a keyhole. It can read your source. It can't use your product.

Keyhole diagram — an agent peering through a narrow opening in the UI wall

Capabilities hide behind visual interfaces. State lives in components. Features only fire through mouse events. The API you bolt on later copies the UI. Badly. Two systems, one always out of sync.

THE INVERSION

Define the command. Validate it in the terminal. Only then do you build the button. If it fails in the CLI, the architecture is wrong.

"If it can't be done via CLI, the architecture is wrong."

AGENTS GET
JUDGMENT.

Most tools hand agents raw data and wish them luck. AFD gives agents the same trust signals that make human UX work: confidence scores, transparent reasoning, recovery paths when things break, and a plan for what to do next.

The agent doesn't have to guess whether a match is reliable. It doesn't have to invent a fallback when a lookup fails. The system already knows.

{
  "success": true,
  "data": { "id": 42, "role": "admin" },
  "confidence": 0.85,
  "reasoning": "Matched by email.",
  "warnings": [{
    "message": "Elevated privileges."
  }],
  "suggestions": [
    "Review role assignments"
  ]
}

confidence → Should I trust this?

reasoning → What happened and why?

suggestions → What do I do next?

ERRORS
TALK BACK

When things break, agents shouldn't have to guess.

{
  "status": 404,
  "message": "Not found"
}

THE BLANK WALL

{
  "success": false,
  "error": {
    "code": "NOT_FOUND",
    "message": "No user with email 'jdoe@example.com'",
    "suggestion": "Try user-search with partial match"
  }
}

THE RECOVERY PATH

A 404 and a blank wall versus a diagnosis and a next step. The agent doesn't stall or churn. It recovers.

DEFINE.
VALIDATE.
SURFACE.

DEFINE

Write the command. Typed schema. Explicit inputs. Structured error states. The schema is the contract between your code and every agent that will ever call it.

VALIDATE

Run it in the terminal. Hit every edge case. Confirm error recovery paths. If the command breaks here, you just saved yourself from shipping a broken button.

SURFACE

Now build the UI. It's a thin wrapper over proven logic. Agents plug into the same commands your React components call. Zero translation layer. Full parity.

Paradigms change. Commands don't. CLI gave way to GUI. Conversational AI is replacing both. The command layer stays.

AGENTS BUILD
APPS FASTER.

The abstraction isn't theoretical. It changes how fast you ship.

You define a command. Validate it in the terminal. The agent iterates on logic in seconds — no browser, no clicking, no waiting for renders. Once the command passes, the agent builds UI against it.

Then the feedback comes in. The layout is wrong. The flow needs rethinking. Rip the UI out. Rebuild it. Rewire it to the same commands that have been passing tests the entire time. When logic lives in the UI, agents shift a component and corrupt a data flow. Fix one thing, break three. Separate them, and the UI becomes disposable — rebuildable in minutes while the logic stays proven.

WITHOUT AFD

Agent edits UI → breaks business logic → cascade
Every iteration requires browser testing
Fixing one component breaks two others
State scattered across UI components
Compound debugging: fix → break → fix → break

WITH AFD

Commands pass tests the entire time
Agent iterates logic via CLI in seconds
UI is a thin layer — rip it out, rebuild, rewire
State changes verified through commands
Feedback loop: minutes, not hours

TEST THE JOB.
NOT THE UI.

Jobs-to-Be-Done scenario testing. Built in.

Test suites check buttons and endpoints. AFD tests the job the user hired your software to do.

Write a YAML scenario that describes a user journey — create a todo, complete it, delete it. Each step calls a command, asserts the result, and passes data forward. No browser. No Selenium. No flaky CSS selectors. Just the job, validated end to end through the command layer.

JTBD scenarios test the jobs. Surface validation audits the command surface itself — naming collisions, schema drift, prompt injection. Together, they replace fragile E2E suites.

scenario:
  name: "Create and complete a todo"
  tags: ["smoke", "crud"]

steps:
  - name: "Create a new todo"
    command: todo-create
    input:
      title: "Buy groceries"
      priority: "high"
    expect:
      success: true
      data:
        title: "Buy groceries"
        completed: false

  - name: "Complete the todo"
    command: todo-toggle
    input:
      id: "${{ steps[0].data.id }}"
    expect:
      success: true
      data:
        completed: true

steps[0].data.id → Data flows forward automatically.

expect.success → Assert on CommandResult. Confidence, reasoning — all testable.

tags: ["smoke"] → Filter and run subsets. CI-friendly.

THE
TOOLKIT

TypeScript · Python · Rust

AFD ships as packages you install, not a platform you migrate to. Pick a language. Add the package. Define your first command.

npm install @lushly-dev/afd-core @lushly-dev/afd-server
pip install afd
cargo add afd

Stack diagram — Core block with TS, PY, RS, and MCP nodes

@afd/core

CommandResult, CommandError, batching, streaming.

@afd/server

Zod-based MCP server factory with middleware.

@afd/client

MCP client + DirectClient for ~0.03ms in-process calls.

@afd/cli

Connect, call, validate, explore commands.

@afd/auth

Provider-agnostic auth — middleware, session sync.

@afd/testing

JTBD scenario runner, surface validation.

@afd/adapters

Frontend adapters for rendering CommandResult.

afd (Python)

Pydantic CommandResult, FastMCP server.

afd (Rust)

CommandResult types, CommandRegistry, WASM.

COMMANDS ARE THE PRODUCT.

THE
OPACITY
PROBLEM

THE INVERSION

AGENTS GET
JUDGMENT.

ERRORS
TALK BACK

DEFINE.
VALIDATE.
SURFACE.

DEFINE

VALIDATE

SURFACE

AGENTS BUILD
APPS FASTER.

WITHOUT AFD

WITH AFD

TEST THE JOB.
NOT THE UI.

THE
TOOLKIT

@afd/core

@afd/server

@afd/client

@afd/cli

@afd/auth

@afd/testing

@afd/adapters

afd (Python)

afd (Rust)

BOTCORE.

COMMANDS ARE THE PRODUCT.

THEOPACITYPROBLEM

THE INVERSION

AGENTS GETJUDGMENT.

ERRORS TALK BACK

DEFINE.VALIDATE.SURFACE.

DEFINE

VALIDATE

SURFACE

AGENTS BUILDAPPS FASTER.

WITHOUT AFD

WITH AFD

TEST THE JOB.NOT THE UI.

THETOOLKIT

@afd/core

@afd/server

@afd/client

@afd/cli

@afd/auth

@afd/testing

@afd/adapters

afd (Python)

afd (Rust)

BOTCORE.

THE
OPACITY
PROBLEM

AGENTS GET
JUDGMENT.

ERRORS
TALK BACK

DEFINE.
VALIDATE.
SURFACE.

AGENTS BUILD
APPS FASTER.

TEST THE JOB.
NOT THE UI.

THE
TOOLKIT