Mastering Agentic AI Coding: A Practical Guide to Verification and Harness Engineering

Overview

In the rapidly evolving landscape of software development, the use of AI for coding has shifted from mere experimentation to a core practice. Chris Parsons' updated guide (his third update) provides concrete insights that resonate with the best advice available. This tutorial distills those insights into actionable steps, focusing on the critical shift from vibe coding—where you ignore the generated code—to agentic engineering, where you orchestrate AI agents with a robust verification framework. The key takeaway: speed of building is obsolete; what matters now is how fast you can verify correctness. This guide will walk you through setting up a harness, training your AI, and scaling your impact.

Mastering Agentic AI Coding: A Practical Guide to Verification and Harness Engineering — Source: martinfowler.com

Prerequisites

Before diving into agentic AI coding, ensure you have:

Proficiency in at least one programming language (e.g., Python, JavaScript, TypeScript)
Experience with version control (Git) and continuous integration (CI) tools
Familiarity with AI coding assistants like Claude Code or Codex CLI
Understanding of unit testing, type checking, and static analysis concepts
Basic knowledge of YAML for CI pipeline configuration

Step-by-Step Instructions

Step 1: Adopt an Agentic Mindset

Distinguish between vibe coding and agentic engineering. Vibe coding involves accepting AI-generated code without review, which can lead to hidden bugs and technical debt. Agentic engineering treats AI as a junior developer that needs clear instructions, guardrails, and verification. Start each session by defining the task in small, testable chunks. For example, instead of asking “Build a login system,” break it into: “Generate a password hash function with error handling” and “Create a login endpoint with rate limiting.”

Step 2: Set Up Your Development Harness

The harness is the environment where AI agents operate and verify their output. Simon Willison’s distinction between tools like Claude Code and Codex CLI is that they provide an inner harness—built-in safety checks. You must augment this with external verification layers.

Add automated tests: Write test suites (e.g., pytest) that run on every generated code change.
Enable type checkers: Use mypy (Python) or TypeScript’s strict mode to catch type mismatches.
Use static analysis: Integrate linters (ESLint, Pylint) to enforce code style and common pitfalls.
Build CI gates: Configure a GitHub Actions workflow that runs tests, type checks, and linting before allowing merges.

Example GitHub Actions configuration:

name: AI Code Verification
on: [push]
jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run tests
        run: pytest
      - name: Type check
        run: mypy src/

Step 3: Leverage AI to Generate Multiple Approaches

Prompt the AI to produce several distinct solutions for the same problem. For example: “Give me three different algorithms for sorting this list, each with performance analysis.” Then use your harness to test all three simultaneously. The goal is not to pick the “best” immediately but to gather verification data. A team that can evaluate five approaches in an afternoon outpaces one that tests a single approach over a week.

Step 4: Implement Verification First

Prioritize building review surfaces over crafting perfect prompts. As the original guide emphasizes: “Build better review surfaces, not better prompts.” Make feedback loops as short as possible. This means:

Automate AI self-verification: Instruct the agent to run tests in a realistic environment before presenting results to you.
Use “assertions” in prompts: Ask the AI to include unit tests within its output.
Shift human review to only what requires judgment: business logic, UX, edge cases that automation can’t cover.

Example prompt: “Write a function to validate email addresses, and include at least three test cases that cover valid, invalid, and edge cases. Run the tests and fix any failures before finalizing.”

Step 5: Train the AI to Code Properly

The most valuable role of a senior engineer is training the AI, not approving every diff. Treat the AI as a coder that needs continuous feedback. Use these techniques:

Provide detailed style guides: Add a .claude-commands or .codex-rules file with your team’s conventions.
Correct mistakes explicitly: When the AI produces a wrong pattern, show the correct one and explain why.
Document patterns: Keep a living document of “what works” with AI for your team.
Review diffs strategically: Focus your eyes only on high-risk changes. Automate everything else.

Step 6: Scale Your Skills by Teaching Others

Agentic engineering compounds when shared. Run pair-sessions with junior developers to show how you prompt, verify, and iterate. Establish a “harness team” that owns the verification infrastructure. As Chris Parsons notes: the way out of being a diff-checker is to make yourself the person who shapes the harness. Make that work visible and measurable—e.g., reduced review time, fewer production bugs.

Common Mistakes

Relying on human review for everything: Even with agent throughput, humans become bottlenecks. Automate all mechanical checks.
Neglecting the harness: Without tests, type checkers, and CI, AI-generated code will degrade quality. Invest in these early.
Forgetting to train the AI: Repeating the same mistakes without feedback will not improve the model’s output. Use examples and corrections.
Mixing vibe coding with agentic engineering: You cannot occasionally ignore code and expect consistent quality. Be deliberate about when you trust AI.
Ignoring computational sensors: Birgitta Böckeler’s article on harness engineering highlights the role of static analysis and tests as sensors. Neglect them, and you operate blind.

Summary

The age of AI coding demands a mindset shift from building speed to verification speed. By adopting agentic engineering—setting up a robust harness, generating and verifying multiple solutions, and training the AI through feedback—you can multiply your productivity while maintaining quality. Senior engineers must become harness builders, not diff approvers. This guide has provided a practical roadmap: start by distinguishing vibe coding from agentic work, build your verification infrastructure, leverage AI to explore alternatives, automate feedback loops, and teach others. The game has changed. Adapt your practices accordingly.