Skip to content
Method: every claim tracked, reviewed every 30–90 days, marked Holding, Partial, or Not holding. Drafted by Claude; signed off by Peter. How this works →
AM-103pub28 Apr 2026rev28 Apr 2026read8 mininAI Implementation

Learning AI by doing AI: 90 days of measured rework across two ventures

Rework rate, measured as deletions over total churn, ran from 8.1% on Rhino-basketball to 13.5% on agentmodeai across the same 90-day window. The number is meaningfully lower than typical solo-developer projects but substantially higher than the 'AI codes it once correctly' marketing narrative implies. The data is the evidence, not the framing.

Holding·reviewed28 Apr 2026·next+89d

The rework-rate number across two ventures Peter built with Claude in Q1 2026 ran from 8.1% on Rhino-basketball to 13.5% on agentmodeai over the 90-day window from 28 January to 28 April 2026. The data is the evidence; the framing is the question this piece is for.

This is the third article in the build-log series. The first (AM-101) named the structural feature this publication is built on. The second (AM-102) named the production model. This piece names the data, what 90 days of building with AI actually looks like, measured.

The methodology is the locked decision from the planning session on 28 April 2026: rework rate measured as git deletions divided by total git churn (insertions plus deletions). The metric is mechanical and reproducible. It is not the only metric that matters, but it is the metric a small team can compute on demand for any git-versioned project, and that is the discipline the rest of this piece sits on.

The numbers

Two ventures, same 90-day window, git log --shortstat --no-merges --since="2026-01-28" on each.

agentmodeai, this publication’s content-and-infrastructure repository.

MetricValue
Commits187
Files changed2,075
Insertions92,385
Deletions14,412
Total churn106,797
Rework rate13.5%

Rhino-basketball, the operations system Peter chairs for the basketball club in Oosterwolde.

MetricValue
Commits376
Files changed3,237
Insertions392,025
Deletions34,401
Total churn426,426
Rework rate8.1%

DealVex, the B2B SaaS for autodealers, built for Peter’s son-in-law’s business.

MetricValue
MethodologyNot yet measurable
ReasonProject not git-versioned
ActionMethodology applies as soon as git history exists

Three projects, two with data, one acknowledged as a measurement gap. That is the honest version of the build-log so far.

What the numbers actually say

Two readings of the 8% to 13% range, both defensible.

The lower end (Rhino-basketball, 8.1%) is consistent with a project whose specification was clear at the start. The basketball club’s operational reality is well-bounded: schedule training, manage trainer availability, track member attendance, handle club communication. The features the system needed were knowable from week one. Most code shipped close to first-draft because the spec did not change underneath it. The 8% rework reflects normal refactoring, dependency updates, and bug fixes, not the project rewriting its own thesis.

The higher end (agentmodeai, 13.5%) is consistent with a project whose specification evolved meaningfully during the build. The content pipeline shifted shape several times across the 90-day window. The Holding-up ledger system (the central feature) was not designed in advance; it emerged from the editorial process. The retractions register, the corrections page, the per-claim review cadence, the operators register, the affiliate firewall, each of these was added during the build, which means earlier code that assumed simpler structure had to be rewritten when the structure changed. The 13.5% rework reflects spec uncertainty during the build, not engineering inefficiency afterward.

Neither number is good or bad. They are the cost of the project’s spec clarity at start, measured. The lesson is not “Rhino-basketball was better engineered.” The lesson is that rework rate tells you what the project did not know at the beginning, expressed as code lines.

What the numbers do not support

Four framings the data does not support, and which would be tempting to assert without it.

The marketing narrative that AI-paired development eliminates rework. Both projects rewrote 8% to 13% of their code lines across 90 days. The vendor framing of “AI codes it correctly the first time” is not what the data shows. It shows that AI-paired development still has spec uncertainty, still has refactor work, and still produces deletions. The advantage the AI offers, if any, is in the velocity of the rework cycle, not in eliminating the cycle.

A strong claim about productivity gain. The comparison that would support a productivity claim is rework rate against a non-AI baseline for the same projects, which neither project has. Without the counterfactual, the only honest statement is that the rework rate happened at this level on these projects in this window. Anything beyond that would be inference dressed as data.

A strong claim about which kinds of work AI assists best. The data is aggregated across the entire commit history, not segmented by task type (feature implementation, bug fix, refactor, documentation, infrastructure). The next iteration of the methodology should segment by commit-message convention, but the current numbers are too coarse to support task-level claims.

Inference about other developer-AI workflows. These are two specific projects with one specific human-AI pairing pattern (Peter as primary developer, Claude as paired drafting partner, occasional code review by independent humans). Different pairing patterns (AI-as-junior, AI-as-pair-programmer, AI-as-reviewer-only) would produce different rework rates. The numbers here describe Peter’s pattern, not a general claim about agentic-AI development.

The point of these four limits is not to undermine the data. It is to mark the boundary of what the data supports, so the claim AM-103 is making does not exceed what the measurement can defend.

Why DealVex is missing and what it implies

DealVex is the third venture in the series. It is also the only one of the three that is not git-versioned. The build is happening on a project directory with backup snapshots but no git history, which means the rework-rate methodology cannot be applied retroactively. The build started before the publication’s measurement discipline existed, and the cost of retrofitting git versioning over the existing project state is non-trivial.

Two implications worth surfacing.

The first is that the build-log series cannot make a “three ventures, three data points” claim until DealVex catches up. The honest version is “two ventures with data, one acknowledged gap.” That is what AM-103 documents and what the 90-day review on 27 July will track.

The second is that the measurement discipline applies to projects from the moment they are versioned, not retroactively. A small-team operator considering whether to adopt the rework-rate metric should treat it as a forward-looking discipline. Backfilling it on an existing project is structurally similar to backfilling a claim ledger on a publication that was not built with one, possible, but the cost is high enough that the practical version is “start measuring from now.”

What an operator should take from this

This piece is in the enterprise register, but the operational lesson generalises. A small team building with AI in 2026 should:

Measure the rework rate from day one. git log --shortstat --no-merges --since="<start-date>" is one command. It runs in seconds. The metric is imperfect but reproducible. Skipping it for the first 90 days produces the DealVex problem: the data is gone and cannot be recovered without rebuilding the project history.

Treat the rework rate as a spec-clarity signal, not a productivity signal. A high rework rate means the project did not know what it was at the start. That is fine; many projects do not. A low rework rate means the spec held; that is also fine and tells you something about the project’s nature, not about the team’s quality. The metric is diagnostic, not evaluative.

Re-measure on a fixed cadence. The Q2 numbers for both ventures will be on the ledger on 27 July 2026. Whether the 8% and 13.5% rates hold, or shift, will be visible at the same URL this piece publishes from. That is the discipline the publication is built on, applied to its own development.

The data is the evidence. The framing is the question. This piece commits the publication to surfacing the data on a quarterly rhythm and to taking the framing apart whenever the data does not support it.

What we are tracking

Claim AM-103 is logged with a 90-day review on 27 July 2026. The trackable assertion is in two parts:

  1. DealVex methodology lands by 27 July 2026 with comparable rework-rate data. Either DealVex is git-versioned by then with at least 60 days of history, or the gap is documented and the next milestone is set. Without this, the build-log series carries a permanent measurement hole.

  2. The agentmodeai and Rhino-basketball rework rates measured over the next 90 days stay within ±5 percentage points of the current figures, or the deviation is explained by a documented project-shape shift. If both rates land in the 3–18% band that contains the current figures, the methodology is reproducible. If either breaks the band, the change either reflects real project-shape evolution (which the writeup will name) or methodology weakness (which the writeup will fix).

If both review checks land cleanly, claim Holds and the methodology is locked for Q3 2026. If DealVex lands but rates diverge, Partial and the divergence is the next piece. If DealVex does not land at all, Partial and the publication owes the reader an explanation of why three ventures became two.

The data is on the ledger. It will be re-measured in public on 27 July, and if the methodology does not hold, the correction will be on the same page.

ShareX / TwitterLinkedInEmail

Spotted an error? See corrections policy →

Disagree with this piece?

Reasoned disagreement is a first-class signal here. Every review cycle weighs documented dissent; material dissent becomes part of the article's change history. This is not a corrections form — use /corrections/ for factual errors.

Part of the pillar

Agentic AI governance

Governance frameworks, oversight patterns, and compliance postures for enterprise agentic-AI deployment. 31 other pieces in this pillar.

Related reading

Vigil · 60 reviewed