Accuracy

Buildable designs. Full coverage.
Footage you can defend.

Every MACRILON release is benchmarked against 174 real, human-engineered as-built FTTH designs across 3 US markets — the networks an experienced OSP engineering team actually built. We lead on the two things you can measure directly and absolutely — buildability and serviceable-home coverage — and we publish footage as an honest engineering estimate band. Here is the current result, including the misses, and exactly how it is measured.
70%of corpus designs score CONSTRUCTABLE on the buildability audit (C1–C8: in-ROW, offset-from-road, perpendicular crossings, no through-buildings)
97.1%of designs land within ±5% on serviceable homes passed
vs the design of record
±15–20%footage delivered as an AACE estimate band benchmarked to the 174-design human-variance envelope — not to one engineer

Footage: an honest estimate band, not a single-engineer match

Total route footage is the dominant cost driver, so we keep it accurate for costing, BOM, and BEAD cost-per-location — and we report it the way the construction industry reports estimates: as a tolerance band, benchmarked to the spread that real human designs show against each other, not to one engineer's line. Two professionals designing the same serving area routinely differ 15–20% in footage; we grade against that envelope. Share of benchmark designs whose total footage falls within each AACE band of the human as-built (160 scoreable designs of 174; the remainder lack a comparable footage baseline). Deterministic run — re-running the benchmark reproduces these numbers exactly.
Tolerance bandDesigns passingRateFraming
±10%74 / 16046.2%Tighter than the self-consistency of the human corpus itself
±15%105 / 16065.6%Estimate-grade — AACE 18R-97 Class 2 lower tolerance; near the floor of human-vs-human variance
±20%115 / 16071.9%Estimate-grade — AACE 18R-97 Class 2 tolerance (−15/+20%)
±25%125 / 16078.1%
±30%132 / 16082.5%Budget-grade — AACE 18R-97 Class 3 tolerance (−20/+30%)

These bands count boundary cases as failures (strict inequality). In the current release no design sits exactly on any band boundary, so strict and inclusive counting agree at every row. When a boundary case exists we publish the strict figure and footnote the inclusive one here rather than rounding the difference away. The ±15% column above reflects our imitation-learned research routing; the footage configuration shipped in delivered packages scores a touch more conservatively (~62% at ±15%) and is tuned to over-build rather than under-build — see the near-miss split below.

The near-miss band, published with signs

10 designs sit in the 15–20% band: 4 over-builds (+15.7% to +18.3%) and 6 under-builds (−16.2% to −19.8%). Median absolute footage error across all 160 scoreable designs is 11.2%. The current release's routing is imitation-learned from the human as-builts themselves; it pulled most of the previous release's over-build near-misses inside the ±15% gate, leaving a smaller, two-sided band. We publish the signed split rather than keep a directional claim that no longer holds.

Why footage is a band, not a single-engineer match

Context for reading the table honestly — grading a design to one human's exact footage is, on its own, the wrong test.

Methodology

  • Benchmark corpus: 174 real subdivision-scale FTTH designs across 3 US markets — Holly Springs NC (80 designs), Chesterfield VA (57), and Hampton VA (37) — engineered and built by professional OSP teams. We disclose the concentration plainly: the corpus comes from a single engineering lineage (one firm's design standards), so published accuracy is measured on that distribution; a fourth market (Greensboro NC) is being onboarded and is not yet in the benchmark. For each design we hold the human design of record (routes, splitters, workbook) as ground truth. 160 designs have a comparable total-footage baseline and are scored on footage; all 174 are scored on homes passed and design rules.
  • Procedure: the engine receives only the boundary and the design standards — never the human answer. It produces its full package; we then compare total constructed footage, homes passed, and rule compliance (split architecture, port budgets, spare capacity) against the as-built.
  • Footage score: signed percent difference of total design footage vs the human design. A band "passes" when |difference| is within tolerance.
  • Homes passed: 97.1% of designs match the engineer-of-record serviceable-home count within ±5% tolerance, using only open authoritative data — no licensed location datasets. This is a direct, absolute measurement against ground truth — not a comparison to a single subjective line.
  • Buildability: every design is scored C1–C8 by our constructability audit — offset-from-road, in-ROW, no through-buildings, correct side, perpendicular crossings, coverage, access, documentation — producing a 0–100 score and a band (CONSTRUCTABLE / NEEDS-REVIEW / NOT-BUILDABLE). About 70% of the corpus scores CONSTRUCTABLE. This is an advisory field check, not yet a hard release gate, and the auditor itself is still maturing (a 2026-06-18 fix corrected an over-penalty that had wrongly flagged buildable jobs). We report it because it measures the thing that actually matters in the field — can a crew build this as drawn — without needing the human's answer key.
  • Cost: the costed workbook is built line-item from the customer's own disclosed unit rates (template rates are the fallback only when no rate card is present). On our calibrated benchmark build (Holly Springs) it landed within ~2% of the engineer-of-record estimate for the same scope. On our first never-seen-market verification run (Hampton VA), the customer-rates estimate priced +10.9% above the engineer of record ($334,013 vs $301,106, like-for-like scope including permits and provisioning), with the residual itemized — about half route-geometry difference, half BOM semantics on splitter hardware. We measure and publish that gap per market rather than quote the calibrated figure everywhere — rate-card calibration is part of onboarding.
  • Reproducibility: the benchmark pipeline is deterministic — the same release re-scored against the corpus produces byte-identical results. Numbers on this page update only when a release moves them, in either direction.

Page last updated June 18, 2026, from the benchmark run of record for the current release.

Audit us against your own network

The benchmark proves we build to one team's standard. Send a boundary your team has built, and check the buildability, coverage, and footage band against yours.