Private beta

Give your codebase a fitness grade. Then coach it to an A.

BenchFit scores your repo A+ to F across six dimensions of code fitness, architecture gates, and Big Ball of Mud risk. Scan on your machine or in the cloud — then loop with your AI agent until the grade climbs.

Start scanning See the workflow

Go, Rust, TypeScript & Python. No config. Your source is never retained in the cloud.

~/acme-api — benchfit -cloud cloud

B+GRADE

80.6/100 BBoM risk 18 · healthy 98 files · 24 packages · 8.2s

R94.2

C79.1

O72.5

D69.5

T72.4

X58.9

Gates 28 / 28 R Readability O Operational X Change Risk

The scale

A+AB+ BC+C DF

Go Rust TypeScript Python new

The workflow

From first scan to a grade you enforce

Five steps, one tight loop. Sign up, scan, improve with your AI agent, watch the trend, and hold the line in CI.

01INSTALL & SIGN UP

Install, then sign up from your terminal

Install the CLI with one command, then benchfit login opens your browser to authorize against GitHub. Approve, and you're in — your key is written to your machine. No dashboard signup, no yaml to fill in first.

$ curl -fsSL https://bench.fit/install.sh | sh

$ benchfit login
# opens a browser · authorize with GitHub · key stored locally

Free during the beta. Keys are per developer and tied to your GitHub handle. macOS and Linux supported.

02SCAN

Scan your repo

Run benchfit for a local scan, or add -cloud to score server-side. Either way you get a grade, a score, the six dimension bars, and gate status — no configuration required.

$ cd your-repo && benchfit -cloud

benchfit -cloud

  Packaging your-repo for cloud scan…
  Uploading 98 files (1.2 MB compressed) to bench.fit…

BenchFit Code & Architecture Platform
=====================================

  Grade: B+      Score: 80.6/100     BBoM: 18 healthy

  Scanned 98 files across 24 packages in 8.2s

Dimensions
------------------------------------------------------------
  R  Readability       94.2/100  (20%)  ######################--
  C  Changeability     79.1/100  (15%)  ##################------
  O  Operational       72.5/100  (25%)  #################-------
  D  Dependency        69.5/100  (10%)  ################--------
  T  Test Quality      72.4/100  (15%)  #################-------
  X  Change Risk       58.9/100  (15%)  ##############----------

Gates (28 checked)
------------------------------------------------------------
  28 passed, 0 failed

03IMPROVE

Loop with Claude Code

This is where the grade actually moves. Ask benchfit next for the single highest-priority fix, hand it to your coding agent, let it make the change, then re-scan. BLOCKING gates come first, then WARNING gates, then the weakest dimension metrics.

$ benchfit next
# the one fix that moves your grade the most, right now

Drop this CLAUDE.md snippet in your repo and your agent knows the loop without being told each time:

CLAUDE.md

## Code fitness with BenchFit

This repo is graded by BenchFit.
Target grade: A- or better.

When you finish a change, or when I ask you to improve code fitness:

1. Run `benchfit -cloud` to get the current grade, score, and the six
   dimension scores (R C O D T X).
2. Run `benchfit next` for the single highest-priority recommendation
   (or `benchfit next --all` for the full prioritized list). BenchFit
   ranks BLOCKING gates first, then WARNING gates, then the weakest
   dimension metrics.
3. Make the smallest change that addresses it. To focus on one axis, use
   `benchfit next --dimension <R|C|O|D|T|X>`.
4. Re-run `benchfit -cloud`, confirm the grade or score improved and no
   gate regressed, and repeat until the grade reaches the target.

Never disable a gate or exclude files to raise the score — fix the cause.

Worth learning — the commands you'll reach for most:

benchfitLocal scan of the current directory

benchfit -cloudScore server-side via the cloud API

benchfit nextThe single highest-priority recommendation

benchfit next --dimension RPrioritized fixes for one dimension

benchfit webOpen this repo's dashboard in a browser

benchfit -cloud -strictFail on blocking architecture gates

04WATCH

Watch the trend on the web

benchfit web opens your repo's dashboard: grade over time, current gate status, and where the score is heading. Good for standups, good for proving the loop is working.

$ benchfit web
# opens your repo's performance page

benchfit web · acme-api ● live

B+GRADE

Grade over time · last 12 scans ▲ +6.2

28 / 28 gates BBoM healthy 72.4 → 80.6

05ENFORCE

Enforce the bar in CI

Set a floor and fail the build when the repo drops below it. -strict fails on blocking architecture gates; -min-grade and -min-score fail below a threshold you choose. Store your key as the repo secret BENCHFIT_KEY.

.github/workflows/fitness.yml

name: code-fitness
on:
  pull_request:
  push:
    branches: [main]

jobs:
  benchfit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: stable
      - name: Install BenchFit
        run: curl -fsSL https://bench.fit/install.sh | sh
      - name: Grade code fitness
        env:
          BENCHFIT_KEY: ${{ secrets.BENCHFIT_KEY }}
        run: benchfit -cloud -strict -min-grade=B

What's measured

Six dimensions, weighted by what matters in production

62+ metrics roll up into six dimension scores. Operational quality carries the most weight — code that can't survive production doesn't get to be elegant.

R20% weight

Readability

Function length, cyclomatic complexity, naming, nesting, documentation, dead code.

C15% weight

Changeability

Cohesion, coupling, abstraction, API stability, value objects, connascence.

O25% weight

Operational

Error handling, logging, health checks, graceful shutdown, resilience, rollback readiness.

D10% weight

Dependency

Freshness, vulnerabilities, obsolescence, vendor abstraction, DevSecOps pipeline.

T15% weight

Test Quality

MC/DC condition coverage, isolation, assertion density, edge cases, contract tests.

X15% weight

Change Risk

Commit size, hotspots, knowledge distribution, breaking-change detection.

Blocking gates

Hard structural limits

Circular dependencies, data-ownership violations, error leakage. A blocking failure forces the grade to F, no matter the score.

Warning gates

One step down each

God files, missing tests, temporal coupling. Each warning failure steps the grade down by one — visible, not fatal.

BBoM risk

Big Ball of Mud score

A single number for structural entropy — healthy, at-risk, or mud — so you can catch a codebase sliding before it's stuck.

How your code is handled

Cloud scanning without the trust tax

Your source, handled carefully

In cloud mode the CLI uploads a filtered copy of your source and nothing more.

Dotfiles, secrets, and dependency & build directories are excluded before anything leaves your machine.
Your .git never leaves — git statistics are computed locally and only aggregates are sent.
The server deletes your source the moment scoring finishes. It keeps your report card — scores, gates, and the file paths of findings, visible only to you — never your source.
Reports are stored under a pseudonymous account id, and Delete my data in your dashboard removes every report, score, and share link on demand. Questions: privacy@bench.fit.

Instrumented end to end

Observability isn't an afterthought here — it's a value we hold ourselves to.

The cloud service is traced end to end with OpenTelemetry — every scan is a span you could follow.
The same operational rigor BenchFit grades your code for, applied to BenchFit itself.
Live at bench.fit.

Plans

Three ways in

Free with a public profile, paid with anonymity, or Enterprise for companies. Every plan gets the full grading engine — plans differ in identity, rate limits, and support.

Community Free forever

All core features, free. The price is your anonymity: your handle is your public profile. Your scores stay private until you choose to share them.

Local and cloud scanning across Go, Rust, TypeScript & Python
All six dimensions, architecture gates, BBoM risk, dashboards & CI enforcement
Shareable score links — your public scorecard, when you want one
Low-cost add-ons (scorecard galleries, design input) — early access by email

Get started

Vibe Paid · early access

For the vibe coder shipping fast. Your profile is a handle of your choosing — anonymity preserved. Tiers buy higher scan rates and repo budgets.

Everything in Community, without the public handle
Vibe → Vibe Pro → Vibe Monitor: rising scan rates & repo budgets
Vibe Monitor: headroom to scan every repo on every push from CI — your dashboard tracks the trend
Move between Community and Vibe by email today — self-serve switching ships with public 1.0

Join early access

Enterprise From $45K/yr

The plan for companies. All commercial and corporate use requires Enterprise — using a personal plan at work violates the terms of use.

$45K/yr — site license, onboarding, priority limits
$125K/yr — scales with use, same as every plan here
Call us — custom: on-prem/self-hosted, compliance packs, deep roadmap collaboration
Enterprise accounts stay Enterprise — pair one with a personal account on the same email

Talk to us

Terms in one line: individuals scan free (Community) or paid-and-anonymous (Vibe); any use by or for a company requires an Enterprise plan. Rate limits apply on every plan so the service stays fast for everyone. Full Terms of Service and Privacy Policy.

Roadmap

Where BenchFit is headed

AI writes code faster than ever — but architecture degrades silently while every dashboard shows green. BenchFit is the bar, made explicit. Here's the direction.

Shipped

The fitness engine

Go, Rust, TypeScript & Python — one grade across your monorepo
Cloud scanning, GitHub sign-in, per-repo dashboards & portfolio rollup
CI enforcement gates & OpenTelemetry-grade observability

More languages — Java, C# and the rest of the top ten
A native agent interface, so your AI drives the fix loop directly
Enterprise self-host, SSO & compliance packs for regulated teams

The horizon

The OS for AI-assisted engineering

A fitness grade every team — and every AI agent — optimizes toward, as fundamental to shipping software as CI/CD. The bar must be explicit, or there is no bar at all.

Get started

Grade your codebase in the next five minutes

$ benchfit login && benchfit -cloud

Get started Re-read the workflow

Give your codebase a fitness grade. Then coach it to an A.

From first scan to a grade you enforce

Install, then sign up from your terminal

Scan your repo

Loop with Claude Code

Watch the trend on the web

Enforce the bar in CI

Six dimensions, weighted by what matters in production

Readability

Changeability

Operational

Dependency

Test Quality

Change Risk

Hard structural limits

One step down each

Big Ball of Mud score

Cloud scanning without the trust tax

Your source, handled carefully

Instrumented end to end

Three ways in

Where BenchFit is headed

The fitness engine

Breadth & the agent loop

The OS for AI-assisted engineering

Grade your codebase in the next five minutes