Open source

The tools we built, in the open.

The eval and agent infrastructure behind Studio, released for the teams doing the same work. MIT-licensed, used in production.

public repos

3.4k

total stars

contributors

MIT

licensed

Flagship

Pinned · most-used

whitescroll/evalkit

The measurement spine for AI-native engineering teams — harness, fixtures, CI reporters, and regression gates. The same eval discipline we install on every engagement, in a library.

$ npm i @whitescroll/evalkit

Read the docs →View on GitHub →

Stars★ 1,240

LanguageTypeScript

LicenseMIT

Used inevery Studio build

Updated2 days ago

All repositories

Showing all 6 repositories.

whitescroll/evalkitTypeScript

A spine for engineering-team eval suites — harness, fixtures, CI reporters.

★ 1.2k⑂ 96Updated 9d ago

whitescroll/fleetGo

Orchestration primitives for long-running agentic ops fleets.

★ 840⑂ 71Updated 12d ago

whitescroll/menddTypeScript

Self-healing post-deploy fix loop — detect, patch, eval, merge.

★ 610⑂ 44Updated 2w ago

whitescroll/scorecardPython

Turn eval runs into a readable scorecard — trends, regressions.

★ 420⑂ 33Updated 3w ago

whitescroll/conventionsMarkdown

The convention templates we install during Advisory.

★ 280⑂ 52Updated 4w ago

whitescroll/harness-examplesTypeScript

Worked examples wiring evalkit into common CI systems.

★ 190⑂ 27Updated 5w ago

Repository stats may be slightly stale — showing the last good snapshot.

Built something on these?

Issues, PRs, and questions are read directly. The same people who ship the Studio builds maintain the repos.

View the org on GitHub Get in touch