← Back to Production Cookbook

CI and Release Qualification

Use this page when the goal is not just to diagnose one run, but to qualify Stormlog’s operational behavior before release or as part of CI.

Source checkout only below. The commands on this page use the repository examples/ package and benchmark assets under docs/benchmarks/.

Audience: release owners, CI maintainers. Difficulty: intermediate.

Prerequisites

  • install the checkout with the needed extras first: pip install -e ".[dev,test,all]"

  • start from Installation, Examples Guide, and Benchmark Harness if the workflow is unfamiliar

  • a source checkout with the examples/ package available

  • benchmark asset files under docs/benchmarks/

  • enough time and artifact space for the chosen harness profile

Success signal:

  • the chosen smoke or benchmark command completes and writes its expected report artifact

Choose the qualification path

If the main goal is…

Start with…

fast CLI sanity signal

examples.cli.quickstart

broad smoke coverage across surfaces

examples.cli.capability_matrix

always-on operability qualification

examples.cli.benchmark_harness --mode all

enforce regression or budget gates

benchmark harness --check modes

Fast smoke validation

python -m examples.cli.quickstart

Use this when you want one fast signal that the installed CLI entrypoints still behave as expected.

Cross-surface smoke validation

python -m examples.cli.capability_matrix --mode smoke --target both --oom-mode simulated

Use this when you want one command that touches the major PyTorch and TensorFlow validation paths.

Always-on operability qualification

python -m examples.cli.benchmark_harness \
  --profile pr \
  --mode all \
  --output artifacts/benchmarks/latest_v0.4.json

This is the fastest source-checkout qualification path for always-on tracking, bounded history, and retention behavior.

Regression-gated benchmark run

python -m examples.cli.benchmark_harness \
  --check \
  --profile pr \
  --mode all \
  --gate-mode regression \
  --iterations 5000 \
  --baseline docs/benchmarks/v0.4_baseline.json \
  --tolerances docs/benchmarks/v0.4_tolerances.json \
  --output artifacts/benchmarks/latest_v0.4_regression.json

Budget-gated benchmark run

python -m examples.cli.benchmark_harness \
  --check \
  --profile pr \
  --mode all \
  --gate-mode budget \
  --iterations 5000 \
  --budgets docs/benchmarks/v0.4_operating_budget.json \
  --output artifacts/benchmarks/latest_v0.4_budget.json

What to archive from CI

  • benchmark harness JSON output

  • sink directories or diagnose bundles for failed runs

  • any saved analysis reports used during triage

What to do next

Troubleshooting

Symptom: a benchmark command is too heavy for local iteration

Likely cause: the current profile is intended for PR or nightly gating. Fix: start with smoke validation or a single harness mode before running the full gate. Verify: the shorter command completes and writes a usable report artifact.


← Back to Production Cookbook