[← Back to main docs](index.md)

# Examples Guide

This page maps documented workflows to the example modules that are actually maintained in the repo.

> **Source checkout only below.** The example modules on this page live under
> `examples/` and are not included in the PyPI distribution. If you installed
> with `pip install stormlog`, use the CLI-only validation below and the Python
> snippets in the [Usage Guide](usage.md) instead.

## CLI-only validation for pip users

Use this when you installed from PyPI and do not have the `examples/` package:

```bash
gpumemprof info
gpumemprof track --duration 2 --interval 0.5 --output track.json --format json
gpumemprof analyze track.json --format txt --output analysis.txt
gpumemprof diagnose --duration 0 --output ./diag

tfmemprof info
tfmemprof diagnose --duration 0 --output ./tf_diag
```

This validates the installed CLI and produces artifacts you can load in the TUI
Diagnostics tab.

## Start here

### CLI smoke and environment validation

```bash
python -m examples.cli.quickstart
```

Use this when you want a fast signal that the installed console scripts work.

### Release-style capability sweep

```bash
python -m examples.cli.capability_matrix --mode smoke --target both --oom-mode simulated
```

Use this when you want one command that touches the major launch-validation flows.

## Python API examples

### PyTorch

```bash
python -m examples.basic.pytorch_demo
```

This demo shows:

- CUDA-gated `GPUMemoryProfiler` usage
- `profile_function`
- `profile_context`
- summary reporting

### CUDA allocator-history and attributed HTML

```bash
python -m examples.basic.cuda_native_history_demo --output ./diag_bundle_native_demo
```

This demo shows:

- `stormlog.cuda_native_debug.cuda_memory_history`
- `capture_cuda_snapshot_artifacts`
- the annotated `cuda_allocator_state_history_annotated.html` artifact
- a retained-allocation snapshot that populates the timeline, segment explorer, and active-memory table

### TensorFlow

```bash
python -m examples.basic.tensorflow_demo
```

This demo shows:

- `TFMemoryProfiler`
- context profiling
- TensorFlow result summaries
- snapshot-driven reporting

This example exercises TensorFlow's training-backed path. When you are bringing
up a new GPU stack, start with the workload-backed `/GPU:0` matmul recipe in
[TensorFlow Production Recipes](cookbook/tensorflow.md) before using this demo
as a deeper source-checkout example.

### Advanced tracking

```bash
python -m examples.advanced.tracking_demo
```

This demo shows:

- `MemoryTracker`
- alert callbacks
- watchdog cleanup flow
- exported CSV and JSON tracker events

### Structured phase tracking

```bash
python -m examples.advanced.phase_tracking_demo
```

This demo shows:

- tracker-scoped `phase(...)` context managers
- nested phase boundaries with structured metadata
- exported `phase_enter` / `phase_exit` records
- phase-aware telemetry you can reload in `gpumemprof analyze`

## Scenario modules

These are the closest examples to real operational workflows:

```bash
python -m examples.scenarios.cpu_telemetry_scenario
python -m examples.scenarios.mps_telemetry_scenario
python -m examples.scenarios.oom_flight_recorder_scenario --mode simulated
python -m examples.scenarios.tf_end_to_end_scenario
python -m examples.scenarios.wandb_training_smoke --device cuda --wandb-mode offline
python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 -m examples.scenarios.torchrun_ddp_reference
```

### When to use them

- `cpu_telemetry_scenario`: validate CPU-only telemetry export
- `mps_telemetry_scenario`: validate Apple Silicon / MPS flows
- `oom_flight_recorder_scenario`: rehearse OOM artifact capture safely
- `tf_end_to_end_scenario`: validate TensorFlow monitor, track, analyze, and diagnose flow together
- `wandb_training_smoke`: run a short real PyTorch training loop that writes a
  summary bundle, an append-only sink, offline W&B files, and structured phase
  boundaries you can reload in `gpumemprof analyze` and the TUI
- `torchrun_ddp_reference`: run a reference single-node DDP training job
  derived from the official PyTorch `torchrun` tutorial pattern, with one
  telemetry sink per rank and a shared distributed summary

## Daily workflow mapping

### ML engineer

Run:

```bash
python -m examples.basic.pytorch_demo
```

For TensorFlow, start with the `/GPU:0` matmul recipe in
[TensorFlow Production Recipes](cookbook/tensorflow.md) when you are bringing
up a GPU runtime, then run `python -m examples.basic.tensorflow_demo` once the
training-backed path is aligned.

Then move to the [Usage Guide](usage.md) if you want the same patterns embedded
inside your own code.

### Researcher or debugger

Run:

```bash
python -m examples.advanced.tracking_demo
python -m examples.scenarios.oom_flight_recorder_scenario --mode simulated
```

Then move to the [TUI Guide](tui.md) and [Troubleshooting Guide](troubleshooting.md).

### CI or release owner

Run:

```bash
python -m examples.cli.quickstart
python -m examples.cli.capability_matrix --mode smoke --target both --oom-mode simulated
```

Then move to the [Testing and Validation Guide](testing.md) for the current CI mapping.

## Markdown-only test guides

The old executable guides were replaced by Markdown checklists:

- [Example Test Guides](examples/test_guides/README.md)

Those Markdown guides are also source-checkout only. Pip users should follow
the CLI-only validation above and the Python API snippets in the [Usage Guide](usage.md).

## Notes

- Example modules are preferred over large inline doc snippets whenever a maintained script already exists.
- Some examples are environment-gated. For example, `examples.basic.pytorch_demo` skips itself when CUDA is unavailable.

---

[← Back to main docs](index.md)