# System test scripts (hardware harnesses) **Audience:** software test engineers writing **lab and bench automation** in this repo. **Example:** **`scripts/system/pcie_hotswap_harness.py`** — a small, readable pattern you can copy. **PCIe hot-swap setup (install, INI, JSON, commands):** **`docs/pcie-hotswap-setup.md`**. --- ## Pytest vs system scripts | | **`tests/` + pytest** | **`scripts/system/`** | |--|------------------------|------------------------| | **Goal** | Fast feedback, CI, mocks, gated remote tests | Long runs, real power, cables, enumeration | | **When it runs** | `python3 -m pytest tests/` on every change | When the bench is wired and someone invokes the script | | **Failure meaning** | Regression in code or contract | Often **environment** (wrong port, flaky USB, SSH) — design logs accordingly | | **Concurrency** | Usually isolated tests | Often **many logical paths** sharing one USB tree or one SSH host | Keep **pytest** strict and deterministic. Keep **system scripts** explicit about assumptions (CLI flags, env vars, dry-run) and safe defaults (no silent hardware actions). --- ## What the example script does **`scripts/system/pcie_hotswap_harness.py`** models a **fronthaul (PCIe) hot-swap campaign**: 1. Build a **`Fabric`**: either load **`--fabric-json`** (**`FabricDefinition`** from disk → **`Fabric.rrhs`**, **`rrh_power_ports`**, fingerprint) or build **N placeholder** **`RadioHead`** instances (each with a **`FrontHaul`**) via **`--paths`** and wrap them in **`Fabric`** (optional concentrator **`ssh_node`**, **`power_lock`**). 2. For each **iteration**, run **`asyncio.TaskGroup`**: every RRH runs **`one_cycle`** **concurrently** (stressing shared-resource design: one BrainStem, one rig SSH target, and so on). 3. Each cycle: **log** remove/restore phases ( **`--dry-run`** ) or placeholders for future **`Power`** calls, then optionally **SSH** to the concentrator for a minimal **smoke** command (`uname`, sample `lspci` output). 4. Exit **non-zero** if the async campaign raises (including **`TaskGroup`** child failures), using **`except* Exception`** so **`ExceptionGroup`** surfaces every underlying error. The script’s module docstring lists **DESIGN_GAPS** — known extension points so harness scope stays explicit. --- ## Fabric JSON (discovery + bindings, one pass) Full workflow (INI → discovery → prompts → JSON): **`docs/fabric-builder.md`**. **`pip install -e ".[power]"`** on the workstation that sees the Acroname hub. 1. **Fabric builder** — use **`build`** when a lab INI must be loaded first; **`bind`** is the same with INI optional if the default path is missing: ```bash python3 -m fiwicontrol.fabric build -o configs/my-fabric.json -c configs/default.ini python3 -m fiwicontrol.fabric bind -o configs/my-fabric.json -c configs/default.ini ``` 2. **Check freshness** — exit **0** only if on-disk fingerprint matches **live** USB discovery: ```bash python3 -m fiwicontrol.fabric status -f configs/my-fabric.json ``` 3. **Harness** — load that graph (optional **`--strict-fabric-ready`** to require **`READY`** status): ```bash python3 scripts/system/pcie_hotswap_harness.py --fabric-json configs/my-fabric.json --dry-run ``` Types live under **`fiwicontrol.fabric`** (**`FabricDefinition`**, **`FabricRRHBinding`**, **`Fabric.binding_cache_status`**). --- ## Concentrator dump (`scripts/system/dump_concentrator.py`) **Purpose:** capture **this machine’s** concentrator-relevant facts in one place: CPU summary from **`/proc/cpuinfo`**, and (by default) a **local host probe** — **`lspci -tv`**, **`/sys/bus/pci/devices/*/current_link_width`** (and related link fields), and **`dmidecode -t baseboard`** when the binary succeeds (often after **`sudo`**, because SMBIOS is not always readable as a normal user). **Default output is human text**, not JSON: a short CPU block; one line with the **total count** of sysfs PCI devices that expose negotiated link width/speed; a **Wi‑Fi / wireless-only** table (**`K of N`**) for PCI class **`0x028…`** (network + wireless) with **`w`/`W`** lanes, **GT/s** current/max, **`class`**, and a **chip** column from **`lspci -nn`** (preferred) or sysfs **`vendor`** / **`device`** hex pair (long chip strings are truncated); a **peek** at the first **`--lspci-lines`** rows of **`lspci -tv`** (default **18**, remainder summarized); and the **first 14 lines** of **`dmidecode -t baseboard`** when that command succeeds (often requires **`sudo`** on Fedora). | Flag | Meaning | |------|---------| | **`--json`** | Emit the full **`ConcentratorPlatformSnapshot.to_json_dict()`** document (large): CPU fields, optional **`lspci_tree`**, compact **`pci_device_links`** as **`{"cols":[...],"rows":[...]}`** (columns **`bdf`**, **`w`**, **`W`**, **`s`**, **`S`**, **`c`** = lanes and GT/s tokens and class), optional **`dmidecode_baseboard`** string. | | **`--no-host-probe`** | CPU-only; skip **`lspci`**, sysfs PCI enumeration, and **`dmidecode`**. | | **`--pci-sysdir DIR`** | Override **`/sys/bus/pci/devices`** (testing or nonstandard roots). | | **`--pci-all`** | After the Wi‑Fi table, append a second table of **other** “interesting” non-wireless links (wide ports / downgrades), still capped by **`--pci-max-rows`**. | | **`--pci-max-rows N`** | Cap for the optional second table (default **40**). | | **`--lspci-lines N`** | Lines of **`lspci -tv`** in human output (**0** = omit that block; default **18**). | | **`--label NAME`** | Shown in the human header only. | | **`--proc-cpuinfo PATH`** | Override **`/proc/cpuinfo`** (tests or chroots). | **Examples:** ```bash # Human summary (default); Wi‑Fi table + short lspci tree + DMI if allowed python3 scripts/system/dump_concentrator.py # Same with baseboard text (often needs root on Fedora) sudo python3 scripts/system/dump_concentrator.py # Machine JSON for tooling / CI artifacts python3 scripts/system/dump_concentrator.py --json > /tmp/concentrator.json ``` **Python API:** **`fiwicontrol.concentrator.ConcentratorPlatform`**, **`ConcentratorPlatformSnapshot`**, **`PciDeviceLinkSnapshot`**, **`format_concentrator_platform_snapshot_human()`** (same layout as the script’s default text; optional **`lspci_nn_by_bdf=`** for tests). Implementation lives in **`src/fiwicontrol/concentrator/host.py`** (package **`fiwicontrol.concentrator`** — local workstation facts, parallel to **`fiwicontrol.radio`** for RRH aggregates; not part of fabric JSON). When the harness (or your script) loads **`--fabric-json`**, it **merges lab INI by default** (same file as **`fiwicontrol.lab`**: **`FIWI_LAB_INI`**, else **`configs/default.ini`** if present). Pass **`--lab-ini PATH`** to point at another file. Merged keys include optional **`[fabric]`** (**`fabric_id`**, **`concentrator`** → **`[machine.*]`** SSH target) and optional **`[fabric.rrh.]`** to override Acroname port / patch panel / module serial for rows already present in the JSON. Use **`--no-lab-ini`** to skip. JSON supplies **`discovery_fingerprint`** and the RRH binding list (key **`rrhs`**; Python: **`FabricDefinition.rrhs`**) from **`fabric build`** / **`bind`** or **`fabric_realize.py --json`**. --- ## Acroname discovery smoke test (`scripts/system/test_acroname_usb_discovery.py`) Runs BrainStem USB enumeration **per `[machine.*]` row** in the lab INI: **`usb=local`** on the workstation you run from, **`usb=remote`** over SSH (same interpreter contract as **`fiwicontrol.power --discovery-json`**). Prints a short table per machine, **`brainstem_version`** from discovery JSON (with an SSH fallback pip probe when the remote build omits that field), and a **total module count** across hosts. ```bash python3 scripts/system/test_acroname_usb_discovery.py python3 scripts/system/test_acroname_usb_discovery.py -c configs/default.ini --json python3 scripts/system/test_acroname_usb_discovery.py --local-only ``` Use **`--local-only`** to skip the INI and probe only this machine’s USB. See **`docs/power-control-and-inventory.md`** for INI fields. --- ## Fabric compose + realize (`scripts/system/fabric_realize.py --realize`) Loads the lab INI, runs **local** Acroname discovery, **`compose_definition`**, builds **`Fabric`**, then **`await fab.realize()`** (strict fingerprint check against live USB). Default **stdout** is an **OK** line plus **`print(fabric)`** (human **`Fabric.__str__`** summary). Pass **`--json`** for **stdout**-only **`FabricDefinition`** JSON after a successful realize. **`-v`** adds discovery / pre-realize fabric lines on **stderr**; **`--no-strict`** passes **`strict=False`** into **`Fabric.realize()`**. **`--realize-discovery-timeout SEC`** bounds Acroname discovery during **`--realize`** (default **120**). **Exit codes** and FDIR semantics: **`docs/fdir.md`** and **`fabric_realize.py --help`** (epilog). Without **`--realize`**, **`fabric_realize.py`** only composes the definition and prints a **human** workstation report (or **`--json`** / **`-o`** for definition JSON **without** calling **`Fabric.realize()`**). The human report can merge patch-panel labels into the Wi‑Fi PCIe table when **`--patch-panel-json PATH`** is set or when **`_panel.json`** exists beside the lab INI (see **`fiwicontrol.fabric.patch_panel_json`**). --- ## Prerequisites 1. **Editable install** from the repo root (see **`docs/install.md`**): ```bash cd ~/Code/FiWiControl python3 -m pip install -e ".[dev]" ``` 2. **Python 3.11+** — the example uses **`asyncio.TaskGroup`** and **`except* Exception`**. 3. **Optional SSH to the rig** — same contract as elsewhere: passwordless **`root@`** for **`sshtype="ssh"`**. Optional **`FIWI_SSH_CONFIG`** is documented in **`docs/node-control-asyncio-design.md`**. 4. **Power / Acroname** — not wired in the example yet. When you add **`fiwicontrol.power`**, use **`pip install -e ".[power]"`** and follow **`docs/power-control-and-inventory.md`**. --- ## How to run the example From the **repository root** (the script prepends **`src`** to **`sys.path`** if needed): ```bash # Safe: no SSH, no hardware — exercises structure only python3 scripts/system/pcie_hotswap_harness.py --dry-run --paths 2 --iterations 1 # With saved fabric JSON (after build/bind; merge lab INI at run time) python3 scripts/system/pcie_hotswap_harness.py --fabric-json configs/my-fabric.json --lab-ini configs/default.ini --dry-run # With SSH smoke on the concentrator (replace IP) FIWI_REMOTE_IP=192.168.1.39 python3 scripts/system/pcie_hotswap_harness.py --dry-run --paths 2 # or python3 scripts/system/pcie_hotswap_harness.py --dry-run --paths 2 --rig-ip 192.168.1.39 ``` | Flag | Meaning | |------|---------| | **`--fabric-json PATH`** | Load **`FabricDefinition`** from JSON; sets **`Fabric.rrhs`** and **`rrh_power_ports`**. Without it, uses **`--paths`** placeholders. | | **`--lab-ini PATH`** | Lab INI merged after JSON (default: **`FIWI_LAB_INI`**, else **`configs/default.ini`** if present). | | **`--no-lab-ini`** | Skip INI merge; JSON only. | | **`--strict-fabric-ready`** | Exit **2** unless **`Fabric.binding_cache_status`** is **`READY`** (requires live Acroname discovery). Only meaningful with **`--fabric-json`**. | | **`--dry-run`** | Log only; no programmable power (none hooked up in this skeleton). | | **`--paths N`** | Placeholder RRH count (ignored when **`--fabric-json`** is set). | | **`--iterations M`** | Outer loop: run **`M`** sequential **`TaskGroup`** rounds. | | **`--settle SEC`** | Sleep between conceptual phases inside **`one_cycle`**. | | **`--rig-ip`** | SSH target; defaults to **`FIWI_REMOTE_IP`**. Overrides JSON concentrator when set. If unset and JSON has no IP, remote checks are skipped. | --- ## Patterns to reuse in your own harness ### 1. Thin `main()` — parse, configure logging, call `asyncio.run` Keep **I/O policy** (flags, env) in **`main()`**. Keep **async** logic in **`async def`** functions so tests or imports can reuse the coroutines without a second event loop. ### 2. One coroutine per “story”: `one_cycle`, `run_campaign` Name coroutines after **user-visible steps** (cycle, campaign, smoke). Pass **explicit** parameters (`dry_run`, `settle_s`, `label`) instead of hidden globals. ### 3. Concurrency with `TaskGroup` When multiple RRHs run together, **`async with asyncio.TaskGroup() as tg:`** + **`tg.create_task(...)`** fails fast and bundles errors in an **`ExceptionGroup`**. Catch with **`except* Exception`** at the boundary that owns **`asyncio.run`**, log each sub-exception, and return a process exit code. ### 4. Dry-run first Always provide a path that **does not touch hardware** so engineers can validate **logging, SSH, and timing** on a laptop. Real power transitions should be clearly gated (extra flag or explicit “I know this is live”). ### 5. Domain types from the library Attach **`FrontHaul`** to **`RadioHead`** even when fields are **`None`** — it documents **intent** and keeps the harness aligned with production models. Pass a **`Fabric`** into the async campaign so **shared** resources (concentrator SSH, bench **`Power`**, **`asyncio.Lock`**, **`rrh_power_ports`**) have one home. Prefer **`--fabric-json`** (bound once via **`python3 -m fiwicontrol.fabric bind`**) over ad hoc placeholders; reserve **`--paths`** for laptop-only smoke. ### 6. Remote checks via `ssh_node` Use **`await node.rexec(cmd="...", ...)`** for one-shot remote work. For **periodic** sampling, prefer **`Command`** / **`CommandManager`** from **`fiwicontrol.commands`** (see **`docs/node-control-asyncio-design.md`**). ### 7. Document gaps in the script A short **DESIGN_GAPS** or **TODO** block at the top of the harness documents how **enumeration**, **telemetry**, or **SPC** relate to this script. --- ## Checklist for a new system script 1. [ ] Lives under **`scripts/system/`** with a **`#!/usr/bin/env python3`** shebang. 2. [ ] **`argparse`** (or equivalent) documents every assumption; **`--help`** is accurate. 3. [ ] **`--dry-run`** (or equivalent) when hardware is involved. 4. [ ] **`logging`** at INFO for operator visibility; avoid **`print`** for control flow. 5. [ ] Async entry is **`async def`** + single **`asyncio.run(...)`** from **`main()`**. 6. [ ] Concurrent work uses **`TaskGroup`** (or **`gather`** with a documented error policy). 7. [ ] Non-zero exit on failure; **`ExceptionGroup`** handled if you use **`TaskGroup`**. 8. [ ] README or this doc updated if you add a **new** category of harness or dependency. --- ## Related docs - **`docs/pcie-hotswap-setup.md`** — PCIe harness prerequisites and JSON generation. - **`docs/fabric-builder.md`** — lab INI + **`python3 -m fiwicontrol.fabric build`** / **`bind`**. - **`docs/install.md`** — workstation and rig setup, **`pip install -e`**. - **`docs/node-control-asyncio-design.md`** — **`ssh_node`**, **`Command`**, timeouts, running tests. - **`docs/power-control-and-inventory.md`** — Acroname / Monsoon, INI, **`--verify-inventory`**. - **`docs/spc.md`** — when campaigns need statistical control charts after KPI extraction. - **`README.md`** — **`scripts/system/`** vs **`tests/`** overview.