PowerIO Guide

PowerIO is compiler infrastructure for power system data. Source formats parse into typed models. Explicit, recorded passes normalize, validate, and lower them, and writers emit any supported target format. The .pio.json package records how a source was interpreted: model kind, provenance, source maps, structured diagnostics, validation, and lowering history. Sparse matrices and graph views are built from the same models for solver and analysis code. This guide records behavior, conventions, and release checks. Rustdoc covers API detail.

The rules these pages document:

same format write back preserves retained source text;
cross format conversion keeps the electrical core and reports losses as warnings;
lowering between model families is an explicit, recorded pass, never an implicit side effect;
matrix builders state sign, tap, shift, shunt, and reference bus conventions;
C, Python, and Julia bindings share the same Rust core.

Transmission readers cover MATPOWER, PSS/E revisions 33 through 35, PowerWorld AUX and PWB, PSLF EPC, PowerModels JSON, egret JSON, pandapower JSON, PyPSA CSV folders, GO Challenge 3 JSON, Surge JSON, GridFM Parquet datasets, and PowerIO JSON snapshots. PowerWorld PWD is a display artifact and uses the display API. Distribution readers and writers live in powerio-dist for OpenDSS, PowerModelsDistribution ENGINEERING JSON, and BMOPF JSON.

Where to look:

Compiler IR: the BalancedNetwork and MulticonductorNetwork model families and the .pio.json package.
PIO JSON schema: the .pio.json field reference, envelope and payload versioning, and row identity.
Format fidelity: numeric conventions, the validation oracles, known limits per format, and the missing generator cost policy.
Matrix outputs and the DC OPF bundle.
Language APIs and Python.
Performance and testing and release checks.
Julia bindings: https://github.com/eigenergy/PowerIO.jl.

Rendered API docs (rustdoc) for all crates: https://powerio.dev.

Crates

crate	responsibility
`powerio`	parsers, writers, `Network`, `IndexedNetwork`, normalization, format routing
`powerio-matrix`	sparse matrices, graph views, DC OPF bundle, GridFM datasets
`powerio-dist`	multiconductor distribution model and converters
`powerio-pkg`	`.pio.json` package envelope
`powerio-cli`	command line interface and TUI
`powerio-py`	PyO3 extension for the Python package
`powerio-capi`	C ABI for C, C++, Julia, and other foreign function interfaces

Adding a format means adding one reader or writer at the hub, not pairwise converters. IndexedNetwork is the dense $[0,n)$ analysis view derived from a balanced Network; matrix builders work from that view. Code that maps source bus ids to dense rows must use IndexedNetwork::bus_index; it must not clamp ids or assume 1 based contiguous ids.

Architecture

PowerIO treats case IO as a compiler pipeline: source formats parse into typed models, passes derive normalized or lowered views, and writers emit target artifacts.

Compiler IR: the IR layers, the BalancedNetwork and MulticonductorNetwork model families, and the NetworkPackage (.pio.json) envelope — explicit model kind, provenance, source maps, structured diagnostics, validation, operating points, and lowering.
PIO JSON schema: the .pio.json field reference and the stability policy. The envelope and the IR payload are versioned independently (schema_version vs payload_schema_version); the payload shape follows the Rust models.

The package is implemented in the powerio-pkg crate. GOC3 package construction uses operating_points to preserve the source time series while keeping the payload itself static.

The PowerIO compiler IR

PowerIO is organized as a compiler for power system data: frontends parse source formats into typed IR, passes normalize and lower it, and backends emit target artifacts. The IR boundaries and the .pio.json package are below. The field reference for the package is in the PIO JSON schema guide.

There is no flattened universal Network mega-struct. PowerIO keeps concrete model families separate. The package wraps one payload at a time with source, diagnostic, validation, and lowering metadata.

Model families

PowerIO keeps two concrete static-grid IR families distinct. They share conventions, not types; code that needs both holds a package, not a union struct.

`BalancedNetwork`

powerio::BalancedNetwork (an alias of powerio::Network) is the scalar positive-sequence model for transmission power flow, OPF, matrices, and graph analysis. Every electrical quantity is a single f64, with no phase or conductor dimension. External bus ids are not dense matrix indices; the dense solver view is derived separately and preserves external ids. Loads and shunts are first class records, not folded onto bus rows.

`MulticonductorNetwork`

powerio_dist::MulticonductorNetwork (an alias of powerio_dist::DistNetwork) is the wire-coordinate model for conductor-level distribution. Bus ids are strings; terminals are ordered string names; every element carries a terminal map; grounding is explicit; units are SI and radians. A neutral is not just another phase; it carries grounding and reduction semantics. Format defaults and inferred facts are tracked, and unsupported objects are preserved rather than dropped.

A balanced model cannot represent conductor-level asymmetry; a multiconductor model carries terminal and grounding data that has no place in a positive sequence struct. The two families never merge into one struct.

BMOPF JSON is the strict exchange format for the distribution family. The .pio.json package uses the same MulticonductorNetwork model and wraps it with compiler metadata: model kind, provenance, source maps, diagnostics, validation, and lowering history.

The compiler package (`.pio.json`)

powerio_pkg::NetworkPackage is the readable envelope. It is the object that records how a source was interpreted. Language bindings can pass the package without guessing whether it holds balanced or multiconductor data. Binary .pio is out of scope until the JSON package settles.

A package always carries:

schema (URL) and schema_version (semver);
producer metadata;
model_kind, explicit and authoritative;
model, the one typed payload, tagged by kind;
origin and sources;
source_maps;
diagnostics;
validation;
summary;
lowering_history;
optional operating_points;
optional derived metadata.

operating_points is a format neutral series of replayable field updates over the package’s single static payload. Materializing one point returns a static package with those updates applied and the series cleared. GO Challenge 3 package construction fills this block from time_series_input: the balanced payload holds the first interval, while every interval is available as an operating point.

For balanced payloads, NetworkPackage::attach_normalized_solver_table_metadata records the compact contract for powerio::Network::to_normalized_solver_tables(): pass name, units, row counts, dense bus ids, reference/component indices, branch to arc indices, and source row provenance. The package does not duplicate the full table rows; it records enough metadata for a compiler cache or sidecar artifact to verify table identity.

Explicit model kind

model_kind is a standalone field. A reader must never infer whether the payload is balanced or multiconductor from which field is present. The payload enum is also tagged by kind, so the payload is self-describing too; NetworkPackage::kind_is_consistent asserts the two agree, and a reader should reject a package where they do not.

Payload stability

The envelope and the payload are versioned independently. The nested balanced_network / multiconductor_network payloads are serde snapshots of the PowerIO Rust IR, declared by the package’s payload_schema / payload_schema_version fields: additive IR growth bumps the payload minor, moves or removals bump its major, and a reader rejects a foreign major before computing on payload fields. Payload rows carry stable uid identities that operating point updates resolve against. See the PIO JSON schema guide.

Provenance and source maps

Origin distinguishes an in-memory model, a single file (with or without retained source), a folder dataset, a partially decoded binary, a derived product, or a composite. A SourceMapEntry points from a payload field to its source with an element_path, a SourceRef into a declared source, a mapping_kind (exact, defaulted, inferred, converted_units, lowered, aggregated, split, synthetic, retained_extra), and a confidence. Balanced source_ref.field values use canonical payload field names. Parser bookkeeping that should not live in the IR payload (retained source text, default-materialization records) is lifted into this layer rather than the raw payload.

Structured diagnostics

Every finding carries a stable dotted code, a severity (debug, info, warning, error, fatal; worst-last so a set’s dominant severity is its max), the stage it came from, a human message, and where known an element_path, a source_ref, a details object, and a suggested_action. Human-readable warnings are rendered from these, not the other way around. Codes are namespaced by leading segment (PARSE, READ, IR, VALIDATE, FIDELITY, LOWER, EMIT, BINDING, PARTNER, PERF), with the conventional shape NAMESPACE.SOURCE_OR_TARGET.SPECIFIC.

Lowering

Each pass that transforms one model into another appends a LoweringRecord (input and output kind, options, assumptions, approximations, dropped fields, diagnostics, validation status) to lowering_history. The record makes the transformation explicit.

powerio_pkg::lower_multiconductor_to_balanced lowers transparent three phase MulticonductorNetwork values into BalancedNetwork using the FortescuePowerInvariant sequence convention. Neutral conductors are Kron reduced before the sequence transform. One wire and two wire inputs, transformers, untyped objects, missing phase references, and closed switches return structured LOWER.MULTI_TO_BALANCED.* diagnostics. The package method NetworkPackage::lower_multiconductor_to_balanced returns a derived balanced package and appends the record. This pass is explicit only; readers, writers, matrix builders, bindings, and MCP operations do not run it implicitly.

Operating point materialization

NetworkPackage::materialize_operating_point(index) clones the package, applies one point’s field updates to the typed payload, clears operating_points, drops stale source maps and diagnostics for changed fields, recomputes validation, and records a LoweringRecord with pass = "materialize-operating-point". If the package already carried normalized solver table metadata, the metadata is rebuilt for the updated static payload.

Versioning

schema_version is semver. Optional additive envelope fields land without a version change (operating_points did); the minor bumps when a reader needs to depend on a field being present; field moves bump the major or ship a migration. Unknown future top-level fields are tolerated on read (ignored), so a package from a newer producer still deserializes when the schema_version major version matches. A different major version is rejected before payload use.

The `.pio.json` schema

.pio.json is the serialized form of powerio_pkg::NetworkPackage: a versioned envelope around one PowerIO IR payload. The envelope shape and stability policy are below. The crate is the implementation; compiler-ir.md is the architecture note.

Two stability tiers

A .pio.json file has two parts with different stability promises.

The envelope — every field except model. This is the versioned, documented surface: schema, schema_version, producer, model_kind, origin, sources, source_maps, diagnostics, validation, summary, lowering_history, operating_points, derived. Its shape changes only under the versioning policy below.
The payload — the model field’s balanced_network / multiconductor_network object: the serde snapshot of the PowerIO Rust IR (powerio::Network / powerio_dist::DistNetwork). The payload is a declared contract of its own, named by the top-level payload_schema URL and versioned by payload_schema_version. A consumer that computes on payload fields pins the payload version; a tool that routes or audits packages pins the envelope version and can keep treating the payload as opaque.

The two versions are independent because they change at different rates and break different consumers: the payload grows whenever the IR grows (a minor payload_schema_version bump), while the envelope bookkeeping barely moves.

The payload is PowerIO’s own IR schema for both model kinds. Interchange formats stay at the converter boundary: for distribution models the multiconductor payload is the same model the BMOPF reader and writer translate to and from, and powerio convert --to bmopf-json emits a standalone BMOPF exchange file when one is needed. The BMOPF schema itself (bmopf-report) never defines the payload.

Versioning policy (envelope)

schema_version is semver. The current value is 0.1.1; the schema URL is https://powerio.dev/schema/pio-package/0.1.
Optional additive envelope fields (a reader that ignores them loses nothing it relied on before) land without a version change; operating_points landed this way. The minor version bumps when a reader needs to depend on a field being present.
Envelope field moves or removals bump the major version, or ship a migration.
A reader tolerates unknown later top-level fields (they are ignored, not an error), so a package from a newer producer still loads. A later version can preserve them in an extras map instead of dropping them.
A reader accepts same major schema_version values and rejects a different major version before using the payload.
Every package states producer.version and schema_version.

Versioning policy (payload)

payload_schema names the payload contract per model kind: https://powerio.dev/schema/pio-payload-balanced/1 and https://powerio.dev/schema/pio-payload-multiconductor/1. The URLs are identifiers, not fetch locations (the JSON Schema $id convention).
payload_schema_version is semver, currently 1.0.0 for both kinds. Additive optional fields bump the minor; field moves or removals bump the major.
A reader rejects a payload_schema_version with a different major (or one that does not parse as semver) before computing on payload fields. Absent fields — every package written before 0.1.1 — are accepted; such payloads predate the declared contract.
The payload field tables are the rustdoc of powerio::Network and powerio_dist::DistNetwork; the serde snapshot of those structs is the normative shape.

Row identity

Every row of the balanced payload tables (buses, loads, shunts, branches, switches, generators, storage, hvdc, transformers_3w) carries a uid string: the source record uid where the format defines one (GOC3), and a {table}:{row} value synthesized at package build otherwise. A synthesized uid records the row the element had when the package was built and sticks to the element from then on. Uids are unique per table; a duplicate is a validation error. Operating point updates resolve against these identities (below). Rows in packages written before 0.1.1 carry no uid, which is what keeps their row-addressed operating points valid.

Explicit model kind

model_kind is a standalone top-level field and is authoritative. A reader must branch on it and must not infer the payload kind from which field is present. The payload is additionally self-describing: model is tagged by kind, so model.kind and model_kind carry the same value. NetworkPackage::kind_is_consistent asserts the two agree; a reader should reject a package where they disagree.

"model_kind": "balanced",
"model": { "kind": "balanced", "balanced_network": { "...": "..." } }

model_kind values: balanced, multiconductor (the enum is non-exhaustive; later families can be added).

Envelope reference

field	type	required	notes
`schema`	string (URL)	yes	identifies the package format; defaults to the current URL on read
`schema_version`	string (semver)	yes	envelope version; defaults to current on read
`producer`	object	yes	`{tool, version, git_commit?, features[]}`
`package_id`	string	no	stable content id, e.g. `"sha256:..."`; unset by the scaffold
`created_at`	string (RFC 3339)	no	unset by default for deterministic output
`model_kind`	enum	yes	`balanced` \| `multiconductor`; authoritative
`payload_schema`	string (URL)	no	declared payload contract for `model_kind`; absent on pre-0.1.1 packages
`payload_schema_version`	string (semver)	no	payload version; a different major is rejected on read
`model`	object	yes	`{kind, <kind>_network}`; follows the Rust model payload
`origin`	object	yes	tagged by `kind`: `in_memory` \| `file` \| `folder` \| `binary_file` \| `derived` \| `composite`
`sources`	array	no	declared source artifacts: `{id, kind, path?, format?, hash?}`
`source_maps`	array	no	`{element_path, source_ref, mapping_kind, confidence}`
`diagnostics`	array	no	structured findings (see below)
`validation`	object	yes	`{status, counts, passes[]}`
`summary`	object	yes	`{elements{}, topology?, units?}`
`lowering_history`	array	no	`LoweringRecord` per pass
`operating_points`	object	no	replayable updates over the one static payload
`derived`	object	no	optional matrix stats, normalized solver table metadata, and cache keys

Operating points

operating_points records a time axis and an ordered list of payload field updates. A point names a table, a row identity and/or a zero based row, and the fields to overwrite. Materializing a point clones the static payload, applies those field updates, and clears operating_points in the returned package.

Updates resolve by identity first. When the referenced table carries uid values, element.source_uid is authoritative: it selects the row, a present element.row must agree with the resolved row, and an unknown or duplicated uid is an error (reported by validation and fatal to materialization). A producer that knows the identity can omit row entirely. When the table carries no uids (packages written before 0.1.1), source_uid is advisory and row addresses the update alone. An update may not overwrite uid itself, and an element ref with neither row nor source_uid does not parse.

The block shape is:

field	type	notes
`time_axis.periods`	integer	number of available operating points
`time_axis.duration_hours`	array of numbers	optional per period duration
`time_axis.labels`	array of strings	optional labels, such as `"1"`, `"2"`, …
`points[]`	array	one replayable state
`points[].index`	integer	zero based period index; addresses `time_axis.duration_hours` and `time_axis.labels`
`points[].updates[]`	array	row field updates to apply for this point
`updates[].element.table`	string	payload table name, such as `generators`, `loads`, `branches`, or `hvdc`
`updates[].element.row`	integer	zero based row; optional when `source_uid` is present, then a consistency check
`updates[].element.source_uid`	string	the target row’s payload identity (`uid`); authoritative when the table carries uids
`updates[].fields`	object	field names and JSON values to overwrite
`metadata`	object	optional series or point metadata

GO Challenge 3 packages use this block for the scheduling time series. The static model reflects the first interval that can be represented by Network; operating_points carries replayable updates for every interval. NetworkPackage::materialize_operating_point(index) returns a new static package with origin.kind = "derived" and origin.pass = "materialize-operating-point".

"operating_points": {
  "time_axis": { "periods": 2, "duration_hours": [1.0, 1.0], "labels": ["1", "2"] },
  "points": [
    { "index": 0, "updates": [] },
    { "index": 1,
      "updates": [
        { "element": { "table": "loads", "row": 0, "source_uid": "device_1" },
          "fields": { "p": 12.5, "q": 3.2 } }
      ] }
  ],
  "metadata": { "source_format": "goc3-json" }
}

Derived metadata

derived.normalized_solver_tables records the compact identity metadata for powerio::Network::to_normalized_solver_tables() without embedding every table row in the package. The full tables are a derived artifact; this metadata lets a compiler cache prove it was built from the same lowering pass and row order.

The block carries:

pass: "balanced-to-normalized-solver-tables";
units: per unit power, per unit voltage, radian angles, per unit impedance and admittance, zero based dense indices;
row_counts: counts for buses, loads, shunts, branches, switches, arcs, generators, storage, and HVDC rows;
bus_ids, reference_bus_indices, and component_labels;
branch_from_arc_indices and branch_to_arc_indices;
source_rows: source row indices for rows that survived normalization, with null for synthetic rows such as 3-winding star buses and branches.

Diagnostics

Each diagnostic carries a stable dotted code, a severity (debug, info, warning, error, fatal; ordered worst-last), the stage it came from (parse, read, canonicalize, validate, lower, emit, bind, partner), a human message, and where known an element_path, a source_ref, a details object, a suggested_action, and a safe_to_ignore list. Code namespaces by leading segment: PARSE, READ, IR, VALIDATE, FIDELITY, LOWER, EMIT, BINDING, PARTNER, PERF.

Source maps

A source_map entry records where a canonical field came from: an element_path (a JSON pointer, or a best-effort locator in v0.1), a source_ref into a declared source, a mapping_kind (exact, defaulted, inferred, converted_units, lowered, aggregated, split, synthetic, retained_extra), and a confidence (exact, high, medium, low). Balanced packages emit source maps for stable bus, load, shunt, branch, and generator fields. Balanced source_ref.field values use the same canonical field names as the payload, so they can be compared directly with element_path. When a source format folds several canonical elements into one source row, the source map records that relation with another mapping kind; MATPOWER load and shunt fields use mapping_kind = split and point to the bus record while keeping fields such as p, q, g, and b. Values that the source format does not carry are not mapped as exact; MATPOWER base_frequency has no source map. When a multiconductor network is packaged, its defaulted fields lift into source maps with mapping_kind = defaulted, and its retained source becomes origin.retained_source. Validation diagnostics attach the matching source_ref when the package has a source map for the reported field.

NetworkPackage::lower_multiconductor_to_balanced(options) returns a new balanced package with origin.kind = derived and origin.pass = "multiconductor-to-balanced". It preserves the parent lowering_history and appends a LoweringRecord whose options, assumptions, approximations, dropped fields, diagnostics, and validation status describe the pass. Lowered balanced source maps use lowered, aggregated, converted_units, synthetic, and defaulted mapping kinds. The pass is never implicit during package readback, format conversion, matrix construction, bindings, or MCP operations.

Example

{
  "schema": "https://powerio.dev/schema/pio-package/0.1",
  "schema_version": "0.1.1",
  "producer": { "tool": "powerio", "version": "0.5.1" },
  "model_kind": "multiconductor",
  "payload_schema": "https://powerio.dev/schema/pio-payload-multiconductor/1",
  "payload_schema_version": "1.0.0",
  "model": {
    "kind": "multiconductor",
    "multiconductor_network": {
      "base_frequency": 60.0,
      "loads": [
        { "name": "l1", "bus": "b1", "configuration": "wye",
          "voltage_model": { "model": "zip", "v_nom": [230.0], "alpha_z": [0.5], "...": "..." } }
      ]
    }
  },
  "origin": { "kind": "file", "format": "dss", "retained_source": true },
  "sources": [ { "id": "src0", "kind": "file", "format": "dss" } ],
  "source_maps": [
    { "element_path": "/model/multiconductor_network/vsource.source#basekv",
      "source_ref": { "source_id": "src0", "field": "basekv" },
      "mapping_kind": "defaulted", "confidence": "high" }
  ],
  "validation": { "status": "ok", "counts": { "fatal": 0, "error": 0, "warning": 0, "info": 0, "debug": 0 } },
  "summary": { "elements": { "buses": 1, "loads": 1 }, "units": { "power": "W/var", "angle": "radians" } }
}

Format fidelity and validation

How powerio’s readers and writers are validated, the conventions they follow, and the known limits. The headline fidelity table is in the top level README; this document covers the conventions and the proof behind it.

Conventions

powerio’s numeric conventions match MATPOWER and PowerModels.jl. The reference implementations and the matching powerio code:

Quantity	Convention	Reference	powerio
Bus type codes	$1 = \mathrm{PQ}$, $2 = \mathrm{PV}$, $3 = \mathrm{ref}$, $4 = \mathrm{isolated}$	MATPOWER `idx_bus`	`network::BusType`
Impedance, susceptance	per unit on `baseMVA`, never rescaled	MATPOWER `idx_brch` (`BR_B` already per unit)	`format::matpower`
Branch terminal admittance	MATPOWER `BR_B` splits half to each end; richer sources use canonical `g_fr`/`b_fr`/`g_to`/`b_to`; one-value targets receive the total susceptance projection	PowerModels `matpower.jl`; MATPOWER `idx_brch`	`network::BranchCharging`, `Branch::terminal_charging`
Tap ratio	`0` means a line (treated as `1`); nonzero is a transformer	MATPOWER `idx_brch` `TAP`	`Branch::effective_tap`
Phase shift, angle	degrees in the model; PowerModels JSON carries radians	PowerModels `make_per_unit!`	`format::powermodels`
Angle limits	`angmin`/`angmax` default ±360 (unconstrained)	MATPOWER `idx_brch` `ANGMIN`/`ANGMAX`	`Branch::has_angle_limits`
pandapower/PyPSA impedance	line `r/x` are converted between per unit and ohms with $Z_{\mathrm{base}} = V_{\mathrm{kV}}^2 / \mathrm{baseMVA}$; pandapower line charging is capacitance per km (`c_nf_per_km`, converted via $2\pi f \ell Z_{\mathrm{base}}$); PyPSA line `b` is siemens	pandapower PPC conversion, PyPSA static components	`format::pandapower`, `format::pypsa`
dcline `Pt`/`Qf`/`Qt`	sign flips vs MATPOWER	PowerModels `matpower.jl`	`format::powermodels`
Generator cost	$c_2 p^2 + c_1 p$ maps to $q = 2c_2$, $c = c_1$; coefficients high order first	MATPOWER `idx_cost`, egret `matpower_parser`	`GenCost::quadratic`
`source_id`	`["bus", id]` for bus-tied elements	PowerModels `matpower.jl`	`format::powermodels`
PSLF shunts	EPC `pu_mw`/`pu_mvar` are per unit on `sbase`; `Network::Shunt` stores MW/MVAr at $V = 1$	paired EPC/RAW case checks	`format::pslf`
GO Challenge 3 time series	`Network` stores the first interval as a static case; `.pio.json` packages carry replayable later intervals in `operating_points`	Rust GOC3 package tests	`format::goc3`, `powerio_pkg::operating`
Surge angles	Surge JSON carries voltage angles, phase shifts, and angle limits in radians; `Network` stores degrees	Rust Surge round trip tests	`format::surge`

egret’s own MATPOWER parser uses the same reductions (bus type as matpower_bustype, polynomial coefficients reversed to a {degree: coefficient} map, piecewise to [[mw, cost], ...], impedances left per unit), which is why a MATPOWER case taken through powerio to egret JSON matches egret’s direct import.

Validation

The harness script benchmarks/run_validation.sh checks powerio against five independent tools. Every classic text reader and writer runs under an oracle: the conversion matrix covers MATPOWER, PSS/E, and egret sources against all five legacy text targets, every PowerWorld output is read back and bridged to PowerModels JSON, and the PMread leg covers the PowerModels JSON read side. pandapower JSON and PyPSA CSV folders have dedicated import validators because pandapower has its own JSON schema and PyPSA is a directory format; both validate the write direction only — the pandapower JSON and PyPSA readers have no external oracle. They, GO Challenge 3 JSON, Surge JSON, and the remaining source/target pairs (PowerModels JSON and PowerWorld sources into the non-PowerModels targets) rest on the Rust round trip suite.

PowerModels.jl (validate_powermodels.jl, validate_psse.jl, core_json.jl). Reads MATPOWER, PowerModels JSON, and PSS/E. The MATPOWER to PowerModels JSON path is checked field by field after per unit normalization; the others by element counts and demand/generation/shunt totals.
egret (validate_egret.py). The oracle for egret output, which PowerModels cannot read: it loads powerio’s egret JSON with egret.data.model_data.ModelData and compares counts, totals, and generator cost curves.
ExaPowerIO.jl (validate_exapowerio.jl). Reads MATPOWER through powerio’s C ABI and compares value for value.
pandapower (validate_pandapower.py, validate_pandapower_converter.py). Cross-checks MATPOWER parse/$Y_{\mathrm{bus}}$ and imports powerio’s pandapower JSON output back into pandapower, comparing counts and $Y_{\mathrm{bus}}$.
PyPSA (validate_pypsa.py). Imports powerio’s PyPSA CSV folder output and checks counts, totals, line r/x/b rebased from ohms on the bus0 voltage, and transformer r/x/tap_ratio/s_nom rebased from the transformer s_nom base; a line/transformer split mismatch fails the case.

The conversion matrix

benchmarks/validate_matrix.py converts each source to every legacy text target and checks the electrical core of the output (bus/branch/generator counts and the per unit demand, generation, and shunt totals) against the source’s own core, read by an independent oracle. The diagonal is checked byte exact: writing back to the source format reproduces the file. Sources use the real native files where they exist (the vendored PSS/E .raw and egret .json) and representative MATPOWER cases otherwise: basic (case9), shunts and transformers (case14, case30), size (case118, case2869pegase), HVDC with a mixed piecewise/polynomial gencost (t_case9_dcline), and a piecewise-cost case (pglib_opf_case5_pjm).

All 65 legacy text cells pass (13 source cases × 5 targets). The core is preserved by every writer regardless of fidelity tier, so it is the invariant checked across the whole matrix; cost, HVDC, and angle limits are tier specific and covered by the dedicated checks above and the Rust suite. The pandapower JSON and PyPSA CSV validators run alongside this matrix and are reported as separate legs.

Running it

cargo build --release -p powerio-capi
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
julia --project=benchmarks -e 'using Pkg; Pkg.instantiate()'
bash benchmarks/run_validation.sh

The oracle tools (PowerModels.jl, egret, ExaPowerIO.jl, pandapower, PyPSA) are benchmark scoped: they are declared in benchmarks/Project.toml and benchmarks/requirements.txt, never as dependencies of the powerio package. benchmarks/run_validation.sh requires the Python oracles to import in the selected Python 3.11+ environment; a missing PyPSA, pandapower, or egret import is a setup failure.

Known limits

Write side losses are reported in Conversion::warnings; the pandapower and PyPSA readers itemize what they ignore in Parsed::warnings (read_warnings in Python), naming the table and counting the affected rows. convert_file/convert_str fold the read warnings into Conversion::warnings.

PSS/E reads revisions 33, 34, and 35. 3-winding transformers are kept as typed records and star-lowered into $Y_{\mathrm{bus}}$/connectivity by the indexed view; two-terminal DC lines map to the neutral HVDC model. A switched shunt keeps its steady-state susceptance BINIT as the shunt b and carries its mode, voltage band, regulated bus, and step blocks. A 2-winding transformer’s magnetizing susceptance round-trips through MAG2 ($\mathrm{CM} = 1$). Impedances are assumed on the system base ($\mathrm{CZ} = \mathrm{CW} = 1$).
PowerWorld .aux is read and written. .pwb binary cases are read only, and .pwd display files parse through the separate display API. .aux carries no system base, so the reader defaults to 100 MVA. No third-party .aux reader exists, so that writer is validated by powerio’s own read back plus a PowerModels JSON bridge. The .pwb layouts are reverse engineered; the decode evidence and coverage matrix are maintainer notes at powerio/src/format/powerworld/FORMAT.md.
PSLF .epc is read and written. The reader maps the static power flow core: buses, lines, two- and three-winding transformers, generators, loads, fixed shunts, controlled shunts at initial g/b, and limited two-terminal DC records. Three-winding transformers are kept as typed records and star-lowered into $Y_{\mathrm{bus}}$/connectivity by the indexed view. Unsupported sections stay in the retained source text and emit warnings.
MATPOWER canonical output (for a case that did not originate as MATPOWER) omits dcline; the byte exact echo path keeps it when the case was read from MATPOWER. Storage is written as an mpc.storage block.
egret output drops HVDC and storage. The reader takes the power flow ModelData subset (numeric bus ids, scalar values); unit commitment cases (system.time_keys) are rejected.
pandapower JSON writes the power flow core as split oriented pandapowerNet tables. Line ohms are referred to the from bus voltage, as pandapower’s build_branch reads them; a bus with baseKV 0 writes vn_kv set to $1$ (warned) so the per unit impedances survive. A branch with a tap, a shift, or terminals on two voltage levels becomes a trafo row with tap_changer_type = "Ratio"; its MATPOWER charging b rides as one bus shunt per terminal (warned, $Y_{\mathrm{bus}}$ exact) because pandapower’s magnetizing model is inductive only. The file is labeled with f_hz set to $50$ and c_nf_per_km compensated, so a 60 Hz source keeps its exact $Y_{\mathrm{bus}}$. Reference buses without a generator get an ext_grid row, which reads back as a Ref generator. The writer also warns on dropped HVDC, storage, capability columns, angle limits, rate B/C, non-finite values (written as JSON null), and costs poly_cost cannot carry. The reader models ratio, ideal, and pandapower 2.x tap changers, off-nominal vn_hv_kv/vn_lv_kv, lv side taps, and shunt vn_kv scaling; ZIP load composition, line shunt conductance, magnetizing branches, tabular tap changers, reactive cost coefficients, and every other non-empty table warn with row counts.
PyPSA CSV folders are canonicalized directory outputs, not byte exact text conversions. Covered: static buses, generators, loads, lines (ohms on the bus0 voltage, as PyPSA computes them), transformers (rebased between the system base and the transformer s_nom), shunts, storage units, and base MVA. The reader maps links to HVDC with a warning, requires v_nom and balanced CSV quoting, and warns on stores, nonzero g, and every CSV it does not read (time series, carriers). The writer keys tables by bus name, falling back to the numeric id when names collide (warned), and warns on dropped HVDC, q limits, mbase, transformer angle limits, rate B/C, isolated buses, non-finite p limits, and slackless or normalized networks. Nonnumeric bus names read back as dense synthetic ids with the originals on Bus.name.
GO Challenge 3 JSON reads ARPA-E GO Competition Challenge 3 input data into the balanced transmission model. Network is static, so the reader maps the first time interval into generator/load bounds and status fields, keeps the original JSON for byte exact source echo, and warns about scheduling data left in the retained source. There is no canonical GOC3 writer from an arbitrary Network; TargetFormat::Goc3Json only succeeds as a same format source echo. When a GOC3 Network is wrapped in .pio.json, powerio-pkg extracts the full input time axis into operating_points. Materializing one point applies those updates to the static payload and clears the series.
Surge JSON reads and writes the versioned surge-json network document. The reader maps buses, loads, fixed shunts, branches, generators, storage, and HVDC links into Network, retains the original source for same format echo, and warns about source sections that stay only in the retained document. The writer emits a canonical Surge network body for the supported power flow core; richer MATPOWER generator capability or ramp columns and unsupported cost shapes are reported in Conversion::warnings.
gridfm (read, the gridfm feature in powerio-matrix) reconstructs a Network from the gridfm-datakit Parquet dataset: lossy, but it recovers everything a power flow needs. That is bus types/voltages/limits, nodal load and shunt totals, generator dispatch and bounds, branch r/x/b/tap/shift/rate_a/angle limits, and baseMVA; it can’t recover original bus ids (synthesized 1..n), per element load/shunt granularity (folded one synthetic element per bus), piecewise/cubic gen costs (read as none), or HVDC/storage. Because the writer stores the effective tap, a branch with unit tap and no phase shift is read back as a line (raw $\mathrm{tap} = 0$); a unity ratio, zero shift transformer in the source is thus read as a line (the power flow is identical). The losses are returned as a warnings list on GridfmRead, mirroring Conversion::warnings. The same direction writer is documented in the top level README.

Missing generator costs

PSS/E .raw files carry no generator cost curves. Converting a PSS/E case to MATPOWER writes mpc.gen and omits mpc.gencost with a warning; powerio does not invent zero costs. A workflow that needs costs must pick an explicit policy:

powerio convert case.raw --from psse --to matpower --missing-gen-cost zero -o case.m
powerio dcopf case.m -o out --missing-gen-cost quadratic --default-gen-cost 0.01,2.0,0.0
powerio gridfm case.raw --from psse -o out --missing-gen-cost zero

preserve: leave missing costs absent (default for conversion and GridFM export);
require: fail on an in-service generator without cost (default for DC OPF export);
zero: fill missing rows with a MATPOWER polynomial cost [0, 0, 0];
quadratic: fill missing rows with --default-gen-cost C2,C1,C0.

--gen-cost-csv overrides costs by generator row before the missing-cost policy runs. The header is gen_index,bus,c2,c1,c0,startup,shutdown: gen_index is zero based in the current generator table, bus must match that generator’s bus id (catching stale tables after reordering), and startup/shutdown default to zero. GridFM stores cp0/cp1/cp2 columns; missing or unsupported costs still write zero columns, and the manifest separates missing_cost_gens, unsupported_cost_gens, zeroed_cost_gens, and synthesized_gen_costs.

Matrix outputs and conventions

The powerio-matrix crate builds sparse matrices and graph outputs for common power system representations. The outputs are derived from a parsed Network. The builders take the densely indexed IndexedNetwork, which maps bus ids to a contiguous $[0,n)$.

The DC OPF bundle has its own schema in the DC OPF bundle guide. Per-builder API detail is in the crate docs.

Capabilities

matrix	shape	builder	notes
B’ (FDPF)	$n \times n$	`build_bprime`	singular positive Laplacian, $\operatorname{rank}(L) = n - 1$, shuntless
B’’ (FDPF)	$n \times n$	`build_bdoubleprime`	SDDM when bus shunts are present
$\Re(Y_{\mathrm{bus}})$, $-\Im(Y_{\mathrm{bus}})$	$n \times n$	`build_ybus`	full admittance, keeps taps and shifts
LACPF (linear AC power flow) block	$2n \times 2n$	`build_lacpf`	$\begin{bmatrix}G & -B \\ -B & -G\end{bmatrix}$, flat start, indefinite
signed incidence $A$	$n \times m$	`build_incidence`	column $e$ has $+1$ at from-bus, $-1$ at to-bus
weighted Laplacian $L$	$n \times n$	`build_weighted_laplacian`	$L = A \operatorname{diag}(w) A^\mathsf{T}$, `ground_at` removes a row/col
flow map $B A^\mathsf{T}$	$m \times n$	`build_flow_map`	$f = B A^\mathsf{T}\theta$
PTDF	$m \times n$	`build_ptdf`	dense; factors the Laplacian grounded at the reference buses
LODF	$m \times m$	`build_lodf`	dense DC line-outage factors
adjacency	$n \times n$	`build_adjacency`	sparse graph adjacency
petgraph graph	n/a	`IndexedNetwork::to_petgraph`	`UnGraph<bus_idx, branch_idx>`

Computing PTDF and LODF matrices requires a linear solve. Both factor the Laplacian with one row and column removed for each reference bus, using the dense Cholesky in matrix::sensitivity. Every connected component must contain at least one reference bus. PTDF is dense $m \times n$. The DC OPF instance bundle ($A$, $b$, $L$, costs, bounds, thermal limits, $C_g$) is documented in the DC OPF bundle guide.

GridFM datasets

The GridFM export is a Parquet dataset under <case>/raw/ with bus_data, gen_data, branch_data, and y_bus_data. A single parsed case writes one scenario. A scenario batch row stacks snapshots that share the same element set and uses the scenario column as the key.

GridFM read is the ML to classical return path. It recovers bus types, voltages, limits, nodal load and shunt totals, generator dispatch and bounds, branch parameters, and base_mva. It cannot recover original bus ids, per element load and shunt granularity, piecewise and cubic costs, HVDC, or storage; those losses are returned as warnings.

Conventions

Positive Laplacian matrices. Off-diagonal $< 0$, diagonal $> 0$, with $L_{ii} = \sum_j \lvert L_{ij} \rvert$ for B’ susceptance matrices. This is the M-matrix form an SDDM (symmetric diagonally dominant M-matrix) or Cholesky solver expects; a consumer can recover an edge weight as $-L_{ij} > 0$.
Bus indexing. Bus ids are 1-based and preserved on the model as a newtype (the Rust New Type Idiom). IndexedNetwork::bus_index(id) is the only mapping into the dense $[0,n)$; an id out of range is an Error::UnknownBus.
Taps and shifts. $\mathrm{tap} = 0$ means $\mathrm{tap} = 1$ (Branch::effective_tap). B’ ignores taps and shifts; B’’ keeps taps and zeros only shifts; $Y_{\mathrm{bus}}$ keeps both.
Branch shunt admittance is stored per unit. Branch::charging is the stored per terminal admittance when present: g_fr, b_fr, g_to, and b_to are already per unit on the system base. Branch::b is the legacy MATPOWER BR_B total projection for formats that carry only one charging value. Matrix builders use Branch::terminal_charging(), so terminal values feed $Y_{\mathrm{bus}}$ even when the legacy total is zero or stale.
B’ scheme. Scheme selects between the two fast decoupled load flow variants for B’: Xb weights a branch by $1/x$ (series resistance ignored), Bx (the default) by $x/(r^2 + x^2)$.
Zero impedance branches. BuildOptions::skip_zero_impedance controls the builders whose branch denominator can be zero. The default true skips the branch and records the skipped source branch rows in MatrixStats as skipped_zero_impedance and skipped_zero_impedance_branches; false returns Error::ZeroImpedance. Full AC admittance builders use $r^2 + x^2$; DC incidence and reactance only FDPF variants use $x$. The gridfm export still zeros its admittance and flow columns for these rows and records dropped_zero_impedance in gridfm_meta.json.
Reference coverage. IndexedNetwork::check_reference_coverage verifies that every in-service island has a reference bus.
Susceptance conventions for the DC approximation. DcConvention selects the branch weight the DC builders (incidence, weighted Laplacian, PTDF/LODF, the DC OPF bundle) use. The default PaperPure is the textbook DC power flow weight $b = 1/x$, taps and shifts ignored; the resulting $L = A \operatorname{diag}(b) A^\mathsf{T}$ equals B’ under Scheme::Xb. Matpower reproduces MATPOWER’s makeBdc: $b = 1/(x\tau)$ for a transformer with tap ratio $\tau$, plus the phase shift injection vector p_shift.

Output

Matrices write as Matrix Market files or stay in memory. A symmetric matrix is stored as its lower triangle with the symmetric header and 1-based indices (io::mtx::write_mtx). The sensitivities and dcopf CLI subcommands bundle the relevant family with a JSON manifest.

The standard case solver property fixture lives at powerio-matrix/tests/fixtures/solver_matrix_stats.json. It records B’, B’’, and ybus_imag stats for case9, case14, case30, case57, and case118: n, nnz, min diagonal, M-matrix sign pattern, diagonal dominance margin, zero impedance skips, row sum checks, SPD checks, and a condition estimate when the solver input is SPD.

IndexedNetwork::to_petgraph returns the network as an undirected petgraph graph, one node per bus and one edge per in-service branch. The connectivity report and the radial check are built on it. Use the returned graph directly for other petgraph algorithms.

DC OPF Bundle Schema

powerio dcopf <case>.m -o <out> (or opf_pipeline::write_dcopf_bundle) writes <out>/<case>_dcopf/: a set of Matrix Market files plus dcopf_meta.json. Everything is a pure function of the case. The files and conventions are below.

Conventions

Format. Matrix Market. Matrices are coordinate real; square symmetric ones (L, L_grounded) use the symmetric header and store the lower triangle only. Vectors are array real general, one value per line.
Index base. .mtx row/column indices are 1-based (Matrix Market standard). reference_buses in the manifest are 0-based dense bus indices.
Sign convention. The Laplacians are the positive (M-matrix) form: diagonal $> 0$, off-diagonal $< 0$, with $L_{ii} = \sum_j \lvert L_{ij} \rvert$ for $L$. An off-diagonal entry is $L_{ij} = -b_e$ for the branch between $i$ and $j$, so a consumer recovers the edge weight as $-L_{ij} > 0$.
Units. PerUnit by default: power divided by base_mva, cost scaled so it is a function of per unit power: $q \leftarrow 2c_2 \cdot \mathrm{base}^2$ and $c \leftarrow c_1 \cdot \mathrm{base}$. Native keeps MW / native cost. The choice is recorded in the manifest.
Generator costs. The default DC OPF export policy is require: an in-service generator without cost data is an error. Use --missing-gen-cost to explicitly fill missing rows for feasibility tests.
Reference buses. reference_buses in the manifest lists every grounded bus as a 0-based dense index. Each in-service island needs at least one reference. If several references lie in one island, the bundle fixes all of those voltage angles to zero; it is not a participation factor slack model.
DC convention. PaperPure by default ($b_e = 1/x$, taps and phase shifts ignored). Matpower uses $b_e = 1/(x \tau)$ plus the phase shift injection p_shift. Recorded in the manifest.

Matrices

file	shape	what
`A.mtx`	$n \times m$	signed incidence; column $e$ has $+1$ at from-bus, $-1$ at to-bus
`L.mtx`	$n \times n$	generic Laplacian $L = A \operatorname{diag}(b) A^\mathsf{T}$, singular with $\operatorname{rank}(L) = n - 1$, $\mathbf{1} \in \ker L$
`L_grounded.mtx`	$(n-k) \times (n-k)$	$L$ with $k$ reference rows and columns removed; SPD when every island is grounded
`BAt.mtx`	$m \times n$	flow map $B A^\mathsf{T}$, where $f = B A^\mathsf{T} \theta$
`Cg.mtx`	$n \times n_{\mathrm{gen}}$	generator-to-bus incidence, one $1$ per column

Vectors

Bus-indexed (length $n$): pd (load), q/c (cost diag/linear), pmax/pmin (generation bounds), e_r (reference indicator: $1$ at every reference bus, else $0$), p_shift (phase shift injection, all zero unless Matpower + shifters). Branch-indexed (length $m$): b (susceptances), fmax (thermal limits; $0$ means unlimited per MATPOWER). Generator-space provenance (length $n_{\mathrm{gen}}$): q_gen, c_gen, pmax_gen, pmin_gen.

Manifest (`dcopf_meta.json`)

Schema powerio.dcopf version 0.1.0 writes Matrix Market files plus structured metadata:

dimensions: n_buses, n_source_branches, n_branch_columns, n_generators, n_reference_buses, and n_grounded_buses.
index_base: dense = 0 for manifest bus, branch, generator, and reference indices; matrix_market = 1 for .mtx coordinates.
dc_convention, units, build_options, and zero_impedance. The zero impedance block records the skip flag, denominator rule, skipped count, and skipped source branch rows.
grounding: reference buses, removed rows and columns, the grounded operator (L_grounded), and the reference selector (e_r).
operators[]: one entry per emitted operator with name, file, kind, rows, cols, index_space, and units.

The legacy aliases n, m, n_gen, reference_buses, and convention remain for current readers. cost_policy, synthesized_gen_costs, patched_gen_costs, files[], and powerio_version remain top level fields.

Solving with it

The grounded system is the one to factor: L_grounded is SPD when every island has a reference. For DC power flow $L\theta = p$ with net injection $p = g - d$, drop all reference_buses entries from $p$, solve $L_{\mathrm{grounded}}\theta_{\mathrm{red}} = p_{\mathrm{red}}$, and set each reference angle to $0$. e_r identifies the grounded buses without parsing the manifest. The full singular $L$ can be used instead with a consistent zero-sum RHS.

An interior point DC OPF solver builds reweighted Laplacians each Newton step from the same A and b (only the edge weights change), so A is the durable operator to hand over.

Language APIs

PowerIO uses the same IO vocabulary across Rust, Python, Julia, and the C ABI, with language-specific spelling where needed. A new format or dataset should appear as a format string or convenience wrapper, not as a new naming scheme.

Verb taxonomy:

parse_*: bytes, paths, or text to typed parsed values. Transmission parsers return a balanced network handle; distribution parsers return a multiconductor network handle; display parsers return display data.
to_*: Network to a new value
convert_file: path to target text convenience
write_*: filesystem outputs (write_gridfm, write_pypsa_csv_folder, write_dcopf_bundle); the Rust hub also keeps write_as and per-format write_* text builders, the internals behind to_format and the to_* writers, which the bindings do not mirror
read_*: filesystem dataset inputs (read_gridfm, read_pypsa_csv_folder), the inverse of write_*. Datasets are multi-file directories, so they read and write; single documents parse and serialize (parse_*/to_*)
export_*: handoff to external memory or interface protocols

Concept	Rust	Python	Julia	C ABI
Parse path	`parse_file(path, from)`	`parse_file(path, from_=None)`	`parse_file(path; from=nothing)`	`pio_parse_file`
Parse text	`parse_str(text, format)`	`parse_str(text, format)`	`parse_str(text, format)`	`pio_parse_str`
Parse display path	`parse_display_file(path, from)`	`parse_display_file(path, from_=None)`	planned	n/a
Parse display bytes	`parse_display_bytes(bytes, format)`	`parse_display_bytes(data, format)`	planned	n/a
Parse IO	n/a	file object later	`parse_file(io, format)`	n/a
JSON to Network	`Network::from_json`	`from_json`	`from_json`	`pio_parse_str` + `"powerio-json"`
File conversion	`convert_file(path, to, from)`	`convert_file(path, to, from_=None)`	`convert_file(path, to; from=nothing)`	`pio_convert_file`
Text conversion	`convert_str(text, to, format)`	`convert_str(text, to, format)`	`convert_str(text, to; from=format)`	`pio_convert_str`
Parsed conversion	`net.to_format(to)`	`net.to_format(to)`	`to_format(net, to)`	`pio_to_format`
MATPOWER text	`net.to_matpower()`	`net.to_matpower()`	`to_matpower(net)`	`pio_to_format` + `"matpower"`
JSON text	`net.to_json()`	`net.to_json()`	`to_json(net)`	`pio_to_format` + `"powerio-json"`
Package JSON	`NetworkPackage::to_json()`	`Package` class / package transport	`to_package` / `write_package`	`pio_package_*`
Package operating points	`pkg.operating_points()`	`pkg.operating_points()`	planned	`pio_package_operating_points_json`
Materialize operating point	`pkg.materialize_operating_point(i)`	`pkg.materialize_operating_point(i)`	planned	`pio_package_materialize_operating_point`
Normalized copy	`net.to_normalized()`	`net.to_normalized()`	`to_normalized(net)`	`pio_normalize`
Dense tables	typed table API	`to_dense`	`to_dense`	`pio_*` extractors
PyPSA CSV folder	`read_pypsa_csv_folder` / `write_pypsa_csv_folder`	`read_pypsa_csv_folder` / `net.write_pypsa_csv_folder`	`parse_file(dir; from="pypsa-csv")` / `write_pypsa_csv_folder`	`pio_parse_file` / `pio_write_dir` + `"pypsa-csv"`
gridfm write	`write_gridfm_dataset` / `write_gridfm_batch`	`net.write_gridfm` / `write_gridfm_batch`	planned	planned
gridfm read	`read_gridfm_dataset(dir, scenario)`	`read_gridfm(dir, scenario=0)`	`read_gridfm(dir; scenario=0)`	`pio_read_dir` + `"gridfm"`
Arrow handoff	internal/C ABI	later	`to_arrow`	`pio_to_arrow`

Note: the C ABI carries no per-format symbols: matpower, the powerio-json snapshot, PyPSA CSV directories, and gridfm datasets are all format strings into pio_to_format / pio_parse_str / pio_write_dir / pio_read_dir. The language APIs keep their per-format conveniences (to_matpower, from_json, …) as wrappers over the same paths.

C ABI and binding compatibility

The C ABI is the stable boundary for non Rust callers. Handles own parsed networks. PioPackage handles own .pio.json compiler packages. Callers free network handles with pio_network_free, package handles with pio_package_free, free returned text with pio_string_free, size output buffers before filling them, and treat every format name as a string routed through the same parser and writer hub.

C ABI review points:

null handles must return documented defaults or errors, not crash;
optional output buffers must be safe to pass as null; required output structs such as Arrow exports must report an error when null;
returned text and warning buffers must be NUL terminated when capacity permits;
reported lengths must let callers allocate exact buffers;
header declarations and exported Rust symbols must match;
feature gated exports such as Arrow, GridFM, distribution, and packages must be additive;
ownership rules must be documented in the header, README, and binding code.

Julia’s PowerIO.jl uses the C ABI for handles, dense extractors, Arrow, GridFM, PyPSA CSV folders, distribution conversion, and .pio.json package construction. Whole-network transport uses powerio-json, so the binding does not stitch together a separate model from individual table calls. The Julia binding checks pio_abi_version() against PIO_ABI_VERSION on first use. Distribution calls also check pio_dist_abi_version().

GOC3 package construction is the first package operating point path backed by a source format. The static balanced payload carries the first interval; the replayable series is exposed through the package APIs above.

During development, test the sibling Julia binding against the local C ABI instead of an artifact:

cargo build -p powerio-capi --release --features arrow,gridfm,dist,pkg
POWERIO_CAPI=$PWD/target/release/libpowerio_capi.dylib \
  julia --project=../PowerIO.jl -e 'using Pkg; Pkg.test()'

Binding compatibility checks:

surface	behavior
Python base import	`import powerio` does not import NumPy, SciPy, NetworkX, Polars, pandas, pyarrow, or the MCP SDK
Python optional paths	matrix, graph, GridFM inspection, pandas, MCP, and benchmark oracles live behind extras
C ABI	`pio_abi_version()` is the core compatibility check; optional symbols are additive and feature probed
Julia	`PowerIO.jl` checks the C ABI version before first use and checks `pio_dist_abi_version()` before distribution calls
Arrow	C returns Arrow C Data Interface structs; Julia’s default `to_arrow` copies to owned vectors, while `copy=false` keeps the wrapper alive for zero copy reads
GridFM	Julia and C read GridFM through `pio_read_dir` / `"gridfm"` and surface schema losses as warnings
Distribution	Python, Julia, Rust, and C use separate distribution handles; transmission and distribution conversion paths do not mix

Distribution surface (`powerio-dist`)

The multiconductor distribution model follows the same taxonomy under its own handle type; the two families do not mix. The C distribution surface ships behind the optional dist feature (PIO_DIST); a consumer probes it with pio_has_feature("dist"), then checks pio_dist_abi_version() against PIO_DIST_ABI_VERSION. PowerIO.jl uses the same runtime check before calling the distribution C conversion helpers.

Concept	Rust	Python	Julia	C ABI
Parse path	`powerio_dist::parse_file(path, from)`	`dist.parse_file(path, from_=None)`	`parse_file(DistNetwork, path; from=nothing)`	`pio_dist_parse_file`
Parse text	`powerio_dist::parse_str(text, format)`	`dist.parse_str(text, format)`	`parse_str(DistNetwork, text, format)`	`pio_dist_parse_str`
File conversion	`powerio_dist::convert_file(path, to, from)`	`dist.convert_file(path, to, from_=None)`	`convert_file(DistNetwork, path, to; from=nothing)`	`pio_dist_convert_file(path, from, to, ...)`
Target format type	`DistTargetFormat` (`FromStr`, `name()`)	format name strings	`DistNetwork` plus format strings	format name strings
Text conversion	`powerio_dist::convert_str(text, to, format)`	`dist.convert_str(text, to, format)`	`convert_str(DistNetwork, text, to, format)`	`pio_dist_convert_str(text, from, to, ...)`
Parsed conversion	`net.to_format(to)`	`case.to_format(to)`	`to_format(net, to)`	`pio_dist_to_format`
Parse warnings	`net.warnings`	`case.warnings`	`warnings(net)`	`pio_dist_warnings`

Python API

Install the base package for parsing, writing, JSON transport, and file conversion with zero dependencies:

pip install powerio

Install extras only for the outputs that need them:

pip install 'powerio[matrix]'   # numpy, scipy
pip install 'powerio[graph]'    # networkx
pip install 'powerio[gridfm]'   # polars
pip install 'powerio[pandas]'   # pandas and pyarrow compatibility reads (Python 3.10+)
pip install 'powerio[all]'      # matrix, graph, and gridfm reads

import powerio, parse_file, parse_str, convert_file, convert_str, to_matpower, and to_json do not import NumPy, SciPy, NetworkX, Polars, pandas, or pyarrow.

Transmission text and file format names accepted by parse_* and convert_* include matpower, psse, powerworld, pslf, powermodels-json, egret-json, pandapower-json, goc3-json, surge-json, and powerio-json, plus their documented aliases. PyPSA CSV folders and GridFM Parquet datasets are directory formats; use read_pypsa_csv_folder, Network.write_pypsa_csv_folder, read_gridfm, Network.write_gridfm, or the conversion/package helpers that take a path.

Canonical use

import powerio as pio

net = pio.parse_file("case9.m")
same_text = net.to_matpower()
json_text = net.to_json()
pm = net.to_format("powermodels-json")
pp = net.to_format("pandapower-json")
raw = pio.convert_file("case9.m", "psse")
aux = pio.convert_str(json_text, "powerworld", format="powermodels-json")
pypsa_out = net.write_pypsa_csv_folder("case9-pypsa")
display = pio.parse_display_file("case.pwd")
pkg = pio.Package.from_file("goc3_case.json", from_="goc3-json")
points = pkg.operating_points()
period_1 = pkg.materialize_operating_point(1)

normalized = net.to_normalized()
dense = net.to_dense()       # needs powerio[matrix]
bprime = net.bprime()        # needs powerio[matrix]
graph = net.to_networkx()    # needs powerio[graph]

Model names

powerio.Network is the existing balanced transmission handle. v0.4 also exports powerio.BalancedNetwork as the v1 family name for the same handle. The old powerio.Case compatibility alias was removed in v0.4.

For distribution models, use powerio.dist.MulticonductorNetwork or the existing powerio.dist.DistNetwork handle name. The old powerio.dist.DistCase alias was removed in v0.4.

parse_file(path, from_=None) reads network case files (inferred from the extension, or forced with from_); parse_str(text, format) reads in-memory case text. Display artifacts are not network cases, so they use the separate display API:

from pathlib import Path

display = pio.parse_display_file("case.pwd")
same = pio.parse_display_bytes(Path("case.pwd").read_bytes(), "pwd")

assert display.kind == "powerworld"
first = display.data.substations[0]
print(first.number, first.name, first.x, first.y)

For v0.2.2, display.data is a PwdDisplay with canvas_width, canvas_height, stamp, and substations.

PyPSA folders

PyPSA CSV folders are multi-file datasets, so they use explicit read and write helpers instead of Conversion.text.

import powerio as pio

case = pio.parse_file("case14.m")
out = case.write_pypsa_csv_folder("case14-pypsa")
round_trip = pio.read_pypsa_csv_folder(out["dir"])

The written folder can be imported with pypsa.Network().import_from_csv_folder(path). PyPSA itself is not a runtime dependency of powerio.

CSV folders are PyPSA’s native static component format and carry the network topology: buses, lines, transformers, generators, loads, shunts, storage units, and links (read as HVDC). Time series scenarios in NetCDF/HDF5 are out of scope for now; support is tracked in #107.

GridFM reads

The native wheel includes the GridFM Parquet writer and reader.

read_gridfm(dir, scenario=0) rebuilds a Network from a dataset, the inverse of Network.write_gridfm, returning a GridfmRead(network, scenario, warnings) namedtuple. The read is lossy but recovers everything a power flow needs; warnings lists what the gridfm schema couldn’t round-trip (synthesized bus ids, folded per bus load/shunt, dropped HVDC/storage, piecewise costs). read_gridfm_scenarios(dir) returns one GridfmRead per scenario. dir resolves the raw/ leaf, a <case>/ directory, or a parent with one */raw/ child.

import powerio as pio

out = pio.parse_file("case14.m").write_gridfm("out")
net, scenario, warnings = pio.read_gridfm(out["dir"])
text = net.to_matpower()                 # gridfm → any classical format

To inspect the raw Parquet tables instead, the preferred read extra is Polars:

import polars as pl

bus = pl.read_parquet(f"{out['dir']}/bus_data.parquet")

Use powerio[pandas] only for downstream code that expects pandas DataFrames.

`.pio.json` packages

powerio.Package is the handle for .pio.json packages: it parses the envelope once and every accessor reuses the handle. Package.from_file and Package.from_str build packages from case input, Package.from_json reads envelope text, and Package.from_balanced / Package.from_multiconductor wrap existing networks. pkg.model_kind names the package family; pkg.as_balanced() / pkg.as_multiconductor() rebuild typed network handles from the payload.

pkg.operating_points() returns a Python dict for the replayable operating point series, or None. pkg.materialize_operating_point(i) returns a new static Package with one point applied; updates resolve by the payload rows’ uid identities, and an unknown identity or a row that contradicts one raises ValueError. GOC3 packages populate this series from the source time series while the static payload holds the first interval. Network table dicts (net.buses, net.loads, …) expose each row’s uid. pkg.validate(), pkg.validation(), and pkg.diagnostics() expose the package validation profile, and multiconductor packages lower through pkg.multiconductor_to_balanced_preflight() and pkg.lower_multiconductor_to_balanced().

pkg = pio.Package.from_file("goc3_case.json", from_="goc3-json")
series = pkg.operating_points()
static_pkg = pkg.materialize_operating_point(0)
net = static_pkg.as_balanced()

MCP path handling

MCP clients can request .pio.json package output from parse and pass that same value back to the other network tools:

parsed = parse(path="case9.m", transport="package")
pkg = parsed["package_json"]
summary(package_json=pkg)
matrix("bprime", package_json=pkg)
save(out_path="case9.raw", to_format="psse", package_json=pkg)
diagnostics(pkg)

summary, normalize, matrix, and save also auto-detect a package passed through the legacy json argument. The package envelope’s model_kind routes balanced and multiconductor payloads.

The optional MCP server accepts local filesystem paths and file:// URIs for path and out_path arguments. Remote URI schemes are rejected. Deployments that need filesystem containment can set POWERIO_MCP_ALLOWED_ROOTS to an os.pathsep separated list of directories; all MCP reads and writes must resolve under one of those roots. POWERIO_MCP_ROOT is accepted as a single root alias.

Performance

PowerIO has four benchmark tiers. Keep them separate when publishing numbers.

tier	command	what it answers
Rust microbenchmarks	`cargo bench -p powerio --bench parse`	parser, writer, and PowerWorld reader timing inside one process
Matrix microbenchmarks	`cargo bench -p powerio-matrix --bench matrix`	sparse matrix, DC OPF component, and dense sensitivity builder timing after parse/indexing
Cross tool parser comparison	`julia --project=benchmarks benchmarks/bench_julia.jl --json`	powerio through the C ABI against ExaPowerIO.jl and PowerModels.jl
Python parser comparison	`.venv/bin/python benchmarks/bench_parse.py --json <cases>`	Python package parse and matrix path against pandapower reader paths

The published table lives in the repository benchmark results, and this guide is the public reference for how those numbers are produced. Each refresh should update the snapshot environment there: machine model, chip, core count, memory, OS, Rust, C compiler, Julia, Python, and the package versions used by the comparison harnesses. Regenerate the JSON inputs first, then splice only the marked regions:

bash benchmarks/fetch_cases.sh
cargo build --release -p powerio-capi
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
julia --project=benchmarks benchmarks/bench_julia.jl --json
.venv/bin/python benchmarks/bench_parse.py --json \
  tests/data/case2869pegase.m \
  tests/data/large/case9241pegase.m \
  tests/data/large/case13659pegase.m \
  tests/data/large/case193k.m
python3 benchmarks/render_tables.py
python3 benchmarks/render_tables.py --check

PowerWorld .pwb and .aux parse timings are measured by the Rust Criterion benchmarks. Fetch the public fixtures, run cargo bench -p powerio --bench parse -- "parse_aux_|parse_pwb_", then run python3 benchmarks/extract_powerworld_bench.py before rendering the tables. If the Texas7k local row is published, pass its aux and pwb paths through POWERIO_BENCH_AUX and POWERIO_BENCH_PWB during the Criterion run.

Matrix builder timings are separate from parse timings. The matrix benchmark parses each fixture once, builds IndexedNetwork once, and times only derived matrix construction. Its pipeline row measures Pipeline::run for the paired $Y_{\mathrm{bus}}$ export, including MTX, shunt, and metadata writes:

cargo bench -p powerio-matrix --bench matrix
python3 benchmarks/extract_matrix_bench.py
python3 benchmarks/render_tables.py

Use filtered runs while developing a focused change, for example:

cargo bench -p powerio-matrix --bench matrix -- 'matrix_bprime|matrix_ybus|dcopf_'

Criterion compares against the local target/criterion baseline. Treat a Performance has regressed line as a signal to investigate, not as a publishable claim by itself. A release note or benchmark page needs the commit, tree cleanliness, machine, toolchain, command, fixtures, and whether optional large cases were present.

Testing and release checks

Keep changes reviewable. A numerical semantics change needs tests and a short reason in code or docs. A performance change needs before and after measurements. A documentation change should link to evidence instead of expanding the README into a second manual.

Baseline checks

These commands cover the Rust workspace, the Python extension build, the Python binding tests, and the book:

cargo fmt --all --check
cargo clippy --all-targets
cargo test
cargo test -p powerio-cli --test cli
cargo test -p powerio-capi
cargo build -p powerio-py
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
.venv/bin/pytest python/tests
mdbook build docs
mdbook test docs

Route changes

Use the smallest gate set that covers the changed surface, then run the release gates before a release claim.

changed surface	extra gates
parser or writer semantics	`bash benchmarks/run_validation.sh`; format round trip tests; affected `cargo +nightly fuzz run <target> -- -runs=1` harnesses
rich model fields	`bash benchmarks/run_rich_validation.sh`
matrix builders or DC OPF bundles	`cargo test -p powerio-matrix`; `cargo bench -p powerio-matrix --bench matrix`
PowerWorld binary reader	PowerWorld parser tests plus `cargo bench -p powerio –bench parse – “parse_aux_
C ABI	`scripts/capi-header-parity.sh`; `scripts/capi-smoke.sh`; `cargo test -p powerio-capi --no-default-features`; `cargo test -p powerio-capi --features arrow,gridfm,dist,pkg`; matching clippy runs
Python package metadata or extras	`maturin build --release --out /tmp/powerio-wheel-check`; inspect wheel `METADATA`
Julia binding compatibility	build `powerio-capi --features arrow,gridfm,dist,pkg`, then run `PowerIO.jl` tests with `POWERIO_CAPI`
shared surface with PowerIO.jl	push a same-named PowerIO.jl companion branch; the tandem CI job tests against it
CLI behavior	`cargo test -p powerio-cli --test cli`
documentation or website	`mdbook build docs`; `mdbook test docs`; check stale links to retired guide outputs

benchmarks/run_validation.sh requires the Python oracle stack in the same Python 3.11+ venv as the local wheel. Missing PyPSA, pandapower, or egret is a setup failure. benchmarks/run_rich_validation.sh treats the committed PowerModels rich oracle as strict; missing Julia is a setup failure.

Release gates

Run the full set below, in addition to the baseline checks, before publishing a release claim:

cargo test -p powerio-capi --no-default-features
cargo test -p powerio-capi --features arrow,gridfm,dist,pkg
cargo clippy -p powerio-capi --all-targets --no-default-features -- -D warnings
cargo clippy -p powerio-capi --all-targets --features arrow,gridfm,dist,pkg -- -D warnings
cargo build -p powerio-capi --release --features arrow,gridfm,dist,pkg
scripts/capi-header-parity.sh
scripts/capi-smoke.sh
POWERIO_CAPI=$PWD/target/release/libpowerio_capi.dylib \
  julia --project=../PowerIO.jl -e 'using Pkg; Pkg.test()'
cargo bench -p powerio-matrix --bench matrix -- 'matrix_bprime|matrix_ybus|dcopf_'
(cd benchmarks/asv && ../../.venv/bin/asv check -E existing:../../.venv/bin/python)
(cd benchmarks/asv && ../../.venv/bin/asv run --quick --show-stderr -E existing:../../.venv/bin/python --dry-run)
for target in matpower psse pslf powerio_json powerworld_aux pwb pwd; do
  cargo +nightly fuzz run "$target" -- -runs=1
done
bash benchmarks/run_validation.sh
bash benchmarks/run_rich_validation.sh

run_validation.sh checks the classic transmission paths against PowerModels.jl, ExaPowerIO.jl, egret, pandapower, and the full legacy reader to writer matrix; run_rich_validation.sh covers fields outside the MATPOWER row shape (branch terminal admittance, switches, current ratings, solution values, HVDC costs, load voltage models). GOC3 and Surge have no external oracle in this harness; the Rust parser, writer, routing, package, and round trip tests cover them. What the oracle legs prove, per format, is in the format fidelity chapter.

The gates do not prove every source format field is lossless. Known losses are part of the public behavior and surface as warnings.

Benchmark updates

Regenerate benchmark JSON before changing published tables:

julia --project=benchmarks benchmarks/bench_julia.jl --json
.venv/bin/python benchmarks/bench_parse.py --json <cases>
cargo bench -p powerio --bench parse -- "parse_aux_|parse_pwb_"
python3 benchmarks/extract_powerworld_bench.py
cargo bench -p powerio-matrix --bench matrix
python3 benchmarks/extract_matrix_bench.py
python3 benchmarks/render_tables.py
python3 benchmarks/render_tables.py --check

The ASV suite tracks Python wheel parse and matrix timing across git history. For an uncommitted worktree, smoke test it against the local venv:

cd benchmarks/asv
../../.venv/bin/asv check -E existing:../../.venv/bin/python
../../.venv/bin/asv run --quick --show-stderr -E existing:../../.venv/bin/python --dry-run

Do not update generated benchmark tables by hand. Update the snapshot environment described in the performance guide when publishing new numbers: commit, tree cleanliness, machine, OS, toolchain, Python stack, Julia stack, commands, fixtures, and optional local data.

Broad local corpora stay local. Pass them through documented environment variables or --root flags, review the reports under benchmarks/results/, and do not commit corpus paths or generated outputs.

file	shape	what
`A.mtx`	\(n \times m\)	signed incidence; column \(e\) has \(+1\) at from-bus, \(-1\) at to-bus
`L.mtx`	\(n \times n\)	generic Laplacian \(L = A \operatorname{diag}(b) A^\mathsf{T}\), singular with \(\operatorname{rank}(L) = n - 1\), \(\mathbf{1} \in \ker L\)
`L_grounded.mtx`	\((n-k) \times (n-k)\)	\(L\) with \(k\) reference rows and columns removed; SPD when every island is grounded
`BAt.mtx`	\(m \times n\)	flow map \(B A^\mathsf{T}\), where \(f = B A^\mathsf{T} \theta\)
`Cg.mtx`	\(n \times n_{\mathrm{gen}}\)	generator-to-bus incidence, one \(1\) per column

Keyboard shortcuts

PowerIO