PowerIO Guide
PowerIO is compiler infrastructure for power system data. Source formats parse
into typed models. Explicit, recorded passes normalize, validate, and lower
them, and writers emit any supported target format. The .pio.json package
records how a source was interpreted: model kind, provenance, source maps,
structured diagnostics, validation, and lowering history. Sparse matrices and
graph views are built from the same models for solver and analysis code. This
guide records behavior, conventions, and release checks. Rustdoc covers API
detail.
The rules these pages document:
- same format write back preserves retained source text;
- cross format conversion keeps the electrical core and reports losses as warnings;
- lowering between model families is an explicit, recorded pass, never an implicit side effect;
- matrix builders state sign, tap, shift, shunt, and reference bus conventions;
- C, Python, and Julia bindings share the same Rust core.
Transmission readers cover MATPOWER, PSS/E revisions 33 through 35,
PowerWorld AUX and PWB, PSLF EPC, PowerModels JSON, egret JSON, pandapower JSON,
PyPSA CSV folders, GO Challenge 3 JSON, Surge JSON, GridFM Parquet datasets, and
PowerIO JSON snapshots. PowerWorld PWD is a display artifact and uses the
display API. Distribution readers and writers live in powerio-dist for
OpenDSS, PowerModelsDistribution ENGINEERING JSON, and BMOPF JSON.
Where to look:
- Compiler IR: the
BalancedNetworkandMulticonductorNetworkmodel families and the.pio.jsonpackage. - PIO JSON schema: the
.pio.jsonfield reference, envelope and payload versioning, and row identity. - Format fidelity: numeric conventions, the validation oracles, known limits per format, and the missing generator cost policy.
- Matrix outputs and the DC OPF bundle.
- Language APIs and Python.
- Performance and testing and release checks.
- Julia bindings: https://github.com/eigenergy/PowerIO.jl.
Rendered API docs (rustdoc) for all crates: https://powerio.dev.
Crates
| crate | responsibility |
|---|---|
powerio | parsers, writers, Network, IndexedNetwork, normalization, format routing |
powerio-matrix | sparse matrices, graph views, DC OPF bundle, GridFM datasets |
powerio-dist | multiconductor distribution model and converters |
powerio-pkg | .pio.json package envelope |
powerio-cli | command line interface and TUI |
powerio-py | PyO3 extension for the Python package |
powerio-capi | C ABI for C, C++, Julia, and other foreign function interfaces |
Adding a format means adding one reader or writer at the hub, not pairwise
converters. IndexedNetwork is the dense \([0,n)\) analysis view derived from
a balanced Network; matrix builders work from that view. Code that maps
source bus ids to dense rows must use IndexedNetwork::bus_index; it must not
clamp ids or assume 1 based contiguous ids.
Architecture
PowerIO treats case IO as a compiler pipeline: source formats parse into typed models, passes derive normalized or lowered views, and writers emit target artifacts.
- Compiler IR: the IR layers, the
BalancedNetworkandMulticonductorNetworkmodel families, and theNetworkPackage(.pio.json) envelope — explicit model kind, provenance, source maps, structured diagnostics, validation, operating points, and lowering. - PIO JSON schema: the
.pio.jsonfield reference and the stability policy. The envelope and the IR payload are versioned independently (schema_versionvspayload_schema_version); the payload shape follows the Rust models.
The package is implemented in the powerio-pkg crate.
GOC3 package construction uses operating_points to preserve the source time
series while keeping the payload itself static.
The PowerIO compiler IR
PowerIO is organized as a compiler for power system data: frontends parse source
formats into typed IR, passes normalize and lower it, and backends emit target
artifacts. The IR boundaries and the .pio.json package are below. The field
reference for the package is in
the PIO JSON schema guide.
There is no flattened universal Network mega-struct. PowerIO keeps concrete
model families separate. The package wraps one payload at a time with source,
diagnostic, validation, and lowering metadata.
Model families
PowerIO keeps two concrete static-grid IR families distinct. They share conventions, not types; code that needs both holds a package, not a union struct.
BalancedNetwork
powerio::BalancedNetwork (an alias of powerio::Network) is the scalar
positive-sequence model for transmission power flow, OPF, matrices, and graph
analysis. Every electrical quantity is a single f64, with no phase or conductor
dimension. External bus ids are not dense matrix indices; the dense solver view
is derived separately and preserves external ids. Loads and shunts are
first class records, not folded onto bus rows.
MulticonductorNetwork
powerio_dist::MulticonductorNetwork (an alias of powerio_dist::DistNetwork)
is the wire-coordinate model for conductor-level distribution. Bus ids are
strings; terminals are ordered string names; every element carries a terminal
map; grounding is explicit; units are SI and radians. A neutral is not just
another phase; it carries grounding and reduction semantics. Format defaults and
inferred facts are tracked, and unsupported objects are preserved rather than
dropped.
A balanced model cannot represent conductor-level asymmetry; a multiconductor model carries terminal and grounding data that has no place in a positive sequence struct. The two families never merge into one struct.
BMOPF JSON is the strict exchange format for the distribution family. The
.pio.json package uses the same MulticonductorNetwork model and wraps it
with compiler metadata: model kind, provenance, source maps, diagnostics,
validation, and lowering history.
The compiler package (.pio.json)
powerio_pkg::NetworkPackage is the readable envelope. It is the object that
records how a source was interpreted. Language bindings can pass the package
without guessing whether it holds balanced or multiconductor data. Binary .pio
is out of scope until the JSON package settles.
A package always carries:
schema(URL) andschema_version(semver);producermetadata;model_kind, explicit and authoritative;model, the one typed payload, tagged bykind;originandsources;source_maps;diagnostics;validation;summary;lowering_history;- optional
operating_points; - optional
derivedmetadata.
operating_points is a format neutral series of replayable field updates over
the package’s single static payload. Materializing one point returns a static
package with those updates applied and the series cleared.
GO Challenge 3 package construction fills this block from time_series_input:
the balanced payload holds the first interval, while every interval is available
as an operating point.
For balanced payloads, NetworkPackage::attach_normalized_solver_table_metadata
records the compact contract for
powerio::Network::to_normalized_solver_tables(): pass name, units, row counts,
dense bus ids, reference/component indices, branch to arc indices, and source row
provenance. The package does not duplicate the full table rows; it records enough
metadata for a compiler cache or sidecar artifact to verify table identity.
Explicit model kind
model_kind is a standalone field. A reader must never infer whether the payload
is balanced or multiconductor from which field is present. The payload enum is
also tagged by kind, so the payload is self-describing too;
NetworkPackage::kind_is_consistent asserts the two agree, and a reader should
reject a package where they do not.
Payload stability
The envelope and the payload are versioned independently. The nested
balanced_network / multiconductor_network payloads are serde snapshots of
the PowerIO Rust IR, declared by the package’s payload_schema /
payload_schema_version fields: additive IR growth bumps the payload minor,
moves or removals bump its major, and a reader rejects a foreign major before
computing on payload fields. Payload rows carry stable uid identities that
operating point updates resolve against. See
the PIO JSON schema guide.
Provenance and source maps
Origin distinguishes an in-memory model, a single file (with or without
retained source), a folder dataset, a partially decoded binary, a derived
product, or a composite. A SourceMapEntry points from a payload field to its
source with an element_path, a SourceRef into a declared source, a
mapping_kind (exact, defaulted, inferred, converted_units, lowered,
aggregated, split, synthetic, retained_extra), and a confidence.
Balanced source_ref.field values use canonical payload field names. Parser
bookkeeping that should not live in the IR payload (retained source text,
default-materialization records) is lifted into this layer rather than the raw
payload.
Structured diagnostics
Every finding carries a stable dotted code, a severity (debug, info,
warning, error, fatal; worst-last so a set’s dominant severity is its max),
the stage it came from, a human message, and where known an element_path, a
source_ref, a details object, and a suggested_action. Human-readable
warnings are rendered from these, not the other way around. Codes are namespaced
by leading segment (PARSE, READ, IR, VALIDATE, FIDELITY, LOWER,
EMIT, BINDING, PARTNER, PERF), with the conventional shape
NAMESPACE.SOURCE_OR_TARGET.SPECIFIC.
Lowering
Each pass that transforms one model into another appends a LoweringRecord
(input and output kind, options, assumptions, approximations, dropped fields,
diagnostics, validation status) to lowering_history. The record makes the
transformation explicit.
powerio_pkg::lower_multiconductor_to_balanced lowers transparent three phase
MulticonductorNetwork values into BalancedNetwork using the
FortescuePowerInvariant sequence convention. Neutral conductors are Kron
reduced before the sequence transform. One wire and two wire inputs,
transformers, untyped objects, missing phase references, and closed switches
return structured LOWER.MULTI_TO_BALANCED.* diagnostics. The package method
NetworkPackage::lower_multiconductor_to_balanced returns a derived balanced
package and appends the record. This pass is explicit only; readers, writers,
matrix builders, bindings, and MCP operations do not run it implicitly.
Operating point materialization
NetworkPackage::materialize_operating_point(index) clones the package, applies
one point’s field updates to the typed payload, clears operating_points, drops
stale source maps and diagnostics for changed fields, recomputes validation, and
records a LoweringRecord with pass = "materialize-operating-point". If the
package already carried normalized solver table metadata, the metadata is
rebuilt for the updated static payload.
Versioning
schema_version is semver. Optional additive envelope fields land without a
version change (operating_points did); the minor bumps when a reader needs to
depend on a field being present; field moves bump the major or ship a migration. Unknown future top-level fields are tolerated
on read (ignored), so a package from a newer producer still deserializes when
the schema_version major version matches. A different major version is
rejected before payload use.
The .pio.json schema
.pio.json is the serialized form of powerio_pkg::NetworkPackage: a versioned
envelope around one PowerIO IR payload. The envelope shape and stability policy
are below. The crate is the implementation; compiler-ir.md is the architecture
note.
Two stability tiers
A .pio.json file has two parts with different stability promises.
-
The envelope — every field except
model. This is the versioned, documented surface:schema,schema_version,producer,model_kind,origin,sources,source_maps,diagnostics,validation,summary,lowering_history,operating_points,derived. Its shape changes only under the versioning policy below. -
The payload — the
modelfield’sbalanced_network/multiconductor_networkobject: the serde snapshot of the PowerIO Rust IR (powerio::Network/powerio_dist::DistNetwork). The payload is a declared contract of its own, named by the top-levelpayload_schemaURL and versioned bypayload_schema_version. A consumer that computes on payload fields pins the payload version; a tool that routes or audits packages pins the envelope version and can keep treating the payload as opaque.
The two versions are independent because they change at different rates and
break different consumers: the payload grows whenever the IR grows (a minor
payload_schema_version bump), while the envelope bookkeeping barely moves.
The payload is PowerIO’s own IR schema for both model kinds. Interchange
formats stay at the converter boundary: for distribution models the
multiconductor payload is the same model the BMOPF reader and writer translate
to and from, and powerio convert --to bmopf-json emits a standalone BMOPF
exchange file when one is needed. The BMOPF schema itself (bmopf-report) never
defines the payload.
Versioning policy (envelope)
schema_versionis semver. The current value is0.1.1; theschemaURL ishttps://powerio.dev/schema/pio-package/0.1.- Optional additive envelope fields (a reader that ignores them loses nothing
it relied on before) land without a version change;
operating_pointslanded this way. The minor version bumps when a reader needs to depend on a field being present. - Envelope field moves or removals bump the major version, or ship a migration.
- A reader tolerates unknown later top-level fields (they are ignored, not an error), so a package from a newer producer still loads. A later version can preserve them in an extras map instead of dropping them.
- A reader accepts same major
schema_versionvalues and rejects a different major version before using the payload. - Every package states
producer.versionandschema_version.
Versioning policy (payload)
payload_schemanames the payload contract per model kind:https://powerio.dev/schema/pio-payload-balanced/1andhttps://powerio.dev/schema/pio-payload-multiconductor/1. The URLs are identifiers, not fetch locations (the JSON Schema$idconvention).payload_schema_versionis semver, currently1.0.0for both kinds. Additive optional fields bump the minor; field moves or removals bump the major.- A reader rejects a
payload_schema_versionwith a different major (or one that does not parse as semver) before computing on payload fields. Absent fields — every package written before 0.1.1 — are accepted; such payloads predate the declared contract. - The payload field tables are the rustdoc of
powerio::Networkandpowerio_dist::DistNetwork; the serde snapshot of those structs is the normative shape.
Row identity
Every row of the balanced payload tables (buses, loads, shunts,
branches, switches, generators, storage, hvdc, transformers_3w)
carries a uid string: the source record uid where the format defines one
(GOC3), and a {table}:{row} value synthesized at package build otherwise. A
synthesized uid records the row the element had when the package was built and
sticks to the element from then on. Uids are unique per table; a duplicate is a
validation error. Operating point updates resolve against these identities
(below). Rows in packages written before 0.1.1 carry no uid, which is what
keeps their row-addressed operating points valid.
Explicit model kind
model_kind is a standalone top-level field and is authoritative. A reader
must branch on it and must not infer the payload kind from which field is
present. The payload is additionally self-describing: model is tagged by
kind, so model.kind and model_kind carry the same value.
NetworkPackage::kind_is_consistent asserts the two agree; a reader should
reject a package where they disagree.
"model_kind": "balanced",
"model": { "kind": "balanced", "balanced_network": { "...": "..." } }
model_kind values: balanced, multiconductor (the enum is non-exhaustive;
later families can be added).
Envelope reference
| field | type | required | notes |
|---|---|---|---|
schema | string (URL) | yes | identifies the package format; defaults to the current URL on read |
schema_version | string (semver) | yes | envelope version; defaults to current on read |
producer | object | yes | {tool, version, git_commit?, features[]} |
package_id | string | no | stable content id, e.g. "sha256:..."; unset by the scaffold |
created_at | string (RFC 3339) | no | unset by default for deterministic output |
model_kind | enum | yes | balanced | multiconductor; authoritative |
payload_schema | string (URL) | no | declared payload contract for model_kind; absent on pre-0.1.1 packages |
payload_schema_version | string (semver) | no | payload version; a different major is rejected on read |
model | object | yes | {kind, <kind>_network}; follows the Rust model payload |
origin | object | yes | tagged by kind: in_memory | file | folder | binary_file | derived | composite |
sources | array | no | declared source artifacts: {id, kind, path?, format?, hash?} |
source_maps | array | no | {element_path, source_ref, mapping_kind, confidence} |
diagnostics | array | no | structured findings (see below) |
validation | object | yes | {status, counts, passes[]} |
summary | object | yes | {elements{}, topology?, units?} |
lowering_history | array | no | LoweringRecord per pass |
operating_points | object | no | replayable updates over the one static payload |
derived | object | no | optional matrix stats, normalized solver table metadata, and cache keys |
Operating points
operating_points records a time axis and an ordered list of payload field
updates. A point names a table, a row identity and/or a zero based row, and the
fields to overwrite. Materializing a point clones the static payload, applies
those field updates, and clears operating_points in the returned package.
Updates resolve by identity first. When the referenced table carries uid
values, element.source_uid is authoritative: it selects the row, a present
element.row must agree with the resolved row, and an unknown or duplicated
uid is an error (reported by validation and fatal to materialization). A
producer that knows the identity can omit row entirely. When the table
carries no uids (packages written before 0.1.1), source_uid is advisory and
row addresses the update alone. An update may not overwrite uid itself, and
an element ref with neither row nor source_uid does not parse.
The block shape is:
| field | type | notes |
|---|---|---|
time_axis.periods | integer | number of available operating points |
time_axis.duration_hours | array of numbers | optional per period duration |
time_axis.labels | array of strings | optional labels, such as "1", "2", … |
points[] | array | one replayable state |
points[].index | integer | zero based period index; addresses time_axis.duration_hours and time_axis.labels |
points[].updates[] | array | row field updates to apply for this point |
updates[].element.table | string | payload table name, such as generators, loads, branches, or hvdc |
updates[].element.row | integer | zero based row; optional when source_uid is present, then a consistency check |
updates[].element.source_uid | string | the target row’s payload identity (uid); authoritative when the table carries uids |
updates[].fields | object | field names and JSON values to overwrite |
metadata | object | optional series or point metadata |
GO Challenge 3 packages use this block for the scheduling time series. The
static model reflects the first interval that can be represented by
Network; operating_points carries replayable updates for every interval.
NetworkPackage::materialize_operating_point(index) returns a new static
package with origin.kind = "derived" and
origin.pass = "materialize-operating-point".
"operating_points": {
"time_axis": { "periods": 2, "duration_hours": [1.0, 1.0], "labels": ["1", "2"] },
"points": [
{ "index": 0, "updates": [] },
{ "index": 1,
"updates": [
{ "element": { "table": "loads", "row": 0, "source_uid": "device_1" },
"fields": { "p": 12.5, "q": 3.2 } }
] }
],
"metadata": { "source_format": "goc3-json" }
}
Derived metadata
derived.normalized_solver_tables records the compact identity metadata for
powerio::Network::to_normalized_solver_tables() without embedding every table
row in the package. The full tables are a derived artifact; this metadata lets a
compiler cache prove it was built from the same lowering pass and row order.
The block carries:
pass:"balanced-to-normalized-solver-tables";units: per unit power, per unit voltage, radian angles, per unit impedance and admittance, zero based dense indices;row_counts: counts for buses, loads, shunts, branches, switches, arcs, generators, storage, and HVDC rows;bus_ids,reference_bus_indices, andcomponent_labels;branch_from_arc_indicesandbranch_to_arc_indices;source_rows: source row indices for rows that survived normalization, withnullfor synthetic rows such as 3-winding star buses and branches.
Diagnostics
Each diagnostic carries a stable dotted code, a severity (debug, info,
warning, error, fatal; ordered worst-last), the stage it came from
(parse, read, canonicalize, validate, lower, emit, bind,
partner), a human message, and where known an element_path, a source_ref,
a details object, a suggested_action, and a safe_to_ignore list. Code
namespaces by leading segment: PARSE, READ, IR, VALIDATE, FIDELITY,
LOWER, EMIT, BINDING, PARTNER, PERF.
Source maps
A source_map entry records where a canonical field came from: an element_path
(a JSON pointer, or a best-effort locator in v0.1), a source_ref into a declared
source, a mapping_kind (exact, defaulted, inferred, converted_units,
lowered, aggregated, split, synthetic, retained_extra), and a
confidence (exact, high, medium, low). Balanced packages emit source
maps for stable bus, load, shunt, branch, and generator fields. Balanced
source_ref.field values use the same canonical field names as the payload, so
they can be compared directly with element_path. When a source format folds
several canonical elements into one source row, the source map records that
relation with another mapping kind; MATPOWER load and shunt fields use
mapping_kind = split and point to the bus record while keeping fields such as
p, q, g, and b. Values that the source format does not carry are not
mapped as exact; MATPOWER base_frequency has no source map. When a
multiconductor network is packaged, its defaulted fields lift into source maps
with mapping_kind = defaulted, and its retained source becomes
origin.retained_source. Validation diagnostics attach the matching source_ref
when the package has a source map for the reported field.
NetworkPackage::lower_multiconductor_to_balanced(options) returns a new
balanced package with origin.kind = derived and
origin.pass = "multiconductor-to-balanced". It preserves the parent
lowering_history and appends a LoweringRecord whose options, assumptions,
approximations, dropped fields, diagnostics, and validation status describe the
pass. Lowered balanced source maps use lowered, aggregated,
converted_units, synthetic, and defaulted mapping kinds. The pass is never
implicit during package readback, format conversion, matrix construction,
bindings, or MCP operations.
Example
{
"schema": "https://powerio.dev/schema/pio-package/0.1",
"schema_version": "0.1.1",
"producer": { "tool": "powerio", "version": "0.5.1" },
"model_kind": "multiconductor",
"payload_schema": "https://powerio.dev/schema/pio-payload-multiconductor/1",
"payload_schema_version": "1.0.0",
"model": {
"kind": "multiconductor",
"multiconductor_network": {
"base_frequency": 60.0,
"loads": [
{ "name": "l1", "bus": "b1", "configuration": "wye",
"voltage_model": { "model": "zip", "v_nom": [230.0], "alpha_z": [0.5], "...": "..." } }
]
}
},
"origin": { "kind": "file", "format": "dss", "retained_source": true },
"sources": [ { "id": "src0", "kind": "file", "format": "dss" } ],
"source_maps": [
{ "element_path": "/model/multiconductor_network/vsource.source#basekv",
"source_ref": { "source_id": "src0", "field": "basekv" },
"mapping_kind": "defaulted", "confidence": "high" }
],
"validation": { "status": "ok", "counts": { "fatal": 0, "error": 0, "warning": 0, "info": 0, "debug": 0 } },
"summary": { "elements": { "buses": 1, "loads": 1 }, "units": { "power": "W/var", "angle": "radians" } }
}
Format fidelity and validation
How powerio’s readers and writers are validated, the conventions they follow, and the known limits. The headline fidelity table is in the top level README; this document covers the conventions and the proof behind it.
Conventions
powerio’s numeric conventions match MATPOWER and PowerModels.jl. The reference implementations and the matching powerio code:
| Quantity | Convention | Reference | powerio |
|---|---|---|---|
| Bus type codes | \(1 = \mathrm{PQ}\), \(2 = \mathrm{PV}\), \(3 = \mathrm{ref}\), \(4 = \mathrm{isolated}\) | MATPOWER idx_bus | network::BusType |
| Impedance, susceptance | per unit on baseMVA, never rescaled | MATPOWER idx_brch (BR_B already per unit) | format::matpower |
| Branch terminal admittance | MATPOWER BR_B splits half to each end; richer sources use canonical g_fr/b_fr/g_to/b_to; one-value targets receive the total susceptance projection | PowerModels matpower.jl; MATPOWER idx_brch | network::BranchCharging, Branch::terminal_charging |
| Tap ratio | 0 means a line (treated as 1); nonzero is a transformer | MATPOWER idx_brch TAP | Branch::effective_tap |
| Phase shift, angle | degrees in the model; PowerModels JSON carries radians | PowerModels make_per_unit! | format::powermodels |
| Angle limits | angmin/angmax default ±360 (unconstrained) | MATPOWER idx_brch ANGMIN/ANGMAX | Branch::has_angle_limits |
| pandapower/PyPSA impedance | line r/x are converted between per unit and ohms with \(Z_{\mathrm{base}} = V_{\mathrm{kV}}^2 / \mathrm{baseMVA}\); pandapower line charging is capacitance per km (c_nf_per_km, converted via \(2\pi f \ell Z_{\mathrm{base}}\)); PyPSA line b is siemens | pandapower PPC conversion, PyPSA static components | format::pandapower, format::pypsa |
dcline Pt/Qf/Qt | sign flips vs MATPOWER | PowerModels matpower.jl | format::powermodels |
| Generator cost | \(c_2 p^2 + c_1 p\) maps to \(q = 2c_2\), \(c = c_1\); coefficients high order first | MATPOWER idx_cost, egret matpower_parser | GenCost::quadratic |
source_id | ["bus", id] for bus-tied elements | PowerModels matpower.jl | format::powermodels |
| PSLF shunts | EPC pu_mw/pu_mvar are per unit on sbase; Network::Shunt stores MW/MVAr at \(V = 1\) | paired EPC/RAW case checks | format::pslf |
| GO Challenge 3 time series | Network stores the first interval as a static case; .pio.json packages carry replayable later intervals in operating_points | Rust GOC3 package tests | format::goc3, powerio_pkg::operating |
| Surge angles | Surge JSON carries voltage angles, phase shifts, and angle limits in radians; Network stores degrees | Rust Surge round trip tests | format::surge |
egret’s own MATPOWER parser uses the same reductions (bus type as
matpower_bustype, polynomial coefficients reversed to a {degree: coefficient}
map, piecewise to [[mw, cost], ...], impedances left per unit), which is why a
MATPOWER case taken through powerio to egret JSON matches egret’s direct import.
Validation
The harness script benchmarks/run_validation.sh checks powerio against five independent
tools. Every classic text reader and writer runs under an oracle: the conversion
matrix covers MATPOWER, PSS/E, and egret sources against all five legacy text
targets, every PowerWorld output is read back and bridged to PowerModels JSON,
and the PMread leg covers the PowerModels JSON read side. pandapower JSON and
PyPSA CSV folders have dedicated import validators because pandapower has its
own JSON schema and PyPSA is a directory format; both validate the write
direction only — the pandapower JSON and PyPSA readers have no external oracle.
They, GO Challenge 3 JSON, Surge JSON, and the remaining source/target pairs
(PowerModels JSON and PowerWorld sources into the non-PowerModels targets) rest
on the Rust round trip suite.
- PowerModels.jl (
validate_powermodels.jl,validate_psse.jl,core_json.jl). Reads MATPOWER, PowerModels JSON, and PSS/E. The MATPOWER to PowerModels JSON path is checked field by field after per unit normalization; the others by element counts and demand/generation/shunt totals. - egret (
validate_egret.py). The oracle for egret output, which PowerModels cannot read: it loads powerio’s egret JSON withegret.data.model_data.ModelDataand compares counts, totals, and generator cost curves. - ExaPowerIO.jl (
validate_exapowerio.jl). Reads MATPOWER through powerio’s C ABI and compares value for value. - pandapower (
validate_pandapower.py,validate_pandapower_converter.py). Cross-checks MATPOWER parse/\(Y_{\mathrm{bus}}\) and imports powerio’s pandapower JSON output back into pandapower, comparing counts and \(Y_{\mathrm{bus}}\). - PyPSA (
validate_pypsa.py). Imports powerio’s PyPSA CSV folder output and checks counts, totals, line r/x/b rebased from ohms on the bus0 voltage, and transformer r/x/tap_ratio/s_nom rebased from the transformers_nombase; a line/transformer split mismatch fails the case.
The conversion matrix
benchmarks/validate_matrix.py converts each source to every legacy text target and checks
the electrical core of the output (bus/branch/generator counts and the per unit
demand, generation, and shunt totals) against the source’s own core, read by an
independent oracle. The diagonal is checked byte exact: writing back to the source
format reproduces the file. Sources use the real native files where they exist
(the vendored PSS/E .raw and egret .json) and representative MATPOWER cases
otherwise: basic (case9), shunts and transformers (case14, case30), size
(case118, case2869pegase), HVDC with a mixed piecewise/polynomial gencost
(t_case9_dcline), and a piecewise-cost case (pglib_opf_case5_pjm).
All 65 legacy text cells pass (13 source cases × 5 targets). The core is preserved by every writer regardless of fidelity tier, so it is the invariant checked across the whole matrix; cost, HVDC, and angle limits are tier specific and covered by the dedicated checks above and the Rust suite. The pandapower JSON and PyPSA CSV validators run alongside this matrix and are reported as separate legs.
Running it
cargo build --release -p powerio-capi
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
julia --project=benchmarks -e 'using Pkg; Pkg.instantiate()'
bash benchmarks/run_validation.sh
The oracle tools (PowerModels.jl, egret, ExaPowerIO.jl, pandapower, PyPSA) are
benchmark scoped: they are declared in benchmarks/Project.toml and
benchmarks/requirements.txt, never as dependencies of the powerio package.
benchmarks/run_validation.sh requires the Python oracles to import in the
selected Python 3.11+ environment; a missing PyPSA, pandapower, or egret import
is a setup failure.
Known limits
Write side losses are reported in Conversion::warnings; the pandapower and
PyPSA readers itemize what they ignore in Parsed::warnings (read_warnings
in Python), naming the table and counting the affected rows.
convert_file/convert_str fold the read warnings into Conversion::warnings.
- PSS/E reads revisions 33, 34, and 35. 3-winding transformers are kept as
typed records and star-lowered into \(Y_{\mathrm{bus}}\)/connectivity by the indexed view;
two-terminal DC lines map to the neutral HVDC model. A switched shunt keeps its
steady-state susceptance
BINITas the shuntband carries its mode, voltage band, regulated bus, and step blocks. A 2-winding transformer’s magnetizing susceptance round-trips throughMAG2(\(\mathrm{CM} = 1\)). Impedances are assumed on the system base (\(\mathrm{CZ} = \mathrm{CW} = 1\)). - PowerWorld
.auxis read and written..pwbbinary cases are read only, and.pwddisplay files parse through the separate display API..auxcarries no system base, so the reader defaults to 100 MVA. No third-party.auxreader exists, so that writer is validated by powerio’s own read back plus a PowerModels JSON bridge. The.pwblayouts are reverse engineered; the decode evidence and coverage matrix are maintainer notes atpowerio/src/format/powerworld/FORMAT.md. - PSLF
.epcis read and written. The reader maps the static power flow core: buses, lines, two- and three-winding transformers, generators, loads, fixed shunts, controlled shunts at initialg/b, and limited two-terminal DC records. Three-winding transformers are kept as typed records and star-lowered into \(Y_{\mathrm{bus}}\)/connectivity by the indexed view. Unsupported sections stay in the retained source text and emit warnings. - MATPOWER canonical output (for a case that did not originate as MATPOWER)
omits dcline; the byte exact echo path keeps it when the case was read from
MATPOWER. Storage is written as an
mpc.storageblock. - egret output drops HVDC and storage. The reader takes the power flow
ModelData subset (numeric bus ids, scalar values); unit commitment cases
(
system.time_keys) are rejected. - pandapower JSON writes the power flow core as split oriented
pandapowerNettables. Line ohms are referred to the from bus voltage, as pandapower’sbuild_branchreads them; a bus with baseKV 0 writesvn_kvset to \(1\) (warned) so the per unit impedances survive. A branch with a tap, a shift, or terminals on two voltage levels becomes atraforow withtap_changer_type = "Ratio"; its MATPOWER charging b rides as one bus shunt per terminal (warned, \(Y_{\mathrm{bus}}\) exact) because pandapower’s magnetizing model is inductive only. The file is labeled withf_hzset to \(50\) andc_nf_per_kmcompensated, so a 60 Hz source keeps its exact \(Y_{\mathrm{bus}}\). Reference buses without a generator get anext_gridrow, which reads back as a Ref generator. The writer also warns on dropped HVDC, storage, capability columns, angle limits, rate B/C, non-finite values (written as JSON null), and costspoly_costcannot carry. The reader models ratio, ideal, and pandapower 2.x tap changers, off-nominalvn_hv_kv/vn_lv_kv, lv side taps, and shuntvn_kvscaling; ZIP load composition, line shunt conductance, magnetizing branches, tabular tap changers, reactive cost coefficients, and every other non-empty table warn with row counts. - PyPSA CSV folders are canonicalized directory outputs, not byte exact
text conversions. Covered: static buses, generators, loads, lines (ohms on
the bus0 voltage, as PyPSA computes them), transformers (rebased between
the system base and the transformer
s_nom), shunts, storage units, and base MVA. The reader maps links to HVDC with a warning, requiresv_nomand balanced CSV quoting, and warns on stores, nonzerog, and every CSV it does not read (time series, carriers). The writer keys tables by bus name, falling back to the numeric id when names collide (warned), and warns on dropped HVDC, q limits, mbase, transformer angle limits, rate B/C, isolated buses, non-finite p limits, and slackless or normalized networks. Nonnumeric bus names read back as dense synthetic ids with the originals onBus.name. - GO Challenge 3 JSON reads ARPA-E GO Competition Challenge 3 input data
into the balanced transmission model.
Networkis static, so the reader maps the first time interval into generator/load bounds and status fields, keeps the original JSON for byte exact source echo, and warns about scheduling data left in the retained source. There is no canonical GOC3 writer from an arbitraryNetwork;TargetFormat::Goc3Jsononly succeeds as a same format source echo. When a GOC3Networkis wrapped in.pio.json,powerio-pkgextracts the full input time axis intooperating_points. Materializing one point applies those updates to the static payload and clears the series. - Surge JSON reads and writes the versioned
surge-jsonnetwork document. The reader maps buses, loads, fixed shunts, branches, generators, storage, and HVDC links intoNetwork, retains the original source for same format echo, and warns about source sections that stay only in the retained document. The writer emits a canonical Surge network body for the supported power flow core; richer MATPOWER generator capability or ramp columns and unsupported cost shapes are reported inConversion::warnings. - gridfm (read, the
gridfmfeature inpowerio-matrix) reconstructs aNetworkfrom the gridfm-datakit Parquet dataset: lossy, but it recovers everything a power flow needs. That is bus types/voltages/limits, nodal load and shunt totals, generator dispatch and bounds, branchr/x/b/tap/shift/rate_a/angle limits, andbaseMVA; it can’t recover original bus ids (synthesized1..n), per element load/shunt granularity (folded one synthetic element per bus), piecewise/cubic gen costs (read as none), or HVDC/storage. Because the writer stores the effective tap, a branch with unit tap and no phase shift is read back as a line (raw \(\mathrm{tap} = 0\)); a unity ratio, zero shift transformer in the source is thus read as a line (the power flow is identical). The losses are returned as a warnings list onGridfmRead, mirroringConversion::warnings. The same direction writer is documented in the top level README.
Missing generator costs
PSS/E .raw files carry no generator cost curves. Converting a PSS/E case to
MATPOWER writes mpc.gen and omits mpc.gencost with a warning; powerio does
not invent zero costs. A workflow that needs costs must pick an explicit policy:
powerio convert case.raw --from psse --to matpower --missing-gen-cost zero -o case.m
powerio dcopf case.m -o out --missing-gen-cost quadratic --default-gen-cost 0.01,2.0,0.0
powerio gridfm case.raw --from psse -o out --missing-gen-cost zero
preserve: leave missing costs absent (default for conversion and GridFM export);require: fail on an in-service generator without cost (default for DC OPF export);zero: fill missing rows with a MATPOWER polynomial cost[0, 0, 0];quadratic: fill missing rows with--default-gen-cost C2,C1,C0.
--gen-cost-csv overrides costs by generator row before the missing-cost policy
runs. The header is gen_index,bus,c2,c1,c0,startup,shutdown: gen_index is
zero based in the current generator table, bus must match that generator’s bus
id (catching stale tables after reordering), and startup/shutdown default to
zero. GridFM stores cp0/cp1/cp2 columns; missing or unsupported costs still
write zero columns, and the manifest separates missing_cost_gens,
unsupported_cost_gens, zeroed_cost_gens, and synthesized_gen_costs.
Matrix outputs and conventions
The powerio-matrix crate builds sparse matrices and graph outputs for common power system representations. The outputs are derived from a parsed Network. The builders take the densely indexed IndexedNetwork, which maps bus ids to a
contiguous \([0,n)\).
The DC OPF bundle has its own schema in the DC OPF bundle guide. Per-builder API detail is in the crate docs.
Capabilities
| matrix | shape | builder | notes |
|---|---|---|---|
| B’ (FDPF) | \(n \times n\) | build_bprime | singular positive Laplacian, \(\operatorname{rank}(L) = n - 1\), shuntless |
| B’’ (FDPF) | \(n \times n\) | build_bdoubleprime | SDDM when bus shunts are present |
| \(\Re(Y_{\mathrm{bus}})\), \(-\Im(Y_{\mathrm{bus}})\) | \(n \times n\) | build_ybus | full admittance, keeps taps and shifts |
| LACPF (linear AC power flow) block | \(2n \times 2n\) | build_lacpf | \(\begin{bmatrix}G & -B \\ -B & -G\end{bmatrix}\), flat start, indefinite |
| signed incidence \(A\) | \(n \times m\) | build_incidence | column \(e\) has \(+1\) at from-bus, \(-1\) at to-bus |
| weighted Laplacian \(L\) | \(n \times n\) | build_weighted_laplacian | \(L = A \operatorname{diag}(w) A^\mathsf{T}\), ground_at removes a row/col |
| flow map \(B A^\mathsf{T}\) | \(m \times n\) | build_flow_map | \(f = B A^\mathsf{T}\theta\) |
| PTDF | \(m \times n\) | build_ptdf | dense; factors the Laplacian grounded at the reference buses |
| LODF | \(m \times m\) | build_lodf | dense DC line-outage factors |
| adjacency | \(n \times n\) | build_adjacency | sparse graph adjacency |
| petgraph graph | n/a | IndexedNetwork::to_petgraph | UnGraph<bus_idx, branch_idx> |
Computing PTDF and LODF matrices requires a linear solve. Both factor the
Laplacian with one row and column removed for each reference bus, using the dense
Cholesky in
matrix::sensitivity. Every connected component must contain at least one
reference bus. PTDF is dense \(m \times n\). The DC OPF
instance bundle (\(A\), \(b\), \(L\), costs, bounds, thermal limits, \(C_g\)) is documented in
the DC OPF bundle guide.
GridFM datasets
The GridFM export is a Parquet dataset under <case>/raw/ with bus_data,
gen_data, branch_data, and y_bus_data. A single parsed case writes one
scenario. A scenario batch row stacks snapshots that share the same element set
and uses the scenario column as the key.
GridFM read is the ML to classical return path. It recovers bus types, voltages,
limits, nodal load and shunt totals, generator dispatch and bounds, branch
parameters, and base_mva. It cannot recover original bus ids, per element load
and shunt granularity, piecewise and cubic costs, HVDC, or storage; those losses
are returned as warnings.
Conventions
- Positive Laplacian matrices. Off-diagonal \(< 0\), diagonal \(> 0\), with \(L_{ii} = \sum_j \lvert L_{ij} \rvert\) for B’ susceptance matrices. This is the M-matrix form an SDDM (symmetric diagonally dominant M-matrix) or Cholesky solver expects; a consumer can recover an edge weight as \(-L_{ij} > 0\).
- Bus indexing. Bus ids are 1-based and preserved on the model as a newtype
(the Rust New Type Idiom).
IndexedNetwork::bus_index(id)is the only mapping into the dense \([0,n)\); an id out of range is anError::UnknownBus. - Taps and shifts. \(\mathrm{tap} = 0\) means \(\mathrm{tap} = 1\)
(
Branch::effective_tap). B’ ignores taps and shifts; B’’ keeps taps and zeros only shifts; \(Y_{\mathrm{bus}}\) keeps both. - Branch shunt admittance is stored per unit.
Branch::chargingis the stored per terminal admittance when present:g_fr,b_fr,g_to, andb_toare already per unit on the system base.Branch::bis the legacy MATPOWERBR_Btotal projection for formats that carry only one charging value. Matrix builders useBranch::terminal_charging(), so terminal values feed \(Y_{\mathrm{bus}}\) even when the legacy total is zero or stale. - B’ scheme.
Schemeselects between the two fast decoupled load flow variants for B’:Xbweights a branch by \(1/x\) (series resistance ignored),Bx(the default) by \(x/(r^2 + x^2)\). - Zero impedance branches.
BuildOptions::skip_zero_impedancecontrols the builders whose branch denominator can be zero. The defaulttrueskips the branch and records the skipped source branch rows inMatrixStatsasskipped_zero_impedanceandskipped_zero_impedance_branches;falsereturnsError::ZeroImpedance. Full AC admittance builders use \(r^2 + x^2\); DC incidence and reactance only FDPF variants use \(x\). The gridfm export still zeros its admittance and flow columns for these rows and recordsdropped_zero_impedanceingridfm_meta.json. - Reference coverage.
IndexedNetwork::check_reference_coverageverifies that every in-service island has a reference bus. - Susceptance conventions for the DC approximation.
DcConventionselects the branch weight the DC builders (incidence, weighted Laplacian, PTDF/LODF, the DC OPF bundle) use. The defaultPaperPureis the textbook DC power flow weight \(b = 1/x\), taps and shifts ignored; the resulting \(L = A \operatorname{diag}(b) A^\mathsf{T}\) equals B’ underScheme::Xb.Matpowerreproduces MATPOWER’smakeBdc: \(b = 1/(x\tau)\) for a transformer with tap ratio \(\tau\), plus the phase shift injection vectorp_shift.
Output
Matrices write as Matrix Market files or stay in memory. A symmetric matrix is
stored as its lower triangle with the symmetric header and 1-based indices
(io::mtx::write_mtx). The sensitivities and dcopf CLI subcommands bundle
the relevant family with a JSON manifest.
The standard case solver property fixture lives at
powerio-matrix/tests/fixtures/solver_matrix_stats.json. It records B’,
B’’, and ybus_imag stats for case9, case14, case30, case57, and
case118: n, nnz, min diagonal, M-matrix sign pattern, diagonal dominance
margin, zero impedance skips, row sum checks, SPD checks, and a condition
estimate when the solver input is SPD.
IndexedNetwork::to_petgraph returns the network as an undirected
petgraph graph, one node per bus and one edge per
in-service branch. The connectivity report and the radial check are built on
it. Use the returned graph directly for other petgraph algorithms.
DC OPF Bundle Schema
powerio dcopf <case>.m -o <out> (or opf_pipeline::write_dcopf_bundle) writes
<out>/<case>_dcopf/: a set of Matrix Market files plus dcopf_meta.json.
Everything is a pure function of the case. The files and conventions are below.
Conventions
- Format. Matrix Market. Matrices are
coordinate real; square symmetric ones (L,L_grounded) use thesymmetricheader and store the lower triangle only. Vectors arearray real general, one value per line. - Index base.
.mtxrow/column indices are 1-based (Matrix Market standard).reference_busesin the manifest are 0-based dense bus indices. - Sign convention. The Laplacians are the positive (M-matrix) form: diagonal \(> 0\), off-diagonal \(< 0\), with \(L_{ii} = \sum_j \lvert L_{ij} \rvert\) for \(L\). An off-diagonal entry is \(L_{ij} = -b_e\) for the branch between \(i\) and \(j\), so a consumer recovers the edge weight as \(-L_{ij} > 0\).
- Units.
PerUnitby default: power divided bybase_mva, cost scaled so it is a function of per unit power: \(q \leftarrow 2c_2 \cdot \mathrm{base}^2\) and \(c \leftarrow c_1 \cdot \mathrm{base}\).Nativekeeps MW / native cost. The choice is recorded in the manifest. - Generator costs. The default DC OPF export policy is
require: an in-service generator without cost data is an error. Use--missing-gen-costto explicitly fill missing rows for feasibility tests. - Reference buses.
reference_busesin the manifest lists every grounded bus as a 0-based dense index. Each in-service island needs at least one reference. If several references lie in one island, the bundle fixes all of those voltage angles to zero; it is not a participation factor slack model. - DC convention.
PaperPureby default (\(b_e = 1/x\), taps and phase shifts ignored).Matpoweruses \(b_e = 1/(x \tau)\) plus the phase shift injectionp_shift. Recorded in the manifest.
Matrices
| file | shape | what |
|---|---|---|
A.mtx | \(n \times m\) | signed incidence; column \(e\) has \(+1\) at from-bus, \(-1\) at to-bus |
L.mtx | \(n \times n\) | generic Laplacian \(L = A \operatorname{diag}(b) A^\mathsf{T}\), singular with \(\operatorname{rank}(L) = n - 1\), \(\mathbf{1} \in \ker L\) |
L_grounded.mtx | \((n-k) \times (n-k)\) | \(L\) with \(k\) reference rows and columns removed; SPD when every island is grounded |
BAt.mtx | \(m \times n\) | flow map \(B A^\mathsf{T}\), where \(f = B A^\mathsf{T} \theta\) |
Cg.mtx | \(n \times n_{\mathrm{gen}}\) | generator-to-bus incidence, one \(1\) per column |
Vectors
Bus-indexed (length \(n\)): pd (load), q/c (cost diag/linear), pmax/pmin
(generation bounds), e_r (reference indicator: \(1\) at every reference bus, else \(0\)),
p_shift (phase shift injection, all zero unless Matpower + shifters).
Branch-indexed (length \(m\)): b (susceptances), fmax (thermal limits; \(0\) means
unlimited per MATPOWER). Generator-space provenance (length \(n_{\mathrm{gen}}\)): q_gen,
c_gen, pmax_gen, pmin_gen.
Manifest (dcopf_meta.json)
Schema powerio.dcopf version 0.1.0 writes Matrix Market files plus
structured metadata:
dimensions:n_buses,n_source_branches,n_branch_columns,n_generators,n_reference_buses, andn_grounded_buses.index_base:dense = 0for manifest bus, branch, generator, and reference indices;matrix_market = 1for.mtxcoordinates.dc_convention,units,build_options, andzero_impedance. The zero impedance block records the skip flag, denominator rule, skipped count, and skipped source branch rows.grounding: reference buses, removed rows and columns, the grounded operator (L_grounded), and the reference selector (e_r).operators[]: one entry per emitted operator withname,file,kind,rows,cols,index_space, andunits.
The legacy aliases n, m, n_gen, reference_buses, and convention remain
for current readers. cost_policy, synthesized_gen_costs,
patched_gen_costs, files[], and powerio_version remain top level fields.
Solving with it
The grounded system is the one to factor: L_grounded is SPD when every island
has a reference. For DC power flow \(L\theta = p\) with net injection
\(p = g - d\), drop all reference_buses entries from \(p\), solve
\(L_{\mathrm{grounded}}\theta_{\mathrm{red}} = p_{\mathrm{red}}\), and set each
reference angle to \(0\). e_r identifies the grounded buses without parsing the
manifest. The full singular \(L\) can be used instead with a consistent zero-sum
RHS.
An interior point DC OPF solver builds reweighted Laplacians each Newton step
from the same A and b (only the edge weights change), so A is the durable
operator to hand over.
Language APIs
PowerIO uses the same IO vocabulary across Rust, Python, Julia, and the C ABI, with language-specific spelling where needed. A new format or dataset should appear as a format string or convenience wrapper, not as a new naming scheme.
Verb taxonomy:
parse_*: bytes, paths, or text to typed parsed values. Transmission parsers return a balanced network handle; distribution parsers return a multiconductor network handle; display parsers return display data.to_*:Networkto a new valueconvert_file: path to target text conveniencewrite_*: filesystem outputs (write_gridfm,write_pypsa_csv_folder,write_dcopf_bundle); the Rust hub also keepswrite_asand per-formatwrite_*text builders, the internals behindto_formatand theto_*writers, which the bindings do not mirrorread_*: filesystem dataset inputs (read_gridfm,read_pypsa_csv_folder), the inverse ofwrite_*. Datasets are multi-file directories, so they read and write; single documents parse and serialize (parse_*/to_*)export_*: handoff to external memory or interface protocols
| Concept | Rust | Python | Julia | C ABI |
|---|---|---|---|---|
| Parse path | parse_file(path, from) | parse_file(path, from_=None) | parse_file(path; from=nothing) | pio_parse_file |
| Parse text | parse_str(text, format) | parse_str(text, format) | parse_str(text, format) | pio_parse_str |
| Parse display path | parse_display_file(path, from) | parse_display_file(path, from_=None) | planned | n/a |
| Parse display bytes | parse_display_bytes(bytes, format) | parse_display_bytes(data, format) | planned | n/a |
| Parse IO | n/a | file object later | parse_file(io, format) | n/a |
| JSON to Network | Network::from_json | from_json | from_json | pio_parse_str + "powerio-json" |
| File conversion | convert_file(path, to, from) | convert_file(path, to, from_=None) | convert_file(path, to; from=nothing) | pio_convert_file |
| Text conversion | convert_str(text, to, format) | convert_str(text, to, format) | convert_str(text, to; from=format) | pio_convert_str |
| Parsed conversion | net.to_format(to) | net.to_format(to) | to_format(net, to) | pio_to_format |
| MATPOWER text | net.to_matpower() | net.to_matpower() | to_matpower(net) | pio_to_format + "matpower" |
| JSON text | net.to_json() | net.to_json() | to_json(net) | pio_to_format + "powerio-json" |
| Package JSON | NetworkPackage::to_json() | Package class / package transport | to_package / write_package | pio_package_* |
| Package operating points | pkg.operating_points() | pkg.operating_points() | planned | pio_package_operating_points_json |
| Materialize operating point | pkg.materialize_operating_point(i) | pkg.materialize_operating_point(i) | planned | pio_package_materialize_operating_point |
| Normalized copy | net.to_normalized() | net.to_normalized() | to_normalized(net) | pio_normalize |
| Dense tables | typed table API | to_dense | to_dense | pio_* extractors |
| PyPSA CSV folder | read_pypsa_csv_folder / write_pypsa_csv_folder | read_pypsa_csv_folder / net.write_pypsa_csv_folder | parse_file(dir; from="pypsa-csv") / write_pypsa_csv_folder | pio_parse_file / pio_write_dir + "pypsa-csv" |
| gridfm write | write_gridfm_dataset / write_gridfm_batch | net.write_gridfm / write_gridfm_batch | planned | planned |
| gridfm read | read_gridfm_dataset(dir, scenario) | read_gridfm(dir, scenario=0) | read_gridfm(dir; scenario=0) | pio_read_dir + "gridfm" |
| Arrow handoff | internal/C ABI | later | to_arrow | pio_to_arrow |
Note: the C ABI carries no per-format symbols: matpower, the powerio-json
snapshot, PyPSA CSV directories, and gridfm datasets are all format strings into
pio_to_format / pio_parse_str / pio_write_dir / pio_read_dir. The
language APIs keep their per-format conveniences (to_matpower, from_json,
…) as wrappers over the same paths.
C ABI and binding compatibility
The C ABI is the stable boundary for non Rust callers. Handles own parsed
networks. PioPackage handles own .pio.json compiler packages. Callers free
network handles with pio_network_free, package handles with
pio_package_free, free returned text with pio_string_free, size output
buffers before filling them, and treat every format name as a string routed
through the same parser and writer hub.
C ABI review points:
- null handles must return documented defaults or errors, not crash;
- optional output buffers must be safe to pass as null; required output structs such as Arrow exports must report an error when null;
- returned text and warning buffers must be NUL terminated when capacity permits;
- reported lengths must let callers allocate exact buffers;
- header declarations and exported Rust symbols must match;
- feature gated exports such as Arrow, GridFM, distribution, and packages must be additive;
- ownership rules must be documented in the header, README, and binding code.
Julia’s PowerIO.jl uses the C ABI for handles, dense extractors, Arrow,
GridFM, PyPSA CSV folders, distribution conversion, and .pio.json package
construction. Whole-network transport uses powerio-json, so the binding does
not stitch together a separate model from individual table calls. The Julia
binding checks pio_abi_version() against PIO_ABI_VERSION on first use.
Distribution calls also check pio_dist_abi_version().
GOC3 package construction is the first package operating point path backed by a source format. The static balanced payload carries the first interval; the replayable series is exposed through the package APIs above.
During development, test the sibling Julia binding against the local C ABI instead of an artifact:
cargo build -p powerio-capi --release --features arrow,gridfm,dist,pkg
POWERIO_CAPI=$PWD/target/release/libpowerio_capi.dylib \
julia --project=../PowerIO.jl -e 'using Pkg; Pkg.test()'
Binding compatibility checks:
| surface | behavior |
|---|---|
| Python base import | import powerio does not import NumPy, SciPy, NetworkX, Polars, pandas, pyarrow, or the MCP SDK |
| Python optional paths | matrix, graph, GridFM inspection, pandas, MCP, and benchmark oracles live behind extras |
| C ABI | pio_abi_version() is the core compatibility check; optional symbols are additive and feature probed |
| Julia | PowerIO.jl checks the C ABI version before first use and checks pio_dist_abi_version() before distribution calls |
| Arrow | C returns Arrow C Data Interface structs; Julia’s default to_arrow copies to owned vectors, while copy=false keeps the wrapper alive for zero copy reads |
| GridFM | Julia and C read GridFM through pio_read_dir / "gridfm" and surface schema losses as warnings |
| Distribution | Python, Julia, Rust, and C use separate distribution handles; transmission and distribution conversion paths do not mix |
Distribution surface (powerio-dist)
The multiconductor distribution model follows the same taxonomy under its own
handle type; the two families do not mix. The C distribution surface ships
behind the optional dist feature (PIO_DIST); a consumer probes it with
pio_has_feature("dist"), then checks pio_dist_abi_version() against
PIO_DIST_ABI_VERSION. PowerIO.jl uses the same runtime check before calling
the distribution C conversion helpers.
| Concept | Rust | Python | Julia | C ABI |
|---|---|---|---|---|
| Parse path | powerio_dist::parse_file(path, from) | dist.parse_file(path, from_=None) | parse_file(DistNetwork, path; from=nothing) | pio_dist_parse_file |
| Parse text | powerio_dist::parse_str(text, format) | dist.parse_str(text, format) | parse_str(DistNetwork, text, format) | pio_dist_parse_str |
| File conversion | powerio_dist::convert_file(path, to, from) | dist.convert_file(path, to, from_=None) | convert_file(DistNetwork, path, to; from=nothing) | pio_dist_convert_file(path, from, to, ...) |
| Target format type | DistTargetFormat (FromStr, name()) | format name strings | DistNetwork plus format strings | format name strings |
| Text conversion | powerio_dist::convert_str(text, to, format) | dist.convert_str(text, to, format) | convert_str(DistNetwork, text, to, format) | pio_dist_convert_str(text, from, to, ...) |
| Parsed conversion | net.to_format(to) | case.to_format(to) | to_format(net, to) | pio_dist_to_format |
| Parse warnings | net.warnings | case.warnings | warnings(net) | pio_dist_warnings |
Python API
Install the base package for parsing, writing, JSON transport, and file conversion with zero dependencies:
pip install powerio
Install extras only for the outputs that need them:
pip install 'powerio[matrix]' # numpy, scipy
pip install 'powerio[graph]' # networkx
pip install 'powerio[gridfm]' # polars
pip install 'powerio[pandas]' # pandas and pyarrow compatibility reads (Python 3.10+)
pip install 'powerio[all]' # matrix, graph, and gridfm reads
import powerio, parse_file, parse_str, convert_file, convert_str,
to_matpower, and to_json do not import NumPy, SciPy, NetworkX, Polars,
pandas, or pyarrow.
Transmission text and file format names accepted by parse_* and convert_* include
matpower, psse, powerworld, pslf, powermodels-json, egret-json,
pandapower-json, goc3-json, surge-json, and powerio-json, plus their
documented aliases. PyPSA CSV folders and GridFM Parquet datasets are directory
formats; use read_pypsa_csv_folder, Network.write_pypsa_csv_folder,
read_gridfm, Network.write_gridfm, or the conversion/package helpers that
take a path.
Canonical use
import powerio as pio
net = pio.parse_file("case9.m")
same_text = net.to_matpower()
json_text = net.to_json()
pm = net.to_format("powermodels-json")
pp = net.to_format("pandapower-json")
raw = pio.convert_file("case9.m", "psse")
aux = pio.convert_str(json_text, "powerworld", format="powermodels-json")
pypsa_out = net.write_pypsa_csv_folder("case9-pypsa")
display = pio.parse_display_file("case.pwd")
pkg = pio.Package.from_file("goc3_case.json", from_="goc3-json")
points = pkg.operating_points()
period_1 = pkg.materialize_operating_point(1)
normalized = net.to_normalized()
dense = net.to_dense() # needs powerio[matrix]
bprime = net.bprime() # needs powerio[matrix]
graph = net.to_networkx() # needs powerio[graph]
Model names
powerio.Network is the existing balanced transmission handle. v0.4 also
exports powerio.BalancedNetwork as the v1 family name for the same handle.
The old powerio.Case compatibility alias was removed in v0.4.
For distribution models, use powerio.dist.MulticonductorNetwork or the
existing powerio.dist.DistNetwork handle name. The old
powerio.dist.DistCase alias was removed in v0.4.
parse_file(path, from_=None) reads network case files (inferred from the
extension, or forced with from_); parse_str(text, format) reads in-memory
case text. Display artifacts are not network cases, so they use the separate
display API:
from pathlib import Path
display = pio.parse_display_file("case.pwd")
same = pio.parse_display_bytes(Path("case.pwd").read_bytes(), "pwd")
assert display.kind == "powerworld"
first = display.data.substations[0]
print(first.number, first.name, first.x, first.y)
For v0.2.2, display.data is a PwdDisplay with canvas_width,
canvas_height, stamp, and substations.
PyPSA folders
PyPSA CSV folders are multi-file datasets, so they use explicit read and write
helpers instead of Conversion.text.
import powerio as pio
case = pio.parse_file("case14.m")
out = case.write_pypsa_csv_folder("case14-pypsa")
round_trip = pio.read_pypsa_csv_folder(out["dir"])
The written folder can be imported with
pypsa.Network().import_from_csv_folder(path). PyPSA itself is not a runtime
dependency of powerio.
CSV folders are PyPSA’s native static component format and carry the network topology: buses, lines, transformers, generators, loads, shunts, storage units, and links (read as HVDC). Time series scenarios in NetCDF/HDF5 are out of scope for now; support is tracked in #107.
GridFM reads
The native wheel includes the GridFM Parquet writer and reader.
read_gridfm(dir, scenario=0) rebuilds a Network from a dataset, the inverse
of Network.write_gridfm, returning a GridfmRead(network, scenario, warnings)
namedtuple. The read is lossy but recovers everything a power flow needs;
warnings lists what the gridfm schema couldn’t round-trip (synthesized bus
ids, folded per bus load/shunt, dropped HVDC/storage, piecewise costs).
read_gridfm_scenarios(dir) returns one GridfmRead per scenario. dir
resolves the raw/ leaf, a <case>/ directory, or a parent with one */raw/
child.
import powerio as pio
out = pio.parse_file("case14.m").write_gridfm("out")
net, scenario, warnings = pio.read_gridfm(out["dir"])
text = net.to_matpower() # gridfm → any classical format
To inspect the raw Parquet tables instead, the preferred read extra is Polars:
import polars as pl
bus = pl.read_parquet(f"{out['dir']}/bus_data.parquet")
Use powerio[pandas] only for downstream code that expects pandas DataFrames.
.pio.json packages
powerio.Package is the handle for .pio.json packages: it parses the
envelope once and every accessor reuses the handle. Package.from_file and
Package.from_str build packages from case input, Package.from_json reads
envelope text, and Package.from_balanced / Package.from_multiconductor wrap
existing networks. pkg.model_kind names the package family;
pkg.as_balanced() / pkg.as_multiconductor() rebuild typed network handles
from the payload.
pkg.operating_points() returns a Python dict for the replayable operating
point series, or None. pkg.materialize_operating_point(i) returns a new
static Package with one point applied; updates resolve by the payload rows’
uid identities, and an unknown identity or a row that contradicts one raises
ValueError. GOC3 packages populate this series from the source time series
while the static payload holds the first interval. Network table dicts
(net.buses, net.loads, …) expose each row’s uid.
pkg.validate(), pkg.validation(), and pkg.diagnostics() expose the
package validation profile, and multiconductor packages lower through
pkg.multiconductor_to_balanced_preflight() and
pkg.lower_multiconductor_to_balanced().
pkg = pio.Package.from_file("goc3_case.json", from_="goc3-json")
series = pkg.operating_points()
static_pkg = pkg.materialize_operating_point(0)
net = static_pkg.as_balanced()
MCP path handling
MCP clients can request .pio.json package output from parse and pass that
same value back to the other network tools:
parsed = parse(path="case9.m", transport="package")
pkg = parsed["package_json"]
summary(package_json=pkg)
matrix("bprime", package_json=pkg)
save(out_path="case9.raw", to_format="psse", package_json=pkg)
diagnostics(pkg)
summary, normalize, matrix, and save also auto-detect a package passed
through the legacy json argument. The package envelope’s model_kind routes
balanced and multiconductor payloads.
The optional MCP server accepts local filesystem paths and file:// URIs for
path and out_path arguments. Remote URI schemes are rejected. Deployments
that need filesystem containment can set POWERIO_MCP_ALLOWED_ROOTS to an
os.pathsep separated list of directories; all MCP reads and writes must
resolve under one of those roots. POWERIO_MCP_ROOT is accepted as a single
root alias.
Performance
PowerIO has four benchmark tiers. Keep them separate when publishing numbers.
| tier | command | what it answers |
|---|---|---|
| Rust microbenchmarks | cargo bench -p powerio --bench parse | parser, writer, and PowerWorld reader timing inside one process |
| Matrix microbenchmarks | cargo bench -p powerio-matrix --bench matrix | sparse matrix, DC OPF component, and dense sensitivity builder timing after parse/indexing |
| Cross tool parser comparison | julia --project=benchmarks benchmarks/bench_julia.jl --json | powerio through the C ABI against ExaPowerIO.jl and PowerModels.jl |
| Python parser comparison | .venv/bin/python benchmarks/bench_parse.py --json <cases> | Python package parse and matrix path against pandapower reader paths |
The published table lives in the repository benchmark results, and this guide is the public reference for how those numbers are produced. Each refresh should update the snapshot environment there: machine model, chip, core count, memory, OS, Rust, C compiler, Julia, Python, and the package versions used by the comparison harnesses. Regenerate the JSON inputs first, then splice only the marked regions:
bash benchmarks/fetch_cases.sh
cargo build --release -p powerio-capi
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
julia --project=benchmarks benchmarks/bench_julia.jl --json
.venv/bin/python benchmarks/bench_parse.py --json \
tests/data/case2869pegase.m \
tests/data/large/case9241pegase.m \
tests/data/large/case13659pegase.m \
tests/data/large/case193k.m
python3 benchmarks/render_tables.py
python3 benchmarks/render_tables.py --check
PowerWorld .pwb and .aux parse timings are measured by the Rust Criterion
benchmarks. Fetch the public fixtures, run
cargo bench -p powerio --bench parse -- "parse_aux_|parse_pwb_", then run
python3 benchmarks/extract_powerworld_bench.py before rendering the tables. If
the Texas7k local row is published, pass its aux and pwb paths through
POWERIO_BENCH_AUX and POWERIO_BENCH_PWB during the Criterion run.
Matrix builder timings are separate from parse timings. The matrix benchmark
parses each fixture once, builds IndexedNetwork once, and times only derived
matrix construction. Its pipeline row measures Pipeline::run for the paired
\(Y_{\mathrm{bus}}\) export, including MTX, shunt, and metadata writes:
cargo bench -p powerio-matrix --bench matrix
python3 benchmarks/extract_matrix_bench.py
python3 benchmarks/render_tables.py
Use filtered runs while developing a focused change, for example:
cargo bench -p powerio-matrix --bench matrix -- 'matrix_bprime|matrix_ybus|dcopf_'
Criterion compares against the local target/criterion baseline. Treat a
Performance has regressed line as a signal to investigate, not as a publishable
claim by itself. A release note or benchmark page needs the commit, tree
cleanliness, machine, toolchain, command, fixtures, and whether optional large
cases were present.
Testing and release checks
Keep changes reviewable. A numerical semantics change needs tests and a short reason in code or docs. A performance change needs before and after measurements. A documentation change should link to evidence instead of expanding the README into a second manual.
Baseline checks
These commands cover the Rust workspace, the Python extension build, the Python binding tests, and the book:
cargo fmt --all --check
cargo clippy --all-targets
cargo test
cargo test -p powerio-cli --test cli
cargo test -p powerio-capi
cargo build -p powerio-py
python3.12 -m venv .venv
.venv/bin/python -m pip install --upgrade pip maturin -r benchmarks/requirements.txt
env VIRTUAL_ENV=$PWD/.venv .venv/bin/maturin develop --release
.venv/bin/pytest python/tests
mdbook build docs
mdbook test docs
Route changes
Use the smallest gate set that covers the changed surface, then run the release gates before a release claim.
| changed surface | extra gates |
|---|---|
| parser or writer semantics | bash benchmarks/run_validation.sh; format round trip tests; affected cargo +nightly fuzz run <target> -- -runs=1 harnesses |
| rich model fields | bash benchmarks/run_rich_validation.sh |
| matrix builders or DC OPF bundles | cargo test -p powerio-matrix; cargo bench -p powerio-matrix --bench matrix |
| PowerWorld binary reader | PowerWorld parser tests plus `cargo bench -p powerio –bench parse – “parse_aux_ |
| C ABI | scripts/capi-header-parity.sh; scripts/capi-smoke.sh; cargo test -p powerio-capi --no-default-features; cargo test -p powerio-capi --features arrow,gridfm,dist,pkg; matching clippy runs |
| Python package metadata or extras | maturin build --release --out /tmp/powerio-wheel-check; inspect wheel METADATA |
| Julia binding compatibility | build powerio-capi --features arrow,gridfm,dist,pkg, then run PowerIO.jl tests with POWERIO_CAPI |
| shared surface with PowerIO.jl | push a same-named PowerIO.jl companion branch; the tandem CI job tests against it |
| CLI behavior | cargo test -p powerio-cli --test cli |
| documentation or website | mdbook build docs; mdbook test docs; check stale links to retired guide outputs |
benchmarks/run_validation.sh requires the Python oracle stack in the same
Python 3.11+ venv as the local wheel. Missing PyPSA, pandapower, or egret is a
setup failure. benchmarks/run_rich_validation.sh treats the committed
PowerModels rich oracle as strict; missing Julia is a setup failure.
Release gates
Run the full set below, in addition to the baseline checks, before publishing a release claim:
cargo test -p powerio-capi --no-default-features
cargo test -p powerio-capi --features arrow,gridfm,dist,pkg
cargo clippy -p powerio-capi --all-targets --no-default-features -- -D warnings
cargo clippy -p powerio-capi --all-targets --features arrow,gridfm,dist,pkg -- -D warnings
cargo build -p powerio-capi --release --features arrow,gridfm,dist,pkg
scripts/capi-header-parity.sh
scripts/capi-smoke.sh
POWERIO_CAPI=$PWD/target/release/libpowerio_capi.dylib \
julia --project=../PowerIO.jl -e 'using Pkg; Pkg.test()'
cargo bench -p powerio-matrix --bench matrix -- 'matrix_bprime|matrix_ybus|dcopf_'
(cd benchmarks/asv && ../../.venv/bin/asv check -E existing:../../.venv/bin/python)
(cd benchmarks/asv && ../../.venv/bin/asv run --quick --show-stderr -E existing:../../.venv/bin/python --dry-run)
for target in matpower psse pslf powerio_json powerworld_aux pwb pwd; do
cargo +nightly fuzz run "$target" -- -runs=1
done
bash benchmarks/run_validation.sh
bash benchmarks/run_rich_validation.sh
run_validation.sh checks the classic transmission paths against
PowerModels.jl, ExaPowerIO.jl, egret, pandapower, and the full legacy reader to
writer matrix; run_rich_validation.sh covers fields outside the MATPOWER row
shape (branch terminal admittance, switches, current ratings, solution values,
HVDC costs, load voltage models). GOC3 and Surge have no external oracle in
this harness; the Rust parser, writer, routing, package, and round trip tests
cover them. What the oracle legs prove, per format, is in the
format fidelity chapter.
The gates do not prove every source format field is lossless. Known losses are part of the public behavior and surface as warnings.
Benchmark updates
Regenerate benchmark JSON before changing published tables:
julia --project=benchmarks benchmarks/bench_julia.jl --json
.venv/bin/python benchmarks/bench_parse.py --json <cases>
cargo bench -p powerio --bench parse -- "parse_aux_|parse_pwb_"
python3 benchmarks/extract_powerworld_bench.py
cargo bench -p powerio-matrix --bench matrix
python3 benchmarks/extract_matrix_bench.py
python3 benchmarks/render_tables.py
python3 benchmarks/render_tables.py --check
The ASV suite tracks Python wheel parse and matrix timing across git history. For an uncommitted worktree, smoke test it against the local venv:
cd benchmarks/asv
../../.venv/bin/asv check -E existing:../../.venv/bin/python
../../.venv/bin/asv run --quick --show-stderr -E existing:../../.venv/bin/python --dry-run
Do not update generated benchmark tables by hand. Update the snapshot environment described in the performance guide when publishing new numbers: commit, tree cleanliness, machine, OS, toolchain, Python stack, Julia stack, commands, fixtures, and optional local data.
Broad local corpora stay local. Pass them through documented environment
variables or --root flags, review the reports under benchmarks/results/, and
do not commit corpus paths or generated outputs.