Configuration File
The generation of an AC-OPF dataset is completely controlled and configured through an input YAML file of options settings.yaml. As exemplified below, these are organized into sections and are explained in detail hereafter.
CASE:
name : "test"
grid : "pglib_opf_case5_pjm.m"
...
SAMPLING:
delta_pd : 100.0
...CASE options
This section defines global parameters that control how the AC-OPF dataset is generated.
| Key | Type | Description |
|---|---|---|
name | String | Name of the folder where the dataset is saved |
grid | String | Full name of the power system grid (MATPOWER format .m file) |
uid | Bool | Whether to generate a unique identifier for each AC-OPF instance |
append | Bool | Whether to append new results to an existing dataset |
baseseed | Int64 | Random number generator seed to control reproducibility |
num_samples | Int64 | Total number of AC-OPF samples to generate |
num_batches | Int64 | Number of batches in which the total samples are processed |
num_items | Int64 | Number of load samples generated for a single total load active power level |
Few options deserve a more in-depth characterization.
Behaviour of append
If append is set to false and a dataset already exists under name, an error is raised to avoid overwriting it. If, instead, this option is true, new results are appended to the existing files. In this case, sampling resumes from the previously saved Random Number Generator (RNG) state, stored as rng_state.bin in the results folder. This file is automatically saved at the end of the dataset generation.
If append is true and no RNG state file is available, remember to change the RNG seed baseseed to avoid regenerating the same dataset.
Appending to an existing dataset with uid set to true implies re-generating/overwriting the unique identifier mapping that is used for splitting. To avoid this behaviour (e.g., when generating a held-out dataset for NN testing), switch this option to false.
Behaviour of num_batches
By design, AC-OPF instances are obtained by first sampling all input load profiles, then solving the OPF problem for each. Clearly, this can significantly increase memory consumption when generating large datasets. Processing the total number of samples num_samples in batches mitigates this issue.
However, the batch size should not be smaller than 2000-2500 samples. This is due to the fact that AC-OPF convergence is monitored for each total load active power sample. After processing the first batch, the region of sampling in total load active power is trimmed to exclude areas at the extrema of the distribution in which the AC-OPF never converges. If the number of batch samples is too small to sample the total active power region truly uniformly, regions that are feasible for the AC-OPF may be wrongly trimmed.
Behaviour of num_items
As detailed in the reference publication, load sampling is performed uniformly in terms of total active power. Specifically, each sample is generated by slicing the convex polytope around a certain value of total load active power and by sampling uniformly within this slice. The option num_items controls how many load samples are generated from a single polytope slice.
Sampling from a polytope slice requires computing first its Chebyshev center. Setting num_items to 1 means determining as many Chebyshev centers as num_samples. This can be computationally demanding for large-scale power systems and/or for large datasets. Increasing num_items implies reducing the number of Chebyshev center computations.
SAMPLING options
These control how the input sampling space is created. Currently, this consists solely in a convex polytope defined in terms of load active and reactive power variables.
| Key | Type | Unit | Description |
|---|---|---|---|
delta_pd | Float64 | [%] | Percentage variation in load active power around the nominal values |
delta_qd | Float64 | [%] | Percentage variation in load reactive power around the nominal values |
delta_pf | Float64 | [1] | Maximum reduction in power factor w.r.t. the nominal absolute value |
max_pf | Float64 | [1] | Maximum allowable load power factor |
min_pf | Float64 | [1] | Minimum allowable load power factor |
The value delta_pd cannot exceed 100% since the active power of a load cannot be unrestricted in sign to preserve the sign relation between active and reactive power. To create a load with negative (positive) active power at a bus that already has a positive (negative) one, add a new load to the power system dictionary or .m file with negative (positive) nominal active power.
PARALLEL options
The PARALLEL section controls distributed computing settings to be applied throughout the simulation.
| Key | Type | Unit | Description |
|---|---|---|---|
cpu_ratio | Float64 | [%] | Percentage of CPU threads to use w.r.t. Sys.CPU_THREADS count |
MODEL options
These options are related to the modified PowerModels OPF model with slack variables for active and reactive load power.
| Key | Type | Unit | Description |
|---|---|---|---|
duals | Bool | – | Whether to record dual values for every primal AC-OPF variable |
voll | Float64 | [€/MWh] | Value Of Lost Load |
When duals is set to true, by design HEDGeOPF automatically retrieves the dual values, if available, of every JuMP.VariableRef defined in the model and, yet, does not look for any JuMP.ConstraintRef object. This choice stems from the fact that PowerModels mainly employs anonymous, non-containerized JuMP constraints in model definition, making it difficult to retrieve their (dual) values when inspecting results.
To record the dual value of branch apparent power, the OPF model is modified when duals is set to true by adding variables for the square of the branch apparent power at the from and to buses. When accessing and using the primal values for branch apparent power in the dataset results, the user should remember that they are squared.
PATH options
These define input and output file paths relative to the one of the input configuration YAML file basepath. Overall, the absolute path of the grid file is basepath/PATH.input/. Similarly, the one of the dataset is composed as basepath/PATH.output/CASE.grid/CASE.name/.
| Key | Type | Description |
|---|---|---|
input | String | Relative path to the folder containing the grid file |
output | String | Relative path to the folder where the AC-OPF dataset is saved |
SOLVER options
The SOLVER section specifies the LP and NLP solvers employed in HEDGeOPF.
| Key | Type | Description |
|---|---|---|
lp | String | Julia package name of linear programming solver |
nlp | String | Julia package name of nonlinear programming solver |
lp_options | Pair | Key-value pairs of options for the linear programming solver |
nlp_options | Pair | Key-value pairs of options for the nonlinear programming solver |
The following example shows how lp_options and nlp_options can be specified in the YAML file.
SOLVER:
lp : "HiGHS"
lp_options :
solver : "ipm"
nlp : "Ipopt"
nlp_options :
max_cpu_time : 1000.0
mumps_mem_percent : 10
print_level : 0HEDGeOPF is designed to be independent from the specific choice of LP and NLP solvers (as long as they support the type of variables and constraints used in the optimization models). The user should be able to control this simply by installing the relevant Julia packages and change the SOLVER section accordingly. The options of a specific solver, such as those of HiGHS and Ipopt, are typically available at the solver main documentation.