Configuration File

The generation of an AC-OPF dataset is completely controlled and configured through an input YAML file of options settings.yaml. As exemplified below, these are organized into sections and are explained in detail hereafter.

CASE:
    name : "test"
    grid : "pglib_opf_case5_pjm.m"
    ...

SAMPLING:
    delta_pd : 100.0
    ...

CASE options

This section defines global parameters that control how the AC-OPF dataset is generated.

KeyTypeDescription
nameStringName of the folder where the dataset is saved
gridStringFull name of the power system grid (MATPOWER format .m file)
uidBoolWhether to generate a unique identifier for each AC-OPF instance
appendBoolWhether to append new results to an existing dataset
baseseedInt64Random number generator seed to control reproducibility
num_samplesInt64Total number of AC-OPF samples to generate
num_batchesInt64Number of batches in which the total samples are processed
num_itemsInt64Number of load samples generated for a single total load active power level

Few options deserve a more in-depth characterization.

Behaviour of append

If append is set to false and a dataset already exists under name, an error is raised to avoid overwriting it. If, instead, this option is true, new results are appended to the existing files. In this case, sampling resumes from the previously saved Random Number Generator (RNG) state, stored as rng_state.bin in the results folder. This file is automatically saved at the end of the dataset generation.

Warning

If append is true and no RNG state file is available, remember to change the RNG seed baseseed to avoid regenerating the same dataset.

Appending to an existing dataset with uid set to true implies re-generating/overwriting the unique identifier mapping that is used for splitting. To avoid this behaviour (e.g., when generating a held-out dataset for NN testing), switch this option to false.

Behaviour of num_batches

By design, AC-OPF instances are obtained by first sampling all input load profiles, then solving the OPF problem for each. Clearly, this can significantly increase memory consumption when generating large datasets. Processing the total number of samples num_samples in batches mitigates this issue.

Warning

However, the batch size should not be smaller than 2000-2500 samples. This is due to the fact that AC-OPF convergence is monitored for each total load active power sample. After processing the first batch, the region of sampling in total load active power is trimmed to exclude areas at the extrema of the distribution in which the AC-OPF never converges. If the number of batch samples is too small to sample the total active power region truly uniformly, regions that are feasible for the AC-OPF may be wrongly trimmed.

Behaviour of num_items

As detailed in the reference publication, load sampling is performed uniformly in terms of total active power. Specifically, each sample is generated by slicing the convex polytope around a certain value of total load active power and by sampling uniformly within this slice. The option num_items controls how many load samples are generated from a single polytope slice.

Note

Sampling from a polytope slice requires computing first its Chebyshev center. Setting num_items to 1 means determining as many Chebyshev centers as num_samples. This can be computationally demanding for large-scale power systems and/or for large datasets. Increasing num_items implies reducing the number of Chebyshev center computations.

SAMPLING options

These control how the input sampling space is created. Currently, this consists solely in a convex polytope defined in terms of load active and reactive power variables.

KeyTypeUnitDescription
delta_pdFloat64[%]Percentage variation in load active power around the nominal values
delta_qdFloat64[%]Percentage variation in load reactive power around the nominal values
delta_pfFloat64[1]Maximum reduction in power factor w.r.t. the nominal absolute value
max_pfFloat64[1]Maximum allowable load power factor
min_pfFloat64[1]Minimum allowable load power factor
Note

The value delta_pd cannot exceed 100% since the active power of a load cannot be unrestricted in sign to preserve the sign relation between active and reactive power. To create a load with negative (positive) active power at a bus that already has a positive (negative) one, add a new load to the power system dictionary or .m file with negative (positive) nominal active power.

PARALLEL options

The PARALLEL section controls distributed computing settings to be applied throughout the simulation.

KeyTypeUnitDescription
cpu_ratioFloat64[%]Percentage of CPU threads to use w.r.t. Sys.CPU_THREADS count

MODEL options

These options are related to the modified PowerModels OPF model with slack variables for active and reactive load power.

KeyTypeUnitDescription
dualsBoolWhether to record dual values for every primal AC-OPF variable
vollFloat64[€/MWh]Value Of Lost Load

When duals is set to true, by design HEDGeOPF automatically retrieves the dual values, if available, of every JuMP.VariableRef defined in the model and, yet, does not look for any JuMP.ConstraintRef object. This choice stems from the fact that PowerModels mainly employs anonymous, non-containerized JuMP constraints in model definition, making it difficult to retrieve their (dual) values when inspecting results.

Warning

To record the dual value of branch apparent power, the OPF model is modified when duals is set to true by adding variables for the square of the branch apparent power at the from and to buses. When accessing and using the primal values for branch apparent power in the dataset results, the user should remember that they are squared.

PATH options

These define input and output file paths relative to the one of the input configuration YAML file basepath. Overall, the absolute path of the grid file is basepath/PATH.input/. Similarly, the one of the dataset is composed as basepath/PATH.output/CASE.grid/CASE.name/.

KeyTypeDescription
inputStringRelative path to the folder containing the grid file
outputStringRelative path to the folder where the AC-OPF dataset is saved

SOLVER options

The SOLVER section specifies the LP and NLP solvers employed in HEDGeOPF.

KeyTypeDescription
lpStringJulia package name of linear programming solver
nlpStringJulia package name of nonlinear programming solver
lp_optionsPairKey-value pairs of options for the linear programming solver
nlp_optionsPairKey-value pairs of options for the nonlinear programming solver

The following example shows how lp_options and nlp_options can be specified in the YAML file.

SOLVER:
    lp          : "HiGHS"
    lp_options  :
        solver            : "ipm"  
    nlp         : "Ipopt"
    nlp_options :
        max_cpu_time      : 1000.0
        mumps_mem_percent : 10
        print_level       : 0

HEDGeOPF is designed to be independent from the specific choice of LP and NLP solvers (as long as they support the type of variables and constraints used in the optimization models). The user should be able to control this simply by installing the relevant Julia packages and change the SOLVER section accordingly. The options of a specific solver, such as those of HiGHS and Ipopt, are typically available at the solver main documentation.

Note

Currently only HiGHS and Ipopt solvers have been tested with HEDGeOPF, with Ipopt being equipped with the default sequential MUMPS linear solver.