Configuration File

The generation of an AC-OPF dataset is completely controlled and configured through an input YAML file of options settings.yaml. As exemplified below, these are organized into sections and are explained in detail hereafter.

CASE:
    name : "test"
    grid : "pglib_opf_case5_pjm.m"
    ...

SAMPLING:
    delta_pd : 100.0
    ...

`CASE` options

This section defines global parameters that control how the AC-OPF dataset is generated.

Key	Type	Description
`name`	`String`	Name of the folder where the dataset is saved
`grid`	`String`	Full name of the power system grid (MATPOWER format `.m` file)
`uid`	`Bool`	Whether to generate a unique identifier for each AC-OPF instance
`append`	`Bool`	Whether to append new results to an existing dataset
`baseseed`	`Int64`	Random number generator seed to control reproducibility
`num_samples`	`Int64`	Total number of AC-OPF samples to generate
`num_batches`	`Int64`	Number of batches in which the total samples are processed
`num_items`	`Int64`	Number of load samples generated for a single total load active power level

Few options deserve a more in-depth characterization.

Behaviour of `append`

If append is set to false and a dataset already exists under name, an error is raised to avoid overwriting it. If, instead, this option is true, new results are appended to the existing files. In this case, sampling resumes from the previously saved Random Number Generator (RNG) state, stored as rng_state.bin in the results folder. This file is automatically saved at the end of the dataset generation.

Warning

If append is true and no RNG state file is available, remember to change the RNG seed baseseed to avoid regenerating the same dataset.

Appending to an existing dataset with uid set to true implies re-generating/overwriting the unique identifier mapping that is used for splitting. To avoid this behaviour (e.g., when generating a held-out dataset for NN testing), switch this option to false.

Behaviour of `num_batches`

By design, AC-OPF instances are obtained by first sampling all input load profiles, then solving the OPF problem for each. Clearly, this can significantly increase memory consumption when generating large datasets. Processing the total number of samples num_samples in batches mitigates this issue.

Warning

However, the batch size should not be smaller than 2000-2500 samples. This is due to the fact that AC-OPF convergence is monitored for each total load active power sample. After processing the first batch, the region of sampling in total load active power is trimmed to exclude areas at the extrema of the distribution in which the AC-OPF never converges. If the number of batch samples is too small to sample the total active power region truly uniformly, regions that are feasible for the AC-OPF may be wrongly trimmed.

Behaviour of `num_items`

As detailed in the reference publication, load sampling is performed uniformly in terms of total active power. Specifically, each sample is generated by slicing the convex polytope around a certain value of total load active power and by sampling uniformly within this slice. The option num_items controls how many load samples are generated from a single polytope slice.

Note

Sampling from a polytope slice requires computing first its Chebyshev center. Setting num_items to 1 means determining as many Chebyshev centers as num_samples. This can be computationally demanding for large-scale power systems and/or for large datasets. Increasing num_items implies reducing the number of Chebyshev center computations.

`SAMPLING` options

These control how the input sampling space is created. Currently, this consists solely in a convex polytope defined in terms of load active and reactive power variables.

Key	Type	Unit	Description
`delta_pd`	`Float64`	[%]	Percentage variation in load active power around the nominal values
`delta_qd`	`Float64`	[%]	Percentage variation in load reactive power around the nominal values
`delta_pf`	`Float64`	[1]	Maximum reduction in power factor w.r.t. the nominal absolute value
`max_pf`	`Float64`	[1]	Maximum allowable load power factor
`min_pf`	`Float64`	[1]	Minimum allowable load power factor

Note

The value delta_pd cannot exceed 100% since the active power of a load cannot be unrestricted in sign to preserve the sign relation between active and reactive power. To create a load with negative (positive) active power at a bus that already has a positive (negative) one, add a new load to the power system dictionary or .m file with negative (positive) nominal active power.

`PARALLEL` options

The PARALLEL section controls distributed computing settings to be applied throughout the simulation.

Key	Type	Unit	Description
`cpu_ratio`	`Float64`	[%]	Percentage of CPU threads to use w.r.t. `Sys.CPU_THREADS` count

`MODEL` options

These options are related to the modified PowerModels OPF model with slack variables for active and reactive load power.

Key	Type	Unit	Description
`duals`	`Bool`	–	Whether to record dual values for every primal AC-OPF variable
`voll`	`Float64`	[€/MWh]	Value Of Lost Load

When duals is set to true, by design HEDGeOPF automatically retrieves the dual values, if available, of every JuMP.VariableRef defined in the model and, yet, does not look for any JuMP.ConstraintRef object. This choice stems from the fact that PowerModels mainly employs anonymous, non-containerized JuMP constraints in model definition, making it difficult to retrieve their (dual) values when inspecting results.

Warning

To record the dual value of branch apparent power, the OPF model is modified when duals is set to true by adding variables for the square of the branch apparent power at the from and to buses. When accessing and using the primal values for branch apparent power in the dataset results, the user should remember that they are squared.

`PATH` options

These define input and output file paths relative to the one of the input configuration YAML file basepath. Overall, the absolute path of the grid file is basepath/PATH.input/. Similarly, the one of the dataset is composed as basepath/PATH.output/CASE.grid/CASE.name/.

Key	Type	Description
`input`	`String`	Relative path to the folder containing the grid file
`output`	`String`	Relative path to the folder where the AC-OPF dataset is saved

`SOLVER` options

The SOLVER section specifies the LP and NLP solvers employed in HEDGeOPF.

Key	Type	Description
`lp`	`String`	Julia package name of linear programming solver
`nlp`	`String`	Julia package name of nonlinear programming solver
`lp_options`	`Pair`	Key-value pairs of options for the linear programming solver
`nlp_options`	`Pair`	Key-value pairs of options for the nonlinear programming solver

The following example shows how lp_options and nlp_options can be specified in the YAML file.

SOLVER:
    lp          : "HiGHS"
    lp_options  :
        solver            : "ipm"  
    nlp         : "Ipopt"
    nlp_options :
        max_cpu_time      : 1000.0
        mumps_mem_percent : 10
        print_level       : 0

HEDGeOPF is designed to be independent from the specific choice of LP and NLP solvers (as long as they support the type of variables and constraints used in the optimization models). The user should be able to control this simply by installing the relevant Julia packages and change the SOLVER section accordingly. The options of a specific solver, such as those of HiGHS and Ipopt, are typically available at the solver main documentation.

Note

Currently only HiGHS and Ipopt solvers have been tested with HEDGeOPF, with Ipopt being equipped with the default sequential MUMPS linear solver.