Dakota interface

Parsim can be used as an interface to the Dakota toolkit, developed by Sandia National Laboratories. For projects where you already have developed parameterized models with Parsim, this gives easy access to the most complete collection of methods and tools for optimization, uncertainty quantification and parameter estimation available.

What you need is a Dakota input file, which specifies which method to use and which variables to vary. You also need to add a couple of lines to your simulation script, so that it outputs the response variables that Dakota wants back. You then use the psm dakota command to have Parsim create an empty Study and start Dakota. Dakota uses a special analysis driver to create the cases needed for the analysis, based on your existing Parsim model template.

Dakota must be installed according to instructions. The Dakota executable should be in the executable path of your OS environment, so that the single command dakota will start the program. To check your Dakota installation, you can run Dakota to output version information:

> dakota -v
Dakota version 6.5 released Nov 11 2016.
Repository revision f928f89 (2016-11-10) built Nov 11 2016 05:09:45.

The Dakota input file

The Dakota input file contains a specification of the method to use, the model parameters to modify, how to execute the simulation and what output to expect. You need to study the Dakota user documentation and tutorials to learn how to use the functionality provided.

In the “variables” section of the input file, you need to make sure the variable “descriptors” match parameter names in your Parsim model template.

In the “interface” section of the Dakota input file, you need to tell Dakota how to create a new case and execute the simulation. This section must have exactly this content:

interface
  analysis_driver = 'psm_dakota_driver'
    fork
    parameters_file = 'params.in'
    results_file = 'results.out'

Here psm_dakota_driver is a special script executable installed with Parsim.

Output from the simulation

Your simulation script needs to be modified so that it writes the response variables in the proper format to an output file in the case directory, usually results.out. The name of the output file is specified by the “results_file” entry in the interface section of the Dakota input file (see above).

Restarts

The Parsim interface supports the Dakota restart functionality. Dakota writes a binary restart file with results from the function evaluations. Parsim saves restart files in the Study directory, for successive Dakota restarts. The initial Dakota run will have a run index of 0, and restarts will be numbered from 1, 2, etc. Parsim also saves copies of the Dakota inputfile used at each run, tagged by the same run index.

The psm dakota command has an option --restart to request restart and specify the index of the restart file to use. The index number itself is optional; if no index is given, the last restart file will be used for the restart. You will still need to specify Dakota input file and simulation executable, as these may have changed.

Dakota has an option -stop_restart to specify how many saves records to read from the restart file. The psm dakota command has a corresponding option --stop_restart.

To understand how the Dakota restart functionality works, please consult the Dakota documentation.

Dakota execution phases and pre-run

Dakota has three execution phases: pre-run, run and post-run. Some Dakota methods are implemented so that the pre-run phase can be run separately, which means that a table of case specifications will be generated without actually launching a simulation executable. This functionality is typically supported for sampling, parameter study and DACE methods.

The psm dakota command has an option --pre_run to use Dakota in pre-run mode. If supported by the selected method, this will create the corresponding Parsim cases of the study. The user can then use this as any other study, running simulation scripts and other activities using the psm run command, collecting results with the psm collect command, etc.

Example: Rosenbrock, gradient optimization

As an example, let us look at the Rosenbrock problem of the Dakota user documenation, where the gradient optimization method is used to find the minimum of the Rosenbrock function.

Assume we have the following Python script, which computes the Rosenbrock function for a fixed point:

"""
Compute Rosenbrock function.
"""
from __future__ import print_function

def rb(x1, x2):
    return 100*(x2 - x1**2)**2 + (1 - x1)**2

# -----------------------------------
# Input data
# -----------------------------------

x1 = 1.5
x2 = 0.5

# -----------------------------------
# Calculation/simulation
# -----------------------------------

f = rb(x1, x2)

# -----------------------------------
# Results output
# -----------------------------------

print('x1', 'x2', 'f')
print(x1, x2, f)

We assume you already have Parsim project to work with, otherwise create one as explained in the tutorial.

Creating the model template

In the modelTemplates directory of your project, create a new model template called “rosenbrock”:

> cd modelTemplates
> mkdir rosenbrock

Inside the rosenbrock directory, create a parameterized version of the Python script above, and name it rb.py.macro:

"""
Compute Rosenbrock function.
"""
from __future__ import print_function

import json

def rb(x1, x2):
    return 100*(x2 - x1**2)**2 + (1 - x1)**2

#-----------------------------------
# Input data
#-----------------------------------

x1 = $(x1)
x2 = $(x2)

#-----------------------------------
# Calculation/simulation
#-----------------------------------

f = rb(x1, x2)

#-----------------------------------
# Results output
#-----------------------------------

#print('x1', 'x2', 'f')
#print(x1, x2, f)

with open('results.out', 'w') as fout:
    fout.write('%s\n' % str(f))

with open('results.json', 'w') as fjson:
    fjson.write(json.dumps({'f': f}))

The only changes we have made is to introduce parameters x1 and x2 for the input data, and to write the computed function value to the output file results.out. We also chose to output the results in json format, as discussed in the Parsim tutorial, in case we would want to collect the results also in tabular format with the psm collect command.

The model template must also have a default.parameters file, which defines default values for the parameters:

#==========================================================
# Model template "rosenbrock"
#
# Computes Rosenbrock problem.
# Outputs function value to 'results.out'
#==========================================================
x1  :   1.0
x2  :   0.5

Modifying the Dakota input file

Compared to the original example in the Dakota manual, the only change needed in the input file rosen_grad_opt.in is the specification of the analysis driver in the interface section:

# Dakota Input File: rosen_grad_opt.in
# Usage:
#   dakota -i rosen_grad_opt.in -o rosen_grad_opt.out > rosen_grad_opt.stdout

environment
  # graphics
  tabular_data
    tabular_data_file = 'rosen_grad_opt.dat'

method
  max_iterations = 100
  convergence_tolerance = 1e-4
  conmin_frcg

model
  single

variables
  continuous_design = 2
    initial_point    -1.2      1.0
    lower_bounds     -2.0     -2.0
    upper_bounds      2.0      2.0
    descriptors       'x1'     "x2"

interface
  analysis_driver = 'psm_dakota_driver'
    fork
    parameters_file = 'params.in'
    results_file = 'results.out'

responses
  objective_functions = 1
#  analytic_gradients
  numerical_gradients
    method_source dakota
    interval_type forward
    fd_gradient_step_size = 1.e-5
  no_hessians

We here need to use the special executable psm_dakota_driver.

Running Dakota with Parsim

To run the Dakota optimization, with our new Parsim model template, use the psm dakota command:

psm dakota --template rosenbrock --name rb1 rosen_grad_opt.in rb.py

The first positional argument is the Dakota input file, the second is the name of the simulation executable, in this case the simple parameterized Python script above.

The output from Dakota is found inside the study directory study_rb1. The standard output from Dakota (the execution history) is found in dakota.out. The Dakota input file also instructed Dakota to write tabular data to the file rosen_grad_opt.dat.

Restarting a failed Dakota run

The example above generates 134 cases in the study. Assume that the process stops and crashes after, say, 20 succesful cases (for example because of a full disk, or something else). We would then want to restart the Dakota run, but making use of the existing 20 succesful function evaluations. The Dakota restart functionality makes this possible. In this example, you restart the Dakota execution with the command

psm dakota -t rosenbrock --name rb1 --restart 0 --stop_restart 20 rosen_grad_opt.in rb.py

We here explicitly selected the initial restart file (0), although this is the same as the last one generated in this example. Parsim stores succesive restart files in the study directory, numbered by an integer run index, 0 corresponding to the original run. We also explicitly told Dakota to only use the first 20 function evaluations of the restart file; by default it would use as many as it would find.

Note that we define the name of the Dakota input file and the simulation executable for the restart. This is because one may want to modify these, to avoid the problems experienced in the previous run.

Example: Generating cases with the Dakota pre-run functionality

For the optimization problem above, Dakota must select parameter values for each new case based on the result of the previous cases. For other methods, for example random sampling methods or traditional response surface designs, parameter values for all cases can be produced before starting any simulations. This is possible with the Dakota pre-run functionality.

Let us assume we want to create a complete study with 200 cases, based on the Dakota random sampling method, using the Dakota input file rosen_sampling.in:

# Dakota Input File: rosen_sampling.in
# Usage:
#   dakota -i rosen_sampling.in -o rosen_sampling.out > rosen_sampling.stdout

environment
  # graphics
  tabular_data
    tabular_data_file = 'rosen_sampling.dat'

method
  sampling
    sample_type random
    samples = 200
    seed = 17
    response_levels = 100.0

model
  single

variables
  uniform_uncertain = 2
    lower_bounds       -2.0  -2.0
    upper_bounds        2.0   2.0
    descriptors         'x1'  'x2'

interface
  analysis_driver = 'psm_dakota_driver'
    fork
    parameters_file = 'params.in'
    results_file = 'results.out'

responses
  response_functions = 1
  no_gradients
  no_hessians

This file is essentially the same as in the Dakota documentation. For consistency, we have modified the interface section to use the Parsim simulation driver, although the driver will never be executed in pre-run mode.

The Parsim study rb2 can now be created in Dakota pre-run mode,

psm dakota -t rosenbrock --name rb2 --pre_run rosen_sampling.in rb.py

With the command-line syntax currently implemented, we have to provide a name of a simulation executable, although it is not used.

Once the study and its cases are created, you interact with it as with any other Parsim study. For exemple, you would run the actual Rosenbrock “simulation” for all cases with the psm run command,

psm run rb2 rb.py

Since the rb.py script above also outputs the response in the json file results.json, we can use the psm collect command to collect all results into a table:

psm collect -i results.json rb2

The default, the results table was written in space-separated format to the file results.txt in the study direcory.

Example: Polynomial Chaos Expansion on the Rosenbrock problem

As an additional example, we apply the Polynomial Chaos Expansion method (PCE) on the Rosenbrock function. This example is taken from Section 5.4.1.1 in the Dakota User’s Manual; the interested reader should read about these methods there, to fully appreciate what is going on.

Again, we modify the interface section of the input file found in the Dakota documentation:

# Dakota Input File: rosen_uq_pce.in

environment
  #graphics

method
  polynomial_chaos
    quadrature_order   = 5
    dimension_preference = 5 3
    samples_on_emulator = 10000
    seed = 12347 rng rnum2
    response_levels = .1 1. 50. 100. 500. 1000.
    variance_based_decomp #interaction_order = 1

variables
  uniform_uncertain = 2
    lower_bounds      = -2.  -2.
    upper_bounds      =  2.   2.
    descriptors       = 'x1' 'x2'

interface
  analysis_driver = 'psm_dakota_driver'
    fork
    parameters_file = 'params.in'
    results_file = 'results.out'

responses
  response_functions = 1
  no_gradients
  no_hessians

We then run Dakota through Parsim, as before,

psm dakota --template rosenbrock --name pce rosen_uq_pce.in rb.py

The results output by Dakota are found in the file dakota.out in the study directory study_pce.