Design Of Experiments (DOE)¶

Parsim integrates functionality for common Design Of Experiments (DOE) methods, both random sampling schemes and well-known factorial designs.

The implementation of all methods (except Monte Carlo sampling) rely on the pyDOE Python library; for details, see the pyDOE documentation.

The psm doe command is used to create a study, based on the specified DOE scheme and parameter definitions in a parameter file. Some parameters are typically given fixed values in the parameter file, as for an ordinary study. The difference from an ordinary study is that the parameter file also defines a statistical distribution for each “uncertain”, or varying, parameter. The distribution specifications must follow the syntax of distribution in the scipy.stats module. The cases created inside the study will then be sampled or mapped from these distributions according to the chosen DOE scheme.

All schemes implemented in the pyDOE2 package (and possibly others) will eventually be made accessible, but currently only the following schemes can be used:

Monte Carlo random sampling (MC)
Latin Hypercube Sampling (LHS)
Plackett-Burman (fraction factorial designs)
Two-level full factorial design
General full factorial sampling (for more than two levels)
Two-level fractional factorial sampling
Central Composite Design (CCD)
Generalized Subset Design (GSD)

To get help on using the psm doe command, you use the -h option, as usual:

psm doe -h

This also gives a list of the currently implemented DOE schemes. You can get detailed help on syntax and capabilities for each of these by adding the name of the scheme as argument to the -h option. For example, to get help on using the Plackett-Burman scheme,

psm doe -h pb

Example: Tutorial “box” model with full factorial design¶

We use the simplistic “box” model from the tutorial, assuming we want to investigate a full factorial design in the parameters length, width, height and density.

The lower and upper levels to use will be obtained by sampling the statistical distributions defined for the variability of the parameters. These distributions are defined in a parsim parameter file, and must match distribution classes defined in the scipy.stats library. In this example we assume uniform distributions centred around the nominal values in the default parameters file. The scipy.stats.uniform class takes the lower bound and the width as arguments. The parameter file box_uniform.par looks like this:

length:     uniform(10, 4)    # 12 [m]
width:      uniform(3, 2)      # 4 [m]
height:     uniform(1.3, 0.4)  # 1.5 [m]
density:    uniform(950, 100) # 1000 [kg/m3]

Running psm doe -h tells you that the two-level full factorial scheme is called ff2n. Let’s check for additional options:

$ psm doe -h ff2n

Two-level full factorial sampling.

Keyword arguments:
    mapping (str): Selection of method for mapping factor levels to actual values in the distribution.

               'int': (default) Use confidence interval with equal areas around the mean. Width of interval is given by parameter 'beta' (default: 0.9545)
    beta (float): Width of confidence interval for 'int' method (see above). Default: 0.9545 (+/- 2*sigma)

Considering all our distributions are uniform, it would be natural to use the lower and upper bound as our two test levels. This, however, would not work for a normal distribution, or any other unbounded distribution. The default method (and the only one currently implemented) for mapping levels to a given distribution is int, which means that we use a confidence interval of the distribution to define the lower and upper levels. The beta argument defines the width of this interval. If we set beta to 0.999, for example, then 0.01 % of the PDF is outside of this interval; half of it below the lower level, half above the upper level.

We create a study named “box_ff2n” for a two-level full factorial design:

This design creates 16 cases in the study. The actual simulations can now be run as in the tutorial:

Collect is also done as before: