NLopt Tutorial

From AbInitio

(Difference between revisions)
Jump to: navigation, search
Revision as of 01:05, 13 November 2009 (edit)
Stevenj (Talk | contribs)
(note NLOPT_ROUNDOFF_LIMITED)
← Previous diff
Revision as of 00:14, 21 November 2009 (edit)
Stevenj (Talk | contribs)
(Switching to a derivative-free algorithm)
Next diff →
Line 115: Line 115:
===Switching to a derivative-free algorithm=== ===Switching to a derivative-free algorithm===
-We can also try a derivative-free algorithm. Looking at the [[NLopt Algorithms]] list, the only other algorithm in NLopt that handles nonlinear constraints is COBYLA, which is derivative-free. To use it, we just change <code>NLOPT_LD_MMA</code> ("LD" means local optimization, derivative/gradient-based) into <code>NLOPT_LN_COBYLA</code> ("LN" means local optimization, no derivatives), and obtain:+We can also try a derivative-free algorithm. Looking at the [[NLopt Algorithms]] list, another algorithm in NLopt that handles nonlinear constraints is COBYLA, which is derivative-free. To use it, we just change <code>NLOPT_LD_MMA</code> ("LD" means local optimization, derivative/gradient-based) into <code>NLOPT_LN_COBYLA</code> ("LN" means local optimization, no derivatives), and obtain:
found minimum after 22 evaluations at f(0.3333328219501758,0.2962976600664823) = 0.5443323066532817 found minimum after 22 evaluations at f(0.3333328219501758,0.2962976600664823) = 0.5443323066532817

Revision as of 00:14, 21 November 2009

NLopt
Download
Release notes
FAQ
NLopt manual
Introduction
Installation
Tutorial
Reference
Algorithms
License and Copyright

In this tutorial, we illustrate the usage of NLopt in various languages via one or two trivial examples.

Contents

Example nonlinearly constrained problem

Feasible region for a simple example optimization problem with two nonlinear (cubic) constraints.
Enlarge
Feasible region for a simple example optimization problem with two nonlinear (cubic) constraints.

As a first example, we'll look at the nonlinearly constrained minimization problem:

\min_{\mathbf{x}\in\mathbb{R}^2} \sqrt{x_2}
subject to x_2 \geq 0, x_2 \geq (a_1 x_1 + b_1)^3, and x_2 \geq (a_2 x_1 + b_2)^3

for parameters a1=2, b1=0, a2=-1, b2=1.

The feasible region defined by these constraints is plotted at right: x2 is constrained to lie above the maximum of two cubics, and the optimum point is located at the intersection (1/3, 8/27) where the objective function takes on the value \sqrt{8/27} \approx 0.5443310539518\ldots.

(This problem is especially trivial, because by formulating it in terms of the cube root of x2 you can turn it into a linear-programming problem, but we won't do that here.)

In principle, we don't need the bound constraint x2≥0, since the nonlinear constraints already imply a positive-x2 feasible region. However, NLopt doesn't guarantee that, on the way to finding the optimum, it won't violate the nonlinear constraints at some intermediate steps, while it does guarantee that all intermediate steps will satisfy the bound constraints. So, we will explicitly impose x2≥0 in order to ensure that the √x2 in our objective is real.

Example in C/C++

To implement the above example in C or C++, we would first do:

#include <math.h>
#include <nlopt.h>

to include the NLopt header file as well as the standard math header file (needed for things like the sqrt function and the HUGE_VAL constant), then we would define our objective function as:

double myfunc(int n, const double *x, double *grad, void *my_func_data)
{
    if (grad) {
        grad[0] = 0.0;
        grad[1] = 0.5 / sqrt(x[1]);
    }
    return sqrt(x[1]);
}

There are several things to notice here. First, since this is C, our indices are zero-based, so we have x[0] and x[1] instead of x1 and x2. The return value of our function is the objective \sqrt{x_2}. Also, if the parameter grad is not NULL, then we set grad[0] and grad[1] to the partial derivatives of our objective with respect to x[0] and x[1]. The gradient is only needed for gradient-based algorithms; if you use a derivative-free optimization algorithm, grad will always be NULL and you need never compute any derivatives. Finally, we have an extra parameter my_func_data that can be used to pass additional data to myfunc, but no additional data is needed here so that parameter is unused.

For the constraints, on the other hand, we will have additional data. Each constraint is parameterized by two numbers a and b, so we will declare a data structure to hold this information:

typedef struct {
    double a, b;
} my_constraint_data;

Then, we implement our constraint function as follows. (You always create a single constraint function, even if you have several very different constraints—your constraint function can do different things depending on what is passed as the void *data parameter.)

double myconstraint(int n, const double *x, double *grad, void *data)
{
    my_constraint_data *d = (my_constraint_data *) data;
    double a = d->a, b = d->b;
    if (grad) {
        grad[0] = 3 * a * (a*x[0] + b) * (a*x[0] + b);
        grad[1] = -1.0;
    }
    return ((a*x[0] + b) * (a*x[0] + b) * (a*x[0] + b) - x[1]);
 }

The form of the constraint function is the same as that of the objective function. Here, the data parameter will actually be a pointer to my_constraint_data (because this is the type that we will pass to nlopt_minimize_constrained below), so we use a typecast to get the constraint data. NLopt always expects constraints to be of the form myconstraint(x) ≤ 0, so we implement the constraint x2 ≥ (a x1 + b)3 as the function (a x1 + b)3 − x2. Again, we only compute the gradient if grad is non-NULL, which will never occur if we use a derivative-free optimization algorithm.

Finally, we need to call nlopt_minimize_constrained from our program somewhere. We will, of course, need to declare arrays to hold the solution x, the upper and lower bounds, and the constraint data. This will look something like:

my_constraint_data data[2] = { {2,0}, {-1,1} };
double x[2] = { 1.234, 5.678 };  /* some initial guess */
double lb[2] = { -HUGE_VAL, 0 }, ub[2] = { HUGE_VAL, HUGE_VAL }; /* lower and upper bounds */
double minf; /* the minimum objective value, upon return */

if (nlopt_minimize_constrained(NLOPT_LD_MMA, 2, myfunc, NULL,
                               2, myconstraint, data, sizeof(my_constraint_data),
                               lb, ub, x, &minf,
                               -HUGE_VAL, 0.0, 0.0, 1e-4, NULL, 0, 0.0) < 0) {
    printf("nlopt failed!\n");
}
else {
    printf("found minimum at f(%g,%g) = %g\n", x[0], x[1], minf);
}

The arguments to nlopt_minimize_constrained are described in detail by the reference manual (on Unix, you can also run man nlopt_minimize_constrained). Most of them are straightforward. Notice that we pass data as the data for myconstraint — when myconstraint is called, it will be passed pointers to data[0] and data[1] for the two constraints. The last set of parameters (-HUGE_VAL, 0.0, 0.0, 1e-4, NULL, 0, 0.0) are termination criteria — the only one we are using is that we are setting the relative tolerance on x to 10−4, and the other conditions are set to defaults where they do nothing. Also notice that we set the lower bound on x2 to 0, but set the lower and upper bounds to ±∞ (±HUGE_VAL).

nlopt_minimize_constrained will return a negative result code on failure, but this usually only happens if you pass invalid parameters, it runs out of memory, or something like that. (However, if it returns the failure code NLOPT_ROUNDOFF_LIMITED, indicating a breakdown due to roundoff errors, the minimum found may still be useful and you may want to still use it.) Otherwise, we print out the minimum function value and the corresponding parameters x.

Assuming we save this in a file tutorial.c, we would compile and link (on Unix) with:

cc tutorial.c -o tutorial -lnlopt -lm

The result of running the program should then be something like:

found minimum at f(0.3333333313061715,0.2962962926308738) = 0.5443310505849118

That is, it found the correct parameters and minimum function value to about 8 significant digits. (This is better than we specified; this often occurs because the local optimization routines usually try to be conservative in estimating the error.)

Number of evaluations

Let's modify our program to print out the number of function evaluations that were required to obtain this result. First, we'll change our objective function to:

int count = 0;
double myfunc(int n, const double *x, double *grad, void *my_func_data)
{
    ++count;
    if (grad) {
        grad[0] = 0.0;
        grad[1] = 0.5 / sqrt(x[1]);
    }
    return sqrt(x[1]);
}

using a global variable count that is incremented for each function evaluation. (We could also pass a pointer to a counter variable as my_func_data, if we wanted to avoid global variables.) Then, changing our printf statement to:

printf("found minimum after %d evaluations at f(%g,%g) = %g\n", count, x[0], x[1], minf);

we obtain:

found minimum after 11 evaluations at f(0.3333333313061715,0.2962962926308738) = 0.5443310505849118

For such a simple problem, a gradient-based local optimization algorithm like MMA can converge very quickly!

Switching to a derivative-free algorithm

We can also try a derivative-free algorithm. Looking at the NLopt Algorithms list, another algorithm in NLopt that handles nonlinear constraints is COBYLA, which is derivative-free. To use it, we just change NLOPT_LD_MMA ("LD" means local optimization, derivative/gradient-based) into NLOPT_LN_COBYLA ("LN" means local optimization, no derivatives), and obtain:

found minimum after 22 evaluations at f(0.3333328219501758,0.2962976600664823) = 0.5443323066532817

In such a low-dimensional problem, derivative-free algorithms usually work quite well—in this case, it only doubles the number of function evaluations. However, the comparison is not perfect because, for the same relative x tolerance of 10−4, COBYLA is a bit less conservative and only finds the solution to 5 significant digits.

To do a fairer comparison of the two algorithms, we could set the x tolerance to zero and ask how many function evaluations each one requires to get the correct answer to three decimal places. We can specify this by using the minf_max termination criterion, which allows us to halt the process as soon as a feasible point attains an objective function value less than minf_max. In this case, we would set minf_max to \sqrt(8/27)+10^{-3} (instead of -HUGE_VAL before), corresponding to the last line of arguments to nlopt_minimize_constrained being sqrt(8./27.)+1e-3, 0.0, 0.0, 0.0, NULL, 0, 0.0. If we do this, we find that COBYLA requires 20 evaluations while MMA requires 10.

The advantage of gradient-based algorithms over derivative-free algorithms typically grows for higher-dimensional problems. On the other hand, derivative-free algorithms are much easier to use because you don't need to worry about how to compute the gradient (which might be tricky if the function is very complicated).

Example in Matlab or GNU Octave

To implement this objective function in Matlab (or GNU Octave), we would write a file myfunc.m that looks like:

function [val, gradient] = myfunc(x)
    val = sqrt(x(2));
    if (nargout > 1)
        gradient = [0, 0.5 / val];
    end

Notice that we check the Matlab builtin variable nargout (the number of output arguments) to decide whether to compute the gradient. If we use a derivative-free optimization algorithm below, then nargout will always be 1 and the gradient need never be computed.

Our constraint function looks similar, except that it is parameterized by the coefficients a and b. We can just add these on as extra parameters, in a file myconstraint.m:


function [val, gradient] = myconstraint(x,a,b)
    val = (a*x(1) + b)^3 - x(2);
    if (nargout > 1)
        gradient = [ 3*a*(a*x(1) + b)^2, -1];
    end

Now, in Matlab, before we call nlopt_minimize_constrained, we should set our termination condition. As above, we'll use a relative x tolerance of 10−4. To do this, we use a Matlab structure:

stop.xtol_rel = 1e-4;

We can also add other fields to the structure, corresponding to other termination conditions, but we don't need to—any fields that we omit are ignored.

Finally, we call nlopt_minimize_constrained (see also the reference manual, or type help nlopt_minimize_constrained in Matlab):

[xopt, fmin, retcode] = nlopt_minimize_constrained(NLOPT_LD_MMA, @myfunc, {}, {@myconstraint, @myconstraint}, {{2,0}, {-1,1}}, [-inf,0],[inf,inf], [1.234 5.678], stop)

NLOPT_LD_MMA is the optimization algorithm we are using, a gradient-based method that supports nonlinear constraints. @myfunc is a function handle, a reference to our myfunc function to say that this is our objective. The {} is a cell array of additional arguments needed for myfunc, which in this case is none. We have two constraints, so the next argument is a cell array of our constraint functions {@myconstraint, @myconstraint} (in this case, they are both the same function). Each of these needs two extra arguments (a, b), so we pass the extra arguments as a cell array {{2,0}, {-1,1}} of two cell arrays {2,0} and {-1,1} of the additional arguments to pass to myconstraint. [-inf,0] are the lower bounds on x, and [inf,inf] are the upper bounds—i.e., x is unbounded except for x2≥0. [1.234 5.678] is a starting guess for the local optimization. Finally, stop is our structure of termination conditions.

nlopt_minimize_constrained returns three things: xopt, the optimal parameters found; fmin, the corresponding value of the objective function, and a return code retcode (positive on success and negative on failure).

The output of the above command is:

xopt =
 0.33333  0.29630
fmin = 0.54433
retcode = 4

Just as in the C example, it found the correct minimum (to about 8 significant digits). To switch to a derivative-free algorithm like COBYLA, we just change the first argument:


[xopt, fmin, retcode] = nlopt_minimize_constrained(NLOPT_LN_COBYLA, @myfunc, {}, {@myconstraint, @myconstraint}, {{2,0}, {-1,1}}, [-inf,0],[inf,inf], [1.234 5.678], stop)

and obtain the same result (albeit with only about five significant digits of accuracy if we look closely; the COBYLA algorithm interprets the function tolerance a little less conservatively than in MMA, and converges more slowly).

Matlab verbose output

It is often useful to print out some status message to see what is happening, especially if your function evaluation is much slower or if a large number of evaluations are required (e.g. for global optimization). You can, of course, modify your function to print out whatever you want. As a shortcut, however, you can set a verbose option in NLopt's Matlab interface by:

stop.verbose = 1;

If we do this, then running the MMA algorithm as above yields:

nlopt_minimize_constrained eval #1: 2.38286
nlopt_minimize_constrained eval #2: 2.35613
nlopt_minimize_constrained eval #3: 2.24586
nlopt_minimize_constrained eval #4: 2.0191
nlopt_minimize_constrained eval #5: 1.74093
nlopt_minimize_constrained eval #6: 1.40421
nlopt_minimize_constrained eval #7: 1.0223
nlopt_minimize_constrained eval #8: 0.685203
nlopt_minimize_constrained eval #9: 0.552985
nlopt_minimize_constrained eval #10: 0.544354
nlopt_minimize_constrained eval #11: 0.544331

This shows the objective function values at each intermediate step of the optimization. As in the C example above, it converges in 11 steps. The COBYLA algorithm requires a few more iterations, because it doesn't exploit the gradient information:

nlopt_minimize_constrained eval #1: 2.38286
nlopt_minimize_constrained eval #2: 2.38286
nlopt_minimize_constrained eval #3: 3.36987
nlopt_minimize_constrained eval #4: 1.04133
nlopt_minimize_constrained eval #5: 1.03684
nlopt_minimize_constrained eval #6: 0.655582
nlopt_minimize_constrained eval #7: 0
nlopt_minimize_constrained eval #8: 0.539064
nlopt_minimize_constrained eval #9: 0.301109
nlopt_minimize_constrained eval #10: 0.253323
nlopt_minimize_constrained eval #11: 0
nlopt_minimize_constrained eval #12: 0.182837
nlopt_minimize_constrained eval #13: 0.452186
nlopt_minimize_constrained eval #14: 0.525476
nlopt_minimize_constrained eval #15: 0.507997
nlopt_minimize_constrained eval #16: 0.543067
nlopt_minimize_constrained eval #17: 0.538789
nlopt_minimize_constrained eval #18: 0.545382
nlopt_minimize_constrained eval #19: 0.545912
nlopt_minimize_constrained eval #20: 0.544332
nlopt_minimize_constrained eval #21: 0.544209
nlopt_minimize_constrained eval #22: 0.544103

Notice that some of the objective function values are below the minimum of 0.54433 — these are simply values of the objective function at infeasible points (violating the nonlinear constraints).

Personal tools