NLopt Tutorial

From AbInitio

(Difference between revisions)
Jump to: navigation, search
Revision as of 23:58, 13 November 2008 (edit)
Stevenj (Talk | contribs)
(Example nonlinearly constrained problem)
← Previous diff
Revision as of 00:08, 14 November 2008 (edit)
Stevenj (Talk | contribs)
(Example in Matlab or GNU Octave)
Next diff →
Line 125: Line 125:
==Example in Matlab or GNU Octave== ==Example in Matlab or GNU Octave==
 +
 +To implement this objective function in Matlab (or GNU Octave), we would write a file myfunc.m that looks like:
 +
 + function [val, gradient] = myfunc(x)
 + val = sqrt(x(2));
 + if (nargout > 1)
 + gradient = [0, 0.5 / val];
 + end
 +
 +Notice that we check the Matlab builtin variable <code>nargout</code> (the number of output arguments) to decide whether to compute the gradient. If we use a derivative-free optimization algorithm below, then <code>nargout</code> will always be 1 and the gradient need never be computed.
 +
 +Our constraint function looks similar, except that it is parameterized by the coefficients ''a'' and ''b''. We can just add these on as extra parameters, in a file <code>myconstraint.m</code>:
 +
 +
 + function [val, gradient] = myconstraint(x,a,b)
 + val = (a*x(1) + b)^3 - x(1);
 + if (nargout > 1)
 + gradient = [-1, 3*a*(a*x(1) + b)^2];
 + end
 +
 +Now, in Matlab, before we call <code>nlopt_minimize_constrained</code>, we should set our termination condition. As above, we'll use a relative '''x''' tolerance of 10<sup>&minus;4</sup>. To do this, we use a [http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_prog/f2-88951.html Matlab structure]:
 +
 + stop.xtol_rel = 1e-4;
 +
 +We can also add other fields to the structure, corresponding to other [[NLopt Introduction#Termination conditions]], but we don't need to&mdash;any fields that we omit are ignored.

Revision as of 00:08, 14 November 2008

In this tutorial, we illustrate the usage of NLopt in various languages via one or two trivial examples.

Contents

Example nonlinearly constrained problem

Feasible region for a simple example optimization problem with two nonlinear (cubic) constraints.
Enlarge
Feasible region for a simple example optimization problem with two nonlinear (cubic) constraints.

As a first example, we'll look at the nonlinearly constrained minimization problem:

\min_{\mathbf{x}\in\mathbb{R}^2} \sqrt{x_2}
subject to x_2 \geq 0, x_2 \geq (a_1 x_1 + b_1)^3, and x_2 \geq (a_2 x_1 + b_2)^3

for parameters a1=2, b1=0, a2=-1, b2=1.

The feasible region defined by these constraints is plotted at right: x2 is constrained to lie above the maximum of two cubics, and the optimum point is located at the intersection (1/3, 8/27) where the objective function takes on the value \sqrt{8/27} \approx 0.5443310539518\ldots.

(This problem is especially trivial, because by formulating it in terms of the cube root of x2 you can turn it into a linear-programming problem, but we won't do that here.)

In principle, we don't need the bound constraint x2≥0, since the nonlinear constraints already imply a positive-x2 feasible region. However, NLopt doesn't guarantee that, on the way to finding the optimum, it won't violate the nonlinear constraints at some intermediate steps, while it does guarantee that all intermediate steps will satisfy the bound constraints. So, we will explicitly impose x2≥0 in order to ensure that the √x2 in our objective is real.

Example in C/C++

To implement the above example in C or C++, we would first do:

#include <math.h>
#include <nlopt.h>

to include the NLopt header file as well as the standard math header file (needed for things like the sqrt function and the HUGE_VAL constant), then we would define our objective function as:

double myfunc(int n, const double *x, double *grad, void *my_func_data)
{
    if (grad) {
        grad[0] = 0.0;
        grad[1] = 0.5 / sqrt(x[1]);
    }
    return sqrt(x[1]);
}

There are several things to notice here. First, since this is C, our indices are zero-based, so we have x[0] and x[1] instead of x1 and x2. The return value of our function is the objective \sqrt{x_2}. Also, if the parameter grad is not NULL, then we set grad[0] and grad[1] to the partial derivatives of our objective with respect to x[0] and x[1]. The gradient is only needed for gradient-based algorithms; if you use a derivative-free optimization algorithm, grad will always be NULL and you need never compute any derivatives. Finally, we have an extra parameter my_func_data that can be used to pass additional data to myfunc, but no additional data is needed here so that parameter is unused.

For the constraints, on the other hand, we will have additional data. Each constraint is parameterized by two numbers a and b, so we will declare a data structure to hold this information:

typedef struct {
    double a, b;
} my_constraint_data;

Then, we implement our constraint function as follows. (You always create a single constraint function, even if you have several very different constraints—your constraint function can do different things depending on what is passed as the void *data parameter.)

double myconstraint(int n, const double *x, double *grad, void *data)
{
    my_constraint_data *d = (my_constraint_data *) data;
    double a = d->a, b = d->b;
    if (grad) {
        grad[0] = 3 * a * (a*x[0] + b) * (a*x[0] + b);
        grad[1] = -1.0;
    }
    return ((a*x[0] + b) * (a*x[0] + b) * (a*x[0] + b) - x[1]);
 }

The form of the constraint function is the same as that of the objective function. Here, the data parameter will actually be a pointer to my_constraint_data (because this is the type that we will pass to nlopt_minimize_constrained below), so we use a typecast to get the constraint data. NLopt always expects constraints to be of the form myconstraint(x) ≤ 0, so we implement the constraint x2 ≥ (a x1 + b)3 as the function (a x1 + b)3 − x2. Again, we only compute the gradient if grad is non-NULL, which will never occur if we use a derivative-free optimization algorithm.

Finally, we need to call nlopt_minimize_constrained from our program somewhere. We will, of course, need to declare arrays to hold the solution x, the upper and lower bounds, and the constraint data. This will look something like:

my_constraint_data data[2] = { {2,0}, {-1,1} };
double x[2] = { 1.234, 5.678 };  /* some initial guess */
double lb[2] = { -HUGE_VAL, 0 }, ub[2] = { HUGE_VAL, HUGE_VAL }; /* lower and upper bounds */
double minf; /* the minimum objective value, upon return */

if (nlopt_minimize_constrained(NLOPT_LD_MMA, 2, myfunc, NULL,
                               2, myconstraint, data, sizeof(my_constraint_data),
                               lb, ub, x, &minf,
                               -HUGE_VAL, 0.0, 0.0, 1e-4, NULL, 0, 0.0) < 0) {
    printf("nlopt failed!\n");
}
else {
    printf("found minimum at f(%g,%g) = %g\n", x[0], x[1], minf);
}

The arguments to nlopt_minimize_constrained are described in detail by the reference manual. Most of them are straightforward. Notice that we pass data as the data for myconstraint — when myconstraint is called, it will be passed pointers to data[0] and data[1] for the two constraints. The last set of parameters (-HUGE_VAL, 0.0, 0.0, 1e-4, NULL, 0, 0.0) are termination criteria — the only one we are using is that we are setting the relative tolerance on x to 10−4, and the other conditions are set to defaults where they do nothing. Also notice that we set the lower bound on x2 to 0, but set the lower and upper bounds to ±∞ (±HUGE_VAL).

nlopt_minimize_constrained will return a negative result code on failure, but this usually only happens if you pass invalid parameters, it runs out of memory, or something like that. Otherwise, we print out the minimum function value and the corresponding parameters x.

Assuming we save this in a file tutorial.c, we would compile and link (on Unix) with:

cc tutorial.c -o tutorial -lnlopt -lm

The result of running the program should then be something like:

found minimum at f(0.3333333313061715,0.2962962926308738) = 0.5443310505849118

That is, it found the correct parameters and minimum function value to about 8 significant digits. (This is better than we specified; this often occurs because the local optimization routines usually try to be conservative in estimating the error.)

Number of evaluations

Let's modify our program to print out the number of function evaluations that were required to obtain this result. First, we'll change our objective function to:

int count = 0;
double myfunc(int n, const double *x, double *grad, void *my_func_data)
{
    ++count;
    if (grad) {
        grad[0] = 0.0;
        grad[1] = 0.5 / sqrt(x[1]);
    }
    return sqrt(x[1]);
}

using a global variable count that is incremented for each function evaluation. (We could also pass a pointer to a counter variable as my_func_data, if we wanted to avoid global variables.) Then, changing our printf statement to:

printf("found minimum after %d evaluations at f(%g,%g) = %g\n", count, x[0], x[1], minf);

we obtain:

found minimum after 11 evaluations at f(0.3333333313061715,0.2962962926308738) = 0.5443310505849118

For such a simple problem, a gradient-based local optimization algorithm like MMA can converge very quickly!

Switching to a derivative-free algorithm

We can also try a derivative-free algorithm. Looking at the NLopt Algorithms list, the only other algorithm in NLopt that handles nonlinear constraints is COBYLA, which is derivative-free. To use it, we just change NLOPT_LD_MMA ("LD" means local optimization, derivative/gradient-based) into NLOPT_LN_COBYLA ("LN" means local optimization, no derivatives), and obtain:

found minimum after 22 evaluations at f(0.3333328219501758,0.2962976600664823) = 0.5443323066532817

In such a low-dimensional problem, derivative-free algorithms usually work quite well—in this case, it only doubles the number of function evaluations. However, the comparison is not perfect because, for the same relative x tolerance of 10−4, COBYLA is a bit less conservative and only finds the solution to 5 significant digits.

To do a fairer comparison of the two algorithms, we could set the x tolerance to zero and ask how many function evaluations each one requires to get the correct answer to three decimal places. We can specify this by using the minf_max termination criterion, which allows us to halt the process as soon as a feasible point attains an objective function value less than minf_max. In this case, we would set minf_max to \sqrt(8/27)+10^{-3} (instead of -HUGE_VAL before), corresponding to the last line of arguments to nlopt_minimize_constrained being sqrt(8./27.)+1e-3, 0.0, 0.0, 0.0, NULL, 0, 0.0. If we do this, we find that COBYLA requires 20 evaluations while MMA requires 10.

The advantage of gradient-based algorithms over derivative-free algorithms typically grows for higher-dimensional problems. On the other hand, derivative-free algorithms are much easier to use because you don't need to worry about how to compute the gradient (which might be tricky if the function is very complicated).

Example in Matlab or GNU Octave

To implement this objective function in Matlab (or GNU Octave), we would write a file myfunc.m that looks like:

function [val, gradient] = myfunc(x)
    val = sqrt(x(2));
    if (nargout > 1)
        gradient = [0, 0.5 / val];
    end

Notice that we check the Matlab builtin variable nargout (the number of output arguments) to decide whether to compute the gradient. If we use a derivative-free optimization algorithm below, then nargout will always be 1 and the gradient need never be computed.

Our constraint function looks similar, except that it is parameterized by the coefficients a and b. We can just add these on as extra parameters, in a file myconstraint.m:


function [val, gradient] = myconstraint(x,a,b)
    val = (a*x(1) + b)^3 - x(1);
    if (nargout > 1)
        gradient = [-1, 3*a*(a*x(1) + b)^2];
    end

Now, in Matlab, before we call nlopt_minimize_constrained, we should set our termination condition. As above, we'll use a relative x tolerance of 10−4. To do this, we use a Matlab structure:

stop.xtol_rel = 1e-4;

We can also add other fields to the structure, corresponding to other NLopt Introduction#Termination conditions, but we don't need to—any fields that we omit are ignored.

Personal tools