Optimization

java.lang.Object
- keel.Algorithms.Statistical_Classifiers.Logistic.core.Optimization

All Implemented Interfaces:

TechnicalInformationHandler
```
public abstract class Optimization
extends java.lang.Object
implements TechnicalInformationHandler
```
Implementation of Active-sets method with BFGS update to solve optimization problem with only bounds constraints in multi-dimensions. In this implementation we consider both the lower and higher bound constraints.
Here is the sketch of our searching strategy, and the detailed description of the algorithm can be found in the Appendix of Xin Xu's MSc thesis:
Initialize everything, incl. initial value, direction, etc.
LOOP (main algorithm):
1. Perform the line search using the directions for free variables
1.1 Check all the bounds that are not "active" (i.e. binding variables) and compute the feasible step length to the bound for each of them
1.2 Pick up the least feasible step length, say \alpha, and set it as the upper bound of the current step length, i.e. 0<\lambda<=\alpha
1.3 Search for any possible step length<=\alpha that can result the "sufficient function decrease" (\alpha condition) AND "positive definite inverse Hessian" (\beta condition), if possible, using SAFEGUARDED polynomial interpolation. This step length is "safe" and thus is used to compute the next value of the free variables .
1.4 Fix the variable(s) that are newly bound to its constraint(s).
2. Check whether there is convergence of all variables or their gradients. If there is, check the possibilities to release any current bindings of the fixed variables to their bounds based on the "reliable" second-order Lagarange multipliers if available. If it's available and negative for one variable, then release it. If not available, use first-order Lagarange multiplier to test release. If there is any released variables, STOP the loop. Otherwise update the inverse of Hessian matrix and gradient for the newly released variables and CONTINUE LOOP.
3. Use BFGS formula to update the inverse of Hessian matrix. Note the already-fixed variables must have zeros in the corresponding entries in the inverse Hessian.
4. Compute the new (newton) search direction d=H^{-1}*g, where H^{-1} is the inverse Hessian and g is the Jacobian. Note that again, the already- fixed variables will have zero direction.
ENDLOOP
A typical usage of this class is to create your own subclass of this class and provide the objective function and gradients as follows:
```
 class MyOpt extends Optimization {
   // Provide the objective function
   protected double objectiveFunction(double[] x) {
       // How to calculate your objective function...
       // ...
   }

   // Provide the first derivatives
   protected double[] evaluateGradient(double[] x) {
       // How to calculate the gradient of the objective function...
       // ...
   }

   // If possible, provide the index^{th} row of the Hessian matrix
   protected double[] evaluateHessian(double[] x, int index) {
      // How to calculate the index^th variable's second derivative
      // ... 
   }
 }
 
```
When it's the time to use it, in some routine(s) of other class...
```
 MyOpt opt = new MyOpt();
 
 // Set up initial variable values and bound constraints
 double[] x = new double[numVariables];
 // Lower and upper bounds: 1st row is lower bounds, 2nd is upper
 double[] constraints = new double[2][numVariables];
 ...

 // Find the minimum, 200 iterations as default
 x = opt.findArgmin(x, constraints); 
 while(x == null){  // 200 iterations are not enough
    x = opt.getVarbValues();  // Try another 200 iterations
    x = opt.findArgmin(x, constraints);
 }

 // The minimal function value
 double minFunction = opt.getMinFunction();
 ...
 
```
It is recommended that Hessian values be provided so that the second-order Lagrangian multiplier estimate can be calcluated. However, if it is not provided, there is no need to override the evaluateHessian() function.
REFERENCES (see also the getTechnicalInformation() method):
The whole model algorithm is adapted from Chapter 5 and other related chapters in Gill, Murray and Wright(1981) "Practical Optimization", Academic Press. and Gill and Murray(1976) "Minimization Subject to Bounds on the Variables", NPL Report NAC72, while Chong and Zak(1996) "An Introduction to Optimization", John Wiley & Sons, Inc. provides us a brief but helpful introduction to the method.
Dennis and Schnabel(1983) "Numerical Methods for Unconstrained Optimization and Nonlinear Equations", Prentice-Hall Inc. and Press et al.(1992) "Numeric Recipe in C", Second Edition, Cambridge University Press. are consulted for the polynomial interpolation used in the line search implementation.
The Hessian modification in BFGS update uses Cholesky factorization and two rank-one modifications:
Bk+1 = Bk + (Gk*Gk')/(Gk'Dk) + (dGk*(dGk)'))/[alpha*(dGk)'*Dk].
where Gk is the gradient vector, Dk is the direction vector and alpha is the step rate.
This method is due to Gill, Golub, Murray and Saunders(1974) ``Methods for Modifying Matrix Factorizations'', Mathematics of Computation, Vol.28, No.126, pp 505-535.
Version:

$Revision: 1.1 $

Author:

Xin Xu (xx5@cs.waikato.ac.nz)

See Also:

getTechnicalInformation()

Field Summary

Fields
Modifier and Type	Field and Description
`protected double`	`m_ALF` Alpha value.
`protected double`	`m_BETA` Beta value.
`protected static boolean`	`m_Debug` Debug flag.
`protected static double`	`m_Epsilon` Compute machine precision
`protected double`	`m_f` function value
`protected int`	`m_MAXITS` Maximun iterations.
`protected double`	`m_STPMX` STPMX value.
`protected double`	`m_TOLX` ToleranceX value.
`protected static double`	`m_Zero` Compute machine precision

Constructor Summary

Constructors
Constructor and Description

Optimization()

Constructors
Constructor and Description
`Optimization()`

Method Summary

All Methods Static Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`protected abstract double[]`	`evaluateGradient(double[] x)` Subclass should implement this procedure to evaluate gradient of the objective function
`protected double[]`	`evaluateHessian(double[] x, int index)` Subclass is recommended to override this procedure to evaluate second-order gradient of the objective function.
`double[]`	`findArgmin(double[] initX, double[][] constraints)` Main algorithm.
`double`	`getMinFunction()` Get the minimal function value
`TechnicalInformation`	`getTechnicalInformation()` Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
`double[]`	`getVarbValues()` Get the variable values.
`double[]`	`lnsrch(double[] xold, double[] gradient, double[] direct, double stpmax, boolean[] isFixed, double[][] nwsBounds, keel.Algorithms.Statistical_Classifiers.Logistic.core.Optimization.DynamicIntArray wsBdsIndx)` Find a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently, the positive definiteness of B matrix (approximation of the inverse of the Hessian) is preserved and no bound constraints are violated.
`protected abstract double`	`objectiveFunction(double[] x)` Subclass should implement this procedure to evaluate objective function to be minimized
`void`	`setDebug(boolean db)` Set whether in debug mode
`void`	`setMaxIteration(int it)` Set the maximal number of iterations in searching (Default 200)
`static double[]`	`solveTriangle(Matrix t, double[] b, boolean isLower, boolean[] isZero)` Solve the linear equation of TX=B where T is a triangle matrix It can be solved using back/forward substitution, with O(N^2) complexity
`protected void`	`updateCholeskyFactor(Matrix L, double[] D, double[] v, double coeff, boolean[] isFixed)` One rank update of the Cholesky factorization of B matrix in BFGS updates, i.e.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - m_ALF
```
protected double m_ALF
```
    Alpha value.
  - m_BETA
```
protected double m_BETA
```
    Beta value.
  - m_TOLX
```
protected double m_TOLX
```
    ToleranceX value.
  - m_STPMX
```
protected double m_STPMX
```
    STPMX value.
  - m_MAXITS
```
protected int m_MAXITS
```
    Maximun iterations.
  - m_Debug
```
protected static boolean m_Debug
```
    Debug flag.
  - m_f
```
protected double m_f
```
    function value
  - m_Epsilon
```
protected static double m_Epsilon
```
    Compute machine precision
  - m_Zero
```
protected static double m_Zero
```
    Compute machine precision
- Constructor Detail
  - Optimization
```
public Optimization()
```
- Method Detail
  - getTechnicalInformation
```
public TechnicalInformation getTechnicalInformation()
```
    Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
    
    Specified by:
    
    getTechnicalInformation in interface TechnicalInformationHandler
    
    Returns:
    
    the technical information about this class
  - objectiveFunction
```
protected abstract double objectiveFunction(double[] x)
                                     throws java.lang.Exception
```
    Subclass should implement this procedure to evaluate objective function to be minimized
    
    Parameters:
    
    x - the variable values
    
    Returns:
    
    the objective function value
    
    Throws:
    
    java.lang.Exception - if something goes wrong
  - evaluateGradient
```
protected abstract double[] evaluateGradient(double[] x)
                                      throws java.lang.Exception
```
    Subclass should implement this procedure to evaluate gradient of the objective function
    
    Parameters:
    
    x - the variable values
    
    Returns:
    
    the gradient vector
    
    Throws:
    
    java.lang.Exception - if something goes wrong
  - evaluateHessian
```
protected double[] evaluateHessian(double[] x,
                                   int index)
                            throws java.lang.Exception
```
    Subclass is recommended to override this procedure to evaluate second-order gradient of the objective function. If it's not provided, it returns null.
    
    Parameters:
    
    x - the variables
    
    index - the row index in the Hessian matrix
    
    Returns:
    
    one row (the row #index) of the Hessian matrix, null as default
    
    Throws:
    
    java.lang.Exception - if something goes wrong
  - getMinFunction
```
public double getMinFunction()
```
    Get the minimal function value
    
    Returns:
    
    minimal function value found
  - setMaxIteration
```
public void setMaxIteration(int it)
```
    Set the maximal number of iterations in searching (Default 200)
    
    Parameters:
    
    it - the maximal number of iterations
  - setDebug
```
public void setDebug(boolean db)
```
    Set whether in debug mode
    
    Parameters:
    
    db - use debug or not
  - getVarbValues
```
public double[] getVarbValues()
```
    Get the variable values. Only needed when iterations exceeds the max threshold.
    
    Returns:
    
    the current variable values
  - lnsrch
```
public double[] lnsrch(double[] xold,
                       double[] gradient,
                       double[] direct,
                       double stpmax,
                       boolean[] isFixed,
                       double[][] nwsBounds,
                       keel.Algorithms.Statistical_Classifiers.Logistic.core.Optimization.DynamicIntArray wsBdsIndx)
                throws java.lang.Exception
```
    Find a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently, the positive definiteness of B matrix (approximation of the inverse of the Hessian) is preserved and no bound constraints are violated. Details see "Numerical Methods for Unconstrained Optimization and Nonlinear Equations". "Numeric Recipes in C" was also consulted.
    
    Parameters:
    
    xold - old x value
    
    gradient - gradient at that point
    
    direct - direction vector
    
    stpmax - maximum step length
    
    isFixed - indicating whether a variable has been fixed
    
    nwsBounds - non-working set bounds. Means these variables are free and subject to the bound constraints in this step
    
    wsBdsIndx - index of variables that has working-set bounds. Means these variables are already fixed and no longer subject to the constraints
    
    Returns:
    
    new value along direction p from xold, null if no step was taken
    
    Throws:
    
    java.lang.Exception - if an error occurs
  - findArgmin
```
public double[] findArgmin(double[] initX,
                           double[][] constraints)
                    throws java.lang.Exception
```
    Main algorithm. Descriptions see "Practical Optimization"
    
    Parameters:
    
    initX - initial point of x, assuming no value's on the bound!
    
    constraints - the bound constraints of each variable constraints[0] is the lower bounds and constraints[1] is the upper bounds
    
    Returns:
    
    the solution of x, null if number of iterations not enough
    
    Throws:
    
    java.lang.Exception - if an error occurs
  - solveTriangle
```
public static double[] solveTriangle(Matrix t,
                                     double[] b,
                                     boolean isLower,
                                     boolean[] isZero)
```
    Solve the linear equation of TX=B where T is a triangle matrix It can be solved using back/forward substitution, with O(N^2) complexity
    
    Parameters:
    
    t - the matrix T
    
    b - the vector B
    
    isLower - whether T is a lower or higher triangle matrix
    
    isZero - which row(s) of T are not used when solving the equation. If it's null or all 'false', then every row is used.
    
    Returns:
    
    the solution of X
  - updateCholeskyFactor
```
protected void updateCholeskyFactor(Matrix L,
                                    double[] D,
                                    double[] v,
                                    double coeff,
                                    boolean[] isFixed)
                             throws java.lang.Exception
```
    One rank update of the Cholesky factorization of B matrix in BFGS updates, i.e. B = LDL', and B_{new} = LDL' + coeff*(vv') where L is a unit lower triangle matrix and D is a diagonal matrix, and v is a vector.
    When coeff > 0, we use C1 algorithm, and otherwise we use C2 algorithm described in ``Methods for Modifying Matrix Factorizations''
    
    Parameters:
    
    L - the unit triangle matrix L
    
    D - the diagonal matrix D
    
    v - the update vector v
    
    coeff - the coeffcient of update
    
    isFixed - which variables are not to be updated
    
    Throws:
    
    java.lang.Exception

Class Optimization

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

m_ALF

m_BETA

m_TOLX

m_STPMX

m_MAXITS

m_Debug

m_f

m_Epsilon

m_Zero

Constructor Detail

Optimization

Method Detail

getTechnicalInformation

objectiveFunction

evaluateGradient

evaluateHessian

getMinFunction

setMaxIteration

setDebug

getVarbValues

lnsrch

findArgmin

solveTriangle

updateCholeskyFactor