Optimization using MATLAB

Sep 7, 2022·
Zakir Hussain Shaik
Zakir Hussain Shaik
· 17 min read

This tutorial is designed to help readers solve optimization problems in MATLAB through various examples and approaches. We will explore three widely used tools/interfaces: (i) MATLAB’s Optimization toolbox, (ii) YALMIP in conjunction with MATLAB, and (iii) CVX integrated with MATLAB. Additionally, the tutorial will delve into optimizing functions in the complex domain and demonstrate techniques for enhancing convergence rates by utilizing analytically computed gradients and Hessians.

Introduction

Optimization represents a fundamental mathematical concept involving the maximization or minimization of a specific utility function. The structure of this utility function varies based on the application’s context and the specific field of study. Once an optimization problem is formulated, the subsequent challenge lies in obtaining a solution. Unfortunately, deriving a closed-form (analytical) solution is often not feasible. In such cases, numerical optimization methods come into play.

Numerous solvers are accessible to tackle optimization problems, particularly if you are aware of the problem’s categorical type. This tutorial demonstrates several examples of how to model and solve optimization problems using MATLAB. The code snippets below are self-contained and can be directly used in a MATLAB script.

Optimization Problem Categories

This tutorial covers several important optimization problem classes:

Linear Programming (LP)

Minimizing a linear objective subject to linear constraints:

minfTxs.t.Axb,Aeqx=beq\begin{aligned}\min & \quad f^Tx \\\text{s.t.} & \quad Ax\leq b, \quad A_{\rm eq}x=b_{\rm eq}\end{aligned}

Quadratic Programming (QP)

Minimizing a quadratic objective (with positive semidefinite Hessian) subject to linear constraints:

min12xTHx+fTxs.t.Axb,Aeqx=beq\begin{aligned}\min & \quad \frac{1}{2}x^THx + f^Tx \\\text{s.t.} & \quad Ax\leq b, \quad A_{\rm eq}x=b_{\rm eq}\end{aligned}

Quadratically-Constrained Quadratic Programming (QCQP)

Minimizing a quadratic objective subject to quadratic constraints:

min12xTH0x+f0Txs.t.12xTHix+fiTx1,i=1,,N\begin{aligned}\min & \quad \frac{1}{2}x^TH_0x + f_0^Tx \\\text{s.t.} & \quad \frac{1}{2}x^TH_ix + f_i^Tx\leq 1, \quad i=1,\ldots, N \\\end{aligned}

Second-Order Cone Programming (SOCP)

minfTxs.t.Aix+bi2ciTx+di,i=1,,N\begin{aligned}\min & \quad f^Tx \\\text{s.t.} & \quad ||A_ix + b_i||_2 \leq c_i^Tx + d_i, \quad i=1,\ldots, N\end{aligned}

General Nonlinear Convex Optimization

Minimizing any convex function subject to linear and nonlinear convex constraints.

Linear Programming

The standard form of a linear program is:

minfTxs.t.Axb,Aeqx=beq,lxu\begin{aligned}\min & \quad f^Tx \\\text{s.t.} & \quad Ax\leq b, \quad A_{\rm eq}x=b_{\rm eq},\quad l \leq x \leq u\end{aligned}

We will solve this using three different interfaces: the MATLAB optimization toolbox using linprog, YALMIP, and CVX.

Example

min3x14x2s.t.2x1+x26x1+2x26x1,x20\begin{aligned}\min & \quad -3x_1 - 4x_2 \\\text{s.t.} & \quad 2x_1 + x_2 \leq 6 \\ & \quad x_1 + 2x_2 \leq 6 \\ & \quad x_1, x_2 \geq 0\end{aligned}

Using linprog:

% Linear programming example: maximize 3*x1 + 4*x2 => minimize -3*x1 - 4*x2
f = [-3 -4];                % Coefficients of objective
A = [2  1;  1  2];          % Coefficients of inequality constraints Ax <= b
b = [6; 6];                 % Right-hand side of inequality constraints
x_lb = [0; 0];              % Lower bounds
x_ub = [];                  % Upper bounds (empty means no upper bound)

[x_opt,f_opt,exitflag,output] = linprog(f,A,b,[],[],x_lb,x_ub);

disp(['Optimal solution (linprog): x1 = ', num2str(x_opt(1)), ', x2 = ', num2str(x_opt(2))]);
disp(['Objective value: ', num2str(-f_opt)]);

Using YALMIP:

% Define variable
x = sdpvar(2, 1);

% Objective
objective = -3*x(1) - 4*x(2);

% Constraints
constraints = [2*x(1) + x(2) <= 6, x(1) + 2*x(2) <= 6, x(1) >= 0, x(2) >= 0];

% Solve
options = sdpsettings('verbose', 1, 'solver', 'linprog');
solve(constraints, objective, options);

disp(['Optimal solution (YALMIP): x1 = ', num2str(value(x(1))), ', x2 = ', num2str(value(x(2)))]);
disp(['Objective value: ', num2str(value(objective))]);

Using CVX:

cvx_begin
    variable x(2)
    minimize(-3*x(1) - 4*x(2))
    subject to
        2*x(1) + x(2) <= 6
        x(1) + 2*x(2) <= 6
        x(1) >= 0
        x(2) >= 0
cvx_end

disp(['Optimal solution (CVX): x1 = ', num2str(x(1)), ', x2 = ', num2str(x(2))]);
disp(['Objective value: ', num2str(cvx_optval)]);

Quadratic Programming

The standard form of a quadratic program is:

min12xTHx+fTxs.t.Axb,Aeqx=beq,lxu\begin{aligned}\min & \quad \frac{1}{2}x^THx + f^Tx \\\text{s.t.} & \quad Ax\leq b, \quad A_{\rm eq}x=b_{\rm eq},\quad l \leq x \leq u\end{aligned}

where HH is a positive semidefinite matrix.

Example

min0.5(x12+2x222x1x2)x12x2s.t.x1+x21x1,x20\begin{aligned}\min & \quad 0.5 (x_1^2 + 2x_2^2 - 2x_1x_2) - x_1 - 2x_2 \\\text{s.t.} & \quad x_1 + x_2 \leq 1 \\ & \quad x_1, x_2 \geq 0\end{aligned}

Using quadprog:

H = [1 -1; -1 2];           % Hessian of objective (must be positive semidefinite)
f = [-1 -2]';               % Linear term in objective
A = [1 1];                  % Inequality constraint matrix (Ax <= b)
b = [1];                    % Right-hand side (b)
x_lb = [0; 0];              % Lower bounds
x_ub = [];                  % Upper bounds

[x_opt, f_opt, exitflag, output] = quadprog(H, f, A, b, [], [], x_lb, x_ub);

disp(['Optimal solution (quadprog): x1 = ', num2str(x_opt(1)), ', x2 = ', num2str(x_opt(2))]);
disp(['Objective value: ', num2str(f_opt)]);

Using YALMIP:

% Define variable
x = sdpvar(2, 1);

% Objective
objective = 0.5*(x(1)^2 + 2*x(2)^2 - 2*x(1)*x(2)) - x(1) - 2*x(2);

% Constraints
constraints = [x(1) + x(2) <= 1, x(1) >= 0, x(2) >= 0];

% Solve
options = sdpsettings('verbose', 1, 'solver', 'linprog');
solve(constraints, objective, options);

disp(['Optimal solution (YALMIP): x1 = ', num2str(value(x(1))), ', x2 = ', num2str(value(x(2)))]);
disp(['Objective value: ', num2str(value(objective))]);

Using CVX:

cvx_begin
    variable x(2)
    minimize(0.5*(x(1)^2 + 2*x(2)^2 - 2*x(1)*x(2)) - x(1) - 2*x(2)))
    subject to
        x(1) + x(2) <= 1
        x(1) >= 0
        x(2) >= 0
cvx_end

disp(['Optimal solution (CVX): x1 = ', num2str(x(1)), ', x2 = ', num2str(x(2))]);
disp(['Objective value: ', num2str(cvx_optval)]);

Quadratically-Constrained Quadratic Programming (QCQP)

A quadratically-constrained quadratic program (QCQP) involves minimizing a quadratic objective subject to quadratic constraints.

Example

minx12+2x22x12x2s.t.x12+x222x1,x20\begin{aligned}\min & \quad x_1^2 + 2x_2^2 - x_1 - 2x_2 \\\text{s.t.} & \quad x_1^2 + x_2^2 \leq 2 \\ & \quad x_1, x_2 \geq 0 \end{aligned}

Using fmincon:

% Objective function and its gradient
fun = @(x) x(1)^2 + 2*x(2)^2 - x(1) - 2*x(2);
grad_fun = @(x) [2*x(1) - 1; 4*x(2) - 2];

% Nonlinear inequality constraint: x1^2 + x2^2 <= 2 => x1^2 + x2^2 - 2 <= 0
nonlincon = @(x) deal(x(1)^2 + x(2)^2 - 2, []);  % c(x) <= 0, ceq(x) = 0

x0 = [0.5; 0.5];
options = optimoptions('fmincon', 'Algorithm', 'interior-point', ...
    'SpecifyObjectiveGradient', true, 'SpecifyConstraintGradient', true);

[x_opt, f_opt, exitflag, output] = fmincon(fun, x0, [], [], [], [], [0; 0], [], nonlincon, options);

disp(['Optimal solution (fmincon): x1 = ', num2str(x_opt(1)), ', x2 = ', num2str(x_opt(2))]);
disp(['Objective value: ', num2str(f_opt)]);

Using YALMIP:

% Define variable
x = sdpvar(2, 1);

% Objective
objective = x(1)^2 + 2*x(2)^2 - x(1) - 2*x(2);

% Constraints
constraints = [x(1)^2 + x(2)^2 <= 2, x(1) >= 0, x(2) >= 0];

% Solve
options = sdpsettings('verbose', 1, 'solver', 'sedumi');
solve(constraints, objective, options);

disp(['Optimal solution (YALMIP): x1 = ', num2str(value(x(1))), ', x2 = ', num2str(value(x(2)))]);
disp(['Objective value: ', num2str(value(objective))]);

Using CVX:

cvx_begin
    variable x(2)
    minimize(x(1)^2 + 2*x(2)^2 - x(1) - 2*x(2))
    subject to
        x(1)^2 + x(2)^2 <= 2
        x(1) >= 0
        x(2) >= 0
cvx_end

disp(['Optimal solution (CVX): x1 = ', num2str(x(1)), ', x2 = ', num2str(x(2))]);
disp(['Objective value: ', num2str(cvx_optval)]);

Second-Order Cone Programming (SOCP)

SOCP is a powerful class of optimization problems that can handle more general constraints than quadratic programming.

Example

minx1+2x2s.t.[2x1x2,3x1+x2]2x13x2+4\begin{aligned}\min & \quad x_1 + 2x_2 \\\text{s.t.} & \quad ||[2x_1 - x_2, 3x_1 + x_2]||_2 \leq x_1 - 3x_2 + 4 \end{aligned}

Using MATLAB’s coneprog (R2020b and later):

c = [1; 2];                 % Linear objective coefficients
% SOCP form: minimize c'*x subject to ||A_i*x + b_i||_2 <= c_i'*x + d_i

Asocp = [2 -1; 3 1];        % A matrix for the norm constraint
bsocp = [0; 0];             % b vector
csocp = [1; -3];            % c vector
dsocp = 4;                  % d scalar

[x_opt, f_opt, exitflag] = coneprog(c, Asocp, bsocp, csocp, dsocp);

disp(['Optimal solution (coneprog): x1 = ', num2str(x_opt(1)), ', x2 = ', num2str(x_opt(2))]);
disp(['Objective value: ', num2str(f_opt)]);

Using YALMIP:

% Define variable
x = sdpvar(2, 1);

% Objective
objective = x(1) + 2*x(2);

% SOCP constraint: ||[2*x1 - x2, 3*x1 + x2]||_2 <= x1 - 3*x2 + 4
constraints = [norm([2*x(1) - x(2); 3*x(1) + x(2)], 2) <= x(1) - 3*x(2) + 4];

% Solve
options = sdpsettings('verbose', 1, 'solver', 'sedumi');
solve(constraints, objective, options);

disp(['Optimal solution (YALMIP): x1 = ', num2str(value(x(1))), ', x2 = ', num2str(value(x(2)))]);
disp(['Objective value: ', num2str(value(objective))]);

Using CVX:

cvx_begin
    variable x(2)
    minimize(x(1) + 2*x(2))
    subject to
        norm([2*x(1) - x(2); 3*x(1) + x(2)], 2) <= x(1) - 3*x(2) + 4
cvx_end

disp(['Optimal solution (CVX): x1 = ', num2str(x(1)), ', x2 = ', num2str(x(2))]);
disp(['Objective value: ', num2str(cvx_optval)]);

Optimizing Complex-Valued Functions

Optimizing complex-valued functions can be tricky in MATLAB’s optimization toolbox. However, YALMIP and CVX can handle this without re-modelling. The idea is to represent a complex variable z=x+iyz = x + iy as a vector [x,y]T[x, y]^T and reformulate the constraints and objectives accordingly. However, YALMIP and CVX can work directly with complex variables.

Example

minz2=zHzs.t.z121\begin{aligned}\min & \quad |z|^2 = z^H z \\\text{s.t.} & \quad |z - 1|^2 \leq 1 \end{aligned}

where z=x+iyz = x + iy is a complex variable.

Using MATLAB fmincon (by re-modelling):

% Represent z = x + iy as a vector [x; y]
fun = @(w) w(1)^2 + w(2)^2;  % |z|^2 = x^2 + y^2
grad_fun = @(w) [2*w(1); 2*w(2)];

% Nonlinear constraint: |z - 1|^2 <= 1
% (x-1)^2 + y^2 <= 1 => (x-1)^2 + y^2 - 1 <= 0
nonlincon = @(w) deal((w(1)-1)^2 + w(2)^2 - 1, []);

w0 = [0.5; 0.5];
options = optimoptions('fmincon', 'Algorithm', 'interior-point', ...
    'SpecifyObjectiveGradient', true, 'SpecifyConstraintGradient', true);

[w_opt, f_opt, exitflag, output] = fmincon(fun, w0, [], [], [], [], [], [], nonlincon, options);

z_opt = w_opt(1) + 1i*w_opt(2);
disp(['Optimal solution (fmincon): z = ', num2str(z_opt)]);
disp(['Objective value: ', num2str(f_opt)]);

Using YALMIP:

% YALMIP can handle complex variables directly
z = sdpvar(1, 1, 'complex');

% Objective
objective = z'*z;  % |z|^2

% Constraints
constraints = [abs(z - 1)^2 <= 1];

% Solve
options = sdpsettings('verbose', 1, 'solver', 'sedumi');
solve(constraints, objective, options);

disp(['Optimal solution (YALMIP): z = ', num2str(value(z))]);
disp(['Objective value: ', num2str(value(objective))]);

Using CVX:

cvx_begin
    variable z complex
    minimize(abs(z)^2)
    subject to
        abs(z - 1)^2 <= 1
cvx_end

disp(['Optimal solution (CVX): z = ', num2str(z)]);
disp(['Objective value: ', num2str(cvx_optval)]);

Providing Gradient and Hessian Information

For optimization problems with expensive objective functions or when high accuracy is required, providing analytical gradients and Hessians can significantly improve convergence. MATLAB’s fmincon supports this via the 'SpecifyObjectiveGradient' and 'SpecifyConstraintGradient' options.

Example with Analytical Gradient

Consider the optimization problem:

minc(x2+x12)2+(1x1)2s.t.x12+(x2a)22x20\begin{aligned}\min & \quad c(x_2 + x_1^2)^2 + (1-x_1)^2 \\\text{s.t.} & \quad x_1^2 + (x_2-a)^2 \leq 2 \\ & \quad x_2 \geq 0\end{aligned}

where cc and aa are parameters.

Case 1: Providing Analytical Gradient Only

clc;
clear;
close all;

% Parameters of the problem
c = 100;
a = 3;

% Function handles for objective and constraint with gradient
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);

% Linear constraints (inequality)
A = [0, -1];
b = 0;

x0 = [0; 0];
options = optimoptions('fmincon', 'Algorithm', 'interior-point', ...
    'SpecifyObjectiveGradient', true, 'SpecifyConstraintGradient', true);

[x, fval, exitflag, output] = fmincon(fun, x0, [], [], [], [], [], [], noncol, options);

disp(['Optimal solution: x = [', num2str(x(1)), ', ', num2str(x(2)), ']']);
disp(['Objective value: ', num2str(fval)]);

% Function to compute function value and gradient value at the point
function [f, gradf] = computeFunGrad(x, c)
% Calculate objective f
f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to compute gradient for all equality and inequality constraints
function [c, ceq, gc, gceq] = computenonlinConsGrad(x, a)
c = (x(1)^2 + (x(2)-a)^2 - 2);
ceq = [];

gc = [2*x(1); 2*(x(2)-a)];
gceq = [];
end

Case 2: Providing Analytical Gradient and Hessian

clc;
clear;
close all;

% Parameters of the problem
c = 100;
a = 3;

% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
HessMult = @(x,lambda,v)computeHessMultFcn(x,lambda,v,c);

% Linear constraints
A = [0,-1];
b = 0;

x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'SubproblemAlgorithm','cg','HessianMultiplyFcn',HessMult);

[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to Hessian of Lagrangian at the point
function W = computeHessMultFcn(x,lambda,v,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
    4*c*x(1), 2*c];

% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;

W = Hout*v;
end

% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = x(1)^2 + (x(2)-a)^2 - 2;
ceq = [ ];

gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end

Case 3: Solving with theoretical gradient and Hessians-Full Hessian

clc;
clear;
close all;

% Parameters of the problem
c = 100;
a = 3;

% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
HessFun = @(x,lambda,c)computeHessianfcn(x,lambda,c);

% Linear constraints
A = [0,-1];
b = 0;

x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'HessianFcn',HessFun);

[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to Hessian of Lagrangian at the point
function Hout = computeHessianfcn(x,lambda,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
    4*c*x(1), 2*c];

% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;
end

% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = (x(1)^2 + (x(2)-a)^2 - 2);
ceq = [ ];

gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end

Case 3: Solving with theoretical gradient and Hessians-Multiple

clc;
clear;
close all;

% Paramters of the problem
c = 100;
a = 3;

% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
HessMult = @(x,lambda,v)computeHessMultFcn(x,lambda,v,c);

% Linear constraints
A = [0,-1];
b = 0;

x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'SubproblemAlgorithm','cg','HessianMultiplyFcn',HessMult);

[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) - x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to Hessian of Lagrangian at the point
function W = computeHessMultFcn(x,lambda,v,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
    4*c*x(1), 2*c];

% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;

W = Hout*v;
end

% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = x(1)^2 + (x(2)-a)^2 - 2;
ceq = [ ];

gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end

Case 3 will be helpful, especially when you have a large sparse Hessian matrix. It speeds up the convergence.

Note that when we are computing Hessian, we are actually computing Hessian of Lagrangian which is as follows

2L(x,λ)=2f(x)+iλineq(i)2ci(x)+jλeq(i)2ceqj(x) \nabla^2 L(x,\lambda) = \nabla^2 f(x) + \sum_{i}\lambda_{\rm ineq}(i)\nabla^2 c_i(x) + \sum_{j}\lambda_{\rm eq}(i)\nabla^2 ceq_j(x)

where ci(x),i=1,,Nc_i(x), i =1,\ldots,N and ceqj(x),j=1,,Mceq_j(x), j =1,\ldots,M are equality and inequality constraints. The λ\lambda’s are Lagrangian multipliers. The notations may seem unconventional, but they are chosen to align with MATLAB’s syntax for easier understanding. In the previous example, we had only one inequality constraint. However, when dealing with multiple constraints, you can employ a loop to iteratively add them. Here’s a syntactical representation:

Hobj % Hessian of the objective

% Adding Hessian of inequality nonlinear constraints
for i = 1:N
Hobj  = Hobj + lambda.ineqnonlin(1)*Hc(:,:,i); % where Hc(:,:,i) is Hessian of c_i(x)
end

% Adding Hessian of equality nonlinear constraints (not if it is not affine, then it is nonconvex)
for j = 1:M
Hobj  = Hobj + lambda.eqnonlin(1)*Hceq(:,:,i); % where Hceq(:,:,j) is Hessian of ceq_j(x)
end

This approach enables efficient handling of multiple constraints in the optimization problem.

Computing Gradient/Hessian using symbolic toolbox of MATLAB

To err is human, and hence there is a high probability that one may make small mistakes when computing gradients and Hessians with complicated functions. One way to verify the correctness of these expressions is by using the Symbolic Math Toolbox in MATLAB.

The following code computes symbolic gradients and Hessians, which we used in the example above:

clc;
clear;
close all;

% Defining data
x = sym('x',[2,1]);

% Parameter variable
syms c a;

% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

% Symbolic gradient
gradfs = gradient(fs,x);

% Symbolic hessian
hessfs = hessian(fs,x);

% Defining constraint symbolically
gs = x(1)^2 + (x(2)-a)^2 - 2;

% Symbolic gradient
gradgs = gradient(gs,x);

% Symbolic hessian
hessgs = hessian(gs,x);

Note that we have even parameterized cc and aa to demonstrate that we can use the symbolic toolbox in a more general scenario where parameters can vary. However, if desired, one can fix them to some real value. These symbolic expressions can be compared against the analytical expressions derived manually to ensure correctness.

One way to check is to display the expression in the command line and verify it by observation. Another method would be to evaluate the expression at several sample points and check, as follows:

% Sample checking over various values
M = 1e2;
eps = 1e-7;
for i = 1:M
    x0 = randn(2,1);
    c0  = randn;
    a0  = randn;

    % Symbolic function value
    gradfsVal = double(subs(subs(gradfs,x,x0),c,c0));
    hessfsval = double(subs(subs(hessfs,x,x0),c,c0));

    % Computing theoretical
    %f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

    gradfVal = [4*c0*(x0(2)+x0(1)^2)*x0(1)-2*(1-x0(1));
             2*c0*(x0(2)+x0(1)^2)];

    hessfVal = [12*c0*x0(1)^2+4*c0*x0(2)+2, 4*c0*x0(1);
    4*c0*x0(1), 2*c0];

    % Checking the accuracy
    if abs(gradfsVal - gradfVal) >= eps
        error('Gradient is in error');
    end

    if sum(abs(hessfsval - hessfVal),'all')>=eps
        error('Hessian is in error');
    end

    % Symbolic constraint value
    gradgsVal = double(subs(subs(gradgs,x,x0),a,a0));
    hessgsval = double(subs(subs(hessgs,x,x0),a,a0));

    % Computing theoretical
    gradgVal = [2*x0(1);2*(x0(2)-a0)];

    hessgVal = 2*eye(2);

    % Checking the accuracy
    if abs(gradgsVal - gradgVal) >= eps
        error('Gradient of constraint is in error');
    end

    if sum(abs(hessgsval - hessgVal),'all')>=eps
        error('Hessian of constraint is in error');
    end

end

Another way to check is with the help of fmincon as discussed here

Converting symbolically computed Gradient/Hessian to a MATLAB function

In the final step, it’s worth noting that one doesn’t always need to derive analytical expressions manually using MATLAB. MATLAB offers a feature to convert symbolically computed gradients and Hessians into functions, which can be directly utilized in your code. This process helps prevent errors from creeping in when typing out these expressions manually. However, it’s essential to define the function correctly without mistakes.

clc;
clear;
close all;

% Defining data
x = sym('x',[2,1]);

% Parameter variable
syms c;

% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

% Symbolic gradient
gradfs = gradient(fs,x);

% Symbolic hessian
hessfs = hessian(fs,x);

matlabFunction(gradfs,'File','myGrad','vars',{x,c});
matlabFunction(hessfs,'File','myHess','vars',{x,c});

The provided code creates two .m files in your folder: myGrad.m and myHess.m. These files take inputs x, c and output the gradient and Hessian, respectively, at the given point x and parameterized by c. This feature of MATLAB, which generates function files from symbolic expressions, is incredibly useful. It allows us to avoid repeatedly using the symbolic toolbox, which can be quite slow.

Some general discussion

For maximizing the objective, just multiple objective function by 1-1 for MATLAB optimization toolbox. However, note that the YALMIP and CVX supports maximize operation without having to flip the sign.

A few general comments on the tools

The three popular choices with MATLAB to solve an optimization problem are:

  • MATLAB optimization toolbox
    • Fast and efficient as all functions are optimized to MATLAB.
    • Have more flexibility in debugging the code.
    • All functions of MATLAB can be used while modelling with no incompatibility issues.
    • The best aspect of fmincon function is that it can accept analytical gradient and Hessian of objective and constraints, which increases the reliability, results are more robust and converges with fewer iterations.
    • Can handle most of the convex and non-convex problems.
    • Cannot handle real-valued complex domain function without re-modelling
  • YALMIP to prototype optimization problems in MATLAB
    • Syntax is straightforward to follow and model the problem.
    • Almost all MATLAB functions are compatible with YALMIP.
    • Can handle convex and non-convex problems.
    • Can interface with most popular solvers like MOSEK, SeDuMi, Gurobi etc.
    • However, it requires a license for the solvers, which is free for academics.
    • Don’t have the option to input analytically computed gradients and Hessians.
    • Can handle real-valued complex domain functions without re-modelling
  • CVX to prototype convex optimization problems in MATLAB.
    • Handles only convex optimization problems.
    • One needs to adhere to the so-called disciplined convex programming (DCP) ruleset; otherwise, CVX may not recognize it as convex and may not accept the model even if the problem is convex.
    • If CVX accepts the model, one can be assured that the problem is convex.
    • CVX doesn’t have the option to input analytically computed gradients and Hessians.
    • Can handle real-valued complex domain convex function without re-modelling.
    • Can interface with most popular solvers, however, requires license similar to YALMIP.

YALMIP and CVX can solve many other convex problems which are not yet currently directly supported by MATLAB like SDP. One can see the documentation of YALMIP and CVX for information on that.

Some tips when solving optimization problems using MATLAB

  • Whenever the constraint values are very small or very large, precision issues may arise. To address such situations, one can artificially scale the constraints. For example, the following two constraints are equivalent:

    fi(x)0 \begin{equation} f_i(x)\leq 0 \end{equation}

    and

    cifi(x)0 \begin{equation} c_i f_i(x)\leq 0 \end{equation}

    where cic_i is some positive constant.

  • Likewise, one can scale the objective value before optimization and then rescale it after the optimization is solved. This ensures numerical stability and helps mitigate precision issues caused by extremely large or small objective values.

  • Sometimes, even scaling the variable of interest may help. For instance, one can solve for x=ayx = ay for some positive constant aa and replace all functions fi(x)f_i(x) by fi(y)=fi(xa)f_i(y)=f_i(\frac{x}{a}) and once the results are obtained rescale them.

  • When dealing with logarithmic functions, the performance of YALMIP/CVX may vary depending on the specific function involved. These solvers typically rely on exponential cones, which can be slower compared to other standard convex problems. In such cases, based on my experience, using fmincon with analytically computed gradients and Hessians can offer significant benefits in terms of both accuracy and speed of convergence.