Optimization using MATLAB
This tutorial is designed to help readers solve optimization problems in MATLAB through various examples and approaches. We will explore three widely used tools/interfaces: (i) MATLAB’s Optimization toolbox, (ii) YALMIP in conjunction with MATLAB, and (iii) CVX integrated with MATLAB. Additionally, the tutorial will delve into optimizing functions in the complex domain and demonstrate techniques for enhancing convergence rates by utilizing analytically computed gradients and Hessians. We will learn an important tips and tricks involves verifying the accuracy of these calculations using MATLAB’s Symbolic Toolbox. Please note that all the examples presented in this tutorial are based on MATLAB version 2021b. The tutorial’s content includes:
Table of Contents
Introduction
Optimization represents a fundamental mathematical concept involving the maximization or minimization of a specific utility function. The structure of this utility function varies based on the application’s context and the specific field of study. Once an optimization problem is formulated, the subsequent challenge lies in obtaining a solution. Unfortunately, deriving a closed-form (analytical) solution is often not feasible. In such cases, numerical optimization methods come into play. Numerous solvers are accessible to tackle optimization problems, particularly if you are aware of the problem’s categorical type. This tutorial demonstrates several examples of how to model and solve optimization problems using MATLAB. It’s essential to note that this post doesn’t encompass all available tools; instead, it focuses on a select few. Additionally, the tutorial also touches tools capable of handling nonconvex functions, even though they might yield suboptimal solutions.
Solving Linear Programming (LP)
The general LP is given by
$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & f^Tx\\ \rm{s.t.} \quad & Ax\leq b\\ \quad & A_{\rm eq}x=b_{\rm eq} \end{aligned} \end{equation} $$
Now consider the following example of LP that we intend to solve: $$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & -x_1 - \frac{1}{3}x_2\\ \rm{s.t.} \quad & \begin{bmatrix} 1 & 1\\ 1 & \frac{1}{4}\\ 1 & -1\\ -\frac{1}{4} & -1\\ -1 & -1\\ -1 & 1 \end{bmatrix}x\leq \begin{bmatrix} 2\\ 1\\ 2\\ 1\\ -1\\ 2 \end{bmatrix}\\ \quad & x_1+\frac{1}{4}x_2= \frac{1}{2}\\ \quad &-1\leq x_1 \leq 1.5\\ \quad &-0.5 \leq x_2 \leq 1.25 \end{aligned} \end{equation} $$
First let us define the parameters of LP for the example described above
clc;
clear;
close all;
% Defining inequality constraint matrix
A = [1,1;
1,1/4;
1,-1;
-1/4, -1;
-1,-1;
-1,1];
% Defining inequality constraint vector
b = [2;1;2;1;-1;2];
% Defining equality constraint matrix and vector
Aeq = [1 1/4];
beq = 1/2;
% Defining lower bounds and upper bounds on the optimizing variables
lb = [-1,-0.5];
ub = [1.5,1.25];
Note that the lower and upper bounds on the variable can also be modeled as inequality constraint and can be included in the matrix $A$ and vector $b$.
Example on how to model LP with linprog of MATLAB
We will use MATLAB’s linprog which is a dedicated solver for linear programming. We can model and solve the LP problem as follows:
% Weights of the objective
f = [-1;-1/3];
% Solving the LP
[x,fval] = linprog(f,A,b,Aeq,beq,lb,ub);
One can also solve LP using interior-point method and that can be invoked via optimoptions as follows:
% Setting the options for linprog with interior-point method
options = optimoptions('linprog','Algorithm','interior-point');
% Solving the LP
[x,fval] = linprog(f,A,b,Aeq,beq,lb,ub,options);
The solution obtained with linprog for the example is (format of display is changed):
x = [0.1875;1.25]
fval = -0.6042
Example on how to model LP with YALMIP in MATLAB
We can solve the same LP problem by modeling with YALMIP as follows
% Defining the optimization variable
x = sdpvar(2,1);
% Defining the objective
obj = f'*x;
% Defining the constraints
cons = [A*x<=b,Aeq*x==beq,-1<=x(1)<=1.5,-0.5<=x(2)<=1.25];
% Setting up the options and setting verbose not to display
options = sdpsettings('verbose',0);
% Solving the optimization problem
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
disp(value(x)) % value function to display sdp variables value
disp(value(obj))
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
Example on how to model LP with CVX in MATLAB
Now we shall solve the same LP problem by modeling with CVX as follows
cvx_begin quiet
variable x(2,1)
minimize dot(f,x);
subject to
A*x<=b;
Aeq*x==beq;
-1<=x(1)<=1.5;
-0.5<=x(2)<=1.25;
cvx_end
Solving Quadratic Programming (QP)
The general QP is given by
$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & \frac{1}{2}x^THx + f^Tx\\ \rm{s.t.} \quad & Ax\leq b\\ \quad & A_{\rm eq}x=b_{\rm eq} \end{aligned} \end{equation} $$
We will restrict to the case when H is at least positive semidefinite i.e., we confine to convex QP. For the example to be presented next we will consider positive definite matrices because we will then have unique solution and it will be easy to compare with results of YALMIP and CVX. Note that MATLAB’s function quadprog and YALMIP can handle even indefinite matrices, however, they will give some suboptimal solution. However, CVX won’t accept indefinite matrices as that would make problem a nonconvex.
Consider the following example of QP that we intend to solve: (Note: if some equations are not rendering properly in mobile browser, then in browser options select “desktop site”) $$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & \frac{1}{2}x_1^2+x_2^2-x_1x_2-2x_1-6x_2\\ \rm{s.t.} \quad & x_1 + x_2 \leq 2\\ & -x_1 + 2x_2 \leq 2\\ & 2x_1 + x_2 \leq 3\\ & x_1 + x_2 =0 \end{aligned} \end{equation} $$
One can re-formulate into matrix-vector representation and obtain the following QP parameters:
$$ \begin{equation} \begin{aligned} H &= \begin{bmatrix} 1 & -1\\ -1 & 2 \end{bmatrix}\\ f &= \begin{bmatrix} -2\\ -6 \end{bmatrix}\\ A &= \begin{bmatrix} 1 & 1\\ -1 & 2\\ 2 & 1 \end{bmatrix}\\ b &= \begin{bmatrix} 2\\ 2\\ 6 \end{bmatrix}\\ A_{\rm eq} &= \begin{bmatrix} 1 & 2 \end{bmatrix}\\ b_{\rm eq} &= 0 \end{aligned} \end{equation} $$
Now we can start modeling QP in MATLAB. First step would be to defining fixed parameters of QP problem.
clc;
clear;
close all;
H = [1 -1; -1 2];
f = [-2; -6];
A = [1 1; -1 2; 2 1];
b = [2; 2; 3];
Aeq = [1 2];
beq = 0;
Example on how to model QP with quadprog of MATLAB
We will use MATLAB’s quadprog which is a dedicated solver for QP. We can solve the problem as follows
[x,fval] = quadprog(H,f,A,b,Aeq,beq);
The solution obtained with quadprog for the example is (format of display is changed):
x = [-0.4;0.2]
fval = -0.2
Note that quadprog can also accept the bounds on the variable as an input, as we did with LP example.
Example on how to model QP with YALMIP in MATLAB
We can solve the same QP problem by modeling with YALMIP as follows
% Define the variable
x = sdpvar(2,1);
% Objective
obj = 0.5*(x'*H*x)+f'*x;
% Constraints
cons = [A*x<=b,Aeq*x==beq];
options = sdpsettings('verbose',0);
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
x = value(x)
fval = value(obj)
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
Example on how to model QP with CVX in MATLAB
We can now solve the same QP problem by modeling with CVX as follows
cvx_begin quiet
cvx_solver mosek
variable x(2,1)
minimize 0.5*quad_form(x,H) + dot(f,x);
subject to
A*x<=b;
Aeq*x==beq;
cvx_end
This example also shows how one can invoke mosek solver with CVX to solve the problem.
Solving Quadratic Constrained Quadratic Programming (QCQP)
The general QCQP is given by
$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & x^TQ_0x + 2r_0^Tx + c_0\\ \rm{s.t.} \quad & x^TQ_ix + 2r_i^Tx + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$
We will focus on examples with matrices being positive definite. However, MATLAB function, YALMIP and CVX can handle semidefinite matrices. Sometimes even non-convex quadratic functions as well but may give some suboptimal solution. However, CVX won’t accept the non-convex model.
We shall generate the fixed data of QCQP randomly for the example to be presented to next as follows:
clc;
clear;
close all;
% Set the seed for repeatability
rng(5);
% Generate sample data
M = 4; % Size of matrices
n = 4; % Number of constraints + objective
Q = zeros(M,M,n); % Including objective
r = zeros(M,n);
c = zeros(n,1);
% Generating all data at one place
% (this is to ensure data is same across all examples)
Qfull = randn(M,M,n);
rfull = randn(M,n);
cfull = randn(n);
% Defining objective parameters
Q(:,:,1) = Qfull(:,:,1)*Qfull(:,:,1)' + eye(M);
r(:,1) = rfull(:,1);
c(1) = cfull(1);
% Defining constraint parameters
for i = 2:n
Qtmp = Qfull(:,:,i);
Q(:,:,i) = Qtmp*Qtmp';
r(:,i) = rfull(:,i);
c(i) = cfull(i);
end
Example on how to model QCQP with fmincon of MATLAB
MATLAB does not have a dedicated solver for QCQP. However, we can use the interior point method via fmincon function which is one of the powerful general purpose solver.
As a first step, we need to define functions for the objective and its constraints (non-linear). We can define these functions as anonymous as follows:
% Defining the function as anonymous function
fun = @(x)(x'*Q(:,:,1)*x + 2*r(:,1)'*x + c(1));
% Defining non-linear constraints
noncol = @(x)consfun(x,Q,r,c,n);
Since the objective function is small, it has been defined as an in-line equation, and here fun is the objective function handle. However, for constraints, we have multiple, and so we just defined function handle noncol and its definition is given below
% % Function to compute equality and inequality nonlinear constraints
function [cons,conseq] = consfun(x,Q,r,c,n)
cons = zeros(n-1,1); % Number of constraints
for i = 2:n
cons(i-1) = x'*Q(:,:,i)*x + 2*r(:,i)'*x + c(i);
end
conseq = [ ];
end
One can define this user-defined function at the end of the script for readability. Now we shall see how to solve the problem.
% Defining the initial point
x0 = zeros(M,1);
% Setting the options for algorithm: interior-point method
options = optimoptions('fmincon','Algorithm','interior-point');
% Solving the optimization problem
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);
The solution obtained with fmincon for the example is (format of display is changed):
x = [-0.013;-0.0689;0.1170;0.0579]
fval = -0.5080
exitflag and output contain other helpful information about the results, constraint violations etc. Remember that fmincon can sometimes violate constraints. However, it may be minimal, but one should check those extra variables to decide how reliable the results are for the problem at hand.
Note that fmincon can also accept linear inequality and equality constraints in the third, fourth, fifth and sixth arguments. Moreover, one can even define the lower and upper bounds on the variable in the seventh and eighth arguments. Also, it is worth noting that fmincon can accept nonlinear equality constraints. However, we didn’t consider that case in this example to focus on the convex QCQP problem.
Example on how to model QCQP with YALMIP in MATLAB
We can now solve the same QCQP problem by modeling with YALMIP as follows:
% Defining the sdp variables
x = sdpvar(M,1);
% Defining the objective function
obj = (x'*Q(:,:,1)*x + 2*r(:,1)'*x + c(1));
% Defining all constraints iteratively
cons = [];
for i = 2:n
cons = [cons,x'*Q(:,:,i)*x + 2*r(:,i)'*x + c(i)<=0];
end
% Setting up the options
options = sdpsettings('verbose',0,'solver','mosek');
% Solving the problem
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
disp(value(x))
disp(value(x))
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
Note with YALMIP; we can use the code snippet check(cons) to infer how well the constraints are satisfying. In this example, we also learned how to invoke mosek solver with YALMIP.
Example on how to model QCQP with CVX in MATLAB
We can now solve the same QCQP problem by modeling with CVX as follows:
cvx_begin quiet
cvx_solver mosek
variable x(M,1)
minimize quad_form(x,Q(:,:,1)) + 2*dot(r(:,1),x)+ c(1);
subject to
for i = 2:n
quad_form(x,Q(:,:,i)) + 2*dot(r(:,i),x)+ c(i)<=0;
end
cvx_end
Solving Second Order Conic Programing (SOCP)
The general SOCP is given by
$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & f^Tx\\ \rm{s.t.} \quad & \Vert A_ix + b_i \Vert \leq \gamma^T_i x + d_i,\ i = 1,\ldots, n\\ &Bx \leq q\\ & B_{\rm eq}x = q_{\rm eq} \end{aligned} \end{equation} $$
To provide an example, instead of creating a new one, we shall solve the QCQP problem an a SOCP problem. One can re-write the QCQP problem after some mathematical manipulations as follows:
$$ \begin{equation} \begin{aligned} \underset{x,t}{\rm{minimize}} \quad & t\\ \rm{s.t.} \quad & \Vert Q_0^{\frac{1}{2}}x + Q_0^{-\frac{1}{2}}r_0 \Vert \leq t,\\ & \Vert Q_i^{\frac{1}{2}}x + Q_i^{-\frac{1}{2}}r_i \Vert \leq d_i, \ i = 1,\ldots, n\\ \end{aligned} \end{equation} $$ where $d_i = \Vert Q_i^{-\frac{1}{2}}r_i \Vert^2 - c_i$.
Note that we have considered in the previous example $Q_i, i=0,\ldots, n$ to be positive definite, however, the problem can also be re-written for positive semidefinite matrices, refer: here
Also, note that for the general SOCP problem, there is no structure assumption on $A_i, i =1,\ldots, n$ matrices, i.e., they need not be PSD or PD.
Example on how to model SOCP with coneprog of MATLAB
MATLAB has a dedicated solver coneprog for conic programming. The first step would be to learn how one can model the SOC (second-order constraint) with MATLAB. One can use the following syntax:
socConstraints = secondordercone(A,-b,gamma,-d);
Note that variable notations are changed in the way MATLAB documentation defines them. Also, note the signs of $b_i$ and $d_i$ are flipped due to the way MATLAB defines the SOCP problem, refer
Now let us solve the QCQP problem with coneprog of MATLAB. The first step would be to define the A,b, $\gamma$ and d variables:
% Defining parameters to store SOCP parameters
A = zeros(M,M+1,n); % Including objective
b = zeros(M,n);
d = zeros(n);
% Defining parameters SoCP of MATLAB
A(:,:,1) = [sqrtm(Q(:,:,1)),zeros(M,1)]; % Extra zero column for slack variable t
b(:,1) = sqrtm(Q(:,:,1))\r(:,1);
d(1) = sqrt(norm(b(:,1))^2 - c(1));
% Defining constraint parameters
for i = 2:n
A(:,:,i) = [sqrtm(Q(:,:,i)),zeros(M,1)];
b(:,i) = sqrtm(Q(:,:,i))\r(:,i);
d(i) = sqrt(norm(b(:,i))^2 - c(i));
if ~isreal(d(i))
error('Infeasible constraints');
end
end
Here, we have defined our SOCP problem with two variables, $x$ and $t$. However, with MATLAB, we can define only one variable. So we can treat our new $x$ to be $x := [x;t]$ (not mathematically but just for representation purpose). Thus, accordingly we need extra zero column in matrices of $A_i, i = 0,\ldots,n$. Also, one can verify mathematically we need $d_i, i = 1,\ldots, n$ to be real and positive for feasibility.
Note that variables $\gamma_i$ and $f$ will directly defined directly in function as they easy to formulate. If you want we can do before as well. Now we are ready to solve the problem
% Defining SoC constraints
% Objective
socConstraints(n) = secondordercone(A(:,:,1),-b(:,1),[zeros(M,1);1],0);
% Constraints
for i = 2:n
socConstraints(i-1) = secondordercone(A(:,:,i),-b(:,i),zeros(M+1,1),-d(i));
end
[x,fval] = coneprog([zeros(M,1);1],socConstraints,[],[],[],[],[],[]);
Note that coneprog can also accept linear inequality in the third, fourth, and equality constraints in the fifth and sixth arguments. Moreover, one can even define the lower and upper bounds on the variable in the seventh and eighth arguments.
The solution obtained with coneprog for the example is (format of display is changed):
x(1:n-1,1) = [0.0135;-0.0691;0.1175;0.0578]
x(n,1) = 2.0375
fval = 2.0375
Example on how to model SOCP with YALMIP in MATLAB
With YALMIP, one can define multiple optimizing variables; hence, YALMIP is more flexible than MATLAB optimization toolbox in that sense. The syntax looks precisely like what mathematical formulation looks like. Let us first define the parameters of SOCP:
% Defining parameters to store SOCP parameters
A = zeros(M,M,n); % Including objective
b = zeros(M,n);
d = zeros(n);
% Defining objective parameters
A(:,:,1) = sqrtm(Q(:,:,1));
b(:,1) = sqrtm(Q(:,:,1))\r(:,1);
d(1) = sqrt(norm(b(:,1))^2 - c(1));
% Defining constraint parameters
for i = 2:n
A(:,:,i) = sqrtm(Q(:,:,i));
b(:,i) = sqrtm(Q(:,:,i))\r(:,i);
d(i) = sqrt(norm(b(:,i))^2 - c(i));
if ~isreal(d(i))
error('Infeasible constraints');
end
end
Now solving the problem as follows
% Defining optimizing variables
x = sdpvar(M,1);
t = sdpvar(1,1);
% Defining the objective
obj = t;
% Defining the constraints
cons = cone(A(:,:,1)*x + b(:,1),t);
for i = 2:n
cons = [cons,cone(A(:,:,i)*x + b(:,i),d(i))];
end
options = sdpsettings('verbose',0,'solver','mosek');
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
disp(value(x))
disp(value(obj))
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
Example on how to model SOCP with CVX in MATLAB
Now we can solve the same problem with CVX. Even CVX is flexible in defining the number of optimization variables. So for defining the parameters of the SOCP problem, one can follow the same code as that with the YALMIP version of the SOCP example. Once defined we can then solve the problem as follows:
cvx_begin quiet
cvx_solver mosek
variable x(M,1)
variable t
minimize t;
subject to
norm(A(:,:,1)*x + b(:,1))<=t;
for i = 2:n
norm(A(:,:,i)*x + b(:,i))<=d(i);
end
cvx_end
Solving QCQP/SOCP for real-valued complex domain functions
In signal processing and communications, we deal with complex-valued variables more commonly. However, all the available utility functions are real, e.g., error, rate, etc. So, we will now see how to solve the real-valued complex variable QCQP problem here. We can employ a similar strategy to SOCP or any other real-valued function.
For solving an example, let us first define some random data for real-valued complex variables for convex QCPQ problem.
clc;
clear;
close all;
rng(1); % Set the seed for repeatability
% Generate sample data
M = 4; % Size of matrices
n = 4; % Number of constraints + objective
Q = zeros(M,M,n); % Including objective
r = zeros(M,n);
c = zeros(n,1);
% Generating all data at one place
% (this is to ensure data is same across all examples)
Qfull = randn(M,M,n) + 1j*randn(M,M,n);
rfull = randn(M,n) + 1j* randn(M,n);
cfull = randn(n);
% Defining objective parameters
Q(:,:,1) = Qfull(:,:,1)*Qfull(:,:,1)' + eye(M);
r(:,1) = rfull(:,1);
c(1) = cfull(1);
% Defining constraint parameters
for i = 2:n
Qtmp = Qfull(:,:,i);
Q(:,:,i) = Qtmp*Qtmp';
r(:,i) = rfull(:,i);
c(i) = cfull(i);
end
We will first solve the problem with YALMIP and CVX because they support optimization with complex variables. Finally, we will learn how to use MATLAB’s optimization toolbox for such situations.
Example on how to solve convex QCQP problem with complex variables with YALMIP
% Defining the complex vector variable
z = sdpvar(M,1,'full','complex');
% Defining the objective
obj = (z'*Q(:,:,1)*z + 2*real(r(:,1)'*z) + c(1));
% Defining the constraints
cons = [];
for i = 2:n
cons = [cons,z'*Q(:,:,i)*z + 2*real(r(:,i)'*z) + c(i)<=0];
end
% Setting the options
options = sdpsettings('verbose',0);
% Solving the problem
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
disp(value(z))
disp(value(obj))
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
The solution obtained with YALMIP for the above example is (format of display is changed):
z = [-0.0131 + 0.0438i;0.0713 - 0.1038i;-0.0295 - 0.0003i;0.0868 + 0.0375i]
obj = -0.8042
Example on how to solve convex QCQP problem with complex variables with CVX
cvx_begin quiet
cvx_solver mosek
variable x(M,1) complex
minimize quad_form(x,Q(:,:,1)) + 2*real(dot(r(:,1),x))+ c(1);
subject to
for i = 2:n
quad_form(x,Q(:,:,i)) + 2*real(dot(r(:,i),x))+ c(i)<=0;
end
cvx_end
Example on how to solve convex QCQP problem with complex variables with MATLAB optimization toolbox
MATLAB optimization toolbox directly doesn’t support complex variables. However, one can reformulate any real-valued complex-variable function with an equivalent real-variables function.
For example with the general convex QCQP
$$ \begin{equation} \begin{aligned} \underset{z\in \mathbb{C}^N}{\rm{minimize}} \quad & z^TQ_0z + 2r_0^Tz + c_0\\ \rm{s.t.} \quad & z^TQ_iz + 2r_i^Tz + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$
One can re-write equivalently as follows:
$$ \begin{equation} \begin{aligned} \underset{x\in \mathbb{R}^{2N}}{\rm{minimize}} \quad & x^T\bar{Q}_0x + 2\bar{r}_0^Tx + c_0\\ \rm{s.t.} \quad & x^T\bar{Q}_ix + 2\bar{r}_i^Tx + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$
where
$$ \begin{equation} \begin{aligned} x &= \begin{bmatrix} \mathcal{R}\{z\}\\ \mathcal{I}\{z\} \end{bmatrix}\\ \bar{Q}_i &= \begin{bmatrix} \mathcal{R}\{Q_i\} & -\mathcal{I}\{Q_i\}\\ \mathcal{I}\{Q_i\} & \mathcal{R}\{Q_i\} \end{bmatrix}\\ \bar{r}_i &= \begin{bmatrix} \mathcal{R}\{r_i\}\\ \mathcal{I}\{r_i\} \end{bmatrix} \end{aligned} \end{equation} $$ where $\mathcal{R}\{\cdot\}$ and $\mathcal{I}\{\cdot\}$ are real and imaginary parts of the argument. So after solving, the objective’s value remains the same as in the case of the complex variable model. To obtain the point of optimality, one can get via $z := x(1:n,1) + i x(n+1,1)$ (just representation purpose). So having understood what to do, we can now solve the problems.
% Reformulating to real variables
Qb = zeros(2*M,2*M,n);
rb = zeros(2*M,n);
for i = 1:n
Qr = real(Q(:,:,i));
Qi = imag(Q(:,:,i));
rr = real(r(:,i));
ri = imag(r(:,i));
Qb(:,:,i) = [Qr,-1*Qi;Qi,Qr];
rb(:,i) = [rr;ri];
end
% Defiing the objective function
fun = @(x)(x'*Qb(:,:,1)*x + 2*rb(:,1)'*x + c(1));
% Defining the nonlinear constraints
noncol = @(x)consfun(x,Qb,rb,c,n);
% Defining the intial value
x0 = zeros(2*M,1);
% Setting up the options
options = optimoptions('fmincon','Algorithm','interior-point');
% Solving the optimization problem
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);
% Obtaining the complex variable from x
z = x(1:M) + 1j*x(M+1:end);
% Function to compute nonlinear constraint
function [cons,conseq] = consfun(x,Qb,rb,c,n)
cons = zeros(n-1,1); % Number of constraints
for i = 2:n
cons(i-1) = x'*Qb(:,:,i)*x + 2*rb(:,i)'*x + c(i);
end
conseq = [ ];
end
General Convex Programming
We will now use some general convex functions which do not fall directly under the above representations; of course, we can sometime reformulate them into standard forms by introducing some variables. We will not get into that as we want to keep our focus on how to solve the optimization problems in MATLAB. For general convex optimization we can either of three tools: fmincon of MATLAB or go with YALMIP or CVX which ever toolbox you are comfortable with. However, note that with CVX, the objective should be convex on in its entire domain (including the region outside the feasibility regime). Otherwise, we cannot directly solve such problems with CVX, which is strictly for convex objectives and constraints. So to solve in CVX, we need to restructure the objective. However, we will not be dealing with that here. One big advantage with fmincon is that you can supply analytical gradient and Hessian (of course both are optional) and that would increase the reliability of the results. We will now present an example whose objective is not convex on its entire domain but it is convex in the feasibility region.
Consider the following example:
$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & c(x_2 + x_1^2)^2 + (1 - x_1)^2\\ \rm{s.t.} \quad & x_1^2 + (x_2 - a)^2 \leq 2,\\ & x_2 \geq 0 \end{aligned} \end{equation} $$
for $c = 100, a = 3$. This problem is convex in the feasibility region. Let’s compute its objective, say $f(x)$, and constraint’s, say $f_1(x)$, gradient and Hessian.
$$ \begin{equation} f(x) = c(x_2 + x_1^2)^2 + (1-x_1)^2 \end{equation}
\begin{equation} \nabla f(x) = \begin{bmatrix} 4c(x_2 + x_1^2)x_1 - 2(1-x_1)\\ 2c(x_2 + x_1^2) \end{bmatrix} \end{equation}
\begin{equation} \nabla^2 f(x) = \begin{bmatrix} 4c(x_2 + 3x_1^2) + 2 & 4cx_1\\ 4cx_1 & 2c \end{bmatrix} \end{equation} $$
Gradient and Hessian of the constraint are: $$ \begin{equation} \begin{aligned} f_1(x) &= x_1^2 + (x_2 - a)^2 - 2\\ \nabla f_1(x) &= \begin{bmatrix} 2x_1\\ 2(x_2 - a) \end{bmatrix}\\ \nabla^2 f_1(x) &= \begin{bmatrix} 2 & 0\\ 0 & 2 \end{bmatrix} \end{aligned} \end{equation} $$ One can verify from Hessians that the problem is convex in the feasibility region. We will solve the above problem with fmincon and YALMIP only. CVX can handle problems which are convex on entire domain.
Solving the above example with YALMIP
clc;
clear;
close all;
% Paramters of the problem
c = 100;
a = 3;
% Number of variables
x = sdpvar(2,1);
% Defining the objective
obj = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;
% Defining the constraints
cons = x(2)>=0;
cons = [cons,10*(x(1)^2 + (x(2)-a)^2)<=2*10];
% Setting the options
options = sdpsettings('verbose',0);
% Solving the problem
diagnostics = optimize(cons,obj,options);
if diagnostics.problem == 0
disp(value(x))
disp(value(obj))
elseif diagnostics.problem == 1
disp('Solver thinks it is infeasible')
else
disp('Something else happened')
end
and the solution obtained is (format of display is changed)
x = [0.0023;1.5858]
obj = 252.469541
There are many other examples in YALMIP website.
Let us solve the same problem with MATLAB in three different ways. One way directly and two other ways by supplying the theoretical gradients and Hessians.
Case 1: Solving a general optimization problem
clc;
clear;
close all;
% Linear constraints
A = [0,-1];
b = 0;
% Initial value
x0 = [0;3];
% Setting the options
options = optimoptions('fmincon','Algorithm','interior-point');
% Solving the problem
[x,fval,exitflag,output] = fmincon(@computeFun,x0,A,b,[],[],[],[],@computenonlincons,options);
% Function to compute function value
function f = computeFun(x)
% Calculate objective f
f = 100*(x(2) + x(1)^2)^2 + (1-x(1))^2;
end
% Function to compute nonlinear equality and inequality constraints
function [c,ceq] = computenonlincons(x)
c = x(1)^2 + (x(2)-3)^2 - 2;
ceq = [ ];
end
Note that fmincon can handle only those problems where the objective and constraint functions are both continuous and have continuous first derivatives. However, it can be nonconvex. On the other hand YALMIP can handle any convex function smooth or not and in some cases even nonconvex functions. In case of nonconvex both fmincon and YALMIP will provide some suboptimal solution.
Optimization of smooth function with fmincon using theoretical gradient and Hessian
In some cases, supplying theoretically computed gradient and Hessians to the interior-point method based fmincon can improve its reliability, making the code more robust and increasing the algorithm’s rate of convergence. This feature makes fmincon stand out from YALMIP and CVX, which can be extremely slow sometimes especially dealing with exponential cones.
Case 2: Solving the same previous problem using theoretical gradient and Hessians
clc;
clear;
close all;
% Parameters of the problem
c = 100;
a = 3;
% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
Hess = @(x,lambda)computeHessianfcn(x,lambda,c);
% Linear constraints
A = [0,-1];
b = 0;
x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
'HessianFcn',Hess);
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);
% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;
gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
2*c*(x(2)+x(1)^2)];
end
% Function to Hessian of Lagrangian at the point
function Hout = computeHessianfcn(x,lambda,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
4*c*x(1), 2*c];
% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;
end
% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = (x(1)^2 + (x(2)-a)^2 - 2);
ceq = [ ];
gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end
Case 3: Solving with theoretical gradient and Hessians-Multiple
clc;
clear;
close all;
% Paramters of the problem
c = 100;
a = 3;
% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
HessMult = @(x,lambda,v)computeHessMultFcn(x,lambda,v,c);
% Linear constraints
A = [0,-1];
b = 0;
x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
'SubproblemAlgorithm','cg','HessianMultiplyFcn',HessMult);
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);
% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) - x(1)^2)^2 + (1-x(1))^2;
gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
2*c*(x(2)+x(1)^2)];
end
% Function to Hessian of Lagrangian at the point
function W = computeHessMultFcn(x,lambda,v,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
4*c*x(1), 2*c];
% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;
W = Hout*v;
end
% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = x(1)^2 + (x(2)-a)^2 - 2;
ceq = [ ];
gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end
Case 3 will be helpful, especially when you have a large sparse Hessian matrix. It speeds up the convergence.
Note that when we are computing Hessian, we are actually computing Hessian of Lagrangian which is as follows
$$ \nabla^2 L(x,\lambda) = \nabla^2 f(x) + \sum_{i}\lambda_{\rm ineq}(i)\nabla^2 c_i(x) + \sum_{j}\lambda_{\rm eq}(i)\nabla^2 ceq_j(x) $$
where $c_i(x), i =1,\ldots,N$ and $ceq_j(x), j =1,\ldots,M$ are equality and inequality constraints. The $\lambda$’s are Lagrangian multipliers. The notations are little nonstandard but I did with the intention that you can easily relate it to the MATLAB’s syntax. In the example above we just had one inequality constraint. However, if one have multiple contraints then one can have a loop and iteratively add them up as follows (just syntactical representation):
Hobj % Hessian of the objective
% Adding Hessian of inequality nonlinear constraints
for i = 1:N
Hobj = Hobj + lambda.ineqnonlin(1)*Hc(:,:,i); % where Hc(:,:,i) is Hessian of c_i(x)
end
% Adding Hessian of equality nonlinear constraints (not if it is not affine, then it is nonconvex)
for j = 1:M
Hobj = Hobj + lambda.eqnonlin(1)*Hceq(:,:,i); % where Hceq(:,:,j) is Hessian of ceq_j(x)
end
Computing Gradient/Hessian using symbolic toolbox of MATLAB
To err is human, and hence there is a high probability that one may make small mistakes when computing Gradient and Hessian with some complicated functions. So, one way to verify the expressions' correctness is using the symbolic MATLAB toolbox.
Following code computes symbolic gradient and Hessians that we used in the example above
clc;
clear;
close all;
% Defining data
x = sym('x',[2,1]);
% Parameter variable
syms c a;
% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;
% Symbolic gradient
gradfs = gradient(fs,x);
% Symbolic hessian
hessfs = hessian(fs,x);
% Defining constraint symbolically
gs = x(1)^2 + (x(2)-a)^2 - 2;
% Symbolic gradient
gradgs = gradient(gs,x);
% Symbolic hessian
hessgs = hessian(gs,x);
Note that we have parametrized even c and a to show we can use the symbolic toolbox in a more general scenario where parameters can vary, but if one wants, one can fix it for some real value.
One way to check is to display the expression in the command line and verify by observing. Another way would be to run over several sample points and check as follows
% Sample checking over various values
M = 1e2;
eps = 1e-7;
for i = 1:M
x0 = randn(2,1);
c0 = randn;
a0 = randn;
% Symbolic function value
gradfsVal = double(subs(subs(gradfs,x,x0),c,c0));
hessfsval = double(subs(subs(hessfs,x,x0),c,c0));
% Computing theoretical
%f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;
gradfVal = [4*c0*(x0(2)+x0(1)^2)*x0(1)-2*(1-x0(1));
2*c0*(x0(2)+x0(1)^2)];
hessfVal = [12*c0*x0(1)^2+4*c0*x0(2)+2, 4*c0*x0(1);
4*c0*x0(1), 2*c0];
% Checking the accuracy
if abs(gradfsVal - gradfVal) >= eps
error('Gradient is in error');
end
if sum(abs(hessfsval - hessfVal),'all')>=eps
error('Hessian is in error');
end
% Symbolic constraint value
gradgsVal = double(subs(subs(gradgs,x,x0),a,a0));
hessgsval = double(subs(subs(hessgs,x,x0),a,a0));
% Computing theoretical
gradgVal = [2*x0(1);2*(x0(2)-a0)];
hessgVal = 2*eye(2);
% Checking the accuracy
if abs(gradgsVal - gradgVal) >= eps
error('Gradient of constraint is in error');
end
if sum(abs(hessgsval - hessgVal),'all')>=eps
error('Hessian of constraint is in error');
end
end
Another way to check with the help of fmincon is discussed here
Converting symbolically computed Gradient/Hessian to a MATLAB function
In the final step, do you know that one need not always derive analytical expressions with MATLAB? MATLAB has a feature to convert the symbolically computed gradient and Hessian to function, which you can use directly in your code. This process ensures no error creep when we type those (gradient and Hessian) expressions. However, one should define function correctly without mistakes.
clc;
clear;
close all;
% Defining data
x = sym('x',[2,1]);
% Parameter variable
syms c;
% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;
% Symbolic gradient
gradfs = gradient(fs,x);
% Symbolic hessian
hessfs = hessian(fs,x);
matlabFunction(gradfs,'File','myGrad','vars',{x,c});
matlabFunction(hessfs,'File','myHess','vars',{x,c});
The above code creates two .m files in your folder: myGrad.m and myHess.m, which take input x, c and output gradient and Hessian at the given point x and parameterized by c. Creating the function files out of symbolic expression is such a cool feature of MATLAB. Because of this, we can avoid using a symbolic toolbox repeatedly, which is generally very slow.
Some general discussion
For maximizing the objective, just multiple objective function by $-1$ for MATLAB optimization toolbox. However, note that the YALMIP and CVX supports maximize operation without having to flip the sign.
A few general comments on the tools
The three popular choices with MATLAB to solve an optimization problem are:
- MATLAB optimization toolbox
- Fast and efficient as all functions are optimized to MATLAB.
- Have more flexibility in debugging the code.
- All functions of MATLAB can be used while modelling with no incompatibility issues.
- The best aspect of fmincon function is that it can accept analytical gradient and Hessian of objective and constraints, which increases the reliability, results are more robust and converges with fewer iterations.
- Can handle most of the convex and non-convex problems.
- Cannot handle real-valued complex domain function without re-modelling
- YALMIP to prototype optimization problems in MATLAB
- Syntax is straightforward to follow and model the problem.
- Almost all MATLAB functions are compatible with YALMIP.
- Can handle convex and non-convex problems.
- Can interface with most popular solvers like MOSEK, SeDuMi, Gurobi etc.
- However, it requires a license for the solvers, which is free for academics.
- Don’t have the option to input analytically computed gradients and Hessians.
- Can handle real-valued complex domain functions without re-modelling
- CVX to prototype convex optimization problems in MATLAB.
- Handles only convex optimization problems.
- One needs to adhere to the so-called disciplined convex programming (DCP) ruleset; otherwise, CVX may not recognize it as convex and may not accept the model even if the problem is convex.
- If CVX accepts the model, one can be assured that the problem is convex.
- CVX doesn’t have the option to input analytically computed gradients and Hessians.
- Can handle real-valued complex domain convex function without re-modelling.
- Can interface with most popular solvers, however, requires license similar to YALMIP.
YALMIP and CVX can solve many other convex problems which are not yet currently directly supported by MATLAB like SDP. One can see the documentation of YALMIP and CVX for information on that.
Some tips when solving optimization problems using MATLAB
-
Whenever the constraint values are very small or very large, then one may sometime encounter precision issues. So in order to deal with those situations, one can artificially scale the constraints. E.g., following both constraints are equivalent: $$ \begin{equation} f_i(x)\leq 0 \end{equation} $$ and $$ \begin{equation} c_i f_i(x)\leq 0 \end{equation} $$ where $c_i$ is some positive constant.
-
Similarly, one can scale the objective value before optimization and then rescale when the optimization is solved.
-
Sometimes, even scaling variable of interest may help i.e., one can solve for $x = ay$ for some positive constant $a$ and replace all functions $f_i(x)$ by $f_i(y)=f_i(\frac{x}{a})$ and once the results are obtained rescale them.
-
When dealing with logarithmic functions, sometimes YALMIP/CVX might be slow depending on the type of function you have in hand. Because they generally solved using exponential cones, which are generally slower compared to other standard convex problems. From my experience, in such situations using fmincon with analytically computed gradient and Hessian will payoff a lot in terms of accuracy and speed of convergence.