The world’s leading publication for data science, AI, and ML professionals.

Comprehensive Study of Least Square Estimation (Part 3)

Ordinary, Constrained, Multi-objective, and Nonlinear least square

Photo by Wil Stewart on Unsplash
Photo by Wil Stewart on Unsplash

This is the third part of the series of articles written in "Comprehensive Study of Least Square Estimation". In the first two parts, I discuss ordinary least square and multi-objective least-square problems. If you have not already read the first two parts, you can check them out here:

Comprehensive Study of Least Square Estimation (Part 1)

Comprehensive Study of Least Square Estimation (Part 2)

In this article, I will talk about the constrained least-squares problem, which arises frequently in Machine Learning, control, signal processing, and statistics. As before, I would like to encourage you to check out my YouTube channel for complete series of videos on Numerical linear algebra and optimization (I have multiple videos on least squares), Random processes, and Parameter estimation techniques.

MANIE TADAYON

Constrained Least Square

The constrained LS problem is trying to find the solution for the LS problem with linear constraints. The general form of the problem is as follows:

Figure 1: Constraint Least Square
Figure 1: Constraint Least Square

In figure 1, ||Ax-b||² is called the objective function and Cx = d is the set of linear constraints (as many as the number of rows of C). A is m x n and C is p x n matrix. The vectors x, b, and d are n x 1, m x 1, and p x 1 respectively. The linear constraint can be written as p linear equations as follows:

Figure 2: Linear Constraints
Figure 2: Linear Constraints

We call a vector x optimal if it minimizes the objective function and it satisfies the constraint.

The constrained LS problem can also be formulated as a multi-objective LS problem as follows:

Figure 3: Constraint LS as a Multi-objective LS
Figure 3: Constraint LS as a Multi-objective LS

The logic behind the above equation is as follows: If λ is very large and our goal is to minimize the weighted sum of two objective functions where one of is weighted by λ. Then we need to make sure what is multiplied by λ is actually zero. Therefore norm of Cx-d should be zero hence Cx = d.

Optimality Condition (Lagrange Multiplier)

Lagrange multiplier is a well-known and powerful algorithm to solve the constrained Optimization problems. In this article, we use the Lagrange multiplier method to drive the optimality conditions for constrained LS problems.

Suppose our goal is to solve a problem given in figure 1 using the Lagrange multiplier method.

  • First, construct the Lagrangian function as follows:
Figure 4: Lagrangian Function for Figure 1
Figure 4: Lagrangian Function for Figure 1

Where z is a vector of p x 1 Lagrange multipliers.

  • Differentiate the L(x,z) with respect to x and z (the original variable and the Lagrange multipliers) and set it to zero.
Figure 5: Optimality Conditions
Figure 5: Optimality Conditions
  • Finally putting everything together in a more compact form results in well-known optimality conditions for constrained least square:
Figure 6: Optimality Conditions for Constrained LS
Figure 6: Optimality Conditions for Constrained LS

The optimality conditions are mostly known as the KKT conditions. The matrix on the left-hand side of figure 6 is called the KKT matrix.

Solution of constrained LS

To find the optimal x, we tend to solve the matrix-vector equation in figure 6. Therefore we need to find the condition under which the KKT matrix is invertible. Let’s take a look at the dimensions: A is m x n and C is p x n matrices, d is p x 1, and b is m x 1 vectors. Therefore the KKT matrix is a square (p + n) x (p + n) matrices. To find the condition of the invertibility, we impose the condition for trivial nullspace or the linearly independent columns.

Figure 7
Figure 7

Multiply the top equation in figure 7 by x transpose and use the bottom equation to arrive at the following:

Figure 8: Conditions for invertibility of KKT Matrix
Figure 8: Conditions for invertibility of KKT Matrix

Putting equations in figures 7 and 8 together we arrive at the following two conditions for invertibility of the KKT matrix:

Figure 9: Conditions for invertibility of KKT Matrix
Figure 9: Conditions for invertibility of KKT Matrix

Therefore the conditions for invertibility of the KKT matrix is that the stacked matrix of A and C should have linearly independent columns and C should have linearly independent rows ( Please refer to YouTube videos and lectures on matrices, and least-squares to have a complete grasp on these concepts).

  • If the above conditions are met then the solution can be found as follows:
Figure 10: Solution of Constrained LS
Figure 10: Solution of Constrained LS

Conclusion

In this article, I discussed the constraint least-square problem. In the first two parts (I discussed ordinarily least-square and the multi-objective least-square problems, you can find a link to those articles at the beginning of this article).

In the next part, I will discuss the nonlinear least square problem. Please make sure to subscribe to my YouTube channel for more videos and more detailed explanations.


Related Articles