Lagrange Multipliers

Lagrange multipliers use a couple of clever tricks to deliver constrained optimums over compact domains. In recent years, this methodology has been used to undergird the theory of linear programming and what I might term the fundamental theorem of machine learning. The problems in this section require substantial time and effort, but the rewards are immense.

1. The basic theory of lagrange multipliers

Lagrange multipliers provide a method to optimize the value of a function within a compact domain. We recall that, on a compact domain, the extreme value theorem guarantees the existence of both a maximum and minimum value over that domain. We run into a significant problem in applying this theorem in any dimension higher than 1, however. The number of endpoints are infinite so we cannot simply evaluate the function at each of them. Lagrange gives us a tool given some function $f(\bar{x})$ and a constrain function $g(\bar{x})$ to construct $\mathcal{L}(\bar{x}, \lambda) = f(\bar{x}) + \lambda g(\bar{x})$. If we examine this function, we note that we find the minima and maxima on the constraint $g(\bar{x}) = 0$ precisely when all of the partials (with respect to each variable: x, y, z, …, $\lambda$) are equal to zero.

We will generally limit ourselves to the consideration of a single constraint. What does this mean? The given surface you’re examining may be quite complicated like the example to the left.

We might consider the constraint that we wish to maximize values for this surface only within the unit disc (specified by $x^2 + y^2 \leq 1$). Our constraint gives us the boundary of evaluation and our partials allow us to search within that boundary provided our function is differentiable.

We notice as well, that we have graphed the function f(x, y) from example 2 and then we have restricted that plot to the portion carved out by the unit disc. It looks a bit like a Pringle!

Worked Problems for Lagrange Multipliers:

1. Find the max of $f(x, y) = 2xy$ bounded by the finite region described by $x^4 + y^4 = 2$.

By now, the procedure should be fairly familiar. We set up the partials

$f_x = 2y = 4\lambda x^3 = \lambda g_x$ (equation 1)
$f_y = 2x = 4\lambda y^3 = \lambda g_y$ (equation 2)
$x^4 + y^4 = 2$ (equation 3)

Now, with a little astute observation we can see that if $\lambda = 0$ then both x and y are forced to be 0 and this fails to satisfy our constraint equation. Now, we may safely assume that $\lambda \neq 0$. With a little bit more care we can see that if x or y is zero, it forces either the other variable to be zero as well or $\lambda = 0$ and we know this cannot happen. Now, we can multiply through equation 1 by x and equation 2 by y to obtain

$2yx = 4\lambda x^4$ (equation 1*)
$2xy = 4\lambda y^4$ (equation 2*)

We can set these equations equal to one another
$4\lambda x^4 = 4\lambda y^4 \rightarrow$
$x^4 = y^4 \rightarrow$ (equation 4)

And we can substitute equation 4 into equation 3 and solve for y and then subsequently x.
$y^4 + y^4 = 2y^4 = 2 \rightarrow y = \pm 1$ and thus $x = \pm 1$.

Checking these solutions, we find that the max of f(x, y) = 2 at (1, 1) or (-1, -1). We still need to consult the relevant derivatives and we find f has a saddle point at (0, 0) and that is not the maximum.

2. Consider the function $f(x, y) = sin(xy)$ subject to the constraint $x^2 + y^2 = 1$. Use the method of lagrange multipliers to determine the absolute minimum and maximum of the function over the unit disc.

We first obtain the relevant partials and set them equal yielding:
$f_x = ycos(xy) = 2\lambda x = \lambda g_x$ (equation 1)
$f_y = xcos(xy) = 2\lambda y = \lambda g_y$ (equation 2)
$x^2 + y^2 = 1 \rightarrow$ (equation 3)
$x = \pm \sqrt{1-y^2}$ (equation 3*)

Here, we notice we have 3 equations and 3 unknowns and substituting equation 3* into equation 1 we can obtain
$ycos(y \sqrt{1-y^2}) = 2\lambda \sqrt{1-y^2} \rightarrow$
$\frac{ycos(y \sqrt{1-y^2})}{2 \sqrt{1-y^2}} = \lambda$ (equation 4)

We will then substitute this into equation equation 2 along with equation 3* to obtain
$\sqrt{1-y^2}cos(\sqrt{1-y^2}y) = 2y\frac{ycos(y \sqrt{1-y^2})}{2 \sqrt{1-y^2}} \rightarrow$
$\sqrt{1-y^2}cos(\sqrt{1-y^2}y) = \frac{y^2cos(y \sqrt{1-y^2})}{\sqrt{1-y^2}} \rightarrow$
$\sqrt{1-y^2}cos(\sqrt{1-y^2}y) – \frac{y^2cos(y \sqrt{1-y^2})}{\sqrt{1-y^2}} = 0\rightarrow$
$cos(y\sqrt{1-y^2})(\sqrt{1-y^2} – \frac{y^2}{\sqrt{1-y^2}}) = 0\rightarrow y = \{0, 1, \frac{1}{\sqrt{2}}\}$
Noticing that we also obtain symmetrical values with substitution of the negative expression and with the same work for the x variable we see our full set of solutions is
$x = \{0, \pm1, \pm\frac{1}{\sqrt{2}}\}, y = \{0, \pm1, \pm\frac{1}{\sqrt{2}}\}$

We then examine the solutions pairwise to see which simultaneously solve equation 3 from above and we obtain the pairs $(0, 1), (0, -1), (1, 0), (-1, 0), (\pm\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}), (\frac{1}{\sqrt{2}}, \pm \frac{1}{\sqrt{2}})$

Evaluating all 8 of these in f(x, y) we find two maxes at $(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}), (-\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}})$ and two mins at $(-\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}), (\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}})$.

We also need to examine the partials over the interior to the disc and we find $f_x = ycos(xy) = 0 \rightarrow y = 0$ and similarly, $f_y = 0 \rightarrow x = 0$. If we examine the relevant point we find that it is a saddle point on the interior and therefore cannot be a min or a max.

3. What is the maximum volume of a box subject the constraint that the longest diagonal is exactly 10 cm?

Now, we need to recall the volume a rectangular prism:
$v(x, y, z) = xyz$ (equation 1)

and the by repeated application of the pythagorean theorem and situation the near vertex on the origin and the far one in the positive octant, we have

$10 = \sqrt{x^2 + y^2 + z^2}$ (equation 2)
We will save ourselves some trouble by replacing equation 2 with the equivalent condition
$100 = x^2 + y^2 + z^2$ (equation 2*)

We now set up the relevant partials
$v_x = yz = 2\lambda x = \lambda g_x$ (equation 3)
$v_y = xz = 2\lambda y = \lambda g_y$ (equation 4)
$v_z = xy = 2\lambda z = \lambda g_z$ (equation 5)
$100 = x^2 + y^2 + z^2$ (equation 2*)

To solve this collection of equations most easily, we need to recall a sneaky fact about boxes: all their dimensions are positive so we can rewrite equations 3, 4 and 5 as:
$xyz = 2\lambda x^2$ (equation 3*)
$xyz = 2\lambda y^2$ (equation 4*)
$xyz = 2\lambda z^2$ (equation 5*)

Setting equation 3* and 4* equal to one another, we obtain
$2\lambda (x^2 – y^2) = 0 \rightarrow x = y$ since we cannot have $\lambda = 0$ in this instance (it forces one of the dimensions to be zero) and we must take the positive solution because our box needs positive dimensions. We can repeat this trick with equations 3* and 5* and obtain that $x = z$ and plugging into equation 2* we find

$100 = 3x^2 \rightarrow \frac{10}{\sqrt{3}} = x$

We note that a cube maximizes a rectangular prism given a fixed long diagonal and the resultant volume in this case is $v(x, x, x) = \frac{1000}{3\sqrt{3}}$.