## ideas behind the cauchy schwarz inequality

If people asked you, what is the most elementary inequality you know? I bet your answer would be AM-GM. But in this series of posts I will try to show you the power that Cauchy Schwarz has over that of AM-GM. But as an introduction, let us first state and prove the theorem.

Cauchy Schwarz Inequality: Let and be two sequences of real numbers, then we have:

In particular, equality holds iff there exists for which for .

Proof: We will present proofs, one originating from analysis on the equality case, the other by wishful thinking on small cases of .

(i) Consider defining the following function :

.

We will expand this to get:

From our first way of representing , we can conclude that or

Equality holds if the equation has one root.■

(ii) Just remark that:

Let us note the following positive things regarding Cauchy-Schwarz:

• it is effective in proving symmetric inequalities
• Try to form squares
• Helps to clear up square roots

Look forward to the next few posts to see applications of this extremely elegant inequality!

http://onionesquereality.wordpress.com/2013/09/25/some-proofs-of-the-cauchy-schwarz-inequality/

## Some Proofs of the Cauchy-Schwarz Inequality

Over the past 4-5 months whenever there is some time to spare, I have been working through The Cauchy-Schwarz Master Class by J. Michael Steele. And, although I am still left with the last two chapters, I have been reflecting on the material already covered in order to get a better perspective on what I have been slowly learning over the months. This blog post is a small exercise in this direction.

Ofcourse, there is nothing mysterious about proving the Cauchy-Schwarz inequality; it is fairly obvious and basic. But I thought it still might be instructive (mostly to myself) to reproduce some proofs that I know out of memory (following a maxim of my supervisor on a white/blackboard blogpost). Although, why Cauchy-Schwarz keeps appearing all the time and what makes it so useful and fundamental is indeed quite interesting and non-obvious. And like Gil Kalai notes, it is also unclear why is it that it is Cauchy-Schwarz which is mainly useful. I feel that Steele’s book has made me appreciate this importance somewhat more (compared to 4-5 months ago) by drawing to many concepts that link back to Cauchy-Schwarz.

Before getting to the post, a word on the book: This book is perhaps amongst the best mathematics book that I have seen in many years. True to its name, it is indeed a Master Class and also truly addictive. I could not put aside the book completely once I had picked it up and eventually decided to go all the way. Like most great books, the way it is organized makes it “very natural” to rediscover many susbtantial results (some of them named) appearing much later by yourself, provided you happen to just ask the right questions. The emphasis on problem solving makes sure you make very good friends with some of the most interesting inequalities. The number of inequalities featured is also extensive. It starts off with the inequalities dealing with “natural” notions such as monotonicity and positivity and later moves onto somewhat less natural notions such as convexity. I can’t recommend this book enough!

Now getting to the proofs: Some of these proofs appear in Steele’s book, mostly as either challenge problems or as exercises. All of them were solvable after some hints.

________________

Proof 1: A Self-Generalizing proof

This proof is essentially due to Titu Andreescu and Bogdan Enescu and has now grown to be my favourite Cauchy-Schwarz proof.

We start with the trivial identity (for $a, b, x, y \in \mathbb{R}$):

Identity 1: $(ay - bx)^2 \geq 0$

Expanding we have

$a^2y^2 + b^2x^2 - 2abxy \geq 0$

Rearranging this we get:

$\displaystyle \frac{a^2y}{x} + \frac{b^2x}{y} \geq 2ab$

Further: $\displaystyle a^2 + b^2 + \frac{a^2y}{x} + \frac{b^2x}{y} \geq (a+b)^2$;

Rearranging this we get the following trivial Lemma:

Lemma 1: $\displaystyle \frac{(a+b)^2}{(x+y)} \leq \frac{a^2}{x} + \frac{b^2}{y}$

Notice that this Lemma is self generalizing in the following sense. Suppose we replace $b$ with $b + c$ and $y$ with $y + z$, then we have:

$\displaystyle \frac{(a+b+c)^2}{(x+y+z)} \leq \frac{a^2}{x} + \frac{(b+c)^2}{y+z}$

But we can apply Lemma 1 to the second term of the right hand side one more time. So we would get the following inequality:

$\displaystyle \frac{(a+b+c)^2}{(x+y+z)} \leq \frac{a^2}{x} + \frac{b^2}{y} + \frac{c^2}{z}$

Using the same principle $n$ times we get the following:

$\displaystyle \frac{(a_1+a_2+ \dots a_n)^2}{(x_1+x_2+ \dots + x_n)} \leq \frac{a_1^2}{x_1} + \frac{a_2^2}{x_2} + \dots + \frac{a_n^2}{x_n}$

Now substitute $a_i = \alpha_i \beta_i$ and $x_i = \beta_i^2$ to get:

$\displaystyle ( \sum_{i=1}^n \alpha_i \beta_i )^2 \leq \sum_{i=1}^n ( \alpha_i )^2 \sum_{i=1}^n ( \beta_i )^2$

This is just the Cauchy-Schwarz Inequality, thus completing the proof.

________________

Proof 2: By Induction

Again, the Cauchy-Schwarz Inequality is the following: for $a, b \in \mathbb{R}$

$\displaystyle ( \sum_{i=1}^n a_i b_i )^2 \leq \sum_{i=1}^n ( a_i )^2 \sum_{i=1}^n ( b_i )^2$

For proof of the inequality by induction, the most important thing is starting with the right base case. Clearly $n = 1$ is trivially true, suggesting that it is perhaps not of much import. So we consider the case for $n = 2$. Which is:

$\displaystyle ( a_1 b_1 + a_2 b_2 )^2 \leq ( a_1^2 + a_2^2 ) (b_1^2 + b_2^2)$

To prove the base case, we simply expand the expressions. To get:

$\displaystyle a_1^2 b_1^2 + a_2^2 b_2^2 + 2 a_1 b_1 a_2 b_2 \leq a_1^2 b_1^2 + a_1^2 b_2^2 + a_2^2 b_1^2 + a_2^2 b_2^2$

Which is just:

$\displaystyle a_1^2 b_2^2 + a_2^2 b_1^2 - 2 a_1 b_1 a_2 b_2 \geq 0$

Or:

$\displaystyle (a_1 b_2 - a_2 b_1 )^2 \geq 0$

Which proves the base case.

Moving ahead, we assume the following inequality to be true:

$\displaystyle ( \sum_{i=1}^k a_i b_i )^2 \leq \sum_{i=1}^k ( a_i )^2 \sum_{i=1}^k ( b_i )^2$

To establish Cauchy-Schwarz, we have to demonstrate, assuming the above, that

$\displaystyle ( \sum_{i=1}^{k+1} a_i b_i )^2 \leq \sum_{i=1}^{k+1} ( a_i )^2 \sum_{i=1}^{k+1} ( b_i )^2$

So, we start from $H(k)$:

$\displaystyle ( \sum_{i=1}^k a_i b_i )^2 \leq \sum_{i=1}^k ( a_i )^2 \sum_{i=1}^k ( b_i )^2$

we further have,

$\displaystyle \Big(\sum_{i=1}^k a_i b_i\Big) + a_{k+1}b_{k+1} \leq \Big(\sum_{i=1}^k ( a_i )^2\Big)^{1/2} \Big(\sum_{i=1}^k ( b_i )^2\Big)^{1/2} + a_{k+1}b_{k+1} \ \ \ \ (1)$

Now, we can apply the case for $n = 2$. Recall that: $\displaystyle a_1 b_1 + a_2 b_2 \leq (a_1^2 + a_2^2)^{1/2} (b_1^2 + b_2^2)^{1/2}$

Thus, using this in the R. H. S of $(1)$, we would now have:

$\displaystyle \Big(\sum_{i=1}^{k+1} a_i b_i\Big) \leq \Big(\sum_{i=1}^k ( a_i )^2 + a_{k+1}^2\Big)^{1/2} \Big(\sum_{i=1}^k ( b_i )^2 + b_{k+1}^2\Big)^{1/2}$

Or,

$\displaystyle \Big(\sum_{i=1}^{k+1} a_i b_i\Big) \leq \Big(\sum_{i=1}^{k+1} ( a_i )^2 \Big)^{1/2} \Big(\sum_{i=1}^{k+1} ( b_i )^2 \Big)^{1/2}$

This proves the case $H(k+1)$ on assuming $H(k)$. Thus also proving the Cauchy-Schwarz inequality by the principle of mathematical induction.

________________

Proof 3: For Infinite Sequences using the Normalization Trick:

Problem: For $a, b \in \mathbb{R}$ If $\displaystyle \Big(\sum_{i=1}^{\infty} a_i^2 \Big) < \infty$ and $\displaystyle \Big(\sum_{i=1}^{\infty} b_i^2 \Big) < \infty$ then is $\displaystyle \Big(\sum_{i=1}^{\infty} |a_i||b_i|\Big) < \infty$ ?

Note that this is easy to establish. We simply start with the trivial identity $(x-y)^2 \geq 0$ which in turn gives us $\displaystyle xy \leq \frac{x^2}{2} + \frac{y^2}{2}$

Next, take $x = |a_i|$ and $y = |b_i|$ on summing up to infinity on both sides, we get the following:

$\displaystyle \Big( \sum_{i=1}^{\infty} |a_i||b_i| \Big)^2 \leq \frac{1}{2}\Big(\sum_{i=1}^{\infty} a_i^2\Big) + \frac{1}{2} \Big(\sum_{i=1}^{\infty} b_i^2\Big)\ \ \ \ \ \ \ \ (2)$

From this it immediately follows that

$\displaystyle \Big(\sum_{i=1}^{\infty} |a_i||b_i|\Big) < \infty$

Now let

$\displaystyle \hat{a}_i = \frac{a_i}{\Big(\sum_j a_i^2\Big)^{1/2}}$ and

$\displaystyle \hat{b}_i = \frac{b_i}{\Big(\sum_j b_i^2\Big)^{1/2}}$; substituting in $(2)$, we get:

$\displaystyle \Big( \sum_{i=1}^{\infty} |\hat{a}_i||\hat{b}_i| \Big)^2 \leq \frac{1}{2}\Big(\sum_{i=1}^{\infty} \hat{a}_i^2\Big) + \frac{1}{2} \Big(\sum_{i=1}^{\infty} \hat{b}_i^2\Big)$ or,

$\displaystyle \Bigg(\sum_{i = 1}^{\infty}\frac{a_i}{\Big(\sum_j a_j^2\Big)^{1/2}}\frac{b_i}{\Big(\sum_j b_j^2\Big)^{1/2}}\Bigg)^2 \leq \frac{1}{2} + \frac{1}{2}$

Which simply gives back Cauchy’s inequality for infinite sequences thus completing the proof:

$\displaystyle \Big(\sum_{i=1}^{\infty} a_i b_i\Big)^2 \leq \Big(\sum_{i=1}^{\infty}a_i^2\Big) \Big(\sum_{i=1}^{\infty}b_i^2\Big)$

________________

Proof 4: Using Lagrange’s Identity

We first start with a polynomial which we denote by $\mathbf{Q}_n$:

$\mathbf{Q}_n = \big(a_1^2 + a_2^2 + \dots + a_n^2 \big) \big(b_1^2 + b_2^2 + \dots + b_n^2 \big) - \big(a_1b_1 + a_2b_2 + \dots + a_nb_n\big)^2$

The question to now ask, is $\mathbf{Q}_n \geq 0$? To answer this question, we start of by re-writing $\mathbf{Q}_n$ in a “better” form.

$\displaystyle \mathbf{Q}_n = \sum_{i=1}^{n}\sum_{j=1}^{n} a_i^2 b_j^2 - \sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_i a_jb_j$

Next, as J. Michael Steele puts, we pursue symmetry and rewrite the above so as to make it apparent.

$\displaystyle \mathbf{Q}_n = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_i^2 b_j^2 + a_j^2 b_i^2\big) - \frac{2}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_i a_jb_j$

Thus, we now have:

$\displaystyle \mathbf{Q}_n = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_i b_j - a_j b_i\big)^2$

This makes it clear that $\mathbf{Q}_n$ can be written as a sum of squares and hence is always postive. Let us write out the above completely:

$\displaystyle \sum_{i=1}^{n}\sum_{j=1}^{n} a_i^2 b_j^2 - \sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_ib_jb_j = \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_ib_j -a_jb_i\big)^2$

Now, reversing the step we took at the onset to write the L.H.S better, we simply have:

$\displaystyle \sum_{i}^{n} a_i^2 \sum_{}^{n} b_i^2 - \big(\sum_{i=1}^n a_ib_i\big)^2 = \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_ib_j -a_jb_i\big)^2$

This is called Lagrange’s Identity. Now since the R.H.S. is always greater than or equal to zero. We get the following inequality as a corrollary:

$\displaystyle \big(\sum_{i=1}^n a_ib_i\big)^2 \leq \sum_{i}^{n} a_i^2 \sum_{}^{n} b_i^2$

This is just the Cauchy-Schwarz inequality, completing the proof.

________________

Proof 5: Gram-Schmidt Process gives an automatic proof of Cauchy-Schwarz

First we quickly review the Gram-Schmidt Process: Given a set of linearly independent elements of a real or complex inner product space $\big(V,\langle\cdot, \cdot\rangle\big)$, $\mathbf{x_1}, \mathbf{x_2}, \dots, \mathbf{x_n}$. We can get an orthonormal set of $n$ elemets $\mathbf{e_1}, \mathbf{e_2}, \dots, \mathbf{e_n}$ by the simple recursion (after setting $\displaystyle \mathbf{e_1 = \frac{\mathbf{x_1}}{\|x_1 \|}}$).

$\displaystyle \mathbf{z_k} = \mathbf{x_k} - \sum_{j=1}^{k-1} \langle\mathbf{x_k},\mathbf{e_j}\rangle\mathbf{e_j}$ and then

$\displaystyle \mathbf{e_k} = \frac{\mathbf{z_k}}{\|\mathbf{z_k}\|}$

for $k = 2, 3, \dots, n$.

Keeping the above in mind, assume that $\| x\| = 1$. Now let $x = e_1$. Thus, we have:

$\mathbf{z} = \mathbf{y} - \langle \mathbf{y}, \mathbf{e_1}\rangle\mathbf{e_1}$

Giving: $\displaystyle \mathbf{e_2} = \frac{\mathbf{z}}{\|\mathbf{z}\|}$. Rearranging we have:

$\displaystyle \|\mathbf{z}\|\mathbf{e_2} = \mathbf{y} - \langle \mathbf{y}, \mathbf{e_1}\rangle\mathbf{e_1}$ or

$\displaystyle \mathbf{y} = \langle \mathbf{y},\mathbf{e_1}\rangle \mathbf{e_1} + \|\mathbf{z}\|\mathbf{e_2}$ or

$\displaystyle \mathbf{y} = \mathbf{c_1} \mathbf{e_1} + \mathbf{c_2}\mathbf{e_2}$ where $c_1, c_2$ are constants.

Now note that: $\displaystyle \langle\mathbf{x},\mathbf{y}\rangle = c_1$ and

$\displaystyle \langle \mathbf{y},\mathbf{y}\rangle = |c_1|^2 + |c_2|^2$. The following bound is trivial:

$\displaystyle |c_1| \leq (|c_1|^2 + |c_2|^2)^{1/2}$. But note that this is simply $\langle x,y \rangle \leq \langle y,y \rangle^{1/2}$

Which is just the Cauchy-Schwarz inequality when $\|x\| = 1$.

________________

Proof 6: Proof of the Continuous version for d =2; Schwarz’s Proof

For this case, the inequality may be stated as:

Suppose we have $S \subset \mathbb{R}^2$ and that $f: S \to \mathbb{R}$ and $g: S \to \mathbb{R}$. Then consider the double integrals:

$\displaystyle A = \iint_S f^2 dx dy$, $\displaystyle B = \iint_S fg dx dy$ and $\displaystyle C = \iint_S g^2 dx dy$. These double integrals must satisfy the following inequality:

$|B| \leq \sqrt{A} . \sqrt{C}$.

The proof given by Schwarz as is reported in Steele’s book (and indeed in standard textbooks) is based on the following observation:

The real polynomial below is always non-negative:

$\displaystyle p(t) = \iint_S \Big( t f(x,y) + g(x,y) \Big)^2 dx dy = At^2 + 2Bt + C$

$p(t) > 0$ unless $f$ and $g$ are proportional. Thus from the binomial formula we have that $B^2 \leq AC$, moreover the inequality is strict unless $f$ and $g$ are proportional.

________________

Proof 7: Proof using the Projection formula

Problem: Consider any point $x \neq 0$ in $\mathbb{R}^d$. Now consider the line that passes through this point and origin. Let us call this line $\mathcal{L} = \{ tx: t \in \mathbb{R}\}$. Find the point on the line closest to any point $v \in \mathbb{R}^d$.

If $P(v)$ is the point on the line that is closest to $v$, then it is given by the projection formula: $\displaystyle P(v) = x \frac{\langle x, v \rangle }{\langle x, x \rangle}$

This is fairly elementary to establish. To find the value of $t$, such that distance $\rho(v,tx)$ is minimized, we can simply consider the squared distance $\rho^2(v,tx)$ since it is easier to work with. Which by definition is:

$\displaystyle \rho^2(v,tx) = \langle v - x, v - tx \rangle$

which is simply:

$\displaystyle \rho^2(v,tx) = \langle v, v \rangle - 2t \langle v, x \rangle + t^2 \langle x, x \rangle$

$\displaystyle = \langle x, x \rangle \bigg( t^2 -2t \frac{\langle v, x \rangle}{\langle x, x \rangle} + \frac{\langle v, v \rangle}{\langle x, x \rangle}\bigg)$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2}\bigg\} + \frac{\langle v, v \rangle}{\langle x, x \rangle}\bigg)$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2} + \frac{\langle v, v \rangle}{\langle x, x \rangle} \bigg\}$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2} + \frac{\langle v, v \rangle \langle x, x \rangle}{\langle x, x \rangle^2} \bigg\}$

So, the value of $t$ for which the above is minimized is $\displaystyle \frac{\langle v,x \rangle}{\langle x,x \rangle}$. Note that this simply reproduces the projection formula.

Therefore, the minimum squared distance is given by the expression below:

$\displaystyle \min_{t \in \mathbb{R}} \rho^2(v, tx) = \frac{\langle v,v\rangle \langle x,x \rangle - \langle v,x \rangle^2}{\langle x,x \rangle}$

Note that the L. H. S is always positive. Therefore we have:

$\displaystyle \frac{\langle v,v\rangle \langle x,x \rangle - \langle v,x \rangle^2}{\langle x,x \rangle} \geq 0$

Rearranging, we have:

$\displaystyle \langle v,x \rangle^2 \leq \langle v,v\rangle \langle x,x \rangle$

Which is just Cauchy-Schwarz, thus proving the inequality.

________________

Proof 8: Proof using an identity

A variant of this proof is amongst the most common Cauchy-Schwarz proofs that are given in textbooks. Also, this is related to proof (6) above. However, it still has some value in its own right. While also giving an useful expression for the “defect” for Cauchy-Schwarz like the Lagrange Identity above.

$P(t) = \langle v - tw, v - tw \rangle$. Clearly $P(t) \geq 0$.

To find the minimum of this polynomial we find its derivative w.r.t $t$ and setting to zero:

$P'(t) = 2t \langle w, w \rangle - 2 \langle v, w \rangle = 0$ giving:

$\displaystyle t_0 = \frac{\langle v, w \rangle}{\langle w, w \rangle}$

Clearly we have $P(t) \geq P(t_0) \geq 0$. We consider:

$P(t_0) \geq 0$, substituting $\displaystyle t_0 = \frac{\langle v, w \rangle}{\langle w, w \rangle}$ we have:

$\displaystyle \langle v,v \rangle - \frac{\langle v, w \rangle}{\langle w, w \rangle} \langle v,w \rangle - \frac{\langle v, w \rangle}{\langle w, w \rangle} \langle w,v \rangle + \frac{\langle v, w \rangle^2}{\langle w, w \rangle^2}\langle w,w \rangle \geq 0 \ \ \ \ \ \ \ \ (A)$

Just rearrangine and simplifying:

$\displaystyle \langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2 \geq 0$

This proves Cauchy-Schwarz inequality.

Now suppose we are interested in an expression for the defect in Cauchy-Schwarz i.e. the difference $\displaystyle \langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2$. For this we can just consider the L.H.S of equation $(A)$ since it is just $\displaystyle \langle w,w \rangle \Big(\langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2\Big)$.

i.e. Defect =

$\displaystyle \langle w,w \rangle \bigg(\langle v,v \rangle - 2 \frac{\langle v,w \rangle^2}{\langle w,w \rangle} + \frac{\langle v,w \rangle^2}{\langle w,w \rangle}\bigg)$

Which is just:

$\displaystyle \langle w,w \rangle\bigg(\Big\langle v - \frac{\langle w,v \rangle}{\langle w,w \rangle}w,v - \frac{\langle w,v \rangle}{\langle w,w \rangle}w \Big\rangle\bigg)$

This defect term is much in the spirit of the defect term that we saw in Lagrange’s identity above, and it is instructive to compare them.

________________

Proof 9: Proof using the AM-GM inequality

Let us first recall the AM-GM inequality:

For non-negative reals $x_1, x_2, \dots x_n$ we have the following basic inequality:

$\displaystyle \sqrt[n]{x_1 x_2 \dots x_n} \leq \Big(\frac{x_1 + x_2 + \dots x_n}{n}\Big)$.

Now let us define $\displaystyle A = \sqrt{a_1^2 + a_2^2 + \dots + a_n^2}$ and $\displaystyle B = \sqrt{b_1^2 + b_2^2 + \dots + b_n^2}$

Now consider the trivial bound (which gives us the AM-GM): $(x-y)^2 \geq 0$, which is just $\displaystyle \frac{1}{2}\Big(x^2 + y^2\Big) \geq xy$. Note that AM-GM as stated above for $n = 2$ is immediate when we consider $x \to \sqrt{x}$ and $y \to \sqrt{y}$

Using the above, we have:

$\displaystyle \frac{1}{2} \Big(\frac{a_i^2}{A^2} + \frac{b_i^2}{B^2}\Big) \geq \frac{a_ib_i}{AB}$

Summing over $n$, we have:

$\displaystyle \sum_{i=1}^n\frac{1}{2} \Big(\frac{a_i^2}{A^2} + \frac{b_i^2}{B^2}\Big) \geq \sum_{i=1}^n \frac{a_ib_i}{AB}$

But note that the L.H.S equals 1, therefore:

$\displaystyle \sum_{i=1}^n \frac{a_ib_i}{AB} \leq 1$ or $\displaystyle \sum_{i=1}^n a_ib_i \leq AB$

Writing out $A$ and $B$ as defined above, we have:

$\displaystyle \sum_{i=1}^n a_ib_i \leq \sqrt{\sum_{i=1}^na_i^2}\sqrt{\sum_{i=1}^nb_i^2}$.

Thus proving the Cauchy-Schwarz inequality.

________________

Proof 10: Using Jensen’s Inequality

We begin by recalling Jensen’s Inequality:

Suppose that $f: [p, q] \to \mathbb{R}$ is a convex function. Also suppose that there are non-negative numbers $p_1, p_2, \dots, p_n$ such that $\displaystyle \sum_{i=1}^{n} p_i = 1$. Then for all $x_i \in [p, q]$ for $i = 1, 2, \dots, n$ one has:

$\displaystyle f\Big(\sum_{i=1}^{n}p_ix_i\Big) \leq \sum_{i=1}^{n}p_if(x_i)$.

Now we know that $f(x) = x^2$ is convex. Applying Jensen’s Inequality, we have:

$\displaystyle \Big(\sum_{i=1}^{n} p_i x_i \Big)^2 \leq \sum_{i=1}^{n} p_i x_i^2$

Now, for $b_i \neq 0$ for all $i = 1, 2, \dots, n$, let $\displaystyle x_i = \frac{a_i}{b_i}$ and let $\displaystyle p_i = \frac{b_i^2}{\sum_{i=1}^{n}b_i^2}$.

Which gives:

$\displaystyle \Big(\sum_{i=1}^n \frac{a_ib_i}{\sum_{i=1}^{n}b_i^2}\Big)^2 \leq \Big(\sum_{i=1}^{n}\frac{a_i^2}{\sum_{i=1}^{n}b_i^2}\Big)$

Rearranging this just gives the familiar form of Cauchy-Schwarz at once:

$\displaystyle \Big(\sum_{i=1}^{n} a_ib_i\Big)^2 \leq \Big(\sum_{i=1}^{n} a_i^2\Big)\Big(\sum_{i=1}^{n} b_i^2\Big)$

________________

Proof 11: Pictorial Proof for d = 2

Here (page 4) is an attractive pictorial proof by means of tilings for the case $d = 2$ by Roger Nelson.

http://www-stat.wharton.upenn.edu/~steele/MoreProblems.htm

# Cauchy Schwarz Inequality

## More Mathematical Inequalities — From Competitions and Beyond

The Cauchy-Schwarz Master Class has been in print for more than a year now, so, as it it were a law of nature, beautiful problems and proofs start turning up that I never noticed before.

### New Proofs of the Cauchy-Schwarz Inequality?

I certainly like this one that just showed up on my screen. It’s from 2003, but it is amazingly slick — and instructive. It begins with a simple fractional inequality without a name. How about the “baby fractional Cauchy Inequality “? For the whole piece, the most appropriate might be “The Cauchy-Schwarz Inequality and Convexity of the Linear/Quadratic quotient.” Still, that is a bit fancy, for such a down to Earth proof of Cauchy’s inequality.

There is also a real “proof from the book” of the case d=2 of Cauchy’s inequality by tilings! Do look at the lovely piece by Roger Nelsen that covers this (on page 4). The principle? Well, besides the usual (but nice) principle that “tilings of tilings” yield identities, there is the clever observation that a rhombus of given edge lengths has less area than a rectangle with the same edge lengths. Punch line? This “is” Cauchy — if you see the right rhombus! (on page 4)!