Eric Chapdelaine
Student at Northeastern University Studying Computer Science.
Notes
Topics
Projects
Articles
Resume

Email
GitHub
LinkedIn

MATH2331 Linear Algebra


The Class

The Class

4 quizzes (open more than 24 hours) open on Thursday

Exams are timed (65-70 minute test)

Final Exam (not cumulative) on August 19th

1.1 Introduction to Linear Systems

1.1 Introduction to Linear Systems

Background

R\mathbb{R} = All real numbers (,)(-\infty, \infty)

R2\mathbb{R}^{2} = xy-plane

Rn\mathbb{R}^{n} = Vector space. All (x1,x2,,xn)(x_1, x_2, …, x_n)

Single variable Functions:

Linear: f(x)=5x, f(x)=axf(x) = 5x,\ f(x) = ax

Non-linear: f(x)=x2+cos(x), f(x)=ex, f(x)=tan1(x)f(x) = x^{2} + \cos (x),\ f(x) = e^{x},\ f(x) = \tan ^{-1}(x)

Multi-variable Functions:

Linear: f(x, y)=ax+by, f(x, y, z)=5x+3y+bzf(x,\ y) = ax + by,\ f(x,\ y,\ z) = 5x + 3y + bz

Non-linear:

Equations:

5=4x5 = 4x

A linear equation in the variables x1, x2, x3, , xnx_1,\ x_2,\ x_3,\ …,\ x_n is an equation of the form a1x1+a2x2+x3x3+anxn=ba_1x_1 + a_2x_2 + x_3x_3 + … a_nx_n = b where a1, a2, , ana_1,\ a_2,\ …,\ a_n are real numbers

A linear system (or system of linear equations) is a collection of linear equations in same variables x1, x2, x3, ,xnx_1,\ x_2,\ x_3,\ …, x_n.

Example

x+3y=1xy=9    L2=2L1+L2x+3y=107y=7    L2=17L2x+3y=10y=1\begin{vmatrix} x & +3y & = 1 \\ x & -y & =9 \end{vmatrix} \overset{L_2 = -2 L_1 + L_2}{\implies} \begin{vmatrix} x & +3y & =1 \\ 0 & -7y & =7 \end{vmatrix} \overset{L_2 = -\frac{1}{7} L_2}{\implies} \begin{vmatrix} x & +3y & =1 \\ 0 & y & =-1 \end{vmatrix}

    L1=3L2+L1x=4y=1\overset{L_1 = -3 L_2 + L_1}{\implies} \begin{vmatrix} x & = 4 \\ y & = -1 \end{vmatrix}

Example

x+3y=22x6y=4    L2=2L1+L2x+3y=20=0\begin{vmatrix} x & + 3y & =2 \\ -2x & -6y & =-4 \end{vmatrix} \overset{L_2 = 2L_1 + L_2}{\implies} \begin{vmatrix} x & +3y & = 2 \\ & 0 & = 0 \end{vmatrix}

Solutions form the line x+3y=2x+3y=2. Infinitely many solutions.

Example

Example:

x+y=02xy+3z=3x2yz=3    L3=L1+L3L2=2L1+L2x+y=03y+3z=33yz=3\begin{vmatrix} x & +y & & = 0 \\ 2x & -y & + 3z & = 3 \\ x & -2y & -z & =3 \end{vmatrix} \overset{\overset{L_2 = -2L_1 + L_2}{L_3 = -L_1 + L_3}}{\implies} \begin{vmatrix} x & +y & &=0 \\ & -3y & +3z & = 3 \\ & -3y & -z & =3 \end{vmatrix}

    L2=L213x+y=0yz=1z=0    L3=3L2+L3x+y=0yz14z=0\overset{L_2 = L_2 -\frac{1}{3}}{\implies} \begin{vmatrix} x & +y & & = 0 \\ & y & -z & =-1 \\ & & z & =0 \end{vmatrix} \overset{L_3 = 3L_2 + L_3}{\implies} \begin{vmatrix} x & +y & & =0 \\ & y & -z & -1 \\ & & -4z & = 0 \end{vmatrix}

    L3=14L3x+y=0yz=1z=0    L2=L3+L2x+y=0y=1z=0\overset{L_3 = -\frac{1}{4} L_3}{\implies} \begin{vmatrix} x & +y & =0 \\ & y & -z & = -1 \\ & & z & =0 \end{vmatrix} \overset{L_2 = L_3 + L_2}{\implies} \begin{vmatrix} x & + y & & =0 \\ & y & & =-1 \\ & & z & =0 \end{vmatrix}

    L1=L1L2x=1y=1z=0\overset{L_1 = L_1 - L_2}{\implies} \begin{vmatrix} x & =1 \\ y & =-1 \\ z &=0 \end{vmatrix}

Solution (x, y, z)=(1, 1, 0)(x,\ y,\ z) = (1,\ -1,\ 0)

Example

x+y+z=2y+z=1x+2y2z=3    L3=L1+L3x+y+z=2y+z=1y+z=1\begin{vmatrix} x & + y & + z & =2 \\ & y & +z & =1 \\ x & +2y & 2z & =3 \end{vmatrix} \overset{L_3 = -L_1 + L_3}{\implies} \begin{vmatrix} x & +y & +z & = 2 \\ & y & + z & =1 \\ & y & +z & =1 \end{vmatrix}

    L3=L2+l3x+y+z=2y+z=10=0    L1=L2+L1x=1y+z=10=0\overset{L_3 = -L_2 + l_3}{\implies} \begin{vmatrix} x & +y & +z & =2 \\ & y & +z & =1 \\ & & 0 & =0 \end{vmatrix} \overset{L_1 = -L_2 + L_1}{\implies} \begin{vmatrix} x & & & =1\\ & y & +z & =1\\ & & 0 & =0 \end{vmatrix}

This example has a free variable. Let z=tz=t.

Solution: (x, y, z)=(1, 1t, t)(x,\ y,\ z) = (1,\ 1-t,\ t). Has infinitely many solutions.

y+z=1    y=1ty + z = 1 \implies y = 1 -t

Example

x+y+z=2y+z=12y+2z=0    L3=2L2+L3x+y+z=2y+z=10=2\begin{vmatrix} x & + y & + z & =2 \\ & y & + z & =1 \\ & 2y & + 2z & =0 \end{vmatrix} \overset{L_3 = -2L_2 + L_3}{\implies} \begin{vmatrix} x & +y & +z & =2 \\ & y & + z & =1 \\ & & 0 & =-2 \end{vmatrix}

No solutions.

How many solutions are possible to a system of linear equations?

Answer:

  • 0 Solutions
  • 1 Solution
  • Infinitely many solutions

(This is because planes cannot curve)

Geometric Interpretation

A linear equation ax+by=cax + by = c defined a line in R2\mathbb{R}^{2}

Solutions to a linear system are intersections of lines in R2\mathbb{R}^{2}.

  • 0 Points (Solutions)
  • 1 Point (Solution)
  • \infty many points (Solutions) if they are the same line

A linear equation ax+by+cz=dax + by + cz = d defined a plane in R3\mathbb{R}^{3}.

Solutions to a linear system are intersections of (hyper) planes in R3\mathbb{R}^{3}.

  • 0 Points (Solutions)
  • 1 Point (Solution)
  • \infty many points (Solutions): All the planes contain a line. Also if all planes could be the same plane.

Example

Find all polynomials f(t)f(t) of degree 2\le 2.

  • Whose graph run through (1, 3) and (2, 6) and
  • Such that f(1)=1f^{\prime}(1) = 1
  • Use f(t)=a+bt+ct2f(t) = a + bt + ct^{2}

We know

  • f(1)=3    a+b+c=3f(1) = 3 \implies a + b + c = 3
  • f(2)=6    a+2b+4c=6f(2) = 6 \implies a + 2b + 4c = 6
  • f(t)=b+2ctf’(t) = b + 2ct
  • f(1)=1    b+2c=1f’(1) = 1 \implies b + 2c = 1

a+b+c=3a+2b+4c=6b+2c=1    L2=L1+L2a+b+c=3b+3c=3b+2c=1\begin{vmatrix} a & +b & + c & =3 \\ a & +2b & +4c & =6 \\ & b & +2c & =1 \end{vmatrix} \overset{L_2 = -L_1 + L_2}{\implies} \begin{vmatrix} a & +b & +c & =3\\ & b & +3c & =3 \\ & b & +2c & =1 \end{vmatrix}

    L3=L2+L3a+b+c=3b+3c=3c=2    L1=L3+L1L2=3L3+L2a+b=1b3c=2\overset{L_3 = -L_2 + L_3}{\implies} \begin{vmatrix} a & +b & +c & =3 \\ & b& +3c & =3 \\ & & c & =2 \end{vmatrix} \overset{\overset{L_2 = -3L_3 + L_2}{L_1 = -L_3 + L_1}}{\implies} \begin{vmatrix} a & +b & =1\\ & b & -3\\ & c & =2 \end{vmatrix}

    L1=L1L2a=4b=3c=2\overset{L_1 = L_1 - L_2}{\implies} \begin{vmatrix} a & =4 \\ b & =-3 \\ c & =2 \end{vmatrix}

f(t)=43t+2t2f(t) = 4 - 3t + 2t^{2}

1.2 Matrices, Vectors, and Gauss-Jordan Elimination

1.2 Matrices, Vectors, and Gauss-Jordan Elimination

x+2y+3z=12x+4y+7z=23x+7y+4z=8\begin{vmatrix} x & +2y & +3z & =1 \\ 2x & +4y & +7z & =2 \\ 3x & +7y & +4z & =8 \end{vmatrix}

We can store all information in this linear system in a matrix which is a rectangular array of numbers.

Augmented Matrix:

[1231247237118]\begin{bmatrix} 1 & 2 & 3 & \bigm| & 1 \\ 2 & 4 & 7 & \bigm| & 2 \\ 3 & 7 & 11 & \bigm| & 8 \end{bmatrix}

3 row and 4 column = 2x4 matrix

Coefficient Matrix:

[123247371]\begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 7 \\ 3 & 7 & 1 \end{bmatrix}

3 x 3 matrix

Generally, we have

A=[aij]=[a11a12a13a1ma21a22a23a2man1an2an3anm]A = [a_{ij}] = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \cdots & a_{1m} \\ a_{21} & a_{22} & a_{23} & \cdots & a_{2m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{n_1} & a_{n2} & a_{n3} & \cdots & a_{nm} \end{bmatrix}

Here, AA is n×mn\times m (n rows and m columns).

For square n×nn \times n matrices:

Diagonal: aija_{ij} for iji \neq j

Lower triangular: aij=0a_{ij} = 0 for i<ji < j

Upper triangular: aij=0a_{ij} = 0 for i>ji > j

Identity matrix InI_n: square n×nn\times n diagonal (aij=0a_{ij} = 0 for iji \neq j ) and aii=1a_{ii} = 1 for 1i=n1 \le i = n

I3=[100010001]I_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

0 Matrix: Any size; all entries are 0

[0000000000]\begin{bmatrix} 0 & 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 & 0 \end{bmatrix}

Above is a 2×52\times 5 0-Matrix

Columns of an n×mn \times m matrix form vectors in Rn\mathbb{R}^{n}. Example:

[1231247237118]\begin{bmatrix} 1 & 2 & 3 & \Bigm| & 1 \\ 2 & 4 & 7 & \Bigm| & 2 \\ 3 & 7 & 11 & \Bigm| & 8 \end{bmatrix}

We can represent vectors as the columns:

[12],[31],[19], in R2\begin{bmatrix} 1 \\ 2 \end{bmatrix} , \begin{bmatrix} 3 \\ 1 \end{bmatrix} , \begin{bmatrix} 1 \\ 9 \end{bmatrix} , \text{ in } \mathbb{R}^2

This is the standard representation for a vector in Rn\mathbb{R}^{n}. A vector as an arrow starting at origin and ending at corresponding point.

Consider the two vectors:

v=[12],w=[31] in R2\vec{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} , \vec{w} = \begin{bmatrix} 3 \\ 1 \end{bmatrix} \text{ in } \mathbb{R}^2

lec2-fig1

We may use 3 elementary row operations

  1. Multiply/divide a row by a nonzero constant
  2. Add/subtract a multiple of one row to another
  3. Interchange two rows

Solving the system of linear equations:

Example

[1231247237118]    3R1+R32R1+R2[123100100125]    R2R3[123101250010]\begin{bmatrix} 1 & 2 & 3 & \Bigm| & 1 \\ 2 & 4 & 7 & \Bigm| & 2 \\ 3 & 7 & 11 & \Bigm| & 8 \end{bmatrix} \overset{\overset{-2R_1 + R2}{-3R_1 + R_3}}{\implies} \begin{bmatrix} 1 & 2 & 3 & \Bigm| & 1 \\ 0 & 0 & 1 & \Bigm| & 0 \\ 0 & 1 & 2 & \Bigm| & 5 \\ \end{bmatrix} \overset{R_2 \leftrightarrow R_3}{\implies} \begin{bmatrix} 1 & 2 & 3 & \bigm| & 1 \\ 0 & 1 & 2 & \bigm| & 5 \\ 0 & 0 & 1 & \bigm| & 0 \end{bmatrix}     2R3+R23R3+R1[120101050010]    2R2+R1[100901050010] identity matrix\overset{\overset{-3R_3 + R_1}{-2R_3 + R_2}}{\implies} \begin{bmatrix} 1 & 2 & 0 & \bigm| & 1 \\ 0 & 1 & 0 & \bigm| & 5 \\ 0 & 0 & 1 & \bigm| & 0 \end{bmatrix} \overset{-2R_2 + R_1}{\implies} \begin{bmatrix} 1 & 0 & 0 & \bigm| & -9 \\ 0 & 1 & 0 & \bigm| & 5 \\ 0 & 0 & 1 & \bigm| & 0 \end{bmatrix} \text{ identity matrix} [xyz]=[950]\therefore \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} -9 \\ 5 \\ 0 \end{bmatrix}

Example

[1110102221140254]    2R1+R3R1+R2[1110013201140254]\begin{bmatrix} 1 & -1 & 1 & \bigm| & 0 \\ 1 & 0 & -2 & \bigm| & 2 \\ 2 & -1 & 1 & \bigm| & 4 \\ 0 & 2 & -5 & \bigm| & 4 \end{bmatrix} \overset{\overset{-R_1 + R_2}{-2R_1 + R_3}}{\implies} \begin{bmatrix} 1 & -1 & 1 & \bigm| & 0 \\ 0 & 1 & -3 & \bigm| & 2 \\ 0 & 1 & -1 & \bigm| & 4 \\ 0 & 2 & -5 & \bigm| & 4 \end{bmatrix}     2R2+R4R2+R3[1110013200220010]    R3R4[1110013200100022]\overset{\overset{-R_2 + R_3}{-2R_2 + R_4}}{\implies} \begin{bmatrix} 1 & -1 & 1 & \bigm| & 0 \\ 0 & 1 & -3 & \bigm| & 2 \\ 0 & 0 & 2 & \bigm| & 2 \\ 0 & 0 & 1 & \bigm| & 0 \\ \end{bmatrix} \overset{R_3 \leftrightarrow R_4}{\implies} \begin{bmatrix} 1 & -1 & 1 & \bigm| & 0 \\ 0 & 1 & -3 & \bigm| & 2 \\ 0 & 0 & 1 & \bigm| & 0 \\ 0 & 0 & 2 & \bigm| & 2 \\ \end{bmatrix}     2R3+R4[1110013200100002]\overset{-2R_3 + R_4}{\implies} \begin{bmatrix} 1 & -1 & 1 & \bigm| & 0 \\ 0 & 1 & -3 & \bigm| & 2 \\ 0 & 0 & 1 & \bigm| & 0 \\ 0 & 0 & 0 & \bigm| & 2 \\ \end{bmatrix}

No solutions

Example

[x1x2x3x4x5]=[170013001022000111]\begin{bmatrix} x_1 & x_2 & x_3 & x_4 & x_5 \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{bmatrix} = \begin{bmatrix} 1 & -7 & 0 & 0 & 1 & \bigm| & 3 \\ 0 & 0 & 1 & 0 & -2 & \bigm| & 2 \\ 0 & 0 & 0 & 1 & 1 & \bigm| & 1 \end{bmatrix}

This is already as far as we can go with row operations, but we have two free variables. They are x2x_2 and x5x_5.

We can say that

x2=tx_2 = t

x5=sx_5 = s

x1=3+7tsx_1 = 3 + 7t - s

x3=2+2sx_3 = 2 + 2s

x4=1sx_4 = 1 - s

[x1x2x3x4x5]=[3+7t5t2+2s1ss]\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} = \begin{bmatrix} 3 + 7t - 5 \\ t \\ 2 + 2s \\ 1 - s \\ s \end{bmatrix}

Example

[112021164156]    4R1+R3R1+R2[112003360336]\begin{bmatrix} 1 & 1 & 2 & \bigm| 0 \\ 2 & -1 & 1 & \bigm| 6 \\ 4 & 1 & 5 & \bigm| 6 \\ \end{bmatrix} \overset{\overset{-R_1 + R_2}{-4R_1 + R_3}}{\implies} \begin{bmatrix} 1 & 1 & 2 & \bigm| & 0 \\ 0 & -3 & -3 & \bigm| & 6 \\ 0 & -3 & -3 & \bigm| & 6 \end{bmatrix}     (13)R2[112001120336]    3R2+R3[112001120000]\overset{\left( -\frac{1}{3} \right) R_2}{\implies} \begin{bmatrix} 1 & 1 & 2 & \bigm| & 0 \\ 0 & 1 & 1 & \bigm| & -2 \\ 0 & -3 & -3 & \bigm| & 6 \end{bmatrix} \overset{3R_2 + R_3}{\implies} \begin{bmatrix} 1 & 1 & 2 & \bigm| & 0 \\ 0 & 1 & 1 & \bigm| & -2 \\ 0 & 0 & 0 & \bigm| & 0 \end{bmatrix}     R2+R1[101201120000]\overset{-R_2 + R_1}{\implies} \begin{bmatrix} 1 & 0 & 1 & \bigm| & 2 \\ 0 & 1 & 1 & \bigm| & -2 \\ 0 & 0 & 0 & \bigm| & 0 \end{bmatrix}

z=tz=t (free variable)

x=2tx = 2-t

y=3ty= -3 - t

[xyz]=[2t2tt]\begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 2 -t\\ -2-t\\ t \end{bmatrix}

Reduced Row Echelon Form (rref)

Defintion: An n×mn\times m matrix is in reduced row echelon form (rref) provided:

  1. If a row has nonzero entries, the first nonzero entry is a 1, called leading 1 or pivot.
  2. If a column contains a leading 1, then all other entries in column are zero.
  3. If a row contains a leading 1,then each row above has a leading 1 and to the left.

Examples of matrices in reduced row echelon form:

[1700001000010000],[105201270000],[125000000000000],[0010000100000000]\begin{bmatrix} 1 & -7 & 0 & 0\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix} , \begin{bmatrix} 1 & 0 & 5 & 2\\ 0 & 1 & 2 & 7 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix} , \begin{bmatrix} 1 & 2 & 5 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} , \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix}

Row echelon form (ref)

Differences:

  • Leading entry (pivot position) in a row can be anything
  • No restriction on entries above a leading entry in a column
[5728001000010000],[275206270000],[535000000000000],[0077000600000000]\begin{bmatrix} 5 & -7 & 2 & 8\\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \\ 0 & 0 & 0 & 0 \end{bmatrix} , \begin{bmatrix} 2 & 7 & 5 & 2\\ 0 & 6 & 2 & 7 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix} , \begin{bmatrix} 5 & 3 & 5 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} , \begin{bmatrix} 0 & 0 & 7 & 7 \\ 0 & 0 & 0 & 6 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix}

Using the 3 elementary row operations, we may transform any matrix to one in rref (also ref). This method of solving a linear system is called Guass-Jordan Elimination.

1.3 On the Solutions of Linear Systems: Matrix Algebra

1.3 On the Solutions of Linear Systems: Matrix Algebra

Consider the augmented matrices:

ref with 1 unique solution: [2003030300114]\begin{bmatrix} 2 & 0 & 0 & \bigm| & -3 \\ 0 & 3 & 0 & \bigm| & 3 \\ 0 & 0 & 1 & \bigm| & 14\end{bmatrix}

rref with infinitely many solutions: [100011010010001102]\begin{bmatrix} 1 & 0 & 0 & 0 & 1 & \bigm| & -1 \\ 0 & 1 & 0 & 0 & 1 & \bigm| & 0 \\ 0 & 0 & 1 & 1 & 0 & \bigm| & 2\end{bmatrix}

ref with 1 unique solution: [00040124003600000000]\begin{bmatrix} 0 & 0 & 0 & \bigm| & 4 \\ 0 & 1 & 2 & \bigm| & 4 \\ 0 & 0 & 3 & \bigm| & 6 \\ 0 & 0 & 0 & \bigm| & 0 \\ 0 & 0 & 0 & \bigm| & 0 \\ \end{bmatrix}

ref with no solutions: [10030101002400010]\begin{bmatrix} 1 & 0 & 0 & \bigm| & 3 \\ 0 & 1 & 0 & \bigm| & -1 \\ 0 & 0 & 2 & \bigm| & 4 \\ 0 & 0 & 0 & \bigm| & 10 \\ \end{bmatrix}

A linear system is

  • consistent provided it has at least one solution
  • inconsistent provided it has no solutions

Theorem:

  • A linear system is inconsistent if and only if a row echelon form (ref) of its augmented matrix has a row [0000c]\begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & \bigm| & c \end{bmatrix} where c0c\neq 0.
  • A linear system is consistent then we have either:
    • A unique solution or
    • Infinitely many solutions (at least one free variable)

Rank

The rank of a matrix AA, denoted rank(A) is the number of leading 1’s in rref(A) (the reduced row echelon form of AA).

Example

ref

[200030001]\begin{bmatrix} 2 & 0 & 0\\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{bmatrix}

Has a rank of 3 (3x3)

Example

rref:

[100010100110110]\begin{bmatrix} 1 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 1 \\ 1 & 0 & 1 & 1 & 0 \\ \end{bmatrix}

Has a rank of 3 (3x5)

Example

ref: [100012003000000]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 2 \\ 0 & 0 & 3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix}

Rank of 3 (5x3)

Example

rref: [100000000]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix}

Rank of 1 (3x3)

Example

rref: [001010000140000001000000]\begin{bmatrix} 0 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 & 4 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{bmatrix}

rank of 3 (4x6 matrix)

Example

[333333]    13R1[111333]    3R1+R2rref:[111000]\begin{bmatrix} 3 & 3 & 3 \\ 3 & 3 & 3 \end{bmatrix} \overset{\frac{1}{3} R_1}{\implies} \begin{bmatrix} 1 & 1 & 1 \\ 3 & 3 & 3 \end{bmatrix} \overset{-3R_1 + R_2}{\implies} \text{rref}: \begin{bmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \end{bmatrix}

This matrix has a rank of 1.

Example

[111123136000]    R1+R3R2R1[111012025000]    R32R2[111012001000]\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 1 & 3 & 6 \\ 0 & 0 & 0 \end{bmatrix} \overset{\overset{R_2 - R_1}{-R_1 + R_3}}{\implies} \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 2 \\ 0 & 2 & 5 \\ 0 & 0 & 0 \end{bmatrix} \overset{R_3 - 2R_2}{\implies} \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 2 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}

The rank of this matrix is 3.

Example

C=[01a10bab0]    R1R2[10b01aab0]    1×R1[10b01aab0]C = \begin{bmatrix} 0 & 1 & a \\ -1 & 0 & b \\ -a & -b & 0 \end{bmatrix} \overset{R_1 \leftrightarrow R_2}{\implies} \begin{bmatrix} -1 & 0 & b \\ 0 & 1 & a \\ -a & -b & 0 \end{bmatrix} \overset{-1 \times R_1}{\implies} \begin{bmatrix} 1 & 0 & -b \\ 0 & 1 & a \\ -a & -b & 0 \end{bmatrix}     aR1+R3[10b01a0bab]    bR2+R3[10b01a000]\overset{aR_1 + R_3}{\implies} \begin{bmatrix} 1 & 0 & -b \\ 0 & 1 & a \\ 0 & -b & -ab \end{bmatrix} \overset{bR_2 + R_3}{\implies} \begin{bmatrix} 1 & 0 & -b \\ 0 & 1 & a \\ 0 & 0 & 0 \end{bmatrix}

Rank is 2.

Suppose we have an n×mn \times m coefficeint matrix

A=[a11a12a1ma21a22a2man1an2anm]A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1m} \\ a_{21} & a_{22} & \cdots & a_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nm} \end{bmatrix}

rank(A)n\text{rank}(A) \le n

rank(A)m\text{rank}(A) \le m

Number of free variables = mrank(A)m - \text{rank}(A)

If a linear system with coefficient matrix AA has:

  • exactly one solution, then we have no free variables. Therefore the rank(A)=m\text{rank}(A) = m.
  • no solutions, then ret augmented matrix [000b]\begin{bmatrix} 0 & 0 & 0 & \big| & b \end{bmatrix} where b0b\neq 0. Therefore rank(A)<n\text{rank(A)} < n.
  • infinitely many solutions, then at least one free variable rank(A)<m\text{rank(A)} < m.

Square Matricies: When a linear system has an n×nn \times n coefficient matrix AA, there is exactly one soltuion…

if and only if rank(A)=n\text{rank}(A) = n

if and only if rref(A)=In\text{rref}(A) = I_n (the n×nn \times n identity)

Matrix Algebra

Suppose A=[aij]A = [a_{ij}] and B=[bij]B = [b_{ij}] are both n×mn \times m and cc is in R\mathbb{R}.

Matrix Sum: A+B=[aij+bij]A+B = [a_{ij} + b_{ij}] (add/scalar multiply entry by entry)

Scaler Multiplication: cA=[caij]cA = [ca_{ij}]

Example

[235210]+[163002]=[198212]\begin{bmatrix} 2 & 3 \\ 5 & -2 \\ -1 & 0 \end{bmatrix} + \begin{bmatrix} -1 & 6 \\ 3 & 0 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 9 \\ 8 & -2 \\ -1 & 2 \end{bmatrix}

Example

5[231133]=[1015551515]5 \begin{bmatrix} 2 & 3 & -1 \\ 1 & 3 & -3 \end{bmatrix} = \begin{bmatrix} 10 & 15 & -5 \\ 5 & 15 & -15 \end{bmatrix}

Example

Vector Sum and Scaler multiplication

v=[431]\vec{v} = \begin{bmatrix} 4\\ 3 \\ 1 \end{bmatrix} w=[011]\vec{w} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} v+w=[440]\vec{v} + \vec{w} = \begin{bmatrix} 4 \\ 4 \\ 0 \end{bmatrix}

What about matrix/vector products?

  1. Dot product for 2 vectors in Rn\mathbb{R}^n
  2. AxA \vec{x} matrix times vector

Definition:

For vectors v=[v1v2vn]\vec{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} and w=[w1w2wn]\vec{w} = \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix} in Rn\mathbb{R}^n, the dot product vw\vec{v} * \vec{w} is scaler:

vw=v1w1+v2w2+v3w3=k=1nvkwk\vec{v} * \vec{w} = v_1 w_1 + v_2 w_2 + v_3 w_3 … = \sum_{k=1}^{n} v_k w_k

Note: dot product does not distinguish between row vectors and column vectors.

Example

[523][111]=51+2(1)+(3)(1)=5+2+3=6\begin{bmatrix} 5 \\ 2 \\ -3 \end{bmatrix} * \begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix} = 5 * 1 + 2(-1) + (-3)(-1) = 5 +2 + 3 = 6

An important way to think about dot product:

[523][111]\begin{bmatrix} 5 & 2 & -3 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \\ -1 \end{bmatrix}

The product AxA\vec{x} : Suppose AA is n×mn\times m and x=[x1x2xm]\vec{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_m \end{bmatrix}

Size: (n×m)(m×1)n×1\left( n\times m \right) \left( m \times 1\right) \to n \times 1

Way 1: Row Viewport

A=[w1w2wn]A = \begin{bmatrix} -- \vec{w_1} -- \\ -- \vec{w_2} -- \\ \vdots \\ -- \vec{w_n} -- \\ \end{bmatrix}

Note: wiRm\vec{w}_i \in \mathbb{R}^m

Ax=[w1xw2xwnx]A\vec{x} = \begin{bmatrix} \vec{w_1} * \vec{x} \\ \vec{w_2} * \vec{x} \\ \vdots \\ \vec{w_n} * \vec{x} \end{bmatrix}

(Size n×1n \times 1)

Way 2: Column Viewport

A=[v1v2vm]A = \begin{bmatrix} | & | & & | \\ \vec{v_1} & \vec{v_2} & \cdots & \vec{v_m} \\ | & | & & | \\ \end{bmatrix}

vjRn\vec{v_j} \in \mathbb{R}^n

Ax=x1v1+x2v2++xmvmA \vec{x} = x_1 \vec{v_1} + x_2 \vec{v_2} + \cdots + x_m \vec{v_m}

(Size n×1n \times 1)

Example

[512643011021][0213]=\begin{bmatrix} 5 & -1 & 2 & 6 \\ 4 & 3 & 0 & 1 \\ -1 & 0 & 2 & -1 \end{bmatrix} \begin{bmatrix} 0 \\ 2 \\ -1 \\ 3 \end{bmatrix} = 0[541]+2[130]1[202]+3[611]=[1495]0 \begin{bmatrix} 5 \\ 4 \\ -1 \end{bmatrix} + 2 \begin{bmatrix} -1 \\ 3 \\ 0 \end{bmatrix} - 1 \begin{bmatrix} 2 \\ 0 \\ 2 \end{bmatrix} + 3 \begin{bmatrix} 6 \\ 1 \\ -1 \end{bmatrix} = \begin{bmatrix} 14 \\ 9 \\ -5 \end{bmatrix}

Example

[512643011021][232]\begin{bmatrix} 5 & -1 & 2 & 6 \\ 4 & 3 & 0 & 1 \\ -1 & 0 & 2 & -1 \end{bmatrix} \begin{bmatrix} 2 \\ 3 \\ 2 \end{bmatrix}

Product is not defined

Example

[5231141006][21]=[10+261242+006]=[125226]\begin{bmatrix} 5 & -2 \\ 3 & 1 \\ 1 & 4 \\ -1 & 0 \\ 0 & 6 \end{bmatrix} \begin{bmatrix} 2 \\ -1 \end{bmatrix} = \begin{bmatrix} 10 + 2 \\ 6 - 1 \\ 2 - 4 \\ -2 + 0 \\ 0 - 6 \end{bmatrix} = \begin{bmatrix} 12 \\ 5 \\ -2 \\ -2 \\ -6 \end{bmatrix}

Definition:

A vector b\vec{b} in Rn\mathbb{R}^n is a linear combination of v1, v2, , vm\vec{v_1},\ \vec{v_2},\ \cdots,\ \vec{v_m} in Rn\mathbb{R}^n provided there exists scalars x1, x2, x3, , xmx_1,\ x_2,\ x_3,\ \cdots ,\ x_m with b=x1v1+x2v2+x3v3++xmvm\vec{b} = x_1 \vec{v_1} + x_2 \vec{v_2} + x_3 \vec{v_3} + \cdots + x_m \vec{v_m}.

Example

[41023]\begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix} is a linear combination of [0201]\begin{bmatrix} 0 \\ 2 \\ 0 \\ -1 \end{bmatrix} and [2011]\begin{bmatrix} 2 \\ 0 \\ 1 \\ 1 \end{bmatrix}

[41023]=5[0201]+2[2011]\begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix} = 5 \begin{bmatrix} 0 \\ 2 \\ 0 \\ -1 \end{bmatrix} + 2 \begin{bmatrix} 2 \\ 0 \\ 1 \\ 1 \end{bmatrix}

Example

[41023]\begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix} is a linear combination of e1=[1000]\vec{e_1} = \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} , e2=[0100]\vec{e_2} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix} , e3=[0010]\vec{e_3} = \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix}, and e4=[0001]\vec{e_4} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}.

In Rn\mathbb{R}^n vector, for 1in1 \le i \le n : ei\vec{e_i} has 1 in iith spot and 0 elsewhere.

[41023]=4e1+10e2+2e33e4\begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix} = 4 \vec{e_1} + 10 \vec{e_2} + 2 \vec{e_3} - 3 \vec{e_4}

Adding vectors with parallelogram rule

lec3-fig1

Example

[001]\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} in R3\mathbb{R}^3 is not linear combination of e1\vec{e_1} and e2\vec{e_2}. Linear combinations of e1\vec{e_1} and e2\vec{e_2} just fill out the xy-plane. It cannot traverse the z-axis.

Example

Let b=[41023]\vec{b} = \begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix}. Is b\vec{b} a linear combination of v=[4211]\vec{v} = \begin{bmatrix} 4 \\ 2 \\ 1 \\ -1 \end{bmatrix} and w=[2111]\vec{w} = \begin{bmatrix} 2 \\ -1 \\ 1 \\ 1 \end{bmatrix}

What we want: scalars x1x_1, x2x_2 with:

x1[4211]+x2[2111]=[41023]x_1 \begin{bmatrix} 4 \\ 2 \\ 1 \\ -1 \end{bmatrix} + x_2 \begin{bmatrix} 2 \\ -1 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix}

(We will finish this next lecture)

Quiz 1 Preparation

Example

Solve the linear system by elementary row operations.

[162530028216112]    R1+R2[162530028200142]\begin{bmatrix} 1 & 6 & 2 & -5 & \big| & 3 \\ 0 & 0 & 2 & -8 & \big| & 2 \\ 1 & 6 & 1 & -1 & \big| & 2 \end{bmatrix} \overset{-R_1 + R_2}{\implies} \begin{bmatrix} 1 & 6 & 2 & -5 & \big| & 3 \\ 0 & 0 & 2 & -8 & \big| & 2 \\ 0 & 0 & -1 & -4 & \big| & 2 \end{bmatrix}     12R2[162530014100142]    R2+R3[162530014100002]\overset{\frac{1}{2} R_2}{\implies} \begin{bmatrix} 1 & 6 & 2 & -5 & \big| & 3 \\ 0 & 0 & 1 & -4 & \big| & 1 \\ 0 & 0 & -1 & -4 & \big| & 2 \end{bmatrix} \overset{R_2 + R_3}{\implies} \begin{bmatrix} 1 & 6 & 2 & -5 & \big| & 3 \\ 0 & 0 & 1 & -4 & \big| & 1 \\ 0 & 0 & 0& 0 & \big| & 2 \end{bmatrix}     R1+R1[160310014100002]\overset{-R_1 + R_1}{\implies} \begin{bmatrix} 1 & 6 & 0 & 3 & \big| & 1 \\ 0 & 0 & 1 & -4 & \big| & 1 \\ 0 & 0 & 0& 0 & \big| & 2 \end{bmatrix}

x2=5x_2 = 5

x4=5x_4 = 5

x1=16s3tx_1 = 1 - 6s - 3t

x3=1+4tx_3 = 1 + 4t

[x1x2x3x4]=[16s3ts1+4tt]\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} 1-6s-3t \\ s \\ 1+4t \\ t \end{bmatrix}

Example

Find all polynomails of the form f(t)=a+bt+ct2f(t) = a + bt + ct^2 with the point (1, 6) on the graph of ff such that f(2)=9f’(2) = 9 and f’‘(8)=4f’‘(8) = 4.

f(t)=b+2ctf’(t) = b + 2ct

f’‘(t)=2cf’‘(t) = 2c

f(1)=6a+b+c=6f(1) = 6 \to a + b + c = 6

f(2)=9b+4c=9f’(2) = 9 \to b + 4c = 9

f’‘(8)=42c=4f’‘(8) = 4 \to 2c = 4

c=2c = 2

b+2=9    b=1b + 2 = 9 \implies b = 1

a+1+2=6    a=3a + 1 + 2 = 6 \implies a=3

f(t)=3+t+2t2f(t) = 3 + t + 2t^2

Example

Find one value cc so that the agumented matrix below corresponds to an inconsistent linear system.

[1213242c]\begin{bmatrix} 1 & 2 & -1 & \big| & 3 \\ 2 & 4 & -2 & \big| & c \end{bmatrix}

Note that in order for an inconsistent linear system, you need the form: [000b]\begin{bmatrix} 0 & 0 & 0 & \big| & b \end{bmatrix}

[1213242c]    2R1R2[12130006c]\begin{bmatrix} 1 & 2 & -1 & \big| & 3 \\ 2 & 4 & -2 & \big| & c \end{bmatrix} \overset{2R_1 - R_2}{\implies} \begin{bmatrix} 1 & 2 & -1 & \big| & 3 \\ 0 & 0 & 0 & \big| & 6 - c \end{bmatrix}

So when c6c \neq 6.

Example

Consider the matriceis AA, BB, CC, DD below.

A=[13015010900000000114]A = \begin{bmatrix} 1 & 3 & 0 & -1 & 5 \\ 0 & 1 & 0 & 9 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 4 \\ \end{bmatrix} B=[016031000122000000]B = \begin{bmatrix} 0 & 1 & 6 & 0 & 3 & -1 \\ 0 & 0 & 0 & 1 & 2 & 2 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} C=[01024]C = \begin{bmatrix} 0 & 1 & 0 & 2 & 4 \end{bmatrix} D=[01024]D = \begin{bmatrix} 0 \\ 1 \\ 0 \\ 2 \\ 4 \end{bmatrix}

a) Which of the matrices are in reduced row-echelon form (rref)?

Solution

B, C

b) List the rank of each matrix

Solution

rank(AA) = 3

rank(BB) = 2

rank(CC) = rank(DD) = 1

A linear system is consistent if and only if rank of coefficient matrix equals tank of augmented matrix. For example, this would change the rank:

[01]\begin{bmatrix} \vdots & \big\| & \vdots \\ 0 & \big\| & 1 \end{bmatrix}

Recall

AxA \vec{x} for AA an n×mn \times m matrix and x=[x1xm]\vec{x} = \begin{bmatrix} x_1 \\ \vdots \\ x_m \end{bmatrix}

Row Viewport:

Suppose w1,w2,,wn\vec{w_1}, \vec{w_2}, \cdots, \vec{w_n} in Rm\mathbb{R}^m are the rows of AA, then:

Ax=[w1xw2xwmx]A\vec{x} = \begin{bmatrix} - & \vec{w_1} * \vec{x} & - \\ - & \vec{w_2} * \vec{x} & - \\ & \vdots & \\ - & \vec{w_m} * \vec{x} & - \\ \end{bmatrix}

ith entry of AxA \vec{x} is [Row i of AA] x\cdot \vec{x}

Column Viewport:

Suppose v1, v2, , vm\vec{v_1},\ \vec{v_2},\ \cdots ,\ \vec{v_m} in Rn\mathbb{R}^n are ithe columns of AA, i.e. A=[v1v2vm]A = \begin{bmatrix} | & | && | \\ \vec{v_1} & \vec{v_2} & \cdots & \vec{v_m} \\ | & | && | \end{bmatrix}

Then, Ax=x1v1+x2v2++xmvmA \vec{x} = x_1 \vec{v_1} + x_2 \vec{v_2} + \cdots + x_m \vec{v_m}

Properties of the product AxA\vec{x}: Suppose AA is n×mn\times m, x\vec{x}, y\vec{y} are in Rm\mathbb{R}^m and kk is a scalar

  1. A(x+y)=Ax+AyA(\vec{x} + \vec{y}) = A\vec{x} + A\vec{y}
  2. A(kx)=kAxA(k\vec{x}) = kA\vec{x}

Justification of 2:

kx=[kx1kx2kxm]k\vec{x} = \begin{bmatrix} kx_1 \\ kx_2 \\ \vdots \\ kx_m \end{bmatrix}

A(kx)=(kx1)v1+(kx2)v2++(kxm)vmA(k\vec{x}) = (kx_1) \vec{v_1} + (kx_2)\vec{v_2} + \cdots + (kx_m) \vec{v_m}

=k(x1v1+x2v2++xmvm)= k(x_1 \vec{v_1} + x_2 \vec{v_2} + \cdots + x_m \vec{v_m})

=kAx= kA\vec{x}

We continue with this question: is [41023]\begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix} a linear combination of [4211]\begin{bmatrix} 4 \\ 2 \\ 1 \\ -1 \end{bmatrix} and [2111]\begin{bmatrix} 2 \\ -1 \\ 1 \\ 1 \end{bmatrix}?

Can we find x1x_1, x2x_2 scalars such that x1[4211]+x2[2111]=[41023]x_1 \begin{bmatrix} 4 \\ 2 \\ 1 \\ -1 \end{bmatrix} + x_2 \begin{bmatrix} 2 \\ -1 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix}?

Is there a solution to the linear system [4242110122113]\begin{bmatrix} 4 & 2 & \big| & 4 \\ 2 & -1 & \big| & 10 \\ 1 & 2 & \big| & 2 \\ -1 & 1 & \big| & -3 \end{bmatrix}?

[4242110112113]    R1R3[1122110424113]\begin{bmatrix} 4 & 2 & \big\| & 4 \\ 2 & -1 & \big\| & 10 \\ 1 & 1 & \big\| & 2 \\ -1 & 1 & \big\| & -3 \end{bmatrix} \overset{R_1 \leftrightarrow R_3}{\implies} \begin{bmatrix} 1 & 1 & \big\| & 2 \\ 2 & -1 & \big\| & 10 \\ 4 & 2 & \big\| & 4 \\ -1 & 1 & \big\| & -3 \end{bmatrix}     [112036024021]    [112012008003]\implies \begin{bmatrix} 1 & 1 & \big\| & 2 \\ 0 & -3 & \big\| & 6 \\ 0 & -2 & \big\| & -4 \\ 0 & 2 & \big\| & -1 \end{bmatrix} \implies \begin{bmatrix} 1 & 1 & \big\| & 2 \\ 0 & 1 & \big\| & -2 \\ 0 & 0 & \big\| & -8 \\ 0 & 0 & \big\| & 3 \end{bmatrix}

This linear system is inconsistent so: No, there is no solution.

We see

[4242110112113][42211111][x1x2]=[41023]\begin{bmatrix} 4 & 2 & \big\| & 4 \\ 2 & -1 & \big\| & 10 \\ 1 & 1 & \big\| & 2 \\ -1 & 1 & \big\| & -3 \\ \end{bmatrix} \leftrightarrow \begin{bmatrix} 4 & 2 \\ 2 & -1 \\ 1 & 1 \\ -1 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 4 \\ 10 \\ 2 \\ -3 \end{bmatrix}

This correspondence works generally:

  • A linear system with augmented matrix [Ab]\begin{bmatrix} A & \big| & \vec{b} \end{bmatrix} can be written in matrix form as Ax=bA\vec{x} = \vec{b}.

Moreover, this system is consistent if and only if b\vec{b} is a linear combination of the columns of AA. (More in sections 3.1-3.3, 5.4)

2.1 Introduction to Linear Transformation

2.1 Introduction to Linear Transformation

Recall that a function f:RmRnf : \mathbb{R}^m \to \mathbb{R}^n is a rule that assigns to each vector in Rm\mathbb{R}^m a unique vector in Rn\mathbb{R}^n.

  • Domain: Rm\mathbb{R}^m
  • Codomain/target space: Rn\mathbb{R}^n
  • Image/range: {f(x):xRm}\{ f(\vec{x}) : x \in \mathbb{R}^m \}

Example

f:R3Rf : \mathbb{R}^3 \to \mathbb{R} given by f[x1x2x3]=x12+x22+x33f \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \sqrt{x_1^2 + x_2^2 + x_3^3}

This is given the length of the vector.

Domain: R3\mathbb{R}^3

Range: [0,)[0, \infty)

Definition:

A function T:RmRnT : \mathbb{R}^m \to \mathbb{R}^n is a linear transformation provided there exists an n×mn \times m matrix AA such that T(x)=AxT(\vec{x}) = A\vec{x} for all xRm\vec{x} \in \mathbb{R}^m.

Comments:

  • “Linear functions” in calculus 1/2/3: graph is a line/plane/3-space

Examples:

f(x)=5x+4f(x) = 5x + 4

f(x, y)=2x3y+8f(x,\ y) = 2x - 3y + 8

But not all of these are linear transformations. These should be called affine.

  • For any n×mn\times m matrix AA, A0=0A\vec{0} = \vec{0} : For any linear transformation T:T(0)=0T: T(\vec{0}) = \vec{0}.

Example

For scalars, aa, bb, cc, the function g(x, y, z)=ax+by+czg(x,\ y,\ z) = ax + by + cz is a linear transformation.

g:R3Rg : \mathbb{R}^3 \to \mathbb{R}

g[xyz]=[abc][xyz]g \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} a & b & c \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix}

The matrix of gg is: [abc]\begin{bmatrix} a & b & c \end{bmatrix}

Example

The function f(x)=[a5x]f(x) = \begin{bmatrix} a \\ 5 \\ -x \end{bmatrix} is not a linear transformation.

f:RR3f : \mathbb{R} \to \mathbb{R}^3

f(0)=[050]0f(0) = \begin{bmatrix} 0 \\ 5 \\ 0 \end{bmatrix} \neq \vec{0}

Therefore ff is not a linear transformation.

Question: What is the linear transformation corresponding to I3=[100010001]I_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}?

[100010001][xyz]=x[100]+y[010]+z[001]=[xyz]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = x \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + y \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + z \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} x \\ y \\ z \end{bmatrix}

Answer: Identity map. It maps any matrix to itself.

Consider T(x)=AxT(\vec{x}) = A\vec{x} where A=[513416207325]A = \begin{bmatrix} 5 & 1 & 3 \\ 4 & -1 & 6 \\ 2 & 0 & 7 \\ 3 & 2 & 5 \end{bmatrix}. Find T(e1)T(\vec{e_1}), T(e2)T(\vec{e_2}), and T(e3)T(\vec{e_3}).

Recall that e1=[100]\vec{e_1} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, e2=[010]\vec{e_2} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, e3=[001]\vec{e_3} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}

Note: AA is 4×34\times 3. T:R3R4T : \mathbb{R}^3 \to \mathbb{R}^4

T(e1)=[5423]T(\vec{e_1}) = \begin{bmatrix} 5 \\ 4 \\ 2 \\ 3 \end{bmatrix} T(e2)=[1102]T(\vec{e_2}) = \begin{bmatrix} 1 \\ -1 \\ 0 \\ 2 \end{bmatrix} T(e3)=[3675]T(\vec{e_3}) = \begin{bmatrix} 3 \\ 6 \\ 7 \\ 5 \end{bmatrix}

Suppose T:RmRnT : \mathbb{R}^m \to \mathbb{R}^n is a linear transformation.

The matrix of TT is

[T(e1)T(e2)T(em)]\begin{bmatrix} | & | & & | \\ T(\vec{e_1}) & T(\vec{e_2}) & \cdots & T(\vec{e_m}) \\ | & | & & | \\ \end{bmatrix}

Where e1, e2, , em\vec{e_1},\ \vec{e_2},\ \cdots ,\ \vec{e_m} standard vectors in Rm\mathbb{R}^m. e.i.: 1’s in the ith spot, 0’s elsewhere

Example

Find the matrix of the transformation T:R4R2T : \mathbb{R}^4 \to \mathbb{R}^2 given by T[x1x2x3x4]=[x42x2]T \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} x_4 \\ 2x_2 \end{bmatrix}.

T(e1)=[00]T(\vec{e_1}) = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

T(e2)=[02]T(\vec{e_2}) = \begin{bmatrix} 0 \\ 2 \end{bmatrix}

T(e3)=[00]T(\vec{e_3}) = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

T(e4)=[10]T(\vec{e_4}) = \begin{bmatrix} 1 \\ 0 \end{bmatrix}

A=[00010200]A = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 2 & 0 & 0 \end{bmatrix}

Check:

[00010200][x1x2x3x4]=[x42x2]\begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 2 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} x_4 \\ 2x_2 \end{bmatrix}

Example

Find the matrix of this transformation from R2\mathbb{R}^2 to R4\mathbb{R}^4 given by y1=9x1+3x2y2=2x19x2y3=4x19x2y4=5x1+x2\begin{vmatrix} y_1 = 9x_1 + 3x_2 \\ y_2 = 2x_1 - 9x_2 \\ y_3 = 4x_1 - 9x_2 \\ y_4 = 5x_1 + x_2 \end{vmatrix}.

e1=T([10])=[9245]\vec{e_1} = T\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix} \right) = \begin{bmatrix} 9 \\ 2 \\ 4 \\5 \end{bmatrix}

e1=T([01])=[3991]\vec{e_1} = T \left( \begin{bmatrix} 0 \\ 1 \end{bmatrix} \right) = \begin{bmatrix} 3 \\ -9 \\ -9 \\ 1 \end{bmatrix}

A=[93294951]A = \begin{bmatrix} 9 & 3 \\ 2 & -9 \\ 4 & -9 \\ 5 & 1 \end{bmatrix}

Theorem:

A function T:RmRnT : \mathbb{R}^m \to \mathbb{R}^n is a linear transformation if and only if TT satisfies:

  • T(v+w)=T(v)+T(w)T(\vec{v} + \vec{w}) = T(\vec{v}) + T(\vec{w}) for all v\vec{v}, w\vec{w} in Rn\mathbb{R}^n
  • T(kv)=kT(v)T(k\vec{v}) = kT(\vec{v}) for all v\vec{v} in Rn\mathbb{R}^n and scalars kk.

Proof:

If T:RmRnT : \mathbb{R}^m \to \mathbb{R}^n is a linear transformation, there is an n×mn \times m matrix AA with T(x)=AxT(\vec{x}) = A\vec{x}. (1) and (2) hold from matrix properties.

Assume T:RmRmT : \mathbb{R}^m \to \mathbb{R}^m satisfies (1) and (2). Find matrix AA with T(x)=AxT(\vec{x}) = A\vec{x} for all x\vec{x} in Rm\mathbb{R}^m.

Let A=[T(e1)T(e2)T(em)]A = \begin{bmatrix} | & | & & | \\ T(\vec{e_1}) & T(\vec{e_2}) & \cdots & T(\vec{e_m}) \\ | & | & & | \end{bmatrix} . Let x=[x1xm]\vec{x} = \begin{bmatrix} x_1 \\ \vdots \\ x_m \end{bmatrix}

Ax=x1T(e1)+x2T(e2)++xmT(em)A \vec{x} = x_1 T(\vec{e_1}) + x_2 T(\vec{e_2}) + \cdots + x_m T(\vec{e_m})

Ax=T(x1e1)+T(x2e2)++T(xmem)A \vec{x} = T(x_1 \vec{e_1}) + T (x_2 \vec{e_2}) + \cdots + T (x_m \vec{e_m}) (property 2)

Ax=T(x1e1+x2e2++xmem)A \vec{x} = T(x_1 \vec{e_1} + x_2 \vec{e_2} + \cdots + x_m \vec{e_m}) (property 1)

Ax=T(x)A \vec{x} = T(\vec{x}) as x=x1e1+x2e2++xmem\vec{x} = x_1 \vec{e_1} + x_2 \vec{e_2} + \cdots + x_m \vec{e_m}

Example

Sow the transformation T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 is not linear, where TT is given by:

y1=x12y_1 = x_1^2

y2=x1+x2y_2 = x_1 + x_2

f[x1x2]=[x12x1+x2]f \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} x_1^2 \\ x_1 + x_2 \end{bmatrix} f[11]=[121+1]=[12]f \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 1^2 \\ 1 + 1 \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} f[11]=[(1)211]=[12][12]f \begin{bmatrix} -1 \\ -1 \end{bmatrix} = \begin{bmatrix} (-1)^2 \\ -1 -1 \end{bmatrix} = \begin{bmatrix} 1 \\ -2 \end{bmatrix} \neq - \begin{bmatrix} 1 \\ 2 \end{bmatrix}

More generally:

T([11])T([11])T \left( -\begin{bmatrix} 1 \\ 1 \end{bmatrix} \right) \neq - T \left( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right)

This fails property 2. Therefore, this is not a linear transformation.

Example

Recall the function f:R3Rf : \mathbb{R}^3 \to \mathbb{R} given by f[x1x2x3]=x12+x22+x32f \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \sqrt{x_1^2 + x_2^2 + x_3^2}. Show that ff is not a linear a transformation.

f[100]=(1)2+0+0=1f \begin{bmatrix} -1 \\ 0 \\ 0 \end{bmatrix} = \sqrt{\left( -1 \right) ^{2} + 0 + 0} = 1

1f[100]=11+0+0=1-1 f \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} = -1 \sqrt{1 + 0 + 0} =-1

f(e1)f(e1)f(-\vec{e_1}) \neq -f (\vec{e_1}) (fails property 2)

or

f(e1)=1f(\vec{e_1}) = 1

f(e2)=1f(\vec{e_2}) = 1

f(e1+e2)=f[110]=1+1+0=2f(\vec{e_1} + \vec{e_2}) = f \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \sqrt{1 + 1 + 0} = \sqrt{2}

f(e1+e2)f(e1)+f(e2)f(\vec{e_1} + \vec{e_2}) \neq f(\vec{e_1}) + f(\vec{e_2}) (fails property 1)

2.2 Linear Transformations in Geometry

2.2 Linear Transformations in Geometry

Suppose T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 is a linear transformation. Geometrically, we will discuss:

  • Orthogonal projection
  • Reflection
  • Scaling
  • Rotation
  • Horizontal or vertical shear

Background material (Geometry): See Appendix A in textbook

  • v\mid \mid \vec{v} \mid \mid length (magnitude, norm) of v\vec{v} in Rn\mathbb{R}^n

v=vv=v12+v22++vn2 \mid \mid \vec{v} \mid \mid = \sqrt{\vec{v} * \vec{v}} = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}

v=[v1v2vn]\vec{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}

  • If cc is a scalar and vRn, cv=cv\vec{v} \in \mathbb{R}^n,\ \mid \mid c \vec{v} \mid \mid = \mid c \mid \mid \mid \vec{v} \mid \mid. Here c \mid c \mid is absolute value of cc.

  • A vector uRn\vec{u} \in \mathbb{R}^n is a unit vector provided.

u=1 \mid \mid \vec{u} \mid \mid =1

Example: e1, e2,, en\vec{e_1},\ \vec{e_2}, \cdots ,\ \vec{e_n} all unit

  • Two vectors v, w\vec{v},\ \vec{w} in Rn\mathbb{R}^n are orthogonal (perpendicular, normal) provided

vw=0\vec{v} * \vec{w} = 0 (angle between v\vec{v} and w\vec{w} is right)

  • Two nonzero vectors v\vec{v}, w\vec{w} in Rn\mathbb{R}^n are parallel provided they are scaler multiples of each other

Example

Let v=[621]\vec{v} = \begin{bmatrix} 6 \\ -2 \\ -1 \end{bmatrix} and w=[252]\vec{w} = \begin{bmatrix} 2 \\ 5 \\2 \end{bmatrix}

1) Show v\vec{v} and w\vec{w} are perpendicular

vw=6(2)+(2)(5)+(1)(2)=0\vec{v} \cdot \vec{w} = 6(2) + (-2)(5) + (-1)(2) = 0

2) Find two unit vectors parallel to w\vec{w}

w=22+52+22=4+25+4=33\mid \mid \vec{w} \mid \mid = \sqrt{2^{2} + 5^{2} + 2^{2}} = \sqrt{4 + 25 + 4} = \sqrt{33}

ww=133[252]\frac{\vec{w}}{ \mid \mid \vec{w} \mid \mid } = \frac{1}{\sqrt{33}} \begin{bmatrix} 2 \\ 5 \\ 2 \end{bmatrix} (the length of unit vectors must be 1)

Sometimes this is called the normalization of w\vec{w} or the direction of w\vec{w}.

and 133[252]\frac{-1}{\sqrt{33}} \begin{bmatrix} 2 \\ 5 \\ 2 \end{bmatrix}

Adding Vectors (triangle rule and parallelogram rule):

lec5-fig1

Properties of the Dot Product

Consider u,v,wRn\vec{u}, \vec{v}, \vec{w} \in \mathbb{R}^{n} and kk scalar.

  1. vw=wv\vec{v} \cdot \vec{w} = \vec{w} \cdot \vec{v}
  2. k(vw)=(kv)w=v(kw)k\left( \vec{v} \cdot \vec{w} \right) = \left( k \vec{v} \right) \cdot \vec{w} = \vec{v} \cdot \left( k \vec{w} \right)
  3. u(v+w)=uv+uw\vec{u} \cdot \left( \vec{v} + \vec{w} \right) = \vec{u} \cdot \vec{v} + \vec{u} \cdot \vec{w}

Orthogonal Projection onto a line LL

Suppose LL is a line in Rn\mathbb{R}^{n} and w\vec{w} is a nonzero vector with L=space{w}L = \text{space}\{\vec{w}\}.

  • span means all multiples of w\vec{w}

Given x\vec{x} in Rn\mathbb{R}^{n}, we may write x=x+x\vec{x} = \vec{x^{\parallel}} + \vec{x^{\bot}}

x=projLx\vec{x}^{\parallel} = \text{proj}_{L} \vec{x}:

This is the orthogonal projection of x\vec{x} onto LL. Component of x\vec{x} parallel to LL.

lec5-fig2

x=xx\vec{x}^{\bot} = \vec{x} - \vec{x}^{\parallel}:

This is the component of x\vec{x} perpendicular to LL

We want: x=kw\vec{x}^{\parallel} = k \vec{w}. Find kk. We also want:

  • xw=0\vec{x}^{\bot} \cdot \vec{w} = 0

(xx)w=0\left( \vec{x} - \vec{x}^{\parallel} \right) \cdot \vec{w} = 0

(xkw)w=0\left( \vec{x} - k \vec{w}\right) \cdot \vec{w} = 0

xwk(ww)=0\vec{x} \cdot \vec{w} - k \left( \vec{w} \cdot \vec{w} \right) = 0

xw=k(ww)\vec{x} \cdot \vec{w} = k \left( \vec{w} \cdot \vec{w} \right)

k=xwwwk = \frac{\vec{x} \cdot \vec{w}}{\vec{w} \cdot \vec{w}}

The definition of a projection onto a line:

projLx=xwwww\text{proj}_{L} \vec{x} = \frac{\vec{x} \cdot \vec{w}}{\vec{w} \cdot \vec{w}} \vec{w}

Example

Let LL be the line in R3\mathbb{R}^{3} spanned by w=[102]\vec{w} = \begin{bmatrix} 1 \\ 0 \\ -2 \end{bmatrix}.

Find the orthogonal projection of x=[211]\vec{x} = \begin{bmatrix} 2 \\ 1 \\ -1 \end{bmatrix} onto LL and decompose x\vec{x} into x\vec{x}^{\parallel} into x+x\vec{x}^{\parallel} + \vec{x}^{\bot}.

xw=2(1)+0+(2)(1)=4\vec{x} \cdot \vec{w} = 2(1) + 0 + (-2)(-1) = 4

ww=1(1)+0(2)(2)=5\vec{w} \cdot \vec{w} = 1(1) + 0(-2)(-2) = 5

x=projLx=(xwww)w\vec{x}^{\parallel} = \text{proj}_{L} \vec{x} = \left( \frac{\vec{x} \cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w}

x=45[102]=[45085]\vec{x}^{\parallel} = \frac{4}{5} \begin{bmatrix} 1 \\ 0 \\ 2 \end{bmatrix} = \begin{bmatrix} \frac{4}{5} \\ 0 \\ -\frac{8}{5} \end{bmatrix}

x=xx=[211][45085]=[65135]\vec{x}^{\bot} = \vec{x} - \vec{x}^{\parallel} = \begin{bmatrix} 2 \\ 1 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{4}{5} \\ 0 \\ -\frac{8}{5} \end{bmatrix} = \begin{bmatrix} \frac{6}{5} \\ 1 \\ \frac{3}{5} \end{bmatrix}

Check:

xw=0=[65135][102]=6565=0\vec{x}^{\bot} \cdot \vec{w} = 0 = \begin{bmatrix} \frac{6}{5} \\ 1 \\ \frac{3}{5} \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 0 \\ -2 \end{bmatrix} = \frac{6}{5} - \frac{6}{5} = 0

Linear transformations T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} and geometry:

Suppose w=[w1w2]\vec{w} = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix} is a nonzero vector in Rn\mathbb{R}^{n} and L=span{w}L = \text{span}\{\vec{w}\}.

For x\vec{x} in R2\mathbb{R}^{2}, the map xprojL(x)\vec{x} \to \text{proj}_{L}\left( \vec{x} \right) is a linear transformation!

Let’s find the 2×22 \times 2 matrix of orthogonal projection.

projL(e1)=(e1www)w=w1w12+w22[w1w2]\text{proj} _{L} \left( \vec{e} _{1} \right) = \left( \frac{\vec{e} _{1}\cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w} = \frac{w _{1}}{w _{1}^{2} + w _{2}^{2}} \begin{bmatrix} w _1 \\ w _2 \end{bmatrix}

projL(e2)=(e2www)w=w2w12+w22[w1w2]\text{proj} _{L} \left( \vec{e} _{2} \right) = \left( \frac{\vec{e} _{2}\cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w} = \frac{w _{2}}{w _{1}^{2} + w _{2}^{2}} \begin{bmatrix} w _1 \\ w _2 \end{bmatrix}

Matrix: 1w12+w22[w12w1w2w1w2w22]\frac{1}{w_1^{2} + w_2^{2}} \begin{bmatrix} w_1^{2} & w_1w_2 \\ w_1w_2 & w_2 ^{2} \end{bmatrix}

Comment: if w=span{[u1u2]}w=\text{span} \{ \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} \} where [u1u2]\begin{bmatrix} u_{1} \\ u_2 \end{bmatrix} is unit. i.e. u12+u22=1u_1^{2} + u_2^{2} = 1

Let’s verify TT is a linear transformation. Let [x1x2]\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}. Show projLx=Ax\text{proj}_{L} \vec{x} = A \vec{x}

1w12+w22[w12w1w2w1w2w22][x1x2]=1w12+w22[w12x1+w1w2x2w1w2x1+w22x2]\frac{1}{w_1^{2} + w_2^{2}} \begin{bmatrix} w_1^{2} & w_1w_2 \\ w_1w_2 & w_2 ^{2} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \frac{1}{w_1 ^{2} + w_2 ^{2}} \begin{bmatrix} w_1^{2}x_1 + w_1w_2x_2 \\ w_1w_2x_1 + w_2^{2}x_2 \end{bmatrix}

=1w12+w22[w1(w1x1+w2x2)w2(w1x1+w2x2)]=wxvw[w1w2]= \frac{1}{w_1^{2} + w_2^{2}} \begin{bmatrix} w_1 \left( w_1x_1 + w_2x_2 \right) \\ w_2 \left( w_1x_1 + w_2x_2 \right) \end{bmatrix} = \frac{\vec{w} \cdot \vec{x}}{\vec{v} \cdot \vec{w}} \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}

Example

Find the matrix of orthogonal projection onto the line spanned by w=[12]\vec{w} = \begin{bmatrix} -1 \\ 2 \end{bmatrix}.

1(1)2+22[(1)22222]=15[1224]=[15252545]\frac{1}{\left( -1 \right) ^{2} + 2^{2}} \begin{bmatrix} \left( -1 \right) ^{2} & -2 \\ -2 & 2^{2} \end{bmatrix} = \frac{1}{5} \begin{bmatrix} 1 & -2 \\ -2 & 4 \end{bmatrix} = \begin{bmatrix} \frac{1}{5} & -\frac{2}{5} \\ -\frac{2}{5} & \frac{4}{5} \end{bmatrix}

Example

Find the matrix of orthogonal projection onto the line y=xy=x.

span{[11]}\text{span}\{ \begin{bmatrix} 1 \\ 1 \end{bmatrix} \}

112+12[11111112]=[12121212]\frac{1}{1^{2} + 1^{2}} \begin{bmatrix} 1^{1} & 1\cdot 1 \\ 1\cdot 1 & 1^{2} \end{bmatrix} = \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \end{bmatrix}

Reflection: Let L=span{w}L = \text{span} \{ \vec{w} \} by a line in R2\mathbb{R} ^2.

We use x=xprojL(x)\vec{x}^{\bot} = \vec{x} - \text{proj}_L (\vec{x})

refL=projL(x)x\text{ref} _{L} = \text{proj} _{L}\left( \vec{x} \right) - \vec{x}^{\bot}

=projL(x)(xprojL(x))= \text{proj} _{L} \left( \vec{x} \right) - \left( \vec{x} - \text{proj} _{L}\left( x \right) \right)

=2projL(x)x= 2 \text{proj}_{L} \left( \vec{x} \right) - \vec{x}

The matrix of reflection about line LL:

Two ways to compute:

1) Suppose L=span{[u1u2]}L = \text{span}\{ \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} \}, where u12+u22=1u_1 ^{2} + u_2 ^{2} = 1

refL(x)=2projL(x)x2[u12u1u2u1u2u22]I2=[2u12I2u1u22u1u22u221]\text{ref} _{L}\left( \vec{x} \right) = 2 \text{proj} _{L} \left( \vec{x} \right) - \vec{x} \to 2 \begin{bmatrix} u _1^{2} & u _1u _2 \\ u _1u _2 & u _2^{2} \end{bmatrix} - I _2 = \begin{bmatrix} 2u _1^{2}-I & 2u _1u _2 \\ 2u _1u _2 & 2u _2^{2} - 1 \end{bmatrix}

2) The matrix has the form [abba]\begin{bmatrix} a & b \\ b & -a \end{bmatrix} where a2+b2=1a^{2} + b^{2} = 1 and [ab]=refL(e1)\begin{bmatrix} a \\ b \end{bmatrix} = \text{ref}_{L}\left( \vec{e}_1 \right)

Example

Calculate the matrix [0110]\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} that yields reflection about the line y=xy=x.

2projL(x)x2 \text{proj}_{L}\left( \vec{x} \right) - \vec{x}

2[12121212][1001]=[111111]=[0110]2 \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 1-1 & 1 \\ 1 & 1-1 \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}

Example

Let LL by the yy-axis, i.e. L=span{[01]}L = \text{span}\{ \begin{bmatrix} 0 \\ 1 \end{bmatrix} \}.

Find refL(e1)\text{ref}_{L}\left( \vec{e}_1 \right) and the matrix of reflection about the line LL.

refL(e1)=[ab]\text{ref}_{L} \left( \vec{e}_1 \right) = \begin{bmatrix} a \\ b \end{bmatrix}

Matrix: [abba]\begin{bmatrix} a & b \\ b & -a \end{bmatrix}

refL(e1)=2projL(e1)e1\text{ref} _{L}\left( \vec{e} _{1} \right) = 2 \text{proj} _{L} \left( \vec{e} _1 \right) - \vec{e} _1

=2(e1e2e2e2)e2e1=2(01)e2e1=[10]= 2 \left( \frac{\vec{e}_1 \cdot \vec{e}_2}{\vec{e}_2 \cdot \vec{e}_2} \right) \vec{e}_2 - \vec{e}_1 = 2 \left( \frac{0}{1} \right) \vec{e}_2 - \vec{e}_1 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}

A=[1001]A = \begin{bmatrix} -1 & 0 \\ 0 & 1 \end{bmatrix}

Scaling

For k>0, T(x)=kxk > 0,\ T(\vec{x}) = k \vec{x}.

[k00k]\begin{bmatrix} k & 0 \\ 0 & k \end{bmatrix}
  • k>1k > 1 : Dilation
  • 0<k<10 < k < 1 : Contraction

Question: Can we interpret the transformation T(x)=[0110]xT(\vec{x}) = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \vec{x} geometrically?

Answer: Rotation counterclockwise by π2\frac{\pi}{2} or 90 degrees.

Rotation

Counterclockwise by angle θ\theta.

lec5-fig3

T(e1)=[cosθsinθ]T\left( \vec{e}_1 \right) = \begin{bmatrix} \cos \theta \\ \sin \theta \end{bmatrix}

T(e2)=[cos(θ+π2)sin(θ+π2)]=[sinθcosθ]T\left( \vec{e}_2 \right) = \begin{bmatrix} \cos \left( \theta + \frac{\pi}{2} \right) \\ \sin \left( \theta + \frac{\pi}{2} \right) \end{bmatrix} = \begin{bmatrix} -\sin \theta \\ \cos \theta \end{bmatrix}

A=[cosθsinθsinθcosθ]\therefore A = \begin{bmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}

Transformation Recap:

Transformation Matrix
Scaling (by kk) kI2=[k00k]kI_2 = \begin{bmatrix} k & 0 \\ 0 & k \end{bmatrix}
Orthogonal projection onto line LL [u12u1u2u1u2u22]\begin{bmatrix} u_1^2 & u_1u_2 \\ u_1u_2 & u_2^2 \end{bmatrix} , where [u1u2]\begin{bmatrix} u_1 \\ u_2 \end{bmatrix} is a unit vector parallel to LL
Reflection about a line [abba]\begin{bmatrix} a & b \\ b & -a \end{bmatrix}, where a2+b2=1a^2 + b^2 = 1
Rotation through angle θ\theta [cosθsinθsinθcosθ]\begin{bmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix} or [abba]\begin{bmatrix} a & -b \\ b & a \end{bmatrix} , where a2+b2=1a^2 + b^2 = 1
Rotation through angle θ\theta combined with scaling by rr [abba]=r[cosθsinθsinθcosθ]\begin{bmatrix} a & -b \\ b & a \end{bmatrix} = r \begin{bmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}
Horizontal shear [1k01]\begin{bmatrix} 1 & k \\ 0 & 1 \end{bmatrix}
Vertical shear [10 k1]\begin{bmatrix} 1 & 0 \ k & 1 \end{bmatrix}
2.3 Matrix Products

2.3 Matrix Products

Rotation combined with scaling. Suppose

  • T1R2R2T_1 \mathbb{R}^2 \to \mathbb{R}^2 gives rotation counter clockwise by angle θ\theta
  • T2R2R2T_2 \mathbb{R}^2 \to \mathbb{R}^2 Scales by k>0k > 0

This is in the form T2(T1(x))T_2 (T_1(\vec{x}))

T2T1:R2R2 function compositionT_2 T_1 : \mathbb{R}^2 \to \mathbb{R}^2 \text{ function composition} (T2T1)(x)=k[cosθsinθsinθcosθ]x(T_2 T_1)(\vec{x}) = k \begin{bmatrix} \cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix} \vec{x}

What is the matrix?

[kcosθksinθksinθkcosθ]=[k00k][cosθsinθsinθcosθ]\begin{bmatrix} k\cos \theta & -k \sin \theta \\ k \sin \theta & k \cos \theta \end{bmatrix} = \begin{bmatrix} k & 0 \\ 0 & k \end{bmatrix} \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix} Composition of TransformationsMatrix Product\text{Composition of Transformations} \leftrightarrow \text{Matrix Product}

The matrix product BA: Suppose BB is an n×pn\times p matrix and AA is a p×mp \times m matrix.

Size of BABA: [n×p][p×m]n×m[n \times p] [p\times m] \to n\times m

Columns of the product BABA: Suppose A=[v1v2vm]A = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_m \\ | & | & & | \end{bmatrix}

BA=[Bv1Bv2Bvm]BA = \begin{bmatrix} | & | & & | \\ B\vec{v}_1 & B\vec{v}_2 & \cdots & B\vec{v}_m \\ | & | & & | \\ \end{bmatrix}

Entries of BABA are dot products.

(i, j) - entry of BA = [row i of B] * [Column j of A]

Example

[131201][132001]=[7425]\begin{bmatrix} 1 & 3 & -1 \\ 2 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 3 \\ 2 & 0 \\ 0 & -1 \end{bmatrix} = \begin{bmatrix} 7 & 4 \\ 2 & 5 \end{bmatrix}

Rows of the product BABA [ith row of BA] = [ith row of B] A

Example

[201][132001]=[25]\begin{bmatrix} 2 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 3 \\ 2 & 0 \\ 0 & -1 \end{bmatrix} = \begin{bmatrix} 2 & 5 \end{bmatrix}

Example

Suppose A=[5320]A = \begin{bmatrix} 5 & 3 & 2 & 0 \end{bmatrix} and B=[1123]B = \begin{bmatrix} 1 \\ -1 \\ 2 \\ -3 \end{bmatrix}. Find ABAB and BABA.

AB=53+4+0=[6]AB = 5 - 3 + 4 + 0 = \begin{bmatrix} 6 \end{bmatrix} BA=[1123][5320]=[532053201064015960]BA = \begin{bmatrix} 1 \\ -1 \\ 2 \\ -3 \end{bmatrix} \begin{bmatrix} 5 & 3 & 2 & 0 \end{bmatrix} = \begin{bmatrix} 5 & 3 & 2 & 0 \\ -5 & -3 & -2 & 0 \\ 10 & 6 & 4 & 0 \\ -15 & -9 & -6 & 0 \end{bmatrix}

Notice by these examples that ABBAAB \neq BA (they are not even the same size).

Example

Let A=[2130]A = \begin{bmatrix} 2 & 1 \\ -3 & 0 \end{bmatrix}, B=[1010]B = \begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix}, and C=[0110]C = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}. Show that A(B+C)=AB+ACA(B+C) = AB + AC

[2130]([1010]+[0110])=[2130][1100]=[2233]\begin{bmatrix} 2 & 1 \\ -3 & 0 \end{bmatrix} \left( \begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} + \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix} \right) = \begin{bmatrix} 2 & 1 \\ -3 & 0 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 2 & 2 \\ -3 & -3 \end{bmatrix}

Properties

  • A(B+C)=AB+ACA(B+C) = AB + AC and (C+D)A=CA+DA(C+D)A = CA + DA
  • InA=AIm=AI_nA = AI_m = A
  • K(AB)=(KA)B=A(KB)K(AB) = (KA)B = A(KB)
  • A(BC)=(AB)CA(BC) = (AB)C

Be Careful!

  • ABBAAB \neq BA generally even if they are the same size
  • If AB=ACAB = AC, it does not generally follow that B=CB=C
  • If AB=0AB=0, it does not generally follow that A=0A=0 or B=0B=0

Example

[1010][4111]=[4141]\begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 4 & 1 \\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 4 & 1 \\ 4 & 1 \end{bmatrix}

and

[1010][4112]=[4141]\begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 4 & 1 \\ -1 & 2 \end{bmatrix} = \begin{bmatrix} 4 & 1 \\ 4 & 1 \end{bmatrix}

Example

[4111][1010]=[5020]\begin{bmatrix} 4 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 5 & 0 \\ 2 & 0 \end{bmatrix}

Example

[200040][0016]=[000000]\begin{bmatrix} 2 & 0 \\ 0 & 0 \\ -4 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 \\ 1 & 6 \end{bmatrix} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{bmatrix}

Definition: For matrices AA and BB, we say AA and BB commute provided AB=BAAB = BA. Note that both AA and BB must be n×nn \times n.

  • We see [1010]\begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} an [4111]\begin{bmatrix} 4 & 1 \\ 1 & 1 \end{bmatrix} do not commute.
  • InI_n commutes with any n×nn \times n matrix

Example

[abcd][1010]=[a+b0c+d0]\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} a + b & 0 \\ c+d & 0 \end{bmatrix} [1010][abcd]=[abab]\begin{bmatrix} 1 & 0 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} a & b \\ a & b \end{bmatrix}

a+b=aa+b = a

c+d=ac+d = a

b=0b = 0

b=0b=0

[c+d0cd]\begin{bmatrix} c+d & 0 \\ c & d \end{bmatrix}

Example

Find all matrices that commute with [200030004]\begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix}

[200030004][abcdefghi]=[2a2b2c3d3e3f4g4h4i]\begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix} \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = \begin{bmatrix} 2a & 2b & 2c \\ 3d & 3e & 3f \\ 4g & 4h & 4i \end{bmatrix} [abcdefghi][200030004]=[2a3b4c2d3e4f2g3h4i]\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} \begin{bmatrix} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{bmatrix} = \begin{bmatrix} 2a & 3b & 4c \\ 2d & 3e & 4f \\ 2g & 3h & 4i \end{bmatrix}

2b=3b2b = 3b

2c=4c2c = 4c

3d=2d3d = 2d

3f=4f3f = 4f

4g=2g4g = 2g

4h=3h4h = 3h

[a000e000i]\begin{bmatrix} a & 0 & 0 \\ 0 & e & 0 \\ 0 & 0 & i \end{bmatrix}

Power of a Matrix

Suppose AA is n×nn \times n. For k1k \ge 1 integer, define the kkth power of AA.

Ak=AAAAAundefinedk timesA^k = \underbrace{AAAA \cdots A}_{k \text{ times}}

Properties:

  • ApAq=Ap+qA^pA^q = A^{p+q}
  • (Ap)q=Apq\left( A^{p} \right)^{q} = A^{pq}

Example

A=[012001000]A = \begin{bmatrix} 0 & 1 & 2 \\ 0 & 0 & -1 \\ 0 & 0 & 0 \end{bmatrix}. Find A2A^{2}, A3A^{3}. What is AkA^{k} for k>3k > 3?

A2=[012001000][012001000]=[001000000]A^2 = \begin{bmatrix} 0 & 1 & 2 \\ 0 & 0 & -1 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 & 2 \\ 0 & 0 & -1 \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & -1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} A3=[012001000][001000000]=[000000000]A^3 = \begin{bmatrix} 0 & 1 & 2 \\ 0 & 0 & -1 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 & -1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}

Note that A3=0A^3 = 0, but A0A \neq 0.

rank(A)=2\text{rank}\left( A \right) = 2 rank(A2)=1\text{rank}\left( A^{2} \right) = 1 rank(A3)=0\text{rank}\left( A^{3} \right) = 0

Example

[a000b000c][a000b000c]=[a2000b2000c2]\begin{bmatrix} a & 0 & 0 \\ 0 & b & 0 \\ 0 & 0 & c \end{bmatrix} \begin{bmatrix} a & 0 & 0 \\ 0 & b & 0 \\ 0 & 0 & c \end{bmatrix} = \begin{bmatrix} a^2 & 0 & 0 \\ 0 & b^2 & 0 \\ 0 & 0 & c^2 \end{bmatrix}
[a000b000c]k=[ak000bk000ck]\begin{bmatrix} a & 0 & 0\\ 0 & b & 0 \\ 0 & 0 & c \end{bmatrix}^k = \begin{bmatrix} a^k & 0 & 0\\ 0 & b^k & 0 \\ 0 & 0 & c^k \end{bmatrix}

Exam 1

Will most likely have a “find all matrices that commute with” question

100 minutes


Practice Quiz 2

1) Compute the product AxA \vec{x} using paper and pencil: [13141001][12]\begin{bmatrix} 1 & 3 \\ 1 & 4 \\ -1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ -2 \end{bmatrix}.

1[1110]2[3401]=[5712]1 \begin{bmatrix} 1 \\ 1 \\ -1 \\ 0 \end{bmatrix} - 2 \begin{bmatrix} 3 \\ 4 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} -5 \\ -7 \\ -1 \\ -2 \end{bmatrix}

2) Let AA be a 6×36 \times 3 matrix. We are told that Ax=0A \vec{x} = \vec{0} has a unique solution.

a) What is the reduced row-echelon form of AA? b) Can Ax=bA\vec{x} = \vec{b} be an inconsistent system for some bR6\vec{b} \in \mathbb{R}^6? Justify your answer. c) Can Ax=bA\vec{x} = \vec{b} have infinitely many solutions for some bR6\vec{b} \in \mathbb{R}^6? Justify your answer.

Solution

a)

[100010001000000000]\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}

b) Yes; we can have [000c]\begin{bmatrix} 0 & 0 & 0 & \big| & c \end{bmatrix} where c0c\neq 0 in ref[ab]\text{ref}\begin{bmatrix} a & \big| & \vec{b} \end{bmatrix}.

c) No; there are no free variables

3) Let w=[2201]\vec{w} = \begin{bmatrix} -2 \\ 2 \\ 0 \\ 1 \end{bmatrix}, L=span(w)L = \text{span}\left( \vec{w} \right), and x=3e3R4\vec{x} = 3 \vec{e}_3 \in \mathbb{R}^4. Show your work

a) Find x=projL(x)\vec{x}^{\parallel} = \text{proj}_L \left( \vec{x} \right), the projection of x\vec{x} onto LL. b) Find x\vec{x}^{\bot}, the component of x\vec{x} orthogonal to LL.

Solution

a) projL(x)=(xwww)w\text{proj}_L \left( \vec{x} \right) = \left( \frac{\vec{x} \cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w}

xw=0+6+0+0=6\vec{x} \cdot \vec{w} = 0 + 6 + 0 + 0 = 6

ww=4+4+0+1=9\vec{w} \cdot \vec{w} = 4 + 4 + 0 + 1 = 9

projL(x)=23[2201]=[4343023]\text{proj}_{L} \left( \vec{x} \right) = \frac{2}{3} \begin{bmatrix} -2 \\ 2 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} -\frac{4}{3} \\ \frac{4}{3} \\ 0 \\ \frac{2}{3} \end{bmatrix}

b) x=xprojL(x)\vec{x}^{\bot} = \vec{x} - \text{proj}_L \left( \vec{x} \right)

=[0300][4343023]=[4353023]= \begin{bmatrix} 0 \\ 3 \\ 0 \\ 0 \end{bmatrix} - \begin{bmatrix} -\frac{4}{3} \\ \frac{4}{3} \\ 0 \\ \frac{2}{3} \end{bmatrix} = \begin{bmatrix} \frac{4}{3} \\ \frac{5}{3} \\ 0 \\ -\frac{2}{3} \end{bmatrix}

4) Suppose T1:R2R3T_1 : \mathbb{R}^{2} \to \mathbb{R}^{3} is given by T1([xy])=[0xy3y]T_1 \left( \begin{bmatrix} x \\ y \end{bmatrix} \right) = \begin{bmatrix} 0 \\ x - y \\ 3y \end{bmatrix} and T2:R2R2T_2 : \mathbb{R}^{2} \to \mathbb{R}^{2} is a scaling transformation with T2([17])=[321]T_2 \left( \begin{bmatrix} 1 \\ 7 \end{bmatrix} \right) = \begin{bmatrix} 3 \\ 21 \end{bmatrix}. Show your work

a) Find the matrix of the transformation T1T_1. b) Find the matrix of the transformation T2T_2.

Solution

a) [T(e1)T(e2)]\begin{bmatrix} | & | \\ T\left( \vec{e}_1 \right) & T\left( \vec{e}_2 \right) \\ | & | \end{bmatrix}

T[10]=[010]T \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, T[01]=[013]T \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ -1 \\ 3 \end{bmatrix}

A=[001103]A = \begin{bmatrix} 0 & 0 \\ 1 & -1 \\ 0 & 3 \end{bmatrix}

b) Scaling by k=3k=3

[3003]\begin{bmatrix} 3 & 0 \\ 0 & 3 \end{bmatrix}

5) Let T:R2R3T : \mathbb{R}^{2} \to \mathbb{R}^{3} be a linear transformation such that T(2e1)=[222]T \left( 2 \vec{e}_1 \right) = \begin{bmatrix} 2 \\ 2 \\ 2 \end{bmatrix} and T(e1+e2)=[234]T \left( \vec{e}_1 + \vec{e}_2 \right) = \begin{bmatrix} 2 \\ 3 \\ 4 \end{bmatrix}. Find T(e1)T \left( \vec{e}_1 \right) and T(e2)T \left( \vec{e}_2 \right). Show your work.

T(2e1)=2T(e1)=[222]T \left( 2 \vec{e}_1 \right) = 2 T \left( \vec{e}_1 \right) = \begin{bmatrix} 2 \\ 2 \\ 2 \end{bmatrix}

T(e1)=[111]T \left( \vec{e}_1 \right) = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}

T(e1+e2)=T(e1)+T(e2)=[234]T \left( \vec{e}_1 + \vec{e}_2 \right) = T \left( \vec{e}_1 \right) + T \left( \vec{e}_2 \right) = \begin{bmatrix} 2 \\ 3 \\ 4 \end{bmatrix}

T(e2)=[234]T(e1)=[123]T \left( \vec{e}_2 \right) = \begin{bmatrix} 2 \\ 3 \\ 4 \end{bmatrix} - T \left( \vec{e}_1 \right) = \begin{bmatrix} 1 \\ 2 \\3 \end{bmatrix}

2.4 Inverse of a Linear Transformation

2.4 Inverse of a Linear Transformation

In Math1365 (or other courses), you see diagrams for f:XYf : X \to Y function.

lec7-fig1

Definition:

We say the function f:XYf : X \to Y is invertible provided for each yy in YY, there is a unique xx in XX with f(x)=yf(x) = y if any only if f1:YXf^{-1} : Y \to X is a function f1(y)=xf^{-1}(y) = x provided f(x)=yf(x) = y.

Same notation for linear transformation T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n}

A square n×nn \times n matrix AA is invertible provided the map T(x)=AxT \left( \vec{x} \right) = A \vec{x} is invertible. The matrix for T1T^{-1} is denoted A1A^{-1}.

Note:

  • TT1(y)=yT T^{-1} (\vec{y}) = \vec{y} for any y\vec{y} in Rn\mathbb{R}^{n}
  • T1T(x)=xT^{-1}T(\vec{x}) = \vec{x} for any x\vec{x} in Rn\mathbb{R}^{n}
  • AA1=InAA^{-1} = I_{n} and A1A=InA^{-1}A = I_{n}

AA invertible means Ax=bA\vec{x} = \vec{b} has a unique solution for every b\vec{b} in Rn\mathbb{R}^{n}.

  • The unique solution is x=A1b\vec{x} = A^{-1}\vec{b}

For our discussion of rank: AA is invertible is equivalent to…

  • rank(A)=n\text{rank}(A) = n
  • rref(A)=In\text{rref}(A) = I_n
  • The only solution to Ax=0A\vec{x} = \vec{0} is x=0\vec{x} = \vec{0}

How to find A1A^{-1} if AA is n×nn \times n,

  1. Form the n×(2n)n \times \left( 2n \right) matrix [AI]\begin{bmatrix} A & \big| & I \end{bmatrix}
  2. Perform elementary row operations to find rref[AI]\text{rref} \begin{bmatrix} A & \big| & I \end{bmatrix}

Then,

  • If rref[AI]=[IB]\text{rref} \begin{bmatrix} A & \big| & I \end{bmatrix} = \begin{bmatrix} I & \big| & B\end{bmatrix} then B=A1B = A^{-1}.
  • If rref[AI]\text{rref} \begin{bmatrix} A & \big| & I \end{bmatrix} is not of this form then AA is not invertible.

Example

A=[2311]A = \begin{bmatrix} 2 & 3 \\ 1 & 1 \end{bmatrix}. Find A1A^{-1}.

[23101101][11012310]\begin{bmatrix} 2 & 3 & \big| & 1 & 0 \\ 1 & 1 & \big| & 0 & 1 \end{bmatrix} \to \begin{bmatrix} 1 & 1 & \big| & 0 & 1 \\ 2 & 3 & \big| & 1 & 0 \end{bmatrix} \to [11010112][10130112]\begin{bmatrix} 1 & 1 & \big| & 0 & 1 \\ 0 & 1 \big| & 1 & -2 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & \big| & -1 & 3 \\ 0 & 1 & \big| & 1 & -2 \end{bmatrix} A1=[1312]A^{-1} = \begin{bmatrix} -1 & 3 \\ 1 & -2 \end{bmatrix}

Example

A=[22 11]A = \begin{bmatrix} 2 & 2 \ 1 & 1 \end{bmatrix}. Find A1A^{-1}.

[2210][1101][11010012]\begin{bmatrix} 2 & 2 & \big| & 1 & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 1 & \big| & 0 & 1 \end{bmatrix} \to \begin{bmatrix} 1 & 1 & \big| & 0 & 1 \\ 0 & 0 & \big| & 1 & -2 \end{bmatrix}

AA is not invertible

Example

A=[131141201]A = \begin{bmatrix} 1 & 3 & 1 \\ 1 & 4 & 1 \\ 2 & 0 & 1 \end{bmatrix}. Find A1A^{-1}.

[22101101][131100010110061201]\begin{bmatrix} 2 & 2 & \big| & 1 & 0 \\ 1 & 1 & \big| & 0 & 1 \\ \end{bmatrix} \to \begin{bmatrix} 1 & 3 & 1 & \big| & 1 & 0 & 0 \\ 0 & 1 & 0 & \big| & -1 & 1 & 0 \\ 0 & -6 & -1 & \big| & -2 & 0 & 1 \end{bmatrix} [131100010110001861]\to \begin{bmatrix} 1 & 3 & 1 & \big| & 1 & 0 & 0 \\ 0 & 1 & 0 & \big| & -1 & 1 & 0 \\ 0 & 0 & -1 & \big| & -8 & 6 & 1 \end{bmatrix} [131100010110001861][130761010110001861]\to \begin{bmatrix} 1 & 3 & 1 & \big| & 1 & 0 & 0 \\ 0 & 1 & 0 & \big| & -1 & 1 & 0 \\ 0 & 0 & 1 & \big| & 8 & -6 & -1 \end{bmatrix} \to \begin{bmatrix} 1 & 3 & 0 & \big| & -7 & 6 & 1\\ 0 & 1 & 0 & \big| & -1 & 1 & 0 \\ 0 & 0 & 1 & \big| & 8 & -6 & -1 \end{bmatrix} [100431010110001861]\to \begin{bmatrix} 1 & 0 & 0 & \big| & -4 & 3 & 1 \\ 0 & 1 & 0 & \big| & -1 & 1 & 0 \\ 0 & 0 & 1 & \big| & 8 & -6 & -1 \end{bmatrix} A1=[431110861]A^{-1} = \begin{bmatrix} -4 & 3 & 1 \\ -1 & 1 & 0 \\ 8 & -6 & -1 \end{bmatrix}

Example

Find all solutions to the system Ax=bA\vec{x} = \vec{b} where A=[131141201]A = \begin{bmatrix} 1 & 3 & 1 \\ 1 & 4 & 1 \\ 2 & 0 & 1 \end{bmatrix} and b=[110]\vec{b} = \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix}

A1=[431110861]A^{-1} = \begin{bmatrix} -4 & 3 & 1 \\ -1 & 1 & 0 \\ 8 & -6 & -1 \end{bmatrix} x=A1b=[431110861][110]=[7214]\vec{x} = A^{-1}\vec{b} = \begin{bmatrix} -4 & 3 & 1 \\ -1 & 1 & 0 \\ 8 & -6 & -1 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} = \begin{bmatrix} -7 \\ -2 \\ 14 \end{bmatrix}

Theorem:

Let AA, BB be n×nn \times n matrices with BA=InBA = I_n then,

  1. AA, BB are both invertible
  2. A1=BA^{-1} = B and B1=AB^{-1} = A
  3. AB=InAB = I_n

Proof of 1) Assume AA, BB are n×nn\times n matrices with BA=InBA = I_n. Suppose Ax=0A\vec{x} = \vec{0}. Show x=0\vec{x}=0. Multiply by BB: BAx=B0BA\vec{x} = B\vec{0} rewriting Ix=0I\vec{x} = \vec{0} meaning x=0\vec{x} = \vec{0}. Thus, AA is invertible. Then, BAA1=IA1BA A^{-1} = IA^{-1} and B=A1B = A^{-1}. BB is invertible.

Using the theorem:

If AA, BB are n×nn\times n invertible matrices then so is BABA and (BA)1=A1B1\left( BA \right) ^{-1} = A^{-1}B^{-1}.

Proof: (BA)(A1B1)=B(AA1)B1=BIB1=BB1=I\left( BA \right) \left( A^{-1}B^{-1} \right) = B\left( A A^{-1} \right) B^{-1} = BIB^{-1} = B B^{-1} = I.

Exercise: Suppose AA is an n×nn\times n invertible matrix.

Is A2A^{2} invertible? If so, what is (A2)1\left( A^{-2} \right) ^{-1}?

Yes; A1A1=(A1)2A^{-1}A^{-1} = \left( A^{-1} \right) ^{2}

Is A3A^{3} invertible? If so, what is (A3)1\left( A^{3} \right)^{-1}?

Yes; (A1)3\left( A^{-1} \right) ^{3}

(AAA)(A1A1A1)=AAA1A1=AA1=I\left( A AA \right) \left( A^{-1}A^{-1}A^{-1} \right) = A A A^{-1}A^{-1} = A A^{-1} = I

Back to 2×22\times 2 matrices: We saw

  • For A=[2311]A = \begin{bmatrix} 2 & 3 \\ 1 & 1 \end{bmatrix}, A1=[1312]A^{-1} = \begin{bmatrix} -1 & 3 \\ 1 & -2 \end{bmatrix}.
  • The matrix [2211]\begin{bmatrix} 2 & 2 \\ 1 & 1 \end{bmatrix} is not invertible

Theorem: Consider a 2×22\times 2 matrix A=[abcd]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}.

AA is invertible if and only if adbc0ad - bc \neq 0

If AA is invertible, then A1A^{-1} = 1adbc[dbca]\frac{1}{ad-bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}

The number adbcad - bc is a determinant of A=[abcd]A = \begin{bmatrix} a & b \\ c& d \end{bmatrix}.

Example

A=[4701]A = \begin{bmatrix} 4 & 7 \\ 0 & 1 \end{bmatrix}. Find det(A)\text{det}(A) and A1A^{-1}.

det(A)=40=4\text{det}(A) = 4 - 0 = 4

A1=14[1704]A^{-1} = \frac{1}{4} \begin{bmatrix} 1 & -7 \\ 0 & 4 \end{bmatrix}

3.1 Image and Kernel of a Linear Transformation

3.1 Image and Kernel of a Linear Transformation

Definition:

Let T:RmRnT : \mathbb{R}^{m} \to \mathbb{R}^{n} be a linear transformation.

The Image of TT, denoted im(T)\text{im}\left( T \right) : im(T)={T(x):xRm}Rn\text{im}\left( T \right) = \{T \left( \vec{x} \right) : x \in \mathbb{R}^{m} \} \subseteq \mathbb{R}^{n}

The kernel of TT ker(T)\text{ker}\left( T \right) : ker(T)={xRm:T(x)=0}Rm\text{ker}\left( T \right) = \{ \vec{x} \in \mathbb{R}^{m} : T \left( \vec{x} \right) = \vec{0} \} \subseteq \mathbb{R}^{m}

Example

What is ker(T)\text{ker} \left( T \right) and im(T)\text{im}\left( T \right) when T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} is

1) Projection onto the line y=xy = -x. 2) Reflection about the line y=xy = -x.

Solution

1)

w=[11]\vec{w} = \begin{bmatrix} -1 \\ 1 \end{bmatrix}

L=span([11])L = \text{span}\left( \begin{bmatrix} -1 \\ 1 \end{bmatrix} \right)

projL(x)=(wwww)w\text{proj}_{L} \left( \vec{x} \right) = \left( \frac{\vec{w} \cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w}

x\vec{x} is in ker(T)\text{ker}\left( T \right) provided x[11]=0\vec{x} \cdot \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \vec{0}

ker(T)={[x1x2]:x1+x2=0}\text{ker}\left( T \right) = \{ \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} : -x_1 + x_2 = 0 \}

im(T)=L\text{im}\left( T \right) = L

2) ker(T)={0}\text{ker}\left( T \right) = \{ \vec{0} \}

im(T)=R2\text{im}\left( T \right) = \mathbb{R}^{2}

Suppose T:RmRnT : \mathbb{R}^{m} \to \mathbb{R}^{n} is a linear transformation. There is an n×mn \times m matrix A=[a1a2am]A = \begin{bmatrix} | & | & & | \\ \vec{a}_1 & \vec{a}_2 & \cdots & \vec{a}_m \\ | & | & & | \end{bmatrix} such that T(x)=AxT \left( \vec{x} \right) = A \vec{x} for all x\vec{x} in Rm\mathbb{R}^{m}.

Image of TT (Also written im(A)\text{im}\left( A \right)):

im(T)={Ax:xRm={x1a1+x2a2++xmam:xiR={all linear combinations of a1, a2, , am}=span(a1, a2, ,am)\text{im}\left( T \right) = \{ A\vec{x} : \vec{x} \in \mathbb{R}^{m} = \{ x_1\vec{a}_1 + x_2\vec{a}_2 + \cdots + x_m\vec{a}_m : x_i \in \mathbb{R} = \{ \text{all linear combinations of } \vec{a}_1,\ \vec{a}_2,\ \cdots ,\ \vec{a}_m \} = \text{span}\left( \vec{a}_1,\ \vec{a}_2,\ \cdots, \vec{a}_m \right)

Kernel of TT (Also written ker(A)\text{ker}\left( A \right):

ker(T)={xRm:Ax=0}={all solutions to Ax=0}\text{ker}\left( T \right) = \{ x \in \mathbb{R}^{m} : A\vec{x} = \vec{0} \} = \{ \text{all solutions to } A\vec{x} = \vec{0} \}

Example

Find vectors that span the kernel of [1339]\begin{bmatrix} 1 & -3 \\ -3 & 9 \end{bmatrix}.

[130390][130000]\begin{bmatrix} 1 & -3 & \big| & 0 \\ -3 & 9 & \big| & 0 \end{bmatrix} \to \begin{bmatrix} 1 & -3 & \big| & 0 \\ 0 & 0 & \big| & 0 \end{bmatrix}

x2=tx_2 = t

x13t=0x_1 - 3t = 0

x1=3tx_1 = 3t

[3tt]=t[31]\begin{bmatrix} 3t \\ t \end{bmatrix} = t \begin{bmatrix} 3 \\ 1 \end{bmatrix}

ker(A)=span{[31]}\text{ker}\left( A \right) = \text{span} \{ \begin{bmatrix} 3 \\ 1 \end{bmatrix} \}

Example

Find vectors that span the kernel of [130526116515025]\begin{bmatrix} 1 & 3 & 0 & 5 \\ 2 & 6 & 1 & 16 \\ 5 & 15 & 0 & 25 \end{bmatrix}.

[130526116515025][130500160000]\begin{bmatrix} 1 & 3 & 0 & 5 \\ 2 & 6 & 1 & 16 \\ 5 & 15 & 0 & 25 \end{bmatrix} \to \begin{bmatrix} 1 & 3 & 0 & 5 \\ 0 & 0 & 1 & 6 \\ 0 & 0 & 0 & 0 \end{bmatrix}

x2=tx_2 = t

x4=rx_4 = r

x1=3t5rx_1 = -3t - 5r

x3=6rx_3 = -6r

[3t5tt6rr]=t[3100]+r[5061]=span{[3100],[5061]}\begin{bmatrix} -3t - 5t \\ t \\ -6r \\ r \end{bmatrix} = t \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + r \begin{bmatrix} -5 \\ 0 \\ -6 \\ 1 \end{bmatrix} = \text{span} \{ \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} -5 \\ 0 \\ -6\\ 1 \end{bmatrix} \}

Example

Find vectors that span the kernel of [112112]\begin{bmatrix} 1 & 1 & -2 \\ -1 & -1 & 2 \end{bmatrix}

[112112][112000]\begin{bmatrix} 1 & 1 & -2 \\ -1 & -1 & 2 \end{bmatrix} \to \begin{bmatrix} 1 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix}

x1=r+2sx_1 = -r + 2s

x2=rx_2 = r

x3=sx_3 = s

[r+2srs]=r[110]+s[201]\begin{bmatrix} -r + 2s \\ r \\ s \end{bmatrix} = r \begin{bmatrix} -1 \\ 1 \\ 0 \end{bmatrix} + s \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} ker(A)=span{[110],[201]}\text{ker}(A) = \text{span} \{ \begin{bmatrix} -1 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ 0 \\1 \end{bmatrix} \}

Properties of the kernel:

  • 0ker(A)\vec{0} \in \text{ker}\left( A \right)
  • If v1\vec{v}_1, v2ker(A)\vec{v}_2 \in \text{ker}\left( A \right), then v1+v2ker(A)\vec{v}_1 + \vec{v}_2 \in \text{ker}\left( A \right). Closed under addition.
  • If vker(A)\vec{v} \in \text{ker}\left( A \right) then kvker(A)k\vec{v} \in \text{ker}\left( A \right). Closed under scaler multiplication

Proof:

  • A0=0A\vec{0} = \vec{0}
  • If Av1=0A\vec{v}_1 = \vec{0} and Av2=0A\vec{v}_2 = \vec{0}, then A(v1+v2)=Av1+Av2=0+0=0A \left( \vec{v}_1 + \vec{v}_2\right) = A\vec{v}_1 + A \vec{v}_2 = \vec{0} + \vec{0} = \vec{0}
  • If AvA\vec{v}, then A(kv)=kAv=k0=0A\left( k\vec{v} \right) = kA\vec{v} = k\vec{0} = \vec{0}.

Give as few vectors as possible!!

Example

A=[1339]A = \begin{bmatrix} 1 & -3 \\ -3 & 9 \end{bmatrix}

rref(A)=[1300]\text{rref}(A) = \begin{bmatrix} 1 & -3 \\ 0 & 0 \end{bmatrix}

x[13]+y[39]=(x3y)[13]x \begin{bmatrix} 1 \\ -3 \end{bmatrix} + y \begin{bmatrix} -3 \\ 9 \end{bmatrix} = \left( x - 3y \right) \begin{bmatrix} 1 \\ -3 \end{bmatrix}

im(A)=span([13])\text{im}(A) = \text{span}\left( \begin{bmatrix} 1 \\ -3 \end{bmatrix} \right)

Example

A=[111222001131]A = \begin{bmatrix} 1 & -1 & 1 & 2 \\ -2 & 2 & 0 & 0 \\ -1 & 1 & 3 & 1 \end{bmatrix}

rref(A)=[110000100001]\text{rref}\left( A \right) = \begin{bmatrix} 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

lm(A)=span{[121],[121],[103],[201]}\text{lm}\left( A \right) = \text{span} \{ \begin{bmatrix} 1 \\ -2 \\ -1 \end{bmatrix}, \begin{bmatrix} -1 \\ 2 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} \}

im(A)=span{[121],[103],[201]}\text{im}\left( A \right) = \text{span} \{ \begin{bmatrix} 1 \\ -2 \\ -1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} \}

Careful: Make sure you use columns in AA corresponding to leading 1’s in rref\text{rref}.

Example

A=[123123123123]A = \begin{bmatrix} 1 & 2 & 3 \\ 1 & 2 & 3 \\ 1 & 2 & 3 \\ 1 & 2 & 3 \end{bmatrix}

rref(A)=[123000000000]\text{rref}\left( A \right) = \begin{bmatrix} 1 & 2 & 3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}

im(A)=span{[1111]}span{[1000]}=im(rref(A))\text{im}\left( A \right) = \text{span}\{ \begin{bmatrix} 1 \\ 1 \\ 1\\ 1 \end{bmatrix} \} \neq \text{span} \{ \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} \} = \text{im} \left( \text{rref} \left( A \right) \right)

Note: im(T)\text{im}\left( T \right) or im(A)\text{im}\left( A \right) is a subspace of Rn\mathbb{R}^{n}.

  • 0im(A)\vec{0} \in \text{im}\left( A \right) to
  • Closed under addition and scaler multiplication

Exercise

I3=[100010001]I_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}. What is ker(I3)\text{ker}\left( I_3 \right) and im(I3)\text{im}\left( I_3 \right)?

ker(I3)={0}\text{ker}\left( I_3 \right) = \{ \vec{0} \}

im(I3)=R3\text{im}\left( I_3 \right) = \mathbb{R}^{3}

Generally, if AA is n×nn\times n matrix,

im(A)=Rn\text{im}\left( A \right) = \mathbb{R}^{n} if and only if ker(A)={0}\text{ker}\left( A \right) = \{ \vec{0} \} if and only if AA is invertible.

A linear transformation T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n} is invertible if and only if:

  1. The equation T(x)=bT \left( \vec{x} \right) = \vec{b} has a unique solution for any bRn\vec{b} \in \mathbb{R}^{n}.
  2. The corresponding matrix AA is invertible and (TA)1=TA1\left( T_A \right) ^{-1} = T_{A^{-1}}
  3. There is a matrix BB such that AB=InAB = I_n. Here B=A1B = A^{-1}
  4. There is a matrix CC such that CA=InCA = I_n. Here C=A1C = A^{-1}.
  5. The equation Ax=bA\vec{x} = \vec{b} has a unique solution for any bRn\vec{b}\in \mathbb{R}^{n}. The unique solution is given by x=A1b\vec{x} = A^{-1} \vec{b}.
  6. The equation Ax=0A\vec{x} = \vec{0} only has zero solution.
  7. rref(A)=In\text{rref}\left( A \right) = I_n
  8. rank(A)=n\text{rank}\left( A \right) = n
  9. The image of the transformation TT is Rn\mathbb{R}^{n}.
  10. The transformation TT is one-to-one

Basis: Spanning set with as few vectors as possible

Example

For A=[120122435112230]A = \begin{bmatrix} 1 & 2 & 0 & 1 & 2 \\ 2 & 4 & 3 & 5 & 1 \\ 1 & 2 & 2 & 3 & 0 \end{bmatrix}, we are given rref(A)=[120120xy1100000]\text{rref}\left( A \right) = \begin{bmatrix} 1 & 2 & 0 & 1 & 2\\ 0 & x & y & 1 & -1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}.

  1. Find xx and yy.
  2. Find a basis for im(A)\text{im}\left( A \right).
  3. Find a basis for ker(A)\text{ker}\left( A \right).

Solution

  1. x=0x=0, y=1y=1
  2. im(A)=span{[121],[032]}\text{im}\left( A \right) = \text{span} \{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix}, \begin{bmatrix} 0 \\ 3 \\ 2 \end{bmatrix} \}
  3. See below

x2=tx_2 = t

x4=rx_4 = r

x5=sx_5 = s

x1=2tr2sx_1 = -2t - r - 2s

x3=r+sx_3 = -r + s

[2tr2str+srs]=t[21000]+r[10110]+s[20101]\begin{bmatrix} -2t - r -2s \\ t \\ -r+s \\ r \\ s \end{bmatrix} = t\begin{bmatrix}-2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} + r \begin{bmatrix} -1 \\ 0 \\ -1 \\ 1 \\ 0 \end{bmatrix} + s \begin{bmatrix} -2 \\ 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}

ker(A)=span{[21000],[10110],[20101]}\text{ker}\left( A \right) = \text{span}\{ \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} -1 \\ 0 \\ -1 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} -2 \\ 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \}

3.2 Subspaces of R2\mathbb{R}^2: Bases and Linear Independence

3.2 Subspaces of R2\mathbb{R}^2: Bases and Linear Independence

Definition:

For WRnW \subseteq \mathbb{R}^{n}, WW is a subspace of Rn\mathbb{R}^{n} provided

  1. 0W\vec{0} \in W
  2. If v1, v2W\vec{v}_1,\ \vec{v}_2 \in W then v1+v2W\vec{v}_1 + \vec{v}_2 \in W
  3. If vW\vec{v} \in W, then kvWk\vec{v} \in W for all scalars kk.

Which are subspaces of R3\mathbb{R}^{3}?

1) Vectors [xyz]\begin{bmatrix} x \\ y \\ z \end{bmatrix} with x=yx=y.

  • 0\vec{0} is in set
  • [tta]+[ssb]=[t+st+sa+b]\begin{bmatrix} t \\ t \\ a \end{bmatrix} + \begin{bmatrix} s \\ s \\ b \end{bmatrix} = \begin{bmatrix} t + s \\ t + s \\ a+b \end{bmatrix}
  • k[tta]=[ktktka]k \begin{bmatrix} t \\ t \\ a \end{bmatrix} = \begin{bmatrix} kt \\ kt \\ ka \end{bmatrix}

Yes!

2) Vectors [xyz]\begin{bmatrix} x \\ y \\ z \end{bmatrix} with x=1x=1.

  • [000]\begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} not in set

No!

3) Vectors [xyz]\begin{bmatrix} x \\ y \\ z \end{bmatrix} with xyz=0xyz = 0.

  • [101]+[110]=[211]\begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} + \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 2 \\ 1 \\ 1 \end{bmatrix} (not in set)

No; fails property 2.

Subspaces of Rn\mathbb{R}^{n} is equivalent to span(v1, v2, , vm)\text{span}\left( \vec{v}_1,\ \vec{v}_2,\ \cdots ,\ \vec{v}_m \right)

Example

A=[130526116515025]A = \begin{bmatrix} 1 & 3 & 0 & 5 \\ 2 & 6 & 1 & 16 \\ 5 & 15 & 0 & 25 \end{bmatrix}

rref(A)=[130500160000]\text{rref}\left( A \right) = \begin{bmatrix} 1 & 3 & 0 & 5 \\ 0 & 0 & 1 & 6 \\ 0 & 0 & 0 & 0 \end{bmatrix}

im(A)=span{[125],[3615],[010],[51625]}\text{im}\left( A \right) = \text{span}\{ \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 3 \\ 6 \\ 15 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 5 \\ 16 \\ 25 \end{bmatrix} \}

Few vectors as possible: im(A)={[125],[010]}\text{im}\left( A \right) = \{\begin{bmatrix}1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}

Definition:

Consider vectors v1\vec{v}_1, v2\vec{v}_2, \cdots, vm\vec{v}_m in Rn\mathbb{R}^{n}.

  • Vector vi\vec{v} _{i} is redundant provided it is a linear combination of v1\vec{v} _1, v2\vec{v} _2, …, vi1\vec{v} _{i-1}. (0\vec{0} is always redundant)
  • Vectors v1\vec{v}_{1}, v2\vec{v}_2, …, vm\vec{v}_m are linearly independent provided non of them are redundant.
  • Vectors v1\vec{v}_1, v2\vec{v}_2, …, vm\vec{v}_m are linearly dependent provided at least one vector vc\vec{v}_c is redundant.

Example

{[125],[010],[3615],[51625]}\{ \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 3 \\ 6 \\ 15 \end{bmatrix} , \begin{bmatrix} 5 \\ 16 \\ 25 \end{bmatrix} \} is a linearly dependent collection because v3=3v1\vec{v}_3 = 3 \vec{v}_1 and v4=5v1+6v2\vec{v}_4 = 5\vec{v}_1 + 6 \vec{v}_2.

Linear relations:

3v1+v3=0-3 \vec{v}_1 + \vec{v}_3 = \vec{0}

5v16v2+v4=0-5 \vec{v}_1 - 6 \vec{v}_2 + \vec{v}_4 = \vec{0}

Generally, we consider linear relation c1v1+c2v2++cmvm=0c_1\vec{v}_1 + c_2\vec{v}_2 + \cdots + c_m\vec{v}_m = \vec{0}.

  • We always have a trivial relation: c1=c2=c3==cm=0c_1 = c_2 = c_3 = \cdots = c_m = 0
  • nontrivial relation: When at least one cic_i is non-zero.

Note: v1\vec{v}_1, v2\vec{v}_2, …, vm\vec{v}_m are linearly dependent if and only if there exists a nontrivial relation among v1\vec{v}_1, v2\vec{v}_2, …, vm\vec{v}_m.

This is a trivial relation:

0[51625]+0[125]+0[010]=[000]0 \begin{bmatrix} 5 \\ 16 \\ 25 \end{bmatrix} + 0 \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix} + 0 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}

This is a nontrivial relation:

1[51625]5[125]6[010]=[000]1 \begin{bmatrix} 5 \\ 16 \\ 25 \end{bmatrix} - 5 \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix} - 6 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}

Example

The vectors {[16],[00]}\{ \begin{bmatrix} 1 \\ 6 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \end{bmatrix} \} are linearly dependent. (0\vec{0} is never part of a linearly independent set)

0\vec{0} is redundant:

0[16]=[00]0 \begin{bmatrix} 1 \\ 6 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

Nontrivial relation:

0[16]+10[00]=00 \begin{bmatrix} 1 \\ 6 \end{bmatrix} + 10 \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \vec{0}

Example

The vectors {[16],[10]}\{\begin{bmatrix} 1 \\ 6 \end{bmatrix} , \begin{bmatrix} 1 \\ 0 \end{bmatrix} \} are linearly independent.

There are no redundant vectors. Because if c1[16]+c2[10]=[00]c_1 \begin{bmatrix} 1 \\ 6 \end{bmatrix} + c_2 \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} then 6c1+0=0    c1=06c_1 + 0 = 0 \implies c_1 = 0 and 0+c2=0    c2=00 + c_2 = 0 \implies c_2 =0

Recall from 3.1: We found a basis for im(A)\text{im}\left( A \right) by listing all columns of AA and omitting redundant vectors.

Let’s interpret a linear relation v1v1+v2v2++cmvm=0v_1 \vec{v}_1 + v_2 \vec{v}_2 + \cdots + c_m \vec{v}_m = \vec{0} as a matrix equation.

Let A=[v1v2vm]A = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_m \\ | & | & & | \end{bmatrix}

Linear relation: A=[c1c2cm]=0A = \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_m \end{bmatrix} = \vec{0}

Question: What does it mean to be linearly independent? For v1\vec{v}_1, … vm\vec{v}_m and [c1c2cm]=0\begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_m \end{bmatrix} = \vec{0}?

Answer:

  • Only solution to Ax=0A\vec{x}= \vec{0} is x=0\vec{x}= \vec{0}
  • ker(A)={0}\text{ker}\left( A \right) = \{ \vec{0} \} (no free variables)
  • rank(A)=m\text{rank}\left( A \right) = m

Linearly Dependent Collections of Vectors

{[71],[1422]}\{ \begin{bmatrix} 7 \\ 1 \end{bmatrix}, \begin{bmatrix} 14 \\ 22 \end{bmatrix} \} (2nd one is redundant)

{[121],[100],[333],[1117]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix} , \begin{bmatrix} -1 \\ 11 \\ 7 \end{bmatrix} \} (4 vectors in R3\mathbb{R}^{3} are dependent)

{[0000]}\{ \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} \} (0\vec{0} is in set)

{[3210],[3210],[10010]}\{ \begin{bmatrix} 3 \\ 2 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} -3 \\ -2 \\ -1 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 0 \\ 0 \\ 10 \end{bmatrix} \} (2nd vector is redundant)

Linearly Independent Collections of Vectors

{[100],[120],[123]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 2 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \} (Because rank[111022003]=3\text{rank} \begin{bmatrix} 1 & 1 & 1 \\ 0 & 2 & 2 \\ 0 & 0 & 3 \end{bmatrix} = 3, it is independent)

{[4103]}\{ \begin{bmatrix} -4 \\ 1 \\ 0 \\3 \end{bmatrix} \} (No redundant vectors)

{[02103],[08713],[102106]}\{ \begin{bmatrix} 0 \\ 2 \\ 1 \\ 0 \\ 3 \end{bmatrix} , \begin{bmatrix} 0 \\ 8 \\ -7 \\ -1 \\ -3 \end{bmatrix} , \begin{bmatrix} 1 \\ 0 \\ 2 \\ 10 \\ 6 \end{bmatrix} \}

  • If c1v1+c2v2+c3v3=0c_1 \vec{v}_1 + c_2 \vec{v}_2 + c_3 \vec{v}_3 = \vec{0}, we have 0+0+c3=0    c3=00+0+c_3 = 0 \implies c_3 =0, 0c2+0=0    c2=00-c_2+0=0 \implies c_2=0, 2c1+0+0=0    c1=02c_1+0+0=0 \implies c_1=0

Example

Determine whether the vectors {[111],[123],[147]}\{ \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} , \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} , \begin{bmatrix} 1 \\ 4 \\ 7 \end{bmatrix} \} are linearly independent.

[111134137][111013026]\begin{bmatrix} 1 & 1 & 1 \\ 1 & 3 & 4 \\ 1 & 3 & 7 \end{bmatrix} \to \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 3 \\ 0 & 2 & 6 \end{bmatrix} [111013000][102013000]\to \begin{bmatrix} 1 & 1 & 1 \\ 0 & 1 & 3 \\ 0 & 0 & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & -2 \\ 0 & 1 & 3 \\ 0 & 0 & 0 \end{bmatrix}

Therefore the rank is 2 (and therefore is dependent)

[147]=2[111]+3[123]\begin{bmatrix} 1 \\ 4 \\ 7 \end{bmatrix} = -2 \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} + 3 \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}

Remark: [521]=5e1+2e2+1e3\begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix} = 5\vec{e}_1 + 2\vec{e}_2 + 1\vec{e}_3. This is the unique way of writing [521]\begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix} in terms of basis {e1, e2, e3}\{ \vec{e}_1,\ \vec{e}_2,\ \vec{e}_3 \} of R3\mathbb{R}^{3}.

Theorem:

Suppose {v1, v2,, vm}\{ \vec{v}_1,\ \vec{v}_2, \cdots ,\ \vec{v}_m \} is a basis for a subspace WW of Rn\mathbb{R}^{n}. Then, for v\vec{v} in WW, v\vec{v} can be expressed uniquely as a linear combination of {v1, v2, , vm}\{ \vec{v}_1,\ \vec{v}_2,\ \cdots ,\ \vec{v}_m \}.

Proof: Suppose {v1,,vn}\{ \vec{v}_1 ,\cdots , \vec{v}_n \} is a basis for WW and v\vec{v} in WW. {v1,,vn}\{ \vec{v}_1 , \cdots , \vec{v}_n \} spans WW therefore there exists c1,c2,,cmc_1, c_2, \cdots , c_m with v=v1v1++cmvm\vec{v} = v_1 \vec{v}_1 + \cdots + c_m \vec{v}_m. Suppose v=d1v1+d2v2++dmvm\vec{v} = d_1 \vec{v}_1 + d_2 \vec{v}_2 + \cdots + d_m \vec{v}_m. Show di=cid_i = c_i for 1im1 \le i \le m. 0=vv=(d1c1)v1+(d2c2)v2++(dmdm)vm\vec{0} = \vec{v} - \vec{v} = \left( d_1 - c_1 \right) \vec{v}_1 + \left( d_2 - c_2 \right) \vec{v}_2 + \cdots + \left( d_m - d_m \right) \vec{v}_m as v1,,vm\vec{v}_1, \cdots , \vec{v}_m are linearly independent, d1c1=c2c2==dmcm=0d_1 - c_1 = c_2 - c_2 = \cdots = d_m - c_m = 0 meaning di=cid_i = c_i for 1im1 \le i \le m. This shows uniqueness.

3.3 The Dimension of a Subspace of Rn\mathbb{R}^n

3.3 The Dimension of a Subspace of Rn\mathbb{R}^n

Theorem: Suppose v1, ,vp\vec{v}_1,\ \cdots , \vec{v}_p, w1,,w1\vec{w}_1 , \cdots , \vec{w}_1 are vectors in a subspace WW of Rn\mathbb{R}^{n}. If

  • v1,,vp\vec{v}_1 , \cdots , \vec{v}_p are linearly independent and
  • w1,,wq\vec{w}_1 , \cdots , \vec{w}_q span WW, then pqp \le q.

Every basis for WW has the same number of vectors.

Definition: The dimension of a subspace WW, denoted dim(W)\text{dim}\left( W \right) , is the number of vectors in a basis for WW.

Example

dim(Rn)=n\text{dim}\left( \mathbb{R}^{n} \right) = n

Basis: {e1,e2,e3,,en}\{ \vec{e}_1, \vec{e}_2, \vec{e}_3, \cdots , \vec{e}_n \}

  • dim({0})=0\text{dim}\left( \{ \vec{0} \} \right) = 0 (By convention)

Example

Consider the subspace {z=0}\{ z = 0 \} in R3 \mathbb{R}^{3}. The dimension is 2 (because it’s a plane)

  • {[120]}\{ \begin{bmatrix} 1 \\ 2 \\ 0 \end{bmatrix} \} {[100],[120]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 2 \\ 0 \end{bmatrix} \} {[010]}\{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \} {[010],[510]}\{ \begin{bmatrix} 0 \\ 1\\ 0 \end{bmatrix}, \begin{bmatrix} 5 \\ 1 \\ 0 \end{bmatrix} \} (All linearly independent)

  • {[100],[710],[010]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 7 \\ -1 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \} {[100],[010]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \} {[220],[110],[010],[000]}\{ \begin{bmatrix} 2 \\ 2 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} \} (All span subspace)

Generally, for a subspace WW of Rn\mathbb{R}^{n} with dim(W)=m\text{dim}\left( W \right) = m,

  1. We can find at most mm linearly independent vectors in WW.
  2. We need at least mm vectors to span WW.

Suppose we know dim(W)=m\text{dim}\left( W \right) = m,

  • Any collection of mm linearly independent vectors is a basis.
  • Any collection of mm vectors that span WW is a basis.

Example

Show the vectors {[1002],[0103],[0014],[2340]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \\ 2 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \\ 3 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \\ 4 \end{bmatrix} , \begin{bmatrix} 2 \\ 3 \\ 4 \\ 0 \end{bmatrix} \} form a basis for R4\mathbb{R}^{4}.

dim(R4)=4\text{dim}\left( \mathbb{R}^{4} \right) = 4

[1002010300142340][10020103001400029]\begin{bmatrix} 1 & 0 & 0 & 2\\ 0 & 1 & 0 & 3 \\ 0 & 0 & 1 & 4 \\ 2 & 3 & 4 & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & 0 & 2 \\ 0 & 1 & 0 & 3 \\ 0 & 0 & 1 & 4 \\ 0 & 0 & 0 & -29 \end{bmatrix}

rank(A)=4\text{rank}\left( A \right) = 4

Therefore vectors are independent and hence a basis.

We see in the above example: Vectors v1,,vn\vec{v}_1 , \cdots , \vec{v}_n form a basis for Rn\mathbb{R}^{n} if and only if:

[v1v2vn]\begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ | & | & & | \end{bmatrix} is invertible.

  • This gives yet another characterization of invertible matrices

Rank-Nullity Theorem

Let AA by n×mn\times m matrix.

dim(ker(A))+dim(im(A))=m\text{dim}\left( \text{ker}\left( A \right) \right) + \text{dim}\left( \text{im}\left( A \right) \right) = m

  • dim(ker())\text{dim}\left( \text{ker}\left( \right) \right) called nullity of matrix
  • dim(im(A))\text{dim}\left( \text{im}\left( A \right) \right) is rank of matrix

Restated: rank(A)+nullity(A)=m\text{rank}\left( A \right) + \text{nullity}\left( A \right) = m (Number of columns)

Recall: For A=[120122435112230]A = \begin{bmatrix} 1 & 2 & 0 & 1 & 2 \\ 2 & 4 & 3 & 5 & 1 \\ 1 & 2 & 2 & 3 & 0 \end{bmatrix},

  • Basis for im(A)\text{im}\left( A \right) : {[121],[032]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} , \begin{bmatrix} 0 \\ 3 \\ 2 \end{bmatrix} \} (dim2\text{dim} 2)
  • Basis for ker(A)\text{ker}\left( A \right) : {[21000],[10110],[20101]}\{ \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} -1 \\ 0 \\ -1 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} -2 \\ 0 \\ 1 \\ 0 \\ 1 \end{bmatrix} \} (dim3\text{dim} 3)

2+3=52+3 = 5

Example

Suppose we have a linear transformation T:R5R3T : \mathbb{R}^{5} \to \mathbb{R}^{3}.

What are possible values for dim(ker(T))\text{dim}\left( \text{ker}\left( T \right) \right) ?

AA is 3×53\times 5

rank(A)3\text{rank}\left( A \right) \le 3

rank(A)+dim(ker(T))=5\text{rank}\left( A \right) + \text{dim}\left( \text{ker}\left( T \right) \right) = 5

(Cannot be one-to-one)

Answer: 2, 3, 4, or 5

Rank A nullity
0 5
1 4
2 3
3 2

Example

Suppose we have a linear transformation T:R4R7T : \mathbb{R}^{4} \to \mathbb{R}^{7}.

What are possible values for dim(im(T))\text{dim}\left( \text{im}\left( T \right) \right)?

AA is 7×47\times 4

rank(A)4\text{rank}\left( A \right) \le 4

Answer: 0, 1, 2, 3, 4

Test 1 Preparation

Test 1 Preparation

Sample Test 1

1) Suppose T1, T2:R2R2T_1,\ T_2 : \mathbb{R}^{2} \to \mathbb{R}^{2} are linear transformations such that

  • T1T_1 is orthogonal projection onto the line y=3xy=-3x.
  • T2T_2 is scaling by a factor of 5

a) Find the matrix AA of the transformation T2T1T_2T_1. Show your work

Solution

L=span{[13]}L = \text{span} \{ \begin{bmatrix} 1 \\ -3 \end{bmatrix} \}

5112+(3)2[1339]=12[1339]5 \frac{1}{1^2 + (-3)^2} \begin{bmatrix} 1 & -3 \\ -3 & 9 \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 1 & -3 \\ -3 & 9 \end{bmatrix}

b) Determine weather or not the transformation T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} given by T([xy])=[2xy2x]T \left( \begin{bmatrix} x \\ y \end{bmatrix} \right) = \begin{bmatrix} 2 - x \\ y - 2x \end{bmatrix} is a linear transformation. If so, find its associated matrix. If not, give a reason as to why not.

Solution

T(0)=[20]0T\left( \vec{0} \right) = \begin{bmatrix} 2 \\ 0 \end{bmatrix} \neq \vec{0}

Therefore TT is not a linear transformation.

2) For which values of a,b,c,da,b,c,d and ee is the following matrix in reduced row-echelon form? Choose an answer from 0, 1, any real number. No explanation needed

A=[1ab9070c010d0e0019]A = \begin{bmatrix} 1 & a & b & 9 & 0 & 7 \\ 0 & c & 0 & 1 & 0 & d \\ 0 & e & 0 & 0 & 1 & 9 \end{bmatrix}

Solution

a=0a = 0

a=a = any

c=1c=1

d=d = any

e=0e=0

3) Write b=[1002]\vec{b} = \begin{bmatrix} 10 \\ 0 \\ 2 \end{bmatrix} a linear combination of v1=[121]\vec{v}_1 = \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} and v2=[432]\vec{v}_2 = \begin{bmatrix} 4 \\ 3 \\ 2 \end{bmatrix}. Show your work

Solution

Find x1x_1, x2x_2 with x1[121]+x2[432]=[1002]x_1 \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} + x_2 \begin{bmatrix} 4 \\ 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 10 \\ 0 \\ 2 \end{bmatrix}.

[1410230122][14100520028]\begin{bmatrix} 1 & 4 & | & 10 \\ 2 & 3 & | & 0 \\ 1 & 2 & | & 2 \end{bmatrix} \to \begin{bmatrix} 1 & 4 & | & 10 \\ 0 & -5 & | & -20 \\ 0 & -2 & | & -8 \end{bmatrix} [1410014028][106014000]\to \begin{bmatrix} 1 & 4 & | & 10 \\ 0 & 1 & | & 4 \\ 0 & -2 & | & -8 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & | & -6 \\ 0 & 1 & | & 4 \\ 0 & 0 & | & 0 \end{bmatrix}

x1=6x_1 = -6

x2=4x_2 = 4

b=6[121]+4[432]\vec{b} = -6 \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} + 4 \begin{bmatrix} 4 \\ 3 \\ 2 \end{bmatrix}

4) Find all upper triangular 3×33\times 3 matrices [abc0de00f]\begin{bmatrix} a & b & c \\ 0 & d & e \\ 0 & 0 & f \end{bmatrix} that commute with [001020100]\begin{bmatrix} 0 & 0 & -1 \\ 0 & 2 & 0 \\ 1 & 0 & 0 \end{bmatrix}. Show your work

Solution

[abc0de00f][001020100]=[c2bae2d0f00]\begin{bmatrix} a & b & c \\ 0 & d & e \\ 0 & 0 & f \end{bmatrix} \begin{bmatrix} 0 & 0 & -1 \\ 0 & 2 & 0 \\ 1 & 0 & 0 \end{bmatrix} = \begin{bmatrix} c & 2b & -a \\ e & 2d & 0 \\ f & 0 & 0 \end{bmatrix} [001020100][abc0de000]=[00f02d2eabc]\begin{bmatrix} 0 & 0 & -1 \\ 0 & 2 & 0 \\ 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} a & b & c \\ 0 & d & e \\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & -f \\ 0 & 2d & 2e \\ a & b & c \end{bmatrix}

b=c=e=0b=c=e=0

a=fa=f

d=dd=d

Answer: [a000d000a]\begin{bmatrix} a & 0 & 0 \\ 0 & d & 0 \\ 0 & 0 & a \end{bmatrix} a,dRa, d \in \mathbb{R}

5) Suppose AA is 2×32\times 3, BB is 3×33\times 3, CC is 3×23\times 2, and DD is 2×12\times 1. Which matrix operations that are defined? No justification needed

AC+B;CA;CB;BCD;A(B+C)AC+B; CA; CB; BCD; A(B+C)

Solution

CACA and CBDCBD are defined.

6) Let A=[1340010000020010]A = \begin{bmatrix} 1 & 3 & 4 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 2 \\ 0 & 0 & 1 & 0 \end{bmatrix}. Show your work

a) Use Elementary Row Operations to find A1A^{-1}.

Solution

A1=[13040100000100120]A^{-1} = \begin{bmatrix} 1 & -3 & 0 & -4 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & \frac{1}{2} & 0 \end{bmatrix}

b) Use part (a) to find all solutions to the linear system Ax=[0200]A\vec{x} = \begin{bmatrix} 0 \\ 2 \\ 0 \\ 0 \end{bmatrix}.

Solution

x=A1b\vec{x} = A^{-1}\vec{b}

[13040100000100120][0200]\begin{bmatrix} 1 & -3 & 0 & -4 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & \frac{1}{2} & 0 \end{bmatrix} \begin{bmatrix} 0 \\ 2\\ 0 \\ 0 \end{bmatrix}

x=[6200]\vec{x} = \begin{bmatrix} -6 \\ 2 \\ 0 \\ 0 \end{bmatrix}

7) Let A=[1325261242612439109]A = \begin{bmatrix} 1 & 3 & 2 & 5 \\ 2 & 6 & 1 & -2 & 4 \\ 2 & 6 & 1 & -2 & 4 \\ 3 & 9 & 1 & 0 & 9\end{bmatrix}. (Suppose we already know rref(A)=[1325001660000]\text{rref}\left( A \right) = \begin{bmatrix} 1 & 3 & 2 & 5 \\ 0 & 0 & 1 & -6 & -6 \\ 0 & 0 & 0 & 0 \end{bmatrix}).

a) Find vectors that span the kernel of AA. Show your work

Solution

x1=3t2r5sx_1 = -3t - 2r - 5s

x2=tx_2 = t

x3=6r+6sx_3 = 6r + 6s

x4=rx_4 = r

x5=sx_5 = s

[3t2r5st6r+6srs]=t[31000]+r[20610]+s[50601]\begin{bmatrix} -3t-2r-5s \\ t \\ 6r + 6s \\ r \\ s \end{bmatrix} = t \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} + r \begin{bmatrix} -2 \\ 0 \\ 6 \\ 1 \\0 \end{bmatrix} + s \begin{bmatrix} -5 \\ 0 \\ 6 \\ 0 \\1 \end{bmatrix}

Answer: [31000]\begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}, [20610]\begin{bmatrix} -2 \\ 0 \\ 6 \\ 1 \\ 0 \end{bmatrix}, [50601]\begin{bmatrix} -5 \\ 0 \\ 6 \\ 0 \\ 1 \end{bmatrix}

b) Find vectors that span the image of AA

Solution

[123]\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}, [011]\begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}

8) True or false.

a) If AA is an n×nn\times n matrix and A4=AA^{4} = A, then A3=InA^{3} = I_n.

Solution

A=[0000]A= \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix}

A4=AA^{4} = A and A3I2A^{3}\neq I_2

False

b) If v\vec{v} and w\vec{w} in Rn\mathbb{R}^{n} are solutions of Ax=bA\vec{x}=\vec{b}, where b0\vec{b} \neq \vec{0}. Then v+w\vec{v}+\vec{w} is also a solution for Ax=bA\vec{x}= \vec{b}.

Solution

Av=bA\vec{v} = \vec{b} and Aw=bA\vec{w}= \vec{b} where b0\vec{b}\neq \vec{0}

A(v+w)=Av+Aw=b+b=2bbA\left( \vec{v} + \vec{w} \right) = A\vec{v} + A\vec{w} = \vec{b} + \vec{b} = 2\vec{b} \neq \vec{b} and b0\vec{b}\neq \vec{0}

False

d) There exists a rank 2 matrix AA with A[17]=[210]A \begin{bmatrix} 1 \\ -7 \end{bmatrix} = \begin{bmatrix} 2 \\ -1 \\ 0 \end{bmatrix}.

Solution

A[17]=[210]A \begin{bmatrix} 1 \\ -7 \end{bmatrix} = \begin{bmatrix} 2 \\ -1 \\ 0 \end{bmatrix}.

[2001700][17]=[210]\begin{bmatrix} 2 & 0 \\ 0 & \frac{1}{7}\\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ -7 \end{bmatrix} = \begin{bmatrix} 2 \\ -1 \\ 0 \end{bmatrix}

True

e) For any 6×26\times 2 matrix AA the system Ax=0A\vec{x} = \vec{0} is consistent

Solution

For any n×mn\times m matrix AA, Ax=0A\vec{x}=\vec{0} is consistent. (A0=0A\vec{0} = \vec{0} )

5.1 Orthogonal Projections and Orthonormal Bases

5.1 Orthogonal Projections and Orthonormal Bases

Recall: Geometry of Vectors

  • v\vec{v}, w\vec{w} in Rn\mathbb{R}^{n} are orthogonal provided they are perpendicular (vw=0\vec{v}\cdot \vec{w} = 0)
  • The length of v\vec{v} is v=vv \mid \mid \vec{v} \mid \mid = \sqrt{\vec{v} \cdot \vec{v}}. Note v2=vv \mid \mid \vec{v} \mid \mid ^{2} = \vec{v}\cdot \vec{v}
  • Distance between v\vec{v} and w\vec{w} in Rn\mathbb{R}^{n} is vw \mid \mid \vec{v} - \vec{w} \mid \mid (this is used in section 5.4).
  • Geometry and the dot product vw=vwcos(θ)\vec{v} \cdot \vec{w} = \mid \mid \vec{v} \mid \mid \cdot \mid \mid \vec{w} \mid \mid \cos \left( \theta \right) where θ\theta is the angle between v\vec{v} and w\vec{w} (0θπ0 \le \theta \le \pi).
  • For v\vec{v}, w\vec{w}, nonzero in Rn\mathbb{R}^{n}, the angle between v\vec{v} and w\vec{w} is θ=cos1(vwvw)\theta = \cos ^{-1} \left( \frac{\vec{v}\cdot \vec{w}}{ \mid \mid \vec{v} \mid \mid \mid \mid \vec{w} \mid \mid } \right) (Note that the range of cos1()\cos ^{-1} (\cdots ) is [0, π][0,\ \pi])

Example

v=[202]\vec{v} = \begin{bmatrix} 2 \\ 0 \\ 2 \end{bmatrix} and w=[110]\vec{w} = \begin{bmatrix} 1 \\ 1\\ 0 \end{bmatrix}

1) Find the angle between v\vec{v} and w\vec{w}. 2) Find the distance between v\vec{v} and w\vec{w}.

Solution

1) vw=2+0+0=2\vec{v} \cdot \vec{w} = 2 + 0 + 0 = 2

v=4+4=22 \mid \mid \vec{v} \mid \mid = \sqrt{4 + 4} = 2 \sqrt{2}

w=1+1=2 \mid \mid \vec{w} \mid \mid = \sqrt{1 + 1} = \sqrt{2}

θ=cos1(222(2))=cos1(12)\theta = \cos ^{-1} \left( \frac{2}{2\sqrt{2} \left( \sqrt{2} \right) } \right) = \cos ^{-1} \left( \frac{1}{2} \right)

θ=π3\therefore \theta = \frac{\pi}{3}

2)

vw=[112]\vec{v} - \vec{w} = \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}

vw=1+1+4=6 \mid \mid \vec{v} - \vec{w} \mid \mid = \sqrt{1 + 1 + 4} = \sqrt{6}

Remark: For v\vec{v} and w\vec{w} in Rn\mathbb{R}^{n}, v\vec{v} and w\vec{w} are orthogonal if and only if v+w2=v2+w2 \mid \mid \vec{v} + \vec{w} \mid \mid ^{2} = \mid \mid \vec{v} \mid \mid ^{2} + \mid \mid \vec{w} \mid \mid ^{2}

c2=a2+b2c^{2} = a^{2} + b^{2}

lec10-fig1

Definition:

Vectors {u1,u2,,um}\{ \vec{u} _{1}, \vec{u} _{2}, \cdots , \vec{u} _{m} \} in Rn\mathbb{R}^{n} form an orthonormal collection of vectors provided

  1. Each vectors ui\vec{u}_i is unit. ui=1 \mid \mid \vec{u}_i \mid \mid = 1 or ujui=1\vec{u}_j \cdot \vec{u}_i = 1
  2. Vectors are pairwise orthogonal

{u1,u2,,um}\{ \vec{u}_1, \vec{u}_2, \cdots , \vec{u}_m \} are orthonormal if and only if uiuj={0ij1i=j\vec{u}_i \cdot \vec{u}_j = \begin{cases} 0 & i \neq j \\ 1 & i =j\end{cases}

Example

In R3\mathbb{R}^{3}, {e1,e2,e3}\{ \vec{e}_1 , \vec{e}_2 , \vec{e}_3 \} and {[22022],[22022]}\{ \begin{bmatrix} \frac{\sqrt{2} }{2} \\ 0 \\ \frac{\sqrt{2} }{2} \end{bmatrix} , \begin{bmatrix} -\frac{\sqrt{2} }{2} \\ 0 \\ \frac{\sqrt{2} }{2} \end{bmatrix} \}

u1u2=0\vec{u}_1 \cdot \vec{u}_2 = 0

uiui=(22)2+(22)2=12+12=1\vec{u}_i \cdot \vec{u}_i = \left( \frac{\sqrt{2} }{2} \right) ^{2} + \left( \frac{\sqrt{2} }{2} \right) ^{2} = \frac{1}{2} + \frac{1}{2} = 1

Theorem: Orthonormal vectors are linearly independent.

Proof: Suppose {u1,u2,,um}\{ \vec{u}_1 , \vec{u}_2, \cdots , \vec{u}_m \} are orthonormal and c1u1+v2u2++cmum=0c_1 \vec{u}_1 + v_2 \vec{u}_2 + \cdots + c_m \vec{u}_m = \vec{0}. Show c1=c2==cm=0c_1 = c_2 = \cdots = c_m = 0

Fix i: Show ci=0c_{i} = 0 : ui(c1u1+c2u2++cmum)=ui0=0\vec{u}_i \cdot \left( c_1 \vec{u}_1 + c_2 \vec{u}_2 + \cdots + c_m \vec{u}_m \right) = \vec{u}_i \cdot \vec{0} = 0

Rewrite LHS

c1(uiu1)+c2(uiu2)++c1(uiui)++cm(uium)=0c_1 \left( \vec{u}_i \cdot \vec{u}_1 \right) + c_2 \left( \vec{u}_i \cdot \vec{u}_2 \right) + \cdots + c_1 \left( \vec{u}_i \cdot \vec{u}_i \right) + \cdots + c_m \left( \vec{u}_i \cdot \vec{u}_m \right) = 0

We get: ci1=0c_i \cdot 1 = 0. Therefore ci=0c_i = 0.

Therefore, c1=c2=c3==cm=0c_1 = c_2 = c_3 = \cdots = c_m = 0

Note: Really just needed orthogonal and nonzero.

A collection {u1,u2,,un}\{ \vec{u}_1 , \vec{u}_2 , \cdots , \vec{u}_n \} of orthonormal vectors in Rn\mathbb{R}^{n} form a basis for Rn\mathbb{R}^{n}.

dim(Rn)=n\text{dim}\left( \mathbb{R}^{n} \right) = n. nn linearly independent vectors are a basis. This is called an orthonormal basis.

Examples

  • The columns of the rotational matrix [51312131213513]\begin{bmatrix} \frac{5}{13} & \frac{12}{13} \\ -\frac{12}{13} & \frac{5}{13} \end{bmatrix} form an orthonormal basis for R2\mathbb{R}^{2}.
  • The columns of the reflection matrix [72424252425725]\begin{bmatrix} -\frac{7}{24} & -\frac{24}{25} \\ -\frac{24}{25} & \frac{7}{25} \end{bmatrix} form an orthonormal basis for R2\mathbb{R}^{2}.

Given an orthogonal basis, we may normalize the vectors to obtain an orthonormal basis.

Example

Normalize the basis for R3\mathbb{R}^{3}: {[121],[210],[3615]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix}, \begin{bmatrix} -2 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 3 \\ 6 \\ -15 \end{bmatrix} \}.

v1=1+4+1=6 \mid \mid \vec{v}_1 \mid \mid = \sqrt{1+4+1} = \sqrt{6}

v2=4+1=5 \mid \mid \vec{v}_2 \mid \mid = \sqrt{4 + 1} = \sqrt{5}

v3=9+36+225=270=330 \mid \mid \vec{v}_3 \mid \mid = \sqrt{9 + 36 + 225} = \sqrt{270} = 3 \sqrt{30}

{[162616],[25150],[130230530]}\{ \begin{bmatrix} \frac{1}{\sqrt{6} } \\ \frac{2}{\sqrt{6} } \\ \frac{1}{\sqrt{6} } \end{bmatrix} , \begin{bmatrix} -\frac{2}{\sqrt{5} } \\ \frac{1}{\sqrt{5} }\\ 0 \end{bmatrix} , \begin{bmatrix} \frac{1}{\sqrt{30} }\\ \frac{2}{\sqrt{30} }\\ -\frac{5}{\sqrt{30} } \end{bmatrix} \}

Orthogonal Projections: Recall: If L=span{w}L = \text{span} \{ \vec{w} \} where w0\vec{w}\neq \vec{0} in Rn\mathbb{R}^{n}.

  • The orthogonal projection of x\vec{x} onto LL is projL(x)=x=(xwww)w\text{proj}_{L}\left( \vec{x} \right) = \vec{x}^{\parallel} = \left( \frac{\vec{x}\cdot \vec{w}}{\vec{w} \cdot \vec{w}} \right) \vec{w}
  • The component of x\vec{x} orthogonal to LL is x=xx=xprojL(x)\vec{x}^{\bot} = \vec{x} - \vec{x}^{\parallel} = \vec{x} - \text{proj}_{L} \left( \vec{x} \right)

Note: If L=span{x}L = \text{span}\{ \vec{x} \} where u\vec{u} is unit, then projL(x)=(xu)u\text{proj}_{L}\left( \vec{x} \right) = \left( \vec{x}\cdot \vec{u} \right) \vec{u}.

Orthogonal Projection onto a subspace VV of Rn\mathbb{R}^n.

Let x\vec{x} be in Rn\mathbb{R}^{n} and VV a subspace of Rn\mathbb{R}^{n}. We may write x=x+x\vec{x} = \vec{x}^{\bot} + \vec{x}^{\parallel} where x=projV(x)\vec{x}^{\parallel} = \text{proj}_V \left( \vec{x} \right) is in VV.

Suppose {u1,u2,,um}\{ \vec{u}_1, \vec{u}_2 , \cdots , \vec{u}_m \} is an orthonormal basis for VV then projV(x)=(xu1)u1+(xu2)u2++(xum)um\text{proj}_V \left( \vec{x} \right) = \left( \vec{x} \cdot \vec{u}_1 \right) \vec{u}_1 + \left( \vec{x} \cdot \vec{u}_2 \right) \vec{u}_2 + \cdots + \left( \vec{x} \cdot \vec{u}_m \right) \vec{u}_m

Example

Find the orthogonal projection of e1\vec{e}_1 onto the subspace VV of R4\mathbb{R}^{4} spanned by {[1111],[1111],[1111]}\{ \begin{bmatrix} 1 \\ 1\\ 1 \\ 1 \end{bmatrix} , \begin{bmatrix} 1 \\ 1 \\ -1 \\ -1 \end{bmatrix} , \begin{bmatrix} 1 \\ -1 \\ -1 \\ 1 \end{bmatrix} \}.

vi=1+1+1+1=2 \mid \mid \vec{v}_i \mid \mid = \sqrt{1 + 1 + 1 + 1} = 2

projV(e1)=(u1e1)u1+(u2e1)u2+(u3e1)u3\text{proj}_V \left( \vec{e}_1 \right) = \left( \vec{u}_1 \cdot \vec{e}_1 \right) \vec{u}_1 + \left( \vec{u}_2 \cdot \vec{e}_1 \right) \vec{u}_2 + \left( \vec{u}_3 \cdot \vec{e}_1 \right) \vec{u}_3

=12[12121212]+12[12121212]+12[12121212]=[34141414]= \frac{1}{2} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix} + \frac{1}{2} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} + \frac{1}{2} \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \\ \frac{1}{2} \end{bmatrix} = \begin{bmatrix} \frac{3}{4} \\ \frac{1}{4} \\ -\frac{1}{4} \\ \frac{1}{4} \end{bmatrix} in VV

Note: e1=e1projV(e1)\vec{e}_1^{\bot} = \vec{e}_1 - \text{proj}_V \left( \vec{e}_1 \right)

=[1000][34141414]=[14141414]= \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{3}{4} \\ \frac{1}{4} \\ -\frac{1}{4} \\ \frac{1}{4} \end{bmatrix} = \begin{bmatrix} \frac{1}{4} \\ -\frac{1}{4} \\ \frac{1}{4} \\ -\frac{1}{4} \end{bmatrix}

This is orthogonal to u1\vec{u}_1, u2\vec{u}_2, u3\vec{u}_3 and every vector in VV.

Note: if x\vec{x} is in VV then projV(x)=x\text{proj}_V \left( \vec{x} \right) = \vec{x}

Example

x=[1100]\vec{x} = \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix} is in V=span{[1111],[1111],[1111]}V = \text{span} \{ \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} , \begin{bmatrix} 1 \\ 1 \\ -1 \\ -1 \end{bmatrix} , \begin{bmatrix} 1 \\ -1 \\ - 1\\ 1 \end{bmatrix} \}. Show projV(x)=x\text{proj}_V \left( \vec{x} \right) = \vec{x}.

projV(x)=(xu1)u1+(xu2)u2+(xu3)\text{proj}_V \left( \vec{x} \right) = \left( \vec{x} \cdot \vec{u}_1 \right) \vec{u}_1 + \left( \vec{x} \cdot \vec{u}_2 \right) \vec{u}_2 + \left( \vec{x} \cdot \vec{u}_3 \right)

=1[12121212]+1[12121212]+0[12121212]=[1100]= 1 \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix} + 1 \begin{bmatrix} \frac{1}{2}\\ \frac{1}{2} \\ \frac{-1}{2} \\ -\frac{1}{2} \end{bmatrix} + 0 \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \\ \frac{1}{2} \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \end{bmatrix}

{[12121212],[12121212],[12121212]}\{ \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix} , \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} , \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \\ \frac{1}{2} \end{bmatrix} \}

An Application of Orthogonal Projection: Recall: If {v1,v2,,v3}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_3 \} is a basis for Rn\mathbb{R}^{n} then any vector v\vec{v} in Rn\mathbb{R}^{n} can be expressed uniquely as a linear combination of {v1,v2,,vn}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_n \}.

When β={u1,u2,,un}\beta = \{ \vec{u}_1 , \vec{u}_2 , \cdots , \vec{u}_n \} is an orthonormal basis for Rn\mathbb{R}^{n}, we can easily write x\vec{x} as linear combination of {u1,,un}\{ \vec{u}_1 , \cdots , \vec{u}_n \}

x=(xu1)u1+(xu2)u2++(xun)un\vec{x} = \left( \vec{x} \cdot \vec{u}_1 \right) \vec{u}_1 + \left( \vec{x} \cdot \vec{u}_2 \right) \vec{u}_2 + \cdots + \left( \vec{x} \cdot \vec{u}_n \right) \vec{u}_n

called coordinates of x\vec{x} relative to basis β\beta

Example

β={[162616],[25150],[130230530]}\beta = \{ \begin{bmatrix} \frac{1}{\sqrt{6} } \\ \frac{2}{\sqrt{6} } \\ \frac{1}{\sqrt{6} } \end{bmatrix} , \begin{bmatrix} -\frac{2}{\sqrt{5} } \\ \frac{1}{\sqrt{5} } \\ 0 \end{bmatrix} , \begin{bmatrix} \frac{1}{\sqrt{30} } \\ \frac{2}{\sqrt{30} } \\ -\frac{5}{\sqrt{30} } \end{bmatrix} \}. Find the coordinates of x=[12 3]\vec{x} = \begin{bmatrix} 1 \\ 2 \ 3 \end{bmatrix} relative to β\beta.

xu1=1+4+36=86\vec{x} \cdot \vec{u}_1 = \frac{1+4+3}{\sqrt{6} } = \frac{8}{\sqrt{6} }

xu2=2+25=0\vec{x} \cdot \vec{u}_2 = \frac{-2 + 2}{\sqrt{5} } = 0

xu3=1+41530=1030\vec{x}\cdot \vec{u}_3 = \frac{1+4 - 15}{\sqrt{30} } = -\frac{10}{\sqrt{30}}

x=86u11030u3\vec{x} = \frac{8}{\sqrt{6} } \vec{u}_1 - \frac{10}{\sqrt{30} } \vec{u}_3

Note: v1\vec{v}_1, v2\vec{v}_2, v3\vec{v}_3 form an orthonormal basis for R3\mathbb{R}^{3}

Exercise: Express x=[321]\vec{x} = \begin{bmatrix} 3\\ 2\\ 1 \end{bmatrix} as a linear combination of v1=[35450]\vec{v}_1 = \begin{bmatrix} -\frac{3}{5} \\ \frac{4}{5} \\ 0 \end{bmatrix} , v2=[45350]\vec{v}_2 = \begin{bmatrix} \frac{4}{5} \\ \frac{3}{5} \\ 0 \end{bmatrix} , and v3=[001]\vec{v}_3 = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} .

xv1=9+85=15\vec{x}\cdot \vec{v}_1 = \frac{-9+8}{5} = -\frac{1}{5}

xv2=12+65=185\vec{x}\cdot \vec{v}_2 = \frac{12+6}{5} = \frac{18}{5}

xv3=0+0+1=1\vec{x}\cdot \vec{v}_3 = 0 + 0 + 1 = 1

x=15v1+185v2+v3\vec{x} = -\frac{1}{5} \vec{v}_1 + \frac{18}{5} \vec{v}_2 + \vec{v}_3

For a subspace VV of Rn\mathbb{R}^{n}, the map T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n} given by T(x)=projV(x)T\left( \vec{x} \right) = \text{proj}_{V}\left( \vec{x} \right) is a linear transformation!

What is im(T)\text{im}\left( T \right)? im(T)=V\text{im}\left( T \right) = V

What is ker(T)\text{ker}\left( T \right)? ker(T)={xRn:xv=0\text{ker}\left( T \right) = \{ x \in \mathbb{R}^{n} : \vec{x} \cdot \vec{v} = 0 for all vV}\vec{v} \in V \}. This is called the orthogonal complement of VV denoted VV^{\bot}

Theorem: Let VV be a subspace of Rn\mathbb{R}^{n}. Then,

  1. VV^{\bot} is a subspace of Rn\mathbb{R}^{n}
  2. VV={0}V \cap V^{\bot} = \{ \vec{0} \}
  3. dim(V)+dim(V)=n\text{dim}\left( V \right) + \text{dim}\left( V^{\bot} \right) = n
  4. (V)=V\left( V^{\bot} \right)^{\bot} = V

Proof:

2) Suppose xV\vec{x} \in V and xV\vec{x} \in V^{\bot}. Therefore xx=0\vec{x}\cdot \vec{x} = 0. x=0\vec{x} = 0

3) Follows from rank nullity theorem

Example

Find a basis from VV^{\bot} where V=span{[1311]}V = \text{span} \{ \begin{bmatrix} 1 \\ 3 \\ 1 \\ -1 \end{bmatrix} \}.

[1311][x1x2x3x4]=0\begin{bmatrix} 1 & 3 & 1 & -1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = 0

x1=3tr+sx_1 = -3t - r + s

x2=tx_2 = t

x3=rx_3 = r

x4=sx_4 = s

[3tr+strs]=t[3100]+r[1010]+s[1001]\begin{bmatrix} -3t - r + s \\ t \\ r \\ s \end{bmatrix} = t \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} + r \begin{bmatrix} -1 \\ 0 \\ 1 \\ 0 \end{bmatrix} + s \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix}

Basis for VV^{\bot}: {[3100],[1010],[1001]}\{ \begin{bmatrix} -3 \\ 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} -1 \\ 0 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix} \}

Example

Find a basis for VV^{\bot} where V=span{[124],[031]}V = \text{span} \{ \begin{bmatrix} -1 \\ 2 \\ 4 \end{bmatrix} , \begin{bmatrix} 0 \\ 3 \\ 1 \end{bmatrix} \}.

Notice x\vec{x} is in VV^{\bot} provided [124031][x1x2x3]=[00]\begin{bmatrix} -1 & 2 & 4 \\ 0 & 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

Find a basis for ker[124031]\text{ker} \begin{bmatrix} -1 & 2 & 4 \\ 0 & 3 & 1 \end{bmatrix}

[124031][1240113]\begin{bmatrix} -1 & 2 & 4 \\ 0 & 3 & 1 \end{bmatrix} \to \begin{bmatrix} 1 & -2 & -4 \\ 0 & 1 & \frac{1}{3} \end{bmatrix} [101030113]\to \begin{bmatrix} 1 & 0 & \frac{-10}{3} \\ 0 & 1 & \frac{1}{3} \end{bmatrix}

x3=tx_3 = t

x1=103tx_1 = \frac{10}{3} t

x2=13tx_2 = -\frac{1}{3} t

[103t13tt]\begin{bmatrix} \frac{10}{3}t \\ -\frac{1}{3}t \\ t \end{bmatrix}

Basis: {[1013]}\{ \begin{bmatrix} 10 \\ -1 \\ 3 \end{bmatrix} \}

Definition:

Comment: Suppose AA is n×mn \times m.

The row space of AA, denoted row(A)\text{row}\left( A \right) is the span of the rows of AA in Rm\mathbb{R}^{m}.

Our above examples illustrate: ker(A)=(row(A))\text{ker}\left( A \right) = \left( \text{row}\left( A \right) \right) ^{\bot}

Note: dim(row(A))=rank(A)\text{dim}\left( \text{row}\left( A \right) \right) = \text{rank}\left( A \right) .

Example

[123401370010]\begin{bmatrix} 1 & 2 & 3 & 4 \\ 0 & 1 & 3 & 7 \\ 0 & 0 & 1 & 0 \end{bmatrix}

im(A)R3\text{im}\left( A \right) \in \mathbb{R}^{3}

span([100],[210],[331],[470])\text{span}\left( \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 3 \\ 3 \\ 1 \end{bmatrix} , \begin{bmatrix} 4 \\ 7 \\ 0 \end{bmatrix} \right)

Basis: {[100],[210],[331]}\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 2 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 3 \\ 3 \\ 1 \end{bmatrix} \}

Row Space:

span{[1234],[0137],[0010]}R4\text{span} \{ \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 3 \\ 7 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \} \in \mathbb{R}^{4}

Basis: {[1234],[0137],[0010]}\{ \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 3 \\ 7 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \}

5.2 Gram-Schmidt Process and QR Factorization

5.2 Gram-Schmidt Process and QR Factorization

Last time: Orthonormal Basis{Northonormal BasesOrthogonal Projection\text{Orthonormal Basis} \begin{cases} \text{Northonormal Bases} \\ \text{Orthogonal Projection} \end{cases}

Today: Given a subspace WW with basis β\beta, find an orthonormal basis for WW.

Example

W=span{[4030],[250250]}R4W = \text{span} \{ \begin{bmatrix} 4 \\ 0 \\ 3 \\ 0 \end{bmatrix} , \begin{bmatrix} 25 \\ 0 \\ -25 \\ 0 \end{bmatrix} \} \in \mathbb{R}^{4}. We want a new basis for WW that is orthonormal.

New basis: {u1,u2}\{ \vec{u}_1 , \vec{u}_2 \}

u1=u1v1\vec{u}_1 = \frac{\vec{u}_1}{ \mid \mid \vec{v}_1 \mid \mid }

v1=16+9=5\mid \mid \vec{v}_1 \mid \mid = \sqrt{16 + 9} = 5

projL(v2)=(u1)(v2)u1\text{proj}_{L} \left( \vec{v}_2 \right) = \left( \vec{u}_1 \right) \cdot \left( \vec{v}_2 \right) \vec{u}_1

=5[450350]=[4030]= 5 \begin{bmatrix} \frac{4}{5} \\ 0 \\ \frac{3}{5} \\ 0 \end{bmatrix} = \begin{bmatrix} 4 \\ 0 \\ 3 \\ 0 \end{bmatrix}

u1=[450350]\therefore \vec{u}_1 = \begin{bmatrix} \frac{4}{5} \\ 0 \\ \frac{3}{5} \\ 0 \end{bmatrix}

u2=v2v2\vec{u}_2 = \frac{\vec{v}_2 ^{\bot}}{ \mid \mid \vec{v}_2 ^{\bot} \mid \mid }

v2=v2projL(vi)\vec{v} _2 ^{\bot} = \vec{v} _2 - \text{proj} _{L} \left( \vec{v} _i \right)

=[250250][4030]=[210280]= \begin{bmatrix} 25 \\ 0 \\ -25 \\ 0 \end{bmatrix} - \begin{bmatrix} 4 \\ 0 \\ 3 \\ 0 \end{bmatrix} = \begin{bmatrix} 21 \\ 0 \\ -28 \\ 0 \end{bmatrix}

v2=212+282=35 \mid \mid \vec{v}_2 ^{\bot} \mid \mid = \sqrt{21^{2} + 28^{2}} = 35

u2=[350450]\therefore \vec{u}_2 = \begin{bmatrix} \frac{3}{5} \\ 0 \\ -\frac{4}{5} \\ 0 \end{bmatrix}

Example

W=span{[4030],[250250],[0111]}R4W = \text{span} \{ \begin{bmatrix} 4 \\ 0 \\ 3 \\ 0 \end{bmatrix} , \begin{bmatrix} 25 \\ 0 \\ -25 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 1 \\ 1 \end{bmatrix} \} \in \mathbb{R}^{4}.

Orthonormal Basis: {u1,u2,u3}\{ \vec{u}_1 , \vec{u}_2 , \vec{u}_3 \}

We begin the same way:

u1=v1v1\vec{u}_1 = \frac{\vec{v}_1}{ \mid \mid \vec{v}_1 \mid \mid }

L=span{u1}L = \text{span}\{ \vec{u}_1 \}

v2=v2projL(v)\vec{v}_2 ^{\bot} = \vec{v}_2 - \text{proj}_L \left( \vec{v} \right)

u2=v2v2\vec{u}_2 = \frac{\vec{v}_2 ^{\bot}}{ \mid \mid \vec{v}_2 ^{\bot} \mid \mid }

u1=[450350]\vec{u}_1 = \begin{bmatrix} \frac{4}{5} \\ 0 \\ \frac{3}{5} \\ 0 \end{bmatrix}

u2=[350450]\vec{u}_2 = \begin{bmatrix} \frac{3}{5} \\ 0 \\ -\frac{4}{5} \\ 0 \end{bmatrix}

Let V=span{u1,u2}=span{v1,v2}V = \text{span}\{ \vec{u} _1 , \vec{u} _2 \} = \text{span} \{ \vec{v} _1 , \vec{v} _2 \}. We may write v3=projV(v3)+v3\vec{v} _3 = \text{proj} _{V} \left( \vec{v} _3 \right) + \vec{v} _3 ^{\bot}. Then u3=v3v3\vec{u} _3 = \frac{\vec{v} _3 ^{\bot}}{ \mid \mid \vec{v} _3 ^{\bot} \mid \mid }

projV(v3)=(u1v3)u1+(u2v3)u2\text{proj}_{V} \left( \vec{v}_3 \right) = \left( \vec{u}_1 \cdot \vec{v}_3 \right) \vec{u}_1 + \left( \vec{u}_2 \cdot \vec{v}_3 \right) \vec{u}_2 (Projection along subspace)

=35[450350]+(45)[350450]=[0025250]=[0010]= \frac{3}{5} \cdot \begin{bmatrix} \frac{4}{5} \\ 0 \\ \frac{3}{5} \\ 0 \end{bmatrix} + \left( -\frac{4}{5} \right) \cdot \begin{bmatrix} \frac{3}{5} \\ 0 \\ -\frac{4}{5} \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ \frac{25}{25} \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} (Projection of v3\vec{v}_3)

v3=[0111][0010]=[0101]\vec{v}_3 ^{\bot} = \begin{bmatrix} 0 \\ 1 \\ 1 \\ 1 \end{bmatrix} - \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ 1 \end{bmatrix}

v3=2 \mid \mid \vec{v}_3 ^{\bot} \mid \mid = \sqrt{2}

u3=[012012]\therefore \vec{u}_3 = \begin{bmatrix} 0 \\ \frac{1}{\sqrt{2} } \\ 0 \\ \frac{1}{\sqrt{2} } \end{bmatrix}

Gram-Schmidt Process: Let β={v1,v2,,vm}\beta = \{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_m \} be a basis for a subspace WW of Rn\mathbb{R}^{n}.

We construct an orthonormal basis U={u1,u2,,um}\mathcal{U} = \{ \vec{u}_1 , \vec{u}_2 , \cdots , \vec{u}_m \} for WW as follows:

  • u1=v1v1\vec{u}_1 = \frac{\vec{v}_1}{ \mid \mid \vec{v}_1 \mid \mid }
  • u2=v2v2\vec{u} _2 = \frac{\vec{v} _2 ^{\bot}}{ \mid \mid \vec{v} _2 ^{\bot} \mid \mid } where v2=v2projL(v2)\vec{v} _2 ^{\bot} = \vec{v} _2 - \text{proj} _{L} \left( \vec{v} _2 \right) and L=span{v1}=span{u1}L = \text{span} \{ \vec{v} _1 \} = \text{span} \{ \vec{u} _1 \}

To get uj\vec{u} _j, project vj\vec{v} _j onto span{v1,v2,,vj1=span{u1,u2,,uj1}\text{span} \{ \vec{v} _1 , \vec{v} _2 , \cdots , \vec{v} _{j-1} = \text{span} \{ \vec{u} _1 , \vec{u} _2 , \cdots , \vec{u} _{j-1} \}

vj=vjprojV(vj)\vec{v}_j ^{\bot} = \vec{v}_j - \text{proj}_V \left( \vec{v}_j \right) gives the direction

  • uj=vjvj\vec{u}_j = \frac{\vec{v}_j ^{\bot}}{ \mid \mid \vec{v}_j ^{\bot} \mid \mid }

Note: vj=vj(u1vj)u1(u2vj)u2(uj1vj)uj1\vec{v} _j ^{\bot} = \vec{v} _j - \left( \vec{u} _1 \cdot \vec{v} _j \right) \vec{u} _1 - \left( \vec{u} _2 \cdot \vec{v} _j \right) \vec{u} _2 - \cdots - \left( \vec{u} _{j-1} \cdot \vec{v} _j \right) \vec{u} _{j-1}

Exercise: Perform the Gram-Schmidt process on {[1111],[2233]}\{ \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} , \begin{bmatrix} 2 \\ 2 \\ 3 \\ 3 \end{bmatrix} \}

v1=1+1+1+1=2 \mid \mid \vec{v}_1 \mid \mid = \sqrt{1 + 1 + 1 + 1} = 2

v2=v2(v2u1)u1\vec{v}_2 ^{\bot} = \vec{v}_2 - \left( \vec{v}_2 \cdot \vec{u}_1 \right) \vec{u}_1

=[2234](1+1+32+32)[12121212]=[12121212]= \begin{bmatrix} 2 \\ 2 \\ 3 \\ 4 \end{bmatrix} - \left( 1 + 1 + \frac{3}{2} + \frac{3}{2} \right) \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix} = \begin{bmatrix} -\frac{1}{2} \\ -\frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix}

v2=14+14+14+14=1 \mid \mid \vec{v}_2 ^{\bot} \mid \mid = \sqrt{\frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4}} = 1

u1=[12121212]\vec{u}_1 = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix}

u2=[12121212]\vec{u}_2 = \begin{bmatrix} -\frac{1}{2} \\ -\frac{1}{2} \\ \frac{1}{2} \\ \frac{1}{2} \end{bmatrix}

Let’s interpret this process via matrices

A=[12121313]A = \begin{bmatrix} 1 & 2 \\ 1 & 2 \\ 1 & 3 \\ 1 & 3 \end{bmatrix} has linearly independent columns. We want to write A=QRA = QR where QQ has orthonormal columns.

Suppose [v1v2]=[u1u1]R\begin{bmatrix} | & | \\ \vec{v}_1 & \vec{v}_2 \\ | & | \end{bmatrix} = \begin{bmatrix} | & | \\ \vec{u}_1 & \vec{u}_1 \\ | & | \end{bmatrix} R (A=QRA = QR)

R=[v1u1v20v2]R = \begin{bmatrix} \mid \mid \vec{v}_1 \mid \mid & \vec{u}_1 \cdot \vec{v}_2 \\ 0 & \mid \mid \vec{v}_2 ^{\bot} \mid \mid \end{bmatrix}

Check that this RR works:

  • First column of [u1u2]R\begin{bmatrix} | & | \\ \vec{u}_1 & \vec{u}_2 \\ | & | \end{bmatrix} R is [u1u2][v10]\begin{bmatrix} | & | \\ \vec{u}_1 & \vec{u}_2 \\ | & | \end{bmatrix} \begin{bmatrix} \mid \mid \vec{v}_1 \mid \mid \\ 0 \end{bmatrix}
    • v1u1=v1\mid \mid \vec{v}_1 \mid \mid \vec{u}_1 = \vec{v}_1
  • Second column of [u1u2]R\begin{bmatrix} | & | \\ \vec{u}_1 & \vec{u}_2 \\ | & | \end{bmatrix} R is [u1u2][u1v2v2]\begin{bmatrix} | & | \\ \vec{u}_1 & \vec{u}_2 \\ | & | \end{bmatrix} \begin{bmatrix} \vec{u}_1 \cdot \vec{v}_2 \\ \mid \mid \vec{v}_2 ^{\bot} \mid \mid \end{bmatrix}
    • =(u1v2)u1+v2u2=(u1v2)u1+v2= \left( \vec{u}_1 \cdot \vec{v}_2 \right) \vec{u}_1 + \mid \mid \vec{v}_2 ^{\bot} \mid \mid \vec{u}_2 = \left( \vec{u}_1 \cdot \vec{v}_2 \right) \vec{u}_1 + \vec{v}_2 ^{\bot}
    • =projL(v2)+v2=v2= \text{proj}_{L} \left( \vec{v}_2 \right) + \vec{v}_2 ^{\bot} = \vec{v}_2

Example

[12121313]=[1212121212121212]R\begin{bmatrix} 1 & 2 \\ 1 & 2 \\ 1 & 3 \\ 1 & 3 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} & -\frac{1}{2} \\ \frac{1}{2} & -\frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \end{bmatrix} R

R=[v1u1v20v2]=[2501]R = \begin{bmatrix} \mid \mid \vec{v}_1 \mid \mid & \vec{u}_1 \cdot \vec{v}_2 \\ 0 & \mid \mid \vec{v}_2 ^{\bot} \mid \mid \end{bmatrix} = \begin{bmatrix} 2 & 5 \\ 0 & 1 \end{bmatrix}

QR-Factorization

Consider an n×mn\times m matrix AA with linearly independent columns v1,v2,,vm\vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_m.

  • There exists an n×mn\times m matrix QQ with orthonormal columns u1,u2,,um\vec{u}_1 , \vec{u}_2 , \cdots , \vec{u}_m (comes from Gram-Schmidt) and
  • An upper triangular m×mm \times m matrix RR with positive diagonal entries such that A=QRA = QR

Moreover, for the matrix R=[Rij]R = [R_{ij}], we have:

r11=v1r_{11} = \mid \mid \vec{v}_1 \mid \mid

rjj=vjr_{jj} = \mid \mid \vec{v}_{j} ^{\bot} \mid \mid

rij=uivjr_{ij} = \vec{u}_i \cdot \vec{v}_j for i<ji < j

Example

Find the QRQR-Factorization of A=[101771121771]A = \begin{bmatrix} 1 & 0 & 1 \\ 7 & 7 & 1 \\ 1 & 2 & -1 \\ 7 & 7 & -1 \end{bmatrix}.

R=[v1u1v2u1v30v2u2v300v3]R = \begin{bmatrix} \mid \mid \vec{v}_1 \mid \mid & \vec{u}_1 \cdot \vec{v}_2 & \vec{u}_1 \cdot \vec{v}_3 \\ 0 & \mid \mid \vec{v}_2 ^{\bot} \mid \mid & \vec{u}_2 \cdot \vec{v}_3 \\ 0 & 0 & \mid \mid \vec{v}_3 ^{\bot } \mid \mid \end{bmatrix}

Solution:

R=[10100022002]R = \begin{bmatrix} 10 & 10 & 0 \\ 0 & \sqrt{2} & -\sqrt{2} \\ 0 & 0 & \sqrt{2} \end{bmatrix}

v1=1+49+1+49=10\mid \mid \vec{v}_1 \mid \mid = \sqrt{1 + 49 + 1 + 49} = 10

u1=[110710110710]\vec{u}_1 = \begin{bmatrix} \frac{1}{10} \\ \frac{7}{10} \\ \frac{1}{10} \\ \frac{7}{10} \end{bmatrix}

v2=v2(u1v2)u1\vec{v}_2 ^{\bot} = \vec{v}_2 - \left( \vec{u}_1 \cdot \vec{v}_2 \right) \vec{u}_1

v2=[0727](10010)[110710110710]=[1010]\vec{v}_2 ^{\bot} = \begin{bmatrix} 0 \\ 7 \\ 2 \\ 7 \end{bmatrix} - \left( \frac{100}{10} \right) \begin{bmatrix} \frac{1}{10} \\ \frac{7}{10} \\ \frac{1}{10}\\ \frac{7}{10} \end{bmatrix} = \begin{bmatrix} -1 \\ 0 \\ 1 \\ 0 \end{bmatrix}

v2=2\mid \mid \vec{v}_2 ^{\bot} \mid \mid = \sqrt{2}

u2=[120120]\vec{u}_2 = \begin{bmatrix} -\frac{1}{\sqrt{2} } \\ 0 \\ \frac{1}{\sqrt{2} } \\ 0 \end{bmatrix}

v3=v3(u1v3)u1(u2v3)u2\vec{v}_3 ^{\bot} = \vec{v}_3 - \left( \vec{u}_1 \cdot \vec{v}_3 \right) \vec{u}_1 - \left( \vec{u}_2 \cdot \vec{v}_3 \right) \vec{u}_2

v3=[1111](8810)[110710110710](2)[120120]\vec{v}_3 ^{\bot} = \begin{bmatrix} 1 \\ 1 \\ - 1\\ -1 \end{bmatrix} - \left( \frac{8-8}{10} \right) \begin{bmatrix} \frac{1}{10} \\ \frac{7}{10} \\ \frac{1}{10} \\ \frac{7}{10} \end{bmatrix} - \left( -\sqrt{2} \right) \begin{bmatrix} -\frac{1}{\sqrt{2} }\\ 0 \\ \frac{1}{\sqrt{2} } \\ 0 \end{bmatrix}

v3=[1111]+[1010]=[0101]\vec{v}_3 ^{\bot} = \begin{bmatrix} 1 \\ 1 \\ -1 \\ -1 \end{bmatrix} + \begin{bmatrix} -1 \\ 0 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \\ -1 \end{bmatrix}

u3=v3v3=[012012]\vec{u}_3 = \frac{\vec{v}_3 ^{\bot}}{ \mid \mid \vec{v}_3 ^{\bot} \mid \mid } = \begin{bmatrix} 0 \\ \frac{1}{\sqrt{2} } \\ 0 \\ -\frac{1}{\sqrt{2} } \end{bmatrix}

Q=[110120710012110120710012]\therefore Q = \begin{bmatrix} \frac{1}{10} & -\frac{1}{\sqrt{2} } & 0 \\ \frac{7}{10} & 0 & \frac{1}{\sqrt{2} } \\ \frac{1}{10} & \frac{1}{\sqrt{2} } & 0 \\ \frac{7}{10} & 0 & -\frac{1}{\sqrt{2} } \end{bmatrix}

How else can we find RR?

  • Case 1: AA, QQ are n×nn \times n square.
    • AA and QQ have linearly independent columns which means that QQ is invertible.
    • Q1A=Q1QR    R=Q1AQ^{-1}A = Q^{-1}QR \implies R = Q^{-1} A
  • Case 2: Often AA, QQ are n×mn\times m with nmn \neq m (n > m)

Definition:

The transpose of QQ, denoted QTQ^{T}, has (i, j)-entry the (j, i)-entry of QQ.

When Q=[u1u2um]Q = \begin{bmatrix} | & | & & | \\ \vec{u}_1 & \vec{u}_2 & \cdots & \vec{u}_m \\ | & | & & | \end{bmatrix} with {ui}\{ \vec{u}_i \} orthonormal, QT=[u1Tu2Tum]Q^{T} = \begin{bmatrix} – & \vec{u}_1 ^{T} & – \\ – & \vec{u}_2 ^{T} & – \\ & \vdots & \\ – & \vec{u}_m & – \end{bmatrix}.

QTQ=[u1T u2T  um][u1um]=ImQ^T Q = \begin{bmatrix} -- & \vec{u}_1 ^{T} & -- \\\ -- & \vec{u}_2 ^{T} & -- \\\ & \vdots & \\\ -- & \vec{u}_m & -- \end{bmatrix} \begin{bmatrix} | & & | \\ \vec{u}_1 & \cdots & \vec{u}_m \\ | & & | \end{bmatrix} = I_m

Has (i, j)-entry

ujuj={1if i=j0if ij\vec{u}_j \cdot \vec{u}_j = \begin{cases} 1 & \text{if } i=j \\ 0 & \text{if } i\neq j \end{cases}

Way #2 of finding matrix RR :

We have QTQ=ImQ^{T}Q = I_m.

A=QR    QTA=QTQR    R=QTAA = QR \implies Q^{T}A = Q^{T}QR \implies R = Q^{T}A

Example

A=[101771121771]A = \begin{bmatrix} 1 & 0 & 1 \\ 7 & 7 & 1 \\ 1 & 2 & -1 \\ 7 & 7 & -1 \end{bmatrix} and Q=[110120710012110120710012]Q = \begin{bmatrix} \frac{1}{10} & -\frac{1}{\sqrt{2} } & 0 \\ \frac{7}{10} & 0 & \frac{1}{\sqrt{2} } \\ \frac{1}{10} & \frac{1}{\sqrt{2} } & 0 \\ \frac{7}{10} & 0 & -\frac{1}{\sqrt{2} } \end{bmatrix}

QTA=[110710110710120120012012][101771121771]=[10100022002]Q^{T}A = \begin{bmatrix} \frac{1}{10} & \frac{7}{10} & \frac{1}{10} & \frac{7}{10} \\ -\frac{1}{\sqrt{2} } & 0 & \frac{1}{\sqrt{2} } & 0 \\ 0 & \frac{1}{\sqrt{2} } & 0 & -\frac{1}{\sqrt{2} } \end{bmatrix} \begin{bmatrix} 1 & 0 & 1 \\ 7 & 7 & 1 \\ 1 & 2 & - 1\\ 7 & 7 & -1 \end{bmatrix} = \begin{bmatrix} 10 & 10 & 0 \\ 0 & \sqrt{2} & -\sqrt{2} \\ 0 & 0 & \sqrt{2} \end{bmatrix}

5.3 Orthogonal Transformations and Orthogonal Matrices

5.3 Orthogonal Transformations and Orthogonal Matrices

Orthogonal Transformations:

T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n}

Definition: T(x)=x \mid \mid T \left( \vec{x} \right) \mid \mid = \mid \mid \vec{x} \mid \mid for all xRn\vec{x} \in \mathbb{R}^{n}. i.e. TT preserves lengths.

ker(T)={0}\text{ker}\left( T \right) = \{ \vec{0} \} (Any vector mapping to 0\vec{0} must have 0 length)

TT is invertible

T1T^{-1} is an orthogonal transformation

If T1T_1, T2:RnRnT_2 : \mathbb{R}^{n} \to \mathbb{R}^{n} are orthogonal transformations, T1T2T_1 \cdot T_2 orthogonal transformation

Orthogonal Matrices: n×nn\times n matrix AA

Definition: The transformation T(x)=AxT \left( \vec{x} \right) = A \vec{x} is an orthogonal transformation.

Characterization: Columns of AA form an orthonormal basis for Rn\mathbb{R}^{n}.

A1A^{-1} is an orthogonal matrix.

If A1A_1 and A2A_2 are orthogonal matrices, A1A2A_1A_2 is an orthogonal matrix

Example

A=[22222222]A = \begin{bmatrix} \frac{\sqrt{2} }{2} & \frac{-\sqrt{2} }{2} \\ \frac{\sqrt{2} }{2} & \frac{\sqrt{2} }{2} \end{bmatrix}

The transformation T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} given by T(x)=AxT\left( \vec{x} \right) = A\vec{x} is rotation counter-clockwise by θ=π4\theta = \frac{\pi}{4}.

Example

A=[0110]A = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}

The transformation T:R2R2T : \mathbb{R}^{2}\to \mathbb{R}^{2} given by T(x)=AxT\left( \vec{x} \right) = A \vec{x} is a reflection about the line y=xy=x

Non-Example: A=[1000]A = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix} . The transformation T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} given by T(x)=AxT \left( \vec{x} \right) = A \vec{x} is orthogonal projection onto xx-axis.

  • This is because it does not preserve length.
  • Orthogonal projection is not an orthogonal transformation

Remark: For any subspace VV of Rn\mathbb{R}^{n} and xRn\vec{x} \in \mathbb{R}^{n},

projv(x)x with equality if and only if xV\mid \mid \text{proj}_v (\vec{x}) \mid \mid \le \mid \mid \vec{x} \mid \mid \text{ with equality if and only if } \vec{x} \in V

x=projV(x)+x\vec{x} = \text{proj}_V \left( \vec{x} \right) + \vec{x}^{\bot} where x\vec{x}^{\bot} is orthogonal to projV(x)\text{proj}_V \left( \vec{x} \right)

x2=projV(x)2+x2projV(x)2 \mid \mid \vec{x} \mid \mid ^{2} = \mid \mid \text{proj}_V \left( \vec{x} \right) \mid \mid ^{2} + \mid \mid \vec{x}^{\bot} \mid \mid ^{2} \ge \mid \mid \text{proj}_V \left( \vec{x} \right) \mid \mid ^{2}

Let’s justify. The columns of an n×nn\times n orthogonal matrix form an orthonormal basis for Rn\mathbb{R}^{n}.

Theorem: If T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n} is an orthogonal transformation and v\vec{v} and w\vec{w} are orthonormal, then T(v)T \left( \vec{v} \right) and T(w)T \left( \vec{w} \right) are orthonormal.

Proof:

1) Show T(v)T \left( \vec{v} \right) and T(w)T \left( \vec{w} \right) are orthogonal.

Assume v+w2=v2+w2 \mid \mid \vec{v} + \vec{w} \mid \mid ^{2} = \mid \mid \vec{v} \mid \mid ^{2} + \mid \mid \vec{w} \mid \mid ^{2}. Show T(v)+T(w)2=T(v)2+T(w)2 \mid \mid T \left( \vec{v} \right) + T \left( \vec{w} \right) \mid \mid ^{2} = \mid \mid T \left( \vec{v} \right) \mid \mid ^{2} + \mid \mid T \left( \vec{w} \right) \mid \mid ^{2}. We have T(v+T(w))2 \mid \mid T \left( \vec{v} + T \left( \vec{w} \right) \right) \mid \mid ^{2} (T is linear)

=v+w2= \mid \mid \vec{v} + \vec{w} \mid \mid ^{2} (T preserves length)

=v2+w2= \mid \mid \vec{v} \mid \mid ^{2} + \mid \mid \vec{w} \mid \mid ^{2} (v1\vec{v}_1 and w\vec{w} are orthogonal)

=T(v)2+T(w)2= \mid \mid T \left( \vec{v} \right) \mid \mid ^{2} + \mid \mid T \left( \vec{w} \right) \mid \mid ^{2}. (T preserves lengths)

2) Show T(v)T \left( \vec{v} \right) and T(w)T \left( \vec{w} \right) are unit.

v\vec{v} and w\vec{w} are unit. T preserves length T(v)T \left( \vec{v} \right) and T(w)T \left( \vec{w} \right) are unit.

  • T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n} is an orthogonal transformation if and only if {T(e1),T(e2),T(en)}\{ T \left( \vec{e}_1 \right) , T \left( \vec{e}_2 \right) , \cdots T \left( \vec{e}_n \right) \} is an orthonormal basis for RnR^{n}.
  • The columns of an n×nn\times n orthogonal matrix form an orthonormal basis for Rn\mathbb{R}^{n}.

Recall: QR Factorization if AA has linearly independent columns, we may write A=QRA=QR where QQ has orthonormal columns and R=QTAR = Q^{T}A.

Definition:

Consider an m×nm\times n matrix AA, the transpose ATA^{T} is the n×mn\times m matrix such that (i, j)-entry of ATA^{T} is the (j, i)-entry of AA.

In other words: interchange rows and columns

Example

A=[24701021]A = \begin{bmatrix} 2 & 4 \\ 7 & 0 \\ 1 & 0 \\ 2 & 1 \end{bmatrix} and B=[1332]B = \begin{bmatrix} 1 & 3 \\ 3 & 2 \end{bmatrix}. Find ATA^{T} and BTB^{T}.

AT=[27124001]A^{T} = \begin{bmatrix} 2 & 7 & 1 & 2 \\ 4 & 0 & 0 & 1 \end{bmatrix}

BT=[1332]=BB^{T} = \begin{bmatrix} 1 & 3 \\ 3 & 2 \end{bmatrix} = B

Note: for any AA, im(AT)=row (A)\text{im}\left( A^{T} \right) = \text{row }\left( A \right) (row space of AA)

Definition:

A square matrix AA is

  • symmetric provided AT=AA^{T} = A
  • skew-symmetric provided AT=AA^{T} = -A

Properties: (1, 2, 3 for any matrices such that operations are defined. 4 provided AA is n×nn\times n and invertible)

  1. (A+B)T=AT+BT\left( A +B \right) ^{T} = A^T + B^{T}
  2. (AB)T=BTAT\left( AB \right) ^{T} = B^{T}A^{T}
  3. rank(AT)=rank(A)\text{rank}\left( A^{T} \right) = \text{rank}\left( A \right)
  4. (A1)T=(AT)1\left( A^{-1} \right) ^{T} = \left( A^{T} \right) ^{-1}

Proof of 2) Suppose AA is m×pm\times p with A=[w1mm]A = \begin{bmatrix} – & \vec{w}_1 & – \\ & \vdots & \\ – & \vec{m}_m & – \end{bmatrix} and BB is p×np\times n with B=[v1vm]B = \begin{bmatrix} | & & | \\ \vec{v}_1 & \cdots & \vec{v}_m \\ | & & | \end{bmatrix}.

BT=[v1Tv2TvnT]B^{T} = \begin{bmatrix} – & \vec{v}_1 ^{T} & – \\ – & \vec{v}_2 ^{T} & – \\ & \vdots & \\ – & \vec{v}_n ^{T} & – \end{bmatrix}

AT=[w1wm]A^{T} = \begin{bmatrix} | & & | \\ \vec{w}_1 & \cdots & \vec{w}_m \\ | & & | \end{bmatrix}

  • (i,j)(i, j)-entry of (AB)T(AB)^{T} : (jji)(jji)-entry of ABABwjvi\vec{w}_j \cdot \vec{v}_i
  • (i,j)\left( i, j \right) -entry of BTATB^{T}A^{T} : viTwj=vjwj=wjvi\vec{v}_i ^{T} \cdot \vec{w}_j = \vec{v}_j \cdot \vec{w}_j = \vec{w}_j \cdot \vec{v}_i

Dot product does not distinguish between rows and columns

Example

Verify that (A1)T=(AT)1\left( A^{-1} \right) ^{T} = \left( A^{T} \right) ^{-1} for the matrix A=[2101]A = \begin{bmatrix} 2 & 1 \\ 0 & -1 \end{bmatrix}.

Recall: [abcd]1=1adbc[dbca]\begin{bmatrix} a & b \\ c & d \end{bmatrix} ^{-1} = \frac{1}{ad-bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}

  • (A1)T=(12[1102])T=[121201]T=[120121]\left( A^{-1} \right) ^{T} = \left( \frac{1}{-2} \begin{bmatrix} -1 & -1 \\ 0 & 2 \end{bmatrix} \right) ^{T} = \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ 0 & -1 \end{bmatrix} ^{T} = \begin{bmatrix} \frac{1}{2} & 0 \\ \frac{1}{2} & -1 \end{bmatrix}
  • (AT)1=[2011]1=12[1012]=[120121]\left( A^{T} \right) ^{-1} = \begin{bmatrix} 2 & 0 \\ 1 & -1 \end{bmatrix} ^{-1} = \frac{1}{-2} \begin{bmatrix} -1 & 0 \\ -1 & 2 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} & 0 \\ \frac{1}{2} & -1 \end{bmatrix}

Note: det(A)=det(AT)\text{det}\left(A \right) = \text{det}\left( A^{T} \right)

Exercise: Suppose AA and BB are n×nn\times n orthogonal matrices, which of the following must be orthogonal?

2B,AB2,AB2B , AB^2 , A -B

2B: Columns are not unit

AB2AB^2: Yes; B2=BBB^2 = BB orthogonal

ABA-B: Columns are not unit

Suppose AA and BB are n×nn\times n symmetric matrices, which of the following must be symmetric?

2B,AB2,AB2B , AB^2 , A-B
  • (2B)T=2BT=2B(2B)^T = 2B^T = 2B Yes
  • (AB2)T=(B2)TAT=BTBTAT=B2A(AB^2)^T = \left( B^{2} \right) ^{T} A^{T} = B^{T}B^{T}A^{T} = B^{2}A No
  • (AB)T=ATBT=AB(A-B)^{T} = A^{T} - B^{T} = A-B Yes

Theorem: For an n×nn\times n matrix AA, AA is an orthogonal matrix:

  1. If and only if ATA=InA^{T}A = I_{n} and
  2. If and only if AT=A1A^{T} = A^{-1}

Note: (2) follows from (1) (Criterion for infertility)

Proof of (1): Suppose AA is n×nn\times n with A=[v1v2vn]A = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ | & | & & | \end{bmatrix}.

ATAA^{T}A has (i, j)-entry viTvj=vivj\vec{v}_i^T \cdot \vec{v}_j = \vec{v}_i \cdot \vec{v}_j

ATA=InA^{T}A = I_{n} if and only if vivj={1i=j(unit)0ijPerpendicular\vec{v}_i \cdot \vec{v}_j = \begin{cases} 1 & i=j \text{(unit)} \\ 0 & i\neq j \text{Perpendicular} \end{cases}

Note: We can interpret the dot product as a matrix product. For x=[x1xn]\vec{x} = \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} and xT=[x1xn]\vec{x}^T = \begin{bmatrix} x_1 & \cdots & x_n \end{bmatrix}

For x\vec{x} and y\vec{y} in Rn\mathbb{R}^{n}, xy=[x1xn][y1yn]=xTy\vec{x}\cdot \vec{y} = \begin{bmatrix} x_1 & \cdots & x_n \end{bmatrix} \begin{bmatrix} y_1 \\ \vdots \\ y_n \end{bmatrix} = \vec{x}^{T} \vec{y}

Theorem: If TT is an orthogonal transformation then TT preserves dot product, i.e. T(x)T(y)=xyT\left( \vec{x} \right) \cdot T\left( \vec{y} \right) = \vec{x} \cdot \vec{y}.

Proof:

T(x)T(y)=AxAy=(Ax)TAy=xTATAy=xTy=xy\begin{align*} T(\vec{x}) \cdot T(\vec{y}) & = A\vec{x} \cdot A \vec{y} \\ & = (A\vec{x})^T A \vec{y} \\ & = \vec{x}^T A^T A \vec{y} \\ & = \vec{x}^T \vec{y} \\ & = \vec{x} \cdot \vec{y} \end{align*}

Example

Let v1=[1111]\vec{v}_1 = \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix} , v2=[1111]\vec{v}_2 = \begin{bmatrix} 1 \\ 1 \\ 1 \\ -1 \end{bmatrix}, y1=[2000]\vec{y}_1 = \begin{bmatrix} 2 \\ 0 \\ 0 \\ 0 \end{bmatrix}, y2=[0200]\vec{y}_2 = \begin{bmatrix} 0 \\ 2 \\ 0 \\ 0 \end{bmatrix}. Show there is no orthogonal transformation T:R4R4T : \mathbb{R}^{4} \to \mathbb{R}^{4} such that T(v1)=y1T\left( \vec{v}_1 \right) = \vec{y}_1 and T(v2)=y2T\left( \vec{v}_2 \right) = \vec{y}_2.

We would need T(v1)T(v2)=v1v2T \left( \vec{v}_1 \right) \cdot T\left( \vec{v}_2 \right) = \vec{v}_1 \cdot \vec{v}_2.

v1v2=1+1+11=2\vec{v}_1 \cdot \vec{v}_2 = 1 + 1 + 1 - 1 =2

y1y2=02\vec{y}_1 \cdot \vec{y}_2 = 0 \neq 2

No such orthogonal transformation exists.

Suppose T:RnRnT : \mathbb{R}^{n}\to \mathbb{R}^{n} is an orthogonal transformation. Show TT preserves angles. That is, for any nonzero v\vec{v} and w\vec{w} in Rn\mathbb{R}^{n}, the angle between T(v)T\left( \vec{v} \right) and T(w)T\left( \vec{w} \right) equals the angle between v\vec{v} and w\vec{w}.

cos1(vwvw)=cos1(T(v)T(w)T(v)T(w))\cos ^{-1} \left( \frac{\vec{v}\cdot \vec{w}}{ \mid \mid \vec{v} \mid \mid \cdot \mid \mid \vec{w} \mid \mid } \right) = \cos ^{-1} \left( \frac{T\left( \vec{v} \right) \cdot T\left( \vec{w} \right) }{ \mid \mid T \left( \vec{v} \right) \mid \mid \cdot \mid \mid T \left( \vec{w} \right) \mid \mid } \right)

Question: Suppose T:RnRnT : \mathbb{R}^{n} \to \mathbb{R}^{n} preserves angles. Is TT necessarily an orthogonal transformation?

Answer: No! Scaling by kk preserves angle.

Review of ideas/terminology from 3.2, 3.3, 5.1:

Question: What is the span of vectors in Rn\mathbb{R}^{n}?

Answer: All linear combinations

Question: What is a basis for a subspace WW of Rn\mathbb{R}^{n}?

Answer: A (finite) collection B\mathcal{B} of vectors in WW such that:

  • B\mathcal{B} is linearly independent
  • span(B)=W\text{span}\left( \mathcal{B} \right) = W

Example

Let W={[xyz]R3:x=0}W = \{ \begin{bmatrix} x \\ y \\ z \end{bmatrix} \in \mathbb{R}^{3} : x = 0 \}.

W=ker[100]W = \text{ker} \begin{bmatrix} 1 & 0 & 0 \end{bmatrix}

x=0x = 0

y=ty = t (free)

z=rz = r (free)

[0tr]=t[010]+r[001]\begin{bmatrix} 0 \\ t \\ r \end{bmatrix} = t \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + r \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}

Basis: {[010],[001]}\{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \}

dim(W)=2\text{dim}\left( W \right) = 2

Note: This is not the only basis for WW.

Let w1=[011]\vec{w}_1 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} and w2=[011]\vec{w}_2 = \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix}. Let’s verify B={w1,w2}\mathcal{B} = \{ \vec{w}_1 , \vec{w}_2 \} a basis for W={[xyz]R3:c=0}W = \{ \begin{bmatrix} x \\ y \\ z \end{bmatrix} \in \mathbb{R}^{3} : c = 0 \}.

Using only the definition of basis (and not the theory we will review)

  • Linear Independence: w1\vec{w}_1 and w2\vec{w}_2 are nonzero. w2\vec{w}_2 is not a multiple of w1\vec{w}_1. No redundant vectors.
  • span(B)=W\text{span} \left( \mathcal{B} \right) = W
[0yz]=a[011]+b[011]\begin{bmatrix} 0 \\ y \\ z \end{bmatrix} = a \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} + b \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix} [0yz]=y+z2[011]+zy2[011]\begin{bmatrix} 0 \\ y \\ z \end{bmatrix} = \frac{y+z}{2} \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} + \frac{z-y}{2} \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix}

Find aa and bb.

ab=ya - b = y

a+b=za + b = z

2a=y+z2a = y+z

a=y+z2a = \frac{y+z}{2}

b=zy+z2b = z - \frac{y+z}{2}

=zy2= \frac{z-y}{2}

Some theory from 3.3

Suppose we know dim(W)=m\text{dim} \left( W \right) = m and B1\mathcal{B}_1 and B2\mathcal{B}_2 W\subseteq W. If B1\mathcal{B}_1 is linearly independent and B2\mathcal{B}_2 spans WW, then B1B2 \mid \mathcal{B}_1 \mid \le \mid \mathcal{B}_2 \mid .

  • Any collection of mm linearly independent vectors in WW is a basis for WW.
  • Any collection of mm vectors that span WW is a basis for WW.

Example

{[121],[310]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} , \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} \} is not a basis for R3\mathbb{R}^{3}.

Vectors are independent. 2 Vectors cannot span R3\mathbb{R}^{3}.

Example

{[121],[310],[500]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} , \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 5 \\ 0 \\ 0 \end{bmatrix} \} is a basis for R3\mathbb{R}^{3}.

c1[121]+c2[310]+c3[500]=0c_1 \begin{bmatrix} 1 \\ 2 \\1 \end{bmatrix} + c_2 \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} + c_3 \begin{bmatrix} 5 \\ 0 \\ 0 \end{bmatrix} = \vec{0}

3rd line c1=0c_1 =0

2nd line c2=0c_2 = 0

1st line 5c3=0    c3=05c_3 = 0 \implies c_3 =0

  • Vectors are independent
  • dim(R3)=3\text{dim}\left( \mathbb{R}^{3} \right) = 3

Example

{[121],[310],[500],[111]}\{ \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} , \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 5 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \} is not a basis for R3\mathbb{R}^{3}.

Vectors span R3\mathbb{R}^{3}. 4 vectors cannot be independent in R3\mathbb{R}^{3}, however.

Question: How do we find the dimension of a subspace?

Answer: We can use Rank-Nullity Theorem. Suppose AA is n×mn\times m.

dim(im(A))+dim(ker(A))=m\text{dim} (\text{im} (A)) + \text{dim} (\text{ker} (A)) = m
  • If V=im(A)V = \text{im} \left( A \right) , then dim(V)=rank(A)\text{dim} \left( V \right) = \text{rank}\left( A \right)
  • If W=ker(A)W = \text{ker}\left( A \right) , then dim(W)=mrank(A)\text{dim}\left( W \right) = m - \text{rank}\left( A \right) .

Question 3 #2: For Z={[x1x2x3]R3:x1=0 and x2+5x3=0}Z = \{ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \in \mathbb{R}^{3} : x_1 = 0 \text{ and } x_2 + 5x_3 = 0 \}, dim(Z)=1\text{dim}\left( Z \right) =1.

Z=ker([100015])Z = \text{ker} \left( \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 5 \end{bmatrix} \right)

Matrix has rank 2. dim(Z)=32=1\text{dim}\left( Z \right) = 3 - 2 =1

Quiz 3 #1B: The dimension of span{[101],[000],[020],[444],[323]}\text{span} \{ \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 2 \\ 0 \end{bmatrix} , \begin{bmatrix} 4 \\ 4 \\ 4 \end{bmatrix} , \begin{bmatrix} 3 \\ -2 \\ 3 \end{bmatrix} \} is 2.

[100430024210043][100430024200000]\begin{bmatrix} 1 & 0 & 0 & 4 & 3 \\ 0 & 0 & 2 & 4 & -2 \\ 1 & 0 & 0 & 4 & 3 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & 0 & 4 & 3 \\ 0 & 0 & 2 & 4 & -2 0 & 0 & 0 & 0 & 0 \end{bmatrix}

Rank is 2.

dim(im(A))=2<3\text{dim} \left( \text{im}\left( A \right) \right) = 2 < 3

im(A)R3\text{im}\left( A \right) \neq \mathbb{R}^{3}

Question: What is the orthogonal complement of a subspace VV of Rn\mathbb{R}^{n}?

V={xRn:xv=0vV}V^{\bot} = \{ \vec{x} \in \mathbb{R}^{n} : \vec{x} \cdot \vec{v} = 0 \forall \vec{v} \in V \}

VV^{\bot} is a subspace of Rn\mathbb{R}^{n}.

  • VV={0}V \cap V^{\bot} = \{ \vec{0} \}
  • dim(V)+dim(V)=n\text{dim} \left( V \right) + \text{dim} \left( V^{\bot} \right) = n

In example: 2 + 1 = 3

Note:

  • x\vec{x} is in VV^{\bot} if and only if xv=0\vec{x} \cdot \vec{v} = 0 for v\vec{v} in a basis for VV.
  • For VV and WW subspaces, W=VW = V^{\bot} if and only if every vector in a basis for WW is perpendicular to every vector in a basis VV.

Four subspaces of a matrix.

A(n×m)A (n \times m)

  • im(A)Rn\text{im}\left( A \right) \subseteq \mathbb{R}^{n}
  • ker(A)Rm\text{ker}\left( A \right) \subseteq \mathbb{R}^{m}

AT(m×n)A^{T} (m \times n)

  • im(AT)Rm\text{im}\left( A^{T} \right) \subseteq \mathbb{R}^{m}
  • ker(AT)Rn\text{ker}\left( A^{T} \right) \subseteq \mathbb{R}^{n}

Properties:

  • dim(im(A))=k\text{dim} \left( \text{im}\left( A \right) \right) = k
  • dim(ker(A))=mk\text{dim}\left( \text{ker}\left( A \right) \right) = m -k

  • dim(im(AT))=k\text{dim}\left( \text{im}\left( A^{T} \right) \right) = k
  • dim(ker(AT))=nk\text{dim}\left( \text{ker}\left( A^{T} \right) \right) = n -k

Relationship:

ker(AT)=(im(A))\text{ker}\left( A^{T} \right) = \left( \text{im}\left( A \right) \right) ^{\bot} in Rn\mathbb{R}^{n} (we use in 5.4)

ker(A)=(im(AT))\text{ker}\left( A \right) = \left( \text{im}\left( A^{T} \right) \right) ^{\bot}

Example

A=[100010]A = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix}

AT=[100100]A^{T}= \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix}

Orthogonal complements

ker(AT)={0}\text{ker}\left( A^{T} \right) = \{ \vec{0} \}

im(A)=R2\text{im}\left( A \right) = \mathbb{R}^{2}

ker(A)=span{[001]}\text{ker}\left( A \right) = \text{span} \{ \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \}

im(AT)=span{[100],[010]}\text{im}\left( A^{T} \right) = \text{span} \{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}

In 5.4 we will use im(A)=ker(AT)\text{im}\left( A \right) ^{\bot} = \text{ker}\left( A^{T} \right)

Example

A=[100200010010]A = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \end{bmatrix}. Verify that im(A)=ker(AT)\text{im}\left( A \right) ^{\bot} = \text{ker}\left( A^{T} \right).

AT=[120000110000]A^{T} = \begin{bmatrix} 1 & 2 & 0 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix}

ker(AT)\text{ker}\left( A^{T} \right):

x2=tx_2 = t

x4=rx_4 = r

x1=2tx_1 = -2t

x3=rx_3 = -r

[2ttrr]=t[2100]+r[0011]\begin{bmatrix} -2t \\ t \\ -r \\ r \end{bmatrix} = t \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \end{bmatrix} + r \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix}

Basis: {[2100],[0011]}\{ \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix} \}

im(A)\text{im}\left( A \right) : Basis: {[1200],[0011]}\{ \begin{bmatrix} 1 \\ 2 \\ 0 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} \}

Notice: Each element in basis for im(A)\text{im}\left( A \right) is perpendicular to each element in a basis for ker(AT)\text{ker}\left( A^{T} \right).

5.4 Least Squares and Data Fitting

5.4 Least Squares and Data Fitting

Suppose AA is n×mn\times m matrix. For b\vec{b} in Rn\mathbb{R}^{n}, the system Ax=bA\vec{x} = \vec{b} may have no solution. That is, b∉Ab \not\in A.

Question: How do we find a vector in Rm\mathbb{R}^{m} that is “almost” a solution?

We want: xRm\vec{x}^{\star} \in \mathbb{R}^{m} that makes bAx \mid \mid \vec{b} - A \vec{x}^{\star} \mid \mid as small as possible.

projim(A)b=Ax\text{proj}_{\text{im}\left( A \right) } \vec{b} = A \vec{x}^{\star} for some x\vec{x}^{\star} in Rm\mathbb{R}^{m}. This x\vec{x}^{\star} is a least squares solution.

Without using any theory, too many steps involved:

  1. Find orthonormal basis for im(A)\text{im}\left( A \right) . Using Gram-Schmidt Process.
  2. Project b\vec{b} onto im(A)\text{im}\left( A \right). Using the orthonormal basis.
  3. Solve linear system Ax=projim(A)(b)A\vec{x} = \text{proj}_{\text{im}\left( A \right)} \left( \vec{b} \right). Using Gauss Jordan Elimination.

How to find x\vec{x}^{\star} : AxA\vec{x}^{\star} is the vector in im(A)\text{im}\left( A \right) closest to bAx=projim(A)(b)\vec{b} \leftrightarrow A\vec{x}^{\star} = \text{proj}_{\text{im}\left( A \right)}\left( \vec{b} \right).

  • bAx\vec{b} - A\vec{x}^{\star} is in (im(A))\left( \text{im}\left( A \right) \right) ^{\bot}
  • bAx\vec{b} - A \vec{x}^{\star} is in ker(AT)\text{ker}\left( A^{T} \right)
  • AT(bAx)=0ATbATAx=0A^{T}\left( \vec{b} - A\vec{x}^{\star} \right) = \vec{0} \leftrightarrow A^{T}\vec{b} - A^{T}A \vec{x}^{\star} = \vec{0}
  • (ATA)x=(ATb)\left( A^{T}A \right) \vec{x}^{\star} = \left( A^{T} \vec{b} \right)

Definition:

The least squared solutions of the system Ax=bA\vec{x}=\vec{b} are the solutions to the system ATAx=ATbA^{T} A \vec{x} = A^T \vec{b} (Called the normal equation of the system Ax=bA\vec{x}= \vec{b})

Method of Least Squares: If Ax=bA\vec{x} = \vec{b} is inconsistent, multiply by ATA^{T} and solve: ATAx=ATbA^{T}A\vec{x} = A^{T}\vec{b}

Note: The normal equation is always consistent.

5.4 #20: Let A=[111001]A = \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} and b=[333]\vec{b} = \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix}. Find the least squares solution x\vec{x}^{\star} of the system Ax=bA\vec{x}= \vec{b}.

Verify bAx\vec{b} - A\vec{x}^{\star} is perpendicular to the image of AA. ATAx=ATbA^{T}A \vec{x} = A^{T}\vec{b}

ATA=[110101][111001]=[2112]A^{T}A = \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}

[2112][x1x2]=[66]\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 6 \\ 6 \end{bmatrix}

ATb=[110101][333]=[66]A^{T}\vec{b} = \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix} = \begin{bmatrix} 6 \\ 6 \end{bmatrix}

(ATA)1=141[2112]=[23131323]\left( A^{T}A \right) ^{-1} = \frac{1}{4-1} \begin{bmatrix} 2 & -1 \\ -1 & 2 \end{bmatrix} = \begin{bmatrix} \frac{2}{3} & -\frac{1}{3} \\ -\frac{1}{3} & \frac{2}{3} \end{bmatrix}

x=[23131323][66]=[22]\vec{x}^{\star} = \begin{bmatrix} \frac{2}{3} & -\frac{1}{3} \\ -\frac{1}{3} & \frac{2}{3} \end{bmatrix} \begin{bmatrix} 6 \\ 6 \end{bmatrix} = \begin{bmatrix} 2 \\ 2 \end{bmatrix} (Least squares solution)

bAx=[333][111001][22]=[333][422]=[111]\vec{b} - A\vec{x} = \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix} - \begin{bmatrix} 1 & 1 \\ 1 & 0\\ 0 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 2 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \\ 3 \end{bmatrix} - \begin{bmatrix} 4 \\ 2 \\ 2 \end{bmatrix} = \begin{bmatrix} -1 \\ 1 \\ 1 \end{bmatrix} (Notice this is orthogonal to each column of AA)

Example

Find the closest line to points (-1, 6), (1, 0), (2, 4).

f(t)=c0+c1tf(t) = c_0 + c_1 t

6=c0c16 = c_0 - c_1

0=c0+c10 = c_0 + c_1

4=c0+2c14 = c_0 + 2c_1

Inconsistent Linear System: [111112][c0c1]=[604]\begin{bmatrix} 1 & -1 \\ 1 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \end{bmatrix} = \begin{bmatrix} 6 \\ 0 \\ 4 \end{bmatrix}

  • Solve ATAx=ATbA^{T}A\vec{x} = A^{T}\vec{b}

ATA=[111112][111112]=[3226]A^{T}A = \begin{bmatrix} 1 & 1 & 1 \\ -1 & 1 & 2 \end{bmatrix} \begin{bmatrix} 1 & -1 \\ 1 & 1 \\ 1 & 2 \end{bmatrix} = \begin{bmatrix} 3 & 2 \\ 2 & 6 \end{bmatrix}

ATb=[111112][604]=[102]A^{T}\vec{b} = \begin{bmatrix} 1 & 1 & 1 \\ -1 & 1 & 2 \end{bmatrix} \begin{bmatrix} 6 \\ 0 \\ 4 \end{bmatrix} = \begin{bmatrix} 10 \\ 2 \end{bmatrix}

(ATA)1=1184[6223]=[614214214314]\left( A^{T}A \right) ^{-1} = \frac{1}{18-4} \begin{bmatrix} 6 & -2 \\ -2 & 3 \end{bmatrix} = \begin{bmatrix} \frac{6}{14} & -\frac{2}{14} \\ -\frac{2}{14} & \frac{3}{14} \end{bmatrix}

x=[614214214314][102]=[41]\vec{x}^{\star} = \begin{bmatrix} \frac{6}{14} & -\frac{2}{14} \\ -\frac{2}{14} & \frac{3}{14} \end{bmatrix} \begin{bmatrix} 10 \\ 2 \end{bmatrix} = \begin{bmatrix} 4 \\ -1 \end{bmatrix}

f(t)=4tf(t) = 4 - t

Question: How close is b\vec{b} to AxA\vec{x}^{\star}?

bAx=[604][111112][41]=[604][532]=[132]\vec{b} - A\vec{x}^{\star} = \begin{bmatrix} 6 \\ 0 \\ 4 \end{bmatrix} - \begin{bmatrix} 1 & -1 \\ 1 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 4 \\ -1 \end{bmatrix} = \begin{bmatrix} 6 \\ 0 \\ 4 \end{bmatrix} - \begin{bmatrix} 5 \\ 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ -3 \\ 2 \end{bmatrix} (Gives vertical “errors” from points)

Definition:

Using the least squares method, the error is bAx \mid \mid \vec{b} - A\vec{x}^{\star} \mid \mid.

In the above example: bAx=[122]=1+9+4=14 \mid \mid \vec{b} - A\vec{x}^{\star} \mid \mid = \mid \mid \begin{bmatrix} 1 \\ -2 \\ 2 \end{bmatrix} \mid \mid = \sqrt{1 + 9 + 4} = \sqrt{14}

Least squares method minimizes e12+e22+e33e_1^{2} + e_2^{2} + e_3^{3}

Exercise Given A=[111211]A = \begin{bmatrix} 1 & 1 \\ 1 & -2 \\ 1 & 1 \end{bmatrix} and b=[321]\vec{b} = \begin{bmatrix} 3 \\ 2 \\ 1 \end{bmatrix}. Find the least squares solution x\vec{x}^{\star} of the system Ax=bA\vec{x} = \vec{b}.

Solve ATAx=ATbA^{T}A\vec{x} = A^{T}\vec{b} (Normal equation)

ATA=[111121][111211]=[3006]A^{T}A = \begin{bmatrix} 1 & 1 & 1 \\ 1 & -2 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & -2 \\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 0 & 6 \end{bmatrix}

ATb=[111121][321]=[60]A^{T}\vec{b} = \begin{bmatrix} 1 & 1 & 1 \\ 1 & -2 & 1 \end{bmatrix} \begin{bmatrix} 3 \\ 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 0 \end{bmatrix}

(ATA)1=118[6003]=[130016]\left( A^{T}A \right) ^{-1} = \frac{1}{18} \begin{bmatrix} 6 & 0 \\ 0 & 3 \end{bmatrix} = \begin{bmatrix} \frac{1}{3} & 0 \\ 0 & \frac{1}{6} \end{bmatrix}

x=[130016][60]=[20]\vec{x}^{\star} = \begin{bmatrix} \frac{1}{3} & 0 \\ 0 & \frac{1}{6} \end{bmatrix} \begin{bmatrix} 6 \\ 0 \end{bmatrix} = \begin{bmatrix} 2 \\ 0 \end{bmatrix}

Remark: In examples so far, our matrix ATAA^{T}A was invertible and hence we had a unique least squares solution

ATAx=ATb and ATA invertible x=(ATA)1ATb.A^T A \vec{x}^{\star} = A^T \vec{b} \text{ and } A^T A \text{ invertible } \to \vec{x}^{\star} = \left( A^T A \right) ^{-1} A^T \vec{b}.

Generally, there need not be a unique least squares solution.

One can show: For an n×mn\times m matrix AA, ker(ATA)=ker(A)\text{ker}\left( A^{T}A \right) = \text{ker}\left( A \right)

  • When AA has linearly independent columns, ATAA^{T} A is invertible.
    • ATAA^{T}A is m×mm\times m with ker(ATA)={0}ATA\text{ker}\left( A^{T}A \right) = \{ \vec{0} \} \to A^{T}A is invertible
  • When AA has linearly dependent columns, ATAA^{T}A is not invertible.
    • ATAA^{T}A is m×mm\times m with rank(ATA)<m\text{rank}\left( A^{T}A \right) < m. The normal equation has at least one free variable (and is consistent always) we have infinitely many least squares solutions.

Example

Find the least squares solutions to Ax=bA\vec{x} = \vec{b} where A=[2400]A = \begin{bmatrix} 2 & 4 \\ 0 & 0 \end{bmatrix} and b=[12]\vec{b} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}.

ATA=[2040][2400]=[48816]A^{T}A = \begin{bmatrix} 2 & 0 \\ 4 & 0 \end{bmatrix} \begin{bmatrix} 2 & 4 \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 4 & 8 \\ 8 & 16 \end{bmatrix} (Not invertible)

ATb=[2040][12]=[24]A^{T} \vec{b} = \begin{bmatrix} 2 & 0 \\ 4 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 2 \\ 4 \end{bmatrix}

[4828164][482000][1212000]\begin{bmatrix} 4 & 8 & | & 2 \\ 8 & 16 & | & 4 \end{bmatrix} \to \begin{bmatrix} 4 & 8 & | & 2 \\ 0 & 0 & | & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 2 & | & \frac{1}{2} \\ 0 & 0 & | & 0 \end{bmatrix}

x1=122tx_1 = \frac{1}{2} - 2t

x2=tx_2 = t

[122tt],tR\begin{bmatrix} \frac{1}{2} - 2t \\ t \end{bmatrix} , t \in \mathbb{R} (Least squares solutions)

Error:

bAx=[12][2400][122tt]=[12][14t+4t0]\vec{b} - A\vec{x}^{\star} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} - \begin{bmatrix} 2 & 4 \\ 0 \\ 0 \end{bmatrix} \begin{bmatrix} \frac{1}{2} - 2t \\ t \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} - \begin{bmatrix} 1 - 4t + 4t \\ 0 \end{bmatrix}

=[12][10]=[02]= \begin{bmatrix} 1 \\ 2 \end{bmatrix} - \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 2 \end{bmatrix} (Error: 2)

In the above example, we can solve using our original discussion of least squares. Solve the linear system Ax=projim(A)(b)A\vec{x} = \text{proj}_{\text{im}\left( A \right) }\left( \vec{b} \right) (We’ll get the same answer):

A=[2400]A = \begin{bmatrix} 2 & 4 \\ 0 & 0 \end{bmatrix} and b=[12]\vec{b} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}

im(A)=span{[10]}\text{im}\left( A \right) = \text{span} \{ \begin{bmatrix} 1 \\ 0 \end{bmatrix} \}

projim(A)(b)=[10]\text{proj}_{\text{im}\left( A \right) } \left( \vec{b} \right) = \begin{bmatrix} 1 \\ 0 \end{bmatrix}

[2400][x1x2]=[10]\begin{bmatrix} 2 & 4 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}

[241000][1212000]\begin{bmatrix} 2 & 4 & | & 1 \\ 0 & 0 & | & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 2 & | & \frac{1}{2} \\ 0 & 0 & | & 0 \end{bmatrix}

x1=122tx_1 = \frac{1}{2} - 2t

x2=tx_2 = t (free)

[122tt],tR\begin{bmatrix} \frac{1}{2} - 2t \\ t \end{bmatrix} , t \in \mathbb{R}

6.1/6.2 Determinants

6.1/6.2 Determinants

Suppose AA is n×nn\times n. The determinant of AA is a number such that AA is invertible if and only if det(A)0\text{det}\left( A \right) \neq 0.

Notation: det(A)\text{det}\left( A \right) or A \mid A \mid

The determinant of a 2×22 \times 2 matrix abcd=adbc\begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad-bc.

Determent Nat have many properties that help us compute A \mid A \mid for an n×nn\times n matrix AA.

  1. In=1 \mid I_{n} \mid =1 ; 1001=10=1\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix} = 1-0 =1
  2. Determinant changes sign when you interchange 2 rows.
    • cdab=cbad=(adbc)=abcd\begin{vmatrix} c & d \\ a & b \end{vmatrix} = c b - ad = - \left( ad - bc \right) = - \begin{vmatrix} a & b \\ c & d \end{vmatrix}

Example

100001010=10001001=1\begin{vmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{vmatrix} = - \begin{vmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 1 \end{vmatrix} = -1

  1. Determent is linear in each row separately:
    1. kakbcd=kadkbc=k(adbc)=kabcd\begin{vmatrix} ka & kb \\ c & d \end{vmatrix} = ka d - kbc = k \left( ad - bc \right) = k \begin{vmatrix}a & b \\ c & d \end{vmatrix}

a1+a2b1+b2cd=(a1+a2)d(b1+b2)c=a1db1c+a2db2c=a1b1cd+a2b2cd\begin{vmatrix} a_1 + a_2 & b_1 + b_2 \\ c & d \end{vmatrix} = \left( a_1 + a_2 \right) d - \left( b_1 + b_2 \right) c = a_1 d - b_1 c + a_2d - b_2 c = \begin{vmatrix} a_1 & b_1 \\ c & d \end{vmatrix} + \begin{vmatrix} a_2 & b_2 \\ c & d \end{vmatrix}

Example

[551015]=5[1123]\begin{bmatrix} 5 & 5 \\ 10 & 15 \end{bmatrix} = 5 \begin{bmatrix} 1 & 1 \\ 2 & 3 \end{bmatrix}. But 55101551123\begin{vmatrix} 5 & 5 \\ 10 & 15 \end{vmatrix} \neq 5 \begin{vmatrix} 1 & 1 \\ 2 & 3 \end{vmatrix}

551015=5(15)5(10)=5(5)=52\begin{vmatrix} 5 & 5 \\ 10 & 15 \end{vmatrix} = 5 \left( 15 \right) - 5 \left( 10 \right) = 5(5) = 5^{2}

1123=32=1\begin{vmatrix} 1 & 1 \\ 2 & 3 \end{vmatrix} = 3-2=1

Example: If AA is b×bb\times b, then det(3A)=36det(A)\text{det}\left( 3A \right) = 3^{6} \text{det}\left( A \right) .

Example

001020100=100020001=100020001=2100010001=2\begin{vmatrix} 0 & 0 & 1 \\ 0 & 2 & 0 \\ -1 & 0 & 0 \end{vmatrix} = - \begin{vmatrix} -1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end{vmatrix} = \begin{vmatrix} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end{vmatrix} = 2 \begin{vmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{vmatrix} = 2

  1. If 2 rows of AA are equal, the det(A)=0\text{det}\left( A \right) =0 (abab=abab=0\begin{vmatrix} a & b \\ a & b \end{vmatrix} = ab - ab = 0)
  2. Adding a multiple of one row to another row does not change the determinant. (abc+kad+kb=abcd+kabab=abcd\begin{vmatrix} a & b \\ c+ka & d+kb \end{vmatrix} = \begin{vmatrix} a & b \\ c & d \end{vmatrix} + k \begin{vmatrix} ab \\ ab \end{vmatrix} = \begin{vmatrix} a & b \\ c & d \end{vmatrix} )

Example

abc1382a+12b+32c+8=abc138138=0\begin{vmatrix} a & b & c \\ 1 & 3 & 8 \\ 2a+1 & 2b+3 & 2c + 8 \end{vmatrix} = \begin{vmatrix} a & b & c \\ 1 & 3 & 8 \\ 1 & 3 & 8 \end{vmatrix} = 0

Note:

We see how elementary row operations affect the determinant.

  • Interchange two rows: Change the sign of the determinant
  • Multiply a row by a nonzero constant kk: multiplies the determinant by kk
  • Add a multiple of one row to another: does not change the determinant

Example

Suppose A=[v1v2v3]A = \begin{bmatrix} - & \vec{v}_1 & - \\ - & \vec{v}_2 & - \\ - & \vec{v}_3 & - \end{bmatrix} is 3×33\times 3 with det(A)=6\text{det}\left( A \right) =6 then,

  • v2v1v3=6\begin{vmatrix} - & \vec{v}_2 & - \\ - & \vec{v}_1 & - \\ - & \vec{v}_3 & - \end{vmatrix} = -6
  • v1v2v1v2v3=6\begin{vmatrix} - & \vec{v}_1 & - \\ - & \vec{v}_2 & - \\ - & \vec{v}_1 & \vec{v}_2 & \vec{v}_3 & - \end{vmatrix} = 6

Example

111222333=111000000=0\begin{vmatrix} 1 & 1 & 1 \\ 2 & 2 & 2 \\ 3 & 3 & 3 \end{vmatrix} = \begin{vmatrix} 1 & 1 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{vmatrix} = 0

  1. If a has a row of 0’s, then det(A)=0\text{det}\left( A \right) = 0 (00cd=0d0c=0\begin{vmatrix} 0 & 0 \\ c & d \end{vmatrix} = 0d - 0c = 0)

Note: At this point, we can calculate any determinant. Moreover, we see that det(A)0\text{det}\left( A \right) \neq 0 if and only if AA is invertible.

  • Perform row operations to find rref(A)\text{rref}\left( A \right)
  • rref(A)=In\text{rref}\left( A \right) = I_{n} if and only if AA is invertible
  1. det(A)=det(AT)\text{det}\left( A \right) = \text{det}\left( A^{T} \right) (abcd=adbc=acbd=adcd\begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc = \begin{vmatrix} a & c \\ b & d \end{vmatrix} = ad - cd )

Example

10050103001200010=10000100001053210=10000100001000010=10I4=10\begin{vmatrix} 1 & 0 & 0 & 5 \\ 0 & 1 & 0 & 3 \\ 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 10 \end{vmatrix} = \begin{vmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 5 & 3 & 2 & 10 \end{vmatrix} = \begin{vmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 10 \end{vmatrix} = 10 \mid I_{4} \mid = 10

The difference between det(A)\text{det}\left( A \right) and det(rref(A))\text{det}\left( \text{rref}\left( A \right) \right) is always a nonzero multiplier.

det(rref(A)){0if row of 0’s1ifrref(A)=In\text{det}\left( \text{rref}\left( A \right) \right) \begin{cases} 0 & \text{if row of 0’s} \\ 1 & \text{if} \text{rref} \left( A \right) = I_{n} \end{cases}

Exercise:

  • 1000030000100005=151000010000100001=15\begin{vmatrix} 1 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 5 \end{vmatrix} = -15 \begin{vmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{vmatrix} = -15
  • A=1000230001107315=15A = \begin{vmatrix} 1 & 0 & 0 & 0 \\ 2 & 3 & 0 & 0 \\ 0 & 1 & -1 & 0 \\ 7 & 3 & 1 & 5 \end{vmatrix} = -15

How to compute using cofactors (The book has other methods):

Definition:

For an n×nn\times n matrix AA,

  • AijA_{ij} is (n1)×(n1)(n-1)\times (n-1) matrix obtained by removing row i and column j from matrix AA.
  • The determinant Aij\mid A_{ij} \mid is called the minor of AA.

Example

A23=[100010735]A_{23} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 7 & 3 & 5 \end{bmatrix}

Cofactor expansion for calculating det(A)\text{det}\left( A \right)

det(A)=a11det(A11)a12det(A12)++a1n(1)n+1det(A1n)\text{det}\left( A \right) = a_{11}\text{det}\left( A_{11} \right) - a_{12}\text{det}\left( A_{12} \right) + \cdots + a_{1n}\left( -1 \right) ^{n+1} \text{det}\left( A_{1n} \right)

=a11c11+a12c12+a13c13++a1nc1n= a_{11}c_{11} + a_{12}c_{12} + a_{13}c_{13} + \cdots + a_{1n}c_{1n}

Where Cij=(1)i+jAijC_{ij} = \left( -1 \right) ^{i+j} \mid A_{ij} \mid is called a cofactor.

For 3×33\times 3 matrix:

a11a12a13a21a22a23a31a32a33=a11a22a23a32a33a12a21a23a31a33+a13a21a22a31a32\begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix} = a_{11} \begin{vmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{vmatrix}

Or another expansion:

a11a12a13a21a22a23a31a32a33=a12a21a23a31a33+a22a11a13a31a33a32a11a13a21a23\begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix} = -a_{12} \begin{vmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{vmatrix} + a_{22} \begin{vmatrix} a_{11} & a_{13} \\ a_{31} & a_{33} \end{vmatrix} - a_{32} \begin{vmatrix} a_{11} & a_{13} \\ a_{21} & a_{23} \end{vmatrix}

Example

120410=1101324013+04111\begin{vmatrix} 1 & 2 & 0 \\ 4 & 1 & 0 \end{vmatrix} = 1 \begin{vmatrix} 1 & 0 \\ -1 & 3 \end{vmatrix} - 2 \begin{vmatrix} 4 & 0\\ 1 & 3 \end{vmatrix} + 0 \begin{vmatrix} 4 & 1 \\ 1 & -1 \end{vmatrix}

=1(30)2(120)=21= 1 (3-0) - 2 (12-0) = -21

Example

0002100301020013=(1)1+42100010001=2\begin{vmatrix} 0 & 0 & 0 & 2 \\ 1 & 0 & 0 & 3 \\ 0 & 1 & 0 & 2 \\ 0 & 0 & 1 & 3 \end{vmatrix} = (-1)^{1+4} 2 \begin{vmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{vmatrix} = -2

Example

543012006=51206+0+0=5(1)(6)=30\begin{vmatrix} 5 & 4 & 3 \\ 0 & -1 & 2 \\ 0 & 0 & 6 \end{vmatrix} = 5 \begin{vmatrix} -1 & 2 \\ 0 & 6 \end{vmatrix} + 0 + 0 = 5(-1)(6) = -30

  1. If AA is upper triangular (or lower triangular), det(A)\text{det}\left( A \right) is product of diagonal entries.

Example

For which values of kk is the matrix [0k1234567]\begin{bmatrix} 0 & k & 1 \\ 2 & 3 & 4 \\ 5 & 6 & 7 \end{bmatrix} invertible?

0k1234567=k2457+12356\begin{vmatrix} 0 & k & 1 \\ 2 & 3 & 4 \\ 5 & 6 & 7 \end{vmatrix} = -k \begin{vmatrix} 2 & 4 \\ 5 & 7 \end{vmatrix} + 1 \begin{vmatrix} 2 & 3 \\ 5 & 6 \end{vmatrix}

=k(1420)+1(1215)= -k (14-20) + 1(12-15)

=6k3= 6k-3

Need: 6k306k-3 \neq 0

k12\therefore k\neq \frac{1}{2}

Exercise: For which values of λ\lambda is the matrix AλIA - \lambda I not invertible where A=[4227]A = \begin{bmatrix} 4 & 2 \\ 2 & 7 \end{bmatrix}?

AλI=[4λ227λ]A - \lambda I = \begin{bmatrix} 4-\lambda & 2 \\ 2 & 7-\lambda \end{bmatrix}

Want λ\lambda so that det(AλI)=0\text{det}\left( A-\lambda I \right) = 0

4λ227λ=(4λ)(7λ)4=2811λ+λ24\begin{vmatrix} 4-\lambda & 2 \\ 2 & 7 - \lambda \end{vmatrix} = (4-\lambda) (7-\lambda) -4 = 28 - 11\lambda + \lambda ^2 - 4

=λ211λ+24=(λ8)(λ3)= \lambda ^{2} - 11\lambda + 24 = (\lambda - 8) (\lambda - 3)

det(AλI)=0\text{det}(A-\lambda I) = 0 if and only if λ=8\lambda = 8 or λ=3\lambda = 3

Example

Let A=[43210x7202344351]A = \begin{bmatrix} 4 & 3 & 2 & 1 \\ 0 & x & 7 & 2 \\ 0 & 2 & 3 & 4 \\4 & 3 & 5 & 1 \end{bmatrix}

  • Compute the determinant of AA

det(A)=43210x7202340030=4x72234030=4(3)x224\text{det}\left( A \right) = \begin{vmatrix} 4 & 3 & 2 & 1 \\ 0 & x & 7 & 2 \\ 0 & 2 & 3 & 4 \\ 0 & 0 & 3 & 0 \end{vmatrix} = 4 \begin{vmatrix} x & 7 & 2 \\ 2 & 3 & 4 \\ 0 & 3 & 0 \end{vmatrix} = -4 (3) \begin{vmatrix} x & 2 \\ 2 & 4 \end{vmatrix}

=12(4x4)=48x+48= -12 (4x -4) = -48x + 48

  • For which value of xx is the matrix AA not invertible?

x=1x=1

This is when det(A)=0\text{det}\left( A \right) = 0 or 48x+48=0-48x + 48 = 0

Properties of Determinants: For an n×nn\times n matrix AA, the determinant of AA, A \mid A \mid or det(A)\text{det}\left( A \right), is a number satisfying:

  1. In=1 \mid I_{n} \mid = 1
  2. Determinant changes sign when 2 rows in matrix are exchanged
  3. Determinant is linear in each row separately (called multi linear).
  4. If 2 rows of AA are equal, then det(A)=0\text{det}\left( A \right) =0
  5. Adding a multiple of one row to another tow does not change the determinant.
  6. If AA has a row of zeros, then det(A)=0\text{det}\left( A \right) = 0
  7. For any n×nn\times n matrix AA, det(A)=det(AT)\text{det}\left( A \right) = \text{det}\left( A^{T} \right).
  8. If AA is upper triangular (or lower triangular), then det(A)\text{det}\left( A \right) is the product of the diagonal entries
  9. If AA and BB are n×nn\times n matrices, then det(AB)=det(A)det(B)\text{det}\left( AB \right) = \text{det}\left( A \right) \text{det}(B)

Recall that det(A)0\text{det}\left( A \right) \neq 0 if and only if AA is invertible.

Illustrating Property #9 for 2×22\times 2 matrices

A=[abcd]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

B=[xyzw]B = \begin{bmatrix} x & y \\ z & w \end{bmatrix}

AB=[ax+bzay+bwcx+dzcy+dw]A\cdot B = \begin{bmatrix} ax+bzay + bw \\ cx + dzcy + dw \end{bmatrix}

det(A)det(B)=(adbc)(wxyz)\text{det}\left( A \right) \cdot \text{det}\left( B \right) = (ad-bc) (wx-yz)

det(AB)=adwxadyzbcwx+bcyz\text{det}\left( A\cdot B \right) = adwx-adyz-bcwx+bcyz

det(AB)=det(A)det(B)\text{det}\left( AB \right) = \text{det}\left( A \right) \text{det}\left( B \right)

Example

A=[147022004]A = \begin{bmatrix} 1 & 4 & 7 \\ 0 & 2 & 2 \\ 0 & 0 & 4 \end{bmatrix}

Find A=1(2)(4)=8 \mid A \mid = 1 (2)(4) = 8

Find A3=AAA=AAA=83 \mid A^{3} \mid = \mid AAA \mid = \mid A \mid \mid A \mid \mid A \mid = 8^{3}

Find A1=18 \mid A^{-1} \mid = \frac{1}{8}

Example

Suppose MM and NN are 3×33\times 3 matrices with det(M)=4\text{det}\left( M \right) = 4 and det(N)=1\text{det}\left( N \right) = -1. Find the determinant of the matrix 2M1N2MT2M^{-1}N^{2}M^{T}.

231det(M)(det(N))2det(M)=2314(1)24=82^{3}\frac{1}{\text{det}\left( M \right) } \left( \text{det}\left( N \right) \right) ^{2} \text{det}\left( M \right) = 2^{3}\frac{1}{4} (-1)^{2} \cdot 4 = 8

Example

Suppose v1\vec{v}_1, v2\vec{v}_2, and v3\vec{v}_3 are row vectors in R3\mathbb{R}^{3} and A=[v1v2v3]A = \begin{bmatrix} - & \vec{v}_1 & - \\ - & \vec{v}_2 & - \\ - & \vec{v}_3 & - \end{bmatrix} satisfies det(A)=5\text{det}\left( A \right) = 5.

  • det(3A)=335\text{det}\left( 3A \right) = 3^{3}5
  • det(A)=(1)35=5\text{det}\left( -A \right) = (-1)^{3}5 = -5
  • 00401v1v23v30=(1)1+34det(AT)=4(5)=20\begin{vmatrix} 0 & 0 & 4 & 0 \\ | & | & 1 & | \\ \vec{v}_1^{\bot} & \vec{v}_2 ^{\bot} & 3 & \vec{v}_3^{\bot} \\ | & | & 0 & | \end{vmatrix} = (-1)^{1+3}4 \text{det} \left( A^{T} \right) = 4(5)=20

Suppose AA is an orthogonal matrix. What can det(A)\text{det}\left( A \right) be?

Know: AA is invertible. det(A)0\text{det}\left( A \right) \neq 0

Use: ATA=In    det(AT)det(A)=1A^{T}A = I_{n} \implies \text{det}\left( A^{T} \right) \text{det}\left( A \right) =1 Property: det(AT)=det(A)    (det(A))2=1\text{det}\left( A^{T} \right) = \text{det}\left( A \right) \implies \left( \text{det}\left( A \right) \right) ^{2}=1

Answer: det(A)=1\text{det}\left( A \right) = 1 or det(A)=1\text{det}\left( A \right) = -1

7.1 Diagonalization

7.1 Diagonalization

Suppose D=[d1000d2000dn]D = \begin{bmatrix} d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n \end{bmatrix} is a n×nn\times n diagonal matrix. Then,

  • For kk positive integer, Dk=[d1k000d2k000dnk]D^{k} = \begin{bmatrix} d_1^{k} & 0 & \cdots & 0 \\ 0 & d_2^{k} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n^{k} \end{bmatrix}
  • det(D)=d1d2d3dn\text{det}\left( D \right) = d_1d_2d_3 \cdots d_n
  • D1=[1d10001d20001dn]D^{-1} = \begin{bmatrix} \frac{1}{d_1} & 0 & \cdots & 0 \\ 0 & \frac{1}{d_2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \frac{1}{d_n} \end{bmatrix} (if d1,d2,d3,,dn0d_1, d_2, d_3, \cdots , d_n \neq 0)

Definition:

A square matrix AA is diagonalizable provided there exists an invertible matrix SS and diagonal matrix BB such that S1AS=BS^{-1}AS = B.

When we diagonalize a matrix AA, we find an invertible matrix SS and a diagonal matrix BB such that S1AS=BS^{-1}AS = B.

Notice: S1AS=BAS=SBA=SBS1S^{-1}AS = B \leftrightarrow AS = SB \leftrightarrow A = SBS^{-1}

  • det(A)=det(SBS1)=det(A)det(B)det(S1)=det(B)\text{det}\left( A \right) = \text{det}\left( SBS^{-1} \right) = \text{det}\left( A \right) \text{det}\left( B \right) \text{det}\left( S^{-1} \right) = \text{det}\left( B \right)
  • Ak=(SBS1)(SBS1)(SBS1)(SBS1)=SBkS1A^{k} = (SBS^{-1}) (SBS^{-1}) (SBS^{-1}) \cdots (SBS^{-1}) = SB^{k}S^{-1}
  • AA is invertible if and only if BB is invertible. A1=SB1S1A^{-1} = SB^{-1}S^{-1}

Check: A(SB1S1)=SBS1(SB1S1)=InA(SB^{-1}S^{-1}) = SBS^{1}(SB^{-1}S^{-1}) = I_{n}

Example

Let A=[5115]A = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix}. AA is diagonalizable with S=[1111]S = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix} and B=[6004]B = \begin{bmatrix} 6 & 0 \\ 0 & 4 \end{bmatrix}.

Check

  • S1=12[1111]S^{-1} = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix}
  • BS1=[6004][12121212]=[3322]BS^{-1} = \begin{bmatrix} 6 & 0 \\ 0 & 4 \end{bmatrix} \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2} \end{bmatrix} = \begin{bmatrix} 3 & 3 \\ -2 & 2 \end{bmatrix}
  • SBS1=[1111][3322]=[5115]SBS^{-1} = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 3 & 3 \\ -2 & 2 \end{bmatrix} = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix}

Question: What does diagonalizable mean?

Suppose B=[λ1000λ2000λn]B = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}, S=[v1v2v3]S = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_3 \\ | & | & & | \end{bmatrix}, and S1AS=BS^{-1}AS = B. Then,

  • AS=A[v1v2vn]=[Av1Av2Avn]AS = A \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ | & | & & | \end{bmatrix} = \begin{bmatrix} | & | & & | \\ A\vec{v}_1 & A\vec{v}_2 & \cdots & A\vec{v}_n \\ | & | & & | \end{bmatrix}
  • SB=[v1v2vn][λ1000λ2000λn]=[λ1v1λ2v2λnvn]SB = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ | & | & & | \end{bmatrix} \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix} = \begin{bmatrix} | & | & & | \\ \lambda_1 \vec{v}_1 & \lambda_2 \vec{v}_2 & \cdots & \lambda_n \vec{v}_n \\ | & | & & | \end{bmatrix}

Notice: AS=SBAS = SB if and only if Avi=λiviA\vec{v}_i = \lambda_i \vec{v}_i for 1in1\le i \le n.

Note: SS invertible. Columns of SS are independent and form a basis for Rn\mathbb{R}^{n}.

Answer: An n×nn\times n matrix AA is diagonalizable if and only if there exists a basis {v1,v2,,vn}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_n \} for Rn\mathbb{R}^{n} and scalars λ1,λ2,,λn\lambda_1 , \lambda_2 , \cdots , \lambda_n with Avi=λiviA\vec{v}_i = \lambda_i \vec{v}_i for i=1,2,,ni=1,2,\cdots , n.

In our example: A=[5115]A = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix}. We had S=[1111]S = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix} and B=[6004]B = \begin{bmatrix} 6 & 0 \\ 0 & 4 \end{bmatrix}.

Basis for R2\mathbb{R}^{2} : {[11],[11]}\{ \begin{bmatrix} 1 \\ 1 \end{bmatrix} , \begin{bmatrix} -1 \\ 1 \end{bmatrix} \}

A[11]=[5115][11]=[66]=6[11]A \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 6 \end{bmatrix} = 6 \begin{bmatrix} 1 \\ 1 \end{bmatrix}

A[11]=[5115][11]=[44]=4[11]A \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} -4 \\ 4 \end{bmatrix} = 4 \begin{bmatrix} -1 \\ 1 \end{bmatrix}

Definition:

  • A nonzero vector v\vec{v} in Rn\mathbb{R}^{n} is an eigenvector of AA with eigenvalue λ\lambda provided Av=λvA \vec{v} = \lambda \vec{v}. Note, AvA\vec{v} is parallel to v\vec{v}
  • A basis {v1,v2,,vn}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_n \} for Rn\mathbb{R}^{n} is called an eigenbasis for AA provided there exists scalars λ1,,λn\lambda_1 , \cdots , \lambda_n with Av1=λiviA\vec{v}_1 = \lambda_i \vec{v}_i for 1in1 \le i \le n.

Note: With this language, an n×nn\times n matrix AA is diagonalizable if and only if AA has an eigenbasis. (There exists a basis for Rn\mathbb{R}^{n} of eigenvectors for AA).

Example

Find all 2×22\times 2 matrices for which [11]\begin{bmatrix} 1 \\ 1 \end{bmatrix} is an eigenvector with eigenvalue λ=6\lambda = 6.

Want: [abcd][11]=6[11]\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = 6 \begin{bmatrix} 1 \\ 1 \end{bmatrix}. Note [5115]\begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix} is of this type.

[abcd][11]=[a+bc+d]\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} a+b \\ c+d \end{bmatrix}

a+b=6    b=6aa+b = 6 \implies b = 6-a

c+d=6    d=6cc+d = 6 \implies d=6-c

[a6ac6c]a,cR\begin{bmatrix} a & 6-a \\ c & 6-c \end{bmatrix} a,c \in \mathbb{R}

Example

Suppose AA is the 2×22\times 2 matrix of reflection about line y=2xy=2x. Is AA diagonalizable? If so, diagonalize AA.

Yes!

$L = \text{span} |{ \begin{bmatrix} 1 \\ 2 \end{bmatrix} \}$

$\text{ref}{L}\left( \vec{x} \right) = 2 \text{proj}{L}\left( \vec{x} \right) - \vec{x}$

Matrix: 211+4[1224][1001]=[35454535]2 \cdot \frac{1}{1+4} \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} -\frac{3}{5} & \frac{4}{5} \\ \frac{4}{5} & \frac{3}{5} \end{bmatrix}

refL[12]=[12]\text{ref}_{L} \begin{bmatrix} 1\\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} (λ=1\lambda = 1)

refL[21]=[21]\text{ref}_{L} \begin{bmatrix} 2 \\ -1 \end{bmatrix} = - \begin{bmatrix} 2 \\ -1 \end{bmatrix} (λ=1\lambda = -1)

S=[1221]S = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}

B=[1001]B = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}

Check: AS=SB=[1221]AS = SB = \begin{bmatrix} 1 & -2 \\ 2 & 1 \end{bmatrix}

Example

Suppose AA is the 2×22\times 2 matrix of projection onto the line L=span{[17]}L = \text{span}\{ \begin{bmatrix} -1 \\ 7 \end{bmatrix} \}. Diagonalize AA if you can.

projL[17]=1[17]\text{proj}_{L} \begin{bmatrix} -1 \\ 7 \end{bmatrix} = 1 \begin{bmatrix} -1 \\ 7 \end{bmatrix} (λ=1\lambda = 1)

projL[71]=0[71]\text{proj}_{L} \begin{bmatrix} 7 \\ 1 \end{bmatrix} = 0 \begin{bmatrix} 7 \\ 1 \end{bmatrix} (λ=0\lambda = 0)

S=[1771]S = \begin{bmatrix} -1 & 7 \\ 7 & 1 \end{bmatrix}

B=[1000]B = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}

Test 1: A=[1507507504950]A = \begin{bmatrix} \frac{1}{50} & -\frac{7}{50} \\ -\frac{7}{50} & \frac{49}{50} \end{bmatrix}

Check: AS=SB=[1070]AS = SB = \begin{bmatrix} -1 & 0 \\ 7 & 0 \end{bmatrix}

Example

Suppose AA is the 2×22\times 2 matrix of rotation counterclockwise by θ=π2\theta = \frac{\pi}{2}. Is AA diagonalizable?

No! For v0\vec{v} \neq \vec{0}, AvA \vec{v} is never parallel to v\vec{v}.

A=[0110]A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}

No eigenvectors and no (real) eigenvalues.

Let VV be a subspace of Rn\mathbb{R}^{n}, Then, the matrix of projection projv:RnRn\text{proj}_{v} : \mathbb{R}^{n} \to \mathbb{R}^{n} is diagonalizable.

Say dim(V)=K\text{dim}\left( V \right) = K.

VV^{\bot} has dimension nkn-k.

Basis for VV: {v1,v2,v3,,vk}\{ \vec{v}_1 , \vec{v}_2 , \vec{v}_3 , \cdots , \vec{v}_k \}

projv(vi)=1vi\text{proj}_{v}\left( \vec{v}_i \right) = 1 \vec{v}_i for 1ik1\le i\le k.

Basis for VV^{\bot}: $\{ \vec{w}{k+1} , \vec{w}{k+2} , \cdots , \vec{w}_{n} \}$

projv(wi)=0wi\text{proj}_{v}\left( \vec{w}_i \right) = 0 \vec{w}_i for k+1ink+1 \le i \le n

S=[v1vkwk+1wn]S = \begin{bmatrix} | & & | & | & & | \\ \vec{v} _1 & \cdots & \vec{v} _k & \vec{w} _{k+1} & \cdots & \vec{w} _n \\ | & & | & | & & | \end{bmatrix}

B=[100010000000]B = \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \vdots & 0 \\ 0 & 0 & \ddots & \vdots \\ 0 & 0 & 0 & 0 \end{bmatrix} (kk amount of diagonal 1’s)

Example

Suppose AA is n×nn\times n and v\vec{v} is an eigenvector for AA with eigenvalue λ=4\lambda = 4.

1) Is v\vec{v} an eigenvector for A2A^{2}?

A2v=AAv=A4v=4Av=44v=16vA^{2}\vec{v} = A\cdot A \vec{v} = A 4\vec{v} = 4A\vec{v} = 4\cdot 4 \vec{v} = 16 \vec{v}

Yes! Eigenvalue is λ=16\lambda = 16.

2) Is v\vec{v} an eigenvector for AInA - I_{n}?

(AIn)v=AvInv=4vv=3v\left( A - I_{n} \right) \vec{v} = A\vec{v} - I_{n}\vec{v} = 4\vec{v} - \vec{v} = 3\vec{v}

Yes! Eigenvalue is λ=3\lambda = 3.

Question: Suppose AA is an n×nn\times n orthogonal matrix. What are possibilities for (real) eigenvalues for AA?

Note: We may not have any eigenvalue, e.g. the 2×22\times 2 (counterclockwise) rotation matrix with angle π2\frac{\pi}{2}.

Answer: λ=1\lambda = 1 or 1-1 only possibilities

Av=v \mid \mid A \vec{v} \mid \mid = \mid \mid \vec{v} \mid \mid

Suppose Av=λvA \vec{v} = \lambda \vec{v}. Then, λv=vλv=v \mid \mid \lambda \vec{v} \mid \mid = \mid \mid \vec{v} \mid \mid \to \mid \lambda \mid \mid \mid \vec{v} \mid \mid = \mid \mid \vec{v} \mid \mid ; v0\vec{v} \neq \vec{0}

λ=1 \mid \lambda \mid = 1

7.2 Finding Eigenvalues

7.2 Finding Eigenvalues

7.1 #7: If v\vec{v} is an eigenvector of the n×nn\times n matrix AA with associated eigenvalue λ\lambda,

1) What can you say about ker(AλIn)\text{ker}\left( A - \lambda I_{n} \right)?

We have Avλv=0A \vec{v} - \lambda \vec{v} = \vec{0}

Equivalently, (AλI)v=0\left( A - \lambda I \right) \vec{v} = \vec{0}.

ker(AλI)\text{ker}\left( A - \lambda I \right) has dimension at least 1.

2) Is the matrix AλInA - \lambda I_{n} invertible?

No! Nullity 1\ge 1. Rank < n

Notice: λ\lambda is an eigenvalue for AA if and only if det(AλI)=0\text{det}\left( A - \lambda I \right) = 0.

Definition:

The characteristic equation of a matrix AA:

det(AλI)=0\text{det} (A - \lambda I) = 0

Solutions λ\lambda to this equation are eigenvalues.

Question: When is 0 an eigenvalue for AA?

Answer:

Precisely when AA is not invertible. A0I=AA - 0I = A

Example

Find the eigenvalues of A=[1254]A = \begin{bmatrix} 1 & 2 \\ 5 & 4 \end{bmatrix} .

λI2=[λ00λ]\lambda I_{2}= \begin{bmatrix} \lambda & 0 \\ 0 & \lambda \end{bmatrix}

0=det(AλI)=1λ254λ=(1λ)(4λ)100 = \text{det}\left( A - \lambda I \right) = \begin{vmatrix} 1-\lambda & 2 \\ 5 & 4-\lambda \end{vmatrix} = \left( 1-\lambda \right) (4- \lambda ) - 10

=λ24λλ+410=λ25λ6=(λ6)(λ+1)= \lambda ^{2} - 4 \lambda - \lambda + 4 - 10 = \lambda ^{2} - 5 \lambda - 6 = (\lambda - 6 ) (\lambda + 1)

0=(λ6)(λ+1)0 = \left( \lambda -6 \right) \left( \lambda + 1 \right)

λ=6,1\lambda = 6, -1

Example

Find the eigenvalues of A=[1224]A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix}.

0=1λ224λ=(1λ)(4λ)4=λ25λ+44=λ(λ5)0 = \begin{vmatrix} 1-\lambda & 2 \\ 2 & 4-\lambda \end{vmatrix} = \left( 1- \lambda \right) \left( 4 - \lambda \right) - 4 = \lambda ^{2} - 5 \lambda + 4 - 4 = \lambda \left( \lambda - 5 \right)

λ=0,5\lambda = 0, 5

Notice:

  • Product: 05=det(A)0\cdot 5 = \text{det}\left( A \right)
  • Sum: 0+5= sum of diagonal entries. Trace of AA.

Example

A=[0110]A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} (Matrix of rotation by counterclockwise θ=π2\theta = \frac{\pi}{2})

0=AλI=λ11λ=λ2+10 = \mid A - \lambda I \mid = \begin{vmatrix} -\lambda & -1 \\ 1 & -\lambda \end{vmatrix} = \lambda ^{2} + 1

No real eigenvalues

Generally for n×nn\times n matrix, λ1,λ2,,λn\lambda_1 , \lambda_2 , \cdots , \lambda _n

λ1λ2λ3λn=det(A)\lambda_1\lambda_2\lambda_3 \cdots \lambda_n = \text{det}\left( A \right)

λ1+λ2+λ3++λn=tr(A)\lambda_1 + \lambda_2 + \lambda_3 + \cdots + \lambda_n = \text{tr}\left( A \right) (Trace)

Moreover, for a general 2×22\times 2 matrix A=[abcd]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}, we see

det(AλI)=[aλbcdλ]=(aλ)(dλ)bc=λ2aλdλ+adbc=λ2(a+d)λ+(adbc)=λ2tr(A)λ+det(A)\begin{align*} \text{det}(A - \lambda I) & = \begin{bmatrix} a-\lambda & b \\ c & d-\lambda \end{bmatrix} \\ &= (a - \lambda) (d - \lambda) - bc \\ &= \lambda^2 - a\lambda - d \lambda + ad - bc \\ &= \lambda^2 - (a+d)\lambda + (ad-bc) \\ &= \lambda^2 - \text{tr}(A) \lambda + \text{det}(A) \end{align*}

Example

Find eigenvalues for A=[134032001]A = \begin{bmatrix} 1 & 3 & 4 \\ 0 & 3 & 2 \\ 0 & 0 & -1 \end{bmatrix}.

=1λ3403λ2001λ=(1λ)(3λ)(1λ)= \begin{vmatrix} 1-\lambda & 3 & 4 \\ 0 & 3-\lambda & 2 \\ 0 & 0 & -1-\lambda \end{vmatrix} = \left( 1- \lambda \right) \left( 3 - \lambda \right) \left( -1-\lambda \right)

λ=1,3,1\lambda = 1, 3, -1

We see:

  1. When AA is upper triangular (or lower triangular), eigenvalues of AA are along diagonal
  2. Any matrix AA: det(AλI)\text{det}\left( A- \lambda I \right) is polynomial in λ\lambda. Called characteristic polynomial fA(λ)f_{A}\left( \lambda \right)

If AA is n×nn\times n, the characteristic polynomial of AA has degree nn and is of the form

fA(λ)=(λ)n+tr(A)(λ)n1++det(A)f_A (\lambda) = (-\lambda)^n + \text{tr}(A)(-\lambda)^{n-1} + \cdots + \text{det}(A) Eigenvalues of ARoots of characteristic polynomial\text{Eigenvalues of } A \leftrightarrow \text{Roots of characteristic polynomial}

Definition:

An eigenvalue λ0\lambda_{0} of an n×nn\times n matrix AA has algebraic multiplicity kk (notation: almu(λ0)=k\text{almu}\left( \lambda_{0} \right) = k ) provided

fA(λ)=det(AλI)=(λ0λ)kg(λ)f_{A}\left( \lambda \right) = \text{det}\left( A - \lambda I \right) = \left( \lambda _{0} - \lambda \right) ^{k} g(\lambda)

Where g(λ0)0g\left( \lambda_{0} \right) \neq 0.

Example

A=[500250125]A = \begin{bmatrix} 5 & 0 & 0 \\ 2 & 5 & 0 \\ 1 & 2 & 5 \end{bmatrix} has eigenvalue λ=5\lambda = 5 with…

almu(5)=3\text{almu} (5) = 3 as det(AλI)=(5λ)3\text{det}\left( A - \lambda I \right) = \left( 5 - \lambda \right) ^{3}

Example

Find eigenvalues with algebraic multiplicities for A=[703323301]A = \begin{bmatrix} 7 & 0 & 3 \\ -3 & 2 & -3 \\ -3 & 0 & 1 \end{bmatrix}.

7λ03 32λ3 301λ=(1)2+2(2λ)7λ3 31λ=(2λ)[(7λ)(1λ)+9]=(2λ)(λ28λ+7+9)=(2λ)(λ4)2\begin{align*} \begin{vmatrix} 7-\lambda & 0 & 3 \\\ -3 & 2-\lambda & -3 \\\ -3 & 0 & 1-\lambda \end{vmatrix} &= \left( -1 \right) ^{2+2} \left( 2-\lambda \right) \begin{vmatrix} 7-\lambda & 3 \\\ -3 & 1-\lambda \end{vmatrix} \\ &= (2-\lambda) [\left( 7- \lambda \right) \left( 1-\lambda \right) + 9 ] \\ &= (2-\lambda ) \left( \lambda ^{2} - 8\lambda + 7 + 9 \right) \\ &= (2-\lambda ) (\lambda - 4) ^{2} \\ \end{align*}

λ=2,4,4\lambda = 2, 4, 4

almu(4)=2\text{almu}(4) = 2

almu(2)=1\text{almu}(2) = 1

Exercise: Find eigenvalues with algebraic multiplicities for A=[210140533]A = \begin{bmatrix} 2 & 1 & 0 \\ -1 & 4 & 0 \\ 5 & 3 & 3 \end{bmatrix}.

2λ10 14λ0 533λ=(2λ)2λ1 14λ=(3λ)((2λ)(4λ)+1)=(3λ)(λ26λ+8+1)=(3λ)3\begin{align*} \begin{vmatrix} 2-\lambda & 1 & 0 \\\ -1 & 4-\lambda & 0 \\\ 5 & 3 & 3-\lambda \end{vmatrix} &= (2-\lambda ) \begin{vmatrix} 2-\lambda & 1 \\\ -1 & 4-\lambda \end{vmatrix} \\ &= (3-\lambda) ((2-\lambda ) (4-\lambda ) + 1) \\ &= (3-\lambda ) (\lambda ^2 - 6\lambda + 8 + 1) \\ &= (3-\lambda )^3 \end{align*}

λ=3,3,3\lambda = 3, 3, 3

almu(3)=3\text{almu}(3) = 3

Remarks: 1) A degree nn polynomial has at most nn roots (counted with multiplicities) fa(λ)f_{a}\left( \lambda \right)

  • An n×nn\times n matrix AA has no more than nn eigenvalues (counting algebraic multiplicities)

Example

Find (real) eigenvalues for matrix A=[8136021100040010]A = \begin{bmatrix} 8 & 1 & 3 & 6 \\ 0 & 2 & 1 & -1 \\ 0 & 0 & 0 & -4 \\ 0 & 0 & 1 & 0 \end{bmatrix}.

Note: rref(A)=I4\text{rref}\left( A \right) = I_{4}

8λ136 02λ11 00λ4 001λ=(8λ)2λ11 0λ4 01λ=(8λ)(2λ)λ4 1λ=(8λ)(2λ)(λ2+4)\begin{align*} \begin{vmatrix} 8-\lambda & 1 & 3 & 6 \\\ 0 & 2-\lambda & 1 & -1 \\\ 0 & 0 & -\lambda & -4 \\\ 0 & 0 & 1 & -\lambda \end{vmatrix} &= (8-\lambda ) \begin{vmatrix} 2-\lambda & 1 & -1 \\\ 0 & -\lambda & -4 \\\ 0 & 1 & -\lambda \end{vmatrix} \\ &= (8-\lambda ) (2-\lambda ) \begin{vmatrix} -\lambda & -4 \\\ 1 & -\lambda \end{vmatrix} \\ &= (8-\lambda ) (2-\lambda ) (\lambda ^2 + 4) \end{align*}

λ=8,2\lambda = 8, 2

almu(8)=1\text{almu}(8) = 1

almu(2)=1\text{almu}(2) = 1

2) If nn is odd and AA is an n×nn\times n matrix then AA has at least one eigenvalue.

Reason: Any odd degree polynomial has at least one root.

Example

Consider the matrix A=[1k11]A = \begin{bmatrix} 1 & k \\ 1 & 1 \end{bmatrix}.

1) For what value(s) of kk does AA have two distinct eigenvalues?

2) For what value(s) of kk does AA have no real eigenvalues?

Solution

Recall:

ax2+bx+c=0ax^{2} + bx + c =0

  • Roots: x=b±b24ac2ax = \frac{-b \pm \sqrt{b^{2} - 4ac} }{2a}

fA(λ)=λ2tr(A)λ+det(A)f_{A}( \lambda ) = \lambda ^{2} - \text{tr}\left( A \right) \lambda + \text{det}\left( A \right)

=λ22λ+(1k)= \lambda ^{2} - 2 \lambda + (1-k)

b24ac=44(1k){>02 distinct eigenvalues<0no eigenvaluesb^2 - 4ac = 4-4(1-k) \begin{cases} >0 & \text{2 distinct eigenvalues} \\ <0 & \text{no eigenvalues} \end{cases}

44(1k)=4k4-4(1-k) = 4k

No eigenvalues: k<0

2 Distinct eigenvalues: k>0

Exercise: For what value(s) of kk does the matrix A=[1k2437002]A = \begin{bmatrix} -1 & k & 2 \\ 4 & 3 & 7 \\ 0 & 0 & 2 \end{bmatrix} have λ=5\lambda = 5 as an eigenvalue?

Restated: For what kk is det(A5I)=0\text{det}\left( A - 5I \right) = 0

0=A5I=6k2427003=(1)3+336k420 = \mid A - 5I \mid = \begin{vmatrix} -6 & k & 2 \\ 4 & -2 & 7 \\ 0 & 0 & -3 \end{vmatrix} = (-1)^{3+3} -3 \begin{vmatrix} -6 & k \\ 4 & -2 \end{vmatrix} =3(124k)= -3 (12-4k)

4k=124k = 12

k=3k=3

Quiz Preparation

1) (a) Find the least-squares solutions to Ax=bA \vec{x} = \vec{b} where A=[120012]A = \begin{bmatrix} 1 & 2 \\ 0 & 0 \\ 1 & 2 \end{bmatrix} and b=[313]\vec{b} = \begin{bmatrix} 3 \\ 1 \\ 3 \end{bmatrix}.

Solution

ATA=[101202][120012]=[2448]A^{T}A = \begin{bmatrix} 1 & 0 & 1 \\ 2 & 0 & 2 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 0 & 0 \\ 1 & 2 \end{bmatrix} = \begin{bmatrix} 2 & 4 \\ 4 & 8 \end{bmatrix}

Normal Equation: [2448][x1x2][612]\begin{bmatrix} 2 & 4 \\ 4 & 8 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \begin{bmatrix} 6 \\ 12 \end{bmatrix}

ATb=[101202][313]=[612]A^{T}\vec{b} = \begin{bmatrix} 1 & 0 & 1 \\ 2 & 0 & 2 \end{bmatrix} \begin{bmatrix} 3 \\ 1 \\ 3 \end{bmatrix} = \begin{bmatrix} 6 \\ 12 \end{bmatrix}

[2464812][1234812]\begin{bmatrix} 2 & 4 & | & 6 \\ 4 & 8 & | & 12 \end{bmatrix} \to \begin{bmatrix} 1 & 2 & | & 3 \\ 4 & 8 & | & 12 \end{bmatrix} [123000]\to \begin{bmatrix} 1 & 2 & | & 3 \\ 0 & 0 & | & 0 \end{bmatrix}

x2=tx_2 = t free

x1=32tx_1 = 3-2t

x=[32tt]\vec{x}^{\star} = \begin{bmatrix} 3-2t \\ t \end{bmatrix}

(b) Compute the error bAx \mid \mid \vec{b} - A \vec{x}^{\star} \mid \mid . Show your work.

Solution

[313][120012][32tt]=[313][32t+2t032t+2t]\mid \mid \begin{bmatrix} 3 \\ 1 \\ 3 \end{bmatrix} - \begin{bmatrix} 1 & 2 \\ 0 & 0 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 3-2t \\ t \end{bmatrix} \mid \mid = \mid \mid \begin{bmatrix} 3 \\ 1 \\ 3 \end{bmatrix} - \begin{bmatrix} 3-2t+2t \\ 0 \\ 3 - 2t + 2t \end{bmatrix} \mid \mid =[010]=0+12+0=1= \mid \mid \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \mid \mid = \sqrt{0 + 1^2 + 0} = 1

2) Suppose AA and BB are 3×33\times 3 matrices with det(A)=2\text{det}\left( A \right) = 2 and det(B)=3\text{det}\left( B \right) = 3. Calculate det(2A2BTA1)\text{det}\left( -2A^{2}B^{T}A^{-1} \right). Show your work.

Solution

(2)3(det(A))2det(B)1det(A)=84312=48(-2)^3 (\text{det}(A))^2 \text{det}(B) \cdot \frac{1}{\text{det}(A)} = -8 \cdot 4 \cdot 3 \cdot \frac{1}{2} = -48

3) Let A=12340121246100365A = \begin{vmatrix} 1 & 2 & 3 & 4 \\ 0 & 1 & 2 & 1 \\ 2 & 4 & 6 & 10 \\ 0 & 3 & 6 & 5 \end{vmatrix}.

(a) Compute the determinant of AA. Show your work.

Solution

det(A)=1234012100020365=121002365=(1)2+321236=2(66)=0\begin{align*} \text{det}(A) &= \begin{vmatrix} 1 & 2 & 3 &4 \\ 0 & 1 & 2 & 1 \\ 0 & 0 & 0 & 2 \\ 0 & 3 & 6 & 5 \end{vmatrix} \\ &= \begin{vmatrix} 1 & 2 & 1 \\ 0 & 0 & 2 \\ 3 & 6 & 5 \end{vmatrix} \\ &= (-1)^{2+3} \cdot 2 \begin{vmatrix} 1 & 2 \\ 3 & 6 \end{vmatrix} \\ &= -2(6-6)\\ &= 0 \end{align*}

(b) For the above matrix AA, Select all that apply.

A: AA is invertible.

B: AA is not invertible.

C: AA is an orthogonal matrix.

D: det(A)=det(A)\text{det}\left( -A \right) = - \text{det}\left( A \right).

E: det(A1ATA)=det(A)\text{det}\left( A^{-1}A^{T}A \right) = \text{det}\left( A \right)

Solution

Because det(A)=0\text{det}\left( A \right) = 0, the matrix is not invertible.

Also recall that for an n×nn\times n orthogonal matrix, the following properties hold:

  1. Columns are orthonormal (unit and perpendicular)
  2. ATA=InA^{T}A = I_{n}
  3. Will be invertible
  4. det(A)=±1\text{det}\left( A \right) = \pm 1

Therefore, B and D are correct.

4) Justify your answers

(a) Suppose T:R2R2T : \mathbb{R}^{2} \to \mathbb{R}^{2} gives rotation through an angle of π3\frac{\pi}{3} in the counterclockwise direction. Let BB be the matrix of the transformation TT. Is BB diagonalizable?

Solution

No; BB has no eigenvectors as for v=0\vec{v}= \vec{0}, BvB\vec{v} is never a multiple of v\vec{v}.

(b) Let A=[113131311]A = \begin{bmatrix} 1 & 1 & 3 \\ 1 & 3 & 1 \\ 3 & 1 & 1 \end{bmatrix}. Is v=[121]\vec{v} = \begin{bmatrix} 1 \\ -2 \\ 1 \end{bmatrix} an eigenvector of AA? If so, what is the corresponding eigenvalue?

Solution

[113131311][121]=[242]=2[121]\begin{bmatrix} 1 & 1 & 3 \\ 1 & 3 & 1 \\ 3 & 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ -2 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ -4 \\ 2 \end{bmatrix} = 2 \begin{bmatrix} 1 \\ -2 \\ 1 \end{bmatrix}

Yes as AvA\vec{v} is a multiple of v\vec{v}. We see λ=2 \lambda = 2.

Example

A=[cos(θ)sin(θ)sin(θ)cos(θ)]A = \begin{bmatrix} \cos \left( \theta \right) & - \sin \left( \theta \right) \\ \sin \left( \theta \right) & \cos \left( \theta \right) \end{bmatrix}

Rotation counterclockwise by θ\theta.

tr(A)=2cos(θ)\text{tr}\left( A \right) = 2 \cos \left( \theta \right)

det(A)=cos2(θ)+sin2θ=1\text{det}\left( A \right) = \cos ^{2} (\theta) + \sin ^{2} \theta = 1

fA(λ)=λtr(A)λ+det(A)f_{A} \left( \lambda \right) = \lambda - \text{tr}\left( A \right) \lambda + \text{det}\left( A \right)

=λ22cos(θ)λ+1= \lambda ^{2} - 2 \cos \left( \theta \right) \lambda + 1

b24acb^{2} - 4ac

4cos2(θ)404 \cos ^{2}\left( \theta \right) - 4 \ge 0

Only when cos2θ=1    cosθ=±1\cos ^{2} \theta = 1 \implies \cos \theta = \pm 1

[1001],[1001]\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} , \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}

The above matrices are the only rotation matrices with eigenvalues.

3) Suppose AA is an n×nn\times n matrix. Then, fA(λ)=fAT(λ)f_{A}\left( \lambda \right) = f_{A^{T}} \left( \lambda \right) .

Proof

Note: ATλI=(AλI)TA^{T} - \lambda I = \left( A - \lambda I \right) ^{T}

fA(λ)=det(AλI)=det((AλI)T)(Property of determinants)=det(ATλI)(Using note)=fAT(λ)\begin{align*} f_A (\lambda) &= \text{det}(A - \lambda I) & \\ &= \text{det}\left( \left( A - \lambda I \right) ^{T} \right) & \text{(Property of determinants)}\\ &= \text{det} \left( A^T - \lambda I \right) & \text{(Using note)} \\ &= f_{A^T} ( \lambda ) & \end{align*}

AA and ATA^{T} have same eigenvalues with algebraic multiplicities.

Note: AA and ATA^{T} do not necessarily have the same eigenvectors.

  • A=[0010]A = \begin{bmatrix} 0 & 0 \\ 1 & 0 \end{bmatrix} has eigenvector v=[01]\vec{v} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} corresponding to eigenvalue λ=0 \lambda = 0. Av=0vA \vec{v} = 0 \vec{v}
  • AT=[0100]A^{T} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}. ATv=[0100][01]=[10]A^{T}\vec{v} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} (Not a multiple of v\vec{v}). [01]\begin{bmatrix} 0 \\ 1 \end{bmatrix} is not an eigenvector for ATA^{T}.
7.3 Finding Eigenvectors

7.3 Finding Eigenvectors

Definition:

Let AA be an n×nn\times n matrix with eigenvalue λ \lambda . The eigenspace associated to λ \lambda is

Eλ=ker(AλI)={vRn:Av=λv}E_\lambda = \text{ker}(A - \lambda I) = \{ \vec{v} \in \mathbb{R}^n : A \vec{v} = \lambda \vec{v} \}

Note: Nonzero vectors in EλE_{ \lambda } are eigenvectors for AA with eigenvalue λ \lambda .

Example

A=[1254]A = \begin{bmatrix} 1 & 2 \\ 5 & 4 \end{bmatrix} has eigenvalues λ=1,6 \lambda = -1, 6. Find a basis for each eigenspace.

1) For λ=1:A+I=[2255] \lambda = -1 : A + I = \begin{bmatrix} 2 & 2 \\ 5 & 5 \end{bmatrix}

rref[1100]\overset{\text{rref}}{\to} \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t (free)

x1=tx_1 = -t

[tt]\begin{bmatrix} -t \\ t \end{bmatrix}

Basis: {[11]}\{ \begin{bmatrix} -1 \\ 1 \end{bmatrix} \}

2) For λ=6:A6I=[5252] \lambda = 6 : A - 6I = \begin{bmatrix} -5 & 2 \\ 5 & -2 \end{bmatrix}

rref[5200]\overset{\text{rref}}{\to} \begin{bmatrix} 5 & -2 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t

5x1=2t5x_1 = 2t

[25tt]\begin{bmatrix} \frac{2}{5}t \\ t \end{bmatrix}

Basis: {[25]}\{ \begin{bmatrix} 2 \\ 5 \end{bmatrix} \}

Previous class notes: We verified A=[5115]A = \begin{bmatrix} 5 & 1 \\ 1 & 5 \end{bmatrix} is diagonalizable with S=[1111]S = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix} and B=[6004]B = \begin{bmatrix} 6 & 0 \\ 0 & 4 \end{bmatrix}.

Question: Where did matrix BB come from?

A: Diagonal entries are eigenvalues for AA.

fA(λ)=λ2tr(A)λ+det(A)=λ210λ+24=(λ6)(λ4)f_{A} \left( \lambda \right) = \lambda ^{2} - \text{tr}\left( A \right) \lambda + \text{det} \left( A \right) = \lambda ^{2} - 10 \lambda + 24 = \left( \lambda - 6 \right) \left( \lambda -4 \right) (Eigenvalues λ=6,4 \lambda = 6, 4)

Question: Where di matrix SS come from?

A: In order, columns are eigenvectors corresponding to eigenvalues.

  • For λ=6:A6I=[1111] \lambda = 6 : A - 6I = \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}

rref[1100]\overset{\text{rref}}{\to} \begin{bmatrix} 1 & -1 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t

x1=tx_1 = t

[11]\begin{bmatrix} 1 \\ 1 \end{bmatrix} (1st column of SS)

  • For λ=4:A4I=[1111] \lambda = 4 : A - 4I = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}

rref[1100]\overset{\text{rref}}{\to} \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t

x1=tx_1 = -t

[11]\begin{bmatrix} -1 \\ 1 \end{bmatrix} (2nd column of SS)

Example

The matrix A=[406030604]A = \begin{bmatrix} 4 & 0 & 6 \\ 0 & 3 & 0 \\ 6 & 0 & 4 \end{bmatrix} has characteristic polynomial fA(λ)=(λ3)(λ10)(λ+2)f_{A} \left( \lambda \right) = - \left( \lambda -3 \right) \left( \lambda - 10 \right) \left( \lambda +2 \right). Find a basis for each eigenspace EλE_{ \lambda }. Diagonalize AA, if you can.

λ=3,10,2 \lambda = 3, 10, -2

λ=3 \lambda = 3 :

A3I=[106000601]A - 3I = \begin{bmatrix} 1 & 0 & 6 \\ 0 & 0 & 0 \\ 6 & 0 & 1 \end{bmatrix}

This matrix has rank 2 and nullity is 1.

Basis: {[010]}\{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}

λ=10 \lambda = 10:

A10I=[606070606]rref[101010000]A - 10 I = \begin{bmatrix} -6 & 0 & 6 \\ 0 & -7 & 0 \\ 6 & 0 & -6 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=0x_2 = 0

x1=tx_1 = t

Basis: {[101]}\{ \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} \}

λ=2 \lambda = -2

A+2I=[606050606]rref[101010000]A + 2I = \begin{bmatrix} 6 & 0 & 6 \\ 0 & 5 & 0 \\ 6 & 0 & 6 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=0x_2 = 0

x1=tx_1 = -t

Basis: {[101]}\{ \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix} \}

{[010],[101],[101]}\{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} , \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix} \}

Basis for R3\mathbb{R}^{3} and hence an eigenbasis for AA.

Yes, AA is diagonalizable.

S=[011100011]S = \begin{bmatrix} 0 & 1 & -1 \\ 1 & 0 & 0 \\ 0 & 1 & 1 \end{bmatrix}

B=[3000100002]B = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 10 & 0 \\ 0 & 0 & -2 \end{bmatrix}

Theorem:

  1. Suppose v1\vec{v}_1, v2\vec{v}_2, …, vp\vec{v}_p are eigenvectors of an n×nn\times n matrix AA corresponding to distinct eigenvalues. Then, {v1,v2,,vp}\{ \vec{v}_1, \vec{v}_2, \cdots , \vec{v}_p \} is a linearly independent set.
  2. If an n×nn\times n matrix AA has nn distinct eigenvalues, then AA is diagonalizable.

Summary of Digitalization

We diagonalize an n×nn\times n matrix AA y finding an invertile matrix SS and a diagonal matrix BB such that

A=SBS1A = SBS^{-1}

Note: Matrix AA is said to be similar to matrix BB

  • AA is diagonalizable if and only if AA has nn linearly independent eigenvectors {v1,v2,,vn}\{\vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_n \}.
    • {v1,v2,,vn}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_n \} is called an eigenbasis for AA
    • Matrix SS has eigenvectors as columns. S=[v1v2vn]S = \begin{bmatrix} | & | & & | \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_n \\ | & | & & | \end{bmatrix}. B=[λ1000λ20000λn]B = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & & \ddots & 0 \\ 0 & \cdots & 0 & \lambda _n \end{bmatrix}
  • We saw 2×22\times 2 rotation matrices are not diagonalizable as they have no eigenvectors.

Many other matrices are not diagonalizable. Reason: AA may not have enough linearly independent eigenvectors.

Theorem:

  1. Suppose v1,v2,,vp\vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_p are eigenvectors of an n×nn\times n matrix AA corresponding to distinct eigenvalues. Then, {v1,v2,,vp}\{ \vec{v}_1 , \vec{v}_2 , \cdots , \vec{v}_p \} is a linearly independent set.
  2. If an n×nn\times n matrix AA has nn distinct eigenvalues then AA is diagonalizable

Example

Find a basis for each eigenspace of A=[703323301]A = \begin{bmatrix} 7 & 0 & 3 \\ -3 & 2 & -3 \\ -3 & 0 & 1 \end{bmatrix}. Diagonalize AA if you can.

We found λ=2,4,4 \lambda = 2, 4, 4

λ=2 \lambda = 2:

A2IA - 2I

[503303301]\begin{bmatrix} 5 & 0 & 3 \\ -3 & 0 & -3 \\ -3 & 0 & -1 \end{bmatrix}

Rank is 2

dim(E2)=32=1\text{dim}\left( E_2 \right) = 3-2 = 1

Basis to E2:{[010]}E_2: \{ \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}

λ=4 \lambda = 4: A4IA - 4I

[303323303]rref[101010000]\begin{bmatrix} 3 & 0 3 \\ -3 & -2 & -3 \\ -3 & 0 & -3 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}

Rank is 2

dim(E4)=1\text{dim}\left( E_4 \right) = 1

x3=tx_3 = t

x2=0x_2 = 0

x1=tx_1 = -t

Basis for E4:{[101]}E_4: \{ \begin{bmatrix} -1 \\ 0 \\ 1 \end{bmatrix} \}

AA is not diagonalizable. We only have 1 linearly independent eigenvector for λ=4 \lambda =4.

B=[200040004]B = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 4 \end{bmatrix}

S=[01?10?01?]S = \begin{bmatrix} 0 & -1 & ? \\ 1 & 0 & ? \\ 0 & 1 & ? \end{bmatrix}

No invertible SS that works.

Definition:

For an n×nn\times n matrix AA with eigenvalue λ \lambda , the geometric multiplicity of λ \lambda is the dimension of EλE _{ \lambda }:

gemu(λ)=dim(Eλ)=dim(ker(AλI))=nrank(AλI)\begin{align*} \text{gemu}( \lambda ) = \text{dim}(E_{ \lambda }) &= \text{dim}(\text{ker}(A - \lambda I)) \\ &= n - \text{rank}(A - \lambda I) \end{align*}

Last example: almu(2)=1=geom(2)\text{almu}(2) = 1 = \text{geom}(2)

almu(4)=2\text{almu}(4) = 2

gemu(4)=1\text{gemu}(4) = 1

Theorem:

An n×nn\times n matrix AA is diagonalizable if and only if the geometric multiplicities of eigenvalues add to nn.

Exercise: Show A=[210140533]A = \begin{bmatrix} 2 & 1 & 0 \\ -1 & 4 & 0 \\ 5 & 3 & 3 \end{bmatrix} with λ=3,3,3 \lambda = 3, 3, 3 is not diagonalizable.

A3I=[110110530]A - 3I = \begin{bmatrix} -1 & 1 & 0 \\ -1 & 1 & 0 \\ 5 & 3 & 0 \end{bmatrix}

rank(A3I)=2\text{rank}\left( A - 3I \right) = 2

gemu(3)=32=1<3\text{gemu}(3) = 3-2 = 1 < 3

We only have 1 linearly independent eigenvector.

Example

The matrix A=[430210111]A = \begin{bmatrix} 4 & -3 & 0 \\ 2 & -1 & 0 \\ 1 & -1 & 1 \end{bmatrix} has characteristic polynomial fA(λ)=(1λ)2(2λ)f_{A} ( \lambda ) = (1 - \lambda )^2 (2- \lambda ). Diagonalize AA if you can.

λ=1 \lambda = 1:

AIA - I

[330220110]rref[110000000]\begin{bmatrix} 3 & -3 & 0 \\ 2 & -2 & 0 \\ 1 & -1 & 0 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & -1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}

Rank is 1

dim(E1)=2=almu(1)\text{dim}\left( E_1 \right) = 2 = \text{almu}(1)

x1=tx_1 = t

x2=tx_2 = t

x3=rx_3 = r

Basis for E1={[110],[001]}E_1 = \{ \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} , \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \}

λ=2 \lambda = 2 :

A2IA - 2I

[230230111][111230000]\begin{bmatrix} 2 & -3 & 0 \\ 2 & -3 & 0 \\ 1 & -1 & -1 \end{bmatrix} \to \begin{bmatrix} 1 & -1 & -1 \\ 2 & -3 & 0 \\ 0 & 0 & 0 \end{bmatrix} [111012000][103012000]\to \begin{bmatrix} 1 & -1 & -1 \\ 0 & -1 & 2 \\ 0 & 0 & 0 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & -3 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=2tx_2 = 2t

x1=3tx_1 = 3t

Basis for E2={[321]}E_2 = \{ \begin{bmatrix} 3 \\ 2 \\ 1 \end{bmatrix} \}

Yes! It’s diagonalizable

B=[100010002]B = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{bmatrix}

S=[103102011]S = \begin{bmatrix} 1 & 0 & 3 \\ 1 & 0 & 2 \\ 0 & 1 & 1 \end{bmatrix}

Comment: if λ \lambda is an eigenvalue for AA then

1gemu(λ)almu(λ)1 \le \text{gemu}( \lambda ) \le \text{almu}( \lambda )

For any n1 n\ge 1, there exists a non-diagonalizable n×nn\times n matrix.

Proof for n=5n =5

Let A=[2100002100002100002100002]A = \begin{bmatrix} 2 & 1 & 0 & 0 & 0 \\ 0 & 2 & 1 & 0 & 0 \\ 0 & 0 & 2 & 1 & 0 \\ 0 & 0 & 0 & 2 & 1 \\ 0 & 0 & 0 & 0 & 2 \end{bmatrix}

Note λ=2 \lambda = 2 only eigenvalue almu(2)=5\text{almu}(2) = 5

det(AλI)=(2λ)5\text{det}\left( A - \lambda I \right) = \left( 2- \lambda \right) ^{5}

Has rank 4 dim(E2)=1<5\text{dim}(E_2) = 1<5

A2I=[0100000100000100000100000]A - 2I = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}

8.1 Symmetric Matrices

8.1 Symmetric Matrices

Two “fantastic” things:

  • Orthonormal bases (The easiest bases to work with)
  • Diagonal matrices (The easiest matrices to work with)

Question: Which n×nn\times n matrices have an orthonormal eigenbasis?

{v1,,vn}\{ \vec{v}_1 , \cdots , \vec{v}_n \} eigenvectors for AA and are orthonormal.

Equivalently, for which n×nn\times n matrices AA can we find

  • An orthogonal matrix SS and
  • diagonal matrix BB with A=SBS1A = SBS^{-1}

Recall: An n×nn\times n matrix SS is orthogonal if and only if S1=STS^{-1} = S^{T}

  • AA has an orthonormal eigenbasis if and only if A=SBSTA = SBS^{T} where SS is an orthogonal matrix and BB is a diagonal matrix.

Definition: Matrix AA is said to be orthogonally diagonalizable.

Answer: (Spectra Theorem) An n×nn\times n matrix AA is orthogonally diagonalizable if and only if AA is symmetric.

Check: If A=SBSTA = SBS^{T} then AT=(SBST)T=(ST)TBTST=SBST=AA^{T} = \left( SBS^{T} \right) ^{T} = \left( S^{T} \right) ^{T} B^{T}S^{T} = SBS^{T} = A

Properties of Symmetric Matrices:

All of this is part of Spectral Theorem

  1. A symmetric n×nn\times n matrix has nn (real) eigenvalues counted with geometric multiplicities. Any eigenvalue for AA satisfies almu(λ)=geom(λ)\text{almu}\left( \lambda \right) = \text{geom} \left( \lambda \right)
  2. Any 2 eigenvectors corresponding to different eigenvalues of a symmetric matrix are perpendicular. (This is not true if AA is not symmetric)

Example

Let A=[2332]A = \begin{bmatrix} 2 & 3 \\ 3 & 2 \end{bmatrix}. Orthogonally diagonalize AA.

fA(λ)=λ2tr(A)λ+det(A)f_{A}\left( \lambda \right) = \lambda ^{2} - \text{tr}\left( A \right) \lambda + \text{det}\left( A \right)

fA(λ)=λ24λ5=(λ5)(λ+1)f_{A}\left( \lambda \right) = \lambda ^{2} - 4 \lambda -5 = \left( \lambda -5 \right) \left( \lambda +1 \right). λ=5,1 \lambda = 5, -1

λ=5 \lambda =5:

A5IA - 5I

[3333]rref[1100]\begin{bmatrix} -3 & 3 \\ 3 & -3 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & -1 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t

x1=tx_1 = t

Basis for E5:{[11]}E_5 : \{ \begin{bmatrix} 1 \\ 1 \end{bmatrix} \}

λ=1 \lambda = -1

A=IA = I

[3333]rref[1100]\begin{bmatrix} 3 & 3 \\ 3 & 3 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 1 \\ 0 & 0 \end{bmatrix}

x2=tx_2 = t

x1=tx_1 = -t

Basis for E1:{[11]}E_{-1} : \{ \begin{bmatrix} -1 \\ 1 \end{bmatrix} \}

B=[5001]B = \begin{bmatrix} 5 & 0 \\ 0 & -1 \end{bmatrix} S=[1212 1212]S = \begin{bmatrix} \frac{1}{\sqrt{2} } & -\frac{1}{\sqrt{2} } \\\ \frac{1}{\sqrt{2} } & \frac{1}{\sqrt{2} } \end{bmatrix}

5 is orthogonal

In the next example, we will use that if AA is an orthogonal matrix, then the only possible eigenvalues are λ=1 \lambda = 1 and λ=1 \lambda = -1

Reason:

Orthogonal matrix AA : Av=v \mid \mid A \vec{v} \mid \mid = \mid \mid \vec{v} \mid \mid for all v\vec{v} in Rn\mathbb{R}^{n}. If λ \lambda an eigenvalue v0\vec{v} \neq \vec{0} Av=λvA \vec{v} = \lambda \vec{v}.

λv=λv=vλ=1 \mid \lambda \mid \mid \mid \vec{v} \mid \mid = \mid \mid \lambda \vec{v} \mid \mid = \mid \mid \vec{v} \mid \mid \to \mid \lambda \mid = 1

λ=1,1 \lambda = 1, -1

Example

Let A=[0001001001001000]A = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}. Find an orthogonal matrix SS and a diagonal matrix BB with A=SBSTA = SBS^{T}. Hint: AA is orthogonal what can eigenvalues be? Only possibilities are λ=1,1 \lambda = 1, -1.

λ=1 \lambda = 1:

AIA - I

[1001011001101001]rref[1001011000000000]\begin{bmatrix} -1 & 0 & 0 & 1 \\ 0 & -1 & 1 & 0 \\ 0 & 1 & -1 & 0 \\ 1 & 0 & 0 & -1 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 0 & -1 \\ 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}

x4=tx_4 = t

x3=rx_3 = r

x1=tx_1 = t

x2=rx_2 = r

[trrt]=t[1001]+r[0110]\begin{bmatrix} t \\ r \\ r \\ t \end{bmatrix} = t \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix} + r \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix}

Basis for E1={[1001],[0110]}E_{1} = \{ \begin{bmatrix} 1 \\ 0 \\ 0 \\ 1 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 1 \\ 0 \end{bmatrix} \}

λ=1 \lambda = -1

A+IA + I

[1001011001101001]rref[1001011000000000]\begin{bmatrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}

x4=tx_4 = t

x3=rx_3 = r

x1=tx_1 = -t

x2=rx_2 = -r

[trrt]=t[1001]+r[0110]\begin{bmatrix} -t \\ -r \\ r \\ t \end{bmatrix} = t \begin{bmatrix} -1 \\ 0 \\ 0 \\ 1 \end{bmatrix} + r \begin{bmatrix} 0 \\ -1 \\ 1 \\ 0 \end{bmatrix}

Basis for E1={[1001],[0110]}E_{-1} = \{ \begin{bmatrix} -1 \\ 0 \\ 0 \\ 1 \end{bmatrix} , \begin{bmatrix} 0 \\ -1 \\ 1 \\ 0 \end{bmatrix} \}

vi=12+0+0+12=2 \mid \mid \vec{v}_i \mid \mid = \sqrt{1^{2} + 0 + 0 + 1^{2}} = \sqrt{2}

B=[1000010000100001]B = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix} S=[120120012012012012120120]S = \begin{bmatrix} \frac{1}{\sqrt{2} } & 0 & -\frac{1}{\sqrt{2} } & 0 \\ 0 & \frac{1}{\sqrt{2} } & 0 & -\frac{1}{\sqrt{2} } \\ 0 & \frac{1}{\sqrt{2} } & 0 & \frac{1}{\sqrt{2} } \\ \frac{1}{\sqrt{2} } & 0 & \frac{1}{\sqrt{2} } & 0\end{bmatrix}

Example

The matrix A=[222200200]A = \begin{bmatrix} 2 & 2 & 2 \\ 2 & 0 & 0 \\ 2 & 0 & 0 \end{bmatrix} has characteristic polynomial fA(λ)=λ(λ4)(λ+2)f_{A} \left( \lambda \right) = - \lambda \left( \lambda - 4 \right) \left( \lambda +2 \right). Orthogonally diagonalize AA.

λ=0 \lambda = 0

A0IA - 0I

[222200200]rref[100011000]\begin{bmatrix} 2 & 2 & 2 \\ 2 & 0 & 0 \\ 2 & 0 & 0 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=tx_2 = -t

x1=0x_1 = 0

Basis for E0:{[011]}E_{0} : \{ \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix} \}

λ=4 \lambda = 4

A4IA - 4I

[222240204][102120111]\begin{bmatrix} -2 & 2 & 2 \\ 2 & -4 & 0 \\ 2 & 0 & -4 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & -2 \\ 1 & -2 & 0 \\ -1 & 1 & 1 \end{bmatrix} [102022011][102011000]\to \begin{bmatrix} 1 & 0 & -2 \\ 0 & -2 & 2 \\ 0 & 1 & -1 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & -2 \\ 0 & 1 & -1 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=tx_2 = t

x1=2tx_1 = 2t

Basis for E4:{[211]}E_{4} : \{ \begin{bmatrix} 2 \\ 1 \\ 1 \end{bmatrix} \}

λ=2: \lambda = -2:

A+2IA + 2I

[422220202][101110211]\begin{bmatrix} 4 & 2 & 2 \\ 2 & 2 & 0 \\2 & 0 & 2 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 2 & 1 & 1 \end{bmatrix} [101011011][101011000]\to \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & -1 \\ 0 & 1 & -1 \end{bmatrix} \to \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & -1 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=tx_2 = t

x1=tx_1 = -t

Basis for E2:{[111]}E_{-2} : \{ \begin{bmatrix} -1 \\ 1 \\ 1 \end{bmatrix} \}

{[011],[211],[111]}\{ \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix} , \begin{bmatrix} 2 \\ 1 \\ 1 \end{bmatrix} , \begin{bmatrix} -1 \\ 1 \\ 1 \end{bmatrix} \}

Eigenbasis

v1=1+1+0=2 \mid \mid \vec{v}_1 \mid \mid = \sqrt{1+1+0} = \sqrt{2}

v2=4+1+1=6 \mid \mid \vec{v}_2 \mid \mid = \sqrt{4 + 1 + 1} = \sqrt{6}

v3=1+1+1=3 \mid \mid \vec{v}_3 \mid \mid = \sqrt{1+1+1} = \sqrt{3}

S=[02613121613121613]S = \begin{bmatrix} 0 & \frac{2}{\sqrt{6} } & -\frac{1}{\sqrt{3} } \\ -\frac{1}{\sqrt{2} } & \frac{1}{\sqrt{6} } & \frac{1}{\sqrt{3} } \\ \frac{1}{\sqrt{2} } & \frac{1}{\sqrt{6} } & \frac{1}{\sqrt{3} } \end{bmatrix}

B=[000040002]B = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & -2 \end{bmatrix}

Notes:

  • Generally, for a symmetric matrix AA, if you have a repeated eigenvalue λ \lambda i.e. almu(λ)>1\text{almu}\left( \lambda \right) > 1, one would perform Gram Schmidt on basis for EλE_{ \lambda }.
  • Two different concepts: in terms of chapter 7.

Diagonalizable: nn linearly independent eigenvectors

Invertible: 0 is not an eigenvalue.

Exercise Suppose AA is a 3×33\times 3 matrix with eigenbasis {[304],[806],[010]}\{ \begin{bmatrix} 3 \\ 0 \\ 4 \end{bmatrix} , \begin{bmatrix} -8 \\ 0 \\ 6 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}.

  • Is AA diagonalizable? Yes
  • Is AA symmetric? Yes (Can normalize each vector to get orthonormal eigenbasis)
  • Is AA invertible? Not enough information

Diagonalization

Example

Suppose AA has characteristic polynomial fA(λ)=λ2(1λ)(2λ)3f_{A}\left( \lambda \right) = \lambda ^{2} \left( 1- \lambda \right) \left( 2 - \lambda \right) ^{3}. Note: AA is 6×66\times 6

1) What are possible dimensions of the eigenspaces of AA?

E0E_{0}: dim 1 or 2 almu(0)=2\text{almu}(0) = 2

E1E_{1}: dim 1 almu(1)=1\text{almu}(1) = 1

E2E_{2}: dim 1, 2, 3 almu(2)=3\text{almu}(2) = 3

2) What is AA diagonalizable?

When dim(E0)=2\text{dim}(E_{0}) = 2 and dim(E2)=3\text{dim}(E_{2}) = 3.

Example

The matrix A=[202042223]A = \begin{bmatrix} 2 & 0 & 2 \\ 0 & 4 & 2 \\ 2 & 2 & 3 \end{bmatrix} has eigenvectors v1=[122]\vec{v}_1 = \begin{bmatrix} 1 \\ 2 \\ 2 \end{bmatrix}, v2=[221]\vec{v}_2 = \begin{bmatrix} 2 \\ -2 \\ 1 \end{bmatrix}, and v3=[212]\vec{v}_3 = \begin{bmatrix} 2 \\ 1 \\ -2 \end{bmatrix}.

AA is symmetric. We will orthogonally diagonalize AA.

[202042223][122]=[61212]=6[122]\begin{bmatrix} 2 & 0 & 2 \\ 0 & 4 & 2 \\ 2 & 2 & 3 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \\ 2 \end{bmatrix} = \begin{bmatrix} 6 \\ 12 \\ 12 \end{bmatrix} = 6 \begin{bmatrix} 1 \\ 2 \\ 2 \end{bmatrix} [202042223][221]=[663]=3[221]\begin{bmatrix} 2 & 0 & 2 \\ 0 & 4 & 2 \\ 2 & 2 & 3 \end{bmatrix} \begin{bmatrix} 2 \\ -2 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ -6 \\ 3 \end{bmatrix} = 3 \begin{bmatrix} 2 \\ -2 \\ 1 \end{bmatrix} [202042223][212]=[000]=0[212]\begin{bmatrix} 2 & 0 & 2 \\ 0 & 4 & 2 \\ 2 & 2 & 3 \end{bmatrix} \begin{bmatrix} 2 \\ 1 \\ -2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} = 0 \begin{bmatrix} 2 \\ 1 \\ -2 \end{bmatrix} B=[600030000]B = \begin{bmatrix} 6 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 0 \end{bmatrix}

vi=4+4+1=3 \mid \mid \vec{v}_i \mid \mid = \sqrt{4 + 4 + 1} = 3

S=[132323232313231323]S = \begin{bmatrix} \frac{1}{3} & \frac{2}{3} & \frac{2}{3} \\ \frac{2}{3} & -\frac{2}{3} & \frac{1}{3} \\ \frac{2}{3} & \frac{1}{3} & -\frac{2}{3} \end{bmatrix}

Note: SS is orthogonal

Example

Let A=[203133003]A = \begin{bmatrix} 2 & 0 & -3 \\ 1 & 3 & 3 \\ 0 & 0 & 3\end{bmatrix}. Find eigenvalues and a basis for each eigenspace. Diagonalize AA if you can.

AλI=2λ0313λ3003λ=(1)3+3(3λ)2λ013λ=(3λ)2(2λ) \mid A - \lambda I \mid = \begin{vmatrix} 2- \lambda & 0 & -3 \\ 1 & 3- \lambda & 3 \\ 0 & 0 & 3- \lambda \end{vmatrix} = (-1)^{3+3} (3 - \lambda ) \begin{vmatrix} 2- \lambda & 0 \\ 1 & 3- \lambda \end{vmatrix} = \left( 3- \lambda \right) ^{2} \left( 2 - \lambda \right) .

λ=3,3,2 \lambda = 3, 3, 2

λ=3 \lambda = 3

A3IA - 3I

[103103000]rref[103000000]\begin{bmatrix} -1 & 0 & -3 \\ 1 & 0 & 3 \\ 0 & 0 & 0 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 0 & 3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}

x3=tx_3 = t

x2=rx_2 = r

x1=3rx_1 = -3r

[3trt]=t[301]+r[010]\begin{bmatrix} -3t \\ r \\ t \end{bmatrix} = t \begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix} + r \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}

Basis for E3:{[301],[010]}E_3 : \{ \begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix} , \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \}

λ=2 \lambda = 2

A2IA - 2 I

[003113001]rref[110001000]\begin{bmatrix} 0 & 0 & -3 \\ 1 & 1 & 3 \\ 0 & 0 & 1 \end{bmatrix} \overset{\text{rref}}{\to} \begin{bmatrix} 1 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}

x3=0x_3 = 0

x2=tx_2 = t

x1=tx_1 = -t

Basis for E2:{[110]}E_2 : \{ \begin{bmatrix} -1 \\ 1 \\ 0 \end{bmatrix} \}

S=[301011100]S = \begin{bmatrix} -3 & 0 & -1 \\ 0 & 1 & 1 \\ 1 & 0 & 0 \end{bmatrix} B=[300030002]B = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 2 \end{bmatrix}

Yes diagonalizable.

Example

Let A=[42321x005]A = \begin{bmatrix} 4 & 2 & 3 \\ 2 & 1 & x \\ 0 & 0 & 5 \end{bmatrix}

1) Find all eigenvalues for the matrix AA.

det(AλI)=4λ2321λx005λ=(1)3+3(5λ)4λ221λ=(5λ)[(4λ)(1λ)4]=(5λ)[λ25λ+44]=λ(5λ)2\begin{align*} \text{det}(A - \lambda I) = \begin{vmatrix} 4 - \lambda & 2 & 3 \\ 2 & 1 - \lambda & x \\0 & 0 & 5- \lambda \end{vmatrix} &= (-1)^{3+3} (5- \lambda ) \begin{vmatrix} 4 - \lambda & 2 \\ 2 & 1- \lambda \end{vmatrix} \\ &= (5 - \lambda ) [(4- \lambda ) (1 - \lambda ) - 4] \\ &= (5- \lambda ) [ \lambda ^2 - 5 \lambda +4 - 4] \\ &= - \lambda (5 - \lambda )^2 \end{align*}

λ=0,5,5 \lambda = 0, 5, 5

almu(5)=2\text{almu}(5) = 2

2) For which values of xx is the matrix AA diagonalizable?

AA is diagonalizable if and only if gemu(5)=2\text{gemu}(5) = 2

λ=5 \lambda =5

Need A5IA - 5I to have rank 1 / nullity 2.

[12324x000]2R1+R2[123006+x000]\begin{bmatrix} -1 & 2 & 3 \\ 2 & -4 & x \\ 0 & 0 & 0 \end{bmatrix} \overset{2R_1 + R_2}{\to} \begin{bmatrix} -1 & 2 & 3 \\ 0 & 0 & 6+x \\ 0 & 0 & 0 \end{bmatrix}

Need 6+x=0    x=66+x = 0 \implies x =-6

Gives AA diagonalizable