Notes on Matrix Algebra - Part 2
Math 5718 - Spring 2001
We have done a fairly exhaustive study of systems of linear
equations, with a particular focus on when solutions exist and
whether or not they are unique. There is still much more to be
learned about this problem. In this set of notes, we will explore
the fundamental subspaces associated with any matrix, their
orthogonality relations, and the question of inverses.
1
Definitions
We need some definitions to get started. First, recall that the
transpose of a matrix A, denoted AT, is the matrix whose
columns are the rows of A; that is, the (i,j) element of A
is the (j,i) element of AT. A special case is that of a
vector: if x is an
column vector, then xT is a
row vector. An important result concerns the
transpose of the product of two matrices: recall that
(AB)T =
BTAT and for a matrix-vector product,
(Ax)T=xTAT.
We must also introduce the notion of orthogonality. Let
and
be column
vectors in
.
Two such vectors are said to be
orthogonal if their scalar product (or inner product) is zero:
Note that in
and
,
orthogonal vectors are
geometrically perpendicular. A set of vectors
is
orthogonal if every vector in the set is orthogonal to every other
vector. Orthogonality is a stronger condition than linear
independence; that is, an orthogonal set is linearly independent
(for example, (1,0) and (0,1)), but not conversely (for
example, (1,1) and (0,1)).
Two subspaces, U and V, are orthogonal if every vector in U
is orthogonal to every vector in V; we write
.
For
example, the subspace
is orthogonal to
the subspace
.
Finally, all vectors in
that are orthogonal to a subspace V comprise the
orthogonal complement of V, denoted
.
For example, for
the subspaces
and
,
we have
and
.
The Four Fundamental Subspaces
Associated with any matrix A are four fundamental subspaces that
obey elegant othogonality relationships. The subspaces are the
null space of A, the row space of A, the column space of A,
and the null space of AT. We have already encountered two of
these subspaces; we will now consider them all in detail.
Null space of A. Recall that null A is comprised of
solutions of the homogeneous system Ax=0. Let U be the result
of reducing A to row echelon form. Because elementary row
operations do not change the solution of Ax=0, null A = null
U. As we saw in the last set of notes, U has r pivots that
correspond to the basic variables. The remaining n-r variables
are free variables. The system Ux=0 can be solved for the basic
variables in terms of the free variables to give a basis of null
A. Thus, dim null A = n-r.
Row space of A. The row space of A is the subspace
spanned by the rows of A. Elementary row operations do not
change the row space of A (because the rows of U are linear
combinations of the rows of A). Therefore, the row space of U
and the row space of A are identical. It is easy to find the row
space of U and a basis for it. Recall that U has r nonzero
rows, one for each pivot element (as shown in the row echelon
matrix below). Furthermore these nonzero rows are linearly
independent. This fact follows because each row has a different
number of leading zeros. If the rows are denoted vi, then the
linear combination
vanishes only if,
first, a1=0, then a2=0, and so forth, until all of the
ai's must be zero. Thus, the r nonzero rows of U form a
basis for the row space of A, and we have that the dimension of
the row space of A is r, which is the rank of A.
Column space of A. We have already encountered the column
space of A, also called the range of A; it is the subspace
spanned by the columns of A. The goal is to find a basis for the
column space of A. Unfortunately, we cannot use the row echelon
form U quite as easily as we did for the row space. The reason
is that elementary row operations do change the column space
of a matrix; so the column space of A is not the same as the
column space of U. But there is still information in U.
First, note that r columns of U contain a pivot element (as
indicated in the row echelon matrix above). Furthermore, these r
columns are linearly independent. This fact can be shown as we did
for the nonzero rows of U. The r pivot columns of U have a
different number of trailing zeros. Letting the pivot columns of
U be vi, then the linear combination
vanishes only if, first, ar=0, then ar-1=0, and so forth,
until all of the ai's must be zero. Thus, the r pivot columns
of U form a basis for the column space of U - but not for the
column space of A.
Here is how we find the column space of A from the column space
of U. It is a subtle argument that requires some thought: We
know that the systems Ax=0 and Ux=0 have the same solutions.
As we have already observed, a solution of Ax=0 gives the
weights in a linear combination of the columns of A. Therefore,
the system Ax=0 describes a linear dependence among the columns
of A. Because the same x is a solution of Ux=0, there is a
corresponding linear dependence among the columns of U. It
follows that if a set of columns of U is linearly independent,
then the corresponding set of columns of A is linearly
independent (and vice versa). Therefore, if the r pivot columns
of U are linearly independent (as we have shown), then the same
r columns are A are also linearly independent.
Therefore, we have a method for finding a basis for the column
space of A: we look for the pivot columns of U and take the
same columns of A as the basis for the column space. Notice that
a surprising result appears from these arguments. We have shown
that
dimension of the row space = dimension of the column space = r
(the rank of A).
Said differently, given any matrix, the number of linearly
independent rows equals the number of linearly independent
columns.
Null space of AT. The fourth subspace is not so obvious,
but as you will see, it completes the picture perfectly. Notice
that if A is an
matrix, then null A and the row
space of A are subspaces of
,
while the column space (or
range) of A is a subspace of
.
The last subspace, null
AT, is also a subspace of
.
This space is often called
the left null space of A, as its elements satisfy
ATy=0, or equivalently, yTA=0.
The dimension of null AT can be found by applying the subspace
dimension theorem (Theorem 3.4 of the text) to AT. Recall that
AT is an
matrix; it maps
into
.
Therefore,
But note that
Putting these two observations together, we have that
.
Finding a basis for null AT is not such an urgent problem. If
nothing else, one could reduce AT to row echelon form and solve
ATx=0 in terms of the m-r free variables (exactly as we found
a basis for null A).
Here is a summary of the four fundamental subspaces (sometimes
called the Fundamental Theorem of Linear Algebra):
- row space A: subspace of
,
dimension = r.
- null A: subspace of
,
dimension = n-r.
- column space A: subspace of
,
dimension = r.
- null AT: subspace of
,
dimension = m-r.
Orthogonality Relations for the Subspaces.
Knowing the dimensions of the four subspaces is important, but
equally important is the orthogonality relations between them. We
will prove that
- in
,
the null space of A and the row space of A are orthogonal complements;
that is,
- in
,
the null space of AT and the column space of A are orthogonal
complements; that is,
Notice that because the row space of A is the range of AT,
these two statements are symmetric: the second result is the first
result applied to AT.
Let's prove the first statement. Suppose
and
.
This implies
that v=ATx for some
.
We first show that w and
v are orthogonal. Note that
wTv = wT(ATx) = (wTAT)x = (Aw)Tx=0;
the last step was made because Aw=0. Therefore, every vector in
is orthogonal to every vector in
.
Thus,
is orthogonal to
.
A more visual way to see this relationship is to write the product
Aw=0 as
This product can be interpreted to mean that the scalar product of
w with each row of A is zero. Thus,
is
orthogonal to the rows of A, or
is orthogonal to
the row space of A.
It might seem that we are finished, but this does not establish
that
is the orthogonal complement of
.
There could be vectors orthogonal to
that are not in
,
and vice versa. So there is a
little more work to be done. First, suppose a vector outside the
null space were orthogonal to the rows of A. Looking at the
product Aw above, we see this orthogonality forces Aw=0, which
means
.
So a vector outside
cannot be orthogonal to the row space of A.
What about a vector v outside the row space of A, but
orthogonal to
? This vector is not a linear
combination of the rows of A, so imagine attaching it to A as
the last row of a new matrix A'. Then A' has the same number
of columns as A and it has the same null space as A, but its
row space has a greater dimension than the row space of A. This
change violates the subspace dimension theorem. Therefore, we
conclude that all vectors orthogonal to the row space of A
are in
and all vectors orthogonal to
are in the row space of A. This implies that
and the row space of A are orthogonal complements.
Replacing A by AT, we can also prove that
and the range of A are orthogonal complements. It now follows
that the domain space,
,
and the target space,
,
can
be represented as the following direct sums:
-
.
-
.
This subspace picture gives a useful interpretation of how a
linear map works. Suppose A maps
and
.
Then x can be decomposed (uniquely) into two
parts:
Now let A work on x; we have
We see that A maps its null space to zero and its row space
onto its column space. Because the row space and the column space
of A have the same dimension (r= rank of A), A is an
injective and surjective (hence invertible) map from the row space
to the column space.
Example. Let's exercise all of these ideas on an
matrix. Let
We see immediately that there are r=2 pivots; so the rank of A
is 2, and A has two linearly independent rows and columns. A
basis of the row space is
(the
nonzero rows of U), confirming that the row space has dimension
2. A basis for the null space can be found by solving Ux=0.
Notice that there are r=2 basic variables (x1 and x2) and
n-r=2 free variables (x3 and x4), so we know that null A
has dimension 2. The equations of Ux=0 are
Solving for x1 and x2 in terms of x3 and x4, we have
that
Thus, a basis for null A is
(found by setting the free variables, x3 and x4, to 0 and 1
alternately).
To find a basis for the column space of A, note that the first
and second columns of U are the pivot columns and are linearly
independent. Therefore, the first and second columns of A,
,
are linearly independent and form a
basis for the columns space of A. Finally, we already know that
the dimension of null AT is m-r=1. To find a basis, we could
start with AT and reduce it to row echelon form:
By solving Ux=0, it follows that a basis for null AT is
.
The orthogonality of these bases and subspaces can be checked. For
the row space of A and null A, it can be verified that
For the column space of A and null AT, it can be verified
that
The Inverse Story
We have already encountered the idea of the inverse of a linear
map. Recall that, by definition, if
is
invertible (has an inverse), then
is the
unique map such that
T T-1= T-1T = I. We also proved that a
map is invertible if and only if it is injective and surjective.
All of these ideas carry over to the case of matrices, but there
is more that we can say.
Let's begin with two definitions. Suppose that A is an
matrix. The left inverse of A is an
matrix
B such that BA=In, where In denotes the
identity matrix. The right inverse of A is an
matrix C such that AC=Im. A matrix many have none, one, or
both of these one-sided inverses. We will determine the conditions
under which these various cases arise. We can say one thing at the
outset: If a matrix has both a right and left inverse, then
they must be equal, as
B=B(AC)=(BA)C=C.
We will begin with the case that the range
;
then A
is surjective and
.
This means that
every vector b can be expressed as a linear combination of the
columns of A; hence, Ax=b has at least one solution for all
.
In particular, if we let ei be the usual ith unit
vector of
,
then the system Ax=ei has at least one
solution; call it xi. If we solve this system for
,
then we have solutions xi that we can represent
in matrix form as
In other words, the m vectors xi (of length n) comprise the
columns of a matrix C such that AC=Im. This
matrix C is a right inverse of A (perhaps one of many). We
have shown that if
,
then a right inverse of A exists.
The solution of the system Ax=b is x=Cb, because
Ax=A(Cb) =
(AC)b=b.
Now suppose that the columns of A are linearly independent; then
,
and A is injective. This means that
when Ax=b has a solution, it is unique. Here is a way to
demonstrate that a left inverse exists. Because
,
we can
add m-n more linearly independent columns to A to form a new
matrix
.
This matrix has m=n linearly independent rows
and columns and is invertible; thus,
.
We
can now discard the last m-n columns of
(to recover
A) and the last m-n rows of
to give us a matrix
B. You can check that the resulting product is simply BA=In.
The matrix B, obtained by discarding the last m-n rows of
,
is the left inverse of A. When a solution of
Ax=b exists, it is given by x=Bb, because
x =
(BA)x=B(Ax)=Bb.
Good Thought Question: Explain why a left inverse cannot
exist in the case that r=m <n and why a right inverse cannot
exist in the case that r=n<m.
In the special case that
,
we have a
square matrix whose rows and columns are linearly independent.
Now, the conclusions of both previous cases apply: A is
invertible and said to be non-singular; its left inverse
equals its right inverse, which we denote A-1; and the system
Ax=b has a unique solution for all
.
The following
table summarizes what we have established about inverses; as
usual, A is
,
,
and In is the
identity matrix.
| |
Inverse |
A? |
Ax=b |
Ax=0 |
| r=m<n |
AC=Im (right) |
A surjective |
Nonunique solutions for all b |
Nonunique solutions |
| r=n<m |
BA=In (left) |
A injective |
Unique solution for some b |
Only x=0 |
| r=n=m |
A-1A=AA-1= In |
A bijective (both) |
Unique solution for all b |
Only x=0 |