Free Essay

Linear Algebra

In: Science

Submitted By huangster94
Words 229129
Pages 917
SCHAUM’S outlines SCHAUM’S outlines Linear Algebra
Fourth Edition

Seymour Lipschutz, Ph.D.
Temple University

Marc Lars Lipson, Ph.D.
University of Virginia

Schaum’s Outline Series

New York

Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto

Copyright © 2009, 2001, 1991, 1968 by The McGraw-Hill Companies, Inc. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher. ISBN: 978-0-07-154353-8 MHID: 0-07-154353-8 The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-154352-1, MHID: 0-07-154352-X. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. To contact a representative please e-mail us at bulksales@mcgraw-hill.com. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.

Preface
Linear algebra has in recent years become an essential part of the mathematical background required by mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and statisticians, among others. This requirement reflects the importance and wide applications of the subject matter. This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all readers regardless of their fields of specification. More material has been included than can be covered in most first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to stimulate further interest in the subject. Each chapter begins with clear statements of pertinent definitions, principles, and theorems together with illustrative and other descriptive material. This is followed by graded sets of solved and supplementary problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are included among the solved problems. The supplementary problems serve as a complete review of the material of each chapter. The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations. These chapters provide the motivation and basic computational tools for the abstract investigations of vector spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing a linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, specifically, the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and the dual space V*, and bilinear, quadratic, and Hermitian forms. The last chapter treats linear operators on inner product spaces. The main changes in the fourth edition have been in the appendices. First of all, we have expanded Appendix A on the tensor and exterior products of vector spaces where we have now included proofs on the existence and uniqueness of such products. We also added appendices covering algebraic structures, including modules, and polynomials over a field. Appendix D, ‘‘Odds and Ends,’’ includes the Moore–Penrose generalized inverse which appears in various applications, such as statistics. There are also many additional solved and supplementary problems. Finally, we wish to thank the staff of the McGraw-Hill Schaum’s Outline Series, especially Charles Wall, for their unfailing cooperation. SEYMOUR LIPSCHUTZ MARC LARS LIPSON

iii

This page intentionally left blank

Contents
CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors 1.1 Introduction 1.2 Vectors in Rn 1.3 Vector Addition and Scalar Multiplication 1.4 Dot (Inner) Product 1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn 1.6 Vectors in R3 (Spatial Vectors), ijk Notation 1.7 Complex Numbers 1.8 Vectors in Cn Algebra of Matrices 2.1 Introduction 2.2 Matrices 2.3 Matrix Addition and Scalar Multiplication 2.4 Summation Symbol 2.5 Matrix Multiplication 2.6 Transpose of a Matrix 2.7 Square Matrices 2.8 Powers of Matrices, Polynomials in Matrices 2.9 Invertible (Nonsingular) Matrices 2.10 Special Types of Square Matrices 2.11 Complex Matrices 2.12 Block Matrices Systems of Linear Equations 3.1 Introduction 3.2 Basic Definitions, Solutions 3.3 Equivalent Systems, Elementary Operations 3.4 Small Square Systems of Linear Equations 3.5 Systems in Triangular and Echelon Forms 3.6 Gaussian Elimination 3.7 Echelon Matrices, Row Canonical Form, Row Equivalence 3.8 Gaussian Elimination, Matrix Formulation 3.9 Matrix Equation of a System of Linear Equations 3.10 Systems of Linear Equations and Linear Combinations of Vectors 3.11 Homogeneous Systems of Linear Equations 3.12 Elementary Matrices 3.13 LU Decomposition Vector Spaces 4.1 Introduction 4.2 Vector Spaces 4.3 Examples of Vector Spaces 4.4 Linear Combinations, Spanning Sets 4.5 Subspaces 4.6 Linear Spans, Row Space of a Matrix 4.7 Linear Dependence and Independence 4.8 Basis and Dimension 4.9 Application to Matrices, Rank of a Matrix 4.10 Sums and Direct Sums 4.11 Coordinates Linear Mappings 5.1 Introduction 5.2 Mappings, Functions 5.3 Linear Mappings (Linear Transformations) 5.4 Kernel and Image of a Linear Mapping 5.5 Singular and Nonsingular Linear Mappings, Isomorphisms 5.6 Operations with Linear Mappings 5.7 Algebra A(V ) of Linear Operators Linear Mappings and Matrices 6.1 Introduction 6.2 Matrix Representation of a Linear Operator 6.3 Change of Basis 6.4 Similarity 6.5 Matrices and General Linear Mappings Inner Product Spaces, Orthogonality 7.1 Introduction 7.2 Inner Product Spaces 7.3 Examples of Inner Product Spaces 7.4 Cauchy–Schwarz Inequality, Applications 7.5 Orthogonality 7.6 Orthogonal Sets and Bases 7.7 Gram–Schmidt Orthogonalization Process 7.8 Orthogonal and Positive Definite Matrices 7.9 Complex Inner Product Spaces 7.10 Normed Vector Spaces (Optional) 1

CHAPTER 2

27

CHAPTER 3

57

CHAPTER 4

112

CHAPTER 5

164

CHAPTER 6

195

CHAPTER 7

226

v

vi
CHAPTER 8

Contents Determinants 8.1 Introduction 8.2 Determinants of Orders 1 and 2 8.3 Determinants of Order 3 8.4 Permutations 8.5 Determinants of Arbitrary Order 8.6 Properties of Determinants 8.7 Minors and Cofactors 8.8 Evaluation of Determinants 8.9 Classical Adjoint 8.10 Applications to Linear Equations, Cramer’s Rule 8.11 Submatrices, Minors, Principal Minors 8.12 Block Matrices and Determinants 8.13 Determinants and Volume 8.14 Determinant of a Linear Operator 8.15 Multilinearity and Determinants Diagonalization: Eigenvalues and Eigenvectors 9.1 Introduction 9.2 Polynomials of Matrices 9.3 Characteristic Polynomial, Cayley–Hamilton Theorem 9.4 Diagonalization, Eigenvalues and Eigenvectors 9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices 9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms 9.7 Minimal Polynomial 9.8 Characteristic and Minimal Polynomials of Block Matrices Canonical Forms 10.1 Introduction 10.2 Triangular Form 10.3 Invariance 10.4 Invariant Direct-Sum Decompositions 10.5 Primary Decomposition 10.6 Nilpotent Operators 10.7 Jordan Canonical Form 10.8 Cyclic Subspaces 10.9 Rational Canonical Form 10.10 Quotient Spaces Linear Functionals and the Dual Space 11.1 Introduction 11.2 Linear Functionals and the Dual Space 11.3 Dual Basis 11.4 Second Dual Space 11.5 Annihilators 11.6 Transpose of a Linear Mapping Bilinear, Quadratic, and Hermitian Forms 12.1 Introduction 12.2 Bilinear Forms 12.3 Bilinear Forms and Matrices 12.4 Alternating Bilinear Forms 12.5 Symmetric Bilinear Forms, Quadratic Forms 12.6 Real Symmetric Bilinear Forms, Law of Inertia 12.7 Hermitian Forms Linear Operators on Inner Product Spaces 13.1 Introduction 13.2 Adjoint Operators 13.3 Analogy Between A(V ) and C, Special Linear Operators 13.4 Self-Adjoint Operators 13.5 Orthogonal and Unitary Operators 13.6 Orthogonal and Unitary Matrices 13.7 Change of Orthonormal Basis 13.8 Positive Definite and Positive Operators 13.9 Diagonalization and Canonical Forms in Inner Product Spaces 13.10 Spectral Theorem Multilinear Products Algebraic Structures Polynomials over a Field Odds and Ends 264

CHAPTER 9

292

CHAPTER 10

325

CHAPTER 11

349

CHAPTER 12

359

CHAPTER 13

377

APPENDIX A APPENDIX B APPENDIX C APPENDIX D List of Symbols Index

396 403 411 415 420 421

CHAPTER H A P T E R 1 C 1

Vectors in Rn and Cn, Spatial Vectors
1.1 Introduction
There are two ways to motivate the notion of a vector: one is by means of lists of numbers and subscripts, and the other is by means of certain objects in physics. We discuss these two ways below. Here we assume the reader is familiar with the elementary properties of the field of real numbers, denoted by R. On the other hand, we will review properties of the field of complex numbers, denoted by C. In the context of vectors, the elements of our number fields are called scalars. Although we will restrict ourselves in this chapter to vectors whose elements come from R and then from C, many of our operations also apply to vectors whose entries come from some arbitrary field K.

Lists of Numbers
Suppose the weights (in pounds) of eight students are listed as follows: 156; w1 ; 125; w2 ; 145; w3 ; 134; w5 ; 178; w6 ; 145; w7 ; 162; w8 193 One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is, w4 ; Observe that each subscript denotes the position of the value in the list. For example, w1 ¼ 156; the first number; w2 ¼ 125; the second number; . . . Such a list of values, w ¼ ðw1 ; w2 ; w3 ; . . . ; w8 Þ is called a linear array or vector.

Vectors in Physics
Many physical quantities, such as temperature and speed, possess only ‘‘magnitude.’’ These quantities can be represented by real numbers and are called scalars. On the other hand, there are also quantities, such as force and velocity, that possess both ‘‘magnitude’’ and ‘‘direction.’’ These quantities, which can be represented by arrows having appropriate lengths and directions and emanating from some given reference point O, are called vectors. Now we assume the reader is familiar with the space R3 where all the points in space are represented by ordered triples of real numbers. Suppose the origin of the axes in R3 is chosen as the reference point O for the vectors discussed above. Then every vector is uniquely determined by the coordinates of its endpoint, and vice versa. There are two important operations, vector addition and scalar multiplication, associated with vectors in physics. The definition of these operations and the relationship between these operations and the endpoints of the vectors are as follows.

1

2

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

Figure 1-1

(i) Vector Addition: The resultant u þ v of two vectors u and v is obtained by the parallelogram law; that is, u þ v is the diagonal of the parallelogram formed by u and v. Furthermore, if ða; b; cÞ and ða0 ; b0 ; c0 Þ are the endpoints of the vectors u and v, then ða þ a0 ; b þ b0 ; c þ c0 Þ is the endpoint of the vector u þ v. These properties are pictured in Fig. 1-1(a). (ii) Scalar Multiplication: The product ku of a vector u by a real number k is obtained by multiplying the magnitude of u by k and retaining the same direction if k > 0 or the opposite direction if k < 0. Also, if ða; b; cÞ is the endpoint of the vector u, then ðka; kb; kcÞ is the endpoint of the vector ku. These properties are pictured in Fig. 1-1(b). Mathematically, we identify the vector u with its ða; b; cÞ and write u ¼ ða; b; cÞ. Moreover, we call the ordered triple ða; b; cÞ of real numbers a point or vector depending upon its interpretation. We generalize this notion and call an n-tuple ða1 ; a2 ; . . . ; an Þ of real numbers a vector. However, special notation may be used for the vectors in R3 called spatial vectors (Section 1.6).

1.2

Vectors in Rn

The set of all n-tuples of real numbers, denoted by Rn , is called n-space. A particular n-tuple in Rn , say u ¼ ða1 ; a2 ; . . . ; an Þ is called a point or vector. The numbers ai are called the coordinates, components, entries, or elements of u. Moreover, when discussing the space Rn , we use the term scalar for the elements of R. Two vectors, u and v, are equal, written u ¼ v, if they have the same number of components and if the corresponding components are equal. Although the vectors ð1; 2; 3Þ and ð2; 3; 1Þ contain the same three numbers, these vectors are not equal because corresponding entries are not equal. The vector ð0; 0; . . . ; 0Þ whose entries are all 0 is called the zero vector and is usually denoted by 0.
EXAMPLE 1.1

(a) The following are vectors:

ð2; À5Þ;

ð7; 9Þ;

ð0; 0; 0Þ;

ð3; 4; 5Þ

The first two vectors belong to R2 , whereas the last two belong to R3 . The third is the zero vector in R3 . (b) Find x; y; z such that ðx À y; x þ y; z À 1Þ ¼ ð4; 2; 3Þ. By definition of equality of vectors, corresponding entries must be equal. Thus,

x À y ¼ 4;

x þ y ¼ 2;

zÀ1¼3

Solving the above system of equations yields x ¼ 3, y ¼ À1, z ¼ 4.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

3

Column Vectors
Sometimes a vector in n-space Rn is written vertically rather than horizontally. Such a vector is called a column vector, and, in this context, the horizontally written vectors in Example 1.1 are called row vectors. For example, the following are column vectors with 2; 2; 3, and 3 components, respectively: 2 3 2 1:5 3 ! ! 1 1 3 6 27 ; ; 4 5 5; 4 35 2 À4 À6 À15 We also note that any operation defined for row vectors is defined analogously for column vectors.

1.3

Vector Addition and Scalar Multiplication u ¼ ða1 ; a2 ; . . . ; an Þ v ¼ ðb1 ; b2 ; . . . ; bn Þ

Consider two vectors u and v in Rn , say and

Their sum, written u þ v, is the vector obtained by adding corresponding components from u and v. That is, u þ v ¼ ða1 þ b1 ; a2 þ b2 ; . . . ; an þ bn Þ The scalar product or, simply, product, of the vector u by a real number k, written ku, is the vector obtained by multiplying each component of u by k. That is, ku ¼ kða1 ; a2 ; . . . ; an Þ ¼ ðka1 ; ka2 ; . . . ; kan Þ Observe that u þ v and ku are also vectors in Rn . The sum of vectors with different numbers of components is not defined. Negatives and subtraction are defined in Rn as follows: Àu ¼ ðÀ1Þu and u À v ¼ u þ ðÀvÞ

The vector Àu is called the negative of u, and u À v is called the difference of u and v. Now suppose we are given vectors u1 ; u2 ; . . . ; um in Rn and scalars k1 ; k2 ; . . . ; km in R. We can multiply the vectors by the corresponding scalars and then add the resultant scalar products to form the vector v ¼ k1 u1 þ k2 u2 þ k3 u3 þ Á Á Á þ km um Such a vector v is called a linear combination of the vectors u1 ; u2 ; . . . ; um .
EXAMPLE 1.2

(a) Let u ¼ ð2; 4; À5Þ and v ¼ ð1; À6; 9Þ. Then

u þ v ¼ ð2 þ 1; 4 þ ðÀ5Þ; À5 þ 9Þ ¼ ð3; À1; 4Þ 7u ¼ ð7ð2Þ; 7ð4Þ; 7ðÀ5ÞÞ ¼ ð14; 28; À35Þ Àv ¼ ðÀ1Þð1; À6; 9Þ ¼ ðÀ1; 6; À9Þ 3u À 5v ¼ ð6; 12; À15Þ þ ðÀ5; 30; À45Þ ¼ ð1; 42; À60Þ
(b) The zero vector 0 ¼ ð0; 0; . . . ; 0Þ in Rn is similar to the scalar 0 in that, for any vector u ¼ ða1 ; a2 ; . . . ; an Þ.

u þ 0 ¼ ða1 þ 0; a2 þ 0; . . . ; an þ 0Þ ¼ ða1 ; a2 ; . . . ; an Þ ¼ u
3 3 2 2 3 2 2 3 3 À5 À9 4 3 2 (c) Let u ¼ 4 3 5 and v ¼ 4 À1 5. Then 2u À 3v ¼ 4 6 5 þ 4 3 5 ¼ 4 9 5. À2 6 À8 À2 À4 2

4

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

Basic properties of vectors under the operations of vector addition and scalar multiplication are described in the following theorem.
THEOREM

1.1: For any vectors u; v; w in Rn and any scalars k; k 0 in R, (i) (ii) ðu þ vÞ þ w ¼ u þ ðv þ wÞ, u þ 0 ¼ u; u þ v ¼ v þ u, (v) (vi) (vii) kðu þ vÞ ¼ ku þ kv, ðk þ k 0 Þu ¼ ku þ k 0 u, (kk’)u=k(k’u);

(iii) u þ ðÀuÞ ¼ 0; (iv)

(viii) 1u ¼ u.

We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices (Problem 2.3). Suppose u and v are vectors in Rn for which u ¼ kv for some nonzero scalar k in R. Then u is called a multiple of v. Also, u is said to be in the same or opposite direction as v according to whether k > 0 or k < 0.

1.4

Dot (Inner) Product u ¼ ða1 ; a2 ; . . . ; an Þ and v ¼ ðb1 ; b2 ; . . . ; bn Þ

Consider arbitrary vectors u and v in Rn ; say, The dot product or inner product or scalar product of u and v is denoted and defined by u Á v ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn That is, u Á v is obtained by multiplying corresponding components and adding the resulting products. The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero—that is, if u Á v ¼ 0.
EXAMPLE 1.3

(a) Let u ¼ ð1; À2; 3Þ, v ¼ ð4; 5; À1Þ, w ¼ ð2; 7; 4Þ. Then,

u Á v ¼ 1ð4Þ À 2ð5Þ þ 3ðÀ1Þ ¼ 4 À 10 À 3 ¼ À9 u Á w ¼ 2 À 14 þ 12 ¼ 0; v Á w ¼ 8 þ 35 À 4 ¼ 39
Thus, u and w are orthogonal. 2 3 2 3 3 2 (b) Let u ¼ 4 3 5 and v ¼ 4 À1 5. Then u Á v ¼ 6 À 3 þ 8 ¼ 11. À2 À4 (c) Suppose u ¼ ð1; 2; 3; 4Þ and v ¼ ð6; k; À8; 2Þ. Find k so that u and v are orthogonal. First obtain u Á v ¼ 6 þ 2k À 24 þ 8 ¼ À10 þ 2k. Then set u Á v ¼ 0 and solve for k:

À10 þ 2k ¼ 0

or

2k ¼ 10 n or

k¼5

Basic properties of the dot product in R (proved in Problem 1.13) follow.
THEOREM

1.2: For any vectors u; v; w in Rn and any scalar k in R: (i) ðu þ vÞ Á w ¼ u Á w þ v Á w; (ii) ðkuÞ Á v ¼ kðu Á vÞ, (iii) u Á v ¼ v Á u, (iv) u Á u ! 0; and u Á u ¼ 0 iff u ¼ 0.

Note that (ii) says that we can ‘‘take k out’’ from the first position in an inner product. By (iii) and (ii), u Á ðkvÞ ¼ ðkvÞ Á u ¼ kðv Á uÞ ¼ kðu Á vÞ

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

5

That is, we can also ‘‘take k out’’ from the second position in an inner product. The space Rn with the above operations of vector addition, scalar multiplication, and dot product is usually called Euclidean n-space.

Norm (Length) of a Vector
The norm or length of a vector u in Rn , denoted by kuk, is defined to be the nonnegative square root of u Á u. In particular, if u ¼ ða1 ; a2 ; . . . ; an Þ, then pffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kuk ¼ u Á u ¼ a2 þ a2 þ Á Á Á þ a2 n 1 2 That is, kuk is the square root of the sum of the squares of the components of u. Thus, kuk ! 0, and kuk ¼ 0 if and only if u ¼ 0. A vector u is called a unit vector if kuk ¼ 1 or, equivalently, if u Á u ¼ 1. For any nonzero vector v in Rn , the vector 1 v ^ v¼ v¼ kvk kvk ^ is the unique unit vector in the same direction as v. The process of finding v from v is called normalizing v.
EXAMPLE 1.4

(a) Suppose u ¼ ð1; À2; À4; 5; 3Þ. To find kuk, we can first find kuk2 ¼ u Á u by squaring each component of u and adding, as follows:

kuk2 ¼ 12 þ ðÀ2Þ2 þ ðÀ4Þ2 þ 52 þ 32 ¼ 1 þ 4 þ 16 þ 25 þ 9 ¼ 55
Then kuk ¼ pffiffiffiffiffi 55. (b) Let v ¼ ð1; À3; 4; 2Þ and w ¼ ð1 ; À 1 ; 5 ; 1Þ. Then 2 6 6 6

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi kvk ¼ 1 þ 9 þ 16 þ 4 ¼ 30 v ^ ¼ v¼ kvk

and

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffi 9 1 25 1 36 pffiffiffi ¼ 1¼1 kwk ¼ þ þ þ ¼ 36 36 36 36 36

Thus w is a unit vector, but v is not a unit vector. However, we can normalize v as follows:

  1 À3 4 2 pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi 30 30 30 30

This is the unique unit vector in the same direction as v.

The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy– Schwarz inequality. It is used in many branches of mathematics.
THEOREM

1.3 (Schwarz): For any vectors u; v in Rn , ju Á vj

kukkvk.

Using the above inequality, we also prove (Problem 1.15) the following result known as the ‘‘triangle inequality’’ or Minkowski’s inequality.
THEOREM

1.4 (Minkowski): For any vectors u; v in Rn , ku þ vk

kuk þ kvk.

Distance, Angles, Projections
The distance between vectors u ¼ ða1 ; a2 ; . . . ; an Þ and v ¼ ðb1 ; b2 ; . . . ; bn Þ in Rn is denoted and defined by qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dðu; vÞ ¼ ku À vk ¼ ða1 À b1 Þ2 þ ða2 À b2 Þ2 þ Á Á Á þ ðan À bn Þ2 One can show that this definition agrees with the usual notion of distance in the Euclidean plane R2 or space R3 .

6

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

The angle y between nonzero vectors u; v in Rn is defined by uÁv cos y ¼ kukkvk This definition is well defined, because, by the Schwarz inequality (Theorem 1.3), uÁv 1 À1 kukkvk Note that if u Á v ¼ 0, then y ¼ 90 (or y ¼ p=2). This then agrees with our previous definition of orthogonality. The projection of a vector u onto a nonzero vector v is the vector denoted and defined by uÁv uÁv v v¼ projðu; vÞ ¼ 2 vÁv kvk We show below that this agrees with the usual notion of vector projection in physics.
EXAMPLE 1.5

(a) Suppose u ¼ ð1; À2; 3Þ and v ¼ ð2; 4; 5Þ. Then

dðu; vÞ ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi ð1 À 2Þ2 þ ðÀ2 À 4Þ2 þ ð3 À 5Þ2 ¼ 1 þ 36 þ 4 ¼ 41 kuk2 ¼ 1 þ 4 þ 9 ¼ 14; kvk2 ¼ 4 þ 16 þ 25 ¼ 45

To find cos y, where y is the angle between u and v, we first find

u Á v ¼ 2 À 8 þ 15 ¼ 9;
Then

cos y ¼
Also,

uÁv 9 ¼ pffiffiffiffiffipffiffiffiffiffi kukkvk 14 45 9 1 v ¼ ð2; 4; 5Þ ¼ ð2; 4; 5Þ ¼ 2 45 5 kvk uÁv   2 4 ; ;1 5 5

projðu; vÞ ¼

(b) Consider the vectors u and v in Fig. 1-2(a) (with respective endpoints A and B). The (perpendicular) projection of u onto v is the vector u* with magnitude

ku*k ¼ kuk cos y ¼ kuk

uÁv uÁv ¼ kukvk kvk

To obtain u*, we multiply its magnitude by the unit vector in the direction of v, obtaining

u* ¼ ku*k

v uÁv v uÁv ¼ ¼ v kvk kvk kvk kvk2

This is the same as the above definition of projðu; vÞ.
A z P(b1– 1, b2– 2 , b3–3 ) a a a u B(b1, b2, b3)

θ 0

u*

C B x

u 0

A(a1, a2, a3) y

Projection u* of u onto (a )

u=B–A (b)

Figure 1-2

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

7

1.5

Located Vectors, Hyperplanes, Lines, Curves in Rn

This section distinguishes between an n-tuple Pðai Þ  Pða1 ; a2 ; . . . ; an Þ viewed as a point in Rn and an n-tuple u ¼ ½c1 ; c2 ; . . . ; cn Š viewed as a vector (arrow) from the origin O to the point Cðc1 ; c2 ; . . . ; cn Þ.

Located Vectors
Any pair of points Aðai Þ and Bðbi Þ in Rn defines the located vector or directed line segment from A to B, ƒ! ƒ! written AB . We identify AB with the vector u ¼ B À A ¼ ½b1 À a1 ; b2 À a2 ; . . . ; bn À an Š ƒ! because AB and u have the same magnitude and direction. This is pictured in Fig. 1-2(b) for the points Aða1 ; a2 ; a3 Þ and Bðb1 ; b2 ; b3 Þ in R3 and the vector u ¼ B À A which has the endpoint Pðb1 À a1 , b2 À a2 , b3 À a3 Þ.

Hyperplanes
A hyperplane H in Rn is the set of points ðx1 ; x2 ; . . . ; xn Þ that satisfy a linear equation a1 x1 þ a2 x2 þ Á Á Á þ an xn ¼ b where the vector u ¼ ½a1 ; a2 ; . . . ; an Š of coefficients is not zero. Thus a hyperplane H in R2 is a line, and a hyperplane H in R3 is a plane. We show below, as pictured in Fig. 1-3(a) for R3 , that u is orthogonal to ƒ! any directed line segment PQ , where Pð pi Þ and Qðqi Þ are points in H: [For this reason, we say that u is normal to H and that H is normal to u:]

Figure 1-3

Because Pð pi Þ and Qðqi Þ belong to H; they satisfy the above hyperplane equation—that is, a1 p1 þ a2 p2 þ Á Á Á þ an pn ¼ b and a1 q1 þ a2 q2 þ Á Á Á þ an qn ¼ b ƒ! Let v ¼ PQ ¼ Q À P ¼ ½q1 À p1 ; q2 À p2 ; . . . ; qn À pn Š Then u Á v ¼ a1 ðq1 À p1 Þ þ a2 ðq2 À p2 Þ þ Á Á Á þ an ðqn À pn Þ ¼ ða1 q1 þ a2 q2 þ Á Á Á þ an qn Þ À ða1 p1 þ a2 p2 þ Á Á Á þ an pn Þ ¼ b À b ¼ 0 ƒ! Thus v ¼ PQ is orthogonal to u; as claimed.

8
Lines in Rn

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

The line L in Rn passing through the point Pðb1 ; b2 ; . . . ; bn Þ and in the direction of a nonzero vector u ¼ ½a1 ; a2 ; . . . ; an Š consists of the points X ðx1 ; x2 ; . . . ; xn Þ that satisfy 8 > x1 ¼ a1 t þ b1 > < x2 ¼ a2 t þ b2 X ¼ P þ tu or or LðtÞ ¼ ðai t þ bi Þ > :::::::::::::::::::: > : xn ¼ an t þ bn where the parameter t takes on all real values. Such a line L in R3 is pictured in Fig. 1-3(b).
EXAMPLE 1.6

(a) Let H be the plane in R3 corresponding to the linear equation 2x À 5y þ 7z ¼ 4. Observe that Pð1; 1; 1Þ and Qð5; 4; 2Þ are solutions of the equation. Thus P and Q and the directed line segment ƒ! v ¼ PQ ¼ Q À P ¼ ½5 À 1; 4 À 1; 2 À 1Š ¼ ½4; 3; 1Š lie on the plane H. The vector u ¼ ½2; À5; 7Š is normal to H, and, as expected,

u Á v ¼ ½2; À5; 7Š Á ½4; 3; 1Š ¼ 8 À 15 þ 7 ¼ 0
That is, u is orthogonal to v. (b) Find an equation of the hyperplane H in R4 that passes through the point Pð1; 3; À4; 2Þ and is normal to the vector u ¼ ½4; À2; 5; 6Š. The coefficients of the unknowns of an equation of H are the components of the normal vector u; hence, the equation of H must be of the form 4x1 À 2x2 þ 5x3 þ 6x4 ¼ k Substituting P into this equation, we obtain 4ð1Þ À 2ð3Þ þ 5ðÀ4Þ þ 6ð2Þ ¼ k or 4 À 6 À 20 þ 12 ¼ k or k ¼ À10

Thus, 4x1 À 2x2 þ 5x3 þ 6x4 ¼ À10 is the equation of H. (c) Find the parametric representation of the line L in R4 passing through the point Pð1; 2; 3; À4Þ and in the direction of u ¼ ½5; 6; À7; 8Š. Also, find the point Q on L when t ¼ 1. Substitution in the above equation for L yields the following parametric representation:

x1 ¼ 5t þ 1; or, equivalently,

x2 ¼ 6t þ 2;

x3 ¼ À7t þ 3;

x4 ¼ 8t À 4

LðtÞ ¼ ð5t þ 1; 6t þ 2; À7t þ 3; 8t À 4Þ Note that t ¼ 0 yields the point P on L. Substitution of t ¼ 1 yields the point Qð6; 8; À4; 4Þ on L.

Curves in Rn
Let D be an interval (finite or infinite) on the real line R. A continuous function F: D ! Rn is a curve in Rn . Thus, to each point t 2 D there is assigned the following point in Rn : FðtÞ ¼ ½F1 ðtÞ; F2 ðtÞ; . . . ; Fn ðtފ Moreover, the derivative (if it exists) of FðtÞ yields the vector V ðtÞ ¼ ! dFðtÞ dF1 ðtÞ dF2 ðtÞ dF ðtÞ ¼ ; ;...; n dt dt dt dt

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors which is tangent to the curve. Normalizing V ðtÞ yields TðtÞ ¼ V ðtÞ kV ðtÞk

9

Thus, TðtÞ is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often presented in bold type.)
EXAMPLE 1.7

Consider the curve FðtÞ ¼ ½sin t; cos t; tŠ in R3 . Taking the derivative of FðtÞ [or each component of

FðtÞ] yields

V ðtÞ ¼ ½cos t; À sin t; 1Š which is a vector tangent to the curve. We normalize V ðtÞ. First we obtain

kV ðtÞk2 ¼ cos2 t þ sin2 t þ 1 ¼ 1 þ 1 ¼ 2
Then the unit tangent vection TðtÞ to the curve follows:

! V ðtÞ cos t À sin t 1 ¼ pffiffiffi ; pffiffiffi ; pffiffiffi TðtÞ ¼ kV ðtÞk 2 2 2

1.6

Vectors in R3 (Spatial Vectors), ijk Notation

Vectors in R3 , called spatial vectors, appear in many applications, especially in physics. In fact, a special notation is frequently used for such vectors as follows: i ¼ ½1; 0; 0Š denotes the unit vector in the x direction: j ¼ ½0; 1; 0Š denotes the unit vector in the y direction: k ¼ ½0; 0; 1Š denotes the unit vector in the z direction: Then any vector u ¼ ½a; b; cŠ in R3 can be expressed uniquely in the form u ¼ ½a; b; cŠ ¼ ai þ bj þ cj Because the vectors i; j; k are unit vectors and are mutually orthogonal, we obtain the following dot products: i Á i ¼ 1; j Á j ¼ 1; kÁk ¼1 and i Á j ¼ 0; i Á k ¼ 0; jÁk ¼0 Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows. Suppose u ¼ a1 i þ a2 j þ a3 k Then u þ v ¼ ða1 þ b1 Þi þ ða2 þ b2 Þj þ ða3 þ b3 Þk where c is a scalar. Also, u Á v ¼ a1 b1 þ a2 b2 þ a3 b3
EXAMPLE 1.8

and

v ¼ b1 i þ b2 j þ b3 k and cu ¼ ca1 i þ ca2 j þ ca3 k

and

kuk ¼

pffiffiffiffiffiffiffiffiffi u Á u ¼ a2 þ a2 þ a2 1 2 3

Suppose u ¼ 3i þ 5j À 2k and v ¼ 4i À 8j þ 7k.

(a) To find u þ v, add corresponding components, obtaining u þ v ¼ 7i À 3j þ 5k (b) To find 3u À 2v, first multiply by the scalars and then add: 3u À 2v ¼ ð9i þ 13j À 6kÞ þ ðÀ8i þ 16j À 14kÞ ¼ i þ 29j À 20k

10 u Á v ¼ 12 À 40 À 14 ¼ À42

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

(c) To find u Á v, multiply corresponding components and then add: (d) To find kuk, take the square root of the sum of the squares of the components:

kuk ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 9 þ 25 þ 4 ¼ 38

Cross Product
There is a special operation for vectors u and v in R3 that is not defined in Rn for n 6¼ 3. This operation is called the cross product and is denoted by u  v. One way to easily remember the formula for u  v is to use the determinant (of order two) and its negative, which are denoted and defined as follows:     a b a b   ¼ ad À bc   ¼ bc À ad and À c d c d Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus, the determinant is the product ad of the diagonal elements minus the product bc of the nondiagonal elements, but vice versa for the negative of the determinant. Now suppose u ¼ a1 i þ a2 j þ a3 k and v ¼ b1 i þ b2 j þ b3 k. Then u  v ¼ ða2 b3 À a3 b2 Þi þ ða3 b1 À a1 b3 Þj þ ða1 b2 À a2 b1 Þk        a1 a2 a3   a1 a2 a3    i À  j þ  a1 a2 a3 i ¼  b  b  b b2 b3 b2 b3 b2 b3  1 1 1 That is, the three components of u  v are obtained from the array ! a1 a2 a3 b1 b2 b3 (which contain the components of u above the component of v) as follows: (1) Cover the first column and take the determinant. (2) Cover the second column and take the negative of the determinant. (3) Cover the third column and take the determinant. Note that u  v is a vector; hence, u  v is also called the vector product or outer product of u and v.
EXAMPLE 1.9

(a) Use (b) Use

4 2 2 3

Find u  v where: (a) u ¼ 4i þ 3j þ 6k, v ¼ 2i þ 5j À 3k, (b) u ¼ ½2; À1; 5Š, v ¼ ½3; 7; 6Š. ! 3 6 to get u  v ¼ ðÀ9 À 30Þi þ ð12 þ 12Þj þ ð20 À 6Þk ¼ À39i þ 24j þ 14k 5 À3 ! À1 5 to get u  v ¼ ½À6 À 35; 15 À 12; 14 þ 3Š ¼ ½À41; 3; 17Š 7 6

Remark: The cross products of the vectors i; j; k are as follows: i  j ¼ k; j  i ¼ Àk; j  k ¼ i; k  j ¼ Ài; kÂi¼j i  k ¼ Àj

Thus, if we view the triple ði; j; kÞ as a cyclic permutation, where i follows k and hence k precedes i, then the product of two of them in the given direction is the third one, but the product of two of them in the opposite direction is the negative of the third one. Two important properties of the cross product are contained in the following theorem.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

11

Figure 1-4

THEOREM

1.5: Let u; v; w be vectors in R3 . (a) The vector u  v is orthogonal to both u and v. (b) The absolute value of the ‘‘triple product’’ uÁvÂw represents the volume of the parallelopiped formed by the vectors u; v, w. [See Fig. 1-4(a).]

We note that the vectors u; v, u  v form a right-handed system, and that the following formula gives the magnitude of u  v: ku  vk ¼ kukkvk sin y where y is the angle between u and v.

1.7

Complex Numbers

The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair ða; bÞ of real numbers where equality, addition, and multiplication are defined as follows: ða; bÞ ¼ ðc; dÞ if and only if a ¼ c and b ¼ d ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ ða; bÞ Á ðc; dÞ ¼ ðac À bd; ad þ bcÞ We identify the real number a with the complex number ða; 0Þ; that is, a $ ða; 0Þ This is possible because the operations of addition and multiplication of real numbers are preserved under the correspondence; that is, ða; 0Þ þ ðb; 0Þ ¼ ða þ b; 0Þ and ða; 0Þ Á ðb; 0Þ ¼ ðab; 0Þ Thus we view R as a subset of C, and replace ða; 0Þ by a whenever convenient and possible. We note that the set C of complex numbers with the above operations of addition and multiplication is a field of numbers, like the set R of real numbers and the set Q of rational numbers.

12

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

The complex number ð0; 1Þ is denoted by i. It has the important property that pffiffiffiffiffiffiffi i2 ¼ ii ¼ ð0; 1Þð0; 1Þ ¼ ðÀ1; 0Þ ¼ À1 or i ¼ À1 Accordingly, any complex number z ¼ ða; bÞ can be written in the form z ¼ ða; bÞ ¼ ða; 0Þ þ ð0; bÞ ¼ ða; 0Þ þ ðb; 0Þ Á ð0; 1Þ ¼ a þ bi The above notation z ¼ a þ bi, where a  Re z and b  Im z are called, respectively, the real and imaginary parts of z, is more convenient than ða; bÞ. In fact, the sum and product of complex numbers z ¼ a þ bi and w ¼ c þ di can be derived by simply using the commutative and distributive laws and i2 ¼ À1: z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ a þ c þ bi þ di ¼ ða þ bÞ þ ðc þ dÞi zw ¼ ða þ biÞðc þ diÞ ¼ ac þ bci þ adi þ bdi2 ¼ ðac À bdÞ þ ðbc þ adÞi We also define the negative of z and subtraction in C by w À z ¼ w þ ðÀzÞ pffiffiffiffiffiffiffi Warning: The letter i representing À1 has no relationship whatsoever to the vector i ¼ ½1; 0; 0Š in Section 1.6. and Àz ¼ À1z

Complex Conjugate, Absolute Value
Consider a complex number z ¼ a þ bi. The conjugate of z is denoted and defined by z ¼ a þ bi ¼ a À bi   Then z ¼ ða þ biÞða À biÞ ¼ a2 À b2 i2 ¼ a2 þ b2 . Note that z is real if and only if z ¼ z. z The absolute value of z, denoted by jzj, is defined to be the nonnegative square root of z. Namely, z pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffi jzj ¼ z ¼ a2 þ b2 z Note that jzj is equal to the norm of the vector ða; bÞ in R2 . Suppose z 6¼ 0. Then the inverse zÀ1 of z and division in C of w by z are given, respectively, by zÀ1 ¼ z  a b ¼ 2 À 2 i 2 z a þ b z a þ b2 and w w z À ¼ wzÀ1 z z z

EXAMPLE 1.10 Suppose z ¼ 2 þ 3i and w ¼ 5 À 2i. Then

z þ w ¼ ð2 þ 3iÞ þ ð5 À 2iÞ ¼ 2 þ 5 þ 3i À 2i ¼ 7 þ i zw ¼ ð2 þ 3iÞð5 À 2iÞ ¼ 10 þ 15i À 4i À 6i2 ¼ 16 þ 11i  and w ¼ 5 À 2i ¼ 5 þ 2i z ¼ 2 þ 3i ¼ 2 À 3i  w 5 À 2i ð5 À 2iÞð2 À 3iÞ 4 À 19i 4 19 ¼ ¼ ¼ ¼ À i z 2 þ 3i ð2 þ 3iÞð2 À 3iÞ 13 13 13 pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi ffi jzj ¼ 4 þ 9 ¼ 13 and jwj ¼ 25 þ 4 ¼ 29

Complex Plane
Recall that the real numbers R can be represented by points on a line. Analogously, the complex numbers C can be represented by points in the plane. Specifically, we let the point ða; bÞ in the plane represent the complex number a þ bi as shown in Fig. 1-4(b). In such a case, jzj is the distance from the origin O to the point z. The plane with this representation is called the complex plane, just like the line representing R is called the real line.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

13

1.8

Vectors in Cn

The set of all n-tuples of complex numbers, denoted by Cn , is called complex n-space. Just as in the real case, the elements of Cn are called points or vectors, the elements of C are called scalars, and vector addition in Cn and scalar multiplication on Cn are given by ½z1 ; z2 ; . . . ; zn Š þ ½w1 ; w2 ; . . . ; wn Š ¼ ½z1 þ w1 ; z2 þ w2 ; . . . ; zn þ wn Š z½z1 ; z2 ; . . . ; zn Š ¼ ½zz1 ; zz2 ; . . . ; zzn Š where the zi , wi , and z belong to C.
EXAMPLE 1.11 Consider vectors u ¼ ½2 þ 3i; 4 À i; 3Š and v ¼ ½3 À 2i; 5i; 4 À 6iŠ in C3 . Then

u þ v ¼ ½2 þ 3i; 4 À i; 3Š þ ½3 À 2i; 5i; 4 À 6iŠ ¼ ½5 þ i; 4 þ 4i; 7 À 6iŠ ð5 À 2iÞu ¼ ½ð5 À 2iÞð2 þ 3iÞ; ð5 À 2iÞð4 À iÞ; ð5 À 2iÞð3ފ ¼ ½16 þ 11i; 18 À 13i; 15 À 6iŠ

Dot (Inner) Product in Cn
Consider vectors u ¼ ½z1 ; z2 ; . . . ; zn Š and v ¼ ½w1 ; w2 ; . . . ; wn Š in Cn . The dot or inner product of u and v is denoted and defined by    u Á v ¼ z1 w1 þ z2 w2 þ Á Á Á þ zn wn  This definition reduces to the real case because wi ¼ wi when wi is real. The norm of u is defined by qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kuk ¼ u Á u ¼ z1 z1 þ z2 z2 þ Á Á Á þ zn zn ¼ jz1 j2 þ jz2 j2 þ Á Á Á þ jv n j2    We emphasize that u Á u and so kuk are real and positive when u 6¼ 0 and 0 when u ¼ 0.
EXAMPLE 1.12 Consider vectors u ¼ ½2 þ 3i; 4 À i; 3 þ 5iŠ and v ¼ ½3 À 4i; 5i; 4 À 2iŠ in C3 . Then

u Á v ¼ ð2 þ 3iÞð3 À 4iÞ þ ð4 À iÞð5iÞ þ ð3 þ 5iÞð4 À 2iÞ ¼ ð2 þ 3iÞð3 þ 4iÞ þ ð4 À iÞðÀ5iÞ þ ð3 þ 5iÞð4 þ 2iÞ ¼ ðÀ6 þ 13iÞ þ ðÀ5 À 20iÞ þ ð2 þ 26iÞ ¼ À9 þ 19i u Á u ¼ j2 þ 3ij2 þ j4 À ij2 þ j3 þ 5ij2 ¼ 4 þ 9 þ 16 þ 1 þ 9 þ 25 ¼ 64 pffiffiffiffiffi kuk ¼ 64 ¼ 8 The space Cn with the above operations of vector addition, scalar multiplication, and dot product, is called complex Euclidean n-space. Theorem 1.2 for Rn also holds for Cn if we replace u Á v ¼ v Á u by uÁv ¼uÁv On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski’s inequality (Theorem 1.4) are true for Cn with no changes.

SOLVED PROBLEMS Vectors in Rn 1.1. Determine which of the following vectors are equal: u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 1Þ; u3 ¼ ð1; 3; 2Þ; u4 ¼ ð2; 3; 1Þ

Vectors are equal only when corresponding entries are equal; hence, only u2 ¼ u4 .

14
1.2.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
Let u ¼ ð2; À7; 1Þ, v ¼ ðÀ3; 0; 4Þ, w ¼ ð0; 5; À8Þ. Find: (a) 3u À 4v, (b) 2u þ 3v À 5w.
First perform the scalar multiplication and then the vector addition. (a) 3u À 4v ¼ 3ð2; À7; 1Þ À 4ðÀ3; 0; 4Þ ¼ ð6; À21; 3Þ þ ð12; 0; À16Þ ¼ ð18; À21; À13Þ (b) 2u þ 3v À 5w ¼ ð4; À14; 2Þ þ ðÀ9; 0; 12Þ þ ð0; À25; 40Þ ¼ ðÀ5; À39; 54Þ

3 2 3 2 3 5 À1 3 1.3. Let u ¼ 4 3 5; v ¼ 4 5 5; w ¼ 4 À1 5. Find: À4 2 À2 (a) 5u À 2v, (b) À2u þ 4v À 3w.
First perform the scalar multiplication and then the vector addition: 2 3 2 3 2 3 2 3 2 3 2 27 5 À1 25 (a) 5u À 2v ¼ 54 3 5 À 24 5 5 ¼ 4 15 5 þ 4 À10 5 ¼ 4 5 5 À4 2 À20 À4 À24 3 3 2 3 2 2 3 2 À23 À9 À4 À10 (b) À2u þ 4v À 3w ¼ 4 À6 5 þ 4 20 5 þ 4 3 5 ¼ 4 17 5 22 6 8 8

2

1.4.

Find x and y, where: (a)

ðx; 3Þ ¼ ð2; x þ yÞ, x ¼ 2;

(b) ð4; yÞ ¼ xð2; 3Þ.
3¼xþy

(a) Because the vectors are equal, set the corresponding entries equal to each other, yielding

Solve the linear equations, obtaining x ¼ 2; y ¼ 1: (b) First multiply by the scalar x to obtain ð4; yÞ ¼ ð2x; 3xÞ. Then set corresponding entries equal to each other to obtain 4 ¼ 2x; Solve the equations to yield x ¼ 2, y ¼ 6. y ¼ 3x

1.5.

Write the vector v ¼ ð1; À2; 5Þ as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð2; À1; 1Þ.
We want to express v in the form v ¼ xu1 þ yu2 þ zu3 with x; y; z as yet unknown. First we have 2 3 2 3 2 3 2 3 2 3 x þ y þ 2z 2 1 1 1 4 À2 5 ¼ x4 1 5 þ y4 2 5 þ z4 À1 5 ¼ 4 x þ 2y À z 5 x þ 3y þ z 1 3 5 1 (It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set corresponding entries equal to each other to obtain x þ y þ 2z ¼ 1 x þ 2y À z ¼ À2 x þ 3y þ z ¼ 5 or x þ y þ 2z ¼ 1 y À 3z ¼ À3 2y À z ¼ 4 or x þ y þ 2z ¼ 1 y À 3z ¼ À3 5z ¼ 10

This unique solution of the triangular system is x ¼ À6, y ¼ 3, z ¼ 2. Thus, v ¼ À6u1 þ 3u2 þ 2u3 .

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
1.6. Write v ¼ ð2; À5; 3Þ as a linear combination of u1 ¼ ð1; À3; 2Þ; u2 ¼ ð2; À4; À1Þ; u3 ¼ ð1; À5; 7Þ: Find the equivalent system of linear equations and then solve. First, 3 2 3 2 2 3 2 3 2 3 x þ 2y þ z 1 2 1 2 4 À5 5 ¼ x4 À3 5 þ y4 À4 5 þ z4 À5 5 ¼ 4 À3x À 4y À 5z 5 2x À y þ 7z 7 À1 2 3 Set the corresponding entries equal to each other to obtain x þ 2y þ z ¼ 2 À3x À 4y À 5z ¼ À5 2x À y þ 7z ¼ 3 or x þ 2y þ z ¼ 2 2y À 2z ¼ 1 À 5y þ 5z ¼ À1 or x þ 2y þ z ¼ 2 2y À 2z ¼ 1 0¼3

15

The third equation, 0x þ 0y þ 0z ¼ 3, indicates that the system has no solution. Thus, v cannot be written as a linear combination of the vectors u1 , u2 , u3 .

Dot (Inner) Product, Orthogonality, Norm in Rn 1.7. Find u Á v where:
(a) u ¼ ð2; À5; 6Þ and v ¼ ð8; 2; À3Þ, (b) u ¼ ð4; 2; À3; 5; À1Þ and v ¼ ð2; 6; À1; À4; 8Þ. Multiply the corresponding components and add: (a) u Á v ¼ 2ð8Þ À 5ð2Þ þ 6ðÀ3Þ ¼ 16 À 10 À 18 ¼ À12 (b) u Á v ¼ 8 þ 12 þ 3 À 20 À 8 ¼ À5

1.8.

Let u ¼ ð5; 4; 1Þ, v ¼ ð3; À4; 1Þ, w ¼ ð1; À2; 3Þ. Which pair of vectors, if any, are perpendicular (orthogonal)?
Find the dot product of each pair of vectors: u Á v ¼ 15 À 16 þ 1 ¼ 0; v Á w ¼ 3 þ 8 þ 3 ¼ 14; uÁw¼5À8þ3¼0 Thus, u and v are orthogonal, u and w are orthogonal, but v and w are not.

1.9.

Find k so that u and v are orthogonal, where:
(a) u ¼ ð1; k; À3Þ and v ¼ ð2; À5; 4Þ, (b) u ¼ ð2; 3k; À4; 1; 5Þ and v ¼ ð6; À1; 3; 7; 2kÞ. Compute u Á v, set u Á v equal to 0, and then solve for k: (a) u Á v ¼ 1ð2Þ þ kðÀ5Þ À 3ð4Þ ¼ À5k À 10. Then À5k À 10 ¼ 0, or k ¼ À2. (b) u Á v ¼ 12 À 3k À 12 þ 7 þ 10k ¼ 7k þ 7. Then 7k þ 7 ¼ 0, or k ¼ À1.

1.10. Find kuk, where: (a)
2

u ¼ ð3; À12; À4Þ,

kuk2 . pffiffiffiffiffiffiffiffi (a) kuk2 ¼ ð3Þ2 þ ðÀ12Þ2 þ ðÀ4Þ2 ¼ 9 þ 144 þ 16 ¼ 169. Then kuk ¼ 169 ¼ 13. pffiffiffiffiffiffiffiffi (b) kuk2 ¼ 4 þ 9 þ 64 þ 49 ¼ 126. Then kuk ¼ 126. First find kuk ¼ u Á u by squaring the entries and adding. Then kuk ¼

(b) u ¼ ð2; À3; 8; À7Þ. ffi qffiffiffiffiffiffiffiffiffi

16

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

^ 1.11. Recall that normalizing a nonzero vector v means finding the unique unit vector v in the same direction as v, where ^ v¼ 1 v kvk

u ¼ ð3; À4Þ, (b) v ¼ ð4; À2; À3; 8Þ, (c) w ¼ ð1, 2, À 1). 2 3 4 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi ^ (a) First find kuk ¼ 9 þ 16 ¼ 25 ¼ 5. Then divide each entry of u by 5, obtaining u ¼ ð3, À 4). 5 5 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi (b) Here kvk ¼ 16 þ 4 þ 9 þ 64 ¼ 93. Then   4 À2 À3 8 ^ v ¼ pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi 93 93 93 93 Normalize: (a) (c) Note that w and any positive multiple of w will have the same normalized form. Hence, first multiply w by 12 to ‘‘clear fractions’’—that is, first find w0 ¼ 12w ¼ ð6; 8; À3Þ. Then   pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi 6 8 À3 0 b0 ¼ pffiffiffiffiffiffiffiffi ; pffiffiffiffiffiffiffiffi ; pffiffiffiffiffiffiffiffi ^ kw k ¼ 36 þ 64 þ 9 ¼ 109 and w ¼ w 109 109 109

1.12. Let u ¼ ð1; À3; 4Þ and v ¼ ð3; 4; 7Þ. Find: (a) cos y, where y is the angle between u and v; (b) projðu; vÞ, the projection of u onto v; (c) dðu; vÞ, the distance between u and v.
First find u Á v ¼ 3 À 12 þ 28 ¼ 19, kuk2 ¼ 1 þ 9 þ 16 ¼ 26, kvk2 ¼ 9 þ 16 þ 49 ¼ 74. Then (a) cos y ¼ uÁv 19 ¼ pffiffiffiffiffipffiffiffiffiffi , kukkvk 26 74 uÁv v¼ 2

   57 76 133 57 38 133 ; ; ; ; ¼ ; 74 74 74 74 37 74 kvk pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi (c) dðu; vÞ ¼ ku À vk ¼ kðÀ2; À7 À 3Þk ¼ 4 þ 49 þ 9 ¼ 62: (b) projðu; vÞ ¼ 19 ð3; 4; 7Þ ¼ 74



1.13. Prove Theorem 1.2: For any u; v; w in Rn and k in R: (i) ðu þ vÞ Á w ¼ u Á w þ v Á w, (ii) ðkuÞ Á v ¼ kðu Á vÞ, (iv) u Á u ! 0, and u Á u ¼ 0 iff u ¼ 0.
Let u ¼ ðu1 ; u2 ; . . . ; un Þ, v ¼ ðv 1 ; v 2 ; . . . ; v n Þ, w ¼ ðw1 ; w2 ; . . . ; wn Þ. (i) Because u þ v ¼ ðu1 þ v 1 ; u2 þ v 2 ; . . . ; un þ v n Þ, ðu þ vÞ Á w ¼ ðu1 þ v 1 Þw1 þ ðu2 þ v 2 Þw2 þ Á Á Á þ ðun þ v n Þwn ¼ u1 w1 þ v 1 w1 þ u2 w2 þ Á Á Á þ un wn þ v n wn ¼ ðu1 w1 þ u2 w2 þ Á Á Á þ un wn Þ þ ðv 1 w1 þ v 2 w2 þ Á Á Á þ v n wn Þ ¼uÁwþvÁw (ii) Because ku ¼ ðku1 ; ku2 ; . . . ; kun Þ, ðkuÞ Á v ¼ ku1 v 1 þ ku2 v 2 þ Á Á Á þ kun v n ¼ kðu1 v 1 þ u2 v 2 þ Á Á Á þ un v n Þ ¼ kðu Á vÞ (iii) u Á v ¼ u1 v 1 þ u2 v 2 þ Á Á Á þ un v n ¼ v 1 u1 þ v 2 u2 þ Á Á Á þ v n un ¼ v Á u (iv) Because u2 is nonnegative for each i, and because the sum of nonnegative real numbers is nonnegative, i u Á u ¼ u2 þ u2 þ Á Á Á þ u2 ! 0 1 2 n Furthermore, u Á u ¼ 0 iff ui ¼ 0 for each i, that is, iff u ¼ 0.

(iii) u Á v ¼ v Á u,

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
1.14. Prove Theorem 1.3 (Schwarz): ju Á vj kukkvk.

17

For any real number t, and using Theorem 1.2, we have 0 ðtu þ vÞ Á ðtu þ vÞ ¼ t2 ðu Á uÞ þ 2tðu Á vÞ þ ðv Á vÞ ¼ kuk2 t2 þ 2ðu Á vÞt þ kvk2

Let a ¼ kuk2 , b ¼ 2ðu Á vÞ, c ¼ kvk2 . Then, for every value of t, at2 þ bt þ c ! 0. This means that the quadratic polynomial cannot have two real roots. This implies that the discriminant D ¼ b2 À 4ac 0 or, equivalently, b2 4ac. Thus, 4ðu Á vÞ2 Dividing by 4 gives us our result. 4kuk2 kvk2

1.15. Prove Theorem 1.4 (Minkowski): ku þ vk

kuk þ kvk. kuk2 þ 2kukkvk þ kvk2 ¼ ðkuk þ kvkÞ2

By the Schwarz inequality and other properties of the dot product, ku þ vk2 ¼ ðu þ vÞ Á ðu þ vÞ ¼ ðu Á uÞ þ 2ðu Á vÞ þ ðv Á vÞ

Taking the square root of both sides yields the desired inequality.

Points, Lines, Hyperplanes in Rn Here we distinguish between an n-tuple Pða1 ; a2 ; . . . ; an Þ viewed as a point in Rn and an n-tuple u ¼ ½c1 ; c2 ; . . . ; cn Š viewed as a vector (arrow) from the origin O to the point Cðc1 ; c2 ; . . . ; cn Þ. ƒ! 1.16. Find the vector u identified with the directed line segment PQ for the points: (a) Pð1; À2; 4Þ and Qð6; 1; À5Þ in R3 , (b) Pð2; 3; À6; 5Þ and Qð7; 1; 4; À8Þ in R4 .
ƒ! (a) u ¼ PQ ¼ Q À P ¼ ½6 À 1; 1 À ðÀ2Þ; À5 À 4Š ¼ ½5; 3; À9Š ƒ! (b) u ¼ PQ ¼ Q À P ¼ ½7 À 2; 1 À 3; 4 þ 6; À8 À 5Š ¼ ½5; À2; 10; À13Š

1.17. Find an equation of the hyperplane H in R4 that passes through Pð3; À4; 1; À2Þ and is normal to u ¼ ½2; 5; À6; À3Š.
The coefficients of the unknowns of an equation of H are the components of the normal vector u. Thus, an equation of H is of the form 2x1 þ 5x2 À 6x3 À 3x4 ¼ k. Substitute P into this equation to obtain k ¼ À26. Thus, an equation of H is 2x1 þ 5x2 À 6x3 À 3x4 ¼ À26.

1.18. Find an equation of the plane H in R3 that contains Pð1; À3; À4Þ and is parallel to the plane H 0 determined by the equation 3x À 6y þ 5z ¼ 2.
The planes H and H 0 are parallel if and only if their normal directions are parallel or antiparallel (opposite direction). Hence, an equation of H is of the form 3x À 6y þ 5z ¼ k. Substitute P into this equation to obtain k ¼ 1. Then an equation of H is 3x À 6y þ 5z ¼ 1.

1.19. Find a parametric representation of the line L in R4 passing through Pð4; À2; 3; 1Þ in the direction of u ¼ ½2; 5; À7; 8Š.
Here L consists of the points X ðxi Þ that satisfy X ¼ P þ tu or xi ¼ ai t þ bi or LðtÞ ¼ ðai t þ bi Þ

where the parameter t takes on all real values. Thus we obtain x1 ¼ 4 þ 2t; x2 ¼ À2 þ 2t; x3 ¼ 3 À 7t; x4 ¼ 1 þ 8t or LðtÞ ¼ ð4 þ 2t; À2 þ 2t; 3 À 7t; 1 þ 8tÞ

18

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors t 4.

1.20. Let C be the curve FðtÞ ¼ ðt2 ; 3t À 2; t3 ; t2 þ 5Þ in R4 , where 0 (a) Find the point P on C corresponding to t ¼ 2. (b) Find the initial point Q and terminal point Q 0 of C. (c) Find the unit tangent vector T to the curve C when t ¼ 2.

(a) Substitute t ¼ 2 into FðtÞ to get P ¼ f ð2Þ ¼ ð4; 4; 8; 9Þ. (b) The parameter t ranges from t ¼ 0 to t ¼ 4. Hence, Q ¼ f ð0Þ ¼ ð0; À2; 0; 5Þ and Q 0 ¼ Fð4Þ ¼ ð16; 10; 64; 21Þ. (c) Take the derivative of FðtÞ—that is, of each component of FðtÞ—to obtain a vector V that is tangent to the curve: dFðtÞ ¼ ½2t; 3; 3t2 ; 2tŠ dt Now find V when t ¼ 2; that is, substitute t ¼ 2 in the equation for V ðtÞ to obtain V ¼ V ð2Þ ¼ ½4; 3; 12; 4Š. Then normalize V to obtain the desired unit tangent vector T. We have ! pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi 4 3 12 4 kV k ¼ 16 þ 9 þ 144 þ 16 ¼ 185 and T ¼ pffiffiffiffiffiffiffiffi ; pffiffiffiffiffiffiffiffi ; pffiffiffiffiffiffiffiffi ; pffiffiffiffiffiffiffiffi 185 185 185 185 V ðtÞ ¼

Spatial Vectors (Vectors in R3 ), ijk Notation, Cross Product 1.21. Let u ¼ 2i À 3j þ 4k, v ¼ 3i þ j À 2k, w ¼ i þ 5j þ 3k. Find: (a) u þ v, (b) 2u À 3v þ 4w, (c) u Á v and u Á w, (d) kuk and kvk.
Treat the coefficients of i, j, k just like the components of a vector in R3 . (a) Add corresponding coefficients to get u þ v ¼ 5i À 2j À 2k. (b) First perform the scalar multiplication and then the vector addition: 2u À 3v þ 4w ¼ ð4i À 6j þ 8kÞ þ ðÀ9i þ 3j þ 6kÞ þ ð4i þ 20j þ 12kÞ ¼ Ài þ 17j þ 26k (c) Multiply corresponding coefficients and then add: u Á v ¼ 6 À 3 À 8 ¼ À5 and u Á w ¼ 2 À 15 þ 12 ¼ À1

(d) The norm is the square root of the sum of the squares of the coefficients: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi kuk ¼ 4 þ 9 þ 16 ¼ 29 and kvk ¼ 9 þ 1 þ 4 ¼ 14

1.22. Find the (parametric) equation of the line L: (a) through the points Pð1; 3; 2Þ and Qð2; 5; À6Þ; (b) containing the point Pð1; À2; 4Þ and perpendicular to the plane H given by the equation 3x þ 5y þ 7z ¼ 15:
ƒ! (a) First find v ¼ PQ ¼ Q À P ¼ ½1; 2; À8Š ¼ i þ 2j À 8k. Then LðtÞ ¼ ðt þ 1; 2t þ 3; À8t þ 2Þ ¼ ðt þ 1Þi þ ð2t þ 3Þj þ ðÀ8t þ 2Þk (b) Because L is perpendicular to H, the line L is in the same direction as the normal vector N ¼ 3i þ 5j þ 7k to H. Thus, LðtÞ ¼ ð3t þ 1; 5t À 2; 7t þ 4Þ ¼ ð3t þ 1Þi þ ð5t À 2Þj þ ð7t þ 4Þk

1.23. Let S be the surface xy2 þ 2yz ¼ 16 in R3 . (a) Find the normal vector Nðx; y; zÞ to the surface S. (b) Find the tangent plane H to S at the point Pð1; 2; 3Þ.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
(a) The formula for the normal vector to a surface Fðx; y; zÞ ¼ 0 is Nðx; y; zÞ ¼ Fx i þ Fy j þ Fz k where Fx , Fy , Fz are the partial derivatives. Using Fðx; y; zÞ ¼ xy2 þ 2yz À 16, we obtain Fx ¼ y2 ; Fy ¼ 2xy þ 2z; Fz ¼ 2y Thus, Nðx; y; zÞ ¼ y2 i þ ð2xy þ 2zÞj þ 2yk. (b) The normal to the surface S at the point P is NðPÞ ¼ Nð1; 2; 3Þ ¼ 4i þ 10j þ 4k

19

Hence, N ¼ 2i þ 5j þ 2k is also normal to S at P. Thus an equation of H has the form 2x þ 5y þ 2z ¼ c. Substitute P in this equation to obtain c ¼ 18. Thus the tangent plane H to S at P is 2x þ 5y þ 2z ¼ 18.

1.24. Evaluate the following determinants and      3 4    , (ii)  2 À1 , (iii)  4 (a) (i)  5 9 4  3 3     3 6  7 À5    , (iii) (b) (i) À  4 2 , (ii) À 3 2

negative of determinants of order two:  À5   À2     4 À1    À 8 À3 

      a b   ¼ ad À bc and À a b  ¼ bc À ad. Thus, Use  c d c d (a) (i) 27 À 20 ¼ 7, (ii) 6 þ 4 ¼ 10, (iii) À8 þ 15 ¼ 7: (b) (i) 24 À 6 ¼ 18, (ii) À15 À 14 ¼ À29, (iii) À8 þ 12 ¼ 4:

1.25. Let u ¼ 2i À 3j þ 4k, v ¼ 3i þ j À 2k, w ¼ i þ 5j þ 3k. Find: (a) u  v, (b) u  w

! 2 À3 4 (a) Use to get u  v ¼ ð6 À 4Þi þ ð12 þ 4Þj þ ð2 þ 9Þk ¼ 2i þ 16j þ 11k: 3 1 À2 ! 2 À3 4 (b) Use to get u  w ¼ ðÀ9 À 20Þi þ ð4 À 6Þj þ ð10 þ 3Þk ¼ À29i À 2j þ 13k: 1 5 3

1.26. Find u  v, where: (a)
(a) Use 1 2 4 5 À4 6 3 6 7 À5 !

u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 6Þ; (b)

u ¼ ðÀ4; 7; 3Þ, v ¼ ð6; À5; 2Þ.

to get u  v ¼ ½12 À 15; 12 À 6; 5 À 8Š ¼ ½À3; 6; À3Š: 3 2 ! to get u  v ¼ ½14 þ 15; 18 þ 8; 20 À 42Š ¼ ½29; 26; À22Š:

(b) Use

1.27. Find a unit vector u orthogonal to v ¼ ½1; 3; 4Š and w ¼ ½2; À6; À5Š.
First find v  w, which is orthogonal to v and w. ! 1 3 4 The array gives v  w ¼ ½À15 þ 24; 8 þ 5; À6 À 61Š ¼ ½9; 13; À12Š: 2 À6 À5 pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi Normalize v  w to get u ¼ ½9= 394, 13= 394, À12= 394Š:

1.28. Let u ¼ ða1 ; a2 ; a3 Þ and v ¼ ðb1 ; b2 ; b3 Þ so u  v ¼ ða2 b3 À a3 b2 ; a3 b1 À a1 b3 ; a1 b2 À a2 b1 Þ. Prove: (a) u  v is orthogonal to u and v [Theorem 1.5(a)]. (b) ku  vk2 ¼ ðu Á uÞðv Á vÞ À ðu Á vÞ2 (Lagrange’s identity).

20
(a) We have

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

u Á ðu  vÞ ¼ a1 ða2 b3 À a3 b2 Þ þ a2 ða3 b1 À a1 b3 Þ þ a3 ða1 b2 À a2 b1 Þ ¼ a1 a2 b3 À a1 a3 b2 þ a2 a3 b1 À a1 a2 b3 þ a1 a3 b2 À a2 a3 b1 ¼ 0 Thus, u  v is orthogonal to u. Similarly, u  v is orthogonal to v. (b) We have ku  vk2 ¼ ða2 b3 À a3 b2 Þ2 þ ða3 b1 À a1 b3 Þ2 þ ða1 b2 À a2 b1 Þ2 ðu Á uÞðv Á vÞ À ðu Á vÞ ¼
2

ð1Þ
2

ða2 1

þ

a2 2

þ

a2 Þðb2 3 1

þ

b2 2

þ

b2 Þ 3

À ða1 b1 þ a2 b2 þ a3 b3 Þ

ð2Þ

Expansion of the right-hand sides of (1) and (2) establishes the identity.

Complex Numbers, Vectors in Cn 1.29. Suppose z ¼ 5 þ 3i and w ¼ 2 À 4i. Find: (a) z þ w, (b) z À w, (c) zw.
Use the ordinary rules of algebra together with i2 ¼ À1 to obtain a result in the standard form a þ bi. (a) z þ w ¼ ð5 þ 3iÞ þ ð2 À 4iÞ ¼ 7 À i (b) z À w ¼ ð5 þ 3iÞ À ð2 À 4iÞ ¼ 5 þ 3i À 2 þ 4i ¼ 3 þ 7i (c) zw ¼ ð5 þ 3iÞð2 À 4iÞ ¼ 10 À 14i À 12i2 ¼ 10 À 14i þ 12 ¼ 22 À 14i

1.30. Simplify: (a)

ð5 þ 3iÞð2 À 7iÞ, (b) ð4 À 3iÞ2 , (c)

ð1 þ 2iÞ3 .

(a) ð5 þ 3iÞð2 À 7iÞ ¼ 10 þ 6i À 35i À 21i2 ¼ 31 À 29i (b) ð4 À 3iÞ2 ¼ 16 À 24i þ 9i2 ¼ 7 À 24i (c) ð1 þ 2iÞ3 ¼ 1 þ 6i þ 12i2 þ 8i3 ¼ 1 þ 6i À 12 À 8i ¼ À11 À 2i

1.31. Simplify: (a)

i0 ; i3 ; i4 , (b) i5 ; i6 ; i7 ; i8 , (c)

i39 ; i174 , i252 , i317 :

(a) i0 ¼ 1, i3 ¼ i2 ðiÞ ¼ ðÀ1ÞðiÞ ¼ Ài; i4 ¼ ði2 Þði2 Þ ¼ ðÀ1ÞðÀ1Þ ¼ 1 (b) i5 ¼ ði4 ÞðiÞ ¼ ð1ÞðiÞ ¼ i, i6 ¼ ði4 Þði2 Þ ¼ ð1Þði2 Þ ¼ i2 ¼ À1, i7 ¼ i3 ¼ Ài, i8 ¼ i4 ¼ 1 (c) Using i4 ¼ 1 and in ¼ i4qþr ¼ ði4 Þq ir ¼ 1q ir ¼ ir , divide the exponent n by 4 to obtain the remainder r: i39 ¼ i4ð9Þþ3 ¼ ði4 Þ9 i3 ¼ 19 i3 ¼ i3 ¼ Ài; i174 ¼ i2 ¼ À1; i252 ¼ i0 ¼ 1; i317 ¼ i1 ¼ i

1.32. Find the complex conjugate of each of the following: (a) 6 þ 4i, 7 À 5i, 4 þ i, À3 À i, (b) 6, À3, 4i, À9i.
(a) 6 þ 4i ¼ 6 À 4i, 7 À 5i ¼ 7 þ 5i, 4 þ i ¼ 4 À i, À3 À i ¼ À3 þ i (b)  ¼ 6, À3 ¼ À3, 4i ¼ À4i, À9i ¼ 9i 6 (Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary number is the negative of the original number.)

1.33. Find z and jzj when z ¼ 3 þ 4i. z
For z ¼ a þ bi, use z ¼ a2 þ b2 and z ¼ z

pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi z ¼ a2 þ b2 . z jzj ¼

1.34. Simpify

2 À 7i : 5 þ 3i

z ¼ 9 þ 16 ¼ 25; z

pffiffiffiffiffi 25 ¼ 5

 To simplify a fraction z=w of complex numbers, multiply both numerator and denominator by w, the conjugate of the denominator: 2 À 7i ð2 À 7iÞð5 À 3iÞ À11 À 41i 11 41 ¼ ¼ ¼À À i 5 þ 3i ð5 þ 3iÞð5 À 3iÞ 34 34 34

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
1.35. Prove: For any complex numbers z, w 2 C, (i) z þ w ¼ z þ w, (ii) zw ¼ zw, (iii) z ¼ z.    
Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R. (i) z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ ða þ cÞ þ ðb þ dÞi ¼ ða þ cÞ À ðb þ dÞi ¼ a þ c À bi À di ¼ ða À biÞ þ ðc À diÞ ¼  þ w z  zw ¼ ða þ biÞðc þ diÞ ¼ ðac À bdÞ þ ðad þ bcÞi ¼ ðac À bdÞ À ðad þ bcÞi ¼ ða À biÞðc À diÞ ¼ w z z  ¼ a þ bi ¼ a À bi ¼ a À ðÀbÞi ¼ a þ bi ¼ z

21

(ii) (iii)

1.36. Prove: For any complex numbers z; w 2 C, jzwj ¼ jzjjwj.
By (ii) of Problem 1.35,  z z jzwj2 ¼ ðzwÞðzwÞ ¼ ðzwÞðwÞ ¼ ðzÞðwwÞ ¼ jzj2 jwj2 The square root of both sides gives us the desired result.

1.37. Prove: For any complex numbers z; w 2 C, jz þ wj

jzj þ jwj.

Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R. Consider the vectors u ¼ ða; bÞ and v ¼ ðc; dÞ in R2 . Note that pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jwj ¼ c2 þ d 2 ¼ kvk jzj ¼ a2 þ b2 ¼ kuk; and jz þ wj ¼ jða þ cÞ þ ðb þ dÞij ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða þ cÞ2 þ ðb þ dÞ2 ¼ kða þ c; b þ dÞk ¼ ku þ vk kuk þ kvk, and so kuk þ kvk ¼ jzj þ jwj

By Minkowski’s inequality (Problem 1.15), ku þ vk jz þ wj ¼ ku þ vk

1.38. Find the dot products u Á v and v Á u where: (a) u ¼ ð1 À 2i; 3 þ iÞ, v ¼ ð4 þ 2i; 5 À 6iÞ; (b) u ¼ ð3 À 2i; 4i; 1 þ 6iÞ, v ¼ ð5 þ i; 2 À 3i; 7 þ 2iÞ.
Recall that conjugates of the second vector appear in the dot product   ðz1 ; . . . ; zn Þ Á ðw1 ; . . . ; wn Þ ¼ z1 w1 þ Á Á Á þ zn wn (a) u Á v ¼ ð1 À 2iÞð4 þ 2iÞ þ ð3 þ iÞð5 À 6iÞ ¼ ð1 À 2iÞð4 À 2iÞ þ ð3 þ iÞð5 þ 6iÞ ¼ À10i þ 9 þ 23i ¼ 9 þ 13i v Á u ¼ ð4 þ 2iÞð1 À 2iÞ þ ð5 À 6iÞð3 þ iÞ ¼ ð4 þ 2iÞð1 þ 2iÞ þ ð5 À 6iÞð3 À iÞ ¼ 10i þ 9 À 23i ¼ 9 À 13i (b) u Á v ¼ ð3 À 2iÞð5 þ iÞ þ ð4iÞð2 À 3iÞ þ ð1 þ 6iÞð7 þ 2iÞ ¼ ð3 À 2iÞð5 À iÞ þ ð4iÞð2 þ 3iÞ þ ð1 þ 6iÞð7 À 2iÞ ¼ 20 þ 35i v Á u ¼ ð5 þ iÞð3 À 2iÞ þ ð2 À 3iÞð4iÞ þ ð7 þ 2iÞð1 þ 6iÞ ¼ ð5 þ iÞð3 þ 2iÞ þ ð2 À 3iÞðÀ4iÞ þ ð7 þ 2iÞð1 À 6iÞ ¼ 20 À 35i In both cases, v Á u ¼ u Á v. This holds true in general, as seen in Problem 1.40.

1.39. Let u ¼ ð7 À 2i; 2 þ 5iÞ and v ¼ ð1 þ i; À3 À 6iÞ. Find: (a) u þ v, (b) 2iu, (c) ð3 À iÞv, (d) u Á v,

(e) kuk and kvk.

(a) u þ v ¼ ð7 À 2i þ 1 þ i; 2 þ 5i À 3 À 6iÞ ¼ ð8 À i; À1 À iÞ (b) 2iu ¼ ð14i À 4i2 ; 4i þ 10i2 Þ ¼ ð4 þ 14i; À10 þ 4iÞ (c) ð3 À iÞv ¼ ð3 þ 3i À i À i2 ; À9 À 18i þ 3i þ 6i2 Þ ¼ ð4 þ 2i; À15 À 15iÞ

22

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
(d) u Á v ¼ ð7 À 2iÞð1 þ iÞ þ ð2 þ 5iÞðÀ3 À 6iÞ ¼ ð7 À 2iÞð1 À iÞ þ ð2 þ 5iÞðÀ3 þ 6iÞ ¼ 5 À 9i À 36 À 3i ¼ À31 À 12i qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi (e) kuk ¼ 72 þ ðÀ2Þ2 þ 22 þ 52 ¼ 82 and kvk ¼ 12 þ 12 þ ðÀ3Þ2 þ ðÀ6Þ2 ¼ 47

1.40. Prove: For any vectors u; v 2 Cn and any scalar z 2 C, (i) u Á v ¼ v Á u, (ii) ðzuÞ Á v ¼ zðu Á vÞ, (iii) u Á ðzvÞ ¼ zðu Á vÞ. 
Suppose u ¼ ðz1 ; z2 ; . . . ; zn Þ and v ¼ ðw1 ; w2 ; . . . ; wn Þ. (i) Using the properties of the conjugate, v Á u ¼ w1 z1 þ w2 z2 þ Á Á Á þ wn zn ¼ w1 z1 þ w2 z2 þ Á Á Á þ wn n      z       ¼ w1 z1 þ w2 z2 þ Á Á Á þ wn zn ¼ z1 w1 þ z2 w2 þ Á Á Á þ zn wn ¼ u Á v (ii) Because zu ¼ ðzz1 ; zz2 ; . . . ; zzn Þ, ðzuÞ Á v ¼ zz1 w1 þ zz2 w2 þ Á Á Á þ zzn wn ¼ zðz1 w1 þ z2 w2 þ Á Á Á þ zn wn Þ ¼ zðu Á vÞ       (Compare with Theorem 1.2 on vectors in Rn .) (iii) Using (i) and (ii), u Á ðzvÞ ¼ ðzvÞ Á u ¼ zðv Á uÞ ¼ zðv Á uÞ ¼ zðu Á vÞ  

SUPPLEMENTARY PROBLEMS

Vectors in Rn
1.41. Let u ¼ ð1; À2; 4Þ, v ¼ ð3; 5; 1Þ, w ¼ ð2; 1; À3Þ. Find: (a) 3u À 2v; (b) 5u þ 3v À 4w; (c) u Á v, u Á w, v Á w; (d) kuk, kvk; (e) cos y, where y is the angle between u and v; (f ) dðu; vÞ; (g) projðu; vÞ. 3 2 3 2 3 1 3 2 1.42. Repeat Problem 1.41 for vectors u ¼ 4 3 5, v ¼ 4 1 5, w ¼ 4 À2 5. À4 6 5 1.43. Let u ¼ ð2; À5; 4; 6; À3Þ and v ¼ ð5; À2; 1; À7; À4Þ. Find: (a) 4u À 3v; (b) 5u þ 2v; (c) 1.44. Normalize each vector: (a) u ¼ ð5; À7Þ; (b) v ¼ ð1; 2; À2; 4Þ; (c) w¼ u Á v; (d) kuk and kvk; (e) projðu; vÞ; ( f ) dðu; vÞ. 2

  1 1 3 ;À ; . 2 3 4

1.45. Let u ¼ ð1; 2; À2Þ, v ¼ ð3; À12; 4Þ, and k ¼ À3. (a) Find kuk, kvk, ku þ vk, kkuk: (b) Verify that kkuk ¼ jkjkuk and ku þ vk 1.46. Find x and y where: (a) ðx; y þ 1Þ ¼ ðy À 2; 6Þ; (b) xð2; yÞ ¼ yð1; À2Þ. kuk þ kvk.

1.47. Find x; y; z where ðx; y þ 1; y þ zÞ ¼ ð2x þ y; 4; 3zÞ.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
1.48. Write v ¼ ð2; 5Þ as a linear combination of u1 and u2 , where: (a) u1 ¼ ð1; 2Þ and u2 ¼ ð3; 5Þ; (b) u1 ¼ ð3; À4Þ and u2 ¼ ð2; À3Þ. 3 2 3 2 3 2 3 9 1 2 4 1.49. Write v ¼ 4 À3 5 as a linear combination of u1 ¼ 4 3 5, u2 ¼ 4 5 5, u3 ¼ 4 À2 5. 16 3 À1 3 1.50. Find k so that u and v are orthogonal, where: (a) u ¼ ð3; k; À2Þ, v ¼ ð6; À4; À3Þ; (b) u ¼ ð5; k; À4; 2Þ, v ¼ ð1; À3; 2; 2kÞ; (c) u ¼ ð1; 7; k þ 2; À2Þ, v ¼ ð3; k; À3; kÞ. 2

23

Located Vectors, Hyperplanes, Lines in Rn
1.51. Find the vector v identified with the directed line segment PQ for the points: (a) Pð2; 3; À7Þ and Qð1; À6; À5Þ in R3 ; (b) Pð1; À8; À4; 6Þ and Qð3; À5; 2; À4Þ in R4 . 1.52. Find an equation of the hyperplane H in R4 that: (a) contains Pð1; 2; À3; 2Þ and is normal to u ¼ ½2; 3; À5; 6Š; (b) contains Pð3; À1; 2; 5Þ and is parallel to 2x1 À 3x2 þ 5x3 À 7x4 ¼ 4. 1.53. Find a parametric representation of the line in R4 that: (a) passes through the points Pð1; 2; 1; 2Þ and Qð3; À5; 7; À9Þ; (b) passes through Pð1; 1; 3; 3Þ and is perpendicular to the hyperplane 2x1 þ 4x2 þ 6x3 À 8x4 ¼ 5.
!

Spatial Vectors (Vectors in R3 ), ijk Notation
1.54. Given u ¼ 3i À 4j þ 2k, v ¼ 2i þ 5j À 3k, w ¼ 4i þ 7j þ 2k. Find: (a) 2u À 3v; (b) 3u þ 4v À 2w; (c) u Á v, u Á w, v Á w; (d) kuk, kvk, kwk.

1.55. Find the equation of the plane H: (a) with normal N ¼ 3i À 4j þ 5k and containing the point Pð1; 2; À3Þ; (b) parallel to 4x þ 3y À 2z ¼ 11 and containing the point Qð2; À1; 3Þ. 1.56. Find the (parametric) equation of the line L: (a) through the point Pð2; 5; À3Þ and in the direction of v ¼ 4i À 5j þ 7k; (b) perpendicular to the plane 2x À 3y þ 7z ¼ 4 and containing Pð1; À5; 7Þ. 1.57. Consider the following curve C in R3 where 0 t 5:

FðtÞ ¼ t3 i À t2 j þ ð2t À 3Þk (a) Find the point P on C corresponding to t ¼ 2. (b) Find the initial point Q and the terminal point Q 0 . (c) Find the unit tangent vector T to the curve C when t ¼ 2. 1.58. Consider a moving body B whose position at time t is given by RðtÞ ¼ t2 i þ t3 j þ 3tk. [Then V ðtÞ ¼ dRðtÞ=dt and AðtÞ ¼ dV ðtÞ=dt denote, respectively, the velocity and acceleration of B.] When t ¼ 1, find for the body B: (a) position; (b) velocity v; (c) speed s; (d) acceleration a.

24

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

1.59. Find a normal vector N and the tangent plane H to each surface at the given point: (a) surface x2 y þ 3yz ¼ 20 and point Pð1; 3; 2Þ; (b) surface x2 þ 3y2 À 5z2 ¼ 160 and point Pð3; À2; 1Þ:

Cross Product
1.60. Evaluate the following determinants and negative of determinants of order two:        2 5   3 À6   À4 À2   ;  ;   (a)  3 6   1 À4   7 À3            6 4  ; À 1 À3 ; À 8 À3  (b) À  À6 À2  2 4 7 5 1.61. Given u ¼ 3i À 4j þ 2k, v ¼ 2i þ 5j À 3k, w ¼ 4i þ 7j þ 2k, find: (a) u  v, (b) u  w, (c) v  w.

1.62. Given u ¼ ½2; 1; 3Š, v ¼ ½4; À2; 2Š, w ¼ ½1; 1; 5Š, find: (a) u  v, (b) u  w, (c) v  w.

1.63. Find the volume V of the parallelopiped formed by the vectors u; v; w appearing in: (a) Problem 1.60 (b) Problem 1.61.

1.64. Find a unit vector u orthogonal to: (a) v ¼ ½1; 2; 3Š and w ¼ ½1; À1; 2Š; (b) v ¼ 3i À j þ 2k and w ¼ 4i À 2j À k. 1.65. Prove the following properties of the cross product: (a) u  v ¼ Àðv  uÞ (b) u  u ¼ 0 for any vector u (c) ðkuÞ Â v ¼ kðu  vÞ ¼ u  ðkvÞ (d) u  ðv þ wÞ ¼ ðu  vÞ þ ðu  wÞ (e) ðv þ wÞ Â u ¼ ðv  uÞ þ ðw  uÞ ( f ) ðu  vÞ Â w ¼ ðu Á wÞv À ðv Á wÞu

Complex Numbers
1.66. Simplify:
(a) ð4 À 7iÞð9 þ 2iÞ; (b) ð3 À 5iÞ ;
2

(c)

1 ; 4 À 7i
25 34

(d)

9 þ 2i ; 3 À 5i

(e)

ð1 À iÞ .

3

1.67. Simplify: (a)

1 ; 2i

(b)

2 þ 3i ; 7 À 3i

 (c) i ; i ; i ;
15

(d)

1 3Ài

2 .

1.68. Let z ¼ 2 À 5i and w ¼ 7 þ 3i. Find:
(a) v þ w; (b) zw; (c) z=w; (d) z  ; w; (e) jzj, jwj.

1.69. Show that for complex numbers z and w:
(a) Re z ¼ 1 ðz þ Þ, (b) z 2 Im z ¼ 1 ðz À z),  2 (c) zw ¼ 0 implies z ¼ 0 or w ¼ 0.

Vectors in Cn
1.70. Let u ¼ ð1 þ 7i; 2 À 6iÞ and v ¼ ð5 À 2i; 3 À 4iÞ. Find:
(a) u þ v (b) ð3 þ iÞu (c) 2iu þ ð4 þ 7iÞv (d) uÁv (e) kuk and kvk.

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors
1.71. Prove: For any vectors u; v; w in Cn :

25

(a) ðu þ vÞ Á w ¼ u Á w þ v Á w,

(b) w Á ðu þ vÞ ¼ w Á u þ w Á v.

1.72. Prove that the norm in Cn satisfies the following laws: ½N1 Š For any vector u, kuk ! 0; and kuk ¼ 0 if and only if u ¼ 0. ½N2 Š For any vector u and complex number z, kzuk ¼ jzjkuk. ½N3 Š For any vectors u and v, ku þ vk kuk þ kvk.

ANSWERS TO SUPPLEMENTARY PROBLEMS
1.41. (a) (e) ðÀ3; À16; 4Þ; pffiffiffiffiffipffiffiffiffiffi À3= 21 35; (b) (6,1,35); pffiffiffiffiffi (f) 62; pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi (c) À3; 12; 8; (d) 21, 35, 14; 3 9 3 (g) À 35 ð3; 5; 1Þ ¼ ðÀ 35, À 15, À 35) 35 ðÀ1; 26; À29Þ; pffiffiffiffiffi (f) 86; (c) À15; À27; 34; (g) À 15 v ¼ ðÀ1; À 1 ; À 5Þ 30 2 2 (c) À6; (d) pffiffiffiffiffi pffiffiffiffiffi 90; 95;

1.42. (Column vectors) (a) ðÀ1; 7; À22Þ; (b) pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffipffiffiffiffiffi (d) 26, 30; (e) À15=ð 26 30Þ; 1.43. (a) (e) 1.44. (a) 1.45. (a) 1.46. (a) ðÀ13; À14; 13; 45; 0Þ; pffiffiffiffiffiffiffiffi 6 À 95 v; (f) 167 pffiffiffiffiffi pffiffiffiffiffi ð5= 76; 9= 76Þ; 3; 13; pffiffiffiffiffiffiffiffi 120; 9

(b) ð20; À29; 22; 16; À23Þ;

(b) ð1 ; 5

2 5;

À 2 ; 4Þ; 5 5

(c)

pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi ð6= 133; À4 133; 9 133Þ

x ¼ À3; y ¼ 5; y ¼ 3; z¼3 2

(b) x ¼ 0; y ¼ 0, and x ¼ 1; y ¼ 2

1.47. x ¼ À3; 1.48. (a)

v ¼ 5u1 À u2 ;

(b)

v ¼ 16u1 À 23u2

1.49. v ¼ 3u1 À u2 þ 2u3 1.50. (a) 1.51. (a) 1.52. (a) 1.53. (a) 1.54. (a) 1.55. (a) 1.56. (a) 1.57. (a) (c) 1.58. (a) 1.59. (a) 6; (b) 3; (c) (b)
3 2

v ¼ ½À1; À9; 2Š;

[2; 3; 6; À10] (b) 2x1 À 3x2 þ 5x3 À 7x4 ¼ À16 (b) ½2t þ 1; 4t þ 1; 6t þ 3; À8t þ 3Š (c) À20; À12; 37; (d) pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi 29; 38; 69

2x1 þ 3x2 À 5x3 þ 6x4 ¼ 35;

½2t þ 1; À7t þ 2; 6t þ 1; À11t þ 2Š; À23j þ 13k; (b) 9i À 6j À 10k; (b)

3x À 4y þ 5z ¼ À20;

4x þ 3y À 2z ¼ À1 (b) (b) ½2t þ 1; À3t À 5; 7t þ 7Š Q ¼ Fð0Þ ¼ À3k, Q0 ¼ Fð5Þ ¼ 125i À 25j þ 7k; pffiffiffiffiffi 17;

½4t þ 2; À5t þ 5; 7t À 3Š; P ¼ Fð2Þ ¼ 8i À 4j þ k; pffiffiffiffiffi T ¼ ð6i À 2j þ kÞ= 41 i þ j þ 2k;

(b) 2i þ 3j þ 2k;

(c)

(d) 2i þ 6j

N ¼ 6i þ 7j þ 9k, 6x þ 7y þ 9z ¼ 45;

(b) N ¼ 6i À 12j À 10k, 3x À 6y À 5z ¼ 16

26

CHAPTER 1 Vectors in Rn and Cn, Spatial Vectors

1.60. (a) 1.61. (a) 1.62. (a) 1.63. (a) 1.64. (a) 1.66. (a) 1.67. (a) 1.68. (a) 1.69. (c)

À3; À6; 26; 2i þ 13j þ 23k; ½5; 8; À6Š; 143; (b)

(b) À2; À10; 34 (b) À22i þ 2j þ 37k; ½2; À7; 1Š; (c) (c) 31i À 16j À 6k

(b) 17

½À7; À18; 5Š

pffiffiffiffiffi ð7; 1; À3Þ= 59; 50 À 55i; À 1 i; 2 9 À 2i; (b) (b)
1 58 ð5

(b)

pffiffiffiffiffiffiffiffi ð5i þ 11j À 2kÞ= 150 (c) (c) (c)
1 65 ð4

À16 À 30i; þ 27iÞ;

þ 7iÞ; (d)

(d)
1 50 ð4

1 2 ð1

þ 3iÞ;

(e) À2 À 2i

À1; i; À1;
1 61 ðÀ1

þ 3iÞ (e) pffiffiffiffiffi pffiffiffiffiffi 29, 58

(b) 29 À 29i;

À 41iÞ;

(d)

2 þ 5i, 7 À 3i;

Hint: If zw ¼ 0, then jzwj ¼ jzjjwj ¼ j0j ¼ 0 (c) ðÀ8 À 41i, À4 À 33iÞ;

1.70. (a) ð6 þ 5i, 5 À 10iÞ; (b) ðÀ4 þ 22i, 12 À 16iÞ; pffiffiffiffiffi pffiffiffiffiffi (d) 12 þ 2i; (e) 90, 54

CHAPTER 2

Algebra of Matrices
2.1 Introduction
This chapter investigates matrices and algebraic operations defined on them. These matrices may be viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with vectors, where each entry depended on only one subscript). Systems of linear equations and their solutions (Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, certain abstract objects introduced in later chapters, such as ‘‘change of basis,’’ ‘‘linear transformations,’’ and ‘‘quadratic forms,’’ can be represented by these matrices (rectangular arrays). On the other hand, the abstract treatment of linear algebra presented later on will give us new insight into the structure of these matrices. The entries in our matrices will come from some arbitrary, but fixed, field K. The elements of K are called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R.

2.2

Matrices

A matrix A over a field K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars usually presented in the following form: 2 3 a11 a12 . . . a1n 6a a22 . . . a2n 7 7 A ¼ 6 21 4 ÁÁÁ ÁÁÁ ÁÁÁ ÁÁÁ 5 am1 am2 . . . amn The rows of such a matrix A are the m horizontal lists of scalars: ða11 ; a12 ; . . . ; a1n Þ; ða21 ; a22 ; . . . ; a2n Þ; ...; ðam1 ; am2 ; . . . ; amn Þ and the columns of A are the n vertical lists of scalars: 2 3 2 3 2 3 a11 a12 a1n 6 a21 7 6 a22 7 6 a2n 7 6 7 6 7 6 7 4 . . . 5; 4 . . . 5; . . . ; 4 . . . 5 am1 am2 amn Note that the element aij , called the ij-entry or ij-element, appears in row i and column j. We frequently denote such a matrix by simply writing A ¼ ½aij Š. A matrix with m rows and n columns is called an m by n matrix, written m  n. The pair of numbers m and n is called the size of the matrix. Two matrices A and B are equal, written A ¼ B, if they have the same size and if corresponding elements are equal. Thus, the equality of two m  n matrices is equivalent to a system of mn equalities, one for each corresponding pair of elements. A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and will usually be denoted by 0.

27

28

CHAPTER 2 Algebra of Matrices

Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R. Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to be matrices over C. This text will be mainly concerned with such real and complex matrices.
EXAMPLE 2.1

1 (a) The rectangular array A ¼ 0 and its columns are ! 1 ; 0 ! À4 ; 3 5 À2 !

À4 5 3 À2

! is a 2 Â 3 matrix. Its rows are ð1; À4; 5Þ and ð0; 3; À2Þ,

(b) The 2 Â 4 zero matrix is the matrix 0 ¼ (c) Find x; y; z; t such that ! ! x þ y 2z þ t 3 7 ¼ xÀy zÀt 1 5

0 0

0 0

! 0 0 . 0 0

By definition of equality of matrices, the four corresponding entries must be equal. Thus, x þ y ¼ 3; x À y ¼ 1; 2z þ t ¼ 7; zÀt ¼5

Solving the above system of equations yields x ¼ 2, y ¼ 1, z ¼ 4, t ¼ À1.

2.3

Matrix Addition and Scalar Multiplication

Let A ¼ ½aij Š and B ¼ ½bij Š be two matrices with the same size, say m  n matrices. The sum of A and B, written A þ B, is the matrix obtained by adding corresponding elements from A and B. That is, a11 þ b11 6 a21 þ b21 AþB¼6 4 ÁÁÁ am1 þ bm1 2 a12 þ b12 a22 þ b22 ÁÁÁ am2 þ bm2 ... ... ÁÁÁ ... 3 a1n þ b1n a2n þ b2n 7 7 5 ÁÁÁ amn þ bmn

The product of the matrix A by a scalar k, written k Á A or simply kA, is the matrix obtained by multiplying each element of A by k. That is, ka11 6 ka21 kA ¼ 6 4 ÁÁÁ kam1 2 ka12 ka22 ÁÁÁ kam2 3 . . . ka1n . . . ka2n 7 7 ÁÁÁ ÁÁÁ 5 . . . kamn

Observe that A þ B and kA are also m  n matrices. We also define ÀA ¼ ðÀ1ÞA and A À B ¼ A þ ðÀBÞ

The matrix ÀA is called the negative of the matrix A, and the matrix A À B is called the difference of A and B. The sum of matrices with different sizes is not defined.

CHAPTER 2 Algebra of Matrices
1 À2 3 EXAMPLE 2.2 Let A ¼ 0 4 5 ! ! 4 6 8 and B ¼ . Then 1 À3 À7

29

" AþB¼

¼ 0 þ 1 4 þ ðÀ3Þ 5 þ ðÀ7Þ 1 1 À2 " # " # 3ð1Þ 3ðÀ2Þ 3ð3Þ 3 À6 9 ¼ 3A ¼ 3ð0Þ 3ð4Þ 3ð5Þ 0 12 15 " # " # " # 2 À4 6 À12 À18 À24 À10 À22 À18 2A À 3B ¼ þ ¼ 9 21 17 31 0 8 10 À3 À3 The matrix 2A À 3B is called a linear combination of A and B. Basic properties of matrices under the operations of matrix addition and scalar multiplication follow.
THEOREM 2.1: Consider any matrices A; B; C (with the same size) and any scalars k and k 0 . Then

1þ4

À2 þ 6

3þ8

#

"

5

4

11

#

(i) (ii)

ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) A þ 0 ¼ 0 þ A ¼ A, A þ B ¼ B þ A, (vi) (vii)

kðA þ BÞ ¼ kA þ kB, ðk þ k 0 ÞA ¼ kA þ k 0 A, ðkk 0 ÞA ¼ kðk 0 AÞ,

(iii) A þ ðÀAÞ ¼ ðÀAÞ þ A ¼ 0; (iv)

(viii) 1 Á A ¼ A.

Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices A1 þ A2 þ Á Á Á þ An requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using (vi) and (viii), we also have A þ A ¼ 2A; A þ A þ A ¼ 3A; ...

and so on. The proof of Theorem 2.1 reduces to showing that the ij-entries on both sides of each matrix equation are equal. (See Problem 2.3.) Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the above operations for matrices may be viewed as generalizations of the corresponding operations for vectors.

2.4

Summation Symbol

Before we define matrix multiplication, it will be instructive to first introduce the summation symbol S (the Greek capital letter sigma). Suppose f ðkÞ is an algebraic expression involving the letter k. Then the expression n P k¼1

f ðkÞ

or equivalently

Pn k¼1 f ðkÞ

has the following meaning. First we set k ¼ 1 in f ðkÞ, obtaining f ð1Þ Then we set k ¼ 2 in f ðkÞ, obtaining f ð2Þ, and add this to f ð1Þ, obtaining f ð1Þ þ f ð2Þ

30 f ð1Þ þ f ð2Þ þ f ð3Þ We continue this process until we obtain the sum f ð1Þ þ f ð2Þ þ Á Á Á þ f ðnÞ

CHAPTER 2 Algebra of Matrices

Then we set k ¼ 3 in f ðkÞ, obtaining f ð3Þ, and add this to the previous sum, obtaining

Observe that at each step we increase the value of k by 1 until we reach n. The letter k is called the index, and 1 and n are called, respectively, the lower and upper limits. Other letters frequently used as indices are i and j. We also generalize our definition by allowing the sum to range from any integer n1 to any integer n2 . That is, we define n2 P k¼n1

f ðkÞ ¼ f ðn1 Þ þ f ðn1 þ 1Þ þ f ðn1 þ 2Þ þ Á Á Á þ f ðn2 Þ

EXAMPLE 2.3

(a) (b) (c)

5 P k¼1 5 P j¼2 p P k¼1

xk ¼ x1 þ x2 þ x3 þ x4 þ x5

and

n P i¼1

ai bi ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn ai xi ¼ a0 þ a1 x þ a2 x2 þ Á Á Á þ an xn

j2 ¼ 22 þ 32 þ 42 þ 52 ¼ 54 and

n P i¼0

aik bkj ¼ ai1 b1j þ ai2 b2j þ ai3 b3j þ Á Á Á þ aip bpj

2.5

Matrix Multiplication

The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin with a special case. The product AB of a row matrix A ¼ ½ai Š and a column matrix B ¼ ½bi Š with the same number of elements is defined to be the scalar (or 1  1 matrix) obtained by multiplying corresponding entries and adding; that is, 2 3 b1 n 6 b2 7 P AB ¼ ½a1 ; a2 ; . . . ; an Š6 7 ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn ¼ ak bk 4...5 k¼1 bn We emphasize that AB is a scalar (or a 1  1 matrix). The product AB is not defined when A and B have different numbers of elements. 3 3 (a) ½7; À4; 5Š4 2 5 ¼ 7ð3Þ þ ðÀ4Þð2Þ þ 5ðÀ1Þ ¼ 21 À 8 À 5 ¼ 8 À1 2 3 4 6 À9 7 7 (b) ½6; À1; 8; 3Š6 4 À2 5 ¼ 24 þ 9 À 16 þ 15 ¼ 32 5 We are now ready to define matrix multiplication in general.
EXAMPLE 2.4 2

CHAPTER 2 Algebra of Matrices
DEFINITION:

31

Suppose A ¼ ½aik Š and B ¼ ½bkj Š are matrices such that the number of columns of A is equal to the number of rows of B; say, A is an m  p matrix and B is a p  n matrix. Then the product AB is the m  n matrix whose ij-entry is obtained by multiplying the ith row of A by the jth column of B. That is, 2 a11 6 : 6 6 ai1 6 4 : am1 where 32 . . . a1p b11 ... : 76 : 76 . . . aip 76 : 76 ... : 54 : . . . amp bp1 . . . b1j ... : ... : ... : . . . bpj 3 2 . . . b1n c11 ... : 7 6 : 7 6 ... : 7 ¼ 6 : 7 6 ... : 5 4 : . . . bpn cm1 p P k¼1

3 . . . c1n ... : 7 7 cij : 7 7 ... : 5 . . . cmn

cij ¼ ai1 b1j þ ai2 b2j þ Á Á Á þ aip bpj ¼

aik bkj

The product AB is not defined if A is an m  p matrix and B is a q  n matrix, where p 6¼ q.
EXAMPLE 2.5

(a) Find AB where A ¼

1 3 2 À1

! and B ¼

2 5

! 0 À4 . À2 6

Because A is 2 Â 2 and B is 2 Â 3, the product AB is defined and AB is a 2 Â 3 matrix. To obtain the first row of the product matrix AB, multiply the first row [1, 3] of A by each column of B, ! 2 ; 5 ! 0 ; À2 À4 6 !

respectively. That is, AB ¼ 2 þ 15 0 À 6 À4 þ 18 ! ¼ 17 À6 14 !

To obtain the second row of AB, multiply the second row ½2; À1Š of A by each column of B. Thus, ! ! À6 14 17 17 À6 14 AB ¼ ¼ 4 À 5 0 þ 2 À8 À 6 À1 2 À14 1 (b) Suppose A ¼ 3 5þ0 AB ¼ 15 þ 0 ! 5 6 and B ¼ . Then 0 À2 ! ! 6À4 5 2 ¼ and 18 À 8 15 10 2 4 !

! ! 5 þ 18 10 þ 24 23 34 BA ¼ ¼ 0À6 0À8 À6 À8

The above example shows that matrix multiplication is not commutative—that is, in general, AB 6¼ BA. However, matrix multiplication does satisfy the following properties.
THEOREM 2.2: Let A; B; C be matrices. Then, whenever the products and sums are defined,

(i) (ii) (iii) (iv)

ðABÞC ¼ AðBCÞ (associative law), AðB þ CÞ ¼ AB þ AC (left distributive law), ðB þ CÞA ¼ BA þ CA (right distributive law), kðABÞ ¼ ðkAÞB ¼ AðkBÞ, where k is a scalar.

We note that 0A ¼ 0 and B0 ¼ 0, where 0 is the zero matrix.

32
2.6 Transpose of a Matrix

CHAPTER 2 Algebra of Matrices

The transpose of a matrix A, written AT , is the matrix obtained by writing the columns of A, in order, as rows. For example, 2 3 2 3 !T 1 4 1 1 2 3 ¼ 42 55 and ½1; À3; À5ŠT ¼ 4 À3 5 4 5 6 3 6 À5 In other words, if A ¼ ½aij Š is an m  n matrix, then AT ¼ ½bij Š is the n  m matrix where bij ¼ aji . Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column vector is a row vector. The next theorem lists basic properties of the transpose operation.
THEOREM 2.3: Let A and B be matrices and let k be a scalar. Then, whenever the sum and product are

defined, (i) ðA þ BÞT ¼ AT þ BT , (iii) ðkAÞT ¼ kAT ,

(ii) ðAT ÞT ¼ A;

(iv) ðABÞT ¼ BT AT .

We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the reverse order.

2.7

Square Matrices

A square matrix is a matrix with the same number of rows as columns. An n  n square matrix is said to be of order n and is sometimes called an n-square matrix. Recall that not every two matrices can be added or multiplied. However, if we only consider square matrices of some given order n, then this inconvenience disappears. Specifically, the operations of addition, multiplication, scalar multiplication, and transpose can be performed on any n  n matrices, and the result is again an n  n matrix.
EXAMPLE 2.6 The following are square matrices of order 3:

1 2 A ¼ 4 À4 À4 5 6

2

3 3 À4 5 7

and

2 À5 B ¼ 40 3 1 2

2

3 1 À2 5 À4 À4 À4 À4 3

The following are also matrices of order 3: 2 3 2 2 3 2 4 6 3 À3 4 1 6 7 6 7 6 T A ¼ 42 2A ¼ 4 À8 À8 À8 5; A þ B ¼ 4 À4 À1 À6 5; 10 12 14 6 8 3 3 2 3 3 2 27 30 33 5 7 À15 6 7 6 7 BA ¼ 4 À22 À24 À26 5 AB ¼ 4 À12 0 20 5; À27 À30 À33 17 7 À35

7 65 7

5

Diagonal and Trace
Let A ¼ ½aij Š be an n-square matrix. The diagonal or main diagonal of A consists of the elements with the same subscripts—that is, a11 ; a22 ; a33 ; ...; ann

CHAPTER 2 Algebra of Matrices
The trace of A, written trðAÞ, is the sum of the diagonal elements. Namely, trðAÞ ¼ a11 þ a22 þ a33 þ Á Á Á þ ann The following theorem applies.
THEOREM 2.4: Suppose A ¼ ½aij Š and B ¼ ½bij Š are n-square matrices and k is a scalar. Then

33

(i)

trðA þ BÞ ¼ trðAÞ þ trðBÞ,

(iii) (iv)

trðAT Þ ¼ trðAÞ, trðABÞ ¼ trðBAÞ.

(ii) trðkAÞ ¼ k trðAÞ,

EXAMPLE 2.7 Let A and B be the matrices A and B in Example 2.6. Then

diagonal of A ¼ f1; À4; 7g diagonal of B ¼ f2; 3; À4g Moreover, trðA þ BÞ ¼ 3 À 1 þ 3 ¼ 5; trðABÞ ¼ 5 þ 0 À 35 ¼ À30; As expected from Theorem 2.4, trðA þ BÞ ¼ trðAÞ þ trðBÞ; trðAT Þ ¼ trðAÞ;

and and

trðAÞ ¼ 1 À 4 þ 7 ¼ 4 trðBÞ ¼ 2 þ 3 À 4 ¼ 1 trðAT Þ ¼ 1 À 4 þ 7 ¼ 4

trð2AÞ ¼ 2 À 8 þ 14 ¼ 8;

trðBAÞ ¼ 27 À 24 À 33 ¼ À30 trð2AÞ ¼ 2 trðAÞ

Furthermore, although AB 6¼ BA, the traces are equal.

Identity Matrix, Scalar Matrices
The n-square identity or unit matrix, denoted by In , or simply I, is the n-square matrix with 1’s on the diagonal and 0’s elsewhere. The identity matrix I is similar to the scalar 1 in that, for any n-square matrix A, AI ¼ IA ¼ A More generally, if B is an m  n matrix, then BIn ¼ Im B ¼ B. For any scalar k, the matrix kI that contains k’s on the diagonal and 0’s elsewhere is called the scalar matrix corresponding to the scalar k. Observe that ðkIÞA ¼ kðIAÞ ¼ kA That is, multiplying a matrix A by the scalar matrix kI is equivalent to multiplying A by the scalar k.
EXAMPLE 2.8 The following are the identity matrices of orders 3 and 4 and the corresponding scalar matrices for k ¼ 5: 2 3 2 3 2 3 2 3 1 5 1 0 0 5 0 0 6 7 6 7 1 5 4 0 1 0 5; 6 7; 4 0 5 0 5; 6 7 4 5 4 5 1 5 0 0 1 0 0 5 1 5

Remark 1: It is common practice to omit blocks or patterns of 0’s when there is no ambiguity, as in the above second and fourth matrices. Remark 2: The Kronecker delta function dij is defined by & 0 if i 6¼ j dij ¼ 1 if i ¼ j Thus, the identity matrix may be defined by I ¼ ½dij Š.

34
2.8 Powers of Matrices, Polynomials in Matrices
A2 ¼ AA; A3 ¼ A2 A; ...; Anþ1 ¼ An A;

CHAPTER 2 Algebra of Matrices

Let A be an n-square matrix over a field K. Powers of A are defined as follows: ...; and A0 ¼ I

Polynomials in the matrix A are also defined. Specifically, for any polynomial f ðxÞ ¼ a0 þ a1 x þ a2 x2 þ Á Á Á þ an xn where the ai are scalars in K, f ðAÞ is defined to be the following matrix: f ðAÞ ¼ a0 I þ a1 A þ a2 A2 þ Á Á Á þ an An [Note that f ðAÞ is obtained from f ðxÞ by substituting the matrix A for the variable x and substituting the scalar matrix a0 I for the scalar a0 .] If f ðAÞ is the zero matrix, then A is called a zero or root of f ðxÞ.
EXAMPLE 2.9 Suppose A ¼

1 A ¼ 3
2

2 À4

!

1 3

! 7 2 ¼ À9 À4

1 3

! 2 . Then À4 !

À6 22

7 and A ¼ A A ¼ À9
3 2

À6 22

!

1 3

! ! À11 38 2 ¼ 57 À106 À4

Suppose f ðxÞ ¼ 2x2 À 3x þ 5 and gðxÞ ¼ x2 þ 3x À 10. Then ! ! ! ! 7 À6 1 2 1 0 16 À18 f ðAÞ ¼ 2 À3 þ5 ¼ À9 22 3 À4 0 1 À27 61 ! ! ! ! 0 0 1 0 1 2 7 À6 ¼ À 10 þ3 gðAÞ ¼ 0 0 0 1 3 À4 À9 22 Thus, A is a zero of the polynomial gðxÞ.

2.9

Invertible (Nonsingular) Matrices
AB ¼ BA ¼ I

A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that where I is the identity matrix. Such a matrix B is unique. That is, if AB1 ¼ B1 A ¼ I and AB2 ¼ B2 A ¼ I, then B1 ¼ B1 I ¼ B1 ðAB2 Þ ¼ ðB1 AÞB2 ¼ IB2 ¼ B2 We call such a matrix B the inverse of A and denote it by AÀ1 . Observe that the above relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B.
EXAMPLE 2.10 Suppose that A ¼ 2 5 1 3 ! and B ¼ 3 À1 ! À5 . Then 2

6À5 AB ¼ 3À3

! ! 1 0 À10 þ 10 ¼ 0 1 À5 þ 6

and

6À5 BA ¼ À2 þ 2

! 1 15 À 15 ¼ 0 À5 þ 6

0 1

!

Thus, A and B are inverses. It is known (Theorem 3.16) that AB ¼ I if and only if BA ¼ I. Thus, it is necessary to test only one product to determine whether or not two given matrices are inverses. (See Problem 2.17.) Now suppose A and B are invertible. Then AB is invertible and ðABÞÀ1 ¼ BÀ1 AÀ1 . More generally, if A1 ; A2 ; . . . ; Ak are invertible, then their product is invertible and ðA1 A2 . . . Ak ÞÀ1 ¼ AÀ1 . . . AÀ1 AÀ1 k 2 1 the product of the inverses in the reverse order.

CHAPTER 2 Algebra of Matrices

35

! a b . We want to derive a formula for AÀ1 , the inverse c d of A. Specifically, we seek 22 ¼ 4 scalars, say x1 , y1 , x2 , y2 , such that ! ! ! ! ! ax1 þ by1 ax2 þ by2 1 0 1 0 a b x1 x2 or ¼ ¼ 0 1 0 1 c d y1 y2 cx1 þ dy1 cx2 þ dy2 Let A be an arbitrary 2 Â 2 matrix, say A ¼ Setting the four entries equal to the corresponding entries in the identity matrix yields four equations, which can be partitioned into two 2 Â 2 systems as follows: ax1 þ by1 ¼ 1; ax2 þ by2 ¼ 0 cx1 þ dy1 ¼ 0; cx2 þ dy2 ¼ 1 Suppose we let jAj ¼ ab À bc (called the determinant of A). Assuming jAj 6¼ 0, we can solve uniquely for the above unknowns x1 , y1 , x2 , y2 , obtaining d Àc Àb a ; y1 ¼ ; x2 ¼ ; y2 ¼ x1 ¼ jAj jAj jAj jAj Accordingly, !À1 ! ! 1 a b d=jAj Àb=jAj d Àb ¼ AÀ1 ¼ ¼ c d Àc=jAj a=jAj a jAj Àc In other words, when jAj 6¼ 0, the inverse of a 2 Â 2 matrix A may be obtained from A as follows: (1) Interchange the two elements on the diagonal. (2) Take the negatives of the other two elements. (3) Multiply the resulting matrix by 1=jAj or, equivalently, divide each element by jAj. In case jAj ¼ 0, the matrix A is not invertible.
EXAMPLE 2.11 Find the inverse of A ¼ 2 4 3 5 ! and B ¼ 1 2 ! 3 . 6

Inverse of a 2 Â 2 Matrix

First evaluate jAj ¼ 2ð5Þ À 3ð4Þ ¼ 10 À 12 ¼ À2. Because jAj 6¼ 0, the matrix A is invertible and ! ! 3 1 5 À3 À5 2 2 ¼ AÀ1 ¼ 2 2 À1 À2 À4 Now evaluate jBj ¼ 1ð6Þ À 3ð2Þ ¼ 6 À 6 ¼ 0. Because jBj ¼ 0, the matrix B has no inverse. Remark: The above property that a matrix is invertible if and only if A has a nonzero determinant is true for square matrices of any order. (See Chapter 8.)

Inverse of an n  n Matrix
Suppose A is an arbitrary n-square matrix. Finding its inverse AÀ1 reduces, as above, to finding the solution of a collection of n  n systems of linear equations. The solution of such systems and an efficient way of solving such a collection of systems is treated in Chapter 3.

2.10

Special Types of Square Matrices

This section describes a number of special kinds of square matrices.

Diagonal and Triangular Matrices
A square matrix D ¼ ½dij Š is diagonal if its nondiagonal entries are all zero. Such a matrix is sometimes denoted by D ¼ diagðd11 ; d22 ; . . . ; dnn Þ

36 where some or all the dii may be zero. For example, 2 2 3 6 ! 3 0 0 6 4 0 0 6 4 0 À7 0 5; ; 4 0 À5 0 0 2

CHAPTER 2 Algebra of Matrices

3 À9 8 7 7 5

are diagonal matrices, which may be represented, respectively, by diagð3; À7; 2Þ; diagð4; À5Þ; diagð6; 0; À9; 8Þ (Observe that patterns of 0’s in the third matrix have been omitted.) A square matrix A ¼ ½aij Š is upper triangular or simply triangular if all entries below the (main) diagonal are equal to 0—that is, if aij ¼ 0 for i > j. Generic upper triangular matrices of orders 2, 3, 4 are as follows: 2 3 2 3 c11 c12 c13 c14 ! b11 b12 b13 6 c22 c23 c24 7 a11 a12 6 4 7 b22 b23 5; ; 4 0 a22 c33 c34 5 b33 c44 (As with diagonal matrices, it is common practice to omit patterns of 0’s.) The following theorem applies.
THEOREM 2.5: Suppose A ¼ ½aij Š and B ¼ ½bij Š are n  n (upper) triangular matrices. Then

(i)

A þ B, kA, AB are triangular with respective diagonals: ða11 þ b11 ; . . . ; ann þ bnn Þ; ðka11 ; . . . ; kann Þ; ða11 b11 ; . . . ; ann bnn Þ

(ii)

For any polynomial f ðxÞ, the matrix f ðAÞ is triangular with diagonal ð f ða11 Þ; f ða22 Þ; . . . ; f ðann ÞÞ

(iii) A is invertible if and only if each diagonal element aii 6¼ 0, and when AÀ1 exists it is also triangular. A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note that Theorem 2.5 is true if we replace ‘‘triangular’’ by either ‘‘lower triangular’’ or ‘‘diagonal.’’ Remark: A nonempty collection A of matrices is called an algebra (of matrices) if A is closed under the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the square matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular, and lower triangular matrices.

Special Real Square Matrices: Symmetric, Orthogonal, Normal [Optional until Chapter 12]
Suppose now A is a square matrix with real entries—that is, a real square matrix. The relationship between A and its transpose AT yields important kinds of matrices.

(a)

Symmetric Matrices

A matrix A is symmetric if AT ¼ A. Equivalently, A ¼ ½aij Š is symmetric if symmetric elements (mirror elements with respect to the diagonal) are equal—that is, if each aij ¼ aji . A matrix A is skew-symmetric if AT ¼ ÀA or, equivalently, if each aij ¼ Àaji . Clearly, the diagonal elements of such a matrix must be zero, because aii ¼ Àaii implies aii ¼ 0. (Note that a matrix A must be square if AT ¼ A or AT ¼ ÀA.)

CHAPTER 2 Algebra of Matrices
2 EXAMPLE 2.12 Let A ¼ 4 À3 5 2 À3 6 7 2 3 3 ! 0 3 À4 5 1 0 0 : 0 5 5; C ¼ 7 5; B ¼ 4 À3 0 0 1 4 À5 0 À8

37

(a) By inspection, the symmetric elements in A are equal, or AT ¼ A. Thus, A is symmetric. (b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or BT ¼ ÀB. Thus, B is skew-symmetric. (c) Because C is not square, C is neither symmetric nor skew-symmetric.

(b)

Orthogonal Matrices

A real matrix A is orthogonal if AT ¼ AÀ1 —that is, if AAT ¼ AT A ¼ I. Thus, A must necessarily be square and invertible. 2 3 1 8 4 9 9 À9 6 7 EXAMPLE 2.13 Let A ¼ 4 4 À 4 À 7 5. Multiplying A by AT yields I; that is, AAT ¼ I. This means 9 9 9
8 9 1 9 4 9

AT A ¼ I, as well. Thus, AT ¼ AÀ1 ; that is, A is orthogonal. Now suppose A is a real orthogonal 3 Â 3 matrix with rows u1 ¼ ða1 ; a2 ; a3 Þ; u2 ¼ ðb1 ; b2 ; b3 Þ; u3 ¼ ðc1 ; c2 ; c3 Þ Because A is orthogonal, we must 2 32 a1 a1 a2 a3 AAT ¼ 4 b1 b2 b3 54 a2 c1 c2 c3 a3 have AAT ¼ I. Namely, 3 2 3 b1 c1 1 0 0 b2 c2 5 ¼ 4 0 1 0 5 ¼ I 0 0 1 b3 c3

Multiplying A by AT and setting each entry equal to the corresponding entry in I yields the following nine equations: a2 þ a2 þ a2 ¼ 1; 1 2 3 b1 a1 þ b2 a2 þ b3 a3 ¼ 0; c1 a1 þ c2 a2 þ c3 a3 ¼ 0; a1 b1 þ a2 b2 þ a3 b3 ¼ 0; b2 þ b2 þ b2 ¼ 1; 1 2 3 c1 b1 þ c2 b2 þ c3 b3 ¼ 0; a1 c1 þ a2 c2 þ a3 c3 ¼ 0 b1 c1 þ b2 c2 þ b3 c3 ¼ 0 c2 þ c2 þ c2 ¼ 1 1 2 3

Accordingly, u1 Á u1 ¼ 1, u2 Á u2 ¼ 1, u3 Á u3 ¼ 1, and ui Á uj ¼ 0 for i 6¼ j. Thus, the rows u1 , u2 , u3 are unit vectors and are orthogonal to each other. Generally speaking, vectors u1 , u2 ; . . . ; um in Rn are said to form an orthonormal set of vectors if the vectors are unit vectors and are orthogonal to each other; that is, & 0 if i 6¼ j ui Á uj ¼ 1 if i ¼ j In other words, ui Á uj ¼ dij where dij is the Kronecker delta function: We have shown that the condition AAT ¼ I implies that the rows of A form an orthonormal set of vectors. The condition AT A ¼ I similarly implies that the columns of A also form an orthonormal set of vectors. Furthermore, because each step is reversible, the converse is true. The above results for 3 Â 3 matrices are true in general. That is, the following theorem holds.
THEOREM 2.6: Let A be a real matrix. Then the following are equivalent:

(a) A is orthogonal. (b) The rows of A form an orthonormal set. (c) The columns of A form an orthonormal set. For n ¼ 2, we have the following result (proved in Problem 2.28).

38
!

CHAPTER 2 Algebra of Matrices

THEOREM 2.7: Let A be a real 2 Â 2 orthogonal matrix. Then, for some real number y,



cos y À sin y

sin y cos y

or



cos y sin y

sin y À cos y

!

(c)

Normal Matrices
! À3 . Then 6 !

A real matrix A is normal if it commutes with its transpose AT —that is, if AAT ¼ AT A. If A is symmetric, orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices.
EXAMPLE 2.14 Let A ¼

AAT ¼

6 3

À3 6

!

6 3

6 À3

45 0 3 ¼ 0 45 6

! and AT A ¼

6 À3

3 6

!

6 3

! ! 45 0 À3 ¼ 0 45 6

Because AAT ¼ AT A, the matrix A is normal.

2.11

Complex Matrices

Let A be a complex matrix—that is, a matrix with complex entries. Recall (Section 1.7) that if z ¼ a þ bi  is a complex number, then  ¼ a À bi is its conjugate. The conjugate of a complex matrix A, written A, is z  ¼ ½bij Š, the matrix obtained from A by taking the conjugate of each entry in A. That is, if A ¼ ½aij Š, then A   where bij ¼ aij . (We denote this fact by writing A ¼ ½ij Š.) a The two operations of transpose and conjugation commute for any complex matrix A, and the special notation AH is used for the conjugate transpose of A. That is,  AH ¼ ðAÞT ¼ ðAT Þ Note that if A is real, then AH ¼ AT . [Some texts use A* instead of AH :] 2 þ 8i EXAMPLE 2.15 Let A ¼ 6i 2 3 ! 2 À 8i À6i 5 À 3i 4 À 7i . Then AH ¼ 4 5 þ 3i 1 þ 4i 5. 1 À 4i 3 þ 2i 4 þ 7i 3 À 2i

Special Complex Matrices: Hermitian, Unitary, Normal [Optional until Chapter 12]
Consider a complex matrix A. The relationship between A and its conjugate transpose AH yields important kinds of complex matrices (which are analogous to the kinds of real matrices described above). A complex matrix A is said to be Hermitian or skew-Hermitian according as to whether AH ¼ A or AH ¼ ÀA: Clearly, A ¼ ½aij Š is Hermitian if and only if symmetric elements are conjugate—that is, if each  aij ¼ aji —in which case each diagonal element aii must be real. Similarly, if A is skew-symmetric, then each diagonal element aii ¼ 0. (Note that A must be square if AH ¼ A or AH ¼ ÀA.) A complex matrix A is unitary if AH AÀ1 ¼ AÀ1 AH ¼ I—that is, if AH ¼ AÀ1 : Thus, A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only if its rows (columns) form an orthonormal set relative to the dot product of complex vectors. A complex matrix A is said to be normal if it commutes with AH —that is, if AAH ¼ AH A

CHAPTER 2 Algebra of Matrices

39

(Thus, A must be a square matrix.) This definition reduces to that for real matrices when A is real.
EXAMPLE 2.16

Consider the following complex matrices: 3 2 3 1 Ài À1 þ i 3 1 À 2i 4 þ 7i 14 i 1 1 þ i5 A ¼ 4 1 þ 2i À4 À2i 5 B¼ 2 1 þ i À1 þ i 0 4 À 7i 2i 5 2

2 þ 3i 1 C¼ i 1 þ 2i

!

(a) By inspection, the diagonal elements of A are real, and the symmetric elements 1 À 2i and 1 þ 2i are conjugate, 4 þ 7i and 4 À 7i are conjugate, and À2i and 2i are conjugate. Thus, A is Hermitian. (b) Multiplying B by BH yields I; that is, BBH ¼ I. This implies BH B ¼ I, as well. Thus, BH ¼ BÀ1 , which means B is unitary. (c) To show C is normal, we evaluate CC H and C H C: ! ! ! 14 4 À 4i 2 À 3i Ài 2 þ 3i 1 ¼ CC H ¼ 4 þ 4i 6 1 1 À 2i i 1 þ 2i ! 14 4 À 4i and similarly C H C ¼ . Because CC H ¼ C H C, the complex matrix C is normal. 4 þ 4i 6 We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as orthogonal.

2.12

Block Matrices

Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For example, 2 3 2 3 2 3 1 À2 0 1 3 1 À2 0 1 3 1 À2 0 1 3 62 62 62 3 5 7 À2 7 3 5 7 À2 7 3 5 7 À2 7 6 7; 6 7; 6 7 43 5 43 5 43 1 4 5 9 1 4 5 95 1 4 5 9 4 6 À3 1 8 4 6 À3 1 8 4 6 À3 1 8 The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A and B can be obtained by carrying out the computation with the blocks, just as if they were the actual elements of the matrices. This is illustrated below, where the notation A ¼ ½Aij Š will be used for a block matrix A with blocks Aij . Suppose that A ¼ ½Aij Š and B ¼ ½Bij Š are block matrices with the same numbers of row and column blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks of A and B also adds the corresponding elements of A and B, and multiplying each block of A by a scalar k multiplies each element of A by k. Thus, 3 2 A11 þ B11 A12 þ B12 . . . A1n þ B1n 6 A þB A22 þ B22 . . . A2n þ B2n 7 7 6 21 21 AþB¼6 7 5 4 ... ... ... ... Am1 þ Bm1 and kA11 6 kA21 kA ¼ 6 4 ... kAm1 2 kA12 kA22 ... kAm2 Am2 þ Bm2 3 . . . kA1n . . . kA2n 7 7 ... ... 5 . . . kAmn ... Amn þ Bmn

40

CHAPTER 2 Algebra of Matrices

The case of matrix multiplication is less obvious, but still true. That is, suppose that U ¼ ½Uik Š and V ¼ ½Vkj Š are block matrices such that the number of columns of each block Uik is equal to the number of rows of each block Vkj . (Thus, each product Uik Vkj is defined.) Then W11 6 W21 UV ¼ 6 4 ... Wm1 2 W12 W22 ... Wm2 3 . . . W1n . . . W2n 7 7; ... ... 5 . . . Wmn

where

Wij ¼ Ui1 V1j þ Ui2 V2j þ Á Á Á þ Uip Vpj

The proof of the above formula for UV is straightforward but detailed and lengthy. It is left as an exercise (Problem 2.85).

Square Block Matrices
Let M be a block matrix. Then M is called a square block matrix if (i) M is a square matrix. (ii) The blocks form a square matrix. (iii) The diagonal blocks are also square matrices. The latter two conditions will occur if and only if there are the same number of horizontal and vertical lines and they are placed symmetrically. Consider the following two block matrices: 1 61 6 A ¼ 69 6 44 3 2 2 1 8 4 5 3 1 7 4 3 4 1 6 4 5 3 5 17 7 57 7 45 3 1 61 6 B ¼ 69 6 44 3 2 2 1 8 4 5 3 1 7 4 3 4 1 6 4 5 3 5 17 7 57 7 45 3

and

The block matrix A is not a square block matrix, because the second and third diagonal blocks are not square. On the other hand, the block matrix B is a square block matrix.

Block Diagonal Matrices
Let M ¼ ½Aij Š be a square block matrix such that the nondiagonal blocks are all zero matrices; that is, Aij ¼ 0 when i 6¼ j. Then M is called a block diagonal matrix. We sometimes denote such a block diagonal matrix by writing M ¼ diagðA11 ; A22 ; . . . ; Arr Þ or M ¼ A11 È A22 È Á Á Á È Arr

The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to the algebra of the individual blocks. Specifically, suppose f ðxÞ is a polynomial and M is the above block diagonal matrix. Then f ðMÞ is a block diagonal matrix, and f ðMÞ ¼ diagð f ðA11 Þ; f ðA22 Þ; . . . ; f ðArr ÞÞ Also, M is invertible if and only if each Aii is invertible, and, in such a case, M À1 is a block diagonal matrix, and M À1 ¼ diagðAÀ1 ; AÀ1 ; . . . ; AÀ1 Þ 11 22 rr Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the diagonal are zero matrices and a block lower triangular matrix if the blocks above the diagonal are zero matrices.

CHAPTER 2 Algebra of Matrices

41

EXAMPLE 2.17 Determine which of the following square block matrices are upper diagonal, lower diagonal, or diagonal: 3 2 2 3 2 3 2 3 1 0 0 0 1 0 0 1 2 0 1 2 0 62 3 4 07 7 C ¼ 4 0 2 3 5; D ¼ 43 4 55 A ¼ 4 3 4 5 5; B¼6 4 5 0 6 0 5; 0 4 5 0 6 7 0 0 6 0 7 8 9

(a) A is upper triangular because the block below the diagonal is a zero block. (b) B is lower triangular because all blocks above the diagonal are zero blocks. (c) C is diagonal because the blocks above and below the diagonal are zero blocks. (d) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will make it into either a block upper triangular matrix or a block lower triangular matrix.

SOLVED PROBLEMS Matrix Addition and Scalar Multiplication ! 3 0 1 À2 3 and B ¼ 2.1 Given A ¼ À7 1 4 5 À6 (a) A þ B, (b) 2A À 3B.
! ! 1 þ 3 À2 þ 0 3þ2 4 À2 5 ¼ 4À7 5 þ 1 À6 þ 8 À3 6 2

! 2 , find: 8

(a) Add the corresponding elements: AþB¼

(b) First perform the scalar multiplication and then a matrix addition: ! ! ! À7 À4 0 À9 0 À6 2 À4 6 ¼ þ 2A À 3B ¼ 29 7 À36 21 À3 À24 8 10 À12 (Note that we multiply B by À3 and then add, rather than multiplying B by 3 and subtracting. This usually prevents errors.)

2.2.

Find x; y; z; t where 3

x z

! ! 4 x 6 y þ ¼ zþt À1 2t t
3x 3z ! 3y xþ4 ¼ 3t zþtÀ1

! xþy : 3 xþyþ6 2t þ 3 !

Write each side as a single equation:

Set corresponding entries equal to each other to obtain the following system of four equations: 3x ¼ x þ 4; 3y ¼ x þ y þ 6; or 2x ¼ 4; 2y ¼ 6 þ x; 2z ¼ t À 1; The solution is x ¼ 2, y ¼ 4, z ¼ 1, t ¼ 3. 3z ¼ z þ t À 1; t¼3 3t ¼ 2t þ 3

2.3.

Prove Theorem 2.1 (i) and (v): (i) ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) kðA þ BÞ ¼ kA þ kB. Suppose A ¼ ½aij Š, B ¼ ½bij Š, C ¼ ½cij Š. The proof reduces to showing that corresponding ij-entries in each side of each matrix equation are equal. [We prove only (i) and (v), because the other parts of Theorem 2.1 are proved similarly.]

42

CHAPTER 2 Algebra of Matrices
(i) The ij-entry of A þ B is aij þ bij ; hence, the ij-entry of ðA þ BÞ þ C is ðaij þ bij Þ þ cij . On the other hand, the ij-entry of B þ C is bij þ cij ; hence, the ij-entry of A þ ðB þ CÞ is aij þ ðbij þ cij Þ. However, for scalars in K, ðaij þ bij Þ þ cij ¼ aij þ ðbij þ cij Þ Thus, ðA þ BÞ þ C and A þ ðB þ CÞ have identical ij-entries. Therefore, ðA þ BÞ þ C ¼ A þ ðB þ CÞ. (v) The ij-entry of A þ B is aij þ bij ; hence, kðaij þ bij Þ is the ij-entry of kðA þ BÞ. On the other hand, the ijentries of kA and kB are kaij and kbij , respectively. Thus, kaij þ kbij is the ij-entry of kA þ kB. However, for scalars in K, kðaij þ bij Þ ¼ kaij þ kbij Thus, kðA þ BÞ and kA þ kB have identical ij-entries. Therefore, kðA þ BÞ ¼ kA þ kB.

Matrix Multiplication 3 Calculate: (a) ½8; À4; 5Š4 2 5, À1 2 3 3 4 6 À9 7 7 ½6; À1; 7; 5Š6 4 À3 5, 2 2 3 5 ½3; 8; À2; 4Š4 À1 5 6 2

2.4.

(b)

(c)

(a) Multiply the corresponding entries and add: 2 3 3 ½8; À4; 5Š4 2 5 ¼ 8ð3Þ þ ðÀ4Þð2Þ þ 5ðÀ1Þ ¼ 24 À 8 À 5 ¼ 11 À1 (b) Multiply the corresponding entries and add: 2 3 4 6 À9 7 6 7 ½6; À1; 7; 5Š6 7 ¼ 24 þ 9 À 21 þ 10 ¼ 22 4 À3 5 2 (c) The product is not defined when the row matrix and the column matrix have different numbers of elements.

2.5.

Let ðr  sÞ denote an r  s matrix. Find the sizes of those matrix products that are defined: (a) ð2  3Þð3  4Þ; (b) ð4  1Þð1  2Þ, (c) ð1  2Þð3  1Þ; (d) ð5  2Þð2  3Þ, (e) ð4  4Þð3  3Þ (f) ð2  2Þð2  4Þ

In each case, the product is defined if the inner numbers are equal, and then the product will have the size of the outer numbers in the given order. (a) (b) 2 Â 4, 4 Â 2, (c) (d) not defined, 5 Â 3, (e) not defined (f) 2Â4

2.6.

Let A ¼

1 2

3 À1

! and B ¼

2 3

0 À2

! À4 . Find: (a) AB, (b) BA. 6

(a) Because A is a 2  2 matrix and B a 2  3 matrix, the product AB is defined and is a 2  3 matrix. To obtain the entries in the first row of AB, multiply the first row ½1; 3Š of A by the columns ! ! ! 2 0 À4 ; ; of B, respectively, as follows: 3 À2 6 ! ! ! ! 2 þ 9 0 À 6 À4 þ 18 11 À6 14 0 À4 1 3 2 ¼ ¼ AB ¼ 6 2 À1 3 À2

CHAPTER 2 Algebra of Matrices

43

To obtain the entries in the second row of AB, multiply the second row ½2; À1Š of A by the columns of B: ! ! ! 1 3 2 0 À4 11 À6 14 AB ¼ ¼ 2 À1 3 À2 6 4 À 3 0 þ 2 À8 À 6 Thus, AB ¼ ! 11 À6 14 : 1 2 À14

(b) The size of B is 2 Â 3 and that of A is 2 Â 2. The inner numbers 3 and 2 are not equal; hence, the product BA is not defined.

2.7.

Find AB, where A ¼

2 4

3 À2

À1 5

!

2 and B ¼ 4 1 4

2

À1 3 1

3 0 6 À5 1 5. À2 2

Because A is a 2 Â 3 matrix and B a 3 Â 4 matrix, the product AB is defined and is a 2 Â 4 matrix. Multiply the rows of A by the columns of B to obtain ! ! 4 þ 3 À 4 À2 þ 9 À 1 0 À 15 þ 2 12 þ 3 À 2 3 6 À13 13 ¼ AB ¼ : 8 À 2 þ 20 À4 À 6 þ 5 0 þ 10 À 10 24 À 2 þ 10 26 À5 0 32

2.8.

Find: (a)

1 À3

6 5

!

! 2 , À7

(b)

2 À7

!

1 À3

! 6 , 5

1 (c) ½2; À7Š À3

! 6 . 5

(a) The first factor is 2  2 and the second is 2  1, so the product is defined as a 2  1 matrix: ! ! ! ! À40 2 À 42 1 6 2 ¼ ¼ À41 À6 À 35 À3 5 À7 (b) The product is not defined, because the first factor is 2  1 and the second factor is 2  2. (c) The first factor is 1  2 and the second factor is 2  2, so the product is defined as a 1  2 (row) matrix: ! 1 6 ¼ ½2 þ 21; 12 À 35Š ¼ ½23; À23Š ½2; À7Š À3 5

2.9.

Clearly, 0A ¼ 0 and A0 ¼ 0, where the 0’s are zero matrices (with possibly different sizes). Find matrices A and B with no zero entries such that AB ¼ 0.
Let A ¼ 1 2 2 4 ! and B ¼ ! ! 6 2 0 0 . Then AB ¼ . À3 À1 0 0

2.10. Prove Theorem 2.2(i): ðABÞC ¼ AðBCÞ.
Let A ¼ ½aij Š, sik ¼ m P j¼1

B ¼ ½bjk Š, C ¼ ½ckl Š, and let AB ¼ S ¼ ½sik Š, aij bjk and tjl ¼

BC ¼ T ¼ ½tjl Š. Then n P k¼1

bjk ckl

Multiplying S ¼ AB by C, the il-entry of ðABÞC is si1 c1l þ si2 c2l þ Á Á Á þ sin cnl ¼ n P k¼1

sik ckl ¼

n m PP k¼1 j¼1

ðaij bjk Þckl

On the other hand, multiplying A by T ¼ BC, the il-entry of AðBCÞ is ai1 t1l þ ai2 t2l þ Á Á Á þ ain tnl ¼ m P j¼1

aij tjl ¼

m n PP j¼1 k¼1

aij ðbjk ckl Þ

The above sums are equal; that is, corresponding elements in ðABÞC and AðBCÞ are equal. Thus, ðABÞC ¼ AðBCÞ.

44
2.11. Prove Theorem 2.2(ii): AðB þ CÞ ¼ AB þ AC.

CHAPTER 2 Algebra of Matrices

Let A ¼ ½aij Š, B ¼ ½bjk Š, C ¼ ½cjk Š, and let D ¼ B þ C ¼ ½djk Š, E ¼ AB ¼ ½eik Š, F ¼ AC ¼ ½ fik Š. Then djk ¼ bjk þ cjk ; Thus, the ik-entry of the matrix AB þ AC is eik þ fik ¼ m P j¼1

eik ¼

m P j¼1

aij bjk ;

fik ¼

m P j¼1

aij cjk

aij bjk þ

m P j¼1

aij cjk ¼

m P j¼1

aij ðbjk þ cjk Þ

On the other hand, the ik-entry of the matrix AD ¼ AðB þ CÞ is ai1 d1k þ ai2 d2k þ Á Á Á þ aim dmk ¼ m P j¼1

aij djk ¼

m P j¼1

aij ðbjk þ cjk Þ

Thus, AðB þ CÞ ¼ AB þ AC, because the corresponding elements are equal.

Transpose 2.12. Find the transpose of each matrix: 2 ! 1 1 À2 3 A¼ ; B ¼ 42 7 8 À9 3 3 3 5 5; 6 3 2 D ¼ 4 À4 5 6 2

2 4 5

C ¼ ½1; À3; 5; À7Š;

Rewrite the rows of each matrix as columns to obtain the transpose of the3 matrix: 2 2 3 2 3 1 1 7 1 2 3 6 À3 7 7 AT ¼ 4 À2 8 5; BT ¼ 4 2 4 5 5; CT ¼ 6 DT ¼ ½2; À4; 6Š 4 5 5; 3 À9 3 5 6 À7 (Note that BT ¼ B; such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a column vector, and the transpose of the column vector D is a row vector.)

2.13. Prove Theorem 2.3(iv): ðABÞT ¼ BT AT .
Let A ¼ ½aik Š and B ¼ ½bkj Š. Then the ij-entry of AB is ai1 b1j þ ai2 b2j þ Á Á Á þ aim bmj This is the ji-entry (reverse order) of ðABÞT . Now column j of B becomes row j of BT , and row i of A becomes column i of AT . Thus, the ij-entry of BT AT is ½b1j ; b2j ; . . . ; bmj Š½ai1 ; ai2 ; . . . ; aim Š ¼ b1j ai1 þ b2j ai2 þ Á Á Á þ bmj aim Thus, ðABÞT ¼ BT AT on because the corresponding entries are equal.
T

Square Matrices 2.14. Find the diagonal and trace of each matrix: 2 3 2 3 1 3 6 2 4 8 (a) A ¼ 4 2 À5 8 5, (b) B ¼ 4 3 À7 9 5, 4 À2 9 À5 0 2 ! À3 . 6

(c)

1 C¼ 4

2 À5

(a) The diagonal of A consists of the elements from the upper left corner of A to the lower right corner of A or, in other words, the elements a11 , a22 , a33 . Thus, the diagonal of A consists of the numbers 1; À5, and 9. The trace of A is the sum of the diagonal elements. Thus, trðAÞ ¼ 1 À 5 þ 9 ¼ 5 (b) The diagonal of B consists of the numbers 2; À7, and 2. Hence, trðBÞ ¼ 2 À 7 þ 2 ¼ À3 (c) The diagonal and trace are only defined for square matrices.

CHAPTER 2 Algebra of Matrices
! 1 2 , and let f ðxÞ ¼ 2x3 À 4x þ 5 and gðxÞ ¼ x2 þ 2x þ 11. Find 2.15. Let A ¼ 4 À3 (a) A2 , (b) A3 , (c) f ðAÞ, (d) gðAÞ.
(a) (b) A2 ¼ AA ¼ A3 ¼ AA2 ¼ 1 2 4 À3 1 4 ! ! ! 9 À4 1þ8 2À6 2 ¼ ¼ À8 17 4 À 12 8 þ 9 À3 ! ! ! À7 9 À 16 À4 þ 34 9 À4 2 ¼ ¼ 60 36 þ 24 À16 À 51 À3 À8 17 ! 1 4

45

30 À67

!

(c) First substitute A for x and 5I for the constant in f ðxÞ, obtaining ! ! ! À7 30 1 2 1 0 f ðAÞ ¼ 2A3 À 4A þ 5I ¼ 2 À4 þ5 60 À67 4 À3 0 1 Now perform the scalar multiplication and then the matrix addition: ! ! ! À13 5 0 À4 À8 À14 60 ¼ þ þ f ðAÞ ¼ 104 0 5 À16 12 120 À134 !

52 À117

(d) Substitute A for x and 11I for the constant in gðxÞ, and then calculate as follows: ! ! ! 9 À4 1 2 1 0 gðAÞ ¼ A2 þ 2A À 11I ¼ þ2 À 11 À8 17 4 À3 0 1 ! ! ! ! À11 0 0 0 9 À4 2 4 þ ¼ ¼ þ À8 17 8 À6 0 À11 0 0 Because gðAÞ is the zero matrix, A is a root of the polynomial gðxÞ.

! ! x 1 3 such that Au ¼ 3u. . (a) Find a nonzero column vector u ¼ 2.16. Let A ¼ y 4 À3 (b) Describe all such vectors.
(a) First set up the matrix equation Au ¼ 3u, and then write each side as a single matrix (column vector) as follows: ! ! ! ! ! 1 3 x x x þ 3y 3x ¼3 ; and then ¼ 4 À3 y y 4x À 3y 3y Set the corresponding elements equal to each other to obtain a system of equations: x þ 3y ¼ 3x 4x À 3y ¼ 3y 2x À 3y ¼ 0 4x À 6y ¼ 0

or

or

2x À 3y ¼ 0

The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite number of solutions. To obtain a nonzero solution, let, say, y ¼ 2; then x ¼ 3. Thus, u ¼ ð3; 2ÞT is a desired nonzero vector. (b) To find the general solution, set y ¼ a, where a is a parameter. Substitute y ¼ a into 2x À 3y ¼ 0 to obtain x ¼ 3 a. Thus, u ¼ ð3 a; aÞT represents all such solutions. 2 2

Invertible Matrices, Inverses 1 42 2.17. Show that A ¼ 4 2 3 2 3 0 2 À11 2 2 À1 3 5 and B ¼ 4 À4 0 1 5 are inverses. 1 8 6 À1 À1
3 2 1 2þ0À2 2þ0À2 4 þ 0 À 3 4 À 1 À 35 ¼ 40 0 8þ0À8 8þ1À8 3 0 0 1 05 ¼ I 0 1

Compute the product AB, obtaining 2 À11 þ 0 þ 12 AB ¼ 4 À22 þ 4 þ 18 À44 À 4 þ 48

Because AB ¼ I, we can conclude (Theorem 3.16) that BA ¼ I. Accordingly, A and B are inverses.

46
2.18. Find the inverse, if possible, of each matrix: (a) A¼ 5 4 ! 3 ; 2 (b) B¼ 2 1 ! À3 ; 3 (c)

CHAPTER 2 Algebra of Matrices

! À2 6 : 3 À9

Use the formula for the inverse of a 2 Â 2 matrix appearing in Section 2.9. (a) First find jAj ¼ 5ð2Þ À 3ð4Þ ¼ 10 À 12 ¼ À2. Next interchange the diagonal elements, take the negatives of the nondiagonal elements, and multiply by 1=jAj:
À1

A

1 2 ¼À 2 À4

# ! " 3 À1 À3 2 ¼ 5 2 À5 2

(b) First find jBj ¼ 2ð3Þ À ðÀ3Þð1Þ ¼ 6 þ 3 ¼ 9. Next interchange the diagonal elements, take the negatives of the nondiagonal elements, and multiply by 1=jBj:
À1

B

! " 1 1 3 3 3 ¼ ¼ 9 À1 2 À1
9

1 3 2 9

#

(c) First find jCj ¼ À2ðÀ9Þ À 6ð3Þ ¼ 18 À 18 ¼ 0. Because jCj ¼ 0; C has no inverse.

2

6 2.19. Let A ¼ 6 0 4 1

1

1 1 2

2 x1 7 2 7. Find AÀ1 ¼ 4 y1 5 z1 4 1

3

x2 y2 z2

3 x3 y3 5. z3

Multiplying A by AÀ1 and setting the nine entries equal to the nine entries of the identity matrix I yields the following three systems of three equations in three of the unknowns: x1 þ y1 þ z1 ¼ 1 y1 þ 2z1 ¼ 0 x1 þ 2y1 þ 4z1 ¼ 0 x2 þ y2 þ z2 ¼ 0 y2 þ 2z2 ¼ 1 x2 þ 2y2 þ 4z2 ¼ 0 x3 þ y3 þ z3 ¼ 0 y3 þ 2z3 ¼ 0 x3 þ 2y3 þ 4z3 ¼ 1

[Note that A is the coefficient matrix for all three systems.] Solving the three systems for the nine unknowns yields x1 ¼ 0; y1 ¼ 2; z1 ¼ À1; x2 ¼ À2; 2 y2 ¼ 3; 3 z2 ¼ À1; x3 ¼ 1; y3 ¼ À2; z3 ¼ 1

Thus;

0 6 AÀ1 ¼ 4 2 À1

À2 1 7 3 À2 5 À1 1

(Remark: Chapter 3 gives an efficient way to solve the three systems.)

2.20. Let A and B be invertible matrices (with the same size). Show that AB is also invertible and ðABÞÀ1 ¼ BÀ1 AÀ1 . [Thus, by induction, ðA1 A2 . . . Am ÞÀ1 ¼ AÀ1 . . . AÀ1 AÀ1 .] m 2 1
Using the associativity of matrix multiplication, we get ðABÞðBÀ1 AÀ1 Þ ¼ AðBBÀ1 ÞAÀ1 ¼ AIAÀ1 ¼ AAÀ1 ¼ I ðBÀ1 AÀ1 ÞðABÞ ¼ BÀ1 ðAÀ1 AÞB ¼ AÀ1 IB ¼ BÀ1 B ¼ I Thus, ðABÞÀ1 ¼ BÀ1 AÀ1 .

CHAPTER 2 Algebra of Matrices
Diagonal and Triangular Matrices

47

2.21. Write out the diagonal matrices A ¼ diagð4; À3; 7Þ, B ¼ diagð2; À6Þ, C ¼ diagð3; À8; 0; 5Þ.
Put the given scalars on the diagonal and 0’s elsewhere: 2 3 2 B¼ 2 0 ; 0 À6 ! 6 C¼6 4 3 À8 0 5 3 7 7 5

4 0 0 A ¼ 4 0 À3 0 5; 0 0 7

2.22. Let A ¼ diagð2; 3; 5Þ and B ¼ diagð7; 0; À4Þ. Find (a) AB, A2 , B2 ; (b) f ðAÞ, where f ðxÞ ¼ x2 þ 3x À 2; (c) AÀ1 and BÀ1 .

(a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries; hence, AB ¼ diagð2ð7Þ; 3ð0Þ; 5ðÀ4ÞÞ ¼ diagð14; 0; À20Þ Thus, the squares A and B2 are obtained by squaring each diagonal entry; hence, A2 ¼ diagð22 ; 32 ; 52 Þ ¼ diagð4; 9; 25Þ f ð2Þ ¼ 4 þ 6 À 2 ¼ 8; Thus, f ðAÞ ¼ diagð8; 16; 38Þ. (c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal) of each diagonal entry. Thus, AÀ1 ¼ diagð1 ; 1 ; 1Þ, but B has no inverse because there is a 0 on the 2 3 5 diagonal. and B2 ¼ diagð49; 0; 16Þ f ð5Þ ¼ 25 þ 15 À 2 ¼ 38
2

(b) f ðAÞ is a diagonal matrix obtained by evaluating f ðxÞ at each diagonal entry. We have f ð3Þ ¼ 9 þ 9 À 2 ¼ 16;

2.23. Find a 2 Â 2 matrix A such that A2 is diagonal but not A.
Let A ¼ ! ! 1 2 7 0 . Then A2 ¼ , which is diagonal. 3 À1 0 7
3

8 2.24. Find an upper triangular matrix A such that A ¼ 0

! À57 . 27

! x y . Then x3 ¼ 8, so x ¼ 2; and z3 ¼ 27, so z ¼ 3. Next calculate A3 using x ¼ 2 and y ¼ 3: Set A ¼ 0 z 2 y 0 3 ! 2 0 ! ! 4 5y y ¼ 0 9 3 2 0 y 3 ! ! ! 8 19y 4 5y ¼ 0 27 0 9

A2 ¼

and ! 2 À3 . 0 3

A3 ¼

Thus, 19y ¼ À57, or y ¼ À3. Accordingly, A ¼

2.25. Let A ¼ ½aij Š and B ¼ ½bij Š be upper triangular matrices. Prove that AB is upper triangular with diagonal a11 b11 , a22 b22 ; . . . ; ann bnn .

Pn Pn Let AB ¼ ½cij Š. Then cij ¼ k¼1 aik bkj and cii ¼ k¼1 aik bki . Suppose i > j. Then, for any k, either i > k or k > j, so that either aik ¼ 0 or bkj ¼ 0. Thus, cij ¼ 0, and AB is upper triangular. Suppose i ¼ j. Then, for k < i, we have aik ¼ 0; and, for k > i, we have bki ¼ 0. Hence, cii ¼ aii bii , as claimed. [This proves one part of Theorem 2.5(i); the statements for A þ B and kA are left as exercises.]

48
Special Real Matrices: Symmetric and Orthogonal

CHAPTER 2 Algebra of Matrices

2.26. Determine whether or not each of the following matrices is symmetric—that is, AT ¼ A—or skew-symmetric—that is, AT ¼ ÀA: 2 3 2 3 ! 5 À7 1 0 4 À3 0 0 0 (a) A ¼ 4 À7 8 2 5; (b) B ¼ 4 À4 0 5 5; (c) C ¼ 0 0 0 1 2 À4 3 À5 0
(a) By inspection, the symmetric elements (mirror images in the diagonal) are À7 and À7, 1 and 1, 2 and 2. Thus, A is symmetric, because symmetric elements are equal. (b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and À4, À3 and 3, and 5 and À5, are negatives of each other. Hence, B is skew-symmetric. (c) Because C is not square, C is neither symmetric nor skew-symmetric.

4 2.27. Suppose B ¼ 2x À 3

xþ2 xþ1

! is symmetric. Find x and B.

Set the symmetric elements x þ 2 and 2x À 3 equal to each other, obtaining 2x À 3 ¼ x þ 2 or x ¼ 5. ! 4 7 Hence, B ¼ . 7 6

2.28. Let A be an arbitrary 2 Â 2 (real) orthogonal matrix. (a) Prove: If ða; bÞ is the first row of A, then a2 þ b2 ¼ 1 and ! ! a b a b : A¼ or A¼ b Àa Àb a (b) Prove Theorem 2.7: For some real number y, ! ! cos y sin y cos y sin y or A¼ A¼ sin y À cos y À sin y cos y
(a) Suppose ðx; yÞ is the second row of A. Because the rows of A form an orthonormal set, we get x2 þ y2 ¼ 1; a2 þ b2 ¼ 1; Similarly, the columns form an orthogonal set, so
2 2

ax þ by ¼ 0

b2 þ y2 ¼ 1; ab þ xy ¼ 0 a2 þ x2 ¼ 1; 2 Therefore, x ¼ 1 À a ¼ b , whence x ¼ Æb: Case (i): x ¼ b. Then bða þ yÞ ¼ 0, so y ¼ Àa. Case (ii): x ¼ Àb. Then bðy À aÞ ¼ 0, so y ¼ a. This means, as claimed, ! ! a b a b or A¼ A¼ Àb a b Àa (b) Because a2 þ b2 ¼ 1, we have À1 the theorem. a 1. Let a ¼ cos y. Then b2 ¼ 1 À cos2 y, so b ¼ sin y. This proves

2.29. Find a 2 Â 2 orthogonal matrix A whose first row is a (positive) multiple of ð3; 4Þ.
Normalize ð3; 4Þ to get ð3 ; 4Þ. Then, by Problem 2.28, 5 5 " 3 4# A¼
5 5 3 5

" A¼

À4 5

or

3 5 4 5

4 5 À3 5

# :

2.30. Find a 3 Â 3 orthogonal matrix P whose first two rows are multiples of u1 ¼ ð1; 1; 1Þ and u2 ¼ ð0; À1; 1Þ, respectively. (Note that, as required, u1 and u2 are orthogonal.)

CHAPTER 2 Algebra of Matrices

49

First find a nonzero vector u3 orthogonal to u1 and u2 ; say (cross product) u3 ¼ u1 Â u2 ¼ ð2; À1; À1Þ. Let A be the matrix whose rows are u1 ; u2 ; u3 ; and let P be the matrix obtained from A by normalizing the rows of A. Thus, 2 pffiffiffi pffiffiffi pffiffiffi 3 2 3 1= 3 1= 3 1= 3 1 1 1 6 pffiffiffi pffiffiffi 7 6 7 6 7 15 A ¼ 4 0 À1 and P¼6 0 À1= 2 1= 2 7 4 5 pffiffiffi pffiffiffi pffiffiffi 2 À1 À1 2= 6 À1= 6 À1= 6

Complex Matrices: Hermitian and Unitary Matrices 2 3 ! 2 À 3i 5 þ 8i 3 À 5i 2 þ 4i , (b) A ¼ 4 À4 3 À 7i 5 2.31. Find AH where (a) A ¼ 6 þ 7i 1 þ 8i À6 À i 5i
 Recall that AH ¼ AT , the conjugate tranpose of A. Thus, ! ! 3 þ 5i 6 À 7i 2 þ 3i À4 À6 þ i , (b) AH ¼ (a) AH ¼ 2 À 4i 1 À 8i 5 À 8i 3 þ 7i À5i

" 2.32. Show that A ¼

1 3

À 2i 3

À2i 3

2 3i 1 À3 À 2i 3

# is unitary.

The rows of A form an orthonormal set:       1 2 2 1 2 2 1 4 4 þ ¼1 À i; i Á À i; i ¼ þ 3 3 3 3 3 3 9 9 9         1 2 2 2 1 2 2 4 2 4 À i; i Á À i; À À i ¼ iþ þ À iÀ ¼0 3 3 3 3 3 3 9 9 9 9       2 1 2 2 1 2 4 1 4 ¼1 þ À i; À À i Á À i; À À i ¼ þ 3 3 3 3 3 3 9 9 9 Thus, A is unitary.

2.33. Prove the complex analogue of Theorem 2.6: Let A be a complex matrix. Then the following are equivalent: (i) A is unitary. (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set. (The proof is almost identical to the proof on page 37 for the case when A is a 3 Â 3 real matrix.)

First recall that the vectors u1 ; u2 ; . . . ; un in Cn form an orthonormal set if they are unit vectors and are orthogonal to each other, where the dot product in Cn is defined by    ða1 ; a2 ; . . . ; an Þ Á ðb1 ; b2 ; . . . ; bn Þ ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn

1 2 n Suppose A is unitary, and R1 ; R2 ; . . . ; Rn are its rows. Then RT ; RT ; . . . ; RT are the columns of AH . Let H T ¼ Ri Á Rj . Because A is unitary, we have AAH ¼ I. MultiAA ¼ ½cij Š. By matrix multiplication, cij ¼ Ri Rj plying A by AH and setting each entry cij equal to the corresponding entry in I yields the following n2 equations: R1 Á R1 ¼ 1; R2 Á R2 ¼ 1; ...; Rn Á Rn ¼ 1; and Ri Á Rj ¼ 0; for i 6¼ j Thus, the rows of A are unit vectors and are orthogonal to each other; hence, they form an orthonormal set of vectors. The condition AT A ¼ I similarly shows that the columns of A also form an orthonormal set of vectors. Furthermore, because each step is reversible, the converse is true. This proves the theorem.

Block Matrices 2.34. Consider the following block matrices (which are partitions of the same matrix): 2 3 2 3 1 À2 0 1 3 1 À2 0 1 3 (a) 4 2 3 5 7 À2 5, (b) 4 2 3 5 7 À2 5 3 1 4 5 9 3 1 4 5 9

50

CHAPTER 2 Algebra of Matrices
Find the size of each block matrix and also the size of each block.
(a) The block matrix has two rows of matrices and three columns of matrices; hence, its size is 2 Â 3. The block sizes are 2 Â 2, 2 Â 2, and 2 Â 1 for the first row; and 1 Â 2, 1 Â 2, and 1 Â 1 for the second row. (b) The size of the block matrix is 3 Â 2; and the block sizes are 1 Â 3 and 1 Â 2 for each of the three rows.

2.35. Compute AB using block multiplication, where 2 3 2 1 2 1 1 2 3 A ¼ 43 4 05 and B ¼ 44 5 6 0 0 2 0 0 0
E

3 1 15 1

! ! R S F and B ¼ , where E; F; G; R; S; T are the given blocks, and 01Â2 and 01Â3 Here A ¼ 01Â3 T 01Â2 G are zero matrices of the indicated sites. Hence, ES þ FT GT ! 2 6 9 12 15 ¼ 4 19 26 33 ½ 0 0 0Š ! 3 2 3 ! ! 9 12 15 4 3 1 7 þ 4 5 7 0 5 ¼ 19 26 33 7 0 0 0 2 2

ER AB ¼ 01Â3

1 2.36. Let M ¼ diagðA; B; CÞ, where A ¼ 3

! ! 1 3 2 . Find M 2 . , B ¼ ½5Š, C ¼ 5 7 4
! 24 ; 64

Because M is block diagonal, square each block: ! 7 10 ; B2 ¼ ½25Š; A2 ¼ 15 22 so 7 10 6 15 22 6 M2 ¼ 6 6 4 2

C2 ¼ 3

16 40

25 16 40

7 7 7 7 24 5 64

Miscellaneous Problem 2.37. Let f ðxÞ and gðxÞ be polynomials and let A be a square matrix. Prove (a) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ, (b) ð f Á gÞðAÞ ¼ f ðAÞgðAÞ, (c) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ.
Suppose f ðxÞ ¼ Pr i¼1 ai xi and gðxÞ ¼

Ps j¼1 bj xj .

(a) We can assume r ¼ s ¼ n by adding powers of x with 0 as their coefficients. Then n P f ðxÞ þ gðxÞ ¼ ðai þ bi Þxi i¼1 n n n P P P i Hence, ð f þ gÞðAÞ ¼ ðai þ bi ÞA ¼ ai Ai þ bi Ai ¼ f ðAÞ þ gðAÞ i¼1 i¼1 i¼1 P (b) We have f ðxÞgðxÞ ¼ ai bj xiþj . Then i;j ! ! P P P i j ai A bj A ¼ ai bj Aiþj ¼ ð fgÞðAÞ f ðAÞgðAÞ ¼ i j i;j

(c) Using f ðxÞgðxÞ ¼ gðxÞf ðxÞ, we have f ðAÞgðAÞ ¼ ð fgÞðAÞ ¼ ðg f ÞðAÞ ¼ gðAÞ f ðAÞ

CHAPTER 2 Algebra of Matrices
SUPPLEMENTARY PROBLEMS

51

Algebra of Matrices Problems 2.38–2.41 refer to the following matrices: ! ! 1 À3 1 2 5 0 A¼ ; B¼ ; C¼ À6 7 2 6 3 À4 2.38. Find 2.39. Find 2.40. Find (a) 5A À 2B, (b) 2A þ 3B, (a) AB and ðABÞC, (c) 2C À 3D. ! 4 ; À5 7 À1 À8 9 !

3 D¼ 4

(b) BC and AðBCÞ. [Note that ðABÞC ¼ AðBCÞ.]

(a) A2 and A3 , (b) AD and BD, (c) CD.

2.41. Find (a) AT , (b) BT , (c) ðABÞT , (d) AT BT . [Note that AT BT 6¼ ðABÞT .] Problems 2.42 and 2.43 refer to the following matrices: 2 3 ! ! 2 À3 0 1 1 À1 2 4 0 À3 A¼ ; B¼ ; C ¼ 4 5 À1 À4 2 5; 0 3 4 À1 À2 3 À1 0 0 3 2.42. Find 2.43. Find (a) 3A À 4B, (b) AC, (a) AT , (c) BC, (d) AD, (e) BD, ( f ) CD.

3 2 D ¼ 4 À1 5: 3

2

(b) AT B, (c) AT C. ! 1 2 . Find a 2  3 matrix B with distinct nonzero entries 2.44. Let A ¼ 3 6 2 a1 a2 a3 2.45 Let e1 ¼ ½1; 0; 0Š, e2 ¼ ½0; 1; 0Š, e3 ¼ ½0; 0; 1Š, and A ¼ 4 b1 b2 b3 c1 c2 c3 (a) ei A ¼ Ai , ith row of A. (b) BeT ¼ Bj , jth column of B. j

such that AB ¼ 0. 3 a4 b4 5. Find e1 A, e2 A, e3 A. c4

2.46. Let ei ¼ ½0; . . . ; 0; 1; 0; . . . ; 0Š, where 1 is the ith entry. Show (c) If ei A ¼ ei B, for each i, then A ¼ B. (d) If AeT ¼ BeT , for each j, then A ¼ B. j j

2.47. Prove Theorem 2.2(iii) and (iv): (iii) ðB þ CÞA ¼ BA þ CA, (iv) kðABÞ ¼ ðkAÞB ¼ AðkBÞ. 2.48. Prove Theorem 2.3: (i) ðA þ BÞT ¼ AT þ BT , (ii) ðAT ÞT ¼ A, (iii) ðkAÞT ¼ kAT .

2.49. Show (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a zero column. Square Matrices, Inverses 2.50. Find the diagonal and trace of each of the following matrices: 2 3 2 3 ! 2 À5 8 1 3 À4 5, (c) C ¼ 4 3 À6 4 3 À6 À7 5, (b) B ¼ 4 6 (a) A ¼ 1 7 2 À5 0 4 0 À1 2 À5 À1 ! ! ! 6 À4 4 À2 2 À5 . ,C¼ ,B¼ Problems 2.51–2.53 refer to A ¼ 3 À2 1 À6 3 1 2.51. Find (a) A2 and A3 , (b) f ðAÞ and gðAÞ, where f ðxÞ ¼ x3 À 2x2 À 5; gðxÞ ¼ x2 À 3x þ 17:

52
2.52. Find (a) B2 and B3 , (b) f ðBÞ and gðBÞ, where f ðxÞ ¼ x2 þ 2x À 22; 2.53. Find a nonzero column vector u such that Cu ¼ 4u.

CHAPTER 2 Algebra of Matrices

gðxÞ ¼ x2 À 3x À 6:

2.54. Find the inverse of each of the following matrices (if it exists): ! ! ! ! 7 4 2 3 4 À6 5 À2 A¼ ; B¼ ; C¼ ; D¼ 5 3 4 5 À2 3 6 À3 2 3 2 3 1 À1 1 1 1 2 1 À1 5. [Hint: See Problem 2.19.] 2.55. Find the inverses of A ¼ 4 1 2 5 5 and B ¼ 4 0 1 3 7 1 3 À2 2.56. Suppose A is invertible. Show that if AB ¼ AC, then B ¼ C. Give an example of a nonzero matrix A such that AB ¼ AC but B 6¼ C. 2.57. Find 2 Â 2 invertible matrices A and B such that A þ B 6¼ 0 and A þ B is not invertible. 2.58. Show (a) A is invertible if and only if AT is invertible. (b) The operations of inversion and transpose commute; that is, ðAT ÞÀ1 ¼ ðAÀ1 ÞT . (c) If A has a zero row or zero column, then A is not invertible. Diagonal and triangular matrices 2.59. Let A ¼ diagð1; 2; À3Þ and B ¼ diagð2; À5; 0Þ. Find (a) AB, A2 , B2 ; (b) f ðAÞ, where f ðxÞ ¼ x2 þ 4x À 3; (c) AÀ1 and BÀ1 . 2 3 ! 1 1 0 1 2 and B ¼ 4 0 1 1 5. (a) Find An . (b) Find Bn . 2.60. Let A ¼ 0 1 0 0 1 ! 4 21 1 2 , (b) B ¼ 2.61. Find all real triangular matrices A such that A ¼ B, where (a) B ¼ 0 25 0 ! 5 2 2.62. Let A ¼ . Find all numbers k for which A is a root of the polynomial: 0 k

! 4 . À9

(a) f ðxÞ ¼ x2 À 7x þ 10, (b) gðxÞ ¼ x2 À 25, (c) hðxÞ ¼ x2 À 4. ! 1 0 2.63. Let B ¼ : Find a matrix A such that A3 ¼ B. 26 27 2 3 1 8 5 2.64. Let B ¼ 4 0 9 5 5. Find a triangular matrix A with positive diagonal entries such that A2 ¼ B. 0 0 4 2.65. Using only the elements 0 and 1, find the number of 3  3 matrices that are (a) diagonal, (b) upper triangular, (c) nonsingular and upper triangular. Generalize to n  n matrices. 2.66. Let Dk ¼ kI, the scalar matrix belonging to the scalar k. Show (a) Dk A ¼ kA, (b) BDk ¼ kB, (c) Dk þ Dk 0 ¼ Dkþk 0 , (d) Dk Dk 0 ¼ Dkk 0

2.67. Suppose AB ¼ C, where A and C are upper triangular. (a) Find 2 Â 2 nonzero matrices A; B; C, where B is not upper triangular. (b) Suppose A is also invertible. Show that B must also be upper triangular.

CHAPTER 2 Algebra of Matrices
Special Types of Real Matrices 2.68. Find x; y; 2 such that3 is symmetric, 2 z A where 2 x 3 7 À6 (a) A ¼ 4 4 5 y 5, (b) A ¼ 4 y z z 1 7 x À2 3 2x À2 5. 5

53

2.69. Suppose A is a square matrix. Show (a) A þ AT is symmetric, (b) A À AT is skew-symmetric, (c) A ¼ B þ C, where B is symmetric and C is skew-symmetric. 2.70. Write A ¼ 4 1 5 3 ! as the sum of a symmetric matrix B and a skew-symmetric matrix C.

2.71. Suppose A and B are symmetric. Show that the following are also symmetric: (a) A þ B; (b) kA, for any scalar k; (c) A2 ; (d) An , for n > 0; (e) f ðAÞ, for any polynomial f ðxÞ. 2.72. Find a 2 Â 2 orthogonal matrix P whose first row is a multiple of (a) ð3; À4Þ, (b) ð1; 2Þ.

2.73. Find a 3 Â 3 orthogonal matrix P whose first two rows are multiples of (a) ð1; 2; 3Þ and ð0; À2; 3Þ, (b) ð1; 3; 1Þ and ð1; 0; À1Þ. 2.74. Suppose A and B are orthogonal matrices. Show that AT , AÀ1 , AB are also orthogonal. 2 ! ! 1 1 3 À4 1 À2 , C ¼ 40 1 2.75. Which of the following matrices are normal? A ¼ ,B¼ 4 3 2 3 0 0 Complex Matrices 3 2.76. Find real numbers x; y; z such that A is Hermitian, where A ¼ 4 3 À 2i yi 2 x þ 2i 0 1 À xi 3 yi 1 þ zi 5: À1

3 1 1 5. 1

2.77. Suppose A is a complex matrix. Show that AAH and AH A are Hermitian. 2.78. Let A be a square matrix. Show that (a) A þ AH is Hermitian, (b) A À AH is skew-Hermitian, (c) A ¼ B þ C, where B is Hermitian and C is skew-Hermitian. 2.79. Determine which of the following matrices are unitary: i=2 A ¼ pffiffiffi 3=2 pffiffiffi ! À 3=2 ; Ài=2 B¼ 1 1þi 1Ài ; 2 1Ài 1þi ! 3 1 Ài À1 þ i 1 C¼ 4 i 1 1 þ i5 2 1 þ i À1 þ i 0 2

2.80. Suppose A and B are unitary. Show that AH , AÀ1 , AB are unitary. 2.81. Determine which of the following matrices are normal: A ¼ ! 1 0 B¼ . 1Ài i 3 þ 4i 1 i 2 þ 3i ! and

54
Block Matrices 1 63 2.82. Let U ¼ 6 40 0 2 2 4 0 0 0 0 5 3 0 0 1 4 2 3 3 À2 0 62 4 6 07 7 and V ¼ 6 0 0 6 25 40 0 1 0 0

CHAPTER 2 Algebra of Matrices

0 0 1 2 À4

3 0 07 7 2 7. 7 À3 5 1

(a) Find UV using block multiplication. (b) Are U and V block diagonal matrices? (c) Is UV block diagonal? 2.83. Partition each of the following matrices so that it becomes a square block matrix diagonal blocks as possible: 2 3 1 2 0 0 0 2 3 2 63 0 0 0 07 1 0 0 0 1 6 7 A ¼ 4 0 0 2 5; B ¼ 6 0 0 4 0 0 7; C ¼ 40 0 6 7 40 0 5 0 05 0 0 3 2 0 0 0 0 0 6 3 2 3 2 1 1 0 0 2 0 0 0 62 3 0 07 60 1 4 07 7 6 7 2.84. Find M 2 and M 3 for (a) M ¼ 6 4 0 2 1 0 5, (b) M ¼ 4 0 0 1 2 5. 0 0 4 5 0 0 0 3 2.85. For each matrix M in Problem 2.84, find f ðMÞ where f ðxÞ ¼ x2 þ 4x À 5. 2.86. Suppose U ¼ ½Uik Š and V ¼ ½Vkj Š are block matrices for which UV is defined and the number of columns of each block Uik is equal to the number of rows of each block Vkj . Show that UV ¼ ½Wij Š, P where Wij ¼ k Uik Vkj . 2.87. Suppose M and N are block diagonal matrices where corresponding blocks have the same size, say M ¼ diagðAi Þ and N ¼ diagðBi Þ. Show (i) M þ N ¼ diagðAi þ Bi Þ, (ii) kM ¼ diagðkAi Þ, (iii) MN ¼ diagðAi Bi Þ, (iv) f ðMÞ ¼ diagð f ðAi ÞÞ for any polynomial f ðxÞ. with as many 3 0 05 0

ANSWERS TO SUPPLEMENTARY PROBLEMS

Notation: A ¼ ½R1 ; 2.38. (a) ½À5; 10;

R2 ;

. . .Š denotes a matrix A with rows R1 ; R2 ; . . . . (b) ½17; 4; À12; 13Š, (c) ½À7; À27; 11; À8; 36; À37Š

27; À34Š,

2.39. (a) ½À7; 14; 39; À28Š, ½21; 105; À98; À17; À285; 296Š (b) ½5; À15; 20; 8; 60; À59Š, ½21; 105; À98; À17; À285; 296Š 2.40. (a) ½7; À6; À9; 22Š, ½À11; 38; 57; À106Š; (b) ½11; À9; 17; À7; 53; À39Š, ½15; 35; À5; 10; À98; 69Š; 2.41. (a) 2.42. (a) (c) ½1; 3; 2; À4Š, (b) ½5; À6; 0; 7Š, (c) ½À7; 39;

(c)

not defined (d) ½5; 15; 10; À40Š

14; À28Š;

½À13; À3; 18; 4; 17; 0Š, (b) ½11; À12; 0; À5; À15; 5; 8; 4Š,

½À5; À2; 4; 5; 11; À3; À12; 18Š, (d) ½9; 9Š, (e) ½À1; 9Š,

(f )

not defined

CHAPTER 2 Algebra of Matrices
2.43. (a) ½1; 0; À1; 3; 2; 4Š, (b) ½4; 0; À3; À7; À6; 12; 4; À8; 6], (c)

55 not defined

2.44. ½2; 4; 6; À1; À2; À3Š 2.45. ½a1 ; a2 ; a3 ; a4 Š, 2.50. (a) 2.51. (a) 2.52. (a) ½b1 ; b2 ; b3 ; b4 Š, ½c1 ; c2 ; c3 ; c4 Š (b) 1; 1; À1; trðBÞ ¼ 1, (c) not defined

2; À6; À1; trðAÞ ¼ À5, ½À11; À15; ½14; 4;

9; À14Š, ½À67; 40; À24; À59Š, 26; À200Š,

(b) ½À50; 70; À42; À36Š, gðAÞ ¼ 0 À5; 46Š

À2; 34Š, ½60; À52;

(b) f ðBÞ ¼ 0, ½À4; 10;

2.53. u ¼ ½2a; aŠT 2.54. ½3; À4; 2.55. ½1; 1; À1; À5; 7Š, ½À 5 ; 3; 2; À1Š, not defined, ½1; À 2; 2; À 5Š 2 2 3 3 2; À5; 3; À1; 2; À1Š, ½1; 1; 0; À1; À3; 1; À1; À4; 1Š 0; 0Š

2.56. A ¼ ½1; 2; 1; 2Š, B ¼ ½0; 0; 1; 1Š, C ¼ ½2; 2; 2.57. A ¼ ½1; 2; 0; 3Š; 2.58. (c) B ¼ ½4; 3; 3; 0Š

Hint: Use Problem 2.48

2.59. (a) AB ¼ diagð2; À10; 0Þ, A2 ¼ diagð1; 4; 9Þ, B2 ¼ diagð4; 25; 0Þ; (b) f ðAÞ ¼ diagð2; 9; À6Þ; (c) AÀ1 ¼ diagð1; 1 ; À 1Þ, C À1 does not exist 2 3 2.60. (a) 2.61. (a) 2.62. (a) 2.63. ½1; 0; ½1; 2n; 0; 1Š, (b) ½1; n; 1 nðn À 1Þ; 0; 1; n; 0; 0; 1Š 2 0; À5Š, ½À2; 7; 0; 5Š, (b) none

½2; 3; 0; 5Š, k ¼ 2, 2; 3Š

½À2; À3; 0; À5Š, ½2; À7; (c) none

(b) k ¼ À5,

2.64. ½1; 2; 1; 0; 3; 1; 0; 0; 2Š 2.65. All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1 to be nonsingular. (a) 8 ð2n Þ, (b) 26 ð2nðnþ1Þ=2 Þ, (c) 23 ð2nðnÀ1Þ=2 Þ 2.67. (a) 2.68. (a) 2.69. (c) A ¼ ½1; 1; 0; 0Š, B ¼ ½1; 2; x ¼ 4, y ¼ 1, z ¼ 3; (b) 3; 4Š, C ¼ ½4; 6; 0; 0Š x ¼ 0, y ¼ À6, z any real number

Hint: Let B ¼ 1 ðA þ AT Þ and C ¼ 1 ðA À AT Þ: 2 2

2.70. B ¼ ½4; 3; 3; 3Š, C ¼ ½0; 2;

À2; 0Š pffiffiffi pffiffiffi pffiffiffi pffiffiffi 2.72. (a) ½3, À 4; 4, 3], (b) ½1= 5, 2= 5; 2= 5, À1= 5Š 5 5 5 5 pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi 2.73. (a) ½1= p14, 2= p14, 3= p14; 0; À2= 13, 3= ffiffiffi13; 12= 157, pffiffiffiffiffi 157ffiffiffiffiffi À3= p , À2= 157Š ffiffiffiffiffi ffiffiffiffiffi ffiffiffiffiffi pffiffiffi p pffiffiffiffiffi (b) ½1= 11, 3= 11, 1= 11; 1= 2, 0; À1= 2; 3= 22, À2= 22, 3= 22Š 2.75. A; C

56
2.76. x ¼ 3, y ¼ 0, z ¼ 3 2.78. (c) Hint: Let B ¼ 1 ðA þ AH Þ and C ¼ 1 ðA À AH Þ. 2 2

CHAPTER 2 Algebra of Matrices

2.79. A; B; C 2.81. A 2.82. (a) UV ¼ diagð½7; 6; 17; 10Š; ½À1; 9; 7; À5Š); (b) no; (c) yes

2.83. A: line between first and second rows (columns); B: line between second and third rows (columns) and between fourth and fifth rows (columns); C: C itself—no further partitioning of C is possible. 2.84. (a) M 2 ¼ diagð½4Š, ½9; 8; 4; 9Š, ½9ŠÞ, M 3 ¼ diagð½8Š; ½25; 44; 22; 25Š, ½27ŠÞ (b) M 2 ¼ diagð½3; 4; 8; 11Š, ½9; 12; 24; 33ŠÞ M 3 ¼ diagð½11; 15; 30; 41Š, ½57; 78; 156; 213ŠÞ diagð½7Š, ½8; 24; 12; 8Š, ½16ŠÞ, (b) diagð½2; 8; 16; 181], ½8; 20; 40; 48ŠÞ

2.85. (a)

CHAPTER 3

Systems of Linear Equations
3.1 Introduction
Systems of linear equations play an important and motivating role in the subject of linear algebra. In fact, many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus, the techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other hand, some of the abstract results will give us new insights into the structure and properties of systems of linear equations. All our systems of linear equations involve scalars as both coefficients and constants, and such scalars may come from any number field K. There is almost no loss in generality if the reader assumes that all our scalars are real numbers—that is, that they come from the real field R.

3.2

Basic Definitions, Solutions

This section gives basic definitions connected with the solutions of systems of linear equations. The actual algorithms for finding such solutions will be treated later.

Linear Equation and Solutions
A linear equation in unknowns x1 ; x2 ; . . . ; xn is an equation that can be put in the standard form a1 x1 þ a2 x2 þ Á Á Á þ an xn ¼ b ð3:1Þ

where a1 ; a2 ; . . . ; an , and b are constants. The constant ak is called the coefficient of xk , and b is called the constant term of the equation. A solution of the linear equation (3.1) is a list of values for the unknowns or, equivalently, a vector u in K n , say x1 ¼ k1 ; x2 ¼ k2 ; ...; xn ¼ kn or u ¼ ðk1 ; k2 ; . . . ; kn Þ

such that the following statement (obtained by substituting ki for xi in the equation) is true: a1 k1 þ a2 k2 þ Á Á Á þ an kn ¼ b In such a case we say that u satisfies the equation. Remark: Equation (3.1) implicitly assumes there is an ordering of the unknowns. In order to avoid subscripts, we will usually use x; y for two unknowns; x; y; z for three unknowns; and x; y; z; t for four unknowns; they will be ordered as shown.

57

58

CHAPTER 3 Systems of Linear Equations

EXAMPLE 3.1 Consider the following linear equation in three unknowns x; y; z:

x þ 2y À 3z ¼ 6
We note that x ¼ 5; y ¼ 2; z ¼ 1, or, equivalently, the vector u ¼ ð5; 2; 1Þ is a solution of the equation. That is,

5 þ 2ð2Þ À 3ð1Þ ¼ 6

or

5þ4À3¼6

or 6 ¼ 6

On the other hand, w ¼ ð1; 2; 3Þ is not a solution, because on substitution, we do not get a true statement:

1 þ 2ð2Þ À 3ð3Þ ¼ 6

or

1þ4À9¼6

or

À4¼6

System of Linear Equations
A system of linear equations is a list of linear equations with the same unknowns. In particular, a system of m linear equations L1 ; L2 ; . . . ; Lm in n unknowns x1 ; x2 ; . . . ; xn can be put in the standard form a11 x1 þ a12 x2 þ Á Á Á þ a1n xn ¼ b1 a21 x1 þ a22 x2 þ Á Á Á þ a2n xn ¼ b2 ::::::::::::::::::::::::::::::::::::::::::::::::::: am1 x1 þ am2 x2 þ Á Á Á þ amn xn ¼ bm where the aij and bi are constants. The number aij is the coefficient of the unknown xj in the equation Li , and the number bi is the constant of the equation Li . The system (3.2) is called an m  n (read: m by n) system. It is called a square system if m ¼ n—that is, if the number m of equations is equal to the number n of unknowns. The system (3.2) is said to be homogeneous if all the constant terms are zero—that is, if b1 ¼ 0, b2 ¼ 0; . . . ; bm ¼ 0. Otherwise the system is said to be nonhomogeneous. A solution (or a particular solution) of the system (3.2) is a list of values for the unknowns or, equivalently, a vector u in K n , which is a solution of each of the equations in the system. The set of all solutions of the system is called the solution set or the general solution of the system.
EXAMPLE 3.2 Consider the following system of linear equations:

ð3:2Þ

x1 þ x2 þ 4x3 þ 3x4 ¼ 5 2x1 þ 3x2 þ x3 À 2x4 ¼ 1 x1 þ 2x2 À 5x3 þ 4x4 ¼ 3
It is a 3  4 system because it has three equations in four unknowns. Determine whether (a) u ¼ ðÀ8; 6; 1; 1Þ and (b) v ¼ ðÀ10; 5; 1; 2Þ are solutions of the system. (a) Substitute the values of u in each equation, obtaining

À8 þ 6 þ 4ð1Þ þ 3ð1Þ ¼ 5 2ðÀ8Þ þ 3ð6Þ þ 1 À 2ð1Þ ¼ 1 À8 þ 2ð6Þ À 5ð1Þ þ 4ð1Þ ¼ 3

or or or

À8 þ 6 þ 4 þ 3 ¼ 5 À16 þ 18 þ 1 À 2 ¼ 1 À8 þ 12 À 5 þ 4 ¼ 3

or or or

5¼5 1¼1 3¼3

Yes, u is a solution of the system because it is a solution of each equation. (b) Substitute the values of v into each successive equation, obtaining

À10 þ 5 þ 4ð1Þ þ 3ð2Þ ¼ 5 2ðÀ10Þ þ 3ð5Þ þ 1 À 2ð2Þ ¼ 1

or or

À10 þ 5 þ 4 þ 6 ¼ 5 À20 þ 15 þ 1 À 4 ¼ 1

or or

5¼5 À8 ¼ 1

No, v is not a solution of the system, because it is not a solution of the second equation. (We do not need to substitute v into the third equation.)

CHAPTER 3 Systems of Linear Equations

59

The system (3.2) of linear equations is said to be consistent if it has one or more solutions, and it is said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real field R or the complex field C, then we have the following important result.
THEOREM 3.1: Suppose the field K is infinite. Then any system l of linear equations has

(i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions. This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system l consists of two equations in two unknowns (Section 3.4).

Figure 3-1

Augmented and Coefficient Matrices of a System
Consider again the general system (3.2) of m equations in n unknowns. Such a system has associated with it the following two matrices: 2 3 2 3 a11 a12 . . . a1n a11 a12 . . . a1n b1 6a 6a a22 . . . a2n b2 7 a22 . . . a2n 7 7 7 and A ¼ 6 21 M ¼ 6 21 4::::::::::::::::::::::::::::::::::::::: 5 4 ::::::::::::::::::::::::::::::: 5 am1 am2 . . . amn bn am1 am2 . . . amn The first matrix M is called the augmented matrix of the system, and the second matrix A is called the coefficient matrix. The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without the last column of constants. Some texts write M ¼ ½A; BŠ to emphasize the two parts of M, where B denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the system in Example 3.2 are as follows: 2 3 2 3 1 1 4 3 5 1 1 4 3 M ¼ 42 3 1 À2 1 5 and A ¼ 42 3 1 À2 5 1 2 À5 4 3 1 2 À5 4 As expected, A consists of all the columns of M except the last, which is the column of constants. Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice versa. Specifically, each row of M corresponds to an equation of the system, and each column of M corresponds to the coefficients of an unknown, except for the last column, which corresponds to the constants of the system.

Degenerate Linear Equations
A linear equation is said to be degenerate if all the coefficients are zero—that is, if it has the form 0x1 þ 0x2 þ Á Á Á þ 0xn ¼ b ð3:3Þ

60

CHAPTER 3 Systems of Linear Equations

The solution of such an equation depends only on the value of the constant b. Specifically, (i) If b 6¼ 0, then the equation has no solution. (ii) If b ¼ 0, then every vector u ¼ ðk1 ; k2 ; . . . ; kn Þ in K n is a solution. The following theorem applies.
THEOREM 3.2: Let l be a system of linear equations that contains a degenerate equation L, say with

constant b. (i) If b 6¼ 0, then the system l has no solution. (ii) If b ¼ 0, then L may be deleted from the system without changing the solution set of the system. Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution. Part (ii) comes from the fact that every element in K n is a solution of the degenerate equation.

Leading Unknown in a Nondegenerate Linear Equation
Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not zero. By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For example, x3 and y are the leading unknowns, respectively, in the equations 0x1 þ 0x2 þ 5x3 þ 6x4 þ 0x5 þ 8x6 ¼ 7 and 0x þ 2y À 4z ¼ 5

We frequently omit terms with zero coefficients, so the above equations would be written as 5x3 þ 6x4 þ 8x6 ¼ 7 In such a case, the leading unknown appears first. and 2y À 4z ¼ 5

3.3

Equivalent Systems, Elementary Operations

Consider the system (3.2) of m linear equations in n unknowns. Let L be the linear equation obtained by multiplying the m equations by constants c1 ; c2 ; . . . ; cm , respectively, and then adding the resulting equations. Specifically, let L be the following linear equation: ðc1 a11 þ Á Á Á þ cm am1 Þx1 þ Á Á Á þ ðc1 a1n þ Á Á Á þ cm amn Þxn ¼ c1 b1 þ Á Á Á þ cm bm Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43) that any solution of the system (3.2) is also a solution of the linear combination L.
EXAMPLE 3.3 Let L1 , L2 , L3 denote, respectively, the three equations in Example 3.2. Let L be the

equation obtained by multiplying L1 , L2 , L3 by 3; À2; 4, respectively, and then adding. Namely, 3L1 : À2L2 : 4L1 : ðSumÞ L: 3x1 þ 3x2 þ 12x3 þ 9x4 ¼ 15 À4x1 À 6x2 À 2x3 þ 4x4 ¼ À2 4x1 þ 8x2 À 20x3 þ 16x4 ¼ 12 3x1 þ 5x2 À 10x3 þ 29x4 ¼ 25

CHAPTER 3 Systems of Linear Equations

61

Then L is a linear combination of L1 , L2 , L3 . As expected, the solution u ¼ ðÀ8; 6; 1; 1Þ of the system is also a solution of L. That is, substituting u in L, we obtain a true statement:

3ðÀ8Þ þ 5ð6Þ À 10ð1Þ þ 29ð1Þ ¼ 25 The following theorem holds.

or

À24 þ 30 À 10 þ 29 ¼ 25

or

9¼9

THEOREM 3.3: Two systems of linear equations have the same solutions if and only if each equation in

each system is a linear combination of the equations in the other system. Two systems of linear equations are said to be equivalent if they have the same solutions. The next subsection shows one way to obtain equivalent systems of linear equations.

Elementary Operations
The following operations on a system of linear equations L1 ; L2 ; . . . ; Lm are called elementary operations. ½E1 Š Interchange two of the equations. We indicate that the equations Li and Lj are interchanged by writing: ‘‘Interchange Li and Lj ’’ or ‘‘Li ! Lj ’’ ½E2 Š Replace an equation by a nonzero multiple of itself. We indicate that equation Li is replaced by kLi (where k 6¼ 0) by writing or ‘‘kLi ! Li ’’ ‘‘Replace Li by kLi ’’ ½E3 Š Replace an equation by the sum of a multiple of another equation and itself. We indicate that equation Lj is replaced by the sum of kLi and Lj by writing ‘‘Replace Lj by kLi þ Lj ’’ or ‘‘kLi þ Lj ! Lj ’’ The arrow ! in ½E2 Š and ½E3 Š may be read as ‘‘replaces.’’ The main property of the above elementary operations is contained in the following theorem (proved in Problem 3.45).
THEOREM 3.4: Suppose a system of m of linear equations is obtained from a system l of linear

equations by a finite sequence of elementary operations. Then m and l have the same solutions.

Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply ½E2 Š and ½E3 Š in one step; that is, we may apply the following operation: ½EŠ Replace equation Lj by the sum of kLi and k 0 Lj (where k 0 6¼ 0), written or ‘‘kLi þ k 0 Lj ! Lj ’’ ‘‘Replace Lj by kLi þ k 0 Lj ’’

We emphasize that in operations ½E3 Š and [E], only equation Lj is changed. Gaussian elimination, our main method for finding the solution of a given system of linear equations, consists of using the above operations to transform a given system into an equivalent system whose solution can be easily obtained. The details of Gaussian elimination are discussed in subsequent sections.

3.4

Small Square Systems of Linear Equations

This section considers the special case of one equation in one unknown, and two equations in two unknowns. These simple systems are treated separately because their solution sets can be described geometrically, and their properties motivate the general case.

62
Linear Equation in One Unknown

CHAPTER 3 Systems of Linear Equations

The following simple basic result is proved in Problem 3.5.
THEOREM 3.5: Consider the linear equation ax ¼ b.

(i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b. (ii) If a ¼ 0, but b 6¼ 0, then ax ¼ b has no solution. (iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b.
EXAMPLE 3.4 Solve (a) 4x À 1 ¼ x þ 6, (b) 2x À 5 À x ¼ x þ 3, (c)
7 3

4 þ x À 3 ¼ 2x þ 1 À x.

(a) Rewrite the equation in standard form obtaining 3x ¼ 7. Then x ¼ is the unique solution [Theorem 3.5(i)]. (b) Rewrite the equation in standard form, obtaining 0x ¼ 8. The equation has no solution [Theorem 3.5(ii)]. (c) Rewrite the equation in standard form, obtaining 0x ¼ 0. Then every scalar k is a solution [Theorem 3.5(iii)].

System of Two Linear Equations in Two Unknowns (2Â2 System)
Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in the standard form A1 x þ B1 y ¼ C1 A2 x þ B2 y ¼ C2 ð3:4Þ

Because the equations are nondegenerate, A1 and B1 are not both zero, and A2 and B2 are not both zero. The general solution of the system (3.4) belongs to one of three types as indicated in Fig. 3-1. If R is the field of scalars, then the graph of each equation is a line in the plane R2 and the three types may be described geometrically as pictured in Fig. 3-2. Specifically, (1) The system has exactly one solution. Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct slopes or, equivalently, when the coefficients of x and y are not proportional: A1 B1 6¼ A2 B2 or; equivalently; A1 B2 À A2 B1 6¼ 0

For example, in Fig. 3-2(a), 1=3 6¼ À1=2. y y y

6

6 L1 and L2

6

3

3 x

3

–3

0

3

x L2

–3

0

3

L1 L2

–3

0

3

x

L1

–3 1 L1: x –y = – L2: 3x + 2y = 12 (a)

–3 L1: x + 3y = 3 8 L2: 2x + 6y = – (b)

–3 L1: x + 2y = 4 L2: 2x + 4y = 8 (c)

Figure 3-2

CHAPTER 3 Systems of Linear Equations
(2)

63

The system has no solution. Here the two lines are parallel [Fig. 3-2(b)]. This occurs when the lines have the same slopes but different y intercepts, or when A1 B1 C1 ¼ 6¼ A2 B2 C2 For example, in Fig. 3-2(b), 1=2 ¼ 3=6 ¼ À3=8. 6 The system has an infinite number of solutions. Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same y intercepts, or when the coefficients and constants are proportional, A1 B1 C1 ¼ ¼ A2 B2 C2 For example, in Fig. 3-2(c), 1=2 ¼ 2=4 ¼ 4=8. Remark: The following expression and its value is called a determinant of order two:    A1 B1     A2 B2  ¼ A1 B2 À A2 B1

(3)

Determinants will be studied in Chapter 8. Thus, the system (3.4) has a unique solution if and only if the determinant of its coefficients is not zero. (We show later that this statement is true for any square system of linear equations.)

Elimination Algorithm
The solution to system (3.4) can be obtained by the process of elimination, whereby we reduce the system to a single equation in only one unknown. Assuming the system has a unique solution, this elimination algorithm has two parts.
ALGORITHM 3.1: The input consists of two nondegenerate linear equations L1 and L2 in two

unknowns with a unique solution. Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of one unknown are negatives of each other, and then add the two equations to obtain a new equation L that has only one unknown. Part B. (Back-Substitution) Solve for the unknown in the new equation L (which contains only one unknown), substitute this value of the unknown into one of the original equations, and then solve to obtain the value of the other unknown. Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique solution. In such a case, the new equation L will be degenerate and Part B will not apply.
EXAMPLE 3.5 (Unique Case). Solve the system

L1 : 2x À 3y ¼ À8 L2 : 3x þ 4y ¼ 5
The unknown x is eliminated from the equations by forming the new equation L ¼ À3L1 þ 2L2 . That is, we multiply L1 by À3 and L2 by 2 and add the resulting equations as follows:

À3L1 : À6x þ 9y ¼ 24 2L2 : 6x þ 8y ¼ 10 Addition : 17y ¼ 34

64

CHAPTER 3 Systems of Linear Equations

We now solve the new equation for y, obtaining y ¼ 2. We substitute y ¼ 2 into one of the original equations, say L1 , and solve for the other unknown x, obtaining

2x À 3ð2Þ ¼ À8

or

2x À 6 ¼ 8

or

2x ¼ À2

or

x ¼ À1

Thus, x ¼ À1, y ¼ 2, or the pair u ¼ ðÀ1; 2Þ is the unique solution of the system. The unique solution is expected, because 2=3 6¼ À3=4. [Geometrically, the lines corresponding to the equations intersect at the point ðÀ1; 2Þ.] EXAMPLE 3.6 (Nonunique Cases) (a) Solve the system

L1 :

x À 3y ¼ 4

L2 : À2x þ 6y ¼ 5
We eliminated x from the equations by multiplying L1 by 2 and adding it to L2 —that is, by forming the new equation L ¼ 2L1 þ L2 . This yields the degenerate equation

0x þ 0y ¼ 13 which has a nonzero constant b ¼ 13. Thus, this equation and the system have no solution. This is expected, because 1=ðÀ2Þ ¼ À3=6 6¼ 4=5. (Geometrically, the lines corresponding to the equations are parallel.) (b) Solve the system

L1 : L2 :

x À 3y ¼ 4 À2x þ 6y ¼ À8

We eliminated x from the equations by multiplying L1 by 2 and adding it to L2 —that is, by forming the new equation L ¼ 2L1 þ L2 . This yields the degenerate equation

0x þ 0y ¼ 0 where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to the solutions of either equation. This is expected, because 1=ðÀ2Þ ¼ À3=6 ¼ 4=ðÀ8Þ. (Geometrically, the lines corresponding to the equations coincide.) To find the general solution, let y ¼ a, and substitute into L1 to obtain

x À 3a ¼ 4

or

x ¼ 3a þ 4 u ¼ ð3a þ 4; aÞ

Thus, the general solution of the system is

x ¼ 3a þ 4; y ¼ a

or

where a (called a parameter) is any scalar.

3.5

Systems in Triangular and Echelon Forms

The main method for solving systems of linear equations, Gaussian elimination, is treated in Section 3.6. Here we consider two simple types of systems of linear equations: systems in triangular form and the more general systems in echelon form.

Triangular Form
Consider the following system of linear equations, which is in triangular form: 2x1 À 3x2 þ 5x3 À 2x4 5x2 À x3 þ 3x4 7x3 À x4 2x4 ¼9 ¼1 ¼3 ¼8

CHAPTER 3 Systems of Linear Equations

65

That is, the first unknown x1 is the leading unknown in the first equation, the second unknown x2 is the leading unknown in the second equation, and so on. Thus, in particular, the system is square and each leading unknown is directly to the right of the leading unknown in the preceding equation. Such a triangular system always has a unique solution, which may be obtained by back-substitution. That is, (1) First solve the last equation for the last unknown to get x4 ¼ 4. (2) Then substitute this value x4 ¼ 4 in the next-to-last equation, and solve for the next-to-last unknown x3 as follows: 7x3 À 4 ¼ 3 or 7x3 ¼ 7 or x3 ¼ 1

(3) Now substitute x3 ¼ 1 and x4 ¼ 4 in the second equation, and solve for the second unknown x2 as follows: 5x2 À 1 þ 12 ¼ 1 or 5x2 þ 11 ¼ 1 or 5x2 ¼ À10 or x2 ¼ À2 (4) Finally, substitute x2 ¼ À2, x3 ¼ 1, x4 ¼ 4 in the first equation, and solve for the first unknown x1 as follows: 2x1 þ 6 þ 5 À 8 ¼ 9 or 2x1 þ 3 ¼ 9 or 2x1 ¼ 6 or x1 ¼ 3 Thus, x1 ¼ 3 , x2 ¼ À2, x3 ¼ 1, x4 ¼ 4, or, equivalently, the vector u ¼ ð3; À2; 1; 4Þ is the unique solution of the system. Remark: There is an alternative form for back-substitution (which will be used when solving a system using the matrix format). Namely, after first finding the value of the last unknown, we substitute this value for the last unknown in all the preceding equations before solving for the next-to-last unknown. This yields a triangular system with one less equation and one less unknown. For example, in the above triangular system, we substitute x4 ¼ 4 in all the preceding equations to obtain the triangular system 2x1 À 3x2 þ 5x3 ¼ 17 5x2 À x3 ¼ À1 7x3 ¼ 7 We then repeat the process using the new last equation. And so on.

Echelon Form, Pivot and Free Variables
The following system of linear equations is said to be in echelon form: 2x1 þ 6x2 À x3 þ 4x4 À 2x5 ¼ 15 x3 þ 2x4 þ 2x5 ¼ 5 3x4 À 9x5 ¼ 6 That is, no equation is degenerate and the leading unknown in each equation other than the first is to the right of the leading unknown in the preceding equation. The leading unknowns in the system, x1 , x3 , x4 , are called pivot variables, and the other unknowns, x2 and x5 , are called free variables. Generally speaking, an echelon system or a system in echelon form has the following form: a11 x1 þ a12 x2 þ a13 x3 þ a14 x4 þ Á Á Á þ a1n xn ¼ b1 a2j2 xj2 þ a2;j2 þ1 xj2 þ1 þ Á Á Á þ a2n xn ¼ b2 :::::::::::::::::::::::::::::::::::::::::::::: arjr xjr þ Á Á Á þ arn xn ¼ br ð3:5Þ

where 1 < j2 < Á Á Á < jr and a11 , a2j2 ; . . . ; arjr are not zero. The pivot variables are x1 , xj2 ; . . . ; xjr . Note that r n. The solution set of any echelon system is described in the following theorem (proved in Problem 3.10).

66 unknowns. There are two cases:

CHAPTER 3 Systems of Linear Equations

THEOREM 3.6: Consider a system of linear equations in echelon form, say with r equations in n

(i) r ¼ n. That is, there are as many equations as unknowns (triangular form). Then the system has a unique solution. (ii) r < n. That is, there are more unknowns than equations. Then we can arbitrarily assign values to the n À r free variables and solve uniquely for the r pivot variables, obtaining a solution of the system. Suppose an echelon system contains more unknowns than equations. Assuming the field K is infinite, the system has an infinite number of solutions, because each of the n À r free variables may be assigned any scalar. The general solution of a system with free variables may be described in either of two equivalent ways, which we illustrate using the above echelon system where there are r ¼ 3 equations and n ¼ 5 unknowns. One description is called the ‘‘Parametric Form’’ of the solution, and the other description is called the ‘‘Free-Variable Form.’’

Parametric Form
Assign arbitrary values, called parameters, to the free variables x2 and x5 , say x2 ¼ a and x5 ¼ b, and then use back-substitution to obtain values for the pivot variables x1 , x3 , x5 in terms of the parameters a and b. Specifically, (1) Substitute x5 ¼ b in the last equation, and solve for x4 : 3x4 À 9b ¼ 6 or 3x4 ¼ 6 þ 9b or x4 ¼ 2 þ 3b x3 ¼ 1 À 8b (2) Substitute x4 ¼ 2 þ 3b and x5 ¼ b into the second equation, and solve for x3 : x3 þ 2ð2 þ 3bÞ þ 2b ¼ 5 or x3 þ 4 þ 8b ¼ 5 or (3) Substitute x2 ¼ a, x3 ¼ 1 À 8b, x4 ¼ 2 þ 3b, x5 ¼ b into the first equation, and solve for x1 : 2x1 þ 6a À ð1 À 8bÞ þ 4ð2 þ 3bÞ À 2b ¼ 15 x1 ¼ 4 À 3a À 9b; x2 ¼ a; or x1 ¼ 4 À 3a À 9b x4 ¼ 2 þ 3b; x5 ¼ b

Accordingly, the general solution in parametric form is x3 ¼ 1 À 8b; or, equivalently, v ¼ ð4 À 3a À 9b; a; 1 À 8b; 2 þ 3b; bÞ where a and b are arbitrary numbers.

Free-Variable Form
Use back-substitution to solve for the pivot variables x1 , x3 , x4 directly in terms of the free variables x2 and x5 . That is, the last equation gives x4 ¼ 2 þ 3x5 . Substitution in the second equation yields x3 ¼ 1 À 8x5 , and then substitution in the first equation yields x1 ¼ 4 À 3x2 À 9x5 . Accordingly, x1 ¼ 4 À 3x2 À 9x5 ; or, equivalently, v ¼ ð4 À 3x2 À 9x5 ; x2 ; 1 À 8x5 ; 2 þ 3x5 ; x5 Þ is the free-variable form for the general solution of the system. We emphasize that there is no difference between the above two forms of the general solution, and the use of one or the other to represent the general solution is simply a matter of taste. Remark: A particular solution of the above system can be found by assigning any values to the free variables and then solving for the pivot variables by back-substitution. For example, setting x2 ¼ 1 and x5 ¼ 1, we obtain x4 ¼ 2 þ 3 ¼ 5; x3 ¼ 1 À 8 ¼ À7; x1 ¼ 4 À 3 À 9 ¼ À8 Thus, u ¼ ðÀ8; 1; 7; 5; 1Þ is the particular solution corresponding to x2 ¼ 1 and x5 ¼ 1. x2 ¼ free variable; x3 ¼ 1 À 8x5 ; x4 ¼ 2 þ 3x5 ; x5 ¼ free variable

CHAPTER 3 Systems of Linear Equations

67

3.6

Gaussian Elimination

The main method for solving the general system (3.2) of linear equations is called Gaussian elimination. It essentially consists of two parts: Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate equation with no solution (which indicates the system has no solution) or an equivalent simpler system in triangular or echelon form. Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler system. Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for Part A, which is as follows.
ALGORITHM 3.2 for (Part A): Input: The m  n system (3.2) of linear equations. ELIMINATION STEP:

Find the first unknown in the system with a nonzero coefficient (which now must be x1 ).

(a) Arrange so that a11 6¼ 0. That is, if necessary, interchange equations so that the first unknown x1 appears with a nonzero coefficient in the first equation. (b) Use a11 as a pivot to eliminate x1 from all equations except the first equation. That is, for i > 1: (1) Set m ¼ Àai1 =a11 ; (2) Replace Li by mL1 þ Li The system now has the following form: a11 x1 þ a12 x2 þ a13 x3 þ Á Á Á þ a1n xn ¼ b1 a2j2 xj2 þ Á Á Á þ a2n xn ¼ b2 ::::::::::::::::::::::::::::::::::::::: amj2 xj2 þ Á Á Á þ amn xn ¼ bn where x1 does not appear in any equation except the first, a11 6¼ 0, and xj2 denotes the first unknown with a nonzero coefficient in any equation other than the first. (c) Examine each new equation L. (1) If L has the form 0x1 þ 0x2 þ Á Á Á þ 0xn ¼ b with b 6¼ 0, then STOP The system is inconsistent and has no solution. (2) If L has the form 0x1 þ 0x2 þ Á Á Á þ 0xn ¼ 0 or if L is a multiple of another equation, then delete L from the system.
RECURSION STEP: OUTPUT:

Repeat the Elimination Step with each new ‘‘smaller’’ subsystem formed by all the equations excluding the first equation.

Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no solution is obtained indicating an inconsistent system.

The next remarks refer to the Elimination Step in Algorithm 3.2. (1) The following number m in (b) is called the multiplier: m¼À ai1 coefficient to be deleted ¼À pivot a11

(2) One could alternatively apply the following operation in (b): Replace Li by À ai1 L1 þ a11 Li This would avoid fractions if all the scalars were originally integers.

68
Gaussian Elimination Example

CHAPTER 3 Systems of Linear Equations

Here we illustrate in detail Gaussian elimination using the following system of linear equations: L1 : L2 : L3 : x À 3y À 2z ¼ 6 2x À 4y À 3z ¼ 8 À3x þ 6y þ 8z ¼ À5

Part A. We use the coefficient 1 of x in the first equation L1 as the pivot in order to eliminate x from the second equation L2 and from the third equation L3 . This is accomplished as follows: (1) Multiply L1 by the multiplier m ¼ À2 and add it to L2 ; that is, ‘‘Replace L2 by À2L1 þ L2 .’’ (2) Multiply L1 by the multiplier m ¼ 3 and add it to L3 ; that is, ‘‘Replace L3 by 3L1 þ L3 .’’ These steps yield ðÀ2ÞL1 : L2 : New L2 : À2x þ 6y þ 4z ¼ À12 2x À 4y À 3z ¼ 8 2y þ z ¼ À4 3L1 : L3 : New L3 : 3x À 9y À 6z ¼ 18 À3x þ 6y þ 8z ¼ À5 À3y þ 2z ¼ 13

Thus, the original system is replaced by the following system: L1 : L2 : L3 : x À 3y À 2z ¼ 6 2y þ z ¼ À4 À3y þ 2z ¼ 13

(Note that the equations L2 and L3 form a subsystem with one less equation and one less unknown than the original system.) Next we use the coefficient 2 of y in the (new) second equation L2 as the pivot in order to eliminate y from the (new) third equation L3 . This is accomplished as follows: (3) Multiply L2 by the multiplier m ¼ 3 and add it to L3 ; that is, ‘‘Replace L3 by 3 L2 þ L3 :’’ 2 2 (Alternately, ‘‘Replace L3 by 3L2 þ 2L3 ,’’ which will avoid fractions.) This step yields
3 2 L2 :

L3 :

3y þ 3 z ¼ À6 2 À3y þ 2z ¼ 13
7 2z

or

3L2 : 2L3 : New L3 :

6y þ 3z ¼ À12 À6y þ 4z ¼ 26 7z ¼ 14

New L3 :

¼

7

Thus, our system is replaced by the following system: L1 : L2 : L3 : x À 3y À 2z ¼ 6 2y þ z ¼ À4 7z ¼ 14 ðor 7 z ¼ 7Þ 2

The system is now in triangular form, so Part A is completed. Part B. The values for the unknowns are obtained in reverse order, z; y; x, by back-substitution. Specifically, (1) Solve for z in L3 to get z ¼ 2. (2) Substitute z ¼ 2 in L2 , and solve for y to get y ¼ À3. (3) Substitute y ¼ À3 and z ¼ 2 in L1 , and solve for x to get x ¼ 1. Thus, the solution of the triangular system and hence the original system is as follows: x ¼ 1; y ¼ À3; z¼2 or; equivalently; u ¼ ð1; À3; 2Þ:

CHAPTER 3 Systems of Linear Equations

69

Condensed Format
The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can avoid excessive recopying of some of the equations by adopting a ‘‘condensed format.’’ This format for the solution of the above system follows: Number ð1Þ ð2Þ ð3Þ ð20 Þ ð30 Þ ð300 Þ Equation x À 3y À 2z ¼ 6 2x À 4y À 3z ¼ 8 À3x þ 6y þ 8z ¼ À5 2y þ z ¼ À4 À 3y þ 2z ¼ 13 7z ¼ 14 Operation

Replace L2 by À2L1 þ L2 Replace L3 by 3L1 þ L3 Replace L3 by 3L2 þ 2L3

That is, first we write down the number of each of the original equations. As we apply the Gaussian elimination algorithm to the system, we only write down the new equations, and we label each new equation using the same number as the original corresponding equation, but with an added prime. (After each new equation, we will indicate, for instructional purposes, the elementary operation that yielded the new equation.) The system in triangular form consists of equations (1), ð20 Þ, and ð300 Þ, the numbers with the largest number of primes. Applying back-substitution to these equations again yields x ¼ 1, y ¼ À3, z ¼ 2. Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot, then this is easily accomplished in the format by simply renumbering the two equations rather than changing their positions.
EXAMPLE 3.7 Solve the following system:

x þ 2y À 3z ¼ 1 2x þ 5y À 8z ¼ 4 3x þ 8y À 13z ¼ 7

We solve the system by Gaussian elimination. Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L1 as the pivot in order to eliminate x from the second equation L2 and from the third equation L3 . This is accomplished as follows: (1) Multiply L1 by the multiplier m ¼ À2 and add it to L2 ; that is, ‘‘Replace L2 by À2L1 þ L2 .’’ (2) Multiply L1 by the multiplier m ¼ À3 and add it to L3 ; that is, ‘‘Replace L3 by À3L1 þ L3 .’’ The two steps yield

x þ 2y À 3z ¼ 1 y À 2z ¼ 2 2y À 4z ¼ 4

or

x þ 2y À 3z ¼ 1 y À 2z ¼ 2

(The third equation is deleted, because it is a multiple of the second equation.) The system is now in echelon form with free variable z. Part B. (Backward Elimination) To obtain the general solution, let the free variable z ¼ a, and solve for x and y by back-substitution. Substitute z ¼ a in the second equation to obtain y ¼ 2 þ 2a. Then substitute z ¼ a and y ¼ 2 þ 2a into the first equation to obtain

x þ 2ð2 þ 2aÞ À 3a ¼ 1

or

x þ 4 þ 4a À 3a ¼ 1

or

x ¼ À3 À a

Thus, the following is the general solution where a is a parameter:

x ¼ À3 À a;

y ¼ 2 þ 2a;

z¼a

or

u ¼ ðÀ3 À a; 2 þ 2a; aÞ

70
EXAMPLE 3.8 Solve the following system:

CHAPTER 3 Systems of Linear Equations

x1 þ 3x2 À 2x3 þ 5x4 ¼ 4 2x1 þ 8x2 À x3 þ 9x4 ¼ 9 3x1 þ 5x2 À 12x3 þ 17x4 ¼ 7
We use Gaussian elimination. Part A. (Forward Elimination) We use the coefficient 1 of x1 in the first equation L1 as the pivot in order to eliminate x1 from the second equation L2 and from the third equation L3 . This is accomplished by the following operations: (1) ‘‘Replace L2 by À2L1 þ L2 ’’ and (2) ‘‘Replace L3 by À3L1 þ L3 ’’ These yield:

x1 þ 3x2 À 2x3 þ 5x4 ¼ 2x2 þ 3x3 À x4 ¼

4 1

À 4x2 À 6x3 þ 2x4 ¼ À5
We now use the coefficient 2 of x2 in the second equation L2 as the pivot and the multiplier m ¼ 2 in order to eliminate x2 from the third equation L3 . This is accomplished by the operation ‘‘Replace L3 by 2L2 þ L3 ,’’ which then yields the degenerate equation

0x1 þ 0x2 þ 0x3 þ 0x4 ¼ À3
This equation and, hence, the original system have no solution:

DO NOT CONTINUE Remark 1: As in the above examples, Part A of Gaussian elimination tells us whether or not the system has a solution—that is, whether or not the system is consistent. Accordingly, Part B need never be applied when a system has no solution. Remark 2: If a system of linear equations has more than four unknowns and four equations, then it may be more convenient to use the matrix format for solving the system. This matrix format is discussed later.

3.7

Echelon Matrices, Row Canonical Form, Row Equivalence

One way to solve a system of linear equations is by working with its augmented matrix M rather than the system itself. This section introduces the necessary matrix concepts for such a discussion. These concepts, such as echelon matrices and elementary row operations, are also of independent interest.

Echelon Matrices
A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions hold (where a leading nonzero element of a row of A is the first nonzero element in the row): (1) All zero rows, if any, are at the bottom of the matrix. (2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row. That is, A ¼ ½aij Š is an echelon matrix if there exist nonzero entries a1j1 ; a2j2 ; . . . ; arjr ; where j1 < j2 < Á Á Á < jr

CHAPTER 3 Systems of Linear Equations with the property that & aij ¼ 0 for ðiÞ i r; ðiiÞ i > r j < ji

71

The entries a1j1 , a2j2 ; . . . ; arjr , which are the leading nonzero elements in their respective rows, are called the pivots of the echelon matrix.
EXAMPLE 3.9 The following is an echelon matrix whose pivots have been circled:

0 60 6 A ¼ 60 6 40 0

2

2 0 0 0 0

3 0 0 0 0

4 3 0 0 0

5 4 0 0 0

9 1 5 0 0

0 2 7 8 0

3 7 57 7 27 7 65 0

Observe that the pivots are in columns C2 ; C4 ; C6 ; C7 , and each is to the right of the one above. Using the above notation, the pivots are

a1j1 ¼ 2; where a2j2 ¼ 3;

a3j3 ¼ 5;

a4j4 ¼ 8

j1 ¼ 2, j2 ¼ 4, j3 ¼ 6, j4 ¼ 7. Here r ¼ 4.

Row Canonical Form
A matrix A is said to be in row canonical form (or row-reduced echelon form) if it is an echelon matrix— that is, if it satisfies the above properties (1) and (2), and if it satisfies the following additional two properties: (3) Each pivot (leading nonzero entry) is equal to 1. (4) Each pivot is the only nonzero entry in its column. The major difference between an echelon matrix and a matrix in row canonical form is that in an echelon matrix there must be zeros below the pivots [Properties (1) and (2)], but in a matrix in row canonical form, each pivot must also equal 1 [Property (3)] and there must also be zeros above the pivots [Property (4)]. The zero matrix 0 of any size and the identity matrix I of any size are important special examples of matrices in row canonical form.
EXAMPLE 3.10 The following are echelon matrices whose pivots have been circled:

2

2 60 6 40 0

3 0 0 0

2 0 0 0

0 1 0 0

4 À3 0 0

5 2 6 0

3 À6 07 7; 25 0

2

1 40 0

3 2 3 0 1 5; 0 0

2

0 40 0

1 3 0 0 0 0

0 1 0

3 0 4 0 À3 5 1 2

The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical form, because it does not satisfy property (4); that is, there is a nonzero entry above the second pivot in the third column. The first matrix is not in row canonical form, because it satisfies neither property (3) nor property (4); that is, some pivots are not equal to 1 and there are nonzero entries above the pivots.

72
Elementary Row Operations

CHAPTER 3 Systems of Linear Equations

Suppose A is a matrix with rows R1 ; R2 ; . . . ; Rm . The following operations on A are called elementary row operations. ½E1 Š (Row Interchange): Interchange rows Ri and Rj . This may be written as ‘‘Interchange Ri and Rj ’’ ‘‘Replace Ri by kRi ðk 6¼ 0Þ’’ or ‘‘Ri ! Rj ’’

½E2 Š (Row Scaling): Replace row Ri by a nonzero multiple kRi of itself. This may be written as or ‘‘kRi ! Ri ’’ ½E3 Š (Row Addition): Replace row Rj by the sum of a multiple kRi of a row Ri and itself. This may be written as ‘‘Replace Rj by kRi þ Rj ’’ or ‘‘kRi þ Rj ! Rj ’’ The arrow ! in E2 and E3 may be read as ‘‘replaces.’’ Sometimes (say to avoid fractions when all the given scalars are integers) we may apply ½E2 Š and ½E3 Š in one step; that is, we may apply the following operation: ½EŠ Replace Rj by the sum of a multiple kRi of a row Ri and a nonzero multiple k 0 Rj of itself. This may be written as ‘‘Replace Rj by kRi þ k 0 Rj ðk 0 6¼ 0Þ’’ or ‘‘kRi þ k 0 Rj ! Rj ’’

We emphasize that in operations ½E3 Š and ½E Š only row Rj is changed.

Row Equivalence, Rank of a Matrix
A matrix A is said to be row equivalent to a matrix B, written A$B if B can be obtained from A by a sequence of elementary row operations. In the case that B is also an echelon matrix, B is called an echelon form of A. The following are two basic results on row equivalence.
THEOREM 3.7: Suppose A ¼ ½aij Š and B ¼ ½bij Š are row equivalent echelon matrices with respective

pivot entries a1j1 ; a2j2 ; . . . arjr and b1k1 ; b2k2 ; . . . bsks Then A and B have the same number of nonzero rows—that is, r ¼ s—and the pivot entries are in the same positions—that is, j1 ¼ k1 , j2 ¼ k2 ; . . . ; jr ¼ kr .
THEOREM 3.8: Every matrix A is row equivalent to a unique matrix in row canonical form.

The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.8 is called the row canonical form of A. Using the above theorems, we can now give our first definition of the rank of a matrix.
DEFINITION:

The rank of a matrix A, written rankðAÞ, is equal to the number of pivots in an echelon form of A.

The rank is a very important property of a matrix and, depending on the context in which the matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the same number. The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any matrix A (and hence the rank of A), and also finds the row canonical form of A.

CHAPTER 3 Systems of Linear Equations
One can show that row equivalence is an equivalence relation. That is, (1) A $ A for any matrix A. (2) If A $ B, then B $ A. (3) If A $ B and B $ C, then A $ C.

73

Property (2) comes from the fact that each elementary row operation has an inverse operation of the same type. Namely, (i) ‘‘Interchange Ri and Rj ’’ is its own inverse. (ii) ‘‘Replace Ri by kRi ’’ and ‘‘Replace Ri by ð1=kÞRi ’’ are inverses. (iii) ‘‘Replace Rj by kRi þ Rj ’’ and ‘‘Replace Rj by ÀkRi þ Rj ’’ are inverses. There is a similar result for operation [E] (Problem 3.73).

3.8

Gaussian Elimination, Matrix Formulation

This section gives two matrix algorithms that accomplish the following: (1) Algorithm 3.3 transforms any matrix A into an echelon form. (2) Algorithm 3.4 transforms the echelon matrix into its row canonical form. These algorithms, which use the elementary row operations, are simply restatements of Gaussian elimination as applied to matrices rather than to linear equations. (The term ‘‘row reduce’’ or simply ‘‘reduce’’ will mean to transform a matrix by the elementary row operations.)
ALGORITHM 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts 0’s below

each pivot, working from the ‘‘top-down.’’) The output is an echelon form of A. Step 1. Find the first column with a nonzero entry. Let j1 denote this column. (a) Arrange so that a1j1 6¼ 0. That is, if necessary, interchange rows so that a nonzero entry appears in the first row in column j1 . (b) Use a1j1 as a pivot to obtain 0’s below a1j1 . Specifically, for i > 1: ð1Þ Set m ¼ Àaij1 =a1j1 ; ð2Þ Replace Ri by mR1 þ Ri

[That is, apply the operation Àðaij1 =a1j1 ÞR1 þ Ri ! Ri :] Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let j2 denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we have a2j2 6¼ 0. Continue the above process until a submatrix has only zero rows.

Steps 3 to r.

We emphasize that at the end of the algorithm, the pivots will be a1j1 ; a2j2 ; . . . ; arjr where r denotes the number of nonzero rows in the final echelon matrix. Remark 1: m¼À The following number m in Step 1(b) is called the multiplier:

aij1 entry to be deleted ¼À a1j1 pivot

74

CHAPTER 3 Systems of Linear Equations

Remark 2: One could replace the operation in Step 1(b) by the following which would avoid fractions if all the scalars were originally integers. Replace Ri by Àaij1 R1 þ a1j1 Ri :
ALGORITHM 3.4 (Backward Elimination): The input is a matrix A ¼ ½aij Š in echelon form with pivot

entries a1j1 ; Step 1. (a) a2j2 ; ...; arjr The output is the row canonical form of A. (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rr by 1=arjr . r À 2; ...; 2; 1: (b) (Use arjr ¼ 1 to obtain 0’s above the pivot.) For i ¼ r À 1; ð1Þ Set m ¼ Àaijr ;

ð2Þ Replace Ri by mRr þ Ri

(That is, apply the operations Àaijr Rr þ Ri ! Ri .) Steps 2 to rÀ1. Repeat Step 1 for rows RrÀ1 , RrÀ2 ; . . . ; R2 . Step r. (Use row scaling so the first pivot equals 1.) Multiply R1 by 1=a1j1 . There is an alternative form of Algorithm 3.4, which we describe here in words. The formal description of this algorithm is left to the reader as a supplementary problem.
ALTERNATIVE ALGORITHM 3.4 Puts 0’s above the pivots row by row from the bottom up (rather

than column by column from right to left). The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is essentially the same as solving for the pivot unknowns one after the other from the bottom up. Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically, Stage A (Algorithm 3.3). Puts 0’s below each pivot, working from the top row R1 down. Stage B (Algorithm 3.4). Puts 0’s above each pivot, working from the bottom row Rr up. There is another algorithm, called Gauss–Jordan, that also row reduces a matrix to its row canonical form. The difference is that Gauss–Jordan puts 0’s both below and above each pivot as it works its way from the top row R1 down. Although Gauss–Jordan may be easier to state and understand, it is much less efficient than the two-stage Gaussian elimination algorithm. 1 2 EXAMPLE 3.11 Consider the matrix A ¼ 4 2 4 3 6
(a) Use Algorithm 3.3 to reduce A to an echelon form. (b) Use Algorithm 3.4 to further reduce A to its row canonical form. (a) First use a11 ¼ 1 as a pivot to obtain 0’s below a11 ; that is, apply the operations ‘‘Replace R2 by À2R1 þ R2 ’’ and ‘‘Replace R3 by À3R1 þ R3 .’’ Then use a23 ¼ 2 as a pivot to obtain 0 below a23 ; that is, apply the operation ‘‘Replace R3 by À 3 R2 þ R3 .’’ This yields 2

2

À3 À4 À6

3 1 2 6 10 5. 9 13

1 40 A$ 0

2

2 0 0

À3 2 3

3 2 1 2 1 5 $ 40 4 6 6 7 0

2 0 0

À3 1 2 4 0 0

3 2 65 À2

The matrix is now in echelon form.

CHAPTER 3 Systems of Linear Equations

75

(b) Multiply R3 by À 1 so the pivot entry a35 ¼ 1, and then use a35 ¼ 1 as a pivot to obtain 0’s above it by the 2 operations ‘‘Replace R2 by À6R3 þ R2 ’’ and then ‘‘Replace R1 by À2R3 þ R1 .’’ This yields

1 2 A $ 40 0 0 0

2

À3 2 0

3 2 1 2 1 2 4 65 $ 40 0 0 1 0 0

À3 2 0

3 1 0 4 0 5: 0 1

Multiply R2 by 1 so the pivot entry a23 ¼ 1, and then use a23 ¼ 1 as a pivot to obtain 0’s above it by the 2 operation ‘‘Replace R1 by 3R2 þ R1 .’’ This yields

1 2 A $ 40 0 0 0

2

À3 1 0

3 2 1 0 1 2 2 05 $ 40 0 0 1 0 0

0 1 0

3 7 0 2 0 5: 0 1

The last matrix is the row canonical form of A.

Application to Systems of Linear Equations
One way to solve a system of linear equations is by working with its augmented matrix M rather than the equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system has a solution), and then further reduce M to its row canonical form (which essentially gives the solution of the original system of linear equations). The justification for this process comes from the following facts: (1) Any elementary row operation on the augmented matrix M of the system is equivalent to applying the corresponding operation on the system itself. (2) The system has a solution if and only if the echelon form of the augmented matrix M does not have a row of the form ð0; 0; . . . ; 0; bÞ with b ¼ 0. 6 (3) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each basic variable is a pivot entry equal to 1, and it is the only nonzero entry in its respective column; hence, the free-variable form of the solution of the system of linear equations is obtained by simply transferring the free variables to the other side. This process is illustrated below.
EXAMPLE 3.12

Solve each of the following systems: x1 þ x2 À 2x3 þ 3x4 ¼ 4 2x1 þ 3x2 þ 3x3 À x4 ¼ 3 5x1 þ 7x2 þ 4x3 þ x4 ¼ 5 (b) x þ 2y þ z ¼ 3 2x þ 5y À z ¼ À4 3x À 2y À z ¼ 5 (c)

x1 þ x2 À 2x3 þ 4x4 ¼ 5 2x1 þ 2x2 À 3x3 þ x4 ¼ 3 3x1 þ 3x2 À 4x3 À 2x4 ¼ 1 (a)

(a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:

1 M ¼ 42 3

2

1 2 3

À2 4 À3 1 À4 À2

3 2 5 1 35 $ 40 1 0

1 0 0

3 2 À2 4 5 1 1 1 À7 À7 5 $ 4 0 0 2 À14 À14 0 0

0 1 0

3 À10 À9 À7 À7 5 0 0

Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the solution. That is,

x1 þ x2

À 10x4 ¼ À9 x3 À 7x4 ¼ À7

or

x1 ¼ À9 À x2 þ 10x4 x3 ¼ À7 þ 7x4

(The zero row is omitted in the solution.) Observe that x1 and x3 are the pivot variables, and x2 and x4 are the free variables.

76

CHAPTER 3 Systems of Linear Equations

(b) First reduce its augmented matrix M to echelon form as follows:

1 M ¼ 42 5

2

1 3 7

À2 3 3 À1 4 1

3 2 4 1 35 $ 40 0 5

1 1 2

3 2 À2 3 4 1 1 7 À7 À5 5 $ 4 0 1 14 À14 À15 0 0

À2 7 0

3 3 4 À7 À5 5 0 À5

There is no need to continue to find the row canonical form of M, because the echelon form already tells us that the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate equation

0x1 þ 0x2 þ 0x3 þ 0x4 ¼ À5 which has no solution. Thus, the system has no solution. (c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:

2

6 M ¼ 42 1 6 $ 40 0 2 3

1

2

1

5 À1 À2 À1 2 1 1 À3 0 1

7 6 À4 5 $ 4 0

3

3

2

1

2 1

1 À3

7 6 À10 5 $ 4 0

3

3

2

1

2 1

7 À3 À10 5

1

3

3

0 0 À28 À84 0 À8 À4 À4 5 3 3 2 3 2 1 0 0 2 1 2 0 0 3 7 7 6 7 6 À10 5 $ 4 0 1 0 À1 5 $ 4 0 1 0 À1 5 0 0 1 3 0 0 1 3 3

Thus, the system has the unique solution x ¼ 2, y ¼ À1, z ¼ 3, or, equivalently, the vector u ¼ ð2; À1; 3Þ. We note that the echelon form of M already indicated that the solution was unique, because it corresponded to a triangular system.

Application to Existence and Uniqueness Theorems
This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system of linear equations using the notion of the rank of a matrix.
THEOREM 3.9: Consider a system of linear equations in n unknowns with augmented matrix

M ¼ ½A; BŠ. Then,

(a) The system has a solution if and only if rankðAÞ ¼ rankðMÞ. (b) The solution is unique if and only if rankðAÞ ¼ rankðMÞ ¼ n. Proof of (a). The system has a solution if and only if an echelon form of M ¼ ½A; BŠ does not have a row of the form ð0; 0; . . . ; 0; bÞ; with b 6¼ 0

If an echelon form of M does have such a row, then b is a pivot of M but not of A, and hence, rankðMÞ > rankðAÞ. Otherwise, the echelon forms of A and M have the same pivots, and hence, rankðAÞ ¼ rankðMÞ. This proves (a). Proof of (b). The system has a unique solution if and only if an echelon form has no free variable. This means there is a pivot for each unknown. Accordingly, n ¼ rankðAÞ ¼ rankðMÞ. This proves (b). The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix M ¼ ½A; BŠ also automatically yields an echelon form of A.

CHAPTER 3 Systems of Linear Equations

77

3.9

Matrix Equation of a System of Linear Equations

The general system (3.2) of m linear equations in n unknowns is equivalent to the matrix equation 2 3 2 3 x1 2 3 b a11 a12 . . . a1n 6 7 x2 7 6 1 7 6 a21 a22 . . . a2n 76 b2 7 6 76 7 6 or AX ¼ B 4 ::::::::::::::::::::::::::::::: 56 x3 7 ¼ 4 . . . 5 4...5 am1 am2 . . . amn bm xn where A ¼ ½aij Š is the coefficient matrix, X ¼ ½xj Š is the column vector of unknowns, and B ¼ ½bi Š is the column vector of constants. (Some texts write Ax ¼ b rather than AX ¼ B, in order to emphasize that x and b are simply column vectors.) The statement that the system of linear equations and the matrix equation are equivalent means that any vector solution of the system is a solution of the matrix equation, and vice versa.
EXAMPLE 3.13

We note that x1 ¼ 3, x2 ¼ 1, x3 ¼ 2, x4 ¼ 1, or, in other words, the vector u ¼ ½3; 1; 2; 1Š is a solution of the system. Thus, the (column) vector u is also a solution of the matrix equation.

The following system of linear equations and matrix equation are equivalent: 2 3 2 3 2 3 x1 4 1 2 À4 7 6 7 x1 þ 2x2 À 4x3 þ 7x4 ¼ 4 x 4 3 À5 and 6 À8 56 2 7 ¼ 4 8 5 3x1 À 5x2 þ 6x3 À 8x4 ¼ 8 4 x3 5 4x1 À 3x2 À 2x3 þ 6x4 ¼ 11 11 4 À3 À2 6 x4

The matrix form AX ¼ B of a system of linear equations is notationally very convenient when discussing and proving properties of systems of linear equations. This is illustrated with our first theorem (described in Fig. 3-1), which we restate for easy reference.
THEOREM 3.1: Suppose the field K is infinite. Then the system AX ¼ B has: (a) a unique solution, (b)

no solution, or (c) an infinite number of solutions. Proof. It suffices to show that if AX ¼ B has more than one solution, then it has infinitely many. Suppose u and v are distinct solutions of AX ¼ B; that is, Au ¼ B and Av ¼ B. Then, for any k 2 K, A½u þ kðu À vފ ¼ Au þ kðAu À AvÞ ¼ B þ kðB À BÞ ¼ B Thus, for each k 2 K, the vector u þ kðu À vÞ is a solution of AX ¼ B. Because all such solutions are distinct (Problem 3.47), AX ¼ B has an infinite number of solutions. Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3 shows that the theorem has a geometrical description when the system consists of two equations in two unknowns, where each equation represents a line in R2 . The theorem also has a geometrical description when the system consists of three nondegenerate equations in three unknowns, where the three equations correspond to planes H1 , H2 , H3 in R3 . That is, (a) Unique solution: Here the three planes intersect in exactly one point. (b) No solution: Here the planes may intersect pairwise but with no common point of intersection, or two of the planes may be parallel. (c) Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they may coincide (two free variables). These three cases are pictured in Fig. 3-3.

Matrix Equation of a Square System of Linear Equations
A system AX ¼ B of linear equations is square if and only if the matrix A of coefficients is square. In such a case, we have the following important result.

78
H3

CHAPTER 3 Systems of Linear Equations

H3 H1, H2 , and H3 H2 H1 H1 (a ) Unique solution (i) H2

H3 (ii) (iii)

(c) Infinite number of solutions

H3 H3 H2 H2 H1 (i) (ii) (b) No solutions

H3 H2 H1

H3

H1 and H2 H1 (iii) (iv )

Figure 3-3

THEOREM 3.10: A square system AX ¼ B of linear equations has a unique solution if and only if the

matrix A is invertible. In such a case, AÀ1 B is the unique solution of the system.

We only prove here that if A is invertible, then AÀ1 B is a unique solution. If A is invertible, then AðAÀ1 BÞ ¼ ðAAÀ1 ÞB ¼ IB ¼ B and hence, AÀ1 B is a solution. Now suppose v is any solution, so Av ¼ B. Then v ¼ Iv ¼ ðAÀ1 AÞv ¼ AÀ1 ðAvÞ ¼ AÀ1 B Thus, the solution AÀ1 B is unique.
EXAMPLE 3.14

Consider the following inverse AÀ1 are also given: 2 x þ 2y þ 3z ¼ 1 1 x þ 3y þ 6z ¼ 3 ; A ¼ 41 2x þ 6y þ 13z ¼ 5 2 2

system of linear equations, whose coefficient matrix A and 3 2 3 3 6 5; 6 13 3 ¼ 4 À1 0 2 3 À8 3 7 À3 5 À2 1

AÀ1

By Theorem 3.10, the unique solution of the system is

3 À8 AÀ1 B ¼ 4 À1 7 0 À2

3 32 3 2 1 À6 3 À3 54 3 5 ¼ 4 5 5 5 À1 1

That is, x ¼ À6, y ¼ 5, z ¼ À1.

Remark: We emphasize that Theorem 3.10 does not usually help us to find the solution of a square system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the system directly. Thus, unless we are given the inverse of a coefficient matrix A, as in Example 3.14, we usually solve a square system by Gaussian elimination (or some iterative method whose discussion lies beyond the scope of this text).

CHAPTER 3 Systems of Linear Equations

79

3.10

Systems of Linear Equations and Linear Combinations of Vectors

The general system (3.2) of linear equations may be rewritten as the following vector equation: 2 3 2 3 2 3 2 3 a11 a12 a1n b1 6 a21 7 6 a22 7 6 a2n 7 6 b2 7 7 6 7 6 7 6 7 x1 6 4 . . . 5 þ x2 4 . . . 5 þ Á Á Á þ xn 4 . . . 5 ¼ 4 . . . 5 am1 am2 amn bm Recall that a vector v in K n is said to be a linear combination of vectors u1 ; u2 ; . . . ; um in K n if there exist scalars a1 ; a2 ; . . . ; am in K such that v ¼ a1 u1 þ a2 u2 þ Á Á Á þ am um Accordingly, the general system (3.2) of linear equations and the above equivalent vector equation have a solution if and only if the column vector of constants is a linear combination of the columns of the coefficient matrix. We state this observation formally.
THEOREM 3.11: A system AX ¼ B of linear equations has a solution if and only if B is a linear

combination of the columns of the coefficient matrix A. Thus, the answer to the problem of expressing a given vector v in K n as a linear combination of vectors u1 ; u2 ; . . . ; um in K n reduces to solving a system of linear equations.

Linear Combination Example
Suppose we want to write the vector v ¼ ð1; À2; 5Þ as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; À1; 1Þ First we write v ¼ xu1 þ yu2 þ zu3 with unknowns x; y; z, and then we find the equivalent system of linear equations which we solve. Specifically, we first write 2 3 2 3 2 3 2 3 1 1 1 2 4 À2 5 ¼ x4 1 5 þ y4 2 5 þ z4 À1 5 ð*Þ 5 1 3 1 Then 3 3 2 3 2 3 2 3 2 1 x y 2z x þ y þ 2z 4 À2 5 ¼ 4 x 5 þ 4 2y 5 þ 4 Àz 5 ¼ 4 x þ 2y À z 5 5 x 3y z x þ 3y þ z 2 x þ y þ 2z ¼ x þ 3y þ z ¼

Setting corresponding entries equal to each other yields the following equivalent system: 1 ð**Þ 5 x þ 2y À z ¼ À2

For notational convenience, we have written the vectors in Rn as columns, because it is then easier to find the equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly to the system (**). Now we solve the equivalent system of linear equations by reducing the system to echelon form. This yields x þ y þ 2z ¼ 1 y À 3z ¼ À3 2y À z ¼ 4 and then x þ y þ 2z ¼ 1 y À 3z ¼ À3 5z ¼ 10

Back-substitution yields the solution x ¼ À6, y ¼ 3, z ¼ 2. Thus, v ¼ À6u1 þ 3u2 þ 2u3 .

80
EXAMPLE 3.15

CHAPTER 3 Systems of Linear Equations

(a) Write the vector v ¼ ð4; 9; 19Þ as a linear combination of

u1 ¼ ð1; À2; 3Þ;

u2 ¼ ð3; À7; 10Þ;

u3 ¼ ð2; 1; 9Þ:

Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3 , and reduce the system to an echelon form. We have

x þ 3y þ 2z ¼ 4 À2x À 7y þ z ¼ 9 3x þ 10y þ 9z ¼ 19

or

x þ 3y þ 2z ¼ 4 Ày þ 5z ¼ 17 y þ 3z ¼ 7

or

x þ 3y þ 2z ¼ 4 Ày þ 5z ¼ 17 8z ¼ 24

Back-substitution yields the solution x ¼ 4, y ¼ À2, z ¼ 3. Thus, v is a linear combination of u1 ; u2 ; u3 . Specifically, v ¼ 4u1 À 2u2 þ 3u3 . (b) Write the vector v ¼ ð2; 3; À5Þ as a linear combination of

u1 ¼ ð1; 2; À3Þ;

u2 ¼ ð2; 3; À4Þ;

u3 ¼ ð1; 3; À5Þ

Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3 , and reduce the system to an echelon form. We have

x þ 2y þ z ¼ 2 2x þ 3y þ 3z ¼ 3 À3x À 4y À 5z ¼ À5

or

x þ 2y þ z ¼ 2 Ày þ z ¼ À1 2y À 2z ¼ 1

or

x þ 2y þ z ¼ 2 À 5y þ 5z ¼ À1 0¼ 3

The system has no solution. Thus, it is impossible to write v as a linear combination of u1 ; u2 ; u3 .

Linear Combinations of Orthogonal Vectors, Fourier Coefficients
Recall first (Section 1.4) that the dot (inner) product u Á v of vectors u ¼ ða1 ; . . . ; an Þ and v ¼ ðb1 ; . . . ; bn Þ in Rn is defined by u Á v ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn Furthermore, vectors u and v are said to be orthogonal if their dot product u Á v ¼ 0. Suppose that u1 ; u2 ; . . . ; un in Rn are n nonzero pairwise orthogonal vectors. This means ðiÞ ui Á uj ¼ 0 for i 6¼ j and ðiiÞ ui Á ui 6¼ 0 for each i

Then, for any vector v in Rn , there is an easy way to write v as a linear combination of u1 ; u2 ; . . . ; un , which is illustrated in the next example.
EXAMPLE 3.16

Consider the following three vectors in R3 : u2 ¼ ð1; À3; 2Þ; u3 ¼ ð5; À1; À4Þ u2 Á u3 ¼ 5 þ 3 À 8 ¼ 0

u1 ¼ ð1; 1; 1Þ;

These vectors are pairwise orthogonal; that is,

u1 Á u2 ¼ 1 À 3 þ 2 ¼ 0;

u1 Á u3 ¼ 5 À 1 À 4 ¼ 0;

Suppose we want to write v ¼ ð4; 14; À9Þ as a linear combination of u1 ; u2 ; u3 .

Method 1. Find the equivalent system of linear equations as in Example 3.14 and then solve, obtaining v ¼ 3u1 À 4u2 þ u3 . Method 2. (This method uses the fact that the vectors u1 ; u2 ; u3 are mutually orthogonal, and hence, the arithmetic is much simpler.) Set v as a linear combination of u1 ; u2 ; u3 using unknown scalars x; y; z as follows: ð4; 14; À9Þ ¼ xð1; 1; 1Þ þ yð1; À3; 2Þ þ zð5; À1; À4Þ ð*Þ

CHAPTER 3 Systems of Linear Equations
Take the dot product of (*) with respect to u1 to get ð4; 14; À9Þ Á ð1; 1; 1Þ ¼ xð1; 1; 1Þ Á ð1; 1; 1Þ or 9 ¼ 3x or x¼3

81

(The last two terms drop out, because u1 is orthogonal to u2 and to u3 .) Next take the dot product of (*) with respect to u2 to obtain ð4; 14; À9Þ Á ð1; À3; 2Þ ¼ yð1; À3; 2Þ Á ð1; À3; 2Þ or À 56 ¼ 14y 42 ¼ 42z or y ¼ À4 z¼1

Finally, take the dot product of (*) with respect to u3 to get ð4; 14; À9Þ Á ð5; À1; À4Þ ¼ zð5; À1; À4Þ Á ð5; À1; À4Þ Thus, v ¼ 3u1 À 4u2 þ u3 . or or

The procedure in Method 2 in Example 3.16 is valid in general. Namely,
THEOREM 3.12: Suppose u1 ; u2 ; . . . ; un are nonzero mutually orthogonal vectors in Rn . Then, for any

vector v in Rn , v Á u1 v Á u2 v Á un v¼ u1 þ u2 þ Á Á Á þ u u1 Á u1 u2 Á u2 un Á un n

We emphasize that there must be n such orthogonal vectors ui in Rn for the formula to be used. Note also that each ui Á ui 6¼ 0, because each ui is a nonzero vector. Remark: The following scalar ki (appearing in Theorem 3.12) is called the Fourier coefficient of v with respect to ui : v Á ui v Á ui ¼ ki ¼ ui Á ui kui k2 It is analogous to a coefficient in the celebrated Fourier series of a function.

3.11

Homogeneous Systems of Linear Equations

A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus, a homogeneous system has the form AX ¼ 0. Clearly, such a system always has the zero vector 0 ¼ ð0; 0; . . . ; 0Þ as a solution, called the zero or trivial solution. Accordingly, we are usually interested in whether or not the system has a nonzero solution. Because a homogeneous system AX ¼ 0 has at least the zero solution, it can always be put in an echelon form, say a11 x1 þ a12 x2 þ a13 x3 þ a14 x4 þ Á Á Á þ a1n xn ¼ 0 a2j2 xj2 þ a2;j2 þ1 xj2 þ1 þ Á Á Á þ a2n xn ¼ 0 :::::::::::::::::::::::::::::::::::::::::::: arjr xjr þ Á Á Á þ arn xn ¼ 0 Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus, the echelon system has n À r free variables. The question of nonzero solutions reduces to the following two cases: (i) r ¼ n. The system has only the zero solution. (ii) r < n. The system has a nonzero solution. Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n, and the system has a nonzero solution. This proves the following important result.
THEOREM 3.13: A homogeneous system AX ¼ 0 with more unknowns than equations has a nonzero

solution.

82
EXAMPLE 3.17

CHAPTER 3 Systems of Linear Equations
Determine whether or not each of the following homogeneous systems has a nonzero xþ yÀ z¼0 2x þ 4y À z ¼ 0 3x þ 2y þ 2z ¼ 0 (b) x1 þ 2x2 À 3x3 þ 4x4 ¼ 0 2x1 À 3x2 þ 5x3 À 7x4 ¼ 0 5x1 þ 6x2 À 9x3 þ 8x4 ¼ 0 (c)

solution: xþ yÀ z¼0 2x À 3y þ z ¼ 0 x À 4y þ 2z ¼ 0 (a)

(a) Reduce the system to echelon form as follows:

xþ yÀ z¼0 À5y þ 3z ¼ 0 À5y þ 3z ¼ 0

and then

xþ yÀ z¼0 À5y þ 3z ¼ 0

The system has a nonzero solution, because there are only two equations in the three unknowns in echelon form. Here z is a free variable. Let us, say, set z ¼ 5. Then, by back-substitution, y ¼ 3 and x ¼ 2. Thus, the vector u ¼ ð2; 3; 5Þ is a particular nonzero solution. (b) Reduce the system to echelon form as follows:

xþyÀ z¼0 2y þ z ¼ 0 Ày þ 5z ¼ 0

and then

xþyÀz¼0 2y þ z ¼ 0 11z ¼ 0

In echelon form, there are three equations in three unknowns. Thus, the system has only the zero solution. (c) The system must have a nonzero solution (Theorem 3.13), because there are four unknowns but only three equations. (Here we do not need to reduce the system to echelon form.)

Basis for the General Solution of a Homogeneous System
Let W denote the general solution of a homogeneous system AX ¼ 0. A list of nonzero solution vectors u1 ; u2 ; . . . ; us of the system is said to be a basis for W if each solution vector w 2 W can be expressed uniquely as a linear combination of the vectors u1 ; u2 ; . . . ; us ; that is, there exist unique scalars a1 ; a2 ; . . . ; as such that w ¼ a1 u1 þ a2 u2 þ Á Á Á þ as us The number s of such basis vectors is equal to the number of free variables. This number s is called the dimension of W , written as dim W ¼ s. When W ¼ f0g—that is, the system has only the zero solution— we define dim W ¼ 0. The following theorem, proved in Chapter 5, page 171, tells us how to find such a basis.
THEOREM 3.14: Let W be the general solution of a homogeneous system AX ¼ 0, and suppose that

the echelon form of the homogeneous system has s free variables. Let u1 ; u2 ; . . . ; us be the solutions obtained by setting one of the free variables equal to 1 (or any nonzero constant) and the remaining free variables equal to 0. Then dim W ¼ s, and the vectors u1 ; u2 ; . . . ; us form a basis of W .

We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us one such basis.
EXAMPLE 3.18 Find the dimension and a basis for the general solution W of the homogeneous system

x1 þ 2x2 À 3x3 þ 2x4 À 4x5 ¼ 0 2x1 þ 4x2 À 5x3 þ x4 À 6x5 ¼ 0 5x1 þ 10x2 À 13x3 þ 4x4 À 16x5 ¼ 0

CHAPTER 3 Systems of Linear Equations
First reduce the system to echelon form. Apply the following operations:

83

‘‘Replace L2 by À2L1 þ L2 ’’
These operations yield

and ‘‘Replace L3 by À 5L1 þ L3 ’’

and then ‘‘Replace L3 by À2L2 þ L3 ’’

x1 þ 2x2 À 3x3 þ 2x4 À 4x5 ¼ 0 x3 À 3x4 þ 2x5 ¼ 0 2x3 À 6x4 þ 4x5 ¼ 0

and

x1 þ 2x2 À 3x3 þ 2x4 À 4x5 ¼ 0 x3 À 3x4 þ 2x5 ¼ 0

The system in echelon form has three free variables, x2 ; x4 ; x5 ; hence, dim W ¼ 3. Three solution vectors that form a basis for W are obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields the solution u1 ¼ ðÀ2; 1; 0; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields the solution u2 ¼ ð7; 0; 3; 1; 0Þ. (3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields the solution u3 ¼ ðÀ2; 0; À2; 0; 1Þ. The vectors u1 ¼ ðÀ2; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; 3; 1; 0Þ, u3 ¼ ðÀ2; 0; À2; 0; 1Þ form a basis for W .

Remark: Any solution of the system in Example 3.18 can be written in the form au1 þ bu2 þ cu3 ¼ aðÀ2; 1; 0; 0; 0Þ þ bð7; 0; 3; 1; 0Þ þ cðÀ2; 0; À2; 0; 1Þ ¼ ðÀ2a þ 7b À 2c; or x1 ¼ À2a þ 7b À 2c; x2 ¼ a; x3 ¼ 3b À 2c; x4 ¼ b; x5 ¼ c a; 3b À 2c; b; cÞ

where a; b; c are arbitrary constants. Observe that this representation is nothing more than the parametric form of the general solution under the choice of parameters x2 ¼ a, x4 ¼ b, x5 ¼ c.

Nonhomogeneous and Associated Homogeneous Systems
Let AX ¼ B be a nonhomogeneous system of linear equations. Then AX ¼ 0 is called the associated homogeneous system. For example, x þ 2y À 4z ¼ 7 3x À 5y þ 6z ¼ 8 and x þ 2y À 4z ¼ 0 3x À 5y þ 6z ¼ 0

show a nonhomogeneous system and its associated homogeneous system. The relationship between the solution U of a nonhomogeneous system AX ¼ B and the solution W of its associated homogeneous system AX ¼ 0 is contained in the following theorem.
THEOREM 3.15: Let v 0 be a particular solution of AX ¼ B and let W be the general solution of

AX ¼ 0. Then the following is the general solution of AX ¼ B: U ¼ v 0 þ W ¼ fv 0 þ w : w 2 W g

That is, U ¼ v 0 þ W is obtained by adding v 0 to each element in W . We note that this theorem has a geometrical interpretation in R3 . Specifically, suppose W is a line through the origin O. Then, as pictured in Fig. 3-4, U ¼ v 0 þ W is the line parallel to W obtained by adding v 0 to each element of W . Similarly, whenever W is a plane through the origin O, then U ¼ v 0 þ W is a plane parallel to W .

84

CHAPTER 3 Systems of Linear Equations

Figure 3-4

3.12

Elementary Matrices

Let e denote an elementary row operation and let eðAÞ denote the results of applying the operation e to a matrix A. Now let E be the matrix obtained by applying e to the identity matrix I; that is, E ¼ eðIÞ Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is always a square matrix.
EXAMPLE 3.19

Consider the following three elementary row operations: ð2Þ Replace R2 by À6R2 : ð3Þ Replace R3 by À 4R1 þ R3 :

ð1Þ Interchange R2 and R3 :

The 3 Â 3 elementary matrices corresponding to the above elementary row operations are as follows:

1 E1 ¼ 4 0 0

2

3 0 0 0 1 5; 1 0

1 E2 ¼ 4 0 0

2

0 À6 0

3 0 0 5; 1

1 0 E3 ¼ 4 0 1 À4 0

2

3 0 05 1

The following theorem, proved in Problem 3.34, holds.
THEOREM 3.16: Let e be an elementary row operation and let E be the corresponding m  m

elementary matrix. Then eðAÞ ¼ EA where A is any m  n matrix. In other words, the result of applying an elementary row operation e to a matrix A can be obtained by premultiplying A by the corresponding elementary matrix E. Now suppose e0 is the inverse of an elementary row operation e, and let E0 and E be the corresponding matrices. We note (Problem 3.33) that E is invertible and E0 is its inverse. This means, in particular, that any product P ¼ Ek . . . E2 E1 of elementary matrices is invertible.

CHAPTER 3 Systems of Linear Equations

85

Applications of Elementary Matrices
Using Theorem 3.16, we are able to prove (Problem 3.35) the following important properties of matrices.
THEOREM 3.17: Let A be a square matrix. Then the following are equivalent:

(a) (b) (c)

A is invertible (nonsingular). A is row equivalent to the identity matrix I. A is a product of elementary matrices.

Recall that square matrices A and B are inverses if AB ¼ BA ¼ I. The next theorem (proved in Problem 3.36) demonstrates that we need only show that one of the products is true, say AB ¼ I, to prove that matrices are inverses.
THEOREM 3.18: Suppose AB ¼ I. Then BA ¼ I, and hence, B ¼ AÀ1 .

Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove (Problem 3.37) the following.
THEOREM 3.19: B is row equivalent to A if and only if there exists a nonsingular matrix P such that

B ¼ PA.

Application to Finding the Inverse of an n  n Matrix
The following algorithm finds the inverse of a matrix.
ALGORITHM 3.5: The input is a square matrix A. The output is the inverse of A or that the inverse

does not exist. Step 1. Form the n  2n (block) matrix M ¼ ½A; IŠ, where A is the left half of M and the identity matrix I is the right half of M.

Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then STOP A has no inverse. (Otherwise A is in triangular form.) Step 3. Further row reduce M to its row canonical form M $ ½I; BŠ where the identity matrix I has replaced A in the left half of M. Step 4. Set AÀ1 ¼ B, the matrix that is now in the right half of M. The justification for the above algorithm is as follows. Suppose A is invertible and, say, the sequence of elementary row operations e1 ; e2 ; . . . ; eq applied to M ¼ ½A; IŠ reduces the left half of M, which is A, to the identity matrix I. Let Ei be the elementary matrix corresponding to the operation ei . Then, by applying Theorem 3.16. we get Eq . . . E2 E1 A ¼ I or ðEq . . . E2 E1 IÞA ¼ I; so AÀ1 ¼ Eq . . . E2 E1 I

That is, AÀ1 can be obtained by applying the elementary row operations e1 ; e2 ; . . . ; eq to the identity matrix I, which appears in the right half of M. Thus, B ¼ AÀ1 , as claimed.
EXAMPLE 3.20

3 1 0 2 Find the inverse of the matrix A ¼ 4 2 À1 3 5. 4 1 8

2

86
2

CHAPTER 3 Systems of Linear Equations

First form the (block) matrix M ¼ ½A; IŠ and row reduce M to an echelon form:

1 M ¼ 42 4

0 À1 1

2 1 3 0 8 0

0 1 0

3 2 0 1 0 2 0 5 $ 4 0 À1 À1 1 0 1 0

1 À2 À4

3 2 0 0 1 0 1 0 5 $ 4 0 À1 0 1 0 0

2 À1 À1

1 À2 À6

3 0 0 1 05 1 1

In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we further row reduce M to its row canonical form:

1 M $ 40 0

2

0 À1 0

0 À11 2 0 4 0 1 6 À1

3 2 2 1 À1 5 $ 4 0 0 À1

0 1 0

0 À11 0 À4 1 6

2 0 À1

3 2 15 À1

The identity matrix is now in the left half of the final matrix; hence, the right half is AÀ1 . In other words,

AÀ1

À11 2 ¼ 4 À4 0 6 À1

2

3 2 15 À1

Elementary Column Operations
Now let A be a matrix with columns C1 ; C2 ; . . . ; Cn . The following operations on A, analogous to the elementary row operations, are called elementary column operations: ½F1 Š (Column Interchange): Interchange columns Ci and Cj . ½F2 Š (Column Scaling): Replace Ci by kCi (where k 6¼ 0). ½F3 Š (Column Addition): Replace Cj by kCi þ Cj . We may indicate each of the column operations by writing, respectively, ð1Þ Ci $ Cj ; ð2Þ kCi ! Ci ; ð3Þ ðkCi þ Cj Þ ! Cj

Moreover, each column operation has an inverse operation of the same type, just like the corresponding row operation. Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to the identity matrix I; that is, F ¼ f ðIÞ Then F is called the elementary matrix corresponding to the elementary column operation f . Note that F is always a square matrix.
EXAMPLE 3.21 Consider the following elementary column operations:

ð1Þ Interchange C1 and C3 ; 0 F1 ¼ 4 0 1 2 3 0 1 1 0 5; 0 0 2

ð2Þ Replace C3 by À2C3 ; 0 1 0 3 0 0 5; À2 2

ð3Þ Replace C3 by À3C2 þ C3 3 0 À3 5 1

The corresponding three 3 Â 3 elementary matrices are as follows:

1 F2 ¼ 4 0 0

1 0 F3 ¼ 4 0 1 0 0

The following theorem is analogous to Theorem 3.16 for the elementary row operations.
THEOREM 3.20: For any matrix A; f ðAÞ ¼ AF.

That is, the result of applying an elementary column operation f on a matrix A can be obtained by postmultiplying A by the corresponding elementary matrix F.

CHAPTER 3 Systems of Linear Equations

87

Matrix Equivalence
A matrix B is equivalent to a matrix A if B can be obtained from A by a sequence of row and column operations. Alternatively, B is equivalent to A, if there exist nonsingular matrices P and Q such that B ¼ PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation. The main result of this subsection (proved in Problem 3.38) is as follows.
THEOREM 3.21: Every m  n matrix A is equivalent to a unique block matrix of the form

Ir 0

0 0

!

where Ir is the r-square identity matrix. The following definition applies. DEFINITION: The nonnegative integer r in Theorem 3.18 is called the rank of A, written rankðAÞ. Note that this definition agrees with the previous definition of the rank of a matrix.

3.13

LU DECOMPOSITION

Suppose A is a nonsingular matrix that can be brought into (upper) triangular form U using only rowaddition operations; that is, suppose A can be triangularized by the following algorithm, which we write using computer notation.
ALGORITHM 3.6: The input is a matrix A and the output is a triangular matrix U .

Step 1. Repeat for i ¼ 1; 2; . . . ; n À 1: Step 2. Repeat for j ¼ i þ 1, i þ 2; . . . ; n (a) Set mij : ¼ Àaij =aii . (b) Set Rj : ¼ mij Ri þ Rj [End of Step 2 inner loop.] [End of Step 1 outer loop.] The numbers mij are called multipliers. Sometimes we keep track of these multipliers by means of the following lower triangular matrix L: 3 1 0 0 ... 0 0 6 Àm21 1 0 ... 0 07 6 7 6 Àm31 Àm32 1 ... 0 07 L¼6 7 4 ......................................................... 5 Àmn1 Àmn2 Àmn3 . . . Àmn;nÀ1 1 That is, L has 1’s on the diagonal, 0’s above the diagonal, and the negative of the multiplier mij as its ij-entry below the diagonal. The above matrix L and the triangular matrix U obtained in Algorithm 3.6 give us the classical LU factorization of such a matrix A. Namely,
THEOREM 3.22: Let A be a nonsingular matrix that can be brought into triangular form U using only

2

row-addition operations. Then A ¼ LU , where L is the above lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix with no 0’s on the diagonal.

88
2

CHAPTER 3 Systems of Linear Equations

3 1 2 À3 EXAMPLE 3.22 Suppose A ¼ 4 À3 À4 13 5. We note that A may be reduced to triangular form by the operations 2 1 À5 ‘‘Replace R2 by 3R1 þ R2 ’’; That is, 1 A $ 40 0 2 2 2 À3 3 2 À3 1 2 45 $ 40 2 1 0 0 3 À3 45 7 ‘‘Replace R3 by À 2R1 þ R3 ’’; and then ‘‘Replace R3 by 3 R2 þ R3 ’’ 2

This gives us the classical factorization A ¼ LU , where 2 1 2 0 1 À3 2 0 1 3 and 2 1 0 2 À3 2 0 7 3

6 L ¼ 4 À3

07 5

6 U ¼ 40

7 45

We emphasize: (1) The entries À3; 2; À 3 in L are the negatives of the multipliers in the above elementary row operations. 2 (2) U is the triangular form of A.

Application to Systems of Linear Equations
Consider a computer algorithm M. Let CðnÞ denote the running time of the algorithm as a function of the size n of the input data. [The function CðnÞ is sometimes called the time complexity or simply the complexity of the algorithm M.] Frequently, CðnÞ simply counts the number of multiplications and divisions executed by M, but does not count the number of additions and subtractions because they take much less time to execute. Now consider a square system of linear equations AX ¼ B, where A ¼ ½aij Š; X ¼ ½x1 ; . . . ; xn ŠT ; B ¼ ½b1 ; . . . ; bn ŠT

and suppose A has an LU factorization. Then the system can be brought into triangular form (in order to apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M ¼ ½A; BŠ of the system. The time complexity of Algorithm 3.6 and back-substitution are, respectively, CðnÞ % 1 n3 2 and CðnÞ % 1 n2 2

where n is the number of equations. On the other hand, suppose we already have the factorization A ¼ LU . Then, to triangularize the system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column vector B. In this case, the time complexity is CðnÞ % 1 n2 2 Of course, to obtain the factorization A ¼ LU requires the original algorithm where CðnÞ % 1 n3 . Thus, 2 nothing may be gained by first finding the LU factorization when a single system is involved. However, there are situations, illustrated below, where the LU factorization is useful. Suppose, for a given matrix A, we need to solve the system AX ¼ B

CHAPTER 3 Systems of Linear Equations

89

repeatedly for a sequence of different constant vectors, say B1 ; B2 ; . . . ; Bk . Also, suppose some of the Bi depend upon the solution of the system obtained while using preceding vectors Bj . In such a case, it is more efficient to first find the LU factorization of A, and then to use this factorization to solve the system for each new B.
EXAMPLE 3.23

Consider the following system of linear equations: 2 3 x þ 2y þ z ¼ k1 1 2 1 2x þ 3y þ 3z ¼ k2 or AX ¼ B; where A ¼ 4 2 3 3 5 À3x þ 10y þ 2z ¼ k3 À3 10 2

3 k1 and B ¼ 4 k2 5 k3

2

Suppose we want to solve the system three times where B is equal, say, to B1 ; B2 ; B3 . Furthermore, suppose B1 ¼ ½1; 1; 1ŠT , and suppose ðfor j ¼ 1; 2Þ Bjþ1 ¼ Bj þ Xj where Xj is the solution of AX ¼ Bj . Here it is more efficient to first obtain the LU factorization of A and then use the LU factorization to solve the system for each of the B’s. (This is done in Problem 3.42.)

SOLVED PROBLEMS Linear Equations, Solutions, 2 Â 2 Systems 3.1. Determine whether each of the following equations is linear:
(a) 5x þ 7y À 8yz ¼ 16, (b) x þ py þ ez ¼ log 5, (c) 3x þ ky À 8z ¼ 16 (a) No, because the product yz of two unknowns is of second degree. (b) Yes, because p; e, and log 5 are constants. (c) As it stands, there are four unknowns: x; y; z; k. Because of the term ky it is not a linear equation. However, assuming k is a constant, the equation is linear in the unknowns x; y; z.

3.2.

Determine whether the following vectors are solutions of x1 þ 2x2 À 4x3 þ 3x4 ¼ 15: (a) u ¼ ð3; 2; 1; 4Þ and (b) v ¼ ð1; 2; 4; 5Þ:
(a) Substitute to obtain 3 þ 2ð2Þ À 4ð1Þ þ 3ð4Þ ¼ 15, or 15 ¼ 15; yes, it is a solution. (b) Substitute to obtain 1 þ 2ð2Þ À 4ð4Þ þ 3ð5Þ ¼ 15, or 4 ¼ 15; no, it is not a solution.

3.3.

Solve (a)

ex ¼ p,

(b) 3x À 4 À x ¼ 2x þ 3,

(c)

7 þ 2x À 4 ¼ 3x þ 3 À x

(a) Because e 6¼ 0, multiply by 1=e to obtain x ¼ p=e. (b) Rewrite in standard form, obtaining 0x ¼ 7. The equation has no solution. (c) Rewrite in standard form, obtaining 0x ¼ 0. Every scalar k is a solution.

3.4.

Prove Theorem 3.4: Consider the equation ax ¼ b.
(i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b.

(ii) If a ¼ 0 but b 6¼ 0, then ax ¼ b has no solution. (iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b.
Suppose a 6¼ 0. Then the scalar b=a exists. Substituting b=a in ax ¼ b yields aðb=aÞ ¼ b, or b ¼ b; hence, b=a is a solution. On the other hand, suppose x0 is a solution to ax ¼ b, so that ax0 ¼ b. Multiplying both sides by 1=a yields x0 ¼ b=a. Hence, b=a is the unique solution of ax ¼ b. Thus, (i) is proved. On the other hand, suppose a ¼ 0. Then, for any scalar k, we have ak ¼ 0k ¼ 0. If b 6¼ 0, then ak 6¼ b. Accordingly, k is not a solution of ax ¼ b, and so (ii) is proved. If b ¼ 0, then ak ¼ b. That is, any scalar k is a solution of ax ¼ b, and so (iii) is proved.

90
3.5. Solve each of the following systems: 2x À 5y ¼ 11 2x À 3y ¼ 8 (a) (b) 3x þ 4y ¼ 5 À6x þ 9y ¼ 6
23y ¼ À23; 2x À 5ðÀ1Þ ¼ 11 or

CHAPTER 3 Systems of Linear Equations

(c)

2x À 3y ¼ 8 À4x þ 6y ¼ À16 y ¼ À1

(a) Eliminate x from the equations by forming the new equation L ¼ À3L1 þ 2L2 . This yields the equation

and so or

Substitute y ¼ À1 in one of the original equations, say L1 , to get 2x þ 5 ¼ 11 2x ¼ 6 or x¼3 Thus, x ¼ 3, y ¼ À1 or the pair u ¼ ð3; À1Þ is the unique solution of the system. (b) Eliminate x from the equations by forming the new equation L ¼ 3L1 þ L2 . This yields the equation 0x þ 0y ¼ 30 This is a degenerate equation with a nonzero constant; hence, this equation and the system have no solution. (Geometrically, the lines corresponding to the equations are parallel.) (c) Eliminate x from the equations by forming the new equation L ¼ 2L1 þ L2 . This yields the equation 0x þ 0y ¼ 0 This is a degenerate equation where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to the solution of either equation. (Geometrically, the lines corresponding to the equations coincide.) To find the general solution, set y ¼ a and substitute in L1 to obtain 2x À 3a ¼ 8 Thus, the general solution is x ¼ 3 a þ 4; 2 where a is any scalar. y¼a or u¼ or 2x ¼ 3a þ 8 or À3
2a

x ¼ 3a þ 4 2 Á þ 4; a

3.6.

Consider the system x þ ay ¼ 4 ax þ 9y ¼ b (a) For which values of a does the system have a unique solution? (b) Find those pairs of values (a; b) for which the system has more than one solution.
(a) Eliminate x from the equations by forming the new equation L ¼ ÀaL1 þ L2 . This yields the equation

ð9 À a2 Þy ¼ b À 4a ð1Þ The system has a unique solution if and only if the coefficient of y in (1) is not zero—that is, if 9 À a2 6¼ 0 or if a 6¼ Æ3. (b) The system has more than one solution if both sides of (1) are zero. The left-hand side is zero when a ¼ Æ3. When a ¼ 3, the right-hand side is zero when b À 12 ¼ 0 or b ¼ 12. When a ¼ À3, the righthand side is zero when b þ 12 À 0 or b ¼ À12. Thus, (3; 12) and ðÀ3; À12Þ are the pairs for which the system has more than one solution.

Systems in Triangular and Echelon Form 3.7. Determine the pivot and free variables in each of the following systems: 2x1 À 3x2 À 6x3 À 5x4 þ 2x5 ¼ 7 x3 þ 3x4 À 7x5 ¼ 6 x4 À 2x5 ¼ 1 (a) 2x À 6y þ 7z ¼ 1 4y þ 3z ¼ 8 2z ¼ 4 (b) x þ 2y À 3z ¼ 2 2x þ 3y þ z ¼ 4 3x þ 4y þ 5z ¼ 8 (c)

(a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here x1 , x3 , x4 are the pivot variables, and x2 and x5 are the free variables.

CHAPTER 3 Systems of Linear Equations

91

(b) The leading unknowns are x; y; z, so they are the pivot variables. There are no free variables (as in any triangular system). (c) The notion of pivot and free variables applies only to a system in echelon form.

3.8.

Solve the triangular system in Problem 3.7(b).
Because it is a triangular system, solve by back-substitution. (i) The last equation gives z ¼ 2. (ii) Substitute z ¼ 2 in the second equation to get 4y þ 6 ¼ 8 or y ¼ 1. 2 (iii) Substitute z ¼ 2 and y ¼ 1 in the first equation to get 2   1 þ 7ð2Þ ¼ 1 or 2x þ 11 ¼ 1 or 2x À 6 2 Thus,

x ¼ À5

x ¼ À5, y ¼ 1, z ¼ 2 or u ¼ ðÀ5; 1 ; 2Þ is the unique solution to the system. 2 2

3.9.

Assign parameters to the free variables, say x2 ¼ a and x5 ¼ b, and solve for the pivot variables by backsubstitution. (i) Substitute x5 ¼ b in the last equation to get x4 À 2b ¼ 1 or x4 ¼ 2b þ 1. (ii) Substitute x5 ¼ b and x4 ¼ 2b þ 1 in the second equation to get x3 þ 3ð2b þ 1Þ À 7b ¼ 6 (iii) or x3 À b þ 3 ¼ 6 or x3 ¼ b þ 3 Substitute x5 ¼ b, x4 ¼ 2b þ 1, x3 ¼ b þ 3, x2 ¼ a in the first equation to get 2x1 À 3a À 6ðb þ 3Þ À 5ð2b þ 1Þ þ 2b ¼ 7 or Thus, 3 x2 ¼ a; x1 ¼ a þ 7b þ 15; x3 ¼ b þ 3; x4 ¼ 2b þ 1; x5 ¼ b 2   3 a þ 7b þ 15; a; b þ 3; 2b þ 1; b or u¼ 2 is the parametric form of the general solution. Alternatively, solving for the pivot variable x1 ; x3 ; x4 in terms of the free variables x2 and x5 yields the following free-variable form of the general solution: 3 x1 ¼ x2 þ 7x5 þ 15; 2 x3 ¼ x5 þ 3; x4 ¼ 2x5 þ 1 or 2x1 À 3a À 14b À 23 ¼ 7 x1 ¼ 3 a þ 7b þ 15 2

Solve the echelon system in Problem 3.7(a).

3.10. Prove Theorem 3.6. Consider the system (3.4) of linear equations in echelon form with r equations and n unknowns.
(i) If r ¼ n, then the system has a unique solution.

(ii) If r < n, then we can arbitrarily assign values to the n À r free variable and solve uniquely for the r pivot variables, obtaining a solution of the system.
(i) Suppose r ¼ n. Then we have a square system AX ¼ B where the matrix A of coefficients is (upper) triangular with nonzero diagonal elements. Thus, A is invertible. By Theorem 3.10, the system has a unique solution. (ii) Assigning values to the n À r free variables yields a triangular system in the pivot variables, which, by (i), has a unique solution.

92
Gaussian Elimination 3.11. Solve each of the following systems: x þ 2y À 4z ¼ À4 2x þ 5y À 9z ¼ À10 3x À 2y þ 3z ¼ 11 (a)

CHAPTER 3 Systems of Linear Equations

x þ 2y À 3z ¼ À1 À3x þ y À 2z ¼ À7 5x þ 3y À 4z ¼ 2 (b)

x þ 2y À 3z ¼ 1 2x þ 5y À 8z ¼ 4 3x þ 8y À 13z ¼ 7 (c)

Reduce each system to triangular or echelon form using Gaussian elimination:
(a) Apply ‘‘Replace L2 by À2L1 þ L2 ’’ and ‘‘Replace L3 by À3L1 þ L3 ’’ to eliminate x from the second and third equations, and then apply ‘‘Replace L3 by 8L2 þ L3 ’’ to eliminate y from the third equation. These operations yield

x þ 2y À 4z ¼ À4 y À z ¼ À2 À8y þ 15z ¼ 23

and then

x þ 2y À 4z ¼ À4 y À z ¼ À2 7z ¼ 7

The system is in triangular form. Solve by back-substitution to obtain the unique solution u ¼ ð2; À1; 1Þ. (b) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by 3L1 þ L2 ’’ and ‘‘Replace L3 by À5L1 þ L3 .’’ This gives the equivalent system x þ 2y À 3z ¼ À1 7y À 11z ¼ À10 À7y þ 11z ¼ 7 The operation ‘‘Replace L3 by L2 þ L3 ’’ yields the following degenerate equation with a nonzero constant: 0x þ 0y þ 0z ¼ À3 This equation and hence the system have no solution.
(c) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by À2L1 þ L2 ’’ and ‘‘Replace L3 by À3L1 þ L3 .’’ This yields the new system

x þ 2y À 3z ¼ 1 y À 2z ¼ 2 2y À 4z ¼ 4

or

x þ 2y À 3z ¼ 1 y À 2z ¼ 2

(The third equation is deleted, because it is a multiple of the second equation.) The system is in echelon form with pivot variables x and y and free variable z. To find the parametric form of the general solution, set z ¼ a and solve for x and y by backsubstitution. Substitute z ¼ a in the second equation to get y ¼ 2 þ 2a. Then substitute z ¼ a and y ¼ 2 þ 2a in the first equation to get x þ 2ð2 þ 2aÞ À 3a ¼ 1 Thus, the general solution is x ¼ À3 À a; where a is a parameter. y ¼ 2 þ 2a; z¼a or u ¼ ðÀ3 À a; 2 þ 2a; aÞ or xþ4þa¼1 or x ¼ À3 À a

3.12. Solve each of the following systems: x1 À 3x2 þ 2x3 À x4 þ 2x5 ¼ 2 3x1 À 9x2 þ 7x3 À x4 þ 3x5 ¼ 7 2x1 À 6x2 þ 7x3 þ 4x4 À 5x5 ¼ 7 (a) x1 þ 2x2 À 3x3 þ 4x4 ¼ 2 2x1 þ 5x2 À 2x3 þ x4 ¼ 1 5x1 þ 12x2 À 7x3 þ 6x4 ¼ 3 (b)

Reduce each system to echelon form using Gaussian elimination:

CHAPTER 3 Systems of Linear Equations

93

(a) Apply ‘‘Replace L2 by À3L1 þ L2 ’’ and ‘‘Replace L3 by À2L1 þ L3 ’’ to eliminate x from the second and third equations. This yields

x1 À 3x2 þ 2x3 À x4 þ 2x5 ¼ 2 x3 þ 2x4 À 3x5 ¼ 1 3x3 þ 6x4 À 9x5 ¼ 3

or

x1 À 3x2 þ 2x3 À x4 þ 2x5 ¼ 2 x3 þ 2x4 À 3x5 ¼ 1

(We delete L3 , because it is a multiple of L2 .) The system is in echelon form with pivot variables x1 and x3 and free variables x2 ; x4 ; x5 . To find the parametric form of the general solution, set x2 ¼ a, x4 ¼ b, x5 ¼ c, where a; b; c are parameters. Back-substitution yields x3 ¼ 1 À 2b þ 3c and x1 ¼ 3a þ 5b À 8c. The general solution is x1 ¼ 3a þ 5b À 8c; x2 ¼ a; x3 ¼ 1 À 2b þ 3c; x4 ¼ b; x5 ¼ c or, equivalently, u ¼ ð3a þ 5b À 8c; a; 1 À 2b þ 3c; b; cÞ.
(b) Eliminate x1 from the second and third equations by the operations ‘‘Replace L2 by À2L1 þ L2 ’’ and ‘‘Replace L3 by À5L1 þ L3 .’’ This yields the system

x1 þ 2x2 À 3x3 þ 4x4 ¼ 2 x2 þ 4x3 À 7x4 ¼ À3 2x2 þ 8x3 À 14x4 ¼ À7 The operation ‘‘Replace L3 by À2L2 þ L3 ’’ yields the degenerate equation 0 ¼ À1. Thus, the system has no solution (even though the system has more unknowns than equations).

3.13. Solve using the condensed format: 2y þ 3z ¼ 3 xþ yþ z¼ 4 4x þ 8y À 3z ¼ 35
The condensed format follows: ð2Þ ð1Þ Number ð =Þ 1 ð =Þ 2 ð3Þ ð30 Þ ð300 Þ Equation 2y þ 3z ¼ 3 xþ yþ z¼ 4 4x þ 8y À 3z ¼ 35 4y À 7z ¼ 19 À 13z ¼ 13 Operation L1 $ L2 L1 $ L2 Replace L3 by À 4L1 þ L3 Replace L3 by À 2L2 þ L3

Here (1), (2), and (300 ) form a triangular system. (We emphasize that the interchange of L1 and L2 is accomplished by simply renumbering L1 and L2 as above.) Using back-substitution with the triangular system yields z ¼ À1 from L3 , y ¼ 3 from L2 , and x ¼ 2 from L1 . Thus, the unique solution of the system is x ¼ 2, y ¼ 3, z ¼ À1 or the triple u ¼ ð2; 3; À1Þ.

3.14. Consider the system x þ 2y þ z ¼ 3 ay þ 5z ¼ 10 2x þ 7y þ az ¼ b
(a) Find those values of a for which the system has a unique solution.

(b) Find those pairs of values ða; bÞ for which the system has more than one solution.
Reduce the system to echelon form. That is, eliminate x from the third equation by the operation ‘‘Replace L3 by À2L1 þ L3 ’’ and then eliminate y from the third equation by the operation

94
‘‘Replace L3 by À3L2 þ aL3 .’’ This yields x þ 2y þ z¼3 ay þ 5z ¼ 10 3y þ ða À 2Þz ¼ b À 6

CHAPTER 3 Systems of Linear Equations

and then

x þ 2y þ z ¼ 3 ay þ 5z ¼ 10 ða2 À 2a À 15Þz ¼ ab À 6a À 30

Examine the last equation ða2 À 2a À 15Þz ¼ ab À 6a À 30. (a) The system has a unique solution if and only if the coefficient of z is not zero; that is, if a2 À 2a À 15 ¼ ða À 5Þða þ 3Þ 6¼ 0 or a 6¼ 5 and a 6¼ À3:

(b) The system has more than one solution if both sides are zero. The left-hand side is zero when a ¼ 5 or a ¼ À3. When a ¼ 5, the right-hand side is zero when 5b À 60 ¼ 0, or b ¼ 12. When a ¼ À3, the righthand side is zero when À3b À 12 ¼ 0, or b ¼ À4. Thus, ð5; 12Þ and ðÀ3; À4Þ are the pairs for which the system has more than one solution.

Echelon Matrices, Row Equivalence, Row Canonical Form 3.15. Row reduce each 2 1 2 42 4 (a) A ¼ 3 6 of the following matrices to echelon form: 3 2 3 À3 0 À4 1 À6 À2 2 5; (b) B ¼ 4 1 2 À5 5 À4 3 6 3 À4

(a) Use a11 ¼ 1 as a pivot to obtain 0’s below a11 ; that is, apply the row operations ‘‘Replace R2 by À2R1 þ R2 ’’ and ‘‘Replace R3 by À3R1 þ R3 :’’ Then use a23 ¼ 4 as a pivot to obtain a 0 below a23 ; that is, apply the row operation ‘‘Replace R3 by À5R2 þ 4R3 .’’ These operations yield 2 3 2 3 1 2 À3 0 1 2 À3 0 A $ 40 0 4 25 $ 40 0 4 25 0 0 0 2 0 0 5 3

The matrix is now in echelon form. (b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange R1 and R2 . Next apply the operations ‘‘Replace R2 by 4R1 þ R2 ’’ and ‘‘Replace R3 by À6R1 þ R3 ’’; and then apply the operation ‘‘Replace R3 by R2 þ R3 .’’ These operations yield 3 3 2 2 3 2 1 2 À5 1 2 À5 1 2 À5 9 À26 5 $ 4 0 9 À26 5 B $ 4 À4 1 À6 5 $ 4 0 0 0 0 0 À9 26 6 3 À4 The matrix is now in echelon form.

3.16. Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this pivoting algorithm.
The row-reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolute value is chosen as the pivot a1j1 and if one uses the row operation ðÀaij1 =a1j1 ÞR1 þ Ri ! Ri The main advantage of the pivoting algorithm is that the above row operation involves division by the (current) pivot a1j1 , and, on the computer, roundoff errors may be substantially reduced when one divides by a number as large in absolute value as possible.

2 3.17. Let A ¼ 4 À3 1

2

À2 6 À7

2 0 10

3 1 À1 5. Reduce A to echelon form using the pivoting algorithm. 2

CHAPTER 3 Systems of Linear Equations

95

First interchange R1 and R2 so that À3 can be used as the pivot, and then apply the operations ‘‘Replace R2 by 2 R1 þ R2 ’’ and ‘‘Replace R3 by 1 R1 þ R3 .’’ These operations yield 3 3 3 2 3 2 À3 6 0 À1 À3 6 0 À1 6 17 2 2 A $ 4 2 À2 2 15 $ 4 0 35 5 0 À5 10 1 À7 10 2 3 Now interchange R2 and R3 so that À5 can be used as the pivot, and then apply the operation ‘‘Replace R3 by 2 5 R2 þ R3 .’’ We obtain 3 2 3 2 À3 6 0 À1 À3 6 0 À1 55 55 $ 4 0 À5 10 A $ 4 0 À5 10 3 3 1 0 2 2 0 0 6 1 3 The matrix has been brought to echelon form using partial pivoting.

3.18. Reduce each of the following matrices to row canonical form: 2 3 2 3 2 2 À1 6 4 5 À9 6 (a) A ¼ 4 4 4 1 10 13 5; (b) B ¼ 4 0 2 35 8 8 À1 26 23 0 0 7
(a) First reduce A to echelon form by applying the operations ‘‘Replace R2 by À2R1 þ R2 ’’ and ‘‘Replace R3 by À4R1 þ R3 ,’’ and then applying the operation ‘‘Replace R3 by ÀR2 þ R3 .’’ These operations yield 2 3 2 3 2 2 À1 6 4 2 2 À1 6 4 A $ 40 0 3 À2 5 5 $ 4 0 0 3 À2 5 5 0 0 3 2 7 0 0 0 4 2

Now use back-substitution on the echelon matrix to obtain the row canonical form of A. Specifically, first multiply R3 by 1 to obtain the pivot a34 ¼ 1, and then apply the operations ‘‘Replace R2 by 4 2R3 þ R2 ’’ and ‘‘Replace R1 by À6R3 þ R1 .’’ These operations yield 2 3 2 3 2 2 À1 0 1 2 2 À1 6 4 3 0 65 3 À2 5 5 $ 4 0 0 A $ 40 0 1 0 0 0 1 1 0 0 0 1 2 2 Now multiply R2 by 1 , making the pivot a23 ¼ 1, and then apply ‘‘Replace R1 by R2 þ R1 ,’’ yielding 3 2 2 A $ 40 0 0 0 2 À1 1 0 3 2 2 0 1 0 25 $ 40 0 1 1 2 2 0 0 0 1 0 0 0 1 3 3 25
1 2

Finally, multiply R1 by 1 , so the pivot a11 ¼ 1. Thus, we obtain the following row canonical form of A: 2 2 3 1 1 0 0 3 2 A $ 40 0 1 0 25 0 0 0 1 1 2
(b) Because B is in echelon form, use back-substitution to obtain

3 2 5 5 À9 6 7 6 6 B $ 40 2 35 $ 40 0 0 0 1 2

À9 2 0

3 2 5 0 7 6 05 $ 40 0 1

À9 1 0

3 2 5 0 0 7 6 05 $ 40 1 0 0 1

3 2 1 0 7 6 05 $ 40 0 1

3 0 0 7 1 05 0 1

The last matrix, which is the identity matrix I, is the row canonical form of B. (This is expected, because B is invertible, and so its row canonical form must be I.)

3.19. Describe the Gauss–Jordan elimination algorithm, which also row reduces an arbitrary matrix A to its row canonical form.

96

CHAPTER 3 Systems of Linear Equations
The Gauss–Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that here each pivot is used to place 0’s both below and above the pivot, not just below the pivot, before working with the next pivot. Also, one variation of the algorithm first normalizes each row—that is, obtains a unit pivot—before it is used to produce 0’s in the other rows, rather than normalizing the rows at the end of the algorithm.

1 3.20. Let A ¼ 4 1 2

2

À2 1 5

3 1 4 À1 9 À2

3 2 3 5. Use Gauss–Jordan to find the row canonical form of A. 8

Use a11 ¼ 1 as a pivot to obtain 0’s below a11 by applying the operations ‘‘Replace R2 by ÀR1 þ R2 ’’ and ‘‘Replace R3 by À2R1 þ R3 .’’ This yields 2 3 1 À2 3 1 2 A $ 40 3 1 À2 1 5 0 9 3 À4 4 Multiply R2 by 1 to make the pivot a22 ¼ 1, and then produce 0’s below and above a22 by applying the 3 operations ‘‘Replace R3 by À9R2 þ R3 ’’ and ‘‘Replace R1 by 2R2 þ R1 .’’ These operations yield 3 2 3 2 1 À2 3 1 2 1 0 11 À 1 8 3 3 3 7 7 6 6 A $ 60 1 1 À2 1 7 $ 60 1 1 À2 17 3 3 35 4 4 3 3 35 0 9 3 À4 4 0 0 0 2 1 Finally, multiply R3 by 1 to make the pivot a34 ¼ 1, and then produce 0’s above a34 by applying the 2 operations ‘‘Replace R2 by 2 R3 þ R2 ’’ and ‘‘Replace R1 by 1 R3 þ R1 .’’ These operations yield 3 3 2 3 2 3 11 1 8 1 0 3 À3 3 1 0 11 0 17 3 6 6 7 6 7 A $ 60 1 1 À2 17 $ 60 1 1 0 27 3 3 35 3 35 4 4 0 0 0 1 1 0 0 0 1 1 2 2 which is the row canonical form of A.

Systems of Linear Equations in Matrix Form 3.21. Find the augmented matrix M and the coefficient matrix A of the following system: x þ 2y À 3z ¼ 4 3y À 4z þ 7x ¼ 5 6z þ 8x À 9y ¼ 1
First align the unknowns in the system, and then use the 2 x þ 2y À 3z ¼ 4 1 2 À3 7x þ 3y À 4z ¼ 5 ; then M ¼ 47 3 À4 8x À 9y þ 6z ¼ 1 8 À9 6 aligned system to obtain M and A. We have 3 2 3 4 1 2 À3 55 and A ¼ 47 3 À4 5 1 8 À9 6

3.22. Solve each of the following systems using its augmented matrix M: x þ 2y À z ¼ 3 x þ 3y þ z ¼ 5 3x þ 8y þ 4z ¼ 17 (a) x À 2y þ 4z ¼ 2 2x À 3y þ 5z ¼ 3 3x À 4y þ 6z ¼ 7 (b) x þ y þ 3z ¼ 1 2x þ 3y À z ¼ 3 5x þ 7y þ z ¼ 7 (c)
3 2 À1 3 1 2 25 0 3 4

(a) Reduce the augmented matrix M to echelon form as follows: 3 2 2 3 2 1 1 2 À1 3 1 2 À1 3 2 25 $ 40 M ¼ 41 3 1 55 $ 40 1 0 0 2 7 8 3 8 4 17

CHAPTER 3 Systems of Linear Equations
Now write down the corresponding triangular system x þ 2y À z ¼ 3 y þ 2z ¼ 2 3z ¼ 4 and solve by back-substitution to obtain the unique solution x ¼ 17 ; y ¼ À 2 ; z ¼ 4 3 3 3 Alternately, reduce the echelon form of M to 2 3 2 1 2 À1 3 1 6 7 6 60 1 7 $ 60 2 25 4 M $4 0 0 1 4 0 3
(b) First reduce the augmented 2 1 À2 M ¼ 4 2 À3 3 À4

97

or

u ¼ ð17 ; À 2 ; 4Þ 3 3 3

row canonical form, obtaining 3 2 3 13 17 2 0 1 0 0 3 3 7 6 7 1 0 À27 $ 60 1 0 À27 35 35 4 4 4 0 1 0 0 1 3 3

This also corresponds to the above solution. matrix M to echelon form as follows: 3 3 2 3 2 1 À2 4 2 1 À2 4 2 4 2 1 À3 À1 5 1 À3 À1 5 $ 4 0 5 35 $ 40 0 0 0 3 0 2 À6 1 6 7

The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ 3, which has no solution. Thus, ‘‘DO NOT CONTINUE.’’ The original system also has no solution. (Note that the echelon form indicates whether or not the system has a solution.) (c) Reduce the augmented matrix M to echelon form and then to row canonical form: 2 3 2 3 ! 1 1 3 1 1 1 3 1 1 0 10 0 M ¼ 4 2 3 À1 3 5 $ 4 0 1 À7 1 5 $ 0 1 À7 1 0 2 À14 2 5 7 1 7 (The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a zero row.) Write down the system corresponding to the row canonical form of M and then transfer the free variables to the other side to obtain the free-variable form of the solution: x ¼ À10z x þ 10z ¼ 0 and y ¼ 1 þ 7z y À 7z ¼ 1 Here z is the only free variable. The parametric solution, using z ¼ a, is as follows: x ¼ À10a; y ¼ 1 þ 7a; z ¼ a or u ¼ ðÀ10a; 1 þ 7a; aÞ

3.23. Solve the following system using its augmented matrix M: x1 þ 2x2 À 3x3 À 2x4 þ 4x5 ¼ 1 2x1 þ 5x2 À 8x3 À x4 þ 6x5 ¼ 4 x1 þ 4x2 À 7x3 þ 5x4 þ 2x5 ¼ 8
Reduce the augmented 2 1 2 À3 6 M ¼ 4 2 5 À8 1 4 À7 2 1 2 À3 6 $ 4 0 1 À2 0 0 0 matrix M to echelon form and then 3 2 À2 4 1 1 2 À3 À2 7 6 À1 6 4 5 $ 4 0 1 À2 3 5 2 8 0 2 À4 7 3 2 0 8 7 1 0 1 0 7 6 0 À8 À7 5 $ 4 0 1 À2 0 1 2 3 0 0 0 1 to row canonical 3 2 4 1 1 7 6 À2 2 5 $ 4 0 À2 7 0 3 24 21 7 À8 À7 5 2 3 form: 2 À3 À2 1 À2 0 0 7 3 À2 2 5 1 2 3 4 1 3

Write down the system corresponding to the row canonical form of M and then transfer the free variables to the other side to obtain the free-variable form of the solution: x1 þ x3 þ x2 À 2x3 À 24x5 ¼ 21 8x5 ¼ À7 x4 þ 2x5 ¼ 3 and x1 ¼ 21 À x3 À 24x5 x2 ¼ À7 þ 2x3 þ 8x5 x4 ¼ 3 À 2x5

98

CHAPTER 3 Systems of Linear Equations
Here x1 ; x2 ; x4 are the pivot variables and x3 and x5 are the free variables. Recall that the parametric form of the solution can be obtained from the free-variable form of the solution by simply setting the free variables equal to parameters, say x3 ¼ a, x5 ¼ b. This process yields or x1 ¼ 21 À a À 24b; x2 ¼ À7 þ 2a þ 8b; x3 ¼ a; x4 ¼ 3 À 2b; x5 ¼ b u ¼ ð21 À a À 24b; À7 þ 2a þ 8b; a; 3 À 2b; bÞ

which is another form of the solution.

Linear Combinations, Homogeneous Systems 3.24. Write v as a linear combination of u1 ; u2 ; u3 , where
(a) v ¼ ð3; 10; 7Þ and u1 ¼ ð1; 3; À2Þ; u2 ¼ ð1; 4; 2Þ; u3 ¼ ð2; 8; 1Þ;

(b) v ¼ ð2; 7; 10Þ and u1 ¼ ð1; 2; 3Þ, u2 ¼ ð1; 3; 5Þ, u3 ¼ ð1; 5; 9Þ; (c) v ¼ ð1; 5; 4Þ and u1 ¼ ð1; 3; À2Þ, u2 ¼ ð2; 7; À1Þ, u3 ¼ ð1; 6; 7Þ.
Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3 . Alternatively, use the augmented matrix M of the equivalent system, where M ¼ ½u1 ; u2 ; u3 ; vŠ. (Here u1 ; u2 ; u3 ; v are the columns of M.) (a) The vector equation v ¼ xu1 þ yu2 þ zu3 for the given vectors is as follows: 3 2 3 2 2 3 2 3 3 x þ y þ 2z 2 1 1 3 4 10 5 ¼ x4 3 5 þ y4 4 5 þ z4 8 5 ¼ 4 3x þ 4y þ 8z 5 À2x þ 2y þ z 1 2 À2 7 2 Form the equivalent system of linear equations by setting corresponding entries equal to each other, and then reduce the system to echelon form: x þ y þ 2z ¼ 3 3x þ 4y þ 8z ¼ 10 À2x þ 2y þ z ¼ 7 x þ y þ 2z ¼ 3 y þ 2z ¼ 1 4y þ 5z ¼ 13 x þ y þ 2z ¼ 3 y þ 2z ¼ 1 À3z ¼ 9

or

or

The system is in triangular form. Back-substitution yields the unique solution x ¼ 2, y ¼ 7, z ¼ À3. Thus, v ¼ 2u1 þ 7u2 À 3u3 . Alternatively, form the augmented matrix M ¼ [u1 ; u2 ; u3 ; v] of the equivalent system, and reduce M to echelon form: 1 1 2 M ¼4 3 4 8 À2 2 1 2 3 2 1 1 2 3 10 5 $ 4 0 1 2 0 4 5 7 3 3 2 1 1 2 3 3 2 15 15 $ 40 1 0 0 À3 9 13

The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields the solution x ¼ 2, y ¼ 7, z ¼ À3. Thus, v ¼ 2u1 þ 7u2 À 3u3 . (b) Form the augmented matrix M ¼ ½u1 ; u2 ; u3 ; vŠ of the equivalent system, and reduce M to the echelon form: 1 1 M ¼ 42 3 3 5 2 3 2 1 1 1 2 5 75 $ 40 1 9 10 0 2 3 2 1 2 1 1 1 3 35 $ 40 1 3 6 4 0 0 0 3 2 3 5 À2

The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ À2, which has no solution. Thus, the system also has no solution, and v cannot be written as a linear combination of u1 ; u2 ; u3 . (c) Form the augmented matrix M ¼ ½u1 ; u2 ; u3 ; vŠ of the equivalent system, and reduce M to echelon form: 1 2 1 M ¼4 3 7 6 À2 À1 7 2 3 2 1 1 55 $ 40 0 4 3 2 1 2 2 1 1 1 3 25 $ 40 1 0 0 3 9 6 3 1 1 3 25 0 0

CHAPTER 3 Systems of Linear Equations
The last matrix corresponds to the following system with free variable z: x þ 2y þ z ¼ 1 y þ 3z ¼ 2

99

Thus, v can be written as a linear combination of u1 ; u2 ; u3 in many ways. For example, let the free variable z ¼ 1, and, by back-substitution, we get y ¼ À2 and x ¼ 2. Thus, v ¼ 2u1 À 2u2 þ u3 .

3.25. Let u1 ¼ ð1; 2; 4Þ, u2 ¼ ð2; À3; 1Þ, u3 ¼ ð2; 1; À1Þ in R3 . Show that u1 ; u2 ; u3 are orthogonal, and write v as a linear combination of u1 ; u2 ; u3 , where (a) v ¼ ð7; 16; 6Þ, (b) v ¼ ð3; 5; 2Þ.
Take the dot product of pairs of vectors to get u1 Á u2 ¼ 2 À 6 þ 4 ¼ 0; u1 Á u3 ¼ 2 þ 2 À 4 ¼ 0; u2 Á u3 ¼ 4 À 3 À 1 ¼ 0 Thus, the three vectors in R3 are orthogonal, and hence Fourier coefficients can be used. That is, v ¼ xu1 þ yu2 þ zu3 , where v Á u1 v Á u2 v Á u3 x¼ ; y¼ ; z¼ u1 Á u1 u2 Á u2 u3 Á u3 (a) We have 7 þ 32 þ 24 63 14 À 48 þ 6 À28 14 þ 16 À 6 24 ¼ ¼ 3; y¼ ¼ ¼ À2; z¼ ¼ ¼4 x¼ 1 þ 4 þ 16 21 4þ9þ1 14 4þ1þ1 6 Thus, v ¼ 3u1 À 2u2 þ 4u3 . (b) We have 3 þ 10 þ 8 21 6 À 15 þ 2 À7 1 6þ5À2 9 3 ¼ ¼ 1; y¼ ¼ ¼À ; z¼ ¼ ¼ x¼ 1 þ 4 þ 16 21 4þ9þ1 14 2 4þ1þ1 6 2 Thus, v ¼ u1 À 1 u2 þ 3 u3 . 2 2

3.26. Find the dimension and a basis for the general solution W systems: 2x1 þ 4x2 À 5x3 þ 3x4 ¼ 0 3x1 þ 6x2 À 7x3 þ 4x4 ¼ 0 5x1 þ 10x2 À 11x3 þ 6x4 ¼ 0 (a)

of each of the following homogeneous x À 2y À 3z ¼ 0 2x þ y þ 3z ¼ 0 3x À 4y À 2z ¼ 0 (b)

(a) Reduce the system to echelon form using the operations ‘‘Replace L2 by À3L1 þ 2L2 ,’’ ‘‘Replace L3 by À5L1 þ 2L3 ,’’ and then ‘‘Replace L3 by À2L2 þ L3 .’’ These operations yield

2x1 þ 4x2 À 5x3 þ 3x4 ¼ 0 2x1 þ 4x2 À 5x3 þ 3x4 ¼ 0 x3 À x4 ¼ 0 and x3 À x4 ¼ 0 3x3 À 3x4 ¼ 0 The system in echelon form has two free variables, x2 and x4 , so dim W ¼ 2. A basis ½u1 ; u2 Š for W may be obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ À2. Thus, u1 ¼ ðÀ2; 1; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1. Back-substitution yields x3 ¼ 1, and then x1 ¼ 1. Thus, u2 ¼ ð1; 0; 1; 1Þ. (b) Reduce the system to echelon form, obtaining x À 2y À 3z ¼ 0 x À 2y À 3z ¼ 0 5y þ 9z ¼ 0 and 5y þ 9z ¼ 0 2y þ 7z ¼ 0 17z ¼ 0 There are no free variables (the system is in triangular form). Hence, dim W ¼ 0, and W has no basis. Specifically, W consists only of the zero solution; that is, W ¼ f0g.

3.27. Find the dimension and a basis for the general solution W of the following homogeneous system using matrix notation: x1 þ 2x2 þ 3x3 À 2x4 þ 4x5 ¼ 0 2x1 þ 4x2 þ 8x3 þ x4 þ 9x5 ¼ 0 3x1 þ 6x2 þ 13x3 þ 4x4 þ 14x5 ¼ 0
Show how the basis gives the parametric form of the general solution of the system. When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its

100

CHAPTER 3 Systems of Linear Equations

augmented matrix M, because the last column of the augmented matrix M is a zero column, and it will remain a zero column during any row-reduction process. Reduce the coefficient matrix A to echelon form, obtaining 1 2 A ¼ 42 4 3 6 2 3 2 1 3 À2 4 8 1 95 $ 40 0 13 4 14 3 2 3 À2 4 1 0 2 5 15 $ 0 0 4 10 2 2 3 À2 4 0 2 5 1 !

(The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a zero row.) We can now proceed in one of two ways.
(a) Write down the corresponding homogeneous system in echelon form:

x1 þ 2x2 þ 3x3 À 2x4 þ 4x5 ¼ 0 2x3 þ 5x4 þ x5 ¼ 0 The system in echelon form has three free variables, x2 ; x4 ; x5 , so dim W ¼ 3. A basis ½u1 ; u2 ; u3 Š for W may be obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ À2. Thus, u1 ¼ ðÀ2; 1; 0; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields x3 ¼ À 5, and then x1 ¼ 19. Thus, 2 2 u2 ¼ ð19 ; 0; À 5 ; 1; 0Þ. 2 2 (3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields x3 ¼ À 1, and then x1 ¼ À 5. Thus, 2 2 u3 ¼ ðÀ 5, 0, À 1 ; 0; 1Þ. 2 2 [One could avoid fractions in the basis by choosing x4 ¼ 2 in (2) and x5 ¼ 2 in (3), which yields multiples of u2 and u3 .] The parametric form of the general solution is obtained from the following linear combination of the basis vectors using parameters a; b; c: au1 þ bu2 þ cu3 ¼ ðÀ2a þ 19 b À 5 c; a; À 5 b À 1 c; b; cÞ 2 2 2 2
(b) Reduce the echelon form of A to row canonical form: " # " 1 2 3 À2 4 1 $ A$ 5 1 0 0 1 2 0 2

2 0

3 À 19 2 1
5 2

5 2 1 2

#

Write down the corresponding free-variable solution: x1 ¼ À2x2 þ 19 5 x4 À x5 2 2 5 1 x3 ¼ À x4 À x5 2 2

Using these equations for the pivot variables x1 and x3 , repeat the above process to obtain a basis ½u1 ; u2 ; u3 Š for W . That is, set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0 to get u1 ; set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0 to get u2 ; and set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1 to get u3 .

3.28. Prove Theorem 3.15. Let v 0 be a particular solution of AX ¼ B, and let W be the general solution of AX ¼ 0. Then U ¼ v 0 þ W ¼ fv 0 þ w : w 2 W g is the general solution of AX ¼ B.
Let w be a solution of AX ¼ 0. Then Aðv 0 þ wÞ ¼ Av 0 þ Aw ¼ B þ 0 ¼ B Thus, the sum v 0 þ w is a solution of AX ¼ B. On the other hand, suppose v is also a solution of AX ¼ B. Then Aðv À v 0 Þ ¼ Av À Av 0 ¼ B À B ¼ 0 Therefore, v À v 0 belongs to W . Because v ¼ v 0 þ ðv À v 0 Þ, we find that any solution of AX ¼ B can be obtained by adding a solution of AX ¼ 0 to a solution of AX ¼ B. Thus, the theorem is proved.

CHAPTER 3 Systems of Linear Equations
Elementary Matrices, Applications 3.29. Let e1 ; e2 ; e3 denote, respectively, the elementary row operations ‘‘Interchange rows R1 and R2 ; ’’ ‘‘Replace R3 by 7R3 ; ’’

101

‘‘Replace R2 by À3R1 þ R2 ’’

Find the corresponding three-square elementary matrices E1 ; E2 ; E3 . Apply each operation to the 3 Â 3 identity matrix I3 to obtain 2 3 2 3 2 3 0 1 0 1 0 0 1 0 0 E2 ¼ 4 0 1 0 5; E3 ¼ 4 À3 1 0 5 E1 ¼ 4 1 0 0 5; 0 0 1 0 0 7 0 0 1

3.30. Consider the elementary row operations in Problem 3.29.
(a) Describe the inverse operations eÀ1 , eÀ1 , eÀ1 . 1 2 3
0 0 0 (b) Find the corresponding three-square elementary matrices E1 , E2 , E3 . 0 0 0 (c) What is the relationship between the matrices E1 , E2 , E3 and the matrices E1 , E2 , E3 ?

(a) The inverses of e1 , e2 , e3 are, respectively,

‘‘Interchange rows R1 and R2 ; ’’

‘‘Replace R3 by 1 R3 ; ’’ 7

‘‘Replace R2 by 3R1 þ R2 :’’ 3 0 0 1 05 0 1

(b) Apply each inverse operation to the 3 Â 3 identity 2 3 2 1 0 1 0 0 0 E2 ¼ 4 0 E1 ¼ 4 1 0 0 5; 0 0 0 1

matrix I3 to obtain 2 3 0 0 1 0 1 0 5; E3 ¼ 4 3 0 1 0 7

0 0 0 (c) The matrices E1 , E2 , E3 are, respectively, the inverses of the matrices E1 , E2 , E3 .

3.31. Write each of the following matrices as a product of elementary matrices: 2 3 2 ! 1 2 3 1 1 1 À3 (a) A ¼ ; (b) B ¼ 4 0 1 4 5; (c) C ¼ 4 2 3 À2 4 0 0 1 À3 À1
The following three steps write a matrix M as a product of elementary matrices:

3 2 85 2

Step 1. Row reduce M to the identity matrix I, keeping track of the elementary row operations. Step 2. Write down the inverse row operations. Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This gives the desired result. If a zero row appears in Step 1, then M is not row equivalent to the identity matrix I, and M cannot be written as a product of elementary matrices.
(a) (1) We have

1 A¼ À2

! ! ! ! À3 1 À3 1 À3 1 0 $ $ $ ¼I 4 0 À2 0 1 0 1

where the row operations are, respectively, ‘‘Replace R2 by 2R1 þ R2 ; ’’ (2) Inverse operations: ‘‘Replace R2 by À2R1 þ R2 ;’’ ‘‘Replace R2 by À2R2 ;’’ ! ! ! 1 0 1 0 1 À3 (3) A ¼ À2 1 0 À2 0 1 ‘‘Replace R1 by À3R2 þ R1 ’’ ‘‘Replace R2 by À 1 R2 ; ’’ 2 ‘‘Replace R1 by 3R2 þ R1 ’’

102
(b) (1) We have

CHAPTER 3 Systems of Linear Equations

1 B ¼ 40 0

2

3 2 3 2 2 3 1 2 0 1 0 1 45 $ 40 1 05 $ 40 1 0 1 0 0 1 0 0

3 0 05 ¼ I 1 ‘‘Replace R1 by À2R2 þ R1 ’’ ‘‘Replace R1 by 2R2 þ R1 ’’

where the row operations are, respectively, ‘‘Replace R2 by À 4R3 þ R2 ; ’’ (2) Inverse operations: ‘‘Replace R1 by À 3R3 þ R1 ; ’’

‘‘Replace R1 by 3R3 þ R1 ; ’’ ‘‘Replace R2 by 4R3 þ R2 ; ’’ 2 32 32 3 1 0 0 1 0 3 1 2 0 (3) B ¼ 4 0 1 4 54 0 1 0 54 0 1 0 5 0 0 1 0 0 1 0 0 1
(c) (1) First row reduce C to echelon form. We have 2 3 2 1 1 2 1 C¼4 2 3 85 $ 40 À3 À1 2 0

3 2 1 2 1 1 45 $ 40 2 8 0

3 1 2 1 45 0 0

In echelon form, C has a zero row. ‘‘STOP.’’ The matrix C cannot be row reduced to the identity matrix I, and C cannot be written as a product of elementary matrices. (We note, in particular, that C has no inverse.)

3.32. Find the inverse of (a)

1 2 A ¼ 4 À1 À1 2 7

2

3 À4 5 5; (b) À3

1 B ¼ 41 3

2

3 3 À4 5 À1 5. 13 À6
3 1 0 0 7 1 1 05 À2 0 1

(a) Form the matrix M ¼ [A; I] and row reduce M to echelon form: 3 2 2 1 2 À4 1 2 À4 1 0 0 6 7 6 1 M ¼ 4 À1 À1 5 0 1 05 $ 40 1 0 3 5 2 7 À3 0 0 1 2 3 1 2 À4 1 0 0 6 7 $ 40 1 1 1 1 05 0 0 2 À5 À3 1

In echelon form, the left half of M is in triangular form; row canonical form: 2 3 2 1 2 0 À9 À6 2 1 6 7 6 7 5 17 6 M $ 60 1 0 2 2 À25 $ 40 4 1 0 0 1 À5 À3 0 2 2 2

hence, A has an inverse. Further reduce M to 0 0 À16 1 0 0 1
7 2

À11
5 2

3

3

7 À17 25
1 2

À5 2

À3 2

The final matrix has the form ½I; AÀ1 Š; that is, AÀ1 is the right half of the last matrix. Thus, 2 3 À16 À11 3 6 7 7 5 17 AÀ1 ¼ 6 2 2 À25 4 1 À5 À3 2 2 2
(b) Form the matrix 2 1 3 M ¼ 41 5 3 13 M ¼ ½B; IŠ and row reduce M to echelon form: 3 2 3 2 1 1 3 À4 1 0 0 À4 1 0 0 3 À1 1 0 5 $ 4 0 À1 0 1 0 5 $ 4 0 2 0 0 4 6 À3 0 1 À6 0 0 1

3 3 À4 1 0 0 2 3 À1 1 05 0 0 À1 À2 1

In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form. Accordingly, B has no inverse.

CHAPTER 3 Systems of Linear Equations

103

3.33. Show that every elementary matrix E is invertible, and its inverse is an elementary matrix.
Let E be the elementary matrix corresponding to the elementary operation e; that is, eðIÞ ¼ E. Let e0 be the inverse operation of e and let E0 be the corresponding elementary matrix; that is, e0 ðIÞ ¼ E0 . Then

I ¼ e0 ðeðIÞÞ ¼ e0 ðEÞ ¼ E0 E Therefore, E is the inverse of E.
0

and

I ¼ eðe0 ðIÞÞ ¼ eðE0 Þ ¼ EE0

3.34. Prove Theorem 3.16: Let e be an elementary row operation and let E be the corresponding m-square elementary matrix; that is, E ¼ eðIÞ. Then eðAÞ ¼ EA, where A is any m  n matrix.
Let Ri be the row i of A; we denote this by writing A ¼ ½R1 ; . . . ; Rm Š. If B is a matrix for which AB is defined then AB ¼ ½R1 B; . . . ; Rm BŠ. We also let ei ¼ ð0; . . . ; 0; ^ 0; . . . ; 0Þ; 1; ^¼ i Here ^¼ i means 1 is the ith entry. One can show (Problem 2.45) that ei A ¼ Ri . We also note that I ¼ ½e1 ; e2 ; . . . ; em Š is the identity matrix. ^ (i) Let e be the elementary row operation ‘‘Interchange rows Ri and Rj .’’ Then, for ^¼ i and ^ ¼ j, b b b E ¼ eðIÞ ¼ ½e1 ; . . . ; ej ; . . . ; ei ; . . . ; em Š and b b b eðAÞ ¼ ½R1 ; . . . ; Rj ; . . . ; Ri ; . . . ; Rm Š Thus, c b b b EA ¼ ½e1 A; . . . ; ec . . . ; ec . . . ; em AŠ ¼ ½R1 ; . . . ; Rj ; . . . ; Ri ; . . . ; Rm Š ¼ eðAÞ j A; i A; (ii) Let e be the elementary row operation ‘‘Replace Ri by kRi ðk 6¼ 0Þ.’’ Then, for^¼ i, b E ¼ eðIÞ ¼ ½e1 ; . . . ; kei ; . . . ; em Š and c eðAÞ ¼ ½R1 ; . . . ; kRi ; . . . ; Rm Š Thus, d c EA ¼ ½e1 A; . . . ; kei A; . . . ; em AŠ ¼ ½R1 ; . . . ; kRi ; . . . ; Rm Š ¼ eðAÞ (iii) Let e be the elementary row operation ‘‘Replace Ri by kRj þ Ri .’’ Then, for^¼ i, E ¼ eðIÞ ¼ ½e1 ; . . . ; kejdei ; . . . ; em Š þ and þ eðAÞ ¼ ½R1 ; . . . ; kRjdRi ; . . . ; Rm Š Using ðkej þ ei ÞA ¼ kðej AÞ þ ei A ¼ kRj þ Ri , we have EA ¼ ½e1 A; ¼ ½R1 ; ...; ...; ðkej þ ei ÞA; kRjdRi ; þ ...; em AŠ ...; Rm Š ¼ eðAÞ

3.35. Prove Theorem 3.17: Let A be a square matrix. Then the following are equivalent:
(a) A is invertible (nonsingular).

(b) A is row equivalent to the identity matrix I. (c) A is a product of elementary matrices.
Suppose A is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there exist elementary matrices E1 ; E2 ; . . . ; Es such that Es . . . E2 E1 A ¼ B. Because A is invertible and each elementary matrix is invertible, B is also invertible. But if B 6¼ I, then B has a zero row; whence B is not invertible. Thus, B ¼ I, and (a) implies (b).

104

CHAPTER 3 Systems of Linear Equations

If (b) holds, then there exist elementary matrices E1 ; E2 ; . . . ; Es such that Es . . . E2 E1 A ¼ I. Hence, À1 À1 À1 A ¼ ðEs . . . E2 E1 ÞÀ1 ¼ E1 E2 . . . ; Es . But the EiÀ1 are also elementary matrices. Thus (b) implies (c). If (c) holds, then A ¼ E1 E2 . . . Es . The Ei are invertible matrices; hence, their product A is also invertible. Thus, (c) implies (a). Accordingly, the theorem is proved.

3.36. Prove Theorem 3.18: If AB ¼ I, then BA ¼ I, and hence B ¼ AÀ1 .
Suppose A is not invertible. Then A is not row equivalent to the identity matrix I, and so A is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices E1 ; . . . ; Es such that Es . . . E2 E1 A has a zero row. Hence, Es . . . E2 E1 AB ¼ Es . . . E2 E1 , an invertible matrix, also has a zero row. But invertible matrices cannot have zero rows; hence A is invertible, with inverse AÀ1 . Then also, B ¼ IB ¼ ðAÀ1 AÞB ¼ AÀ1 ðABÞ ¼ AÀ1 I ¼ AÀ1

3.37. Prove Theorem 3.19: B is row equivalent to A (written B $ AÞ if and only if there exists a nonsingular matrix P such that B ¼ PA.
If B $ A, then B ¼ es ð. . . ðe2 ðe1 ðAÞÞÞ . . .Þ ¼ Es . . . E2 E1 A ¼ PA where P ¼ Es . . . E2 E1 is nonsingular. Conversely, suppose B ¼ PA, where P is nonsingular. By Theorem 3.17, P is a product of elementary matrices, and so B can be obtained from A by a sequence of elementary row operations; that is, B $ A. Thus, the theorem is proved.

3.38. Prove Theorem 3.21: Every m  n matrix A is equivalent to a unique block matrix of the form ! Ir 0 , where Ir is the r  r identity matrix. 0 0
The proof is constructive, in the form of an algorithm. Step 1. Row reduce A to row canonical form, with leading nonzero entries a1j1 , a2j2 ; . . . ; arjr . Step 2. Interchange C1 and C1j1 , interchange C2 and C2j2 ; . . . , and interchange Cr and Cjr . This gives a ! I B matrix in the form r , with leading nonzero entries a11 ; a22 ; . . . ; arr . 0 0 Step 3. Use column operations, with the aii as pivots, to replace each entry in B with a zero; that is, for i ¼ 1; 2; . . . ; r and j ¼ r þ 1, r þ 2;! . . ; n, apply the operation Àbij Ci þ Cj ! Cj . . Ir 0 The final matrix has the desired form . 0 0

Lu Factorization

1 3.39. Find the LU factorization of (a) A ¼ 4 2 À1

2

À3 À4 À2

3 2 5 1 7 5; (b) B ¼ 4 2 1 À5

3 4 À3 8 1 5: À9 7

(a) Reduce A to triangular form by the following operations:

‘‘Replace R2 by À 2R1 þ R2 ; ’’ ‘‘Replace R3 by R1 þ R3 ; ’’ ‘‘Replace R3 by 5 R2 þ R3 ’’ 2 These operations yield the following, where the triangular form is U : 1 A $ 40 0 2 À3 2 À5 3 2 3 1 À3 5 5 2 À3 5 ¼ U À3 5 $ 4 0 0 0 À3 6 2 1 L¼4 2 À1 2

and then

and

3 0 0 1 05 À5 1 2

The entries 2; À1; À 5 in L are the negatives of the multipliers À2; 1; 5 in the above row operations. (As 2 2 a check, multiply L and U to verify A ¼ LU .)

CHAPTER 3 Systems of Linear Equations
(b) Reduce B to triangular form by first applying the R3 by 5R1 þ R3 .’’ These operations yield 2 1 B $ 40 0

105 operations ‘‘Replace R2 by À2R1 þ R2 ’’ and ‘‘Replace

3 4 À3 0 7 5: 11 À8

Observe that the second diagonal entry is 0. Thus, B cannot be brought into triangular form without row interchange operations. Accordingly, B is not LU -factorable. (There does exist a PLU factorization of such a matrix B, where P is a permutation matrix, but such a factorization lies beyond the scope of this text.)

3.40. Find the LDU factorization of the matrix A in Problem 3.39.
The A ¼ LDU factorization refers to the situation where L is a lower triangular matrix with 1’s on the diagonal (as in the LU factorization of A), D is a diagonal matrix, and U is an upper triangular matrix with 1’s on the diagonal. Thus, simply factor out the diagonal entries in the matrix U in the above LU factorization of A to obtain D and L. That is, 2 3 2 3 2 3 1 0 0 1 0 0 1 À3 5 0 5; 1 0 5; U ¼ 40 1 À3 5 D ¼ 40 2 L¼4 2 5 3 0 0 À2 À1 À 2 1 0 0 1

1 3.41. Find the LU factorization of the matrix A ¼ 4 2 À3

2

3 2 1 3 3 5. À10 2

Reduce A to triangular form by the following operations: ð1Þ ‘‘Replace R2 by À2R1 þ R2 ; ’’ ð2Þ ‘‘Replace R3 by 3R1 þ R3 ; ’’ ð3Þ ‘‘Replace R3 by À4R2 þ R3 ’’ These operations yield the following, where the triangular form is U : 3 2 3 2 3 2 1 2 1 1 2 1 1 0 0 and L ¼ 4 2 1 05 A $ 4 0 À1 1 5 $ 4 0 À1 1 5 ¼ U 0 À4 5 0 0 1 À3 4 1 The entries 2; À3; 4 in L are the negatives of the multipliers À2; 3; À4 in the above row operations. (As a check, multiply L and U to verify A ¼ LU .)

3.42. Let A be the matrix in Problem 3.41. Find X1 ; X2 ; X3 , where Xi is the solution of AX ¼ Bi for (a) B1 ¼ ð1; 1; 1Þ, (b) B2 ¼ B1 þ X1 , (c) B3 ¼ B2 þ X2 .
(a) Find LÀ1 B1 by applying the row operations (1), (2), and then (3) in Problem 3.41 to B1 : 2 3 2 3 2 3 1 1 1 ð1Þ and ð2Þ ð3Þ ÀÀ ÀÀ À! B1 ¼ 4 1 5 À À 4 À1 5 ÀÀ À!4 À1 5 4 8 1

Solve UX ¼ B for B ¼ ð1; À1; 8Þ by back-substitution to obtain X1 ¼ ðÀ25; 9; 8Þ. (b) First find B2 ¼ B1 þ X1 ¼ ð1; 1; 1Þ þ ðÀ25; 9; 8Þ ¼ ðÀ24; 10; 9Þ. Then as above B2 ¼ ½À24; 10; 9ŠT À ÀÀ À À! ½À24; 58; À63ŠT ÀÀ À! ½À24; 58; À295ŠT ÀÀ Solve UX ¼ B for B ¼ ðÀ24; 58; À295Þ by back-substitution to obtain X2 ¼ ð943; À353; À295Þ.
(c) First find B3 ¼ B2 þ X2 ¼ ðÀ24; 10; 9Þ þ ð943; À353; À295Þ ¼ ð919; À343; À286Þ. Then, as above ð1Þ and ð2Þ ð3Þ

B3 ¼ ½943; À353; À295ŠT ÀÀ À! ½919; À2181; 2671ŠT À À ÀÀ ÀÀ À! ½919; À2181; 11 395ŠT Solve UX ¼ B for B ¼ ð919; À2181; 11 395Þ by back-substitution to obtain X3 ¼ ðÀ37 628; 13 576; 11 395Þ.

ð1Þ and ð2Þ

ð3Þ

106
Miscellaneous Problems

CHAPTER 3 Systems of Linear Equations

3.43. Let L be a linear combination of the m equations in n unknowns in the system (3.2). Say L is the equation ðc1 a11 þ Á Á Á þ cm am1 Þx1 þ Á Á Á þ ðc1 a1n þ Á Á Á þ cm amn Þxn ¼ c1 b1 þ Á Á Á þ cm bm Show that any solution of the system (3.2) is also a solution of L.
Let u ¼ ðk1 ; . . . ; kn Þ be a solution of (3.2). Then ai1 k1 þ ai2 k2 þ Á Á Á þ ain kn ¼ bi ði ¼ 1; 2; . . . ; mÞ ð2Þ Substituting u in the left-hand side of (1) and using (2), we get ðc1 a11 þ Á Á Á þ cm am1 Þk1 þ Á Á Á þ ðc1 a1n þ Á Á Á þ cm amn Þkn ¼ c1 ða11 k1 þ Á Á Á þ a1n kn Þ þ Á Á Á þ cm ðam1 k1 þ Á Á Á þ amn kn Þ ¼ c1 b1 þ Á Á Á þ cm bm This is the right-hand side of (1); hence, u is a solution of (1).

ð1Þ

3.44. Suppose a system m of linear equations is obtained from a system l by applying an elementary operation (page 64). Show that m and l have the same solutions.
Each equation L in m is a linear combination of equations in l. Hence, by Problem 3.43, any solution of l will also be a solution of m. On the other hand, each elementary operation has an inverse elementary operation, so l can be obtained from m by an elementary operation. This means that any solution of m is a solution of l. Thus, l and m have the same solutions.

3.45. Prove Theorem 3.4: Suppose a system m of linear equations is obtained from a system l by a sequence of elementary operations. Then m and l have the same solutions.
Each step of the sequence does not change the solution set (Problem 3.44). Thus, the original system l and the final system m (and any system in between) have the same solutions.

3.46. A system l of linear equations is said to be consistent if no linear combination of its equations is a degenerate equation L with a nonzero constant. Show that l is consistent if and only if l is reducible to echelon form.
Suppose l is reducible to echelon form. Then l has a solution, which must also be a solution of every linear combination of its equations. Thus, L, which has no solution, cannot be a linear combination of the equations in l. Thus, l is consistent. On the other hand, suppose l is not reducible to echelon form. Then, in the reduction process, it must yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in l. Therefore, l is not consistent; that is, l is inconsistent.

3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u þ kðu À vÞ are distinct.
Suppose u þ k1 ðu À vÞ ¼ u þ k2 ðu À vÞ: We need only show that k1 ¼ k2 . We have k1 ðu À vÞ ¼ k2 ðu À vÞ; and so ðk1 À k2 Þðu À vÞ ¼ 0 Because u and v are distinct, u À v 6¼ 0. Hence, k1 À k2 ¼ 0, and so k1 ¼ k2 .

3.48. Suppose AB is defined. Prove
(a) Suppose A has a zero row. Then AB has a zero row.

(b) Suppose B has a zero column. Then AB has a zero column.

CHAPTER 3 Systems of Linear Equations
(a) Let Ri be the zero row of A, and C1 ; . . . ; Cn the columns of B. Then the ith row of AB is

107

ðRi C1 ; Ri C2 ; . . . ; Ri Cn Þ ¼ ð0; 0; 0; . . . ; 0Þ
(b) BT has a zero row, and so BT AT ¼ ðABÞT has a zero row. Hence, AB has a zero column.

SUPPLEMENTARY PROBLEMS Linear Equations, 2 Â 2 Systems
3.49. Determine whether each of the following systems is linear: (a) 3x À 4y þ 2yz ¼ 8, (b) ex þ 3y ¼ p, (c) 2x À 3y þ kz ¼ 4

3.50. Solve (a) px ¼ 2, (b) 3x þ 2 ¼ 5x þ 7 À 2x, (c) 6x þ 2 À 4x ¼ 5 þ 2x À 3 3.51. Solve each of the following systems: (a) 2x þ 3y ¼ 1 5x þ 7y ¼ 3 (b) 4x À 2y ¼ 5 À6x þ 3y ¼ 1 (c) 2x À 4 ¼ 3y 5y À x ¼ 5 (d) 2x À 4y ¼ 10 3x À 6y ¼ 15

3.52. Consider each of the following systems in unknowns x and y: (a) x À ay ¼ 1 ax À 4y ¼ b (b) ax þ 3y ¼ 2 12x þ ay ¼ b (c) x þ ay ¼ 3 2x þ 5y ¼ b

For which values of a does each system have a unique solution, and for which pairs of values ða; bÞ does each system have more than one solution?

General Systems of Linear Equations
3.53. Solve (a) x þ y þ 2z ¼ 4 2x þ 3y þ 6z ¼ 10 3x þ 6y þ 10z ¼ 17 (b) x À 2y þ 3z ¼ 2 2x À 3y þ 8z ¼ 7 3x À 4y þ 13z ¼ 8 (c) x þ 2y þ 3z ¼ 3 2x þ 3y þ 8z ¼ 4 5x þ 8y þ 19z ¼ 11

3.54. Solve (a) x À 2y ¼ 5 2x þ 3y ¼ 3 3x þ 2y ¼ 7 (b) x þ 2y À 3z þ 2t ¼ 2 2x þ 5y À 8z þ 6t ¼ 5 3x þ 4y À 5z þ 2t ¼ 4 (c) x þ 2y þ 4z À 5t ¼ 3 3x À y þ 5z þ 2t ¼ 4 5x À 4y þ 6z þ 9t ¼ 2

3.55. Solve (a) 2x À y À 4z ¼ 2 4x À 2y À 6z ¼ 5 6x À 3y À 8z ¼ 8 (b) x þ 2y À z þ 3t ¼ 3 2x þ 4y þ 4z þ 3t ¼ 9 3x þ 6y À z þ 8t ¼ 10

3.56. Consider each of the following systems in unknowns x; y; z: (a) x À 2y ¼1 x À y þ az ¼ 2 ay þ 9z ¼ b (b) x þ 2y þ 2z ¼ 1 x þ ay þ 3z ¼ 3 x þ 11y þ az ¼ b (c) x þ y þ az ¼ 1 x þ ay þ z ¼ 4 ax þ y þ z ¼ b

For which values of a does the system have a unique solution, and for which pairs of values ða; bÞ does the system have more than one solution? The value of b does not have any effect on whether the system has a unique solution. Why?

108
Linear Combinations, Homogeneous Systems

CHAPTER 3 Systems of Linear Equations

3.57. Write v as a linear combination of u1 ; u2 ; u3 , where (a) v ¼ ð4; À9; 2Þ, u1 ¼ ð1; 2; À1Þ, u2 ¼ ð1; 4; 2Þ, u3 ¼ ð1; À3; 2Þ; (b) v ¼ ð1; 3; 2Þ, u1 ¼ ð1; 2; 1Þ, u2 ¼ ð2; 6; 5Þ, u3 ¼ ð1; 7; 8Þ; (c) v ¼ ð1; 4; 6Þ, u1 ¼ ð1; 1; 2Þ, u2 ¼ ð2; 3; 5Þ, u3 ¼ ð3; 5; 8Þ. 3.58. Let u1 ¼ ð1; 1; 2Þ, u2 ¼ ð1; 3; À2Þ, u3 ¼ ð4; À2; À1Þ in R3 . Show that u1 ; u2 ; u3 are orthogonal, and write v as a linear combination of u1 ; u2 ; u3 , where (a) v ¼ ð5; À5; 9Þ, (b) v ¼ ð1; À3; 3Þ, (c) v ¼ ð1; 1; 1Þ. (Hint: Use Fourier coefficients.) 3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems: (a) x À y þ 2z ¼ 0 2x þ y þ z ¼ 0 5x þ y þ 4z ¼ 0 (b) x þ 2y À 3z ¼ 0 2x þ 5y þ 2z ¼ 0 3x À y À 4z ¼ 0 (c) x þ 2y þ 3z þ t ¼ 0 2x þ 4y þ 7z þ 4t ¼ 0 3x þ 6y þ 10z þ 5t ¼ 0

3.60. Find the dimension and a basis of the general solution W of each of the following systems: (a) x1 þ 3x2 þ 2x3 À x4 À x5 ¼ 0 2x1 þ 6x2 þ 5x3 þ x4 À x5 ¼ 0 5x1 þ 15x2 þ 12x3 þ x4 À 3x5 ¼ 0 (b) 2x1 À 4x2 þ 3x3 À x4 þ 2x5 ¼ 0 3x1 À 6x2 þ 5x3 À 2x4 þ 4x5 ¼ 0 5x1 À 10x2 þ 7x3 À 3x4 þ 18x5 ¼ 0

Echelon Matrices, Row Canonical Form
3.61. Reduce each of the following matrices to echelon form and then to row 2 2 3 2 3 2 4 1 2 À1 2 1 1 1 2 (c) 4 3 6 (b) 4 2 4 1 À2 5 5; (a) 4 2 4 9 5; 4 8 3 6 3 À7 7 1 5 12 3.62. Reduce each 2 1 2 62 4 6 (a) 4 3 6 1 2 of the following matrices to 3 1 2 1 2 3 5 5 77 7; (b) 4 9 10 11 5 4 3 6 9 echelon 2 0 1 60 3 6 40 0 0 2 canonical form: 3 2 À2 5 1 2 2 0 45 2 6 À5 7

form and then to row canonical form: 3 2 3 1 3 1 3 2 3 62 8 5 10 7 8 12 7 7 7; (c) 6 41 7 7 11 5 4 65 3 11 7 15 7 10

3.63. Using only 0’s and 1’s, list all possible 2 Â 2 matrices in row canonical form. 3.64. Using only 0’s and 1’s, find the number n of possible 3 Â 3 matrices in row canonical form.

Elementary Matrices, Applications
3.65. Let e1 ; e2 ; e3 denote, respectively, the following elementary row operations: ‘‘Interchange R2 and R3 ; ’’ ‘‘Replace R2 by 3R2 ; ’’ ‘‘Replace R1 by 2R3 þ R1 ’’

(a) Find the corresponding elementary matrices E1 ; E2 ; E3 . 0 0 0 (b) Find the inverse operations eÀ1 , eÀ1 , eÀ1 ; their corresponding elementary matrices E1 , E2 , E3 ; and the 1 2 3 relationship between them and E1 ; E2 ; E3 . (c) Describe the corresponding elementary column operations f1 ; f2 ; f3 . (d) Find elementary matrices F1 ; F2 ; F3 corresponding to f1 ; f2 ; f3 , and the relationship between them and E1 ; E2 ; E3 .

CHAPTER 3 Systems of Linear Equations
3.66. Express each of the following matrices as a product of elementary matrices: A¼ 1 2 ; 3 4 ! B¼ 3 À2 ! À6 ; 4 C¼ 2 6 ; À3 À7 ! 3 1 2 0 D ¼ 40 1 35 3 8 7 2

109

3.67. Find the inverse of each of the following matrices (if it exists): 3 1 À2 À1 A ¼ 4 2 À3 1 5; 3 À4 4 2 1 2 B ¼ 42 6 3 10 2 3 3 1 5; À1 1 3 C ¼ 42 8 1 7 2 3 À2 À3 5; 1 2 D ¼ 45 0 2 3 1 À1 2 À3 5 2 1

3.68. Find the inverse of each of the following n  n matrices: (a) A has 1’s on the diagonal and superdiagonal (entries directly above the diagonal) and 0’s elsewhere. (b) B has 1’s on and above the diagonal, and 0’s below the diagonal.

Lu Factorization
3.69. Find the LU factorization of each of the following matrices: 2 2 3 2 3 2 3 1 2 3 6 1 3 À1 1 À1 À1 1 5, (c) 4 4 7 9 5, (d) 4 2 (a) 4 3 À4 À2 5, (b) 4 2 5 3 3 5 4 3 4 2 2 À3 À2 3.70. Let A be the matrix in Problem 3.69(a). Find X1 ; X2 ; X3 ; X4 , where (a) X1 is the solution of AX ¼ B1 , where B1 ¼ ð1; 1; 1ÞT . (b) For k > 1, Xk is the solution of AX ¼ Bk , where Bk ¼ BkÀ1 þ XkÀ1 . 3.71. Let B be the matrix in Problem 3.69(b). Find the LDU factorization of B. 3 2 3 4 75 7 10

Miscellaneous Problems
3.72. Consider the following systems in unknowns x and y: ðaÞ ax þ by ¼ 1 cx þ dy ¼ 0 ðbÞ ax þ by ¼ 0 cx þ dy ¼ 1

Suppose D ¼ ad À bc 6¼ 0. Show that each system has the unique solution: (a) x ¼ d=D, y ¼ Àc=D, (b) x ¼ Àb=D, y ¼ a=D.

3.73. Find the inverse of the row operation ‘‘Replace Ri by kRj þ k 0 Ri ðk 0 6¼ 0Þ.’’ 3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an augmented matrix M ¼ ½A; BŠ yields an echelon form (respectively, the row canonical form) of A. 3.75. Let e be an elementary row operation and E its elementary matrix, and let f be the corresponding elementary column operation and F its elementary matrix. Prove (a) f ðAÞ ¼ ðeðAT ÞÞT , (b) F ¼ ET , (c) f ðAÞ ¼ AF.

3.76. Matrix A is equivalent to matrix B, written A % B, if there exist nonsingular matrices P and Q such that B ¼ PAQ. Prove that % is an equivalence relation; that is, (a) A % A, (b) If A % B, then B % A, (c) If A % B and B % C, then A % C.

110
ANSWERS TO SUPPLEMENTARY PROBLEMS

CHAPTER 3 Systems of Linear Equations

Notation: A ¼ ½R1 ; R2 ; . . .Š denotes the matrix A with rows R1 ; R2 ; . . . . The elements in each row are separated by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero row. For example, 2 3 1 2 3 4 A ¼ ½1; 2; 3; 4; 5; À6; 7; À8; 0Š ¼ 4 5 À6 7 À8 5 0 0 0 0 3.49. (a) 3.50. (a) 3.51. (a) 3.52. (a) 3.53. (a) 3.54. (a) 3.55. (a) 3.56. (a) (c) 3.57. (a) 3.58. (a) 3.59. (a) (c) no, (b) yes, (c) linear in x; y; z, not linear in x; y; z; k (c) (c) every scalar k is a solution ð5; 2Þ, (d) ð5 À 2a; aÞ ð6; 4Þ; ðÀ6; À4Þ, (c) a 6¼ 5 ; 2 ð5 ; 6Þ 2

x ¼ 2=p, ð2; À1Þ, a 6¼ Æ2; ð2; 1; 1Þ, 2 ð3; À1Þ, u ¼ ð1 a þ 2; 2

(b) no solution, (b) no solution, ð2; 2Þ; ðÀ2; À2Þ,

(b) a 6¼ Æ6; (c)

(b) no solution,

u ¼ ðÀ7a À 1; 2a þ 2; aÞ. (c) a;
1 2 ð1

(b) u ¼ ðÀa þ 2b; 1 þ 2a À 2b; a; bÞ, a;
1 2Þ,

no solution þ bÞ; bÞ ðÀ1; À5Þ,

(b) u ¼ ð1 ð7 À 5b À 4aÞ; 2

a 6¼ Æ3; ð3; 3Þ; ðÀ3; À3Þ, a 6¼ 1 and a 6¼ À2; ðÀ2; 5Þ 2; À1; 3, 3; À2; 1, dim W ¼ 1; dim W ¼ 2; (b) 6; À3; 1, (b)
2 1 3 ; À1; 3,

(b) a 6¼ 5 and a 6¼ À1;

ð5; 7Þ;

(c) not possible (c)
2 1 1 3 ; 7 ; 21

u1 ¼ ðÀ1; 1; 1Þ, (b) dim W ¼ 0, no basis, u1 ¼ ðÀ2; 1; 0; 0Þ; u2 ¼ ð5; 0; À2; 1Þ u3 ¼ ð3; 0; À1; 0; 1Þ,

3.60. (a) dim W ¼ 3; u1 ¼ ðÀ3; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; À3; 1; 0Þ, (b) dim W ¼ 2, u1 ¼ ð2; 1; 0; 0; 0Þ, u2 ¼ ð5; 0; À5; À3; 1Þ 3.61. (a) (c) ½1; 0; À 1 ; 0; 1; 5 ; 0Š, (b) ½1; 2; 0; 0; 2; 2 2 ½1; 2; 0; 4; À5; 3; 0; 0; 1; À5; 15 ; À 5 ; 0Š 2 2 0; 0; 1; 0; 5;

0; 0; 0; 1; 2Š,

3.62. (a) ½1; 2; 0; 0; À4; À2; 0; 0; 1; 0; 1; 2; 0; 0; 0; 1; 2; 1; 0Š, (b) ½0; 1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1; 0Š, (c) ½1; 0; 0; 4; 3.63. 5: ½1; 0; 3.64. 16 0; 1Š, ½1; 1; 0; 0Š, ½1; 0; 0; 0Š, ½0; 1; 0; 0Š; 0

0; 1; 0; À1;

0; 0; 1; 2;



3.65. (a) ½1; 0; 0; 0; 0; 1; 0; 1; 0Š, ½1; 0; 0; 0; 3; 0; 0; 0; 1Š, ½1; 0; 2; 0; 1; 0; À2R3 þ R1 ! R1 ; each Ei0 ¼ EiÀ1 , (b) R2 $ R3 ; 1 R2 ! R2 ; 3 (d) each Fi ¼ EiT . (c) C2 $ C3 ; 3C2 ! C2 ; 2C3 þ C1 ! C1 , 3.66. A ¼ ½1; 0; 3; 1Š½1; 0; 0; À2Š½1; 2; 0; 1Š, B is not invertible, C ¼ ½1; 0; À 3 ; 1Š½1; 0; 0; 2Š½1; 6; 0; 1Š½2; 0; 0; 1Š, 2 D ¼ ½100; 010; 301Š½100; 010; 021Š½100; 013; 001Š½120; 3.67. AÀ1 ¼ ½À8; 12; À5; À5; 7; À3; C À1 ¼ ½29 ; À 17 ; 7 ; À 5 ; 3 ; À 1 ; 2 2 2 2 2 2 1; À2; 1Š, 3; À2; 1Š; B has no inverse, DÀ1 ¼ ½8; À3; À1;

0; 0; 1Š,

010;

001Š 10; À4; À1Š

À5; 2; 1;

CHAPTER 3 Systems of Linear Equations
3.68. AÀ1 ¼ ½1; À1; 1; À1; . . . ; 0; 1; À1; 1; À1; . . . ; 0; 0; 1; À1; 1; À1; 1; . . . ; BÀ1 has 1’s on diagonal, À1’s on superdiagonal, and 0’s elsewhere. 3.69. (a) ½100; 310; 211Š½1; À1; À1; 0; À1; 1; 0; 0; À1Š, (b) ½100; 210; 351Š½1; 3; À1; 0; À1; 3; 0; 0; À10Š, (c) ½100; 210; 3 ; 1 ; 1Š½2; 3; 6; 0; 1; À3; 0; 0; À 7Š, 2 2 2 (d) There is no LU decomposition. 3.70. X1 ¼ ½1; 1; À1ŠT ; B2 ¼ ½2; 2; 0ŠT , X2 ¼ ½6; 4; 0ŠT , B3 ¼ ½8; 6; 0ŠT , B4 ¼ ½30; 22; À2ŠT , X4 ¼ ½86; 62; À6ŠT 3.71. B ¼ ½100; 210; 351Š diagð1; À1; À10Þ ½1; 3; À1; 0; 1; 3; 0; 0; 1Š X3 ¼ ½22; 16; À2ŠT , ...; ...;

111
0; . . . 0; 1Š

3.73. Replace Ri by ÀkRj þ ð1=k 0 ÞRi . 3.75. (c) 3.76. (a) (c) f ðAÞ ¼ ðeðAT ÞÞT ¼ ðEAT ÞT ¼ ðAT ÞT ET ¼ AF A ¼ IAI: (b) If A ¼ PBQ, then B ¼ PÀ1 AQÀ1 . If A ¼ PBQ and B ¼ P0 CQ0 , then A ¼ ðPP0 ÞCðQ 0 QÞ.

CHAPTER 4

Vector Spaces
4.1 Introduction
This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field K, whose elements are called scalars. The following notation will be used (unless otherwise stated or implied): V u; v; w K a; b; c; or k the given vector space vectors in V the given number field scalars in K

Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C. The reader might suspect that the real line R has ‘‘dimension’’ one, the cartesian plane R2 has ‘‘dimension’’ two, and the space R3 has ‘‘dimension’’ three. This chapter formalizes the notion of ‘‘dimension,’’ and this definition will agree with the reader’s intuition. Throughout this text, we will use the following set notation: a2A a; b 2 A 8x 2 A 9x 2 A AB A\B A[B ; Element a belongs to set A Elements a and b belong to A For every x in A There exists an x in A A is a subset of B Intersection of A and B Union of A and B Empty set

4.2

Vector Spaces

The following defines the notion of a vector space V where K is the field of scalars.
DEFINITION:

Let V be a nonempty set with two operations: (i) Vector Addition: This assigns to any u; v 2 V a sum u þ v in V. (ii) Scalar Multiplication: This assigns to any u 2 V, k 2 K a product ku 2 V. Then V is called a vector space (over the field K) if the following axioms hold for any vectors u; v; w 2 V :

112

CHAPTER 4 Vector Spaces
[A1] [A2] [A3] [A4] [M1] [M2] [M3] [M4]

113

ðu þ vÞ þ w ¼ u þ ðv þ wÞ There is a vector in V, denoted by 0 and called the zero vector, such that, for any u 2 V; uþ0¼0þu¼u For each u 2 V ; there is a vector in V, denoted by Àu, and called the negative of u, such that u þ ðÀuÞ ¼ ðÀuÞ þ u ¼ 0. u þ v ¼ v þ u. kðu þ vÞ ¼ ku þ kv, for any scalar k 2 K: ða þ bÞu ¼ au þ bu; for any scalars a; b 2 K. ðabÞu ¼ aðbuÞ; for any scalars a; b 2 K. 1u ¼ u, for the unit scalar 1 2 K.

The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first four are concerned only with the additive structure of V and can be summarized by saying V is a commutative group under addition. This means Any sum v 1 þ v 2 þ Á Á Á þ v m of vectors requires no parentheses and does not depend on the order of the summands. (b) The zero vector 0 is unique, and the negative Àu of a vector u is unique. (c) (Cancellation Law) If u þ w ¼ v þ w, then u ¼ v. (a) Also, subtraction in V is defined by u À v ¼ u þ ðÀvÞ, where Àv is the unique negative of v. On the other hand, the remaining four axioms are concerned with the ‘‘action’’ of the field K of scalars on the vector space V. Using these additional axioms, we prove (Problem 4.2) the following simple properties of a vector space.
THEOREM 4.1:

Let V be a vector space over a field K. (i) (ii) (iii) (iv) For any scalar k 2 K and 0 2 V ; k0 ¼ 0. For 0 2 K and any vector u 2 V ; 0u ¼ 0. If ku ¼ 0, where k 2 K and u 2 V, then k ¼ 0 or u ¼ 0. For any k 2 K and any u 2 V ; ðÀkÞu ¼ kðÀuÞ ¼ Àku.

4.3

Examples of Vector Spaces

This section lists important examples of vector spaces that will be used throughout the text.

Space K n
Let K be an arbitrary field. The notation K n is frequently used to denote the set of all n-tuples of elements in K. Here K n is a vector space over K using the following operations: (i) Vector Addition: ða1 ; a2 ; . . . ; an Þ þ ðb1 ; b2 ; . . . ; bn Þ ¼ ða1 þ b1 ; a2 þ b2 ; . . . ; an þ bn Þ (ii) Scalar Multiplication: kða1 ; a2 ; . . . ; an Þ ¼ ðka1 ; ka2 ; . . . ; kan Þ The zero vector in K n is the n-tuple of zeros, 0 ¼ ð0; 0; . . . ; 0Þ and the negative of a vector is defined by Àða1 ; a2 ; . . . ; an Þ ¼ ðÀa1 ; Àa2 ; . . . ; Àan Þ Observe that these are the same as the operations defined for Rn in Chapter 1. The proof that K n is a vector space is identical to the proof of Theorem 1.1, which we now regard as stating that Rn with the operations defined there is a vector space over R.

114
Polynomial Space PðtÞ
Let PðtÞ denote the set of all polynomials of the form pðtÞ ¼ a0 þ a1 t þ a2 t2 þ Á Á Á þ as ts ðs ¼ 1; 2; . . .Þ

CHAPTER 4 Vector Spaces

where the coefficients ai belong to a field K. Then PðtÞ is a vector space over K using the following operations: (i) Vector Addition: Here pðtÞ þ qðtÞ in PðtÞ is the usual operation of addition of polynomials. (ii) Scalar Multiplication: Here kpðtÞ in PðtÞ is the usual operation of the product of a scalar k and a polynomial pðtÞ. The zero polynomial 0 is the zero vector in PðtÞ.

Polynomial Space Pn ðtÞ
Let Pn ðtÞ denote the set of all polynomials pðtÞ over a field K, where the degree of pðtÞ is less than or equal to n; that is, pðtÞ ¼ a0 þ a1 t þ a2 t2 þ Á Á Á þ as ts where s n. Then Pn ðtÞ is a vector space over K with respect to the usual operations of addition of polynomials and of multiplication of a polynomial by a constant (just like the vector space PðtÞ above). We include the zero polynomial 0 as an element of Pn ðtÞ, even though its degree is undefined.

Matrix Space Mm;n
The notation Mm;n , or simply M; will be used to denote the set of all m  n matrices with entries in a field K. Then Mm;n is a vector space over K with respect to the usual operations of matrix addition and scalar multiplication of matrices, as indicated by Theorem 2.1.

Function Space FðXÞ
Let X be a nonempty set and let K be an arbitrary field. Let FðX Þ denote the set of all functions of X into K. [Note that FðX Þ is nonempty, because X is nonempty.] Then FðX Þ is a vector space over K with respect to the following operations: (i) Vector Addition: The sum of two functions f and g in FðX Þ is the function f þ g in FðX Þ defined by ð f þ gÞðxÞ ¼ f ðxÞ þ gðxÞ 8x 2 X (ii) Scalar Multiplication: The product of a scalar k 2 K and a function f in FðX Þ is the function kf in FðX Þ defined by ðkf ÞðxÞ ¼ kf ðxÞ 0ðxÞ ¼ 0 8x 2 X 8x 2 X 8x 2 X The zero vector in FðX Þ is the zero function 0, which maps every x 2 X into the zero element 0 2 K; Also, for any function f in FðX Þ, negative of f is the function Àf in FðX Þ defined by ðÀf ÞðxÞ ¼ Àf ðxÞ

Fields and Subfields
Suppose a field E is an extension of a field K; that is, suppose E is a field that contains K as a subfield. Then E may be viewed as a vector space over K using the following operations: (i) Vector Addition: Here u þ v in E is the usual addition in E. (ii) Scalar Multiplication: Here ku in E, where k 2 K and u 2 E, is the usual product of k and u as elements of E. That is, the eight axioms of a vector space are satisfied by E and its subfield K with respect to the above two operations.

CHAPTER 4 Vector Spaces

115

4.4

Linear Combinations, Spanning Sets

Let V be a vector space over a field K. A vector v in V is a linear combination of vectors u1 ; u2 ; . . . ; um in V if there exist scalars a1 ; a2 ; . . . ; am in K such that v ¼ a1 u1 þ a2 u2 þ Á Á Á þ am um Alternatively, v is a linear combination of u1 ; u2 ; . . . ; um if there is a solution to the vector equation v ¼ x1 u1 þ x2 u2 þ Á Á Á þ xm um where x1 ; x2 ; . . . ; xm are unknown scalars.
EXAMPLE 4.1 (Linear Combinations in Rn ) Suppose we want to express v ¼ ð3; 7; À4Þ in R3 as a linear combination of the vectors

u1 ¼ ð1; 2; 3Þ; 2

u2 ¼ ð2; 3; 7Þ;

u3 ¼ ð3; 5; 6Þ x þ 2y þ 3z ¼ 3 2x þ 3y þ 5z ¼ 7 3x þ 7y þ 6z ¼ À4

We seek scalars x, y, z such that v ¼ xu1 þ yu2 þ zu3 ; that is,

2 3 3 2 3 2 3 3 3 1 2 4 3 5 ¼ x4 2 5 þ y4 3 5 þ z4 5 5 6 À4 3 7

or

(For notational convenience, we have written the vectors in R3 as columns, because it is then easier to find the equivalent system of linear equations.) Reducing the system to echelon form yields

x þ 2y þ 3z ¼ 3 Ày À z ¼ 1 y À 3z ¼ À13

and then

x þ 2y þ 3z ¼ 3 Ày À z ¼ 1 À 4z ¼ À12

Back-substitution yields the solution

x ¼ 2, y ¼ À4, z ¼ 3. Thus, v ¼ 2u1 À 4u2 þ 3u3 .

Remark: Generally speaking, the question of expressing a given vector v in K n as a linear combination of vectors u1 ; u2 ; . . . ; um in K n is equivalent to solving a system AX ¼ B of linear equations, where v is the column B of constants, and the u’s are the columns of the coefficient matrix A. Such a system may have a unique solution (as above), many solutions, or no solution. The last case—no solution—means that v cannot be written as a linear combination of the u’s.
EXAMPLE 4.2 (Linear combinations in PðtÞ) Suppose we want to express the polynomial v ¼ 3t2 þ 5t À 5 as a linear combination of the polynomials

p1 ¼ t2 þ 2t þ 1;

p2 ¼ 2t2 þ 5t þ 4;

p3 ¼ t2 þ 3t þ 6 ð*Þ

We seek scalars x, y, z such that v ¼ xp1 þ yp2 þ zp3 ; that is,

3t2 þ 5t À 5 ¼ xðt2 þ 2t þ 1Þ þ yð2t2 þ 5t þ 4Þ þ zðt2 þ 3t þ 6Þ
There are two ways to proceed from here. (1) Expand the right-hand side of (*) obtaining:

3t 2 þ 5t À 5 ¼ xt 2 þ 2xt þ x þ 2yt 2 þ 5yt þ 4y þ zt 2 þ 3zt þ 6z ¼ ðx þ 2y þ zÞt 2 þ ð2x þ 5y þ 3zÞt þ ðx þ 4y þ 6zÞ
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form:

x þ 2y þ z ¼ 3 2x þ 5y þ 3z ¼ 5 x þ 4y þ 6z ¼ À5

or

x þ 2y þ z ¼ 3 y þ z ¼ À1 2y þ 5z ¼ À8

or

x þ 2y þ z ¼ 3 y þ z ¼ À1 3z ¼ À6

116

CHAPTER 4 Vector Spaces

The system is in triangular form and has a solution. Back-substitution yields the solution x ¼ 3, y ¼ 1, z ¼ À2. Thus,

v ¼ 3p1 þ p2 À 2p3
(2) The equation (*) is actually an identity in the variable t; that is, the equation holds for any value of t. We can obtain three equations in the unknowns x, y, z by setting t equal to any three values. For example,

Set t ¼ 0 in ð1Þ to obtain: Set t ¼ 1 in ð1Þ to obtain: Set t ¼ À1 in ð1Þ to obtain:

x þ 4y þ 6z ¼ À5 4x þ 11y þ 10z ¼ 3 y þ 4z ¼ À7

Reducing this system to echelon form and solving by back-substitution again yields the solution x ¼ 3, y ¼ 1, z ¼ À2. Thus (again), v ¼ 3p1 þ p2 À 2p3 .

Spanning Sets
Let V be a vector space over K. Vectors u1 ; u2 ; . . . ; um in V are said to span V or to form a spanning set of V if every v in V is a linear combination of the vectors u1 ; u2 ; . . . ; um —that is, if there exist scalars a1 ; a2 ; . . . ; am in K such that v ¼ a1 u1 þ a2 u2 þ Á Á Á þ am um The following remarks follow directly from the definition. Remark 1: Suppose u1 ; u2 ; . . . ; um span V. Then, for any vector w, the set w; u1 ; u2 ; . . . ; um also spans V. Remark 2: Suppose u1 ; u2 ; . . . ; um span V and suppose uk is a linear combination of some of the other u’s. Then the u’s without uk also span V. Remark 3: Suppose u1 ; u2 ; . . . ; um span V and suppose one of the u’s is the zero vector. Then the u’s without the zero vector also span V.
EXAMPLE 4.3 Consider the vector space V ¼ R3 . (a) We claim that the following vectors form a spanning set of R3 :

e1 ¼ ð1; 0; 0Þ; v ¼ ae1 þ be2 þ ce3

e2 ¼ ð0; 1; 0Þ;

e3 ¼ ð0; 0; 1Þ

Specifically, if v ¼ ða; b; cÞ is any vector in R3 , then For example, v ¼ ð5; À6; 2Þ ¼ À5e1 À 6e2 þ 2e3 . (b) We claim that the following vectors also form a spanning set of R3 :

w1 ¼ ð1; 1; 1Þ;

w2 ¼ ð1; 1; 0Þ;

w3 ¼ ð1; 0; 0Þ

Specifically, if v ¼ ða; b; cÞ is any vector in R3 , then (Problem 4.62)

v ¼ ða; b; cÞ ¼ cw1 þ ðb À cÞw2 þ ða À bÞw3
For example, v ¼ ð5; À6; 2Þ ¼ 2w1 À 8w2 þ 11w3 . (c) One can show (Problem 3.24) that v ¼ ð2; 7; 8Þ cannot be written as a linear combination of the vectors

u1 ¼ ð1; 2; 3Þ;

u2 ¼ ð1; 3; 5Þ;

u3 ¼ ð1; 5; 9Þ

Accordingly, u1 , u2 , u3 do not span R3 .

CHAPTER 4 Vector Spaces
EXAMPLE 4.4 Consider the vector space V ¼ Pn ðtÞ consisting of all polynomials of degree n.

117

(a) Clearly every polynomial in Pn ðtÞ can be expressed as a linear combination of the n þ 1 polynomials

1;

t;

t2 ;

t3 ;

...;

tn

Thus, these powers of t (where 1 ¼ t0 ) form a spanning set for Pn ðtÞ. (b) One can also show that, for any scalar c, the following n þ 1 powers of t À c,

1;

t À c;

ðt À cÞ2 ;

ðt À cÞ3 ;

...;

ðt À cÞn

(where ðt À cÞ0 ¼ 1), also form a spanning set for Pn ðtÞ. EXAMPLE 4.5 Consider the vector space M ¼ M2;2 consisting of all 2 Â 2 matrices, and consider the following four matrices in M:

E11

1 ¼ 0

! 0 ; 0

E12

! 0 1 ¼ ; 0 0

E21

0 ¼ 1

! 0 ; 0

E22

0 ¼ 0

0 1

!

Then clearly any matrix A in M can be written as a linear combination of the four matrices. For example,

5 A¼ 7

! À6 ¼ 5E11 À 6E12 þ 7E21 þ 8E22 8

Accordingly, the four matrices E11 , E12 , E21 , E22 span M.

4.5

Subspaces

This section introduces the important notion of a subspace. Let V be a vector space over a field K and let W be a subset of V. Then W is a subspace of V if W is itself a vector space over K with respect to the operations of vector addition and scalar multiplication on V. The way in which one shows that any set W is a vector space is to show that W satisfies the eight axioms of a vector space. However, if W is a subset of a vector space V, then some of the axioms automatically hold in W, because they already hold in V. Simple criteria for identifying subspaces follow.
DEFINITION: THEOREM 4.2:

Suppose W is a subset of a vector space V. Then W is a subspace of V if the following two conditions hold: (a) The zero vector 0 belongs to W. (b) For every u; v 2 W; k 2 K: (i) The sum u þ v 2 W. (ii) The multiple ku 2 W.

Property (i) in (b) states that W is closed under vector addition, and property (ii) in (b) states that W is closed under scalar multiplication. Both properties may be combined into the following equivalent single statement: (b0 ) For every u; v 2 W ; a; b 2 K, the linear combination au þ bv 2 W. Now let V be any vector space. Then V automatically contains two subspaces: the set {0} consisting of the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V. Examples of nontrivial subspaces follow.
EXAMPLE 4.6 Consider the vector space V ¼ R3 . (a) Let U consist of all vectors in R3 whose entries are equal; that is,

U ¼ fða; b; cÞ : a ¼ b ¼ cg
For example, (1, 1, 1), (73, 73, 73), (7, 7, 7), (72, 72, 72) are vectors in U . Geometrically, U is the line through the origin O and the point (1, 1, 1) as shown in Fig. 4-1(a). Clearly 0 ¼ ð0; 0; 0Þ belongs to U , because

118

CHAPTER 4 Vector Spaces

all entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U , say, u ¼ ða; a; aÞ and v ¼ ðb; b; bÞ. Then, for any scalar k 2 R, the following are also vectors in U :

u þ v ¼ ða þ b; a þ b; a þ bÞ
Thus, U is a subspace of R3 .

and

ku ¼ ðka; ka; kaÞ

(b) Let W be any plane in R3 passing through the origin, as pictured in Fig. 4-1(b). Then 0 ¼ ð0; 0; 0Þ belongs to W, because we assumed W passes through, the origin O. Further, suppose u and v are vectors in W. Then u and v may be viewed as arrows in the plane W emanating from the origin O, as in Fig. 4-1(b). The sum u þ v and any multiple ku of u also lie in the plane W. Thus, W is a subspace of R3 .

Figure 4-1

EXAMPLE 4.7 (a) Let V ¼ Mn;n , the vector space of n  n matrices. Let W1 be the subset of all (upper) triangular matrices and let W2 be the subset of all symmetric matrices. Then W1 is a subspace of V, because W1 contains the zero matrix 0 and W1 is closed under matrix addition and scalar multiplication; that is, the sum and scalar multiple of such triangular matrices are also triangular. Similarly, W2 is a subspace of V. (b) Let V ¼ PðtÞ, the vector space PðtÞ of polynomials. Then the space Pn ðtÞ of polynomials of degree at most n may be viewed as a subspace of PðtÞ. Let QðtÞ be the collection of polynomials with only even powers of t. For example, the following are polynomials in QðtÞ:

p1 ¼ 3 þ 4t 2 À 5t6

and

p2 ¼ 6 À 7t 4 þ 9t 6 þ 3t 12

(We assume that any constant k ¼ kt0 is an even power of t.) Then QðtÞ is a subspace of PðtÞ. (c) Let V be the vector space of real-valued functions. Then the collection W1 of continuous functions and the collection W2 of differentiable functions are subspaces of V.

Intersection of Subspaces
Let U and W be subspaces of a vector space V. We show that the intersection U \ W is also a subspace of V. Clearly, 0 2 U and 0 2 W, because U and W are subspaces; whence 0 2 U \ W. Now suppose u and v belong to the intersection U \ W. Then u; v 2 U and u; v 2 W. Further, because U and W are subspaces, for any scalars a; b 2 K, au þ bv 2 U and au þ bv 2 W

Thus, au þ bv 2 U \ W. Therefore, U \ W is a subspace of V. The above result generalizes as follows.
THEOREM 4.3:

The intersection of any number of subspaces of a vector space V is a subspace of V.

CHAPTER 4 Vector Spaces

119

Solution Space of a Homogeneous System
Consider a system AX ¼ B of linear equations in n unknowns. Then every solution u may be viewed as a vector in K n . Thus, the solution set of such a system is a subset of K n . Now suppose the system is homogeneous; that is, suppose the system has the form AX ¼ 0. Let W be its solution set. Because A0 ¼ 0, the zero vector 0 2 W. Moreover, suppose u and v belong to W. Then u and v are solutions of AX ¼ 0, or, in other words, Au ¼ 0 and Av ¼ 0. Therefore, for any scalars a and b, we have Aðau þ bvÞ ¼ aAu þ bAv ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0 Thus, au þ bv belongs to W, because it is a solution of AX ¼ 0. Accordingly, W is a subspace of K n . We state the above result formally.
THEOREM 4.4:

The solution set W of a homogeneous system AX ¼ 0 in n unknowns is a subspace of K n .

We emphasize that the solution set of a nonhomogeneous system AX ¼ B is not a subspace of K n . In fact, the zero vector 0 does not belong to its solution set.

4.6

Linear Spans, Row Space of a Matrix

Suppose u1 ; u2 ; . . . ; um are any vectors in a vector space V. Recall (Section 4.4) that any vector of the form a1 u1 þ a2 u2 þ Á Á Á þ am um , where the ai are scalars, is called a linear combination of u1 ; u2 ; . . . ; um . The collection of all such linear combinations, denoted by spanðu1 ; u2 ; . . . ; um Þ or spanðui Þ is called the linear span of u1 ; u2 ; . . . ; um . Clearly the zero vector 0 belongs to spanðui Þ, because 0 ¼ 0u1 þ 0u2 þ Á Á Á þ 0um Furthermore, suppose v and v 0 belong to spanðui Þ, say, v ¼ a1 u1 þ a2 u2 þ Á Á Á þ am um and v 0 ¼ b1 u1 þ b2 u2 þ Á Á Á þ bm um Then, v þ v 0 ¼ ða1 þ b1 Þu1 þ ða2 þ b2 Þu2 þ Á Á Á þ ðam þ bm Þum and, for any scalar k 2 K, kv ¼ ka1 u1 þ ka2 u2 þ Á Á Á þ kam um Thus, v þ v 0 and kv also belong to spanðui Þ. Accordingly, spanðui Þ is a subspace of V. More generally, for any subset S of V, spanðSÞ consists of all linear combinations of vectors in S or, when S ¼ f, span(S) ¼ f0g. Thus, in particular, S is a spanning set (Section 4.4) of spanðSÞ. The following theorem, which was partially proved above, holds.
THEOREM 4.5:

Let S be a subset of a vector space V. (i) Then spanðSÞ is a subspace of V that contains S. (ii) If W is a subspace of V containing S, then spanðSÞ  W.

Condition (ii) in theorem 4.5 may be interpreted as saying that spanðSÞ is the ‘‘smallest’’ subspace of V containing S.
EXAMPLE 4.8 Consider the vector space V ¼ R3 . (a) Let u be any nonzero vector in R3 . Then spanðuÞ consists of all scalar multiples of u. Geometrically, spanðuÞ is the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a).

120

CHAPTER 4 Vector Spaces

u 0

u 0

(a)

(b)

Figure 4-2

(b) Let u and v be vectors in R3 that are not multiples of each other. Then spanðu; vÞ is the plane through the origin O and the endpoints of u and v as shown in Fig. 4-2(b). (c) Consider the vectors e1 ¼ ð1; 0; 0Þ, e2 ¼ ð0; 1; 0Þ, e3 ¼ ð0; 0; 1Þ in R3 . Recall [Example 4.1(a)] that every vector in R3 is a linear combination of e1 , e2 , e3 . That is, e1 , e2 , e3 form a spanning set of R3 . Accordingly, spanðe1 ; e2 ; e3 Þ ¼ R3 .

Row Space of a Matrix
Let A ¼ ½aij Š be an arbitrary m  n matrix over a field K. The rows of A, R1 ¼ ða11 ; a12 ; . . . ; a1n Þ; n R2 ¼ ða21 ; a22 ; . . . ; a2n Þ;

...; n Rm ¼ ðam1 ; am2 ; . . . ; amn Þ

may be viewed as vectors in K ; hence, they span a subspace of K called the row space of A and denoted by rowsp(A). That is, rowspðAÞ ¼ spanðR1 ; R2 ; . . . ; Rm Þ Analagously, the columns of A may be viewed as vectors in K m called the column space of A and denoted by colsp(A). Observe that colspðAÞ ¼ rowspðAT Þ. Recall that matrices A and B are row equivalent, written A $ B, if B can be obtained from A by a sequence of elementary row operations. Now suppose M is the matrix obtained by applying one of the following elementary row operations on a matrix A: ð1Þ Interchange Ri and Rj ; ð2Þ Replace Ri by kRi ; ð3Þ Replace Rj by kRi þ Rj

Then each row of M is a row of A or a linear combination of rows of A. Hence, the row space of M is contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on M to obtain A; hence, the row space of A is contained in the row space of M. Accordingly, A and M have the same row space. This will be true each time we apply an elementary row operation. Thus, we have proved the following theorem.
THEOREM 4.6:

Row equivalent matrices have the same row space.

We are now able to prove (Problems 4.45–4.47) basic results on row equivalence (which first appeared as Theorems 3.7 and 3.8 in Chapter 3).
THEOREM 4.7:

Suppose A ¼ ½aij Š and B ¼ ½bij Š are row equivalent echelon matrices with respective pivot entries a1j1 ; a2j2 ; . . . ; arjr and b1k1 ; b2k2 ; . . . ; bsks Then A and B have the same number of nonzero rows—that is, r ¼ s—and their pivot entries are in the same positions—that is, j1 ¼ k1 ; j2 ¼ k2 ; . . . ; jr ¼ kr .

THEOREM 4.8:

Suppose A and B are row canonical matrices. Then A and B have the same row space if and only if they have the same nonzero rows.

CHAPTER 4 Vector Spaces
COROLLARY 4.9:

121

Every matrix A is row equivalent to a unique matrix in row canonical form.

We apply the above results in the next example.
EXAMPLE 4.9 Consider the following two sets of vectors in R4 :

u1 ¼ ð1; 2; À1; 3Þ; u2 ¼ ð2; 4; 1; À2Þ; u3 ¼ ð3; 6; 3; À7Þ w2 ¼ ð2; 4; À5; 14Þ w1 ¼ ð1; 2; À4; 11Þ;
Let U ¼ spanðui Þ and W ¼ spanðwi Þ. There are two ways to show that U ¼ W. (a) Show that each ui is a linear combination of w1 and w2 , and show that each wi is a linear combination of u1 , u2 , u3 . Observe that we have to show that six systems of linear equations are consistent. (b) Form the matrix A whose rows are u1 , u2 , u3 and row reduce A to row canonical form, and form the matrix B whose rows are w1 and w2 and row reduce B to row canonical form:

2

6 A ¼ 42 3 B¼ 1 2

1

2 4 6 2 4

À1 1 3 À4 À5

7 6 À2 5 $ 4 0 0 À7 ! 11 1 $ 0 14

3

3

2

1 2 0 0

À1 3 6

2 À4 0 3

7 6 À8 5 $ 4 0 À16 0 " ! 1 2 11 $ 0 0 À8

3

3

2

1

2 0 0 0 1

0 1 0
1 3 À8 3

1 3 7 À85 3

3

0 #

Because the nonzero rows of the matrices in row canonical form are identical, the row spaces of A and B are equal. Therefore, U ¼ W. Clearly, the method in (b) is more efficient than the method in (a).

4.7

Linear Dependence and Independence

Let V be a vector space over a field K. The following defines the notion of linear dependence and independence of vectors over K. (One usually suppresses mentioning K when the field is understood.) This concept plays an essential role in the theory of linear algebra and in mathematics in general.
DEFINITION:

We say that the vectors v 1 ; v 2 ; . . . ; v m in V are linearly dependent if there exist scalars a1 ; a2 ; . . . ; am in K, not all of them 0, such that a1 v 1 þ a2 v 2 þ Á Á Á þ am v m ¼ 0 Otherwise, we say that the vectors are linearly independent.

The above definition may be restated as follows. Consider the vector equation x1 v 1 þ x2 v 2 þ Á Á Á þ xm v m ¼ 0 ð*Þ where the x’s are unknown scalars. This equation always has the zero solution x1 ¼ 0; x2 ¼ 0; . . . ; xm ¼ 0. Suppose this is the only solution; that is, suppose we can show: x1 v 1 þ x2 v 2 þ Á Á Á þ xm v m ¼ 0 implies x1 ¼ 0; x2 ¼ 0; ...; xm ¼ 0 Then the vectors v 1 ; v 2 ; . . . ; v m are linearly independent, On the other hand, suppose the equation (*) has a nonzero solution; then the vectors are linearly dependent. A set S ¼ fv 1 ; v 2 ; . . . ; v m g of vectors in V is linearly dependent or independent according to whether the vectors v 1 ; v 2 ; . . . ; v m are linearly dependent or independent. An infinite set S of vectors is linearly dependent or independent according to whether there do or do not exist vectors v 1 ; v 2 ; . . . ; v k in S that are linearly dependent. Warning: The set S ¼ fv 1 ; v 2 ; . . . ; v m g above represents a list or, in other words, a finite sequence of vectors where the vectors are ordered and repetition is permitted.

122
The following remarks follow directly from the above definition.

CHAPTER 4 Vector Spaces

Remark 1: Suppose 0 is one of the vectors v 1 ; v 2 ; . . . ; v m , say v 1 ¼ 0. Then the vectors must be linearly dependent, because we have the following linear combination where the coefficient of v 1 6¼ 0: 1v 1 þ 0v 2 þ Á Á Á þ 0v m ¼ 1 Á 0 þ 0 þ Á Á Á þ 0 ¼ 0 Remark 2: Suppose v is a nonzero vector. Then v, by itself, is linearly independent, because kv ¼ 0; v 6¼ 0 implies k¼0

Remark 3: Suppose two of the vectors v 1 ; v 2 ; . . . ; v m are equal or one is a scalar multiple of the other, say v 1 ¼ kv 2 . Then the vectors must be linearly dependent, because we have the following linear combination where the coefficient of v 1 6¼ 0: v 1 À kv2 þ 0v 3 þ Á Á Á þ 0v m ¼ 0 Remark 4: the other. Two vectors v 1 and v 2 are linearly dependent if and only if one of them is a multiple of

Remark 5: If the set fv 1 ; . . . ; v m g is linearly independent, then any rearrangement of the vectors fv i1 ; v i2 ; . . . ; v im g is also linearly independent. Remark 6: If a set S of vectors is linearly independent, then any subset of S is linearly independent. Alternatively, if S contains a linearly dependent subset, then S is linearly dependent.
EXAMPLE 4.10 (a) Let u ¼ ð1; 1; 0Þ, v ¼ ð1; 3; 2Þ, w ¼ ð4; 9; 5Þ. Then u, v, w are linearly dependent, because

3u þ 5v À 2w ¼ 3ð1; 1; 0Þ þ 5ð1; 3; 2Þ À 2ð4; 9; 5Þ ¼ ð0; 0; 0Þ ¼ 0
(b) We show that the vectors u ¼ ð1; 2; 3Þ, v ¼ ð2; 5; 7Þ, w ¼ ð1; 3; 5Þ are linearly independent. We form the vector equation xu þ yv þ zw ¼ 0, where x, y, z are unknown scalars. This yields

2 3 2 3 2 3 2 3 0 1 1 2 x4 2 5 þ y4 5 5 þ z4 3 5 ¼ 4 0 5 0 5 7 3 xu þ yv þ zw ¼ 0

or

x þ 2y þ z ¼ 0 2x þ 5y þ 3z ¼ 0 3x þ 7y þ 5z ¼ 0 y ¼ 0; z¼0

or

x þ 2y þ z ¼ 0 yþ z¼0 2z ¼ 0

Back-substitution yields x ¼ 0, y ¼ 0, z ¼ 0. We have shown that

implies

x ¼ 0;

Accordingly, u, v, w are linearly independent. (c) Let V be the vector space of functions from R into R. We show that the functions f ðtÞ ¼ sin t, gðtÞ ¼ et , hðtÞ ¼ t2 are linearly independent. We form the vector (function) equation xf þ yg þ zh ¼ 0, where x, y, z are unknown scalars. This function equation means that, for every value of t,

x sin t þ yet þ zt 2 ¼ 0
Thus, in this equation, we choose appropriate values of t to easily get x ¼ 0, y ¼ 0, z ¼ 0. For example,

ðiÞ Substitute t ¼ 0 ðiiÞ Substitute t ¼ p ðiiiÞ Substitute t ¼ p=2
We have shown

to obtain xð0Þ þ yð1Þ þ zð0Þ ¼ 0 to obtain xð0Þ þ 0ðep Þ þ zðp2 Þ ¼ 0 to obtain xð1Þ þ 0ðep=2 Þ þ 0ðp2 =4Þ ¼ 0 x ¼ 0; y ¼ 0; z¼0

or or or

y¼0 z¼0 x¼0

xf þ yg þ zf ¼ 0

implies

Accordingly, u, v, w are linearly independent.

CHAPTER 4 Vector Spaces

123

Linear Dependence in R3
Linear dependence in the vector space V ¼ R3 can be described geometrically as follows: Any two vectors u and v in R3 are linearly dependent if and only if they lie on the same line through the origin O, as shown in Fig. 4-3(a). (b) Any three vectors u, v, w in R3 are linearly dependent if and only if they lie on the same plane through the origin O, as shown in Fig. 4-3(b). (a) Later, we will be able to show that any four or more vectors in R3 are automatically linearly dependent.

Figure 4-3

Linear Dependence and Linear Combinations
The notions of linear dependence and linear combinations are closely related. Specifically, for more than one vector, we show that the vectors v 1 ; v 2 ; . . . ; v m are linearly dependent if and only if one of them is a linear combination of the others. Suppose, say, v i is a linear combination of the others, v i ¼ a1 v 1 þ Á Á Á þ aiÀ1 v iÀ1 þ aiþ1 v iþ1 þ Á Á Á þ am v m Then by adding Àv i to both sides, we obtain a1 v 1 þ Á Á Á þ aiÀ1 v iÀ1 À v i þ aiþ1 v iþ1 þ Á Á Á þ am v m ¼ 0 where the coefficient of v i is not 0. Hence, the vectors are linearly dependent. Conversely, suppose the vectors are linearly dependent, say, b1 v 1 þ Á Á Á þ bj v j þ Á Á Á þ bm v m ¼ 0; Then we can solve for v j obtaining v j ¼ bÀ1 b1 v 1 À Á Á Á À bÀ1 bjÀ1 v jÀ1 À bÀ1 bjþ1 v jþ1 À Á Á Á À bÀ1 bm v m j j j j and so v j is a linear combination of the other vectors. We now state a slightly stronger statement than the one above. This result has many important consequences.
LEMMA 4.10:

where

bj 6¼ 0

Suppose two or more nonzero vectors v 1 ; v 2 ; . . . ; v m are linearly dependent. Then one of the vectors is a linear combination of the preceding vectors; that is, there exists k > 1 such that v k ¼ c1 v 1 þ c2 v 2 þ Á Á Á þ ckÀ1 v kÀ1

124
Linear Dependence and Echelon Matrices

CHAPTER 4 Vector Spaces

Consider the following echelon matrix A, whose pivots have been circled: 2 3 0  3 4 5 6 7 2 60 0  3 2 3 47 4 6 7 A ¼ 60 0 0 0  8 97 7 6 7 40 0 0 0 0  75 6 0 0 0 0 0 0 0 Observe that the rows R2 , R3 , R4 have 0’s in the second column below the nonzero pivot in R1 , and hence any linear combination of R2 , R3 , R4 must have 0 as its second entry. Thus, R1 cannot be a linear combination of the rows below it. Similarly, the rows R3 and R4 have 0’s in the third column below the nonzero pivot in R2 , and hence R2 cannot be a linear combination of the rows below it. Finally, R3 cannot be a multiple of R4 , because R4 has a 0 in the fifth column below the nonzero pivot in R3 . Viewing the nonzero rows from the bottom up, R4 , R3 , R2 , R1 , no row is a linear combination of the preceding rows. Thus, the rows are linearly independent by Lemma 4.10. The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon matrix. Thus, we have the following very useful result.
THEOREM 4.11:

The nonzero rows of a matrix in echelon form are linearly independent.

4.8

Basis and Dimension

First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in Problem 4.28.)
DEFINITION A:

A set S ¼ fu1 ; u2 ; . . . ; un g of vectors is a basis of V if it has the following two properties: (1) S is linearly independent. (2) S spans V. A set S ¼ fu1 ; u2 ; . . . ; un g of vectors is a basis of V if every v 2 V can be written uniquely as a linear combination of the basis vectors.

DEFINITION B:

The following is a fundamental result in linear algebra.
THEOREM 4.12:

Let V be a vector space such that one basis has m elements and another basis has n elements. Then m ¼ n.

A vector space V is said to be of finite dimension n or n-dimensional, written dim V ¼ n if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of elements, so this definition is well defined. The vector space {0} is defined to have dimension 0. Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to be infinite-dimensional. The above fundamental Theorem 4.12 is a consequence of the following ‘‘replacement lemma’’ (proved in Problem 4.35).
LEMMA 4.13:

Suppose fv 1 ; v 2 ; . . . ; v n g spans V, and suppose fw1 ; w2 ; . . . ; wm g is linearly independent. Then m n, and V is spanned by a set of the form fw1 ; w2 ; . . . ; wm ; v i1 ; v i2 ; . . . ; v inÀm g

Thus, in particular, n þ 1 or more vectors in V are linearly dependent. Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m independent vectors and still retained a spanning set.

CHAPTER 4 Vector Spaces

125

Examples of Bases
This subsection presents important examples of bases of some of the main vector spaces appearing in this text. (a) Vector space K n : Consider the following n vectors in K n : e1 ¼ ð1; 0; 0; 0; . . . ; 0; 0Þ; e2 ¼ ð0; 1; 0; 0; . . . ; 0; 0Þ; . . . ; en ¼ ð0; 0; 0; 0; . . . ; 0; 1Þ These vectors are linearly independent. (For example, they form a matrix in echelon form.) Furthermore, any vector u ¼ ða1 ; a2 ; . . . ; an Þ in K n can be written as a linear combination of the above vectors. Specifically, v ¼ a1 e1 þ a2 e2 þ Á Á Á þ an en Accordingly, the vectors form a basis of K n called the usual or standard basis of K n . Thus (as one might expect), K n has dimension n. In particular, any other basis of K n has n elements. (b) Vector space M ¼ Mr;s of all r  s matrices: The following six matrices form a basis of the vector space M2;3 of all 2  3 matrices over K: ! ! ! ! ! ! 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 ; ; ; ; ; 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 More generally, in the vector space M ¼ Mr;s of all r  s matrices, let Eij be the matrix with ij-entry 1 and 0’s elsewhere. Then all such matrices form a basis of Mr;s called the usual or standard basis of Mr;s . Accordingly, dim Mr;s ¼ rs. (c) Vector space Pn ðtÞ of all polynomials of degree n: The set S ¼ f1; t; t2 ; t3 ; . . . ; tn g of n þ 1 polynomials is a basis of Pn ðtÞ. Specifically, any polynomial f ðtÞ of degree n can be expessed as a linear combination of these powers of t, and one can show that these polynomials are linearly independent. Therefore, dim Pn ðtÞ ¼ n þ 1. (d) Vector space PðtÞ of all polynomials: Consider any finite set S ¼ ff1 ðtÞ; f2 ðtÞ; . . . ; fm ðtÞg of polynomials in PðtÞ, and let m denote the largest of the degrees of the polynomials. Then any polynomial gðtÞ of degree exceeding m cannot be expressed as a linear combination of the elements of S. Thus, S cannot be a basis of PðtÞ. This means that the dimension of PðtÞ is infinite. We note that the infinite set S 0 ¼ f1; t; t2 ; t3 ; . . .g, consisting of all the powers of t, spans PðtÞ and is linearly independent. Accordingly, S 0 is an infinite basis of PðtÞ.

Theorems on Bases
The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently.
THEOREM 4.14:

Let V be a vector space of finite dimension n. Then: (i) (ii) (iii) Any n þ 1 or more vectors in V are linearly dependent. Any linearly independent set S ¼ fu1 ; u2 ; . . . ; un g with n elements is a basis of V. Any spanning set T ¼ fv 1 ; v 2 ; . . . ; v n g of V with n elements is a basis of V.

THEOREM 4.15:

Suppose S spans a vector space V. Then: (i) Any maximum number of linearly independent vectors in S form a basis of V. (ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in S. Then the remaining vectors form a basis of V.

126
THEOREM 4.16:

CHAPTER 4 Vector Spaces
Let V be a vector space of finite dimension and let S ¼ fu1 ; u2 ; . . . ; ur g be a set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended to a basis of V.

EXAMPLE 4.11 (a) The following four vectors in R4 form a matrix in echelon form:

ð1; 1; 1; 1Þ; ð0; 1; 1; 1Þ; ð0; 0; 1; 1Þ; ð0; 0; 0; 1Þ
Thus, the vectors are linearly independent, and, because dim R4 ¼ 4, the four vectors form a basis of R4 . (b) The following n þ 1 polynomials in Pn ðtÞ are of increasing degree:

1; t À 1; ðt À 1Þ2 ; . . . ; ðt À 1Þn
Therefore, no polynomial is a linear combination of preceding polynomials; hence, the polynomials are linear independent. Furthermore, they form a basis of Pn ðtÞ, because dim Pn ðtÞ ¼ n þ 1. (c) Consider any four vectors in R3 , say

ð257; À132; 58Þ;

ð43; 0; À17Þ;

ð521; À317; 94Þ;

ð328; À512; À731Þ

By Theorem 4.14(i), the four vectors must be linearly dependent, because they come from the three-dimensional vector space R3 .

Dimension and Subspaces
The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of a vector space and the dimension of a subspace.
THEOREM 4.17:

Let W be a subspace of an n-dimensional vector space V. Then dim W particular, if dim W ¼ n, then W ¼ V.

n. In

EXAMPLE 4.12 Let W be a subspace of the real space R3 . Note that dim R3 ¼ 3. Theorem 4.17 tells us that the dimension of W can only be 0, 1, 2, or 3. The following cases apply: (a) If dim W ¼ 0, then W ¼ f0g, a point. (b) If dim W ¼ 1, then W is a line through the origin 0. (c) If dim W ¼ 2, then W is a plane through the origin 0. (d) If dim W ¼ 3, then W is the entire space R3 .

4.9

Application to Matrices, Rank of a Matrix

Let A be any m  n matrix over a field K. Recall that the rows of A may be viewed as vectors in K n and that the row space of A, written rowsp(A), is the subspace of K n spanned by the rows of A. The following definition applies.
DEFINITION:

The rank of a matrix A, written rank(A), is equal to the maximum number of linearly independent rows of A or, equivalently, the dimension of the row space of A.

Recall, on the other hand, that the columns of an m  n matrix A may be viewed as vectors in K m and that the column space of A, written colsp(A), is the subspace of K m spanned by the columns of A. Although m may not be equal to n—that is, the rows and columns of A may belong to different vector spaces—we have the following fundamental result.
THEOREM 4.18:

The maximum number of linearly independent rows of any matrix A is equal to the maximum number of linearly independent columns of A. Thus, the dimension of the row space of A is equal to the dimension of the column space of A.

Accordingly, one could restate the above definition of the rank of A using columns instead of rows.

CHAPTER 4 Vector Spaces

127

Basis-Finding Problems
This subsection shows how an echelon form of any matrix A gives us the solution to certain problems about A itself. Specifically, let A and B be the following matrices, where the echelon matrix B (whose pivots are circled) is an echelon form of A: 1 62 6 A ¼ 63 6 41 2 2 2 1 3 5 5 6 7 6 11 5 10 8 6 8 11 3 1 2 4 57 7 6 97 7 9 95 9 12

and

 2 1 3 1 2 1 60  3 1 2 17 1 6 7 B ¼ 60 0 0  1 27 1 6 7
40 0 0 0 0 0 0 0 0 0 05 0

2

3

We solve the following four problems about the matrix A, where C1 ; C2 ; . . . ; C6 denote its columns: (a) Find a basis of the row space of A. (b) Find each column Ck of A that is a linear combination of preceding columns of A. (c) Find a basis of the column space of A. (d) Find the rank of A. (a) We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space of B. Thus, they also form a basis of the row space of A. That is, basis of rowspðAÞ: (b) ð1; 2; 1; 3; 1; 2Þ; ð0; 1; 3; 1; 2; 1Þ; ð0; 0; 0; 1; 1; 2Þ

Let Mk ¼ ½C1 ; C2 ; . . . ; Ck Š, the submatrix of A consisting of the first k columns of A. Then MkÀ1 and Mk are, respectively, the coefficient matrix and augmented matrix of the vector equation x1 C1 þ x2 C2 þ Á Á Á þ xkÀ1 CkÀ1 ¼ Ck Theorem 3.9 tells us that the system has a solution, or, equivalently, Ck is a linear combination of the preceding columns of A if and only if rankðMk Þ ¼ rankðMkÀ1 Þ, where rankðMk Þ means the number of pivots in an echelon form of Mk . Now the first k column of the echelon matrix B is also an echelon form of Mk . Accordingly, rankðM2 Þ ¼ rankðM3 Þ ¼ 2 and rankðM4 Þ ¼ rankðM5 Þ ¼ rankðM6 Þ ¼ 3

(c)

Thus, C3 , C5 , C6 are each a linear combination of the preceding columns of A. The fact that the remaining columns C1 , C2 , C4 are not linear combinations of their respective preceding columns also tells us that they are linearly independent. Thus, they form a basis of the column space of A. That is, basis of colspðAÞ: ½1; 2; 3; 1; 2ŠT ; ½2; 5; 7; 5; 6ŠT ; ½3; 6; 11; 8; 11ŠT

Observe that C1 , C2 , C4 may also be characterized as those columns of A that contain the pivots in any echelon form of A. (d) Here we see that three possible definitions of the rank of A yield the same value. (i) There are three pivots in B, which is an echelon form of A. (ii) The three pivots in B correspond to the nonzero rows of B, which form a basis of the row space of A. (iii) The three pivots in B correspond to the columns of A, which form a basis of the column space of A. Thus, rankðAÞ ¼ 3.

128
Application to Finding a Basis for W ¼ spanðu1 ; u2 ; . . . ; ur Þ

CHAPTER 4 Vector Spaces

Frequently, we are given a list S ¼ fu1 ; u2 ; . . . ; ur g of vectors in K n and we want to find a basis for the subspace W of K n spanned by the given vectors—that is, a basis of W ¼ spanðSÞ ¼ spanðu1 ; u2 ; . . . ; ur Þ The following two algorithms, which are essentially described in the above subsection, find such a basis (and hence the dimension) of W. Algorithm 4.1 (Row space algorithm) Step 1. Form the matrix M whose rows are the given vectors. Step 2. Row reduce M to echelon form. Step 3. Output the nonzero rows of the echelon matrix. Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm accomplishes this task. Algorithm 4.2 (Casting-out algorithm) Step 1. Form the matrix M whose columns are the given vectors. Step 2. Row reduce M to echelon form. Step 3. For each column Ck in the echelon matrix without a pivot, delete (cast out) the vector uk from the list S of given vectors.

Step 4. Output the remaining vectors in S (which correspond to columns with pivots). We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas in the second algorithm we form a matrix whose columns are the given vectors.
EXAMPLE 4.13 Let W be the subspace of R5 spanned by the following vectors:

u1 ¼ ð1; 2; 1; 3; 2Þ; u2 ¼ ð1; 3; 3; 5; 3Þ; u3 ¼ ð3; 8; 7; 13; 8Þ u5 ¼ ð5; 13; 13; 25; 19Þ u4 ¼ ð1; 4; 6; 9; 7Þ;
Find a basis of W consisting of the original given vectors, and find dim W. Form the matrix M whose columns are the given vectors, and reduce M to echelon form:

1 62 6 M ¼ 61 6 43 2

2

1 3 3 5 3

3 8 7 13 8

1 4 6 9 7

3 2 5 1 7 60 13 7 6 13 7 $ 6 0 7 6 25 5 4 0 19 0

1 1 0 0 0

3 2 0 0 0

1 2 1 0 0

3 5 37 7 27 7 05 0

The pivots in the echelon matrix appear in columns C1 , C2 , C4 . Accordingly, we ‘‘cast out’’ the vectors u3 and u5 from the original five vectors. The remaining vectors u1 , u2 , u4 , which correspond to the columns in the echelon matrix with pivots, form a basis of W. Thus, in particular, dim W ¼ 3.

Remark: The justification of the casting-out algorithm is essentially described above, but we repeat it again here for emphasis. The fact that column C3 in the echelon matrix in Example 4.13 does not have a pivot means that the vector equation xu1 þ yu2 ¼ u3 has a solution, and hence u3 is a linear combination of u1 and u2 . Similarly, the fact that C5 does not have a pivot means that u5 is a linear combination of the preceding vectors. We have deleted each vector in the original spanning set that is a linear combination of preceding vectors. Thus, the remaining vectors are linearly independent and form a basis of W.

CHAPTER 4 Vector Spaces

129

Application to Homogeneous Systems of Linear Equations
Consider again a homogeneous system AX ¼ 0 of linear equations over K with n unknowns. By Theorem 4.4, the solution set W of such a system is a subspace of K n , and hence W has a dimension. The following theorem, whose proof is postponed until Chapter 5, holds.
THEOREM 4.19:

The dimension of the solution space W of a homogeneous system AX ¼ 0 is n À r, where n is the number of unknowns and r is the rank of the coefficient matrix A.

In the case where the system AX ¼ 0 is in echelon form, it has precisely n À r free variables, say xi1 ; xi2 ; . . . ; xinÀr . Let v j be the solution obtained by setting xij ¼ 1 (or any nonzero constant) and the remaining free variables equal to 0. We show (Problem 4.50) that the solutions v 1 ; v 2 ; . . . ; v nÀr are linearly independent; hence, they form a basis of the solution space W. We have already used the above process to find a basis of the solution space W of a homogeneous system AX ¼ 0 in Section 3.11. Problem 4.48 gives three other examples.

4.10

Sums and Direct Sums

Let U and W be subsets of a vector space V. The sum of U and W, written U þ W, consists of all sums u þ w where u 2 U and w 2 W. That is, U þ W ¼ fv : v ¼ u þ w; where u 2 U and w 2 W g Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that U þ W is a subspace of V. Recall that U \ W is also a subspace of V. The following theorem (proved in Problem 4.58) relates the dimensions of these subspaces.
THEOREM 4.20:

Suppose U and W are finite-dimensional subspaces of a vector space V. Then U þ W has finite dimension and

dimðU þ W Þ ¼ dim U þ dim W À dimðU \ W Þ
EXAMPLE 4.14 Let V ¼ M2;2 , the vector space of 2 Â 2 matrices. Let U consist of those matrices whose second row is zero, and let W consist of those matrices whose second column is zero. Then

& U¼

a 0

b 0

!' ;

& W¼

a 0 c 0

!' and U þ W ¼

&

a c

b 0

!' ;

& U \W ¼

a 0

0 0

!'

That is, U þ W consists of those matrices whose lower right entry is 0, and U \ W consists of those matrices whose second row and second column are zero. Note that dim U ¼ 2, dim W ¼ 2, dimðU \ W Þ ¼ 1. Also, dimðU þ W Þ ¼ 3, which is expected from Theorem 4.20. That is, dimðU þ W Þ ¼ dim U þ dim V À dimðU \ W Þ ¼ 2 þ 2 À 1 ¼ 3 Direct Sums

The vector space V is said to be the direct sum of its subspaces U and W, denoted by V ¼U ÈW if every v 2 V can be written in one and only one way as v ¼ u þ w where u 2 U and w 2 W. The following theorem (proved in Problem 4.59) characterizes such a decomposition.
THEOREM 4.21:

The vector space V is the direct sum of its subspaces U and W if and only if: (i) V ¼ U þ W, (ii) U \ W ¼ f0g.

130
EXAMPLE 4.15 Consider the vector space V ¼ R3 :

CHAPTER 4 Vector Spaces

(a) Let U be the xy-plane and let W be the yz-plane; that is,

U ¼ fða; b; 0Þ : a; b 2 Rg
3

and

W ¼ fð0; b; cÞ : b; c 2 Rg

Then R ¼ U þ W, because every vector in R3 is the sum of a vector in U and a vector in W. However, R3 is not the direct sum of U and W, because such sums are not unique. For example,

ð3; 5; 7Þ ¼ ð3; 1; 0Þ þ ð0; 4; 7Þ

and also

ð3; 5; 7Þ ¼ ð3; À4; 0Þ þ ð0; 9; 7Þ

(b) Let U be the xy-plane and let W be the z-axis; that is,

U ¼ fða; b; 0Þ : a; b 2 Rg
3

and

W ¼ fð0; 0; cÞ : c 2 Rg

Now any vector ða; b; cÞ 2 R can be written as the sum of a vector in U and a vector in V in one and only one way:

ða; b; cÞ ¼ ða; b; 0Þ þ ð0; 0; cÞ
Accordingly, R3 is the direct sum of U and W ; that is, R3 ¼ U È W.

General Direct Sums
The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the direct sum of subspaces W1 ; W2 ; . . . ; Wr , written V ¼ W1 È W2 È Á Á Á È Wr if every vector v 2 V can be written in one and only one way as v ¼ w1 þ w2 þ Á Á Á þ wr where w1 2 W1 ; w2 2 W2 ; . . . ; wr 2 Wr . The following theorems hold.
THEOREM 4.22:

Suppose V ¼ W1 È W2 È Á Á Á È Wr . Also, for each k, suppose Sk is a linearly independent subset of Wk . Then S (a) The union S ¼ k Sk is linearly independent in V. S (b) If each Sk is a basis of Wk , then k Sk is a basis of V. (c) dim V ¼ dim W1 þ dim W2 þ Á Á Á þ dim Wr . P k THEOREM 4.23:

Suppose V ¼ W1 þ W2 þ Á Á Á þ Wr and dim V ¼ V ¼ W1 È W2 È Á Á Á È Wr :

dim Wk . Then

4.11

Coordinates

Let V be an n-dimensional vector space over K with basis S ¼ fu1 ; u2 ; . . . ; un g. Then any vector v 2 V can be expressed uniquely as a linear combination of the basis vectors in S, say v ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un These n scalars a1 ; a2 ; . . . ; an are called the coordinates of v relative to the basis S, and they form a vector [a1 ; a2 ; . . . ; an ] in K n called the coordinate vector of v relative to S. We denote this vector by ½vŠS , or simply ½vŠ; when S is understood. Thus, ½vŠS ¼ ½a1 ; a2 ; . . . ; an Š For notational convenience, brackets ½. . .Š, rather than parentheses ð. . .Þ, are used to denote the coordinate vector.

CHAPTER 4 Vector Spaces

131

Remark: The above n scalars a1 ; a2 ; . . . ; an also form the coordinate column vector ½a1 ; a2 ; . . . ; an ŠT of v relative to S. The choice of the column vector rather than the row vector to represent v depends on the context in which it is used. The use of such column vectors will become clear later in Chapter 6.
EXAMPLE 4.16 Consider the vector space P2 ðtÞ of polynomials of degree 2. The polynomials

p1 ¼ t þ 1;

p2 ¼ t À 1;

p3 ¼ ðt À 1Þ2 ¼ t2 À 2t þ 1

form a basis S of P2 ðtÞ. The coordinate vector [v] of v ¼ 2t2 À 5t þ 9 relative to S is obtained as follows. Set v ¼ xp1 þ yp2 þ zp3 using unknown scalars x, y, z, and simplify:

2t2 À 5t þ 9 ¼ xðt þ 1Þ þ yðt À 1Þ þ zðt2 À 2t þ 1Þ ¼ xt þ x þ yt À y þ zt2 À 2zt þ z ¼ zt2 þ ðx þ y À 2zÞt þ ðx À y þ zÞ
Then set the coefficients of the same powers of t equal to each other to obtain the system

z ¼ 2;

x þ y À 2z ¼ À5;

xÀyþz¼9

The solution of the system is x ¼ 3, y ¼ À4, z ¼ 2. Thus, v ¼ 3p1 À 4p2 þ 2p3 ; and hence; ½vŠ ¼ ½3; À4; 2Š EXAMPLE 4.17 Consider real space R3 . The following vectors form a basis S of R3 :

u1 ¼ ð1; À1; 0Þ;

u2 ¼ ð1; 1; 0Þ;

u3 ¼ ð0; 1; 1Þ

The coordinates of v ¼ ð5; 3; 4Þ relative to the basis S are obtained as follows. Set v ¼ xv 1 þ yv 2 þ zv 3 ; that is, set v as a linear combination of the basis vectors using unknown scalars x, y, z. This yields

2 3 2 3 2 3 2 3 5 1 1 0 4 3 5 ¼ x4 À1 5 þ y4 1 5 þ z4 1 5 4 0 0 1

The equivalent system of linear equations is as follows:

x þ y ¼ 5;

Àx þ y þ z ¼ 3;

z¼4
½vŠs ¼ ½3; 2; 4Š

The solution of the system is x ¼ 3, y ¼ 2, z ¼ 4. Thus, v ¼ 3u1 þ 2u2 þ 4u3 ; and so

Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis S for the real space Rn , which we illustrate using the basis S of R3 in Example 4.17. First consider the space R3 with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R3 , say with x0 , y0 , z0 axes, as shown in Fig. 4-4. That is, (1) The x0 -axis is in the direction of u1 with unit length ku1 k. (2) The y0 -axis is in the direction of u2 with unit length ku2 k. (3) The z0 -axis is in the direction of u3 with unit length ku3 k. Then each vector v ¼ ða; b; cÞ or, equivalently, the point Pða; b; cÞ in R3 will have new coordinates with respect to the new x0 , y0 , z0 axes. These new coordinates are precisely ½vŠS , the coordinates of v with respect to the basis S. Thus, as shown in Example 4.17, the coordinates of the point Pð5; 3; 4Þ with the new axes form the vector [3, 2, 4]. Remark 2: Consider the usual basis E ¼ fe1 ; e2 ; . . . ; en g of K n defined by e2 ¼ ð0; 1; 0; . . . ; 0; 0Þ; ...; en ¼ ð0; 0; 0; . . . ; 0; 1Þ

e1 ¼ ð1; 0; 0; . . . ; 0; 0Þ;

132

CHAPTER 4 Vector Spaces

Figure 4-4

Let v ¼ ða1 ; a2 ; . . . ; an Þ be any vector in K n . Then one can easily show that v ¼ a1 e1 þ a2 e2 þ Á Á Á þ an en ; and so ½vŠE ¼ ½a1 ; a2 ; . . . ; an Š That is, the coordinate vector ½vŠE of any vector v relative to the usual basis E of K n is identical to the original vector v.

Isomorphism of V and K n
Let V be a vector space of dimension n over K, and suppose S ¼ fu1 ; u2 ; . . . ; un g is a basis of V. Then each vector v 2 V corresponds to a unique n-tuple ½vŠS in K n . On the other hand, each n-tuple [c1 ; c2 ; . . . ; cn ] in K n corresponds to a unique vector c1 u1 þ c2 u2 þ Á Á Á þ cn un in V. Thus, the basis S induces a one-to-one correspondence between V and K n . Furthermore, suppose v ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un Then v þ w ¼ ða1 þ b1 Þu1 þ ða2 þ b2 Þu2 þ Á Á Á þ ðan þ bn Þun kv ¼ ðka1 Þu1 þ ðka2 Þu2 þ Á Á Á þ ðkan Þun where k is a scalar. Accordingly, ½v þ wŠS ¼ ½a1 þ b1 ; ...; an þ bn Š ¼ ½a1 ; . . . ; an Š þ ½b1 ; . . . ; bn Š ¼ ½vŠS þ ½wŠS ½kvŠS ¼ ½ka1 ; ka2 ; . . . ; kan Š ¼ k½a1 ; a2 ; . . . ; an Š ¼ k½vŠS Thus, the above one-to-one correspondence between V and K n preserves the vector space operations of vector addition and scalar multiplication. We then say that V and K n are isomorphic, written V ffi Kn We state this result formally. and w ¼ b1 u1 þ b2 u2 þ Á Á Á þ bn un

CHAPTER 4 Vector Spaces
THEOREM 4.24:

133

Let V be an n-dimensional vector space over a field K. Then V and K n are isomorphic. The next example gives a practical application of the above result.
Suppose we want to determine whether or not the following matrices in V ¼ M2;3 are linearly

EXAMPLE 4.18 dependent:

1 2 A¼ 4 0

! À3 ; 1

1 3 B¼ 6 5

! À4 ; 4

3 8 C¼ 16 10

À11 9

!

The coordinate vectors of the matrices in the usual basis of M2;3 are as follows:

½AŠ ¼ ½1; 2; À3; 4; 0; 1Š; 1 M ¼ 41 3 2 2 3 8

½BŠ ¼ ½1; 3; À4; 6; 5; 4Š; À3 4 À1 2 À2 4

½CŠ ¼ ½3; 8; À11; 16; 10; 9Š 3 2 0 1 1 2 5 35 $ 40 1 10 6 0 0 3 À3 4 0 1 À1 2 5 3 5 0 0 0 0

Form the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form:

3 2 À3 4 0 1 1 2 À4 6 5 4 5 $ 4 0 1 À11 16 10 9 0 2

Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a subspace of dimension 2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent.

SOLVED PROBLEMS

Vector Spaces, Linear Combinations 4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions: (a) (b) E1 ¼ 3ð2u À 4vÞ þ 5u þ 7v, (c) E3 ¼ 2uv þ 3ð2u þ 4vÞ 3 E2 ¼ 3u À 6ð3u À 5vÞ þ 7u, (d) E4 ¼ 5u À þ 5u v

Multiply out and collect terms: (a) E1 ¼ 6u À 12v þ 5u þ 7v ¼ 11u À 5v (b) E2 ¼ 3u À 18u þ 30v þ 7u ¼ À8u þ 30v (c) E3 is not defined because the product uv of vectors is not defined. (d) E4 is not defined because division by a vector is not defined.

4.2.

Prove Theorem 4.1: Let V be a vector space over a field K. (i) k0 ¼ 0. (ii) 0u ¼ 0. (iii) If ku ¼ 0, then k ¼ 0 or u ¼ 0. (iv) ðÀkÞu ¼ kðÀuÞ ¼ Àku.
(i) By Axiom [A2] with u ¼ 0, we have 0 þ 0 ¼ 0. Hence, by Axiom [M1], we have k0 ¼ kð0 þ 0Þ ¼ k0 þ k0 Adding Àk0 to both sides gives the desired result. (ii) For scalars, 0 þ 0 ¼ 0. Hence, by Axiom [M2], we have 0u ¼ ð0 þ 0Þu ¼ 0u þ 0u Adding À0u to both sides gives the desired result. (iii) Suppose ku ¼ 0 and k 6¼ 0. Then there exists a scalar k À1 such that k À1 k ¼ 1. Thus, u ¼ 1u ¼ ðk À1 kÞu ¼ k À1 ðkuÞ ¼ k À1 0 ¼ 0 (iv) Using u þ ðÀuÞ ¼ 0 and k þ ðÀkÞ ¼ 0 yields 0 ¼ k0 ¼ k½u þ ðÀuފ ¼ ku þ kðÀuÞ and 0 ¼ 0u ¼ ½k þ ðÀkފu ¼ ku þ ðÀkÞu Adding Àku to both sides of the first equation gives Àku ¼ kðÀuÞ; and adding Àku to both sides of the second equation gives Àku ¼ ðÀkÞu. Thus, ðÀkÞu ¼ kðÀuÞ ¼ Àku.

134
4.3. Show that (a) kðu À vÞ ¼ ku À kv, (b) u þ u ¼ 2u.
(a)

CHAPTER 4 Vector Spaces

Using the definition of subtraction, that u À v ¼ u þ ðÀvÞ, and Theorem 4.1(iv), that kðÀvÞ ¼ Àkv, we have kðu À vÞ ¼ k½u þ ðÀvފ ¼ ku þ kðÀvÞ ¼ ku þ ðÀkvÞ ¼ ku À kv

(b)

Using Axiom [M4] and then Axiom [M2], we have u þ u ¼ 1u þ 1u ¼ ð1 þ 1Þu ¼ 2u

4.4.

Express v ¼ ð1; À2; 5Þ in R3 as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; À1; 1Þ
We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3 . Thus, we require 2 3 2 3 2 3 2 3 1 1 1 2 x þ y þ 2z ¼ 1 4 À2 5 ¼ x4 1 5 þ y4 2 5 þ z4 À1 5 or x þ 2y À z ¼ À2 1 x þ 3y þ z ¼ 5 5 1 3 (For notational convenience, we write the vectors in R3 as columns, because it is then easier to find the equivalent system of linear equations.) Reducing the system to echelon form yields the triangular system x þ y þ 2z ¼ 1; y À 3z ¼ À3; 5z ¼ 10

The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ À6, y ¼ 3, z ¼ 2. Thus, v ¼ À6u1 þ 3u2 þ 2u3 . Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where u1 , u2 , u3 are the first three columns of M and v is the last column, and then reduce M to echelon form: 2 3 2 3 2 3 1 1 2 1 1 1 2 1 1 1 2 1 M ¼ 4 1 2 À1 À2 5 $ 4 0 1 À3 À3 5 $ 4 0 1 À3 À3 5 1 3 1 5 0 2 À1 4 0 0 5 10 The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by back-substitution yields the solution x ¼ À6, y ¼ 3, z ¼ 2. Thus, v ¼ À6u1 þ 3u2 þ 2u3 .

4.5.

Express v ¼ ð2; À5; 3Þ in R3 as a linear combination of the vectors u1 ¼ ð1; À3; 2Þ; u2 ¼ ð2; À4; À1Þ; u3 ¼ ð1; À5; 7Þ
We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3 . Thus, we require 2 3 2 3 2 3 2 3 x þ 2y þ z ¼ 2 1 2 1 2 4 À5 5 ¼ x4 À3 5 þ y4 À4 5 þ z4 À5 5 or À3x À 4y À 5z ¼ À5 2x À y þ 7z ¼ 3 7 À1 2 3 Reducing the system to echelon form yields the system x þ 2y þ z ¼ 2; 2y À 2z ¼ 1; 0¼3 The system is inconsistent and so has no solution. Thus, v cannot be written as a linear combination of u1 , u2 , u3 .

4.6.

Express the polynomial v ¼ t2 þ 4t À 3 in PðtÞ as a linear combination of the polynomials p1 ¼ t2 À 2t þ 5; p2 ¼ 2t 2 À 3t; p3 ¼ t þ 1 ð*Þ Set v as a linear combination of p1 , p2 , p3 using unknowns x, y, z to obtain t2 þ 4t À 3 ¼ xðt2 À 2t þ 5Þ þ yð2t2 À 3tÞ þ zðt þ 1Þ We can proceed in two ways.

CHAPTER 4 Vector Spaces
Method 1. Expand the right side of (*) and express it in terms of powers of t as follows: t2 þ 4t À 3 ¼ xt2 À 2xt þ 5x þ 2yt2 À 3yt þ zt þ z ¼ ðx þ 2yÞt2 þ ðÀ2x À 3y þ zÞt þ ð5x þ 3zÞ

135

Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This yields x þ 2y ¼ 1 À2x À 3y þ z ¼ 4 5x þ 3z ¼ À3 or x þ 2y ¼ 1 yþ z¼ 6 À10y þ 3z ¼ À8 or x þ 2y ¼ 1 yþ z¼ 6 13z ¼ 52

The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ À3, y ¼ 2, z ¼ 4. Thus, v ¼ À3p1 þ 2p2 þ 4p2 . Method 2. The equation (*) is an identity in t; that is, the equation holds for any value of t. Thus, we can set t equal to any numbers to obtain equations in the unknowns. (a) Set t ¼ 0 in (*) to obtain the equation À3 ¼ 5x þ z. (b) Set t ¼ 1 in (*) to obtain the equation 2 ¼ 4x À y þ 2z. (c) Set t ¼ À1 in (*) to obtain the equation À6 ¼ 8x þ 5y. Solve the system of the three equations to again obtain the solution x ¼ À3, y ¼ 2, z ¼ 4. Thus, v ¼ À3p1 þ 2p2 þ 4p3 .

4.7.

Express M as a linear combination of the matrices A, B, C, where ! ! ! ! 1 2 1 1 4 7 1 1 ; C¼ M¼ ; and A¼ ; B¼ 7 9 1 1 3 4 4 5
Set M as a linear combination of A, B, C using unknown scalars x, y, z; that is, set M ¼ xA þ yB þ zC. This yields ! ! ! ! ! 4 7 1 1 1 2 1 1 xþyþz x þ 2y þ z ¼x þy þz ¼ 7 9 1 1 3 4 4 5 x þ 3y þ 4z x þ 4y þ 5z Form the equivalent system of equations by setting corresponding entries equal to each other: x þ y þ z ¼ 4; x þ 2y þ z ¼ 7; Reducing the system to echelon form yields x þ 3y þ 4z ¼ 7; x þ 4y þ 5z ¼ 9

x þ y þ z ¼ 4; y ¼ 3; 3z ¼ À3; 4z ¼ À4 The last equation drops out. Solving the system by back-substitution yields z ¼ À1, y ¼ 3, x ¼ 2. Thus, M ¼ 2A þ 3B À C.

Subspaces 4.8. Prove Theorem 4.2: W is a subspace of V if the following two conditions hold:
(a) 0 2 W. (b) If u; v 2 W, then u þ v, ku 2 W. By (a), W is nonempty, and, by (b), the operations of vector addition and scalar multiplication are well defined for W. Axioms [A1], [A4], [M1], [M2], [M3], [M4] hold in W because the vectors in W belong to V. Thus, we need only show that [A2] and [A3] also hold in W. Now [A2] holds because the zero vector in V belongs to W by (a). Finally, if v 2 W, then ðÀ1Þv ¼ Àv 2 W, and v þ ðÀvÞ ¼ 0. Thus [A3] holds.

4.9.

Let V ¼ R3 . Show that W is not a subspace of V, where
(a) W ¼ fða; b; cÞ : a ! 0g, (b) W ¼ fða; b; cÞ : a2 þ b2 þ c2 1g. In each case, show that Theorem 4.2 does not hold.

136
(a)

CHAPTER 4 Vector Spaces

W consists of those vectors whose first entry is nonnegative. Thus, v ¼ ð1; 2; 3Þ belongs to W. Let k ¼ À3. Then kv ¼ ðÀ3; À6; À9Þ does not belong to W, because À3 is negative. Thus, W is not a subspace of V. (b) W consists of vectors whose length does not exceed 1. Hence, u ¼ ð1; 0; 0Þ and v ¼ ð0; 1; 0Þ belong to W, but u þ v ¼ ð1; 1; 0Þ does not belong to W, because 12 þ 12 þ 02 ¼ 2 > 1. Thus, W is not a subspace of V.

4.10. Let V ¼ PðtÞ, the vector space of real polynomials. Determine whether or not W is a subspace of V, where (a) (b) (c)
(a)

W consists of all polynomials with integral coefficients. W consists of all polynomials with degree ! 6 and the zero polynomial. W consists of all polynomials with only even powers of t.
No, because scalar multiples of polynomials in W do not always belong to W. For example, f ðtÞ ¼ 3 þ 6t þ 7t2 2 W but
1 2 f ðtÞ

¼ 3 þ 3t þ 7 t2 62 W 2 2

(b and c) Yes. In each case, W contains the zero polynomial, and sums and scalar multiples of polynomials in W belong to W.

4.11. Let V be the vector space of functions f : R ! R. Show that W is a subspace of V, where (a) (b) (c)
Let (a)

W ¼ f f ðxÞ : f ð1Þ ¼ 0g, all functions whose value at 1 is 0. W ¼ f f ðxÞ : f ð3Þ ¼ f ð1Þg, all functions assigning the same value to 3 and 1. W ¼ f f ðtÞ : f ðÀxÞ ¼ Àf ðxÞg, all odd functions.
^ denote the zero function, so ^ ¼ 0 for every value of x. 0 0ðxÞ ^ 2 W, because ^ 0 0ð1Þ ¼ 0. Suppose f ; g 2 W. Then f ð1Þ ¼ 0 and gð1Þ ¼ 0. Also, for scalars a and b, we have ðaf þ bgÞð1Þ ¼ af ð1Þ þ bgð1Þ ¼ a0 þ b0 ¼ 0

Thus, af þ bg 2 W, and hence W is a subspace. (b) ^ 2 W, because ^ 0 0ð3Þ ¼ 0 ¼ ^ 0ð1Þ. Suppose f; g 2 W. Then f ð3Þ ¼ f ð1Þ and gð3Þ ¼ gð1Þ. Thus, for any scalars a and b, we have ðaf þ bgÞð3Þ ¼ af ð3Þ þ bgð3Þ ¼ af ð1Þ þ bgð1Þ ¼ ðaf þ bgÞð1Þ (c) Thus, af þ bg 2 W, and hence W is a subspace. ^ 2 W, because ^ 0 0ðÀxÞ ¼ 0 ¼ À0 ¼ À^ 0ðxÞ. Suppose f; g 2 W. Then f ðÀxÞ ¼ Àf ðxÞ and gðÀxÞ ¼ ÀgðxÞ. Also, for scalars a and b, ðaf þ bgÞðÀxÞ ¼ af ðÀxÞ þ bgðÀxÞ ¼ Àaf ðxÞ À bgðxÞ ¼ Àðaf þ bgÞðxÞ Thus, ab þ gf 2 W, and hence W is a subspace of V.

4.12. Prove Theorem 4.3: The intersection of any number of subspaces of V is a subspace of V.
Let fWi : i 2 Ig be a collection of subspaces of V and let W ¼ \ðWi : i 2 IÞ. Because each Wi is a subspace of V, we have 0 2 Wi , for every i 2 I. Hence, 0 2 W. Suppose u; v 2 W. Then u; v 2 Wi , for every i 2 I. Because each Wi is a subspace, au þ bv 2 Wi , for every i 2 I. Hence, au þ bv 2 W. Thus, W is a subspace of V.

Linear Spans 4.13. Show that the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 5; 8Þ span R3 .
We need to show that an arbitrary vector v ¼ ða; b; cÞ in R3 is a linear combination of u1 , u2 , u3 . Set v ¼ xu1 þ yu2 þ zu3 ; that is, set ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 2; 3Þ þ zð1; 5; 8Þ ¼ ðx þ y þ z; x þ 2y þ 5z; x þ 3y þ 8zÞ

CHAPTER 4 Vector Spaces
Form the equivalent system and reduce it to echelon form: xþ yþ z¼a xþyþ z¼a x þ 2y þ 5z ¼ b or y þ 4z ¼ b À a or x þ 3y þ 8z ¼ c 2y þ 7c ¼ c À a The above system is in echelon form and is consistent; in fact, xþyþ z¼a y þ 4z ¼ b À a Àz ¼ c À 2b þ a

137

x ¼ Àa þ 5b À 3c; y ¼ 3a À 7b þ 4c; z ¼ a þ 2b À c is a solution. Thus, u1 , u2 , u3 span R3 .

4.14. Find conditions on a, b, c so that v ¼ ða; b; cÞ in R3 belongs to W ¼ spanðu1 ; u2 ; u3 Þ; where u1 ¼ ð1; 2; 0Þ; u2 ¼ ðÀ1; 1; 2Þ; u3 ¼ ð3; 0; À4Þ
Set v as a linear combination of u1 , u2 , u3 using unknowns x, y, z; that is, set v ¼ xu1 þ yu2 þ zu3 : This yields ða; b; cÞ ¼ xð1; 2; 0Þ þ yðÀ1; 1; 2Þ þ zð3; 0; À4Þ ¼ ðx À y þ 3z; 2x þ y; 2y À 4zÞ Form the equivalent system of linear equations and reduce it to echelon form: x À y þ 3z ¼ a 2x þ y ¼b 2y À 4z ¼ c or x À y þ 3z ¼ a 3y À 6z ¼ b À 2a 2y À 4z ¼ c or x À y þ 3z ¼ a 3y À 6z ¼ b À 2a 0 ¼ 4a À 2b þ 3c

The vector v ¼ ða; b; cÞ belongs to W if and only if the system is consistent, and it is consistent if and only if 4a À 2b þ 3c ¼ 0. Note, in particular, that u1 , u2 , u3 do not span the whole space R3 .

4.15. Show that the vector space V ¼ PðtÞ of real polynomials cannot be spanned by a finite number of polynomials.
Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the linear span span(S) of S cannot contain a polynomial of degree greater than m. Thus, spanðSÞ 6¼ V, for any finite set S.

4.16. Prove Theorem 4.5: Let S be a subset of V. (i) Then span(S) is a subspace of V containing S. (ii) If W is a subspace of V containing S, then spanðSÞ  W.
(i) Suppose S is empty. By definition, spanðSÞ ¼ f0g. Hence spanðSÞ ¼ f0g is a subspace of V and S  spanðSÞ. Suppose S is not empty and v 2 S. Then v ¼ 1v 2 spanðSÞ; hence, S  spanðSÞ. Also 0 ¼ 0v 2 spanðSÞ. Now suppose u; w 2 spanðSÞ, say P P and w ¼ b1 w1 þ Á Á Á þ bs ws ¼ bj wj u ¼ a1 u1 þ Á Á Á þ ar ur ¼ ai ui i j

where ui , wj 2 S and ai ; bj 2 K. Then uþv ¼ P i ai ui þ

P j  bj wj and ku ¼ k

P i  ai ui ¼

P i kai ui

belong to span(S) because each is a linear combination of vectors in S. Thus, span(S) is a subspace of V. (ii) Suppose u1 ; u2 ; . . . ; ur 2 S. Then all the ui belong to W. Thus, all multiples a1 u1 ; a2 u2 ; . . . ; ar ur 2 W, and so the sum a1 u1 þ a2 u2 þ Á Á Á þ ar ur 2 W. That is, W contains all linear combinations of elements in S, or, in other words, spanðSÞ  W, as claimed.

Linear Dependence 4.17. Determine whether or not u and v are linearly dependent, where (a) (b) u ¼ ð1; 2Þ, v ¼ ð3; À5Þ, u ¼ ð1; À3Þ, v ¼ ðÀ2; 6Þ, (c) u ¼ ð1; 2; À3Þ, v ¼ ð4; 5; À6Þ (d) u ¼ ð2; 4; À8Þ, v ¼ ð3; 6; À12Þ

Two vectors u and v are linearly dependent if and only if one is a multiple of the other. (a) No. (b) Yes; for v ¼ À2u. (c) No. (d) Yes, for v ¼ 3 u. 2

138

CHAPTER 4 Vector Spaces

4.18. Determine whether or not u and v are linearly dependent, where (a) (c) u ¼ 2t2 þ 4t À 3, v ¼ 4t2 þ 8t À 6, ! ! 1 3 À4 À4 À12 16 , u¼ ;v ¼ 5 0 À1 À20 0 4 u ¼ 2t2 À 3t þ 4, v ¼ 4t2 À 3t þ 2, ! ! 1 1 1 2 2 2 (d) u ¼ ;v ¼ 2 2 2 3 3 3 (b)

Two vectors u and v are linearly dependent if and only if one is a multiple of the other. (a) Yes; for v ¼ 2u. (b) No. (c) Yes, for v ¼ À4u. (d) No.

4.19. Determine whether or not the vectors u ¼ ð1; 1; 2Þ, v ¼ ð2; 3; 1Þ, w ¼ ð4; 5; 5Þ in R3 are linearly dependent.
Method 1. Set a linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain the equivalent homogeneous system of linear equations and then reduce the system to echelon form. This yields 2 3 2 3 2 3 2 3 0 4 2 1 x4 1 5 þ y4 3 5 þ z4 5 5 ¼ 4 0 5 0 5 1 1 x þ 2y þ 4z ¼ 0 x þ 3y þ 5z ¼ 0 2x þ y þ 5z ¼ 0 x þ 2y þ 4z ¼ 0 yþ z¼0

or

or

The echelon system has only two nonzero equations in three unknowns; hence, it has a free variable and a nonzero solution. Thus, u, v, w are linearly dependent. Method 2. Form the matrix A whose 2 1 2 A ¼ 41 3 2 1 columns are u, v, w and reduce 3 2 3 2 4 1 2 4 1 55 $ 40 1 15 $ 40 5 0 À3 À3 0 to echelon form: 3 2 4 1 15 0 0

The third column does not have a pivot; hence, the third vector w is a linear combination of the first two vectors u and v. Thus, the vectors are linearly dependent. (Observe that the matrix A is also the coefficient matrix in Method 1. In other words, this method is essentially the same as the first method.) Method 3. Form the matrix B whose 2 1 1 B ¼ 42 3 4 5 rows are u, 3 2 0 2 15 $ 40 0 5 v, w, and reduce 3 2 1 1 2 1 À3 5 $ 4 0 0 1 À3 to echelon form: 3 1 2 1 À3 5 0 0

Because the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three given vectors span a space of dimension 2.)

4.20. Determine whether or not each of the following lists of vectors in R3 is linearly dependent: (a) (b) (c) u1 ¼ ð1; 2; 5Þ, u2 ¼ ð1; 3; 1Þ, u3 ¼ ð2; 5; 7Þ, u4 ¼ ð3; 1; 4Þ, u ¼ ð1; 2; 5Þ, v ¼ ð2; 5; 1Þ, w ¼ ð1; 5; 2Þ, u ¼ ð1; 2; 3Þ, v ¼ ð0; 0; 0Þ, w ¼ ð1; 5; 6Þ. the given vectors, and reduce the 3 2 1 1 35 0 24

(a) Yes, because any four vectors in R3 are linearly dependent. (b) Use Method 2 above; that is, form the matrix A whose columns are matrix to echelon form: 2 3 2 3 2 1 2 1 1 2 1 1 A ¼ 42 5 55 $ 40 1 35 $ 40 5 1 2 0 À9 À3 0

Every column has a pivot entry; hence, no vector is a linear combination of the previous vectors. Thus, the vectors are linearly independent. (c) Because 0 ¼ ð0; 0; 0Þ is one of the vectors, the vectors are linearly dependent.

CHAPTER 4 Vector Spaces

139

4.21. Show that the functions f ðtÞ ¼ sin t, gðtÞ cos t, hðtÞ ¼ t from R into R are linearly independent.
Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z; that is, set xf þ yg þ zh ¼ 0. Then show x ¼ 0, y ¼ 0, z ¼ 0. We emphasize that xf þ yg þ zh ¼ 0 means that, for every value of t, we have xf ðtÞ þ ygðtÞ þ zhðtÞ ¼ 0. Thus, in the equation x sin t þ y cos t þ zt ¼ 0: ðiÞ Set t ¼ 0 ðiiÞ Set t ¼ p=2 ðiiiÞ Set t ¼ p to obtain to obtain to obtain xð0Þ þ yð1Þ þ zð0Þ ¼ 0 xð1Þ þ yð0Þ þ zp=2 ¼ 0 xð0Þ þ yðÀ1Þ þ zðpÞ ¼ 0 or or or y ¼ 0: x þ pz=2 ¼ 0: Ày þ pz ¼ 0:

The three equations have only the zero solution; that is, x ¼ 0, y ¼ 0, z ¼ 0. Thus, f , g, h are linearly independent.

4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u þ v, u À v, u À 2v þ w are also linearly independent.
Suppose xðu þ vÞ þ yðu À vÞ þ zðu À 2v þ wÞ ¼ 0. Then xu þ xv þ yu À yv þ zu À 2zv þ zw ¼ 0 or ðx þ y þ zÞu þ ðx À y À 2zÞv þ zw ¼ 0 Because u, v, w are linearly independent, the coefficients in the above equation are each 0; hence, x þ y þ z ¼ 0; x À y À 2z ¼ 0; z¼0

The only solution to the above homogeneous system is x ¼ 0, y ¼ 0, z ¼ 0. Thus, u þ v, u À v, u À 2v þ w are linearly independent.

4.23. Show that the vectors u ¼ ð1 þ i; 2iÞ and w ¼ ð1; 1 þ iÞ in C2 are linearly dependent over the complex field C but linearly independent over the real field R.
Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of the other (by an element in K). Because ð1 þ iÞw ¼ ð1 þ iÞð1; 1 þ iÞ ¼ ð1 þ i; 2iÞ ¼ u u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, as no real multiple of w can equal u. Specifically, when k is real, the first component of kw ¼ ðk; k þ kiÞ must be real, and it can never equal the first component 1 þ i of u, which is complex.

Basis and Dimension 4.24. Determine whether or not each of the following form a basis of R3 : (a) (b) (1, 1, 1), (1, 0, 1); (1, 2, 3), (1, 3, 5), (1, 0, 1), (2, 3, 0);
3

(c)

(1, 1, 1), (1, 2, 3), ð2; À1; 1Þ;

(d) (1, 1, 2), (1, 2, 5), (5, 3, 4).

(a and b) No, because a basis of R must contain exactly three elements because dim R3 ¼ 3. (c) The three vectors form a basis if and only if they are linearly independent. Thus, form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 3 2 3 2 3 2 1 1 1 1 1 1 1 1 1 41 2 35 $ 40 1 25 $ 40 1 25 2 À1 1 0 À3 À1 0 0 5 The echelon matrix has no zero rows; hence, the three vectors are linearly independent, and so they do form a basis of R3 .

140
(d)

CHAPTER 4 Vector Spaces
Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 3 3 2 2 3 2 1 1 2 1 1 2 1 1 2 41 2 55 $ 40 1 35 $ 40 1 35 0 0 0 0 À2 À6 5 3 4 The echelon matrix has a zero row; hence, the three vectors are linearly dependent, and so they do not form a basis of R3 .

4.25. Determine whether (1, 1, 1, 1), (1, 2, 3, 2), (2, 5, 6, 4), (2, 6, 8, 5) form a basis of R4 . If not, find the dimension of the subspace they span.
Form the matrix 2 1 61 6 B¼4 2 2 whose rows are the given 3 2 1 1 1 1 1 1 2 3 27 60 1 2 7$6 5 6 45 40 3 4 6 8 5 0 4 6 vectors, and 3 2 1 1 17 60 7$6 25 40 0 3 row reduce to echelon 3 2 1 1 1 1 1 2 17 60 7$6 0 À2 À1 5 4 0 0 0 À2 À1 form: 1 1 0 0 1 2 2 0 3 1 17 7 15 0

The echelon matrix has a zero row. Hence, the four vectors are linearly dependent and do not form a basis of R4 . Because the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3.

4.26. Extend fu1 ¼ ð1; 1; 1; 1Þ; u2 ¼ ð2; 2; 3; 4Þg to a basis of R4 .
First form the matrix with rows u1 and u2 , and reduce to echelon form: ! ! 1 1 1 1 1 1 1 1 $ 2 2 3 4 0 0 1 2 Then w1 ¼ ð1; 1; 1; 1Þ and w2 ¼ ð0; 0; 1; 2Þ span the same set of vectors as spanned by u1 and u2 . Let u3 ¼ ð0; 1; 0; 0Þ and u4 ¼ ð0; 0; 0; 1Þ. Then w1 , u3 , w2 , u4 form a matrix in echelon form. Thus, they are linearly independent, and they form a basis of R4 . Hence, u1 , u2 , u3 , u4 also form a basis of R4 .

4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q. (Thus, C is a vector space over R, and R is a vector space over Q.) (a) (b)
(a)

Show that f1; ig is a basis of C over R; hence, C is a vector space of dimension 2 over R. Show that R is a vector space of infinite dimension over Q.

For any v 2 C, we have v ¼ a þ bi ¼ að1Þ þ bðiÞ, where a; b 2 R. Hence, f1; ig spans C over R. Furthermore, if xð1Þ þ yðiÞ ¼ 0 or x þ yi ¼ 0, where x, y 2 R, then x ¼ 0 and y ¼ 0. Hence, f1; ig is linearly independent over R. Thus, f1; ig is a basis for C over R. (b) It can be shown that p is a transcendental number; that is, p is not a root of any polynomial over Q. Thus, for any n, the n þ 1 real numbers 1; p; p2 ; . . . ; pn are linearly independent over Q. R cannot be of dimension n over Q. Accordingly, R is of infinite dimension over Q.

4.28. Suppose S ¼ fu1 ; u2 ; . . . ; un g is a subset of V. Show that the following Definitions A and B of a basis of V are equivalent: (A) (B) S is linearly independent and spans V. Every v 2 V is a unique linear combination of vectors in S.
Suppose (A) holds. Because S spans V, the vector v is a linear combination of the ui , say u ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un Subtracting, we get 0 ¼ v À v ¼ ða1 À b1 Þu1 þ ða2 À b2 Þu2 þ Á Á Á þ ðan À bn Þun and u ¼ b1 u1 þ b2 u2 þ Á Á Á þ bn un

CHAPTER 4 Vector Spaces
But the ui are linearly independent. Hence, the coefficients in the above relation are each 0: a1 À b1 ¼ 0; a2 À b2 ¼ 0; ...; an À bn ¼ 0

141

Therefore, a1 ¼ b1 ; a2 ¼ b2 ; . . . ; an ¼ bn . Hence, the representation of v as a linear combination of the ui is unique. Thus, (A) implies (B). Suppose (B) holds. Then S spans V. Suppose 0 ¼ c1 u1 þ c2 u2 þ Á Á Á þ cn un However, we do have 0 ¼ 0u1 þ 0u2 þ Á Á Á þ 0un By hypothesis, the representation of 0 as a linear combination of the ui is unique. Hence, each ci ¼ 0 and the ui are linearly independent. Thus, (B) implies (A).

Dimension and Subspaces 4.29. Find a basis and dimension of the subspace W of R3 where (a) W ¼ fða; b; cÞ : a þ b þ c ¼ 0g, (b) W ¼ fða; b; cÞ : ða ¼ b ¼ cÞg
(a) Note that W 6¼ R3 , because, for example, ð1; 2; 3Þ 62 W. Thus, dim W < 3. Note that u1 ¼ ð1; 0; À1Þ and u2 ¼ ð0; 1; À1Þ are two independent vectors in W. Thus, dim W ¼ 2, and so u1 and u2 form a basis of W. (b) The vector u ¼ ð1; 1; 1Þ 2 W. Any vector w 2 W has the form w ¼ ðk; k; kÞ. Hence, w ¼ ku. Thus, u spans W and dim W ¼ 1.

4.30. Let W be the subspace of R4 spanned by the vectors u1 ¼ ð1; À2; 5; À3Þ; (a)
(a)

u2 ¼ ð2; 3; 1; À4Þ;

u3 ¼ ð3; 8; À3; À5Þ

Find a basis and dimension of W.

(b) Extend the basis of W to a basis of R4 .

Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and reduce it to echelon form: 2 3 2 3 2 3 1 À2 5 À3 1 À2 5 À3 1 À2 5 À3 A ¼ 42 3 1 À4 5 $ 4 0 7 À9 25 $ 40 7 À9 25 3 8 À3 À5 0 14 À18 4 0 0 0 0

The nonzero rows ð1; À2; 5; À3Þ and ð0; 7; À9; 2Þ of the echelon matrix form a basis of the row space of A and hence of W. Thus, in particular, dim W ¼ 2. (b) We seek four linearly independent vectors, which include the above two vectors. The four vectors ð1; À2; 5; À3Þ, ð0; 7; À9; 2Þ, (0, 0, 1, 0), and (0, 0, 0, 1) are linearly independent (because they form an echelon matrix), and so they form a basis of R4 , which is an extension of the basis of W.

4.31. Let W be the subspace of R5 spanned by u1 ¼ ð1; 2; À1; 3; 4Þ, u2 ¼ ð2; 4; À2; 6; 8Þ, u3 ¼ ð1; 3; 2; 2; 6Þ, u4 ¼ ð1; 4; 5; 1; 8Þ, u5 ¼ ð2; 7; 3; 3; 9Þ. Find a subset of the vectors that form a basis of W.
Here we use Algorithm 4.2, the casting-out algorithm. Form the matrix are the given vectors, and reduce it to echelon form: 2 3 2 3 2 1 2 1 1 2 1 2 1 1 2 1 6 2 4 3 4 77 60 0 1 2 37 60 6 7 6 7 6 M ¼ 6 À1 À2 2 5 3 7 $ 6 0 0 3 6 57 $ 60 6 7 6 7 6 4 3 6 2 1 3 5 4 0 0 À1 À2 À3 5 4 0 4 8 6 8 9 0 0 2 4 1 0 M whose columns (not rows) 2 0 0 0 0 1 1 0 0 0 3 1 2 2 37 7 0 À4 7 7 0 05 0 0

The pivot positions are in columns C1 , C3 , C5 . Hence, the corresponding vectors u1 , u3 , u5 form a basis of W, and dim W ¼ 3.

142

CHAPTER 4 Vector Spaces

4.32. Let V be the vector space of 2  2 matrices over K. Let W be the subspace of symmetric matrices. Show that dim W ¼ 3, by finding a basis of W. ! a b b d denotes an arbitrary 2  2 symmetric matrix. Setting (i) a ¼ 1, b ¼ 0, d ¼ 0; (ii) a ¼ 0, b ¼ 1, d ¼ 0; (iii) a ¼ 0, b ¼ 0, d ¼ 1, we obtain the respective matrices: ! ! ! 1 0 0 1 0 0 E1 ¼ ; E2 ¼ ; E3 ¼ 0 0 1 0 0 1 Recall that a matrix A ¼ ½aij Š is symmetric if AT ¼ A, or, equivalently, each aij ¼ aji . Thus, A ¼

We claim that S ¼ fE1 ; E2 ; E3 g is a basis of W ; that is, (a) S spans W and (b) S is linearly independent. ! a b (a) The above matrix A ¼ ¼ aE1 þ bE2 þ dE3 . Thus, S spans W. b d (b) Suppose xE1 þ yE2 þ zE3 ¼ 0, where x, y, z are unknown scalars. That is, suppose ! ! ! ! ! 1 0 0 1 0 0 0 0 x y 0 x þy þz ¼ or ¼ 0 0 1 0 0 1 0 0 y z 0 0 0 !

Setting corresponding entries equal to each other yields x ¼ 0, y ¼ 0, z ¼ 0. Thus, S is linearly independent. Therefore, S is a basis of W, as claimed.

Theorems on Linear Dependence, Basis, and Dimension 4.33. Prove Lemma 4.10: Suppose two or more nonzero vectors v 1 ; v 2 ; . . . ; v m are linearly dependent. Then one of them is a linear combination of the preceding vectors.
Because the v i are linearly dependent, there exist scalars a1 ; . . . ; am , not all 0, such that a1 v 1 þ Á Á Á þ am v m ¼ 0. Let k be the largest integer such that ak 6¼ 0. Then a1 v 1 þ Á Á Á þ ak v k þ 0v kþ1 þ Á Á Á þ 0v m ¼ 0 or a1 v 1 þ Á Á Á þ ak v k ¼ 0 Suppose k ¼ 1; then a1 v 1 ¼ 0, a1 6¼ 0, and so v 1 ¼ 0. But the v i are nonzero vectors. Hence, k > 1 and v k ¼ ÀaÀ1 a1 v 1 À Á Á Á À aÀ1 akÀ1 v kÀ1 k k That is, v k is a linear combination of the preceding vectors.

4.34. Suppose S ¼ fv 1 ; v 2 ; . . . ; v m g spans a vector space V. (a) (b)
(a)

If w 2 V, then fw; v 1 ; . . . ; v m g is linearly dependent and spans V. If v i is a linear combination of v 1 ; . . . ; v iÀ1 , then S without v i spans V.

The vector w is a linear combination of the v i , because fv i g spans V. Accordingly, fw; v 1 ; . . . ; v m g is linearly dependent. Clearly, w with the v i span V, as the v i by themselves span V; that is, fw; v 1 ; . . . ; v m g spans V. (b) Suppose v i ¼ k1 v 1 þ Á Á Á þ kiÀ1 v iÀ1 . Let u 2 V. Because fv i g spans V, u is a linear combination of the v j ’s, say u ¼ a1 v 1 þ Á Á Á þ am v m : Substituting for v i , we obtain u ¼ a1 v 1 þ Á Á Á þ aiÀ1 v iÀ1 þ ai ðk1 v 1 þ Á Á Á þ kiÀ1 v iÀ1 Þ þ aiþ1 v iþ1 þ Á Á Á þ am v m ¼ ða1 þ ai k1 Þv 1 þ Á Á Á þ ðaiÀ1 þ ai kiÀ1 Þv iÀ1 þ aiþ1 v iþ1 þ Á Á Á þ am v m Thus, fv 1 ; . . . ; v iÀ1 ; v iþ1 ; . . . ; v m g spans V. In other words, we can delete v i from the spanning set and still retain a spanning set.

4.35. Prove Lemma 4.13: Suppose fv 1 ; v 2 ; . . . ; v n g spans V, and suppose fw1 ; w2 ; . . . ; wm g is linearly independent. Then m n, and V is spanned by a set of the form fw1 ; w2 ; . . . ; wm ; v i1 ; v i2 ; . . . ; v inÀm g Thus, any n þ 1 or more vectors in V are linearly dependent.

CHAPTER 4 Vector Spaces

143

It suffices to prove the lemma in the case that the v i are all not 0. (Prove!) Because fv i g spans V, we have by Problem 4.34 that ð1Þ fw1 ; v 1 ; . . . ; v n g is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in (1) is a linear combination of the preceding vectors. This vector cannot be w1 , so it must be one of the v’s, say v j : Thus by Problem 4.34, we can delete v j from the spanning set (1) and obtain the spanning set fw1 ; v 1 ; . . . ; v jÀ1 ; v jþ1 ; . . . ; v n g Now we repeat the argument with the vector w2 . That is, because (2) spans V, the set fw1 ; w2 ; v 1 ; . . . ; v jÀ1 ; v jþ1 ; . . . ; v n g ð3Þ is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in (3) is a linear combination of the preceding vectors. We emphasize that this vector cannot be w1 or w2 , because fw1 ; . . . ; wm g is independent; hence, it must be one of the v’s, say v k . Thus, by Problem 4.34, we can delete v k from the spanning set (3) and obtain the spanning set fw1 ; w2 ; v 1 ; . . . ; v jÀ1 ; v jþ1 ; . . . ; v kÀ1 ; v kþ1 ; . . . ; v n g We repeat the argument with w3 , and so forth. At each step, we are able to add one of the w’s and delete one of the v’s in the spanning set. If m n, then we finally obtain a spanning set of the required form: fw1 ; . . . ; wm ; v i1 ; . . . ; v inÀm g Finally, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain the spanning set fw1 ; . . . ; wn g. This implies that wnþ1 is a linear combination of w1 ; . . . ; wn , which contradicts the hypothesis that fwi g is linearly independent. ð2Þ

4.36. Prove Theorem 4.12: Every basis of a vector space V has the same number of elements.
Suppose fu1 ; u2 ; . . . ; un g is a basis of V, and suppose fv 1 ; v 2 ; . . .g is another basis of V. Because fui g spans V, the basis fv 1 ; v 2 ; . . .g must contain n or less vectors, or else it is linearly dependent by Problem 4.35—Lemma 4.13. On the other hand, if the basis fv 1 ; v 2 ; . . .g contains less than n elements, then fu1 ; u2 ; . . . ; un g is linearly dependent by Problem 4.35. Thus, the basis fv 1 ; v 2 ; . . .g contains exactly n vectors, and so the theorem is true.

4.37. Prove Theorem 4.14: Let V be a vector space of finite dimension n. Then (i) Any n þ 1 or more vectors must be linearly dependent. (ii) Any linearly independent set S ¼ fu1 ; u2 ; . . . un g with n elements is a basis of V. (iii) Any spanning set T ¼ fv 1 ; v 2 ; . . . ; v n g of V with n elements is a basis of V.
Suppose B ¼ fw1 ; w2 ; . . . ; wn g is a basis of V. (i) Because B spans V, any n þ 1 or more vectors are linearly dependent by Lemma 4.13. (ii) By Lemma 4.13, elements from B can be adjoined to S to form a spanning set of V with n elements. Because S already has n elements, S itself is a spanning set of V. Thus, S is a basis of V. (iii) Suppose T is linearly dependent. Then some v i is a linear combination of the preceding vectors. By Problem 4.34, V is spanned by the vectors in T without v i and there are n À 1 of them. By Lemma 4.13, the independent set B cannot have more than n À 1 elements. This contradicts the fact that B has n elements. Thus, T is linearly independent, and hence T is a basis of V.

4.38. Prove Theorem 4.15: Suppose S spans a vector space V. Then (i) Any maximum number of linearly independent vectors in S form a basis of V. (ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in S. Then the remaining vectors form a basis of V.
(i) Suppose fv 1 ; . . . ; v m g is a maximum linearly independent subset of S, and suppose w 2 S. Accordingly, fv 1 ; . . . ; v m ; wg is linearly dependent. No v k can be a linear combination of preceding vectors.

144

CHAPTER 4 Vector Spaces
Hence, w is a linear combination of the v i . Thus, w 2 spanðv i Þ, and hence S  spanðv i Þ. This leads to V ¼ spanðSÞ  spanðv i Þ  V Thus, fv i g spans V, and, as it is linearly independent, it is a basis of V.

(ii)

The remaining vectors form a maximum linearly independent subset of S; hence, by (i), it is a basis of V.

4.39. Prove Theorem 4.16: Let V be a vector space of finite dimension and let S ¼ fu1 ; u2 ; . . . ; ur g be a set of linearly independent vectors in V. Then S is part of a basis of V ; that is, S may be extended to a basis of V.
Suppose B ¼ fw1 ; w2 ; . . . ; wn g is a basis of V. Then B spans V, and hence V is spanned by S [ B ¼ fu1 ; u2 ; . . . ; ur ; w1 ; w2 ; . . . ; wn g By Theorem 4.15, we can delete from S [ B each vector that is a linear combination of preceding vectors to obtain a basis B0 for V. Because S is linearly independent, no uk is a linear combination of preceding vectors. Thus, B0 contains every vector in S, and S is part of the basis B0 for V.

4.40. Prove Theorem 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W In particular, if dim W ¼ n, then W ¼ V.

n.

Because V is of dimension n, any n þ 1 or more vectors are linearly dependent. Furthermore, because a basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly, dim W n. In particular, if fw1 ; . . . ; wn g is a basis of W, then, because it is an independent set with n elements, it is also a basis of V. Thus, W ¼ V when dim W ¼ n.

Rank of a Matrix, Row and Column Spaces 4.41. Find the rank and basis of the row space of each of the following 2 2 3 1 3 1 À2 1 2 0 À1 61 4 3 À1 (a) A ¼ 4 2 6 À3 À3 5, (b) B ¼ 6 4 2 3 À4 À7 3 10 À6 À5 3 8 1 À7
(a) Row reduce A to echelon form: 2 1 A $ 40 0 3 2 2 0 À1 1 2 À3 À1 5 $ 4 0 4 À6 À2 0

matrices: 3 À3 À4 7 7. À3 5 À8

3 2 0 À1 2 À3 À1 5 0 0 0

The two nonzero rows ð1; 2; 0; À1Þ and rowsp(A). In particular, rankðAÞ ¼ 2. (b) Row reduce B to echelon form: 2 1 3 1 60 1 2 6 B$4 0 À3 À6 0 À1 À2

ð0; 2; À3; À1Þ of the echelon form of A form a basis for

À2 1 À3 À1

3 2 À3 1 À1 7 6 0 7$6 35 40 1 0

3 1 0 0

3 1 À2 À3 2 1 À1 7 7 0 0 05 0 0 0

The two nonzero rows ð1; 3; 1; À2; À3Þ and ð0; 1; 2; 1; À1Þ of the echelon form of B form a basis for rowsp(B). In particular, rankðBÞ ¼ 2.

4.42. Show that U ¼ W, where U and W are the following subspaces of R3 : U ¼ spanðu1 ; u2 ; u3 Þ ¼ spanð1; 1; À1Þ; ð2; 3; À1Þ; ð3; 1; À5Þg W ¼ spanðw1 ; w2 ; w3 Þ ¼ spanð1; À1; À3Þ; ð3; À2; À8Þ; ð2; 1; À3Þg

CHAPTER 4 Vector Spaces
Form the matrix A whose rows are the ui , and row reduce A to row canonical form: 2 3 2 3 2 3 1 1 À1 1 1 À1 1 0 À2 A ¼ 4 2 3 À1 5 $ 4 0 1 15 $ 40 1 15 3 1 À5 0 À2 À2 0 0 0 Next form the matrix B whose rows are the wj , and row reduce B to row canonical form: 2 3 2 3 2 3 1 À1 À3 1 À1 À3 1 0 À2 B ¼ 4 3 À2 À8 5 $ 4 0 1 15 $ 40 1 15 0 0 0 2 1 À3 0 3 3

145

Because A and B have the same row canonical form, the row spaces of A and B are equal, and so U ¼ W.

1 62 4.43. Let A ¼ 6 41 3 (a) (b) (c) (d)
(a)

2

2 4 2 6

3 1 2 3 1 3 7 7 47 7. 2 5 5 65 6 15 14 15

Find rankðMk Þ, for k ¼ 1; 2; . . . ; 6, where Mk is the submatrix of A consisting of the first k columns C1 ; C2 ; . . . ; Ck of A. Which columns Ckþ1 are linear combinations of preceding columns C1 ; . . . ; Ck ? Find columns of A that form a basis for the column space of A. Express column C4 as a linear combination of the columns in part (c).
Row reduce A to echelon form: 2 1 60 A$6 40 0 2 0 0 0 1 1 1 3 2 3 3 9 3 2 3 1 1 1 27 60 7$6 2 55 40 5 12 0 2 0 0 0 1 1 0 0 2 3 0 0 3 1 1 0 3 1 27 7 35 0

Observe that this simultaneously reduces all the matrices Mk to echelon form; for example, the first four columns of the echelon form of A are an echelon form of M4 . We know that rankðMk Þ is equal to the number of pivots or, equivalently, the number of nonzero rows in an echelon form of Mk . Thus, rankðM3 Þ ¼ rankðM4 Þ ¼ 2 rankðM1 Þ ¼ rankðM2 Þ ¼ 1; rankðM5 Þ ¼ rankðM6 Þ ¼ 3 The vector equation x1 C1 þ x2 C2 þ Á Á Á þ xk Ck ¼ Ckþ1 yields the system with coefficient matrix Mk and augmented Mkþ1 . Thus, Ckþ1 is a linear combination of C1 ; . . . ; Ck if and only if rankðMk Þ ¼ rankðMkþ1 Þ or, equivalently, if Ckþ1 does not contain a pivot. Thus, each of C2 , C4 , C6 is a linear combination of preceding columns. (c) In the echelon form of A, the pivots are in the first, third, and fifth columns. Thus, columns C1 , C3 , C5 of A form a basis for the columns space of A. Alternatively, deleting columns C2 , C4 , C6 from the spanning set of columns (they are linear combinations of other columns), we obtain, again, C1 , C3 , C5 . (d) The echelon matrix tells us that C4 is a linear combination of columns C1 and C3 . The augmented matrix M of the vector equation C4 ¼ xC1 þ yC2 consists of the columns C1 , C3 , C4 of A which, when reduced to echelon form, yields the matrix (omitting zero rows) ! 1 1 2 xþy¼2 or x ¼ À1; y ¼ 3 or 0 1 3 y¼3 (b) Thus, C4 ¼ ÀC1 þ 3C3 ¼ ÀC1 þ 3C3 þ 0C5 .

4.44. Suppose u ¼ ða1 ; a2 ; . . . ; an Þ is a linear combination of the rows R1 ; R2 ; . . . ; Rm of a matrix B ¼ ½bij Š, say u ¼ k1 R1 þ k2 R2 þ Á Á Á þ km Rm : Prove that ai ¼ k1 b1i þ k2 b2i þ Á Á Á þ km bmi ; i ¼ 1; 2; . . . ; n where b1i ; b2i ; . . . ; bmi are the entries in the ith column of B.

146
We are given that u ¼ k1 R1 þ k2 R2 þ Á Á Á þ km Rm . Hence,

CHAPTER 4 Vector Spaces

ða1 ; a2 ; . . . ; an Þ ¼ k1 ðb11 ; . . . ; b1n Þ þ Á Á Á þ km ðbm1 ; . . . ; bmn Þ ¼ ðk1 b11 þ Á Á Á þ km bm1 ; . . . ; k1 b1n þ Á Á Á þ km bmn Þ Setting corresponding components equal to each other, we obtain the desired result.

4.45. Prove Theorem 4.7: Suppose A ¼ ½aij Š and B ¼ ½bij Š are row equivalent echelon matrices with respective pivot entries a1j1 ; a2j2 ; . . . ; arjr and b1k1 ; b2k2 ; . . . ; bsks

(pictured in Fig. 4-5). Then A and B have the same number of nonzero rows—that is, r ¼ s—and their pivot entries are in the same positions; that is, j1 ¼ k1 ; j2 ¼ k2 ; . . . ; jr ¼ kr . 3 a1j1 à à à à à à 6 a2j2 à à à à 7 7 A¼6 4 :::::::::::::::::::::::::::::::::::::: 5; arjr à à 2 3 b1k1 à à à à à à 6 b2k2 à à à à 7 7 b¼6 4 :::::::::::::::::::::::::::::::::::::: 5 bsks à Ã
Figure 4-5

2

Clearly A ¼ 0 if and only if B ¼ 0, and so we need only prove the theorem when r ! 1 and s ! 1. We first show that j1 ¼ k1 . Suppose j1 < k1 . Then the j1 th column of B is zero. Because the first row R* of A is in the row space of B, we have R* ¼ c1 R1 þ c1 R2 þ Á Á Á þ cm Rm , where the Ri are the rows of B. Because the j1 th column of B is zero, we have a1j1 ¼ c1 0 þ c2 0 þ Á Á Á þ cm 0 ¼ 0 But this contradicts the fact that the pivot entry a1j1 6¼ 0. Hence, j1 ! k1 and, similarly, k1 ! j1 . Thus j1 ¼ k1 . Now let A0 be the submatrix of A obtained by deleting the first row of A, and let B0 be the submatrix of B obtained by deleting the first row of B. We prove that A0 and B0 have the same row space. The theorem will then follow by induction, because A0 and B0 are also echelon matrices. Let R ¼ ða1 ; a2 ; . . . ; an Þ be any row of A0 and let R1 ; . . . ; Rm be the rows of B. Because R is in the row space of B, there exist scalars d1 ; . . . ; dm such that R ¼ d1 R1 þ d2 R2 þ Á Á Á þ dm Rm . Because A is in echelon form and R is not the first row of A, the j1 th entry of R is zero: ai ¼ 0 for i ¼ j1 ¼ k1 . Furthermore, because B is 6 in echelon form, all the entries in the k1 th column of B are 0 except the first: b1k1 ¼ 0, but b2k1 ¼ 0; . . . ; bmk1 ¼ 0. Thus, 0 ¼ ak1 ¼ d1 b1k1 þ d2 0 þ Á Á Á þ dm 0 ¼ d1 b1k1 Now b1k1 6¼ 0 and so d1 ¼ 0. Thus, R is a linear combination of R2 ; . . . ; Rm and so is in the row space of B0 . Because R was any row of A0 , the row space of A0 is contained in the row space of B0 . Similarly, the row space of B0 is contained in the row space of A0 . Thus, A0 and B0 have the same row space, and so the theorem is proved.

4.46. Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space if and only if they have the same nonzero rows.
Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only have to prove the converse. Suppose A and B have the same row space, and suppose R 6¼ 0 is the ith row of A. Then there exist scalars c1 ; . . . ; cs such that R ¼ c1 R1 þ c2 R2 þ Á Á Á þ cs Rs ð1Þ where the Ri are the nonzero rows of B. The theorem is proved if we show that R ¼ Ri ; that is, that ci ¼ 1 but ck ¼ 0 for k 6¼ i.

CHAPTER 4 Vector Spaces

147 aiji ¼ c1 b1ji þ c2 b2ji þ Á Á Á þ cs bsji ð2Þ

Let aij , be the pivot entry in R—that is, the first nonzero entry of R. By (1) and Problem 4.44, But, by Problem 4.45, biji is a pivot entry of B, and, as B is row reduced, it is the only nonzero entry in the jth column of B. Thus, from (2), we obtain aiji ¼ ci biji . However, aiji ¼ 1 and biji ¼ 1, because A and B are row reduced; hence, ci ¼ 1. Now suppose k 6¼ i, and bkjk is the pivot entry in Rk . By (1) and Problem 4.44, aijk ¼ c1 b1jk þ c2 b2jk þ Á Á Á þ cs bsjk ð3Þ Because B is row reduced, bkjk is the only nonzero entry in the jth column of B. Hence, by (3), aijk ¼ ck bkjk . Furthermore, by Problem 4.45, akjk is a pivot entry of A, and because A is row reduced, aijk ¼ 0. Thus, ck bkjk ¼ 0, and as bkjk ¼ 1, ck ¼ 0. Accordingly R ¼ Ri ; and the theorem is proved.

4.47. Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form.
Suppose A is row equivalent to matrices A1 and A2 , where A1 and A2 are in row canonical form. Then rowspðAÞ ¼ rowspðA1 Þ and rowspðAÞ ¼ rowspðA2 Þ. Hence, rowspðA1 Þ ¼ rowspðA2 Þ. Because A1 and A2 are in row canonical form, A1 ¼ A2 by Theorem 4.8. Thus, the corollary is proved.

4.48. Suppose RB and AB are defined, where R is a row vector and A and B are matrices. Prove (a) (b) (c) (d) (e)
(a)

RB is a linear combination of the rows of B. The row space of AB is contained in the row space of B. The column space of AB is contained in the column space of A. If C is a column vector and AC is defined, then AC is a linear combination of the columns of A: rankðABÞ rankðBÞ and rankðABÞ rankðAÞ.

Suppose R ¼ ða1 ; a2 ; . . . ; am Þ and B ¼ ½bij Š. Let B1 ; . . . ; Bm denote the rows of B and B1 ; . . . ; Bn its columns. Then RB ¼ ðRB1 ; RB2 ; . . . ; RBn Þ ¼ ða1 b11 þ a2 b21 þ Á Á Á þ am bm1 ; . . . ; a1 b1n þ a2 b2n þ Á Á Á þ am bmn Þ ¼ a1 ðb11 ; b12 ; . . . ; b1n Þ þ a2 ðb21 ; b22 ; . . . ; b2n Þ þ Á Á Á þ am ðbm1 ; bm2 ; . . . ; bmn Þ ¼ a1 B1 þ a2 B2 þ Á Á Á þ am Bm

Thus, RB is a linear combination of the rows of B, as claimed. The rows of AB are Ri B, where Ri is the ith row of A. Thus, by part (a), each row of AB is in the row space of B. Thus, rowspðABÞ  rowspðBÞ, as claimed. (c) Using part (b), we have colspðABÞ ¼ rowspðABÞT ¼ rowspðBT AT Þ  rowspðAT Þ ¼ colspðAÞ: (d) Follows from ðcÞ where C replaces B: (e) The row space of AB is contained in the row space of B; hence, rankðABÞ rankðBÞ. Furthermore, the column space of AB is contained in the column space of A; hence, rankðABÞ rankðAÞ. (b)

4.49. Let A be an n-square matrix. Show that A is invertible if and only if rankðAÞ ¼ n.
Note that the rows of the n-square identity matrix In are linearly independent, because In is in echelon form; hence, rankðIn Þ ¼ n. Now if A is invertible, then A is row equivalent to In ; hence, rankðAÞ ¼ n. But if A is not invertible, then A is row equivalent to a matrix with a zero row; hence, rankðAÞ < n; that is, A is invertible if and only if rankðAÞ ¼ n.

148
Applications to Linear Equations

CHAPTER 4 Vector Spaces

4.50. Find the dimension and a basis of the solution space W of each homogeneous system: x þ 2y þ 2z À s þ 3t ¼ 0 x þ 2y þ 3z þ s þ t ¼ 0 3x þ 6y þ 8z þ s þ 5t ¼ 0 (a)
(a)

x þ 2y þ z À 2t ¼ 0 2x þ 4y þ 4z À 3t ¼ 0 3x þ 6y þ 7z À 4t ¼ 0 (b)

x þ y þ 2z ¼ 0 2x þ 3y þ 3z ¼ 0 x þ 3y þ 5z ¼ 0 (c)

Reduce the system to echelon form: x þ 2y þ 2z À s þ 3t ¼ 0 z þ 2s À 2t ¼ 0 2z þ 4s À 4t ¼ 0 or x þ 2y þ 2z À s þ 3t ¼ 0 z þ 2s À 2t ¼ 0

The system in echelon form has two (nonzero) equations in five unknowns. Hence, the system has 5 À 2 ¼ 3 free variables, which are y, s, t. Thus, dim W ¼ 3. We obtain a basis for W: ð1Þ Set y ¼ 1; s ¼ 0; t ¼ 0 ð2Þ Set y ¼ 0; s ¼ 1; t ¼ 0 ð3Þ Set y ¼ 0; s ¼ 0; t ¼ 1 to obtain the solution to obtain the solution to obtain the solution v 1 ¼ ðÀ2; 1; 0; 0; 0Þ: v 2 ¼ ð5; 0; À2; 1; 0Þ: v 3 ¼ ðÀ7; 0; 2; 0; 1Þ:

The set fv 1 ; v 2 ; v 3 g is a basis of the solution space W. (b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to echelon form: 2 3 2 3 2 3 1 2 1 À2 1 2 1 À2 1 2 1 À2 A ¼ 4 2 4 4 À3 5 $ 4 0 0 2 15 $ 40 0 2 15 3 6 7 À4 0 0 4 2 0 0 0 0 This corresponds to the system x þ 2y þ 2z À 2t ¼ 0 2z þ t ¼ 0 The free variables are y and t, and dim W ¼ 2. (i) Set y ¼ 1, z ¼ 0 to obtain the solution u1 ¼ ðÀ2; 1; 0; 0Þ. (ii) Set y ¼ 0, z ¼ 2 to obtain the solution u2 ¼ ð6; 0; À1; 2Þ. (c) Then fu1 ; u2 g is a basis of W. Reduce the coefficient matrix A 2 1 A ¼ 42 1 to echelon form: 3 2 1 2 1 1 3 35 $ 40 1 3 5 0 2

3 2 2 1 1 À 15 $ 40 1 3 0 0

3 2 À 15 5

This corresponds to a triangular system with no free variables. Thus, 0 is the only solution; that is, W ¼ f0g. Hence, dim W ¼ 0.

4.51. Find a homogeneous system whose solution set W is spanned by fu1 ; u2 ; u3 g ¼ fð1; À2; 0; 3Þ; ð1; À1; À1; 4Þ; ð1; 0; À2; 5Þg
Let v ¼ ðx; y; z; tÞ. Then v 2 W if and only if v is a linear combination of the vectors u1 , u2 , u3 that span W. Thus, form the matrix M whose first columns are u1 , u2 , u3 and whose last column is v, and then row reduce M to echelon form. This yields 2 3 2 3 2 3 1 1 1 x 1 1 1 x 1 1 1 x 6 À2 À1 0 y7 60 1 2 2x þ y 7 6 0 1 2 2x þ y 7 7 6 7$6 7 M ¼6 4 0 À1 À2 z 5 $ 4 0 À1 À2 z 5 40 0 0 2x þ y þ z 5 3 4 5 t 0 1 2 À3x þ t 0 0 0 À5x À y þ t

CHAPTER 4 Vector Spaces

149

Then v is a linear combination of u1 , u2 , u3 if rankðMÞ ¼ rankðAÞ, where A is the submatrix without column v. Thus, set the last two entries in the fourth column on the right equal to zero to obtain the required homogeneous system: 2x þ y þ z 5x þ y ¼0 Àt ¼0

4.52. Let xi1 ; xi2 ; . . . ; xik be the free variables of a homogeneous system of linear equations with n unknowns. Let v j be the solution for which xij ¼ 1, and all other free variables equal 0. Show that the solutions v 1 ; v 2 ; . . . ; v k are linearly independent.
Let A be the matrix whose rows are the v i . We interchange column 1 and column i1 , then column 2 and column i2 ; . . . ; then column k and column ik , and we obtain the k  n matrix 3 1 0 0 . . . 0 0 c1;kþ1 . . . c1n 6 0 1 0 . . . 0 0 c2;kþ1 . . . c2n 7 7 B ¼ ½I; CŠ ¼ 6 4 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 5 0 0 0 . . . 0 1 ck;kþ1 . . . ckn The above matrix B is in echelon form, and so its rows are independent; hence, rankðBÞ ¼ k. Because A and B are column equivalent, they have the same rank—rankðAÞ ¼ k. But A has k rows; hence, these rows (i.e., the v i ) are linearly independent, as claimed. 2

Sums, Direct Sums, Intersections 4.53. Let U and W be subspaces of a vector space V. Show that (a) (b) (c) (d)
(a)

U þ V is a subspace of V. U and W are contained in U þ W. U þ W is the smallest subspace containing U and W; that is, U þ W ¼ spanðU ; W Þ. W þ W ¼ W.
Because U and W are subspaces, 0 2 U and 0 2 W. Hence, 0 ¼ 0 þ 0 belongs to U þ W. Now suppose v; v 0 2 U þ W. Then v ¼ u þ w and v 0 ¼ u0 þ v 0 , where u; u0 2 U and w; w0 2 W. Then av þ bv 0 ¼ ðau þ bu0 Þ þ ðaw þ bw0 Þ 2 U þ W

Thus, U þ W is a subspace of V. (b) Let u 2 U . Because W is a subspace, 0 2 W. Hence, u ¼ u þ 0 belongs to U þ W. Thus, U  U þ W. Similarly, W  U þ W. (c) Because U þ W is a subspace of V containing U and W, it must also contain the linear span of U and W. That is, spanðU ; W Þ  U þ W. On the other hand, if v 2 U þ W, then v ¼ u þ w ¼ 1u þ 1w, where u 2 U and w 2 W. Thus, v is a linear combination of elements in U [ W, and so v 2 spanðU ; W Þ. Hence, U þ W  spanðU ; W Þ. The two inclusion relations give the desired result. (d) Because W is a subspace of V, we have that W is closed under vector addition; hence, W þ W  W. By part (a), W  W þ W. Hence, W þ W ¼ W.

4.54. Consider the following subspaces of R5 : U ¼ spanðu1 ; u2 ; u3 Þ ¼ spanfð1; 3; À2; 2; 3Þ; ð1; 4; À3; 4; 2Þ; ð2; 3; À1; À2; 9Þg W ¼ spanðw1 ; w2 ; w3 Þ ¼ spanfð1; 3; 0; 2; 1Þ; ð1; 5; À6; 6; 3Þ; ð2; 5; 3; 2; 1Þg Find a basis and the dimension of (a) U þ W, (b) U \ W.

150
(a) U þ W is the space spanned by all six vectors. vectors, and then row reduce to echelon form: 3 2 2 1 3 1 3 À2 2 3 6 1 4 À3 1 4 27 60 7 6 6 6 2 3 À1 À2 9 7 6 0 À3 7$6 6 61 3 0 0 2 17 60 7 6 6 4 1 5 À6 2 6 35 40 0 À1 2 5 3 2 1

CHAPTER 4 Vector Spaces
Hence, form the matrix whose rows are the given six À2 À1 3 2 À4 7 2 2 À6 0 4 À2 3 2 1 3 À1 7 6 0 7 6 37 60 7$6 À2 7 6 0 7 6 05 40 0 À5 3 1 0 0 0 0 À2 À1 1 0 0 0 3 2 3 2 À1 7 7 0 À1 7 7 0 07 7 0 05 0 0

The following three nonzero rows of the echelon matrix form a basis of U \ W : ð1; 3; À2; 2; 2; 3Þ; ð0; 1; À1; 2; À1Þ; ð0; 0; 1; 0; À1Þ

Thus, dimðU þ W Þ ¼ 3. (b) Let v ¼ ðx; y; z; s; tÞ denote an arbitrary element in R5 . First find, say as in Problem 4.49, homogeneous systems whose solution sets are U and W, respectively. Let M be the matrix whose columns are the ui and v, and reduce M to echelon form: 3 3 2 2 1 1 2 x 1 1 2 x 6 3 À3x þ y 7 4 3 y 7 6 0 1 À3 7 7 6 6 6 À2 À3 À1 z 7 $ 6 0 0 0 Àx þ y þ z 7 M ¼6 7 7 6 4 2 0 4x À 2y þ s 5 4 À2 s 5 4 0 0 0 0 0 À6x þ y þ t 3 2 9 t Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose solution set is U : Àx þ y þ z ¼ 0;
0

4x À 2y þ s ¼ 0;

À6x þ y þ t ¼ 0

Now let M be the matrix whose columns are the wi and v, and reduce M 0 to echelon form: 2 3 2 3 1 1 2 x 1 1 2 x 63 5 5 y 7 6 0 2 À1 À3x þ y 7 6 7 6 7 M 0 ¼ 6 0 À6 3 z 7 $ 6 0 0 0 À9x þ 3y þ z 7 6 7 6 7 42 6 2 s5 40 0 0 4x À 2y þ s 5 1 3 1 t 0 0 0 2x À y þ t Again set the last three entries in the last column equal to zero to obtain the following homogeneous system whose solution set is W : À9 þ 3 þ z ¼ 0; 4x À 2y þ s ¼ 0; 2x À y þ t ¼ 0 Combine both of the above systems to obtain a homogeneous system, whose solution space is U \ W, and reduce the system to echelon form, yielding Àx þ y þ z ¼ 0 2y þ 4z þ s ¼ 0 8z þ 5s þ 2t ¼ 0 s À 2t ¼ 0 There is one free variable, which is t; hence, dimðU \ W Þ ¼ 1. Setting t ¼ 2, we obtain the solution u ¼ ð1; 4; À3; 4; 2Þ, which forms our required basis of U \ W.

4.55. Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim V ¼ 6. Find the possible dimensions of U \ W.
Because U and W are distinct, U þ W properly contains U and W ; consequently, dimðU þ W Þ > 4. But dimðU þ W Þ cannot be greater than 6, as dim V ¼ 6. Hence, we have two possibilities: (a) dimðU þ W Þ ¼ 5 or (b) dimðU þ W Þ ¼ 6. By Theorem 4.20, dimðU \ W Þ ¼ dim U þ dim W À dimðU þ W Þ ¼ 8 À dimðU þ W Þ Thus (a) dimðU \ W Þ ¼ 3 or (b) dimðU \ W Þ ¼ 2.

CHAPTER 4 Vector Spaces
4.56. Let U and W be the following subspaces of R3 : U ¼ fða; b; cÞ : a ¼ b ¼ cg and W ¼ fð0; b; cÞg (Note that W is the yz-plane.) Show that R3 ¼ U È W.

151

First we show that U \ W ¼ f0g. Suppose v ¼ ða; b; cÞ 2 U \ W. Then a ¼ b ¼ c and a ¼ 0. Hence, a ¼ 0, b ¼ 0, c ¼ 0. Thus, v ¼ 0 ¼ ð0; 0; 0Þ. Next we show that R3 ¼ U þ W. For, if v ¼ ða; b; cÞ 2 R3 , then v ¼ ða; a; aÞ þ ð0; b À a; c À aÞ where ða; a; aÞ 2 U and ð0; b À a; c À aÞ 2 W Both conditions U \ W ¼ f0g and U þ W ¼ R3 imply that R3 ¼ U È W.

4.57. Suppose that U and W are subspaces of a vector space V and that S ¼ fui g spans U and S 0 ¼ fwj g spans W. Show that S [ S 0 spans U þ W. (Accordingly, by induction, if Si spans Wi , for i ¼ 1; 2; . . . ; n, then S1 [ . . . [ Sn spans W1 þ Á Á Á þ Wn .)
Let v 2 U þ W. Then v ¼ u þ w, where u 2 U and w 2 W. Because S spans U , u is a linear combination of ui , and as S 0 spans W, w is a linear combination of wj ; say u ¼ a1 ui1 þ a2 ui2 þ Á Á Á þ ar uir where ai ; bj 2 K. Then v ¼ u þ w ¼ a1 ui1 þ a2 ui2 þ Á Á Á þ ar uir þ b1 wj1 þ b2 wj2 þ Á Á Á þ bs wjs Accordingly, S [ S 0 ¼ fui ; wj g spans U þ W. and v ¼ b1 wj1 þ b2 wj2 þ Á Á Á þ bs wjs

4.58. Prove Theorem 4.20: Suppose U and V are finite-dimensional subspaces of a vector space V. Then U þ W has finite dimension and dimðU þ W Þ ¼ dim U þ dim W À dimðU \ W Þ
Observe that U \ W is a subspace of both U and W. Suppose dim U ¼ m, dim W ¼ n, dimðU \ W Þ ¼ r. Suppose fv 1 ; . . . ; v r g is a basis of U \ W. By Theorem 4.16, we can extend fv i g to a basis of U and to a basis of W ; say fv 1 ; . . . ; v r ; u1 ; . . . ; umÀr g are bases of U and W, respectively. Let and fv 1 ; . . . ; v r ; w1 ; . . . ; wnÀr g

B ¼ fv 1 ; . . . ; v r ; u1 ; . . . ; umÀr ; w1 ; . . . ; wnÀr g Note that B has exactly m þ n À r elements. Thus, the theorem is proved if we can show that B is a basis of U þ W. Because fv i ; uj g spans U and fv i ; wk g spans W, the union B ¼ fv i ; uj ; wk g spans U þ W. Thus, it suffices to show that B is independent. Suppose a1 v 1 þ Á Á Á þ ar v r þ b1 u1 þ Á Á Á þ bmÀr umÀr þ c1 w1 þ Á Á Á þ cnÀr wnÀr ¼ 0 where ai , bj , ck are scalars. Let v ¼ a1 v 1 þ Á Á Á þ ar v r þ b1 u1 þ Á Á Á þ bmÀr umÀr By (1), we also have v ¼ Àc1 w1 À Á Á Á À cnÀr wnÀr ð3Þ Because fv i ; uj g  U , v 2 U by (2); and as fwk g  W, v 2 W by (3). Accordingly, v 2 U \ W. Now fv i g is a basis of U \ W, and so there exist scalars d1 ; . . . ; dr for which v ¼ d1 v 1 þ Á Á Á þ dr v r . Thus, by (3), we have d1 v 1 þ Á Á Á þ dr v r þ c1 w1 þ Á Á Á þ cnÀr wnÀr ¼ 0 But fv i ; wk g is a basis of W, and so is independent. Hence, the above equation forces c1 ¼ 0; . . . ; cnÀr ¼ 0. Substituting this into (1), we obtain a1 v 1 þ Á Á Á þ ar v r þ b1 u1 þ Á Á Á þ bmÀr umÀr ¼ 0 But fv i ; uj g is a basis of U , and so is independent. Hence, the above equation forces a1 ¼ 0; . . . ; ar ¼ 0; b1 ¼ 0; . . . ; bmÀr ¼ 0. Because (1) implies that the ai , bj , ck are all 0, B ¼ fv i ; uj ; wk g is independent, and the theorem is proved. ð2Þ ð1Þ

152

CHAPTER 4 Vector Spaces

4.59. Prove Theorem 4.21: V ¼ U È W if and only if (i) V ¼ U þ W, (ii) U \ W ¼ f0g.
Suppose V ¼ U È W. Then any v 2 V can be uniquely written in the form v ¼ u þ w, where u 2 U and w 2 W. Thus, in particular, V ¼ U þ W. Now suppose v 2 U \ W. Then ð1Þ v ¼ v þ 0; where v 2 U ; 0 2 W ; ð2Þ v ¼ 0 þ v; where 0 2 U ; v 2 W : Thus, v ¼ 0 þ 0 ¼ 0 and U \ W ¼ f0g. On the other hand, suppose V ¼ U þ W and U \ W ¼ f0g. Let v 2 V. Because V ¼ U þ W, there exist u 2 U and w 2 W such that v ¼ u þ w. We need to show that such a sum is unique. Suppose also that v ¼ u0 þ w0 , where u0 2 U and w0 2 W. Then and so u þ w ¼ u0 þ w0 ; 0 0 But u À u 2 U and w À w 2 W ; hence, by U \ W ¼ f0g, u À u0 ¼ 0; w0 À w ¼ 0; and so Thus, such a sum for v 2 V is unique, and V ¼ U È W. u À u0 ¼ w0 À w u ¼ u0 ; w ¼ w0

4.60. Prove Theorem 4.22 (for two factors): Suppose V ¼ U È W. Also, suppose S ¼ fu1 ; . . . ; um g and S 0 ¼ fw1 ; . . . ; wn g are linearly independent subsets of U and W, respectively. Then (a) (b) (c)
(a)

The union S [ S 0 is linearly independent in V. If S and S 0 are bases of U and W, respectively, then S [ S 0 is a basis of V. dim V ¼ dim U þ dim W.
Suppose a1 u1 þ Á Á Á þ am um þ b1 w1 þ Á Á Á þ bn wn ¼ 0, where ai , bj are scalars. Then ða1 u1 þ Á Á Á þ am um Þ þ ðb1 w1 þ Á Á Á þ bn wn Þ ¼ 0 ¼ 0 þ 0 where 0; a1 u1 þ Á Á Á þ am um 2 U and 0; b1 w1 þ Á Á Á þ bn wn 2 W. Because such a sum for 0 is unique, this leads to a1 u1 þ Á Á Á þ am um ¼ 0 and b1 w1 þ Á Á Á þ bn wn ¼ 0

Because S1 is linearly independent, each ai ¼ 0, and because S2 is linearly independent, each bj ¼ 0. Thus, S ¼ S1 [ S2 is linearly independent. (b) By part (a), S ¼ S1 [ S2 is linearly independent, and, by Problem 4.55, S ¼ S1 [ S2 spans V ¼ U þ W. Thus, S ¼ S1 [ S2 is a basis of V. (c) This follows directly from part (b).

Coordinates 4.61. Relative to the basis S ¼ fu1 ; u2 g ¼ fð1; 1Þ; ð2; 3Þg of R2 , find the coordinate vector of v, where (a) v ¼ ð4; À3Þ, (b) v ¼ ða; bÞ.
In each case, set v ¼ xu1 þ yu2 ¼ xð1; 1Þ þ yð2; 3Þ ¼ ðx þ 2y; x þ 3yÞ and then solve for x and y. (a) We have ð4; À3Þ ¼ ðx þ 2y; x þ 3yÞ The solution is x ¼ 18, y ¼ À7. Hence, ½vŠ ¼ ½18; À7Š. (b) We have ða; bÞ ¼ ðx þ 2y; x þ 3yÞ or x þ 2y ¼ a x þ 3y ¼ b a þ bŠ. or x þ 2y ¼ 4 x þ 3y ¼ À3

The solution is x ¼ 3a À 2b, y ¼ Àa þ b. Hence, ½vŠ ¼ ½3a À 2b;

CHAPTER 4 Vector Spaces
4.62. Find the coordinate vector of v ¼ ða; b; cÞ in R3 relative to (a) (b) the usual basis E ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg, the basis S ¼ fu1 ; u2 ; u3 g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg.

153

(a) Relative to the usual basis E, the coordinates of ½vŠE are the same as v. That is, ½vŠE ¼ ½a; b; cŠ. (b) Set v as a linear combination of u1 , u2 , u3 using unknown scalars x, y, z. This yields 2 3 2 3 2 3 2 3 xþyþz¼a 1 1 1 a 4 b 5 ¼ x4 1 5 þ y4 1 5 þ z4 0 5 or xþy ¼b x ¼c 0 0 1 c Solving the system yields x ¼ c, y ¼ b À c, z ¼ a À b. Thus, ½vŠS ¼ ½c; b À c; a À bŠ.

4.63. Consider the vector space P3 ðtÞ of polynomials of degree (a) (b)
(a)
3 2

3.

Show that S ¼ fðt À 1Þ ; ðt À 1Þ ; t À 1; 1g is a basis of P3 ðtÞ. Find the coordinate vector ½vŠ of v ¼ 3t3 À 4t2 þ 2t À 5 relative to S.

The degree of ðt À 1Þk is k; writing the polynomials of S in reverse order, we see that no polynomial is a linear combination of preceding polynomials. Thus, the polynomials are linearly independent, and, because dim P3 ðtÞ ¼ 4, they form a basis of P3 ðtÞ. (b) Set v as a linear combination of the basis vectors using unknown scalars x, y, z, s. We have v ¼ 3t3 þ 4t2 þ 2t À 5 ¼ xðt À 1Þ3 þ yðt À 1Þ2 þ zðt À 1Þ þ sð1Þ ¼ xðt3 À 3t2 þ 3t À 1Þ þ yðt2 À 2t þ 1Þ þ zðt À 1Þ þ sð1Þ ¼ xt3 À 3xt2 þ 3xt À x þ yt2 À 2yt þ y þ zt À z þ s ¼ xt3 þ ðÀ3x þ yÞt2 þ ð3x À 2y þ zÞt þ ðÀx þ y À z þ sÞ Then set coefficients of the same powers of t equal to each other to obtain x ¼ 3; À3x þ y ¼ 4; 3x À 2y þ z ¼ 2; Àx þ y À z þ s ¼ À5 Solving the system yields x ¼ 3, y ¼ 13, z ¼ 19, s ¼ 4. Thus, ½vŠ ¼ ½3; 13; 19; 4Š.

! 2 3 in the real 4.64. Find the coordinate vector of A ¼ 4 À7 & ! ! ! 1 1 À1 1 À1 1 1 ; ; ; (a) the basis S ¼ 0 0 0 1 0 1 1 & ! ! ! 0 0 0 1 1 0 ; ; ; (b) the usual basis E ¼ 1 0 0 0 0 0
(a)

vector space M ¼ M2;2 relative to !' 0 , 0 !' 0 0 0 1

Set A as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows: ! ! ! ! ! ! 1 0 xþzþt xÀyÀz 2 3 1 1 1 À1 1 À1 A¼ ¼x þy þz þt ¼ 4 À7 1 1 1 0 0 0 0 0 xþy x Set corresponding entries equal to each other to obtain the system x þ z þ t ¼ 2; x À y À z ¼ 3; x þ y ¼ 4; x ¼ À7

Solving the system yields x ¼ À7, y ¼ 11, z ¼ À21, t ¼ 30. Thus, ½AŠS ¼ ½À7; 11; À21; 30Š. (Note that the coordinate vector of A is a vector in R4 , because dim M ¼ 4.) (b) Expressing A as a linear combination of the basis matrices yields ! ! ! ! ! ! 2 3 1 0 0 1 0 0 0 0 x y ¼x þy þz þt ¼ 4 À7 0 0 0 0 1 0 0 1 z t Thus, x ¼ 2, y ¼ 3, z ¼ 4, t ¼ À7. Hence, ½AŠ ¼ ½2; 3; 4; À7Š, whose components are the elements of A written row by row.

154

CHAPTER 4 Vector Spaces
Remark: This result is true in general; that is, if A is any m  n matrix in M ¼ Mm;n , then the coordinates of A relative to the usual basis of M are the elements of A written row by row.

4.65. In the space M ¼ M2;3 , determine whether or not the following matrices are linearly dependent: ! ! ! 1 2 5 2 4 7 1 2 3 ; C¼ ; B¼ A¼ 8 2 11 10 1 13 4 0 5 If the matrices are linearly dependent, find the dimension and a basis of the subspace W of M spanned by the matrices.
The coordinate vectors of the above matrices relative to the usual basis of M are as follows: ½AŠ ¼ ½1; 2; 3; 4; 0; 5Š; ½BŠ ¼ ½2; 4; 7; 10; 1; 13Š; ½CŠ ¼ ½1; 2; 5; 8; 2; 11Š Form the matrix M whose rows are the above 2 1 2 3 4 M ¼ 4 2 4 7 10 1 2 5 8 coordinate vectors, 3 2 1 2 0 5 1 13 5 $ 4 0 0 0 0 2 11 and reduce M to echelon form: 3 3 4 0 5 1 2 1 35 0 0 0 0

Because the echelon matrix has only two nonzero rows, the coordinate vectors ½AŠ, ½BŠ, ½CŠ span a space of dimension two, and so they are linearly dependent. Thus, A, B, C are linearly dependent. Furthermore, dim W ¼ 2, and the matrices ! ! 1 2 3 0 0 1 w1 ¼ and w2 ¼ 4 0 5 2 1 3 corresponding to the nonzero rows of the echelon matrix form a basis of W.

Miscellaneous Problems 4.66. Consider a finite sequence of vectors S ¼ fv 1 ; v 2 ; . . . ; v n g. Let T be the sequence of vectors obtained from S by one of the following ‘‘elementary operations’’: (i) interchange two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S and T span the same space W. Also show that T is independent if and only if S is independent.
Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other hand, each operation has an inverse of the same type (Prove!); hence, the vectors in S are linear combinations of vectors in T . Thus S and T span the same space W. Also, T is independent if and only if dim W ¼ n, and this is true if and only if S is also independent.

4.67. Let A ¼ ½aij Š and B ¼ ½bij Š be row equivalent m  n matrices over a field K, and let v 1 ; . . . ; v n be any vectors in a vector space V over K. Let u1 ¼ a11 v 1 þ a12 v 2 þ Á Á Á þ a1n v n u2 ¼ a21 v 1 þ a22 v 2 þ Á Á Á þ a2n v n ::::::::::::::::::::::::::::::::::::::::::::::::::::: um ¼ am1 v 1 þ am2 v 2 þ Á Á Á þ amn v n w1 ¼ b11 v 1 þ b12 v 2 þ Á Á Á þ b1n v n w2 ¼ b21 v 1 þ b22 v 2 þ Á Á Á þ b2n v n ::::::::::::::::::::::::::::::::::::::::::::::::::::::: wm ¼ bm1 v 1 þ bm2 v 2 þ Á Á Á þ bmn v n

Show that fui g and fwi g span the same space.
Applying an ‘‘elementary operation’’ of Problem 4.66 to fui g is equivalent to applying an elementary row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of elementary row operations; hence, fwi g can be obtained from fui g by the corresponding sequence of operations. Accordingly, fui g and fwi g span the same space.

4.68. Let v 1 ; . . . ; v n belong to a vector space V over K, and let P ¼ ½aij Š be an n-square matrix over K. Let ...; wn ¼ an1 v 1 þ an2 v 2 þ Á Á Á þ ann v n w1 ¼ a11 v 1 þ a12 v 2 þ Á Á Á þ a1n v n ; (a) (b) (c) Suppose P is invertible. Show that fwi g and fv i g span the same space; hence, fwi g is independent if and only if fv i g is independent. Suppose P is not invertible. Show that fwi g is dependent. Suppose fwi g is independent. Show that P is invertible.

CHAPTER 4 Vector Spaces
(a)

155

Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 4.67, fwi g and fv i g span the same space. Thus, one is independent if and only if the other is. (b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means that fwi g spans a space that has a spanning set of less than n elements. Thus, fwi g is dependent. (c) This is the contrapositive of the statement of (b), and so it follows from (b).

4.69. Suppose that A1 ; A2 ; . . . are linearly independent sets of vectors, and that A1  A2  . . .. Show that the union A ¼ A1 [ A2 [ . . . is also linearly independent.
Suppose A is linearly dependent. Then there exist vectors v 1 ; . . . ; v n 2 A and scalars a1 ; . . . ; an 2 K, not all of them 0, such that a1 v 1 þ a2 v 2 þ Á Á Á þ an v n ¼ 0 Because A ¼ [ Ai and the v i 2 A, there exist sets Ai1 ; . . . ; Ain such that v 1 2 Ai1 ; v 2 2 Ai2 ; ...; v n 2 A in Let k be the maximum index of the sets Aij : k ¼ maxði1 ; . . . ; in Þ. It follows then, as A1  A2  . . . ; that each Aij is contained in Ak . Hence, v 1 ; v 2 ; . . . ; v n 2 Ak , and so, by (1), Ak is linearly dependent, which contradicts our hypothesis. Thus, A is linearly independent. ð1Þ

4.70. Let K be a subfield of a field L, and let L be a subfield of a field E. (Thus, K  L  E, and K is a subfield of E.) Suppose E is of dimension n over L, and L is of dimension m over K. Show that E is of dimension mn over K.
Suppose fv 1 ; . . . ; v n g is a basis of E over L and fa1 ; . . . ; am g is a basis of L over K. We claim that fai v j : i ¼ 1; . . . ; m; j ¼ 1; . . . ; ng is a basis of E over K. Note that fai v j g contains mn elements. Let w be any arbitrary element in E. Because fv 1 ; . . . ; v n g spans E over L, w is a linear combination of the v i with coefficients in L: w ¼ b1 v 1 þ b2 v 2 þ Á Á Á þ bn v n ; bi 2 L ð1Þ Because fa1 ; . . . ; am g spans L over K, each bi 2 L is a linear combination of the aj with coefficients in K: b1 ¼ k11 a1 þ k12 a2 þ Á Á Á þ k1m am b2 ¼ k21 a1 þ k22 a2 þ Á Á Á þ k2m am :::::::::::::::::::::::::::::::::::::::::::::::::: bn ¼ kn1 a1 þ kn2 a2 þ Á Á Á þ kmn am where kij 2 K. Substituting in (1), we obtain w ¼ ðk11 a1 þ Á Á Á þ k1m am Þv 1 þ ðk21 a1 þ Á Á Á þ k2m am Þv 2 þ Á Á Á þ ðkn1 a1 þ Á Á Á þ knm am Þv n ¼ k11 a1 v 1 þ Á Á Á þ k1m am v 1 þ k21 a1 v 2 þ Á Á Á þ k2m am v 2 þ Á Á Á þ kn1 a1 v n þ Á Á Á þ knm am v n P ¼ kji ðai v j Þ i;j where kji 2 K. Thus, w is a linear combination of the ai v j with coefficients in K; hence, fai v j g spans E over K. The proof is P complete if we show that fai v j g is linearly independent over K. Suppose, for scalars xji 2 K; we have i;j xji ðai v j Þ ¼ 0; that is, ðx11 a1 v 1 þ x12 a2 v 1 þ Á Á Á þ x1m am v 1 Þ þ Á Á Á þ ðxn1 a1 v n þ xn2 a2 v n þ Á Á Á þ xnm am v m Þ ¼ 0 or ðx11 a1 þ x12 a2 þ Á Á Á þ x1m am Þv 1 þ Á Á Á þ ðxn1 a1 þ xn2 a2 þ Á Á Á þ xnm am Þv n ¼ 0 Because fv 1 ; . . . ; v n g is linearly independent over L and the above coefficients of the v i belong to L, each coefficient must be 0: x11 a1 þ x12 a2 þ Á Á Á þ x1m am ¼ 0; ...; xn1 a1 þ xn2 a2 þ Á Á Á þ xnm am ¼ 0

156

CHAPTER 4 Vector Spaces

But fa1 ; . . . ; am g is linearly independent over K; hence, because the xji 2 K, x11 ¼ 0; x12 ¼ 0; . . . ; x1m ¼ 0; . . . ; xn1 ¼ 0; xn2 ¼ 0; . . . ; xnm ¼ 0 Accordingly, fai v j g is linearly independent over K, and the theorem is proved.

SUPPLEMENTARY PROBLEMS Vector Spaces
4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions: (a) E1 ¼ 4ð5u À 6vÞ þ 2ð3u þ vÞ, (b) E2 ¼ 5ð2u À 3vÞ þ 4ð7v þ 8Þ, 4.72. (c) E3 ¼ 6ð3u þ 2vÞ þ 5u À 7v, (d) E4 ¼ 3ð5u þ 2=vÞ:

Let V be the set of ordered pairs (a; b) of real numbers with addition in V and scalar multiplication on V defined by ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ðka; 0Þ Show that V satisfies all the axioms of a vector space except [M4]—that is, except 1u ¼ u. Hence, [M4] is not a consequence of the other axioms.

4.73. 4.74.

Show that Axiom [A4] of a vector space V (that u þ v ¼ v þ u) can be derived from the other axioms for V. Let V be the set of ordered pairs (a; b) of real numbers. Show that V is not a vector space over R with addition and scalar multiplication defined by (i) (ii) (iii) (iv) ða; bÞ þ ðc; dÞ ¼ ða þ d; b þ cÞ and kða; bÞ ¼ ðka; kbÞ, ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ða; bÞ, ða; bÞ þ ðc; dÞ ¼ ð0; 0Þ and kða; bÞ ¼ ðka; kbÞ, ða; bÞ þ ðc; dÞ ¼ ðac; bdÞ and kða; bÞ ¼ ðka; kbÞ.

4.75.

Let V be the set of infinite sequences (a1 ; a2 ; . . .) in a field K. Show that V is a vector space over K with addition and scalar multiplication defined by ða1 ; a2 ; . . .Þ þ ðb1 ; b2 ; . . .Þ ¼ ða1 þ b1 ; a2 þ b2 ; . . .Þ and kða1 ; a2 ; . . .Þ ¼ ðka1 ; ka2 ; . . .Þ

4.76.

Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u; w) where u 2 U and w 2 W. Show that V is a vector space over K with addition in V and scalar multiplication on V defined by ðu; wÞ þ ðu0 ; w0 Þ ¼ ðu þ u0 ; w þ w0 Þ and kðu; wÞ ¼ ðku; kwÞ

(This space V is called the external direct product of U and W.)

Subspaces
4.77. Determine whether or not W is a subspace of R3 where W consists of all vectors (a; b; c) in R3 such that (a) a ¼ 3b, (b) a b c, (c) ab ¼ 0, (d) a þ b þ c ¼ 0, (e) b ¼ a2 , ( f ) a ¼ 2b ¼ 3c. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if W consists of all matrices A ¼ ½aij Š that are (a) symmetric (AT ¼ A or aij ¼ aji ), (b) (upper) triangular, (c) diagonal, 4.79. 4.80. 4.81. (d) scalar. Let AX ¼ B be a nonhomogeneous system of linear equations in n unknowns; that is, B 6¼ 0. Show that the solution set is not a subspace of K n . Suppose U and W are subspaces of V for which U [ W is a subspace. Show that U  W or W  U . Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V where W consists of all: (a) bounded functions, (b) even functions. [Recall that f : R ! R is bounded if 9M 2 R such that 8x 2 R, we have j f ðxÞj M; and f ðxÞ is even if f ðÀxÞ ¼ f ðxÞ; 8x 2 R.]

4.78.

CHAPTER 4 Vector Spaces
4.82.

157

Let V be the vector space (Problem 4.75) of infinite sequences (a1 ; a2 ; . . .) in a field K. Show that W is a subspace of V if W consists of all sequences with (a) 0 as the first element, (b) only a finite number of nonzero elements.

Linear Combinations, Linear Spans
4.83. Consider the vectors u ¼ ð1; 2; 3Þ and v ¼ ð2; 3; 1Þ in R3 . (a) (b) (c) (d) 4.84. Write w ¼ ð1; 3; 8Þ as a linear combination of u and v. Write w ¼ ð2; 4; 5Þ as a linear combination of u and v. Find k so that w ¼ ð1; k; 4Þ is a linear combination of u and v. Find conditions on a, b, c so that w ¼ ða; b; cÞ is a linear combination of u and v.

Write the polynomial f ðtÞ ¼ at2 þ bt þ c as a linear combination of the polynomials p1 ¼ ðt À 1Þ2 , p2 ¼ t À 1, p3 ¼ 1. [Thus, p1 , p2 , p3 span the space P2 ðtÞ of polynomials of degree 2.] Find one vector in R3 that spans the intersection of U and W where U is the xy-plane—that is, U ¼ fða; b; 0Þg—and W is the space spanned by the vectors (1, 1, 1) and (1, 2, 3). Prove that span(S) is the intersection of all subspaces of V containing S. Show that spanðSÞ ¼ spanðS [ f0gÞ. That is, by joining or deleting the zero vector from a set, we do not change the space spanned by the set. Show that (a) If S  T , then spanðSÞ  spanðT Þ. (b) span½spanðSފ ¼ spanðSÞ.

4.85.

4.86. 4.87.

4.88.

Linear Dependence and Linear Independence
4.89. Determine whether the following vectors in R4 are linearly dependent or independent: (a) ð1; 2; À3; 1Þ, ð3; 7; 1; À2Þ, ð1; 3; 7; À4Þ; 4.90. (b) ð1; 3; 1; À2Þ, ð2; 5; À1; 3Þ, ð1; 3; 7; À2Þ.

Determine whether the following polynomials u, v, w in PðtÞ are linearly dependent or independent: (a) u ¼ t3 À 4t2 þ 3t þ 3, v ¼ t3 þ 2t2 þ 4t À 1, w ¼ 2t3 À t2 À 3t þ 5; (b) u ¼ t3 À 5t2 À 2t þ 3, v ¼ t3 À 4t2 À 3t þ 4, w ¼ 2t3 À 17t2 À 7t þ 9.

4.91.

Show that the following functions f , g, h are linearly independent: (a) f ðtÞ ¼ et , gðtÞ ¼ sin t, hðtÞ ¼ t2 ; (b) f ðtÞ ¼ et , gðtÞ ¼ e2t , hðtÞ ¼ t.

4.92. 4.93.

Show that u ¼ ða; bÞ and v ¼ ðc; dÞ in K 2 are linearly dependent if and only if ad À bc ¼ 0. Suppose u, v, w are linearly independent vectors. Prove that S is linearly independent where (a) S ¼ fu þ v À 2w; u À v À w; u þ wg; (b) S ¼ fu þ v À 3w; u þ 3v À w; v þ wg.

4.94.

Suppose fu1 ; . . . ; ur ; w1 ; . . . ; ws g is a linearly independent subset of V. Show that spanðui Þ \ spanðwj Þ ¼ f0g

4.95.

Suppose v 1 ; v 2 ; . . . ; v n are linearly independent. Prove that S is linearly independent where (a) S ¼ fa1 v 1 ; a2 v 2 ; . . . ; an v n g and each ai 6¼ 0. P (b) S ¼ fv 1 ; . . . ; v kÀ1 ; w; v kþ1 ; . . . ; v n g and w ¼ i bi v i and bk 6¼ 0.

4.96.

Suppose ða11 ; . . . ; a1n Þ; ða21 ; . . . ; a2n Þ; . . . ; ðam1 ; . . . ; amn Þ are linearly independent vectors in K n , and suppose v 1 ; v 2 ; . . . ; v n are linearly independent vectors in a vector space V over K. Show that the following

158 vectors are also linearly independent: w1 ¼ a11 v 1 þ Á Á Á þ a1n v n ; w2 ¼ a21 v 1 þ Á Á Á þ a2n v n ;

CHAPTER 4 Vector Spaces

...;

wm ¼ am1 v 1 þ Á Á Á þ amn v n

Basis and Dimension
4.97. Find a subset of u1 , u2 , u3 , u4 that gives a basis for W ¼ spanðui Þ of R5 , where (a) (b) (c) (d) 4.98. u1 u1 u1 u1 ¼ ð1; 1; 1; 2; 3Þ, u2 ¼ ð1; À2; 1; 3; À1Þ, ¼ ð1; 0; 1; 0; 1Þ, u2 ¼ ð1; 0; 1; 1; 1Þ, u2 ¼ ð1; 2; À1; À2; 1Þ, u3 ¼ ð3; 5; À1; À2; 5Þ, u4 ¼ ð1; 2; 1; À1; 4Þ u2 ¼ ðÀ2; 4; À2; À6; 2Þ, u3 ¼ ð1; À3; 1; 2; 1Þ, u4 ¼ ð3; À7; 3; 8; À1Þ ¼ ð1; 1; 2; 1; 0Þ, u3 ¼ ð2; 1; 3; 1; 1Þ, u4 ¼ ð1; 2; 1; 1; 1Þ ¼ ð2; 1; 2; 0; 1Þ, u3 ¼ ð1; 1; 2; 3; 4Þ, u4 ¼ ð4; 2; 5; 4; 6Þ

Consider the subspaces U ¼ fða; b; c; dÞ : b À 2c þ d ¼ 0g and W ¼ fða; b; c; dÞ : a ¼ d; b ¼ 2cg of R4 . Find a basis and the dimension of (a) U , (b) W, (c) U \ W. Find a basis and the dimension of the solution space W of each of the following homogeneous systems: ðaÞ x þ 2y À 2z þ 2s À t ¼ 0 x þ 2y À z þ 3s À 2t ¼ 0 2x þ 4y À 7z þ s þ t ¼ 0 ðbÞ x þ 2y À z þ 3s À 4t ¼ 0 2x þ 4y À 2z À s þ 5t ¼ 0 2x þ 4y À 2z þ 4s À 2t ¼ 0

4.99.

4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors: (a) ð1; À2; 0; 3; À1Þ, ð2; À3; 2; 5; À3Þ, ð1; À2; 1; 2; À2Þ; (b) (1, 1, 2, 1, 1), (1, 2, 1, 4, 3), (3, 5, 4, 9, 7). 4.101. Determine whether each of the following is a basis of the vector space Pn ðtÞ: (a) f1; 1 þ t; 1 þ t þ t2 ; t 2 þ t3 ; 1 þ t þ t 2 þ t3 ; ...; ...; 1 þ t þ t2 þ Á Á Á þ tnÀ1 þ tn g; tnÀ1 þ tn g: (b) f1 þ t; t þ t2 ; tnÀ2 þ tnÀ1 ;

4.102. Find a basis and the dimension of the subspace W of PðtÞ spanned by (a) u ¼ t3 þ 2t2 À 2t þ 1, v ¼ t3 þ 3t2 À 3t þ 4, w ¼ 2t3 þ t2 À 7t À 7, (b) u ¼ t3 þ t2 À 3t þ 2, v ¼ 2t3 þ t2 þ t À 4, w ¼ 4t3 þ 3t2 À 5t þ 2. 4.103. Find a basis and the dimension of the subspace W of V ¼ M2;2 spanned by ! ! ! 1 À5 1 1 2 À4 1 A¼ ; B¼ ; C¼ ; D¼ À4 2 À1 5 À5 7 À5

À7 1

!

Rank of a Matrix, Row and Column Spaces
4.104. Find the rank of each of the following matrices: 2 3 2 1 3 À2 5 4 1 2 À3 61 4 6 1 3 À2 1 3 57 7, (a) 6 (b) 6 41 4 4 3 8 À7 2 4 35 2 7 À3 6 13 2 1 À9 3 À2 07 7, À2 5 À10 2 (c) 3 2 57 7 15 2

1 1 6 4 5 6 4 5 8 À1 À2

4.105. For k ¼ 1; 2; . . . ; 5, find the number nk of linearly independent subsets consisting of k columns for each of the following matrices: 2 3 2 3 1 1 0 2 3 1 2 1 0 2 (a) A ¼ 4 1 2 0 2 5 5, (b) B ¼ 4 1 2 3 0 4 5 1 3 0 2 7 1 1 5 0 6

CHAPTER 4 Vector Spaces
1 62 4.106. Let (a) A ¼ 6 41 4 2 2 4 2 8 1 3 3 8 2 5 6 16 3 1 6 3 15 7 7, 3 11 5 7 32 1 62 (b) B ¼ 6 41 3 2 2 4 2 6 2 5 3 7 1 4 4 7 3 2 1 5 57 7 4 65 9 10

159

For each matrix (where C1 ; . . . ; C6 denote its columns): (i) (ii) (iii) (iv) Find its row canonical form M. Find the columns that are linear combinations of preceding columns. Find columns (excluding C6 ) that form a basis for the column space. Express C6 as a linear combination of the basis vectors obtained in (iii). 2 3 3 10 5 1

4.107. Determine which of the following matrices have the same row space: ! 1 À2 À1 A¼ ; 3 À4 5 ! 1 À1 2 B¼ ; 2 3 À1

1 C ¼ 42 3

À1 À1 À5

4.108. Determine which of the following subspaces of R3 are identical: U1 ¼ span½ð1; 1; À1Þ; ð2; 3; À1Þ; ð3; 1; À5ފ; U2 ¼ span½ð1; À1; À3Þ; ð3; À2; À8Þ; ð2; 1; À3ފ U3 ¼ span½ð1; 1; 1Þ; ð1; À1; 3Þ; ð3; À1; 7ފ 4.109. Determine which of the following subspaces of R4 are identical: U1 ¼ span½ð1; 2; 1; 4Þ; ð2; 4; 1; 5Þ; ð3; 6; 2; 9ފ; U2 ¼ span½ð1; 2; 1; 2Þ; ð2; 4; 1; 3ފ; U3 ¼ span½ð1; 2; 3; 10Þ; ð2; 4; 3; 11ފ 4.110. Find a basis for (i) 2 0 0 61 3 (a) M ¼ 6 43 9 4 12 the row space and (ii) the column space of each matrix M: 3 2 3 3 1 4 1 2 1 0 1 61 2 2 1 2 17 1 37 7, 7. (b) M ¼ 6 43 6 5 4 5 25 2 75 8 8 7 2 4 1 À1 0

4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the resulting matrix is still in echelon (respectively, row canonical) form. 4.112. Let A and B be arbitrary m  n matrices. Show that rankðA þ BÞ 4.113. Let r ¼ rankðA þ BÞ. Find 2  2 matrices A and B such that (a) r < rankðAÞ, rank(B); (b) r ¼ rankðAÞ ¼ rankðBÞ; (c) r > rankðAÞ, rank(B). rankðAÞ þ rankðBÞ.

Sums, Direct Sums, Intersections
4.114. Suppose U and W are two-dimensional subspaces of K 3 . Show that U \ W ¼ f0g. 6 4.115. Suppose U and W are subspaces of V such that dim U ¼ 4, dim W ¼ 5, and dim V ¼ 7. Find the possible dimensions of U \ W. 4.116. Let U and W be subspaces of R3 for which dim U ¼ 1, dim W ¼ 2, and U 6 W. Show that R3 ¼ U È W. 4.117. Consider the following subspaces of R5 : U ¼ span½ð1; À1; À1; À2; 0Þ; W ¼ span½ð1; À2; À3; 0; À2Þ; ð1; À2; À2; 0; À3Þ; ð1; À1; À3; 2; À4Þ; ð1; À1; À2; À2; 1ފ ð1; À1; À2; 2; À5ފ

160

CHAPTER 4 Vector Spaces
(a) Find two homogeneous systems whose solution spaces are U and W, respectively. (b) Find a basis and the dimension of U \ W.

4.118. Let U1 , U2 , U3 be the following subspaces of R3 : U1 ¼ fða; b; cÞ : a ¼ cg;
3 3

U2 ¼ fða; b; cÞ : a þ b þ c ¼ 0g;
3

U3 ¼ fð0; 0; cÞg

Show that (a) R ¼ U1 þ U2 , (b) R ¼ U2 þ U3 , (c) R ¼ U1 þ U3 . When is the sum direct? 4.119. Suppose U , W1 , W2 are subspaces of a vector space V. Show that ðU \ W1 Þ þ ðU \ W2 Þ  U \ ðW1 þ W2 Þ Find subspaces of R2 for which equality does not hold. 4.120. Suppose W1 ; W2 ; . . . ; Wr are subspaces of a vector space V. Show that (a) spanðW1 ; W2 ; . . . ; Wr Þ ¼ W1 þ W2 þ Á Á Á þ Wr . (b) If Si spans Wi for i ¼ 1; . . . ; r, then S1 [ S2 [ Á Á Á [ Sr spans W1 þ W2 þ Á Á Á þ Wr . 4.121. Suppose V ¼ U È W. Show that dim V ¼ dim U þ dim W. 4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let k be a scalar. The sum S þ T and the scalar product kS are defined by S þ T ¼ ðu þ v : u 2 S; v 2 T g; [We also write w þ S for fwg þ S.] Let S ¼ fð1; 2Þ; ð2; 3Þg; T ¼ fð1; 4Þ; ð1; 5Þ; ð2; 5Þg; w ¼ ð1; 1Þ; k¼3 Find: (a) S þ T , (b) w þ S, (c) kS, (d) kT , (e) kS þ kT , (f ) kðS þ T Þ. 4.123. Show that the above operations of S þ T and kS satisfy (a) (b) (c) (d) Commutative law: S þ T ¼ T þ S. Associative law: ðS1 þ S2 Þ þ S3 ¼ S1 þ ðS2 þ S3 Þ. Distributive law: kðS þ T Þ ¼ kS þ kT . S þ f0g ¼ f0g þ S ¼ S and S þ V ¼ V þ S ¼ V. kS ¼ fku : u 2 Sg

4.124. Let V be the vector space of n-square matrices. Let U be the subspace of upper triangular matrices, and let W be the subspace of lower triangular matrices. Find (a) U \ W, (b) U þ W. 4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Problem 4.76.) Let ^ U ¼ fðu; 0Þ : u 2 U g and ^ W ¼ fð0; wÞ : w 2 W g ^ ^ ^ ^ Show that (a) U and W are subspaces of V, (b) V ¼ U È W. ^ ^ 4.126. Suppose V ¼ U þ W. Let V be the external direct sum of U and W. Show that V is isomorphic to V under the correspondence v ¼ u þ w $ ðu; wÞ. 4.127. Use induction to prove (a) Theorem 4.22, (b) Theorem 4.23.

Coordinates
4.128. The vectors u1 ¼ ð1; À2Þ and u2 ¼ ð4; À7Þ form a basis S of R2 . Find the coordinate vector ½vŠ of v relative to S where (a) v ¼ ð5; 3Þ, (b) v ¼ ða; bÞ. 4.129. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3 . Find the coordinate vector ½vŠ of v relative to S where (a) v ¼ ð2; 7; À4Þ, (b) v ¼ ða; b; cÞ.

CHAPTER 4 Vector Spaces

161

4.130. S ¼ ft3 þ t2 ; t2 þ t; t þ 1; 1g is a basis of P3 ðtÞ. Find the coordinate vector ½vŠ of v relative to S where (a) v ¼ 2t3 þ t2 À 4t þ 2, (b) v ¼ at3 þ bt2 þ ct þ d. 4.131. Let V ¼ M2;2 . Find the coordinate vector [A] of A relative to S where ! & !' ! ! ! 3 À5 1 0 1 1 1 À1 1 1 ; and ðaÞ A ¼ ; ; ; S¼ 6 7 0 0 0 0 1 0 1 1

a b ðbÞ A ¼ c d

!

4.132. Find the dimension and a basis of the subspace W of P3 ðtÞ spanned by u ¼ t3 þ 2t2 À 3t þ 4; v ¼ 2t3 þ 5t2 À 4t þ 7; w ¼ t3 þ 4t2 þ t þ 2

4.133. Find the dimension and a basis of the subspace W of M ¼ M2;3 spanned by ! ! 1 2 1 2 4 3 1 2 A¼ ; B¼ ; C¼ 3 1 2 7 5 6 5 7

3 6

!

Miscellaneous Problems
4.134. Answer true or false. If false, prove it with a counterexample. (a) (b) (c) (d) (e) (f ) If If If If If If u1 , u2 , u3 span V, then dim V ¼ 3. A is a 4 Â 8 matrix, then any six columns are linearly dependent. u1 , u2 , u3 are linearly independent, then u1 , u2 , u3 , w are linearly dependent. u1 , u2 , u3 , u4 are linearly independent, then dim V ! 4. u1 , u2 , u3 span V, then w, u1 , u2 , u3 span V. u1 , u2 , u3 , u4 are linearly independent, then u1 , u2 , u3 are linearly independent.

4.135. Answer true or false. If false, prove it with a counterexample. (a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon form. (b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row canonical form. (c) If any column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix is in row canonical form. 4.136. Determine the dimension of the vector space W of the following n-square matrices: (a) symmetric matrices, (d) diagonal matrices, (b) antisymmetric matrices, (c) scalar matrices.

4.137. Let t1 ; t2 ; . . . ; tn be symbols, and let K be any field. Let V be the following set of expressions where ai 2 K: a1 t1 þ a2 t2 þ Á Á Á þ an tn Define addition in V and scalar multiplication on V by ða1 t1 þ Á Á Á þ an tn Þ þ ðb1 t1 þ Á Á Á þ bn tn Þ ¼ ða1 þ b1 Þt1 þ Á Á Á þ ðan bnm Þtn kða1 t1 þ a2 t2 þ Á Á Á þ an tn Þ ¼ ka1 t1 þ ka2 t2 þ Á Á Á þ kan tn Show that V is a vector space over K with the above operations. Also, show that ft1 ; . . . ; tn g is a basis of V, where tj ¼ 0t1 þ Á Á Á þ 0tjÀ1 þ 1tj þ 0tjþ1 þ Á Á Á þ 0tn

162
ANSWERS TO SUPPLEMENTARY PROBLEMS
[Some answers, such as bases, need not be unique.] 4.71. (a) (c) E1 ¼ 26u À 22v; E3 ¼ 23u þ 5v;

CHAPTER 4 Vector Spaces

(b) The sum 7v þ 8 is not defined, so E2 is not defined; (d) Division by v is not defined, so E4 is not defined.

4.77.

(a) Yes; (b) No; e.g., ð1; 2; 3Þ 2 W but À2ð1; 2; 3Þ 62 W; (c) No; e.g., ð1; 0; 0Þ; ð0; 1; 0Þ 2 W, but not their sum; (d) Yes; (e) No; e.g., ð1; 1; 1Þ 2 W, but 2ð1; 1; 1Þ 62 W; (f ) Yes The zero vector 0 is not a solution. (a) w ¼ 3u1 À u2 , (b) Impossible, (c) k ¼ 11, 5 (d) 7a À 5b þ c ¼ 0

4.79. 4.83. 4.84. 4.85. 4.89. 4.90. 4.97. 4.98. 4.99.

Using f ¼ xp1 þ yp2 þ zp3 , we get x ¼ a, y ¼ 2a þ b, z ¼ a þ b þ c v ¼ ð2; 1; 0Þ (a) (a) (a) (a) Dependent, Independent, u1 , u2 , u4 ; dim U ¼ 3, (b) (b) (b) (b) Independent Dependent (c) u1 , u2 , u4 ; (d) u1 , u2 , u3

u1 , u2 , u3 ; dim W ¼ 2,

(c)

dimðU \ W Þ ¼ 1

(a) Basis: fð2; À1; 0; 0; 0Þ; (b) Basis: fð2; À1; 0; 0; 0Þ;

ð4; 0; 1; À1; 0Þ; ð3; 0; 1; 0; 1Þg; dim W ¼ 3; ð1; 0; 1; 0; 0Þg; dim W ¼ 2

4.100. (a) 5x þ y À z À s ¼ 0; x þ y À z À t ¼ 0; (b) 3x À y À z ¼ 0; 2x À 3y þ s ¼ 0; x À 2y þ t ¼ 0 4.101. (a) 4.102. (a) Yes, (b) No, because dim Pn ðtÞ ¼ n þ 1, but the set contains only n elements. (b) dim W ¼ 3

dim W ¼ 2,

4.103. dim W ¼ 2 4.104. (a) 4.105. (a) 4.106. (a) 3, (b) 2, n2 ¼ 5; (c) 3 n3 ¼ n4 ¼ n5 ¼ 0; (b) n1 ¼ 4; n2 ¼ 6; n3 ¼ 3; n4 ¼ n5 ¼ 0

n1 ¼ 4;

(i) M ¼ ½1; 2; 0; 1; 0; 3; 0; 0; 1; 2; 0; 1; 0; 0; 0; 0; 1; 2; 0Š; (iii) C1 , C3 , C5 ; (iv) C6 ¼ 3C1 þ C3 þ 2C5 . (ii) C2 , C4 , C6 ; (b) (i) M ¼ ½1; 2; 0; 0; 3; 1; 0; 0; 1; 0; À1; À1; 0; 0; 0; 1; 1; 2; 0Š; (iii) C1 , C3 , C4 ; (iv) C6 ¼ C1 À C3 þ 2C4 (ii) C2 , C5 , C6 ; ! 1 0 7 4.107. A and C are row equivalent to , but not B 0 1 4 ! 1 0 À2 , but not U3 4.108. U1 and U2 are row equivalent to 0 1 1 ! 1 2 0 1 4.109. U1 and U3 are row equivalent to ; but not U2 0 0 1 3 4.110. (a) (i) ð1; 3; 1; 2; 1Þ, ð0; 0; 1; À1; À1Þ, ð0; 0; 0; 4; 7Þ; (ii) (b) (i) ð1; 2; 1; 0; 1Þ, ð0; 0; 1; 1; 2Þ; (ii) C1 , C3 C1 , C3 , C4 ;

CHAPTER 4 Vector Spaces
4.113. (a) A¼ ! ! À1 À1 1 1 ; ; B¼ 0 0 0 0 (b) A ¼ ! 0 1 0 ; B¼ 0 0 0 ! 2 ; 0

163

! ! 0 0 1 0 ; B¼ (c) A ¼ 0 1 0 0 4.115. dimðU \ W Þ ¼ 2, 3, or 4 4.117. (a) (i) 3x þ 4y À z À t ¼ 0 4x þ 2y þ s ¼ 0 (ii) 4x þ 2y À s ¼ 0 ; 9x þ 2y þ z þ t ¼ 0

(b) Basis: fð1; À2; À5; 0; 0Þ; 4.118. The sum is direct in (b) and (c).

ð0; 0; 1; 0; À1Þg; dimðU \ W Þ ¼ 2

4.119. In R2 , let U , V, W be, respectively, the line y ¼ x, the x-axis, the y-axis. 4.122. (a) fð2; 6Þ; ð2; 7Þ; ð3; 7Þ; ð3; 8Þ; ð4; 8Þg; (b) fð2; 3Þ; (c) fð3; 6Þ; ð6; 9Þg; (d) fð3; 12Þ; ð3; 15Þ; ð6; 15Þg; (e and f ) fð6; 18Þ; ð6; 21Þ; ð9; 21Þ; ð9; 24Þ; ð12; 24Þg 4.124. (a) 4.128. (a) 4.129. (a) 4.130. (a) 4.131. (a) Diagonal matrices, [À41; 11], (b) (b) V 2a þ b] Àc þ 3b À 6a; c À 2b þ 4a] ð3; 4Þg;

[À7a À 4b;

[À11; 13; À10], [2; À1; À2; 2], [7; À1; À13; 10],

(b) [c À 3b þ 7a; (b) [a; (b) [d; b À c; c À d;

c À b þ a;

d À c þ b À a] a À b À 2c þ 2d]

b þ c À 2d;

4.132. dim W ¼ 2; basis: ft3 þ 2t2 À 3t þ 4; 4.133. dim W ¼ 2; basis: f½1; 2; 1; 3; 1; 2Š;

t2 þ 2t À 1g ½0; 0; 1; 1; 3; 2Šg

(b) True; 4.134. (a) False; (1, 1), (1, 2), (2, 1) span R2 ; (c) False; (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), w ¼ ð0; 0; 0; 1Þ; (d) True; (e) True; (f ) True ! 1 0 3 ; (c) 4.135. (a) True; (b) False; e.g. delete C2 from 0 1 2 4.136. (a)
1 2 nðn

True

þ 1Þ, (b)

1 2 nðn

À 1Þ, (c) n, (d) 1

CHAPTER 5

Linear Mappings
5.1 Introduction
The main subject matter of linear algebra is the study of linear mappings and their representation by means of matrices. This chapter introduces us to these linear maps and Chapter 6 shows how they can be represented by matrices. First, however, we begin with a study of mappings in general.

5.2

Mappings, Functions

Let A and B be arbitrary nonempty sets. Suppose to each element in a 2 A there is assigned a unique element of B; called the image of a. The collection f of such assignments is called a mapping (or map) from A into B, and it is denoted by f :A!B The set A is called the domain of the mapping, and B is called the target set. We write f ðaÞ, read ‘‘f of a;’’ for the unique element of B that f assigns to a 2 A. One may also view a mapping f : A ! B as a computer that, for each input value a 2 A, produces a unique output f ðaÞ 2 B. Remark: The term function is used synonymously with the word mapping, although some texts reserve the word ‘‘function’’ for a real-valued or complex-valued mapping. Consider a mapping f : A ! B. If A0 is any subset of A, then f ðA0 Þ denotes the set of images of elements of A0 ; and if B0 is any subset of B, then f À1 ðB0 Þ denotes the set of elements of A; each of whose image lies in B. That is, f ðA0 Þ ¼ f f ðaÞ : a 2 A0 g and f À1 ðB0 Þ ¼ fa 2 A : f ðaÞ 2 B0 g We call f ðA0 ) the image of A0 and f À1 ðB0 Þ the inverse image or preimage of B0 . In particular, the set of all images (i.e., f ðAÞ) is called the image or range of f. To each mapping f : A ! B there corresponds the subset of A Â B given by fða; f ðaÞÞ : a 2 Ag. We call this set the graph of f . Two mappings f : A ! B and g : A ! B are defined to be equal, written f ¼ g, if f ðaÞ ¼ gðaÞ for every a 2 A—that is, if they have the same graph. Thus, we do not distinguish between a function and its graph. The negation of f ¼ g is written f 6¼ g and is the statement: There exists an a 2 A for which f ðaÞ 6¼ gðaÞ: Sometimes the ‘‘barred’’ arrow 7! is used to denote the image of an arbitrary element x 2 A under a mapping f : A ! B by writing x 7! f ðxÞ This is illustrated in the following example.

164

CHAPTER 5 Linear Mappings
EXAMPLE 5.1

165

(a) Let f : R ! R be the function that assigns to each real number x its square x2 . We can denote this function by writing

f ðxÞ ¼ x2

or

x 7! x2

Here the image of À3 is 9, so we may write f ðÀ3Þ ¼ 9. However, f À1 ð9Þ ¼ f3; À3g. Also, f ðRÞ ¼ ½0; 1Þ ¼ fx : x ! 0g is the image of f. (b) Let A ¼ fa; b; c; dg and B ¼ fx; y; z; tg. Then the following defines a mapping f : A ! B:

f ðaÞ ¼ y; f ðbÞ ¼ x; f ðcÞ ¼ z; f ðdÞ ¼ y

or

f ¼ fða; yÞ; ðb; xÞ; ðc; zÞ; ðd; yÞg

The first defines the mapping explicitly, and the second defines the mapping by its graph. Here,

f ðfa; b; dgÞ ¼ f f ðaÞ; f ðbÞ; f ðdÞg ¼ fy; x; yg ¼ fx; yg
Furthermore, f ðAÞ ¼ fx; y; zg is the image of f.
EXAMPLE 5.2

Let V be the vector space of polynomials over R, and let pðtÞ ¼ 3t2 À 5t þ 2.

(a) The derivative defines a mapping D : V ! V where, for any polynomials f ðtÞ, we have Dð f Þ ¼ df =dt. Thus,

DðpÞ ¼ Dð3t2 À 5t þ 2Þ ¼ 6t À 5
(b) The integral, say from 0 to 1, defines a mapping J : V ! R. That is, for any polynomial f ðtÞ,

ð1 Jð f Þ ¼
0

ð1 f ðtÞ dt; and so JðpÞ ¼
0

ð3t2 À 5t þ 2Þ ¼ 1 2

Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from the vector space V into itself.

Matrix Mappings
Let A be any m  n matrix over K. Then A determines a mapping FA : K n ! K m by FA ðuÞ ¼ Au where the vectors in K n and K m are written as columns. For example, suppose 2 3 ! 1 1 À4 5 A¼ and u ¼ 4 35 2 3 À6 À5 then FA ðuÞ ¼ Au ¼ 1 2 3 ! 1 À4 5 4 À36 35 ¼ 3 À6 41 À5 ! 2

Remark: For notational convenience, we will frequently denote the mapping FA by the letter A, the same symbol as used for the matrix.

Composition of Mappings
Consider two mappings f : A ! B and g : B ! C, illustrated below: ! AÀ B À C ! The composition of f and g, denoted by g  f , is the mapping g  f : A ! C defined by ðg  f ÞðaÞ  gð f ðaÞÞ f g

166

CHAPTER 5 Linear Mappings

That is, first we apply f to a 2 A, and then we apply g to f ðaÞ 2 B to get gð f ðaÞÞ 2 C. Viewing f and g as ‘‘computers,’’ the composition means we first input a 2 A to get the output f ðaÞ 2 B using f , and then we input f ðaÞ to get the output gð f ðaÞÞ 2 C using g. Our first theorem tells us that the composition of mappings satisfies the associative law.
THEOREM

5.1:

Let f : A ! B, g : B ! C, h : C ! D. Then

h  ðg  f Þ ¼ ðh  gÞ  f We prove this theorem here. Let a 2 A. Then ðh  ðg  f ÞÞðaÞ ¼ hððg  f ÞðaÞÞ ¼ hðgð f ðaÞÞÞ ððh  gÞ  f ÞðaÞ ¼ ðh  gÞð f ðaÞÞ ¼ hðgð f ðaÞÞÞ Thus, ðh  ðg  f ÞÞðaÞ ¼ ððh  gÞ  f ÞðaÞ for every a 2 A, and so h  ðg  f Þ ¼ ðh  gÞ  f.

One-to-One and Onto Mappings
We formally introduce some special types of mappings.
DEFINITION:

A mapping f : A ! B is said to be one-to-one (or 1-1 or injective) if different elements of A have distinct images; that is, If f ðaÞ ¼ f ða0 Þ; then a ¼ a0 :

DEFINITION:

A mapping f : A ! B is said to be onto (or f maps A onto B or surjective) if every b 2 B is the image of at least one a 2 A. A mapping f : A ! B is said to be a one-to-one correspondence between A and B (or bijective) if f is both one-to-one and onto.
Let f : R ! R, g : R ! R, h : R ! R be defined by x DEFINITION:

EXAMPLE 5.3

f ðxÞ ¼ 2 ;

gðxÞ ¼ x3 À x;

hðxÞ ¼ x2

The graphs of these functions are shown in Fig. 5-1. The function f is one-to-one. Geometrically, this means that each horizontal line does not contain more than one point of f. The function g is onto. Geometrically, this means that each horizontal line contains at least one point of g. The function h is neither one-to-one nor onto. For example, both 2 and À2 have the same image 4, and À16 has no preimage.

Figure 5-1

Identity and Inverse Mappings
Let A be any nonempty set. The mapping f : A ! A defined by f ðaÞ ¼ a—that is, the function that assigns to each element in A itself—is called identity mapping. It is usually denoted by 1A or 1 or I. Thus, for any a 2 A, we have 1A ðaÞ ¼ a.

CHAPTER 5 Linear Mappings
Now let f : A ! B. We call g : B ! A the inverse of f, written f À1 , if f  g ¼ 1B and g  f ¼ 1A

167

We emphasize that f has an inverse if and only if f is a one-to-one correspondence between A and B; that is, f is one-to-one and onto (Problem 5.7). Also, if b 2 B, then f À1 ðbÞ ¼ a, where a is the unique element of A for which f ðaÞ ¼ b

5.3

Linear Mappings (Linear Transformations)

We begin with a definition.
DEFINITION:

Let V and U be vector spaces over the same field K. A mapping F : V ! U is called a linear mapping or linear transformation if it satisfies the following two conditions: (1) For any vectors v; w 2 V , Fðv þ wÞ ¼ FðvÞ þ FðwÞ. (2) For any scalar k and vector v 2 V, FðkvÞ ¼ kFðvÞ.

Namely, F : V ! U is linear if it ‘‘preserves’’ the two basic operations of a vector space, that of vector addition and that of scalar multiplication. Substituting k ¼ 0 into condition (2), we obtain Fð0Þ ¼ 0. Thus, every linear mapping takes the zero vector into the zero vector. Now for any scalars a; b 2 K and any vector v; w 2 V , we obtain Fðav þ bwÞ ¼ FðavÞ þ FðbwÞ ¼ aFðvÞ þ bFðwÞ More generally, for any scalars ai 2 K and any vectors v i 2 V , we obtain the following basic property of linear mappings: Fða1 v 1 þ a2 v 2 þ Á Á Á þ am v m Þ ¼ a1 Fðv 1 Þ þ a2 Fðv 2 Þ þ Á Á Á þ am Fðv m Þ Remark 1: A linear mapping F : V ! U is completely characterized by the condition ð*Þ

Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ and so this condition is sometimes used as its defintion.

Remark 2: The term linear transformation rather than linear mapping is frequently used for linear mappings of the form F : Rn ! Rm .
EXAMPLE 5.4

(a) Let F : R3 ! R3 be the ‘‘projection’’ mapping into the xy-plane; that is, F is the mapping defined by Fðx; y; zÞ ¼ ðx; y; 0Þ. We show that F is linear. Let v ¼ ða; b; cÞ and w ¼ ða0 ; b0 ; c0 Þ. Then

Fðv þ wÞ ¼ Fða þ a0 ; b þ b0 ; c þ c0 Þ ¼ ða þ a0 ; b þ b0 ; 0Þ ¼ ða; b; 0Þ þ ða0 ; b0 ; 0Þ ¼ FðvÞ þ FðwÞ and, for any scalar k,

FðkvÞ ¼ Fðka; kb; kcÞ ¼ ðka; kb; 0Þ ¼ kða; b; 0Þ ¼ kFðvÞ
Thus, F is linear. (b) Let G : R2 ! R2 be the ‘‘translation’’ mapping defined by Gðx; yÞ ¼ ðx þ 1; y þ 2Þ. [That is, G adds the vector (1, 2) to any vector v ¼ ðx; yÞ in R2 .] Note that

Gð0Þ ¼ Gð0; 0Þ ¼ ð1; 2Þ 6¼ 0
Thus, the zero vector is not mapped into the zero vector. Hence, G is not linear.

168
EXAMPLE 5.5

CHAPTER 5 Linear Mappings

(Derivative and Integral Mappings) Consider the vector space V ¼ PðtÞ of polynomials over the real field R. Let uðtÞ and vðtÞ be any polynomials in V and let k be any scalar. (a) Let D : V ! V be the derivative mapping. One proves in calculus that

dðu þ vÞ du dv ¼ þ dt dt dt

and

dðkuÞ du ¼k dt dt

That is, Dðu þ vÞ ¼ DðuÞ þ DðvÞ and DðkuÞ ¼ kDðuÞ. Thus, the derivative mapping is linear. (b) Let J : V ! R be an integral mapping, say

ð1
0

Jð f ðtÞÞ ¼ ð1

f ðtÞ dt ð1 ð1 uðtÞ dt þ
0 0

One also proves in calculus that,

½uðtÞ þ vðtފdt ¼
0

vðtÞ dt

and

ð1 kuðtÞ dt ¼ k
0

ð1 uðtÞ dt
0

That is, Jðu þ vÞ ¼ JðuÞ þ JðvÞ and JðkuÞ ¼ kJðuÞ. Thus, the integral mapping is linear.
EXAMPLE 5.6

(Zero and Identity Mappings)

(a) Let F : V ! U be the mapping that assigns the zero vector 0 2 U to every vector v 2 V . Then, for any vectors v; w 2 V and any scalar k 2 K, we have

Fðv þ wÞ ¼ 0 ¼ 0 þ 0 ¼ FðvÞ þ FðwÞ

and

FðkvÞ ¼ 0 ¼ k0 ¼ kFðvÞ

Thus, F is linear. We call F the zero mapping, and we usually denote it by 0. (b) Consider the identity mapping I : V ! V , which maps each v 2 V into itself. Then, for any vectors v; w 2 V and any scalars a; b 2 K, we have

Iðav þ bwÞ ¼ av þ bw ¼ aIðvÞ þ bIðwÞ
Thus, I is linear.

Our next theorem (proved in Problem 5.13) gives us an abundance of examples of linear mappings. In particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis.
THEOREM 5.2:

Let V and U be vector spaces over a field K. Let fv 1 ; v 2 ; . . . ; v n g be a basis of V and let u1 ; u2 ; . . . ; un be any vectors in U . Then there exists a unique linear mapping F : V ! U such that Fðv 1 Þ ¼ u1 ; Fðv 2 Þ ¼ u2 ; . . . ; Fðv n Þ ¼ un .

We emphasize that the vectors u1 ; u2 ; . . . ; un in Theorem 5.2 are completely arbitrary; they may be linearly dependent or they may even be equal to each other.

Matrices as Linear Mappings
Let A be any real m  n matrix. Recall that A determines a mapping FA : K n ! K m by FA ðuÞ ¼ Au (where the vectors in K n and K m are written as columns). We show FA is linear. By matrix multiplication, FA ðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FA ðvÞ þ FA ðwÞ FA ðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFA ðvÞ In other words, using A to represent the mapping, we have Aðv þ wÞ ¼ Av þ Aw and AðkvÞ ¼ kðAvÞ Thus, the matrix mapping A is linear.

CHAPTER 5 Linear Mappings

169

Vector Space Isomorphism
The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the coordinates of a vector relative to a basis. We now redefine this concept.
DEFINITION:

Two vector spaces V and U over K are isomorphic, written V ffi U , if there exists a bijective (one-to-one and onto) linear mapping F : V ! U . The mapping F is then called an isomorphism between V and U .

Consider any vector space V of dimension n and let S be any basis of V. Then the mapping v 7! ½vŠS which maps each vector v 2 V into its coordinate vector ½vŠS , is an isomorphism between V and K n .

5.4

Kernel and Image of a Linear Mapping

We begin by defining two concepts.
DEFINITION:

Let F : V ! U be a linear mapping. The kernel of F, written Ker F, is the set of elements in V that map into the zero vector 0 in U ; that is, Ker F ¼ fv 2 V : FðvÞ ¼ 0g The image (or range) of F, written Im F, is the set of image points in U ; that is, Im F ¼ fu 2 U : there exists v 2 V for which FðvÞ ¼ ug

The following theorem is easily proved (Problem 5.22).
THEOREM

5.3:

Let F : V ! U be a linear mapping. Then the kernel of F is a subspace of V and the image of F is a subspace of U .

Now suppose that v 1 ; v 2 ; . . . ; v m span a vector space V and that F : V ! U is linear. We show that Fðv 1 Þ; Fðv 2 Þ; . . . ; Fðv m Þ span Im F. Let u 2 Im F. Then there exists v 2 V such that FðvÞ ¼ u. Because the v i ’s span V and v 2 V, there exist scalars a1 ; a2 ; . . . ; am for which v ¼ a1 v 1 þ a2 v 2 þ Á Á Á þ am v m Therefore, u ¼ FðvÞ ¼ Fða1 v 1 þ a2 v 2 þ Á Á Á þ am v m Þ ¼ a1 Fðv 1 Þ þ a2 Fðv 2 Þ þ Á Á Á þ am Fðv m Þ Thus, the vectors Fðv 1 Þ; Fðv 2 Þ; . . . ; Fðv m Þ span Im F. We formally state the above result.
PROPOSITION

5.4:

Suppose v 1 ; v 2 ; . . . ; v m span a vector space V, and suppose F : V ! U is linear. Then Fðv 1 Þ; Fðv 2 Þ; . . . ; Fðv m Þ span Im F.

EXAMPLE 5.7

(a) Let F : R3 ! R3 be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is,

Fðx; y; zÞ ¼ ðx; y; 0Þ
Clearly the image of F is the entire xy-plane—that is, points of the form (x; y; 0). Moreover, the kernel of F is the z-axis—that is, points of the form (0; 0; c). That is,

Im F ¼ fða; b; cÞ : c ¼ 0g ¼ xy-plane

and

Ker F ¼ fða; b; cÞ : a ¼ 0; b ¼ 0g ¼ z-axis

(b) Let G : R3 ! R3 be the linear mapping that rotates a vector v about the z-axis through an angle y [as pictured in Fig. 5-2(b)]; that is,

Gðx; y; zÞ ¼ ðx cos y À y sin y; x sin y þ y cos y; zÞ

170

CHAPTER 5 Linear Mappings

Figure 5-2

Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero vector 0 is mapped into the zero vector 0. Thus, Ker G ¼ f0g. On the other hand, every vector u in R3 is the image of a vector v in R3 that can be obtained by rotating u back by an angle of y. Thus, Im G ¼ R3 , the entire space. Consider the vector space V ¼ PðtÞ of polynomials over the real field R, and let H : V ! V be the third-derivative operator; that is, H½ f ðtފ ¼ d 3 f =dt3 . [Sometimes the notation D3 is used for H, where D is the derivative operator.] We claim that
EXAMPLE 5.8

Ker H ¼ fpolynomials of degree

2g ¼ P2 ðtÞ

and

Im H ¼ V

The first comes from the fact that Hðat2 þ bt þ cÞ ¼ 0 but Hðtn Þ 6¼ 0 for n ! 3. The second comes from that fact that every polynomial gðtÞ in V is the third derivative of some polynomial f ðtÞ (which can be obtained by taking the antiderivative of gðtÞ three times).

Kernel and Image of Matrix Mappings
Consider, say, a 3 Â 4 2 a1 a2 a3 A ¼ 4 b1 b2 b3 c1 c2 c3 matrix A and the usual basis fe1 ; e2 ; e3 ; e4 g of K 4 (written as columns): 2 3 2 3 2 3 2 3 3 1 1 1 1 a4 607 607 607 607 e1 ¼ 6 7; b4 5; e2 ¼ 6 7; e3 ¼ 6 7; e4 ¼ 6 7 405 405 405 405 c4 0 0 0 0

Recall that A may be viewed as a linear mapping A : K 4 ! K 3 , where the vectors in K 4 and K 3 are viewed as column vectors. Now the usual basis vectors span K 4 , so their images Ae1 , Ae2 , Ae3 , Ae4 span the image of A. But the vectors Ae1 , Ae2 , Ae3 , Ae4 are precisely the columns of A: Ae1 ¼ ½a1 ; b1 ; c1 ŠT ; Ae2 ¼ ½a2 ; b2 ; c2 ŠT ; Ae3 ¼ ½a3 ; b3 ; c3 ŠT ; Ae4 ¼ ½a4 ; b4 ; c4 ŠT

Thus, the image of A is precisely the column space of A. On the other hand, the kernel of A consists of all vectors v for which Av ¼ 0. This means that the kernel of A is the solution space of the homogeneous system AX ¼ 0, called the null space of A. We state the above results formally.
PROPOSITION

5.5:

Let A be any m  n matrix over a field K viewed as a linear map A : K n ! K m . Then Ker A ¼ nullspðAÞ and Im A ¼ colspðAÞ

Here colsp(A) denotes the column space of A, and nullsp(A) denotes the null space of A.

CHAPTER 5 Linear Mappings

171

Rank and Nullity of a Linear Mapping
Let F : V ! U be a linear mapping. The rank of F is defined to be the dimension of its image, and the nullity of F is defined to be the dimension of its kernel; namely, rankðFÞ ¼ dimðIm FÞ and nullityðFÞ ¼ dimðKer FÞ The following important theorem (proved in Problem 5.23) holds.
THEOREM

5.6

Let V be of finite dimension, and let F : V ! U be linear. Then dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ

Recall that the rank of a matrix A was also defined to be the dimension of its column space and row space. If we now view A as a linear mapping, then both definitions correspond, because the image of A is precisely its column space.
EXAMPLE 5.9

Let F : R4 ! R3 be the linear mapping defined by

Fðx; y; z; tÞ ¼ ðx À y þ z þ t;

2x À 2y þ 3z þ 4t;

3x À 3y þ 4z þ 5tÞ

(a) Find a basis and the dimension of the image of F. First find the image of the usual basis vectors of R4 ,

Fð1; 0; 0; 0Þ ¼ ð1; 2; 3Þ; Fð0; 1; 0; 0Þ ¼ ðÀ1; À2; À3Þ;

Fð0; 0; 1; 0Þ ¼ ð1; 3; 4Þ Fð0; 0; 0; 1Þ ¼ ð1; 4; 5Þ

By Proposition 5.4, the image vectors span Im F. Hence, form the matrix M whose rows are these image vectors and row reduce to echelon form:

1 6 À1 M ¼6 4 1 1

2

2 À2 3 4

3 2 1 3 À3 7 6 0 7$6 45 40 0 5

2 0 1 2

3 2 1 3 07 60 7$6 15 40 0 2

2 1 0 0

3 3 17 7 05 0

Thus, (1, 2, 3) and (0, 1, 1) form a basis of Im F. Hence, dimðIm FÞ ¼ 2 and rankðFÞ ¼ 2. (b) Find a basis and the dimension of the kernel of the map F. Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ,

Fðx; y; z; tÞ ¼ ðx À y þ z þ t;

2x À 2y þ 3z þ 4t;

3x À 3y þ 4z þ 5tÞ ¼ ð0; 0; 0Þ

Set corresponding components equal to each other to form the following homogeneous system whose solution space is Ker F:

xÀ yþ zþ t ¼0 2x À 2y þ 3z þ 4t ¼ 0 3x À 3y þ 4z þ 5t ¼ 0

or

xÀyþzþ t ¼0 z þ 2t ¼ 0 z þ 2t ¼ 0

or

xÀyþzþ t ¼0 z þ 2t ¼ 0

The free variables are y and t. Hence, dimðKer FÞ ¼ 2 or nullityðFÞ ¼ 2. (i) Set y ¼ 1, t ¼ 0 to obtain the solution (À1; 1; 0; 0Þ, (ii) Set y ¼ 0, t ¼ 1 to obtain the solution (1; 0; À2; 1Þ. Thus, (À1; 1; 0; 0) and (1; 0; À2; 1) form a basis for Ker F. As expected from Theorem 5.6, dimðIm FÞ þ dimðKer FÞ ¼ 4 ¼ dim R4 .

Application to Systems of Linear Equations
Let AX ¼ B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix A may be viewed as a linear mapping A : Kn ! Km

172

CHAPTER 5 Linear Mappings

Thus, the solution of the equation AX ¼ B may be viewed as the preimage of the vector B 2 K m under the linear mapping A. Furthermore, the solution of the associated homogeneous system AX ¼ 0 may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system yields dimðKer AÞ ¼ dim K n À dimðIm AÞ ¼ n À rank A But n is exactly the number of unknowns in the homogeneous system AX ¼ 0. Thus, we have proved the following theorem of Chapter 4.
THEOREM

4.19:

The dimension of the solution space W of a homogenous system AX ¼ 0 of linear equations is s ¼ n À r, where n is the number of unknowns and r is the rank of the coefficient matrix A.

Observe that r is also the number of pivot variables in an echelon form of AX ¼ 0, so s ¼ n À r is also the number of free variables. Furthermore, the s solution vectors of AX ¼ 0 described in Theorem 3.14 are linearly independent (Problem 4.52). Accordingly, because dim W ¼ s, they form a basis for the solution space W. Thus, we have also proved Theorem 3.14.

5.5

Singular and Nonsingular Linear Mappings, Isomorphisms

Let F : V ! U be a linear mapping. Recall that Fð0Þ ¼ 0. F is said to be singular if the image of some nonzero vector v is 0—that is, if there exists v 6¼ 0 such that FðvÞ ¼ 0. Thus, F : V ! U is nonsingular if the zero vector 0 is the only vector whose image under F is 0 or, in other words, if Ker F ¼ f0g.
EXAMPLE 5.10 Consider the projection map F : R3 ! R3 and the rotation map G : R3 ! R3 appearing in Fig. 5-2. (See Example 5.7.) Because the kernel of F is the z-axis, F is singular. On the other hand, the kernel of G consists only of the zero vector 0. Thus, G is nonsingular.

Nonsingular linear mappings may also be characterized as those mappings that carry independent sets into independent sets. Specifically, we prove (Problem 5.28) the following theorem.
THEOREM

5.7:

Let F : V ! U be a nonsingular linear mapping. Then the image of any linearly independent set is linearly independent.

Isomorphisms
Suppose a linear mapping F : V ! U is one-to-one. Then only 0 2 V can map into 0 2 U , and so F is nonsingular. The converse is also true. For suppose F is nonsingular and FðvÞ ¼ FðwÞ, then Fðv À wÞ ¼ FðvÞ À FðwÞ ¼ 0, and hence, v À w ¼ 0 or v ¼ w. Thus, FðvÞ ¼ FðwÞ implies v ¼ w— that is, F is one-to-one. We have proved the following proposition.
PROPOSITION

5.8:

A linear mapping F : V ! U is one-to-one if and only if F is nonsingular.

Recall that a mapping F : V ! U is called an isomorphism if F is linear and if F is bijective (i.e., if F is one-to-one and onto). Also, recall that a vector space V is said to be isomorphic to a vector space U , written V ffi U , if there is an isomorphism F : V ! U . The following theorem (proved in Problem 5.29) applies.
THEOREM

5.9:

Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is linear. Then F is an isomorphism if and only if F is nonsingular.

CHAPTER 5 Linear Mappings

173

5.6

Operations with Linear Mappings

We are able to combine linear mappings in various ways to obtain new linear mappings. These operations are very important and will be used throughout the text. Let F : V ! U and G : V ! U be linear mappings over a field K. The sum F þ G and the scalar product kF, where k 2 K, are defined to be the following mappings from V into U : ðF þ GÞðvÞ  FðvÞ þ GðvÞ and ðkFÞðvÞ  kFðvÞ We now show that if F and G are linear, then F þ G and kF are also linear. Specifically, for any vectors v; w 2 V and any scalars a; b 2 K, ðF þ GÞðav þ bwÞ ¼ Fðav þ bwÞ þ Gðav þ bwÞ ¼ aFðvÞ þ bFðwÞ þ aGðvÞ þ bGðwÞ ¼ a½FðvÞ þ Gðvފ þ b½FðwÞ þ Gðwފ ¼ aðF þ GÞðvÞ þ bðF þ GÞðwÞ and ðkFÞðav þ bwÞ ¼ kFðav þ bwÞ ¼ k½aFðvÞ þ bFðwފ ¼ akFðvÞ þ bkFðwÞ ¼ aðkFÞðvÞ þ bðkFÞðwÞ Thus, F þ G and kF are linear. The following theorem holds.
THEOREM

5.10:

Let V and U be vector spaces over a field K. Then the collection of all linear mappings from V into U with the above operations of addition and scalar multiplication forms a vector space over K.

The vector space of linear mappings in Theorem 5.10 is usually denoted by HomðV; U Þ Here Hom comes from the word ‘‘homomorphism.’’ We emphasize that the proof of Theorem 5.10 reduces to showing that HomðV; U Þ does satisfy the eight axioms of a vector space. The zero element of HomðV; U Þ is the zero mapping from V into U , denoted by 0 and defined by 0ðvÞ ¼ 0 for every vector v 2 V . Suppose V and U are of finite dimension. Then we have the following theorem.
THEOREM

5.11:

Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV ; U ފ ¼ mn.

Composition of Linear Mappings
Now suppose V, U, and W are vector spaces over the same field K, and suppose F : V ! U and G : U ! W are linear mappings. We picture these mappings as follows: ! VÀ UÀ W ! Recall that the composition function G  F is the mapping from V into W defined by ðG  FÞðvÞ ¼ GðFðvÞÞ. We show that G  F is linear whenever F and G are linear. Specifically, for any vectors v; w 2 V and any scalars a; b 2 K, we have ðG  FÞðav þ bwÞ ¼ GðFðav þ bwÞÞ ¼ GðaFðvÞ þ bFðwÞÞ ¼ aGðFðvÞÞ þ bGðFðwÞÞ ¼ aðG  FÞðvÞ þ bðG  FÞðwÞ Thus, G  F is linear. The composition of linear mappings and the operations of addition and scalar multiplication are related as follows.
F G

174
THEOREM

CHAPTER 5 Linear Mappings
5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear: F : V ! U; F0 : V ! U and G : U ! W; G0 : U ! W Then, for any scalar k 2 K: (i) G  ðF þ F 0 Þ ¼ G  F þ G  F 0 . (ii) ðG þ G0 Þ  F ¼ G  F þ G0  F. (iii) kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ.

5.7

Algebra AðVÞ of Linear Operators

Let V be a vector space over a field K. This section considers the special case of linear mappings from the vector space V into itself—that is, linear mappings of the form F : V ! V. They are also called linear operators or linear transformations on V. We will write AðV Þ, instead of HomðV; V Þ, for the space of all such mappings. Now AðV Þ is a vector space over K (Theorem 5.8), and, if dim V ¼ n, then dim AðV Þ ¼ n2 . Moreover, for any mappings F; G 2 AðV Þ, the composition G  F exists and also belongs to AðV Þ. Thus, we have a ‘‘multiplication’’ defined in AðV Þ. [We sometimes write FG instead of G  F in the space AðV Þ.] Remark: An algebra A over a field K is a vector space over K in which an operation of multiplication is defined satisfying, for every F; G; H 2 A and every k 2 K: (i) FðG þ HÞ ¼ FG þ FH, (ii) ðG þ HÞF ¼ GF þ HF, (iii) kðGFÞ ¼ ðkGÞF ¼ GðkFÞ. The algebra is said to be associative if, in addition, ðFGÞH ¼ FðGHÞ. The above definition of an algebra and previous theorems give us the following result.
THEOREM

5.13:

Let V be a vector space over K. Then AðV Þ is an associative algebra over K with respect to composition of mappings. If dim V ¼ n, then dim AðV Þ ¼ n2 .

This is why AðV Þ is called the algebra of linear operators on V .

Polynomials and Linear Operators
Observe that the identity mapping I : V ! V belongs to AðV Þ. Also, for any linear operator F in AðV Þ, we have FI ¼ IF ¼ F. We can also form ‘‘powers’’ of F. Namely, we define F 0 ¼ I; F 2 ¼ F  F; F 3 ¼ F 2  F ¼ F  F  F; F 4 ¼ F 3  F; ... Furthermore, for any polynomial pðtÞ over K, say, pðtÞ ¼ a0 þ a1 t þ a2 t2 þ Á Á Á þ as t2 we can form the linear operator pðFÞ defined by pðFÞ ¼ a0 I þ a1 F þ a2 F 2 þ Á Á Á þ as F s (For any scalar k, the operator kI is sometimes denoted simply by k.) In particular, we say F is a zero of the polynomial pðtÞ if pðFÞ ¼ 0.
EXAMPLE 5.11 Let F : K 3 ! K 3 be defined by Fðx; y; zÞ ¼ ð0; x; yÞ. For any ða; b; cÞ 2 K 3 ,

ðF þ IÞða; b; cÞ ¼ ð0; a; bÞ þ ða; b; cÞ ¼ ða; a þ b; b þ cÞ F 3 ða; b; cÞ ¼ F 2 ð0; a; bÞ ¼ Fð0; 0; aÞ ¼ ð0; 0; 0Þ
Thus, F 3 ¼ 0, the zero mapping in AðV Þ. This means F is a zero of the polynomial pðtÞ ¼ t3 .

CHAPTER 5 Linear Mappings

175

Square Matrices as Linear Operators
Let M ¼ Mn;n be the vector space of all square n  n matrices over K. Then any matrix A in M defines a linear mapping FA : K n ! K n by FA ðuÞ ¼ Au (where the vectors in K n are written as columns). Because the mapping is from K n into itself, the square matrix A is a linear operator, not simply a linear mapping. Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any (column) vector u in K n , FAB ðuÞ ¼ ðABÞu ¼ AðBuÞ ¼ AðFB ðU ÞÞ ¼ FA ðFB ðuÞÞ ¼ ðFA  FB ÞðuÞ In other words, the matrix product AB corresponds to the composition of A and B as linear mappings. Similarly, the matrix sum A þ B corresponds to the sum of A and B as linear mappings, and the scalar product kA corresponds to the scalar product of A as a linear mapping.

Invertible Operators in AðVÞ
Let F : V ! V be a linear operator. F is said to be invertible if it has an inverse—that is, if there exists F À1 in AðV Þ such that FF À1 ¼ F À1 F ¼ I. On the other hand, F is invertible as a mapping if F is both one-to-one and onto. In such a case, F À1 is also linear and F À1 is the inverse of F as a linear operator (proved in Problem 5.15). Suppose F is invertible. Then only 0 2 V can map into itself, and so F is nonsingular. The converse is not true, as seen by the following example.
EXAMPLE 5.12 Let V ¼ PðtÞ, the vector space of polynomials over K. Let F be the mapping on V that increases by 1 the exponent of t in each term of a polynomial; that is,

Fða0 þ a1 t þ a2 t2 þ Á Á Á þ as ts Þ ¼ a0 t þ a1 t2 þ a2 t3 þ Á Á Á þ as tsþ1
Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible.

The vector space V ¼ PðtÞ in the above example has infinite dimension. The situation changes significantly when V has finite dimension. Namely, the following theorem applies.
THEOREM 5.14:

Let F be a linear operator on a finite-dimensional vector space V . Then the following four conditions are equivalent. (i) F is nonsingular: Ker F ¼ f0g. (ii) F is one-to-one. (iii) F is an onto mapping. (iv) F is invertible.

The proof of the above theorem mainly follows from Theorem 5.6, which tells us that dim V ¼ dimðKer FÞ þ dimðIm FÞ By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove the theorem, we need only show that (i) and (iii) are equivalent. This we do below. (a) Suppose (i) holds. Then dimðKer FÞ ¼ 0, and so the above equation tells us that dim V ¼ dimðIm FÞ. This means V ¼ Im F or, in other words, F is an onto mapping. Thus, (i) implies (iii). (b) Suppose (iii) holds. Then V ¼ Im F, and so dim V ¼ dimðIm FÞ. Therefore, the above equation tells us that dimðKer FÞ ¼ 0, and so F is nonsingular. Therefore, (iii) implies (i). Accordingly, all four conditions are equivalent. Remark: Suppose A is a square n  n matrix over K. Then A may be viewed as a linear operator on K n . Because K n has finite dimension, Theorem 5.14 holds for the square matrix A. This is why the terms ‘‘nonsingular’’ and ‘‘invertible’’ are used interchangeably when applied to square matrices.
EXAMPLE 5.13 Let F be the linear operator on R2 defined by Fðx; yÞ ¼ ð2x þ y; 3x þ 2yÞ.

(a) To show that F is invertible, we need only show that F is nonsingular. Set Fðx; yÞ ¼ ð0; 0Þ to obtain the homogeneous system

2x þ y ¼ 0

and

3x þ 2y ¼ 0

176

CHAPTER 5 Linear Mappings

Solve for x and y to get x ¼ 0, y ¼ 0. Hence, F is nonsingular and so invertible. (b) To find a formula for F À1 , we set Fðx; yÞ ¼ ðs; tÞ and so F À1 ðs; tÞ ¼ ðx; yÞ. We have

ð2x þ y; 3x þ 2yÞ ¼ ðs; tÞ

or

2x þ y ¼ s 3x þ 2y ¼ t or F À1 ðx; yÞ ¼ ð2x À y; À3x þ 2yÞ

Solve for x and y in terms of s and t to obtain x ¼ 2s À t, y ¼ À3s þ 2t. Thus,

F À1 ðs; tÞ ¼ ð2s À t; À3s þ 2tÞ

where we rewrite the formula for F À1 using x and y instead of s and t.

SOLVED PROBLEMS Mappings 5.1. State whether each diagram in Fig. 5-3 defines a mapping from A ¼ fa; b; cg into B ¼ fx; y; zg.
(a) No. There is nothing assigned to the element b 2 A. (b) No. Two elements, x and z, are assigned to c 2 A. (c) Yes.

Figure 5-3

5.2.

Let f : A ! B and g : B ! C be defined by Fig. 5-4.
(a) Find the composition mapping ðg  f Þ : A ! C. (b) Find the images of the mappings f , g, g  f .

Figure 5-4

(a) Use the definition of the composition mapping to compute ðg  f Þ ðaÞ ¼ gð f ðaÞÞ ¼ gðyÞ ¼ t; ðg  f Þ ðbÞ ¼ gð f ðbÞÞ ¼ gðxÞ ¼ s ðg  f Þ ðcÞ ¼ gð f ðcÞÞ ¼ gðyÞ ¼ t Observe that we arrive at the same answer if we ‘‘follow the arrows’’ in Fig. 5-4: a ! y ! t; b ! x ! s; c!y!t

(b) By Fig. 5-4, the image values under the mapping f are x and y, and the image values under g are r, s, t.

CHAPTER 5 Linear Mappings
Hence, Im f ¼ fx; yg and Im g ¼ fr; s; tg

177

Also, by part (a), the image values under the composition mapping g  f are t and s; accordingly, Im g  f ¼ fs; tg. Note that the images of g and g  f are different.

5.3.

Consider the mapping F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðyz; x2 Þ. Find (a) Fð2; 3; 4Þ; (b) Fð5; À2; 7Þ; (c) F À1 ð0; 0Þ, that is, all v 2 R3 such that FðvÞ ¼ 0.
(a) Substitute in the formula for F to get Fð2; 3; 4Þ ¼ ð3 Á 4; 22 Þ ¼ ð12; 4Þ. (b) Fð5; À2; 7Þ ¼ ðÀ2 Á 7; 52 Þ ¼ ðÀ14; 25Þ. (c) Set FðvÞ ¼ 0, where v ¼ ðx; y; zÞ, and then solve for x, y, z: Fðx; y; zÞ ¼ ðyz; x2 Þ ¼ ð0; 0Þ or yz ¼ 0; x2 ¼ 0 Thus, x ¼ 0 and either y ¼ 0 or z ¼ 0. In other words, x ¼ 0, y ¼ 0 or x ¼ 0; z ¼ 0—that is, the z-axis and the y-axis.

5.4.

Consider the mapping F : R2 ! R2 defined by Fðx; yÞ ¼ ð3y; 2xÞ. Let S be the unit circle in R2 , that is, the solution set of x2 þ y2 ¼ 1. (a) Describe FðSÞ. (b) Find F À1 ðSÞ.
(a) Let (a; b) be an element of FðSÞ. Then there exists ðx; yÞ 2 S such that Fðx; yÞ ¼ ða; bÞ. Hence, ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a; 2x ¼ b or a b y ¼ ;x ¼ 3 2

Because ðx; yÞ 2 S—that is, x2 þ y2 ¼ 1—we have  2   b a 2 þ ¼1 or 2 3

a2 b2 þ ¼1 9 4

Thus, FðSÞ is an ellipse. (b) Let Fðx; yÞ ¼ ða; bÞ, where ða; bÞ 2 S. Then ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a, 2x ¼ b. Because ða; bÞ 2 S, we have a2 þ b2 ¼ 1. Thus, ð3yÞ2 þ ð2xÞ2 ¼ 1. Accordingly, F À1 ðSÞ is the ellipse 4x2 þ 9y2 ¼ 1.

5.5.

Let the mappings f : A ! B, g : B ! C, h : C ! D be defined by Fig. 5-5. Determine whether or not each function is (a) one-to-one; (b) onto; (c) invertible (i.e., has an inverse).
(a) The mapping f : A ! B is one-to-one, as each element of A has a different image. The mapping g : B ! C is not one-to one, because x and z both have the same image 4. The mapping h : C ! D is one-to-one. (b) The mapping f : A ! B is not onto, because z 2 B is not the image of any element of A. The mapping g : B ! C is onto, as each element of C is the image of some element of B. The mapping h : C ! D is also onto. (c) A mapping has an inverse if and only if it is one-to-one and onto. Hence, only h has an inverse.

A 1 2 3

f

B x y z w

g

C 4 5 6

h

D a b c

Figure 5-5

178
5.6.

CHAPTER 5 Linear Mappings

Suppose f : A ! B and g : B ! C. Hence, ðg  f Þ : A ! C exists. Prove (a) (b) (c) (d) If f and g are one-to-one, then g  f is one-to-one. If f and g are onto mappings, then g  f is an onto mapping. If g  f is one-to-one, then f is one-to-one. If g  f is an onto mapping, then g is an onto mapping.

(a) Suppose ðg  f ÞðxÞ ¼ ðg  f ÞðyÞ. Then gð f ðxÞÞ ¼ gð f ðyÞÞ. Because g is one-to-one, f ðxÞ ¼ f ðyÞ. Because f is one-to-one, x ¼ y. We have proven that ðg  f ÞðxÞ ¼ ðg  f ÞðyÞ implies x ¼ y; hence g  f is one-to-one. (b) Suppose c 2 C. Because g is onto, there exists b 2 B for which gðbÞ ¼ c. Because f is onto, there exists a 2 A for which f ðaÞ ¼ b. Thus, ðg  f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ c. Hence, g  f is onto. (c) Suppose f is not one-to-one. Then there exist distinct elements x; y 2 A for which f ðxÞ ¼ f ðyÞ. Thus, ðg  f ÞðxÞ ¼ gð f ðxÞÞ ¼ gð f ðyÞÞ ¼ ðg  f ÞðyÞ. Hence, g  f is not one-to-one. Therefore, if g  f is one-toone, then f must be one-to-one. (d) If a 2 A, then ðg  f ÞðaÞ ¼ gð f ðaÞÞ 2 gðBÞ. Hence, ðg  f ÞðAÞ  gðBÞ. Suppose g is not onto. Then gðBÞ is properly contained in C and so ðg  f ÞðAÞ is properly contained in C; thus, g  f is not onto. Accordingly, if g  f is onto, then g must be onto.

5.7.

Prove that f : A ! B has an inverse if and only if f is one-to-one and onto.
Suppose f has an inverse—that is, there exists a function f À1 : B ! A for which f À1  f ¼ 1A and f  f À1 ¼ 1B . Because 1A is one-to-one, f is one-to-one by Problem 5.6(c), and because 1B is onto, f is onto by Problem 5.6(d); that is, f is both one-to-one and onto. Now suppose f is both one-to-one and onto. Then each b 2 B is the image of a unique element in A, say b*. Thus, if f ðaÞ ¼ b, then a ¼ b*; hence, f ðb*Þ ¼ b. Now let g denote the mapping from B to A defined by b 7! b*. We have (i) ðg  f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ b* ¼ a for every a 2 A; hence, g  f ¼ 1A . (ii) ð f  gÞðbÞ ¼ f ðgðbÞÞ ¼ f ðb*Þ ¼ b for every b 2 B; hence, f  g ¼ 1B .

Accordingly, f has an inverse. Its inverse is the mapping g.

5.8.

Let f : R ! R be defined by f ðxÞ ¼ 2x À 3. Now f is one-to-one and onto; hence, f has an inverse mapping f À1 . Find a formula for f À1 .
Let y be the image of x under the mapping f ; that is, y ¼ f ðxÞ ¼ 2x À 3. Hence, x will be the image of y under the inverse mapping f À1 . Thus, solve for x in terms of y in the above equation to obtain x ¼ 1 ðy þ 3Þ. 2 Then the formula defining the inverse function is f À1 ðyÞ ¼ 1 ðy þ 3Þ, or, using x instead of y, f À1 ðxÞ ¼ 1 ðx þ 3Þ. 2 2

Linear Mappings 5.9. Suppose the mapping F : R2 ! R2 is defined by Fðx; yÞ ¼ ðx þ y; xÞ. Show that F is linear.
We need to show that Fðv þ wÞ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ kFðvÞ, where u and v are any elements of R2 and k is any scalar. Let v ¼ ða; bÞ and w ¼ ða0 ; b0 Þ. Then

v þ w ¼ ða þ a0 ; b þ b0 Þ
0 0 0

and

kv ¼ ðka; kbÞ

We have FðvÞ ¼ ða þ b; aÞ and FðwÞ ¼ ða þ b ; a Þ. Thus,

Fðv þ wÞ ¼ Fða þ a0 ; b þ b0 Þ ¼ ða þ a0 þ b þ b0 ; a þ a0 Þ ¼ ða þ b; aÞ þ ða0 þ b0 ; a0 Þ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ Fðka; kbÞ ¼ ðka þ kb; kaÞ ¼ kða þ b; aÞ ¼ kFðvÞ Because v, w, k were arbitrary, F is linear.

CHAPTER 5 Linear Mappings

179

5.10. Suppose F : R3 ! R2 is defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x À 3y þ 4zÞ. Show that F is linear.
We argue via matrices. Writing vectors as columns, the mapping F may be written in the form FðvÞ ¼ Av, where v ¼ ½x; y; zŠT and ! 1 1 1 A¼ 2 À3 4 Then, using properties of matrices, we have Fðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FðvÞ þ FðwÞ and Thus, F is linear. FðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFðvÞ

5.11. Show that the following mappings are not linear:
(a) (b) (c) F : R2 ! R2 defined by Fðx; yÞ ¼ ðxy; xÞ F : R2 ! R3 defined by Fðx; yÞ ¼ ðx þ 3; 2y; x þ yÞ F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðjxj; y þ zÞ FðvÞ ¼ ð1ð2Þ; 1Þ ¼ ð2; 1Þ Hence, Fðv þ wÞ ¼ ð4ð6Þ; 4Þ ¼ ð24; 6Þ 6¼ FðvÞ þ FðwÞ (b) Because Fð0; 0Þ ¼ ð3; 0; 0Þ 6¼ ð0; 0; 0Þ, F cannot be linear. (c) Let v ¼ ð1; 2; 3Þ and k ¼ À3. Then kv ¼ ðÀ3; À6; À9Þ. We have FðvÞ ¼ ð1; 5Þ and Thus, FðkvÞ ¼ FðÀ3; À6; À9Þ ¼ ð3; À15Þ 6¼ kFðvÞ Accordingly, F is not linear. kFðvÞ ¼ À3ð1; 5Þ ¼ ðÀ3; À15Þ: and FðwÞ ¼ ð3ð4Þ; 3Þ ¼ ð12; 3Þ

(a) Let v ¼ ð1; 2Þ and w ¼ ð3; 4Þ; then v þ w ¼ ð4; 6Þ. Also,

5.12. Let V be the vector space of n-square real matrices. Let M be an arbitrary but fixed matrix in V . Let F : V ! V be defined by FðAÞ ¼ AM þ MA, where A is any matrix in V . Show that F is linear.
For any matrices A and B in V and any scalar k, we have FðA þ BÞ ¼ ðA þ BÞM þ MðA þ BÞ ¼ AM þ BM þ MA þ MB ¼ ðAM þ MAÞ ¼ ðBM þ MBÞ ¼ FðAÞ þ FðBÞ and FðkAÞ ¼ ðkAÞM þ MðkAÞ ¼ kðAMÞ þ kðMAÞ ¼ kðAM þ MAÞ ¼ kFðAÞ Thus, F is linear.

5.13. Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let fv 1 ; v 2 ; . . . ; v n g be a basis of V and let u1 ; u2 ; . . . ; un be any vectors in U . Then there exists a unique linear mapping F : V ! U such that Fðv 1 Þ ¼ u1 ; Fðv 2 Þ ¼ u2 ; . . . ; Fðv n Þ ¼ un .
There are three steps to the proof of the theorem: (1) Define the mapping F : V ! U such that Fðv i Þ ¼ ui ; i ¼ 1; . . . ; n. (2) Show that F is linear. (3) Show that F is unique. Step 1. Let v 2 V . Because fv 1 ; . . . ; v n g is a basis of V, there exist unique scalars a1 ; . . . ; an 2 K for which v ¼ a1 v 1 þ a2 v 2 þ Á Á Á þ an v n . We define F : V ! U by FðvÞ ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un

180

CHAPTER 5 Linear Mappings
(Because the ai are unique, the mapping F is well defined.) Now, for i ¼ 1; . . . ; n, v i ¼ 0v 1 þ Á Á Á þ 1v i þ Á Á Á þ 0v n Hence, Fðv i Þ ¼ 0u1 þ Á Á Á þ 1ui þ Á Á Á þ 0un ¼ ui

Thus, the first step of the proof is complete. Step 2. Suppose v ¼ a1 v 1 þ a2 v 2 þ Á Á Á þ an v n and w ¼ b1 v 1 þ b2 v 2 þ Á Á Á þ bn v n . Then v þ w ¼ ða1 þ b1 Þv 1 þ ða2 þ b2 Þv 2 þ Á Á Á þ ðan þ bn Þv n and, for any k 2 K, kv ¼ ka1 v 1 þ ka2 v 2 þ Á Á Á þ kan v n . By definition of the mapping F, FðvÞ ¼ a1 u1 þ a2 u2 þ Á Á Á þ an v n Hence, Fðv þ wÞ ¼ ða1 þ b1 Þu1 þ ða2 þ b2 Þu2 þ Á Á Á þ ðan þ bn Þun ¼ ða1 u1 þ a2 u2 þ Á Á Á þ an un Þ þ ðb1 u1 þ b2 u2 þ Á Á Á þ bn un Þ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ kða1 u1 þ a2 u2 þ Á Á Á þ an un Þ ¼ kFðvÞ Thus, F is linear. Step 3. Suppose G : V ! U is linear and Gðv 1 Þ ¼ ui ; i ¼ 1; . . . ; n. Let v ¼ a1 v 1 þ a2 v 2 þ Á Á Á þ an v n Then GðvÞ ¼ Gða1 v 1 þ a2 v 2 þ Á Á Á þ an v n Þ ¼ a1 Gðv 1 Þ þ a2 Gðv 2 Þ þ Á Á Á þ an Gðv n Þ ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un ¼ FðvÞ Because GðvÞ ¼ FðvÞ for every v 2 V ; G ¼ F. Thus, F is unique and the theorem is proved. and FðwÞ ¼ b1 u1 þ b2 u2 þ Á Á Á þ bn un

5.14. Let F : R2 ! R2 be the linear mapping for which Fð1; 2Þ ¼ ð2; 3Þ and Fð0; 1Þ ¼ ð1; 4Þ. [Note that fð1; 2Þ; ð0; 1Þg is a basis of R2 , so such a linear map F exists and is unique by Theorem 5.2.] Find a formula for F; that is, find Fða; bÞ.
Write ða; bÞ as a linear combination of (1, 2) and (0, 1) using unknowns x and y, ða; bÞ ¼ xð1; 2Þ þ yð0; 1Þ ¼ ðx; 2x þ yÞ; Solve for x and y in terms of a and b to get x ¼ a, so a ¼ x; b ¼ 2x þ y y ¼ À2a þ b. Then

Fða; bÞ ¼ xFð1; 2Þ þ yFð0; 1Þ ¼ að2; 3Þ þ ðÀ2a þ bÞð1; 4Þ ¼ ðb; À5a þ 4bÞ

5.15. Suppose a linear mapping F : V ! U is one-to-one and onto. Show that the inverse mapping F À1 : U ! V is also linear.
Suppose u; u0 2 U . Because F is one-to-one and onto, there exist unique vectors v; v 0 2 V for which FðvÞ ¼ u and Fðv 0 Þ ¼ u0 . Because F is linear, we also have Fðv þ v 0 Þ ¼ FðvÞ þ Fðv 0 Þ ¼ u þ u0 By definition of the inverse mapping, F À1 ðuÞ ¼ v; F À1 ðu0 Þ ¼ v 0 ; F À1 ðu þ u0 Þ ¼ v þ v 0 ; F À1 ðkuÞ ¼ kv: Then F À1 ðu þ u0 Þ ¼ v þ v 0 ¼ F À1 ðuÞ þ F À1 ðu0 Þ Thus, F À1 is linear. and F À1 ðkuÞ ¼ kv ¼ kF À1 ðuÞ and FðkvÞ ¼ kFðvÞ ¼ ku

CHAPTER 5 Linear Mappings
Kernel and Image of Linear Mappings 5.16. Let F : R4 ! R3 be the linear mapping defined by Fðx; y; z; tÞ ¼ ðx À y þ z þ t;
(a) Find the images of the usual basis of R4 : Fð1; 0; 0; 0Þ ¼ ð1; 1; 1Þ; Fð0; 1; 0; 0Þ ¼ ðÀ1; 0; 1Þ; Fð0; 0; 1; 0Þ ¼ ð1; 2; 3Þ Fð0; 0; 0; 1Þ ¼ ð1; À1; À3Þ

181

x þ 2z À t;

x þ y þ 3z À 3tÞ

Find a basis and the dimension of (a) the image of F; (b) the kernel of F.

By Proposition 5.4, the image vectors span Im F. Hence, form the matrix whose rows are these image vectors, and row reduce to echelon form: 3 2 3 3 2 2 1 1 1 1 1 1 1 1 1 6 À1 1 27 60 1 27 0 17 6 0 7 7 6 7 6 6 7 7$6 7$6 6 4 1 1 25 40 0 05 2 35 4 0 0 À2 À4 0 0 0 1 À1 À3 Thus, (1, 1, 1) and (0, 1, 2) form a basis for Im F; hence, dimðIm FÞ ¼ 2. (b) Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ; that is, set Fðx; y; z; tÞ ¼ ðx À y þ z þ t; x þ 2z À t; x þ y þ 3z À 3tÞ ¼ ð0; 0; 0Þ Set corresponding entries equal to each other to form the following homogeneous system whose solution space is Ker F: xÀyþ zþ t ¼0 x þ 2z À t ¼ 0 x þ y þ 3z À 3t ¼ 0 or xÀyþ zþ t ¼0 y þ z À 2t ¼ 0 2y þ 2z À 4t ¼ 0 or xÀyþzþ t ¼0 y þ z À 2t ¼ 0

The free variables are z and t. Hence, dimðKer FÞ ¼ 2. (i) Set z ¼ À1, t ¼ 0 to obtain the solution (2; 1; À1; 0). (ii) Set z ¼ 0, t ¼ 1 to obtain the solution (1, 2, 0, 1). Thus, (2; 1; À1; 0) and (1, 2, 0, 1) form a basis of Ker F. [As expected, dimðIm FÞ þ dimðKer FÞ ¼ 2 þ 2 ¼ 4 ¼ dim R4 , the domain of F.]

5.17. Let G : R3 ! R3 be the linear mapping defined by Gðx; y; zÞ ¼ ðx þ 2y À z;
(a) Find the images of the usual basis of R3 : Gð1; 0; 0Þ ¼ ð1; 0; 1Þ; Gð0; 1; 0Þ ¼ ð2; 1; 1Þ; Gð0; 0; 1Þ ¼ ðÀ1; 1; À2Þ

y þ z;

x þ y À 2zÞ

Find a basis and the dimension of (a) the image of G, (b) the kernel of G.

By Proposition 5.4, the image vectors span Im G. Hence, form the matrix M whose rows are these image vectors, and row reduce to echelon form: 2 3 2 3 2 3 1 0 1 1 0 1 1 0 1 M ¼4 2 1 1 5 $ 4 0 1 À1 5 $ 4 0 1 À1 5 À1 1 À2 0 1 À1 0 0 0 Thus, (1, 0, 1) and (0; 1; À1) form a basis for Im G; hence, dimðIm GÞ ¼ 2. (b) Set GðvÞ ¼ 0, where v ¼ ðx; y; zÞ; that is, Gðx; y; zÞ ¼ ðx þ 2y À z; y þ z; x þ y À 2zÞ ¼ ð0; 0; 0Þ

182

CHAPTER 5 Linear Mappings
Set corresponding entries equal to each other to form the following homogeneous system whose solution space is Ker G: x þ 2y À z ¼ 0 yþ z¼0 x þ y À 2z ¼ 0 or x þ 2y À z ¼ 0 yþz¼0 Ày À z ¼ 0 or x þ 2y À z ¼ 0 yþz¼0

The only free variable is z; hence, dimðKer GÞ ¼ 1. Set z ¼ 1; then y ¼ À1 and x ¼ 3. Thus, (3; À1; 1) forms a basis of Ker G. [As expected, dimðIm GÞ þ dimðKer GÞ ¼ 2 þ 1 ¼ 3 ¼ dim R3 , the domain of G.]

1 2 5.18. Consider the matrix mapping A : R4 ! R3 , where A ¼ 4 1 3 3 8 dimension of (a) the image of A, (b) the kernel of A.

2

3 3 1 5 À2 5. Find a basis and the 13 À3

(a) The column space of A is equal to Im A. Now reduce AT to echelon form: 2 3 2 3 2 1 1 3 1 1 3 1 1 62 3 87 60 1 27 60 1 7$6 7$6 AT ¼ 6 43 5 13 5 4 0 2 45 40 0 1 À2 À3 0 À3 À6 0 0

3 3 27 7 05 0

Thus, fð1; 1; 3Þ; ð0; 1; 2Þg is a basis of Im A, and dimðIm AÞ ¼ 2. (b) Here Ker A is the solution space of the homogeneous system AX ¼ 0, where X ¼ fx; y; z; tÞT . Thus, reduce the matrix A of coefficients to echelon form: 1 2 40 1 0 2 2 3 2 1 3 1 2 À3 5 $ 4 0 0 4 À6 3 2 3 1 1 2 À3 5 0 0 0 x þ 2y þ 3z þ t ¼ 0 y þ 2z À 3t ¼ 0

or

The free variables are z and t. Thus, dimðKer AÞ ¼ 2. (i) Set z ¼ 1, t ¼ 0 to get the solution (1; À2; 1; 0). (ii) Set z ¼ 0, t ¼ 1 to get the solution (À7; 3; 0; 1). Thus, (1; À2; 1; 0) and (À7; 3; 0; 1) form a basis for Ker A.

5.19. Find a linear map F : R3 ! R4 whose image is spanned by (1; 2; 0; À4) and (2; 0; À1; À3).
Form a 4 Â 3 matrix whose columns consist only of the given vectors, say 2 3 1 2 2 6 2 0 07 7 A¼6 4 0 À1 À1 5 À4 À3 À3 Recall that A determines a linear map A : R3 ! R4 whose image is spanned by the columns of A. Thus, A satisfies the required condition.

5.20. Suppose f : V ! U is linear with kernel W, and that f ðvÞ ¼ u. Show that the ‘‘coset’’ v þ W ¼ fv þ w : w 2 W g is the preimage of u; that is, f À1 ðuÞ ¼ v þ W.
We must prove that (i) f À1 ðuÞ  v þ W and (ii) v þ W  f À1 ðuÞ. We first prove (i). Suppose v 0 2 f À1 ðuÞ. Then f ðv 0 Þ ¼ u, and so f ðv 0 À vÞ ¼ f ðv 0 Þ À f ðvÞ ¼ u À u ¼ 0 that is, v 0 À v 2 W . Thus, v 0 ¼ v þ ðv 0 À vÞ 2 v þ W , and hence f À1 ðuÞ  v þ W .

CHAPTER 5 Linear Mappings

183

Now we prove (ii). Suppose v 0 2 v þ W . Then v 0 ¼ v þ w, where w 2 W . Because W is the kernel of f ; we have f ðwÞ ¼ 0. Accordingly, f ðv 0 Þ ¼ f ðv þ wÞ þ f ðvÞ þ f ðwÞ ¼ f ðvÞ þ 0 ¼ f ðvÞ ¼ u Thus, v 0 2 f À1 ðuÞ, and so v þ W  f À1 ðuÞ. Both inclusions imply f À1 ðuÞ ¼ v þ W.

5.21. Suppose F : V ! U and G : U ! W are linear. Prove (a) rankðG  FÞ rankðGÞ, (b) rankðG  FÞ rankðFÞ. dim½GðU ފ. Then

(a) Because FðV Þ  U , we also have GðFðV ÞÞ  GðU Þ, and so dim½GðFðV Þފ rankðG  FÞ ¼ dim½ðG  FÞðV ފ ¼ dim½GðFðV Þފ dim½GðU ފ ¼ rankðGÞ. (b) We have dim½GðFðV Þފ dim½FðV ފ. Hence, rankðG  FÞ ¼ dim½ðG  FÞðV ފ ¼ dim½GðFðV Þފ

dim½FðV ފ ¼ rankðFÞ

5.22. Prove Theorem 5.3: Let F : V ! U be linear. Then, (a) Im F is a subspace of U, (b) Ker F is a subspace of V.
(a) Because Fð0Þ ¼ 0; we have 0 2 Im F. Now suppose u; u0 2 Im F and a; b 2 K. Because u and u0 belong to the image of F, there exist vectors v; v 0 2 V such that FðvÞ ¼ u and Fðv 0 Þ ¼ u0 . Then Fðav þ bv 0 Þ ¼ aFðvÞ þ bFðv 0 Þ ¼ au þ bu0 2 Im F Thus, the image of F is a subspace of U . (b) Because Fð0Þ ¼ 0; we have 0 2 Ker F. Now suppose v; w 2 Ker F and a; b 2 K. Because v and w belong to the kernel of F, FðvÞ ¼ 0 and FðwÞ ¼ 0. Thus, Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0; Thus, the kernel of F is a subspace of V. and so av þ bw 2 Ker F

5.23. Prove Theorem 5.6: Suppose V has finite dimension and F : V ! U is linear. Then dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ
Suppose dimðKer FÞ ¼ r and fw1 ; . . . ; wr g is a basis of Ker F, and suppose dimðIm FÞ ¼ s and fu1 ; . . . ; us g is a basis of Im F. (By Proposition 5.4, Im F has finite dimension.) Because every uj 2 Im F, there exist vectors v 1 ; . . . ; v s in V such that Fðv 1 Þ ¼ u1 ; . . . ; Fðv s Þ ¼ us . We claim that the set B ¼ fw1 ; . . . ; wr ; v 1 ; . . . ; v s g is a basis of V; that is, (i) B spans V, and (ii) B is linearly independent. Once we prove (i) and (ii), then dim V ¼ r þ s ¼ dimðKer FÞ þ dimðIm FÞ. (i) B spans V . Let v 2 V . Then FðvÞ 2 Im F. Because the uj span Im F, there exist scalars a1 ; . . . ; as such ^ that FðvÞ ¼ a1 u1 þ Á Á Á þ as us . Set v ¼ a1 v 1 þ Á Á Á þ as v s À v. Then Fð^Þ ¼ Fða1 v 1 þ Á Á Á þ as v s À vÞ ¼ a1 Fðv 1 Þ þ Á Á Á þ as Fðv s Þ À FðvÞ v ¼ a1 u1 þ Á Á Á þ as us À FðvÞ ¼ 0 ^ Thus, v 2 Ker F. Because the wi span Ker F, there exist scalars b1 ; . . . ; br , such that ^ v ¼ b1 w1 þ Á Á Á þ br wr ¼ a1 v 1 þ Á Á Á þ as v s À v Accordingly, v ¼ a1 v 1 þ Á Á Á þ as v s À b1 w1 À Á Á Á À br wr Thus, B spans V.

184
(ii) B is linearly independent. Suppose

CHAPTER 5 Linear Mappings

x1 w1 þ Á Á Á þ xr wr þ y1 v 1 þ Á Á Á þ ys v s ¼ 0 where xi ; yj 2 K. Then 0 ¼ Fð0Þ ¼ Fðx1 w1 þ Á Á Á þ xr wr þ y1 v 1 þ Á Á Á þ ys v s Þ ¼ x1 Fðw1 Þ þ Á Á Á þ xr Fðwr Þ þ y1 Fðv 1 Þ þ Á Á Á þ ys Fðv s Þ

ð1Þ

ð2Þ

But Fðwi Þ ¼ 0, since wi 2 Ker F, and Fðv j Þ ¼ uj . Substituting into (2), we will obtain y1 u1 þ Á Á Á þ ys us ¼ 0. Since the uj are linearly independent, each yj ¼ 0. Substitution into (1) gives x1 w1 þ Á Á Á þ xr wr ¼ 0. Since the wi are linearly independent, each xi ¼ 0. Thus B is linearly independent.

Singular and Nonsingular Linear Maps, Isomorphisms 5.24. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v whose image is 0.
(a) F : R2 ! R2 defined by Fðx; yÞ ¼ ðx À y; x À 2yÞ. (b) G : R2 ! R2 defined by Gðx; yÞ ¼ ð2x À 4y; 3x À 6yÞ. (a) Find Ker F by setting FðvÞ ¼ 0, where v ¼ ðx; yÞ, ðx À y; x À 2yÞ ¼ ð0; 0Þ or xÀ y¼0 x À 2y ¼ 0 or xÀy¼0 Ày ¼ 0

The only solution is x ¼ 0, y ¼ 0. Hence, F is nonsingular. (b) Set Gðx; yÞ ¼ ð0; 0Þ to find Ker G: ð2x À 4y; 3x À 6yÞ ¼ ð0; 0Þ or 2x À 4y ¼ 0 3x À 6y ¼ 0 or x À 2y ¼ 0

The system has nonzero solutions, because y is a free variable. Hence, G is singular. Let y ¼ 1 to obtain the solution v ¼ ð2; 1Þ, which is a nonzero vector, such that GðvÞ ¼ 0.

5.25. The linear map F : R2 ! R2 defined by Fðx; yÞ ¼ ðx À y; x À 2yÞ is nonsingular by the previous Problem 5.24. Find a formula for F À1 .
Set Fðx; yÞ ¼ ða; bÞ, so that F À1 ða; bÞ ¼ ðx; yÞ. We have ðx À y; x À 2yÞ ¼ ða; bÞ or xÀ y¼a x À 2y ¼ b or xÀy¼a y¼aÀb

Solve for x and y in terms of a and b to get x ¼ 2a À b, y ¼ a À b. Thus, F À1 ða; bÞ ¼ ð2a À b; a À bÞ or F À1 ðx; yÞ ¼ ð2x À y; x À yÞ

(The second equation is obtained by replacing a and b by x and y, respectively.)

5.26. Let G : R2 ! R3 be defined by Gðx; yÞ ¼ ðx þ y; x À 2y; 3x þ yÞ. (a)
(a)

Show that G is nonsingular. (b) Find a formula for GÀ1 .
Set Gðx; yÞ ¼ ð0; 0; 0Þ to find Ker G. We have ðx þ y; x À 2y; 3x þ yÞ ¼ ð0; 0; 0Þ or x þ y ¼ 0; x À 2y ¼ 0; 3x þ y ¼ 0

The only solution is x ¼ 0, y ¼ 0; hence, G is nonsingular. (b) Although G is nonsingular, it is not invertible, because R2 and R3 have different dimensions. (Thus, Theorem 5.9 does not apply.) Accordingly, GÀ1 does not exist.

CHAPTER 5 Linear Mappings

185

5.27. Suppose that F : V ! U is linear and that V is of finite dimension. Show that V and the image of F have the same dimension if and only if F is nonsingular. Determine all nonsingular linear mappings T : R4 ! R3 .
By Theorem 5.6, dim V ¼ dimðIm FÞ þ dimðKer FÞ. Hence, V and Im F have the same dimension if and only if dimðKer FÞ ¼ 0 or Ker F ¼ f0g (i.e., if and only if F is nonsingular). Because dim R3 is less than dim R4 , we have that dimðIm T Þ is less than the dimension of the domain 4 R of T . Accordingly no linear mapping T : R4 ! R3 can be nonsingular.

5.28. Prove Theorem 5.7: Let F : V ! U be a nonsingular linear mapping. Then the image of any linearly independent set is linearly independent.
Suppose v 1 ; v 2 ; . . . ; v n are linearly independent vectors in V. We claim that Fðv 1 Þ; Fðv 2 Þ; . . . ; Fðv n Þ are also linearly independent. Suppose a1 Fðv 1 Þ þ a2 Fðv 2 Þ þ Á Á Á þ an Fðv n Þ ¼ 0, where ai 2 K. Because F is linear, Fða1 v 1 þ a2 v 2 þ Á Á Á þ an v n Þ ¼ 0. Hence, a1 v 1 þ a2 v 2 þ Á Á Á þ an v n 2 Ker F But F is nonsingular—that is, Ker F ¼ f0g. Hence, a1 v 1 þ a2 v 2 þ Á Á Á þ an v n ¼ 0. Because the v i are linearly independent, all the ai are 0. Accordingly, the Fðv i Þ are linearly independent. Thus, the theorem is proved.

5.29. Prove Theorem 5.9: Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is linear. Then F is an isomorphism if and only if F is nonsingular.
If F is an isomorphism, then only 0 maps to 0; hence, F is nonsingular. Conversely, suppose F is nonsingular. Then dimðKer FÞ ¼ 0. By Theorem 5.6, dim V ¼ dimðKer FÞ þ dimðIm FÞ. Thus, dim U ¼ dim V ¼ dimðIm FÞ Because U has finite dimension, Im F ¼ U . This means F maps V onto U. Thus, F is one-to-one and onto; that is, F is an isomorphism.

Operations with Linear Maps 5.30. Define F : R3 ! R2 and G : R3 ! R2 by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; y; zÞ ¼ ðx À z; yÞ. Find formulas defining the maps: (a) F þ G, (b) 3F, (c) 2F À 5G.
(a) ðF þ GÞðx; y; zÞ ¼ Fðx; y; zÞ þ Gðx; y; zÞ ¼ ð2x; y þ zÞ þ ðx À z; yÞ ¼ ð3x À z; 2y þ zÞ (b) ð3FÞðx; y; zÞ ¼ 3Fðx; y; zÞ ¼ 3ð2x; y þ zÞ ¼ ð6x; 3y þ 3zÞ (c) ð2F À 5GÞðx; y; zÞ ¼ 2Fðx; y; zÞ À 5Gðx; y; zÞ ¼ 2ð2x; y þ zÞ À 5ðx À z; yÞ ¼ ð4x; 2y þ 2zÞ þ ðÀ5x þ 5z; À5yÞ ¼ ðÀx þ 5z; À3y þ 2zÞ

5.31. Let F : R3 ! R2 and G : R2 ! R2 be defined by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; yÞ ¼ ðy; xÞ. Derive formulas defining the mappings: (a) G  F, (b) F  G.
(a) ðG  FÞðx; y; zÞ ¼ GðFðx; y; zÞÞ ¼ Gð2x; y þ zÞ ¼ ðy þ z; 2xÞ (b) The mapping F  G is not defined, because the image of G is not contained in the domain of F.

5.32. Prove: (a) The zero mapping 0, defined by 0ðvÞ ¼ 0 2 U for every v 2 V , is the zero element of HomðV ; U Þ. (b) The negative of F 2 HomðV ; U Þ is the mapping ðÀ1ÞF, that is, ÀF ¼ ðÀ1ÞF.
Let F 2 HomðV ; U Þ. Then, for every v 2 V : ðaÞ ðbÞ ðF þ 0ÞðvÞ ¼ FðvÞ þ 0ðvÞ ¼ FðvÞ þ 0 ¼ FðvÞ ðF þ ðÀ1ÞFÞðvÞ ¼ FðvÞ þ ðÀ1ÞFðvÞ ¼ FðvÞ À FðvÞ ¼ 0 ¼ 0ðvÞ Because ðF þ 0ÞðvÞ ¼ FðvÞ for every v 2 V , we have F þ 0 ¼ F. Similarly, 0 þ F ¼ F: Thus, F þ ðÀ1ÞF ¼ 0: Similarly ðÀ1ÞF þ F ¼ 0: Hence, ÀF ¼ ðÀ1ÞF:

186

CHAPTER 5 Linear Mappings

5.33. Suppose F1 ; F2 ; . . . ; Fn are linear maps from V into U . Show that, for any scalars a1 ; a2 ; . . . ; an , and for any v 2 V , ða1 F1 þ a2 F2 þ Á Á Á þ an Fn ÞðvÞ ¼ a1 F1 ðvÞ þ a2 F2 ðvÞ þ Á Á Á þ an Fn ðvÞ
The mapping a1 F1 is defined by ða1 F1 ÞðvÞ ¼ a1 FðvÞ. Hence, the theorem holds for n ¼ 1. Accordingly, by induction, ða1 F1 þ a2 F2 þ Á Á Á þ an Fn ÞðvÞ ¼ ða1 F1 ÞðvÞ þ ða2 F2 þ Á Á Á þ an Fn ÞðvÞ ¼ a1 F1 ðvÞ þ a2 F2 ðvÞ þ Á Á Á þ an Fn ðvÞ

5.34. Consider linear mappings F : R3 ! R2 , G : R3 ! R2 , H : R3 ! R2 defined by Fðx; y; zÞ ¼ ðx þ y þ z; x þ yÞ;
Suppose, for scalars a; b; c 2 K, aF þ bG þ cH ¼ 0 (Here 0 is the zero mapping.) For e1 ¼ ð1; 0; 0Þ 2 R , we have 0ðe1 Þ ¼ ð0; 0Þ and
3

Gðx; y; zÞ ¼ ð2x þ z; x þ yÞ;
3 2

Hðx; y; zÞ ¼ ð2y; xÞ

Show that F, G, H are linearly independent [as elements of HomðR ; R Þ]. ð1Þ ðaF þ bG þ cHÞðe1 Þ ¼ aFð1; 0; 0Þ þ bGð1; 0; 0Þ þ cHð1; 0; 0Þ ¼ að1; 1Þ þ bð2; 1Þ þ cð0; 1Þ ¼ ða þ 2b; Thus by (1), ða þ 2b; a þ b þ cÞ ¼ ð0; 0Þ and so a þ 2b ¼ 0
3

a þ b þ cÞ ð2Þ

and

aþbþc¼0

Similarly for e2 ¼ ð0; 1; 0Þ 2 R , we have 0ðe2 Þ ¼ ð0; 0Þ and ðaF þ bG þ cHÞðe2 Þ ¼ aFð0; 1; 0Þ þ bGð0; 1; 0Þ þ cHð0; 1; 0Þ ¼ að1; 1Þ þ bð0; 1Þ þ cð2; 0Þ ¼ ða þ 2c; Thus, Using (2) and (3), we obtain a ¼ 0; b ¼ 0; c¼0 ð4Þ a þ 2c ¼ 0 and aþb¼0 a þ bÞ ð3Þ

Because (1) implies (4), the mappings F, G, H are linearly independent.

5.35. Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence, T is singular if and only if ÀT is singular.
Suppose T is singular. Then T ðvÞ ¼ 0 for some vector v 6¼ 0. Hence, ðkT ÞðvÞ ¼ kT ðvÞ ¼ k0 ¼ 0 and so kT is singular. Now suppose kT is singular. Then ðkT ÞðwÞ ¼ 0 for some vector w 6¼ 0. Hence, T ðkwÞ ¼ kT ðwÞ ¼ ðkT ÞðwÞ ¼ 0 But k 6¼ 0 and w 6¼ 0 implies kw 6¼ 0. Thus, T is also singular.

5.36. Find the dimension d of:
(a) HomðR3 ; R4 Þ, (b) HomðR5 ; R3 Þ, (c) HomðP3 ðtÞ; R2 Þ, (d) HomðM2;3 ; R4 Þ. Use dim½HomðV; U ފ ¼ mn, where dim V ¼ m and dim U ¼ n. (a) d ¼ 3ð4Þ ¼ 12. (b) d ¼ 5ð3Þ ¼ 15. (c) Because dim P3 ðtÞ ¼ 4, d ¼ 4ð2Þ ¼ 8. (d) Because dim M2;3 ¼ 6, d ¼ 6ð4Þ ¼ 24.

CHAPTER 5 Linear Mappings
5.37. Prove Theorem 5.11. Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV ; U ފ ¼ mn.

187

Suppose fv 1 ; . . . ; v m g is a basis of V and fu1 ; . . . ; un g is a basis of U. By Theorem 5.2, a linear mapping in HomðV; U Þ is uniquely determined by arbitrarily assigning elements of U to the basis elements v i of V. We define Fij 2 HomðV; U Þ; i ¼ 1; . . . ; m; j ¼ 1; . . . ; n

to be the linear mapping for which Fij ðv i Þ ¼ uj , and Fij ðv k Þ ¼ 0 for k 6¼ i. That is, Fij maps v i into uj and the other v’s into 0. Observe that fFij g contains exactly mn elements; hence, the theorem is proved if we show that it is a basis of HomðV; U Þ. Proof that fFij g generates HomðV; U Þ. Consider an arbitrary function F 2 HomðV; U Þ. Suppose Fðv 1 Þ ¼ w1 ; Fðv 2 Þ ¼ w2 ; . . . ; Fðv m Þ ¼ wm . Because wk 2 U , it is a linear combination of the u’s; say, wk ¼ ak1 u1 þ ak2 u2 þ Á Á Á þ akn un ; k ¼ 1; . . . ; m; aij 2 K ð1Þ P Pn Consider the linear mapping G ¼ m i¼1 j¼1 aij Fij . Because G is a linear combination of the Fij , the proof that fFij g generates HomðV; U Þ is complete if we show that F ¼ G. We now compute Gðv k Þ; k ¼ 1; . . . ; m. Because Fij ðv k Þ ¼ 0 for k 6¼ i and Fki ðv k Þ ¼ ui ; Gðv k Þ ¼ m n PP i¼1 j¼1

aij Fij ðv k Þ ¼

n P j¼1

akj Fkj ðv k Þ ¼

n P j¼1

akj uj

¼ ak1 u1 þ ak2 u2 þ Á Á Á þ akn un Thus, by (1), Gðv k Þ ¼ wk for each k. But Fðv k Þ ¼ wk for each k. Accordingly, by Theorem 5.2, F ¼ G; hence, fFij g generates HomðV; U Þ. Proof that fFij g is linearly independent. Suppose, for scalars cij 2 K, m n PP i¼1 j¼1

cij Fij ¼ 0

For v k ; k ¼ 1; . . . ; m, 0 ¼ 0ðv k Þ ¼ m n PP i¼1 j¼1

cij Fij ðv k Þ ¼

n P j¼1

ckj Fkj ðv k Þ ¼

n P j¼1

ckj uj

¼ ck1 u1 þ ck2 u2 þ Á Á Á þ ckn un But the ui are linearly independent; hence, for k ¼ 1; . . . ; m, we have ck1 ¼ 0; ck2 ¼ 0; . . . ; ckn ¼ 0. In other words, all the cij ¼ 0, and so fFij g is linearly independent.

5.38. Prove Theorem 5.12: (i) G  ðF þ F 0 Þ ¼ G  F þ G  F 0 . (ii) ðG þ G0 Þ  F ¼ G  F þ G0  F. (iii) kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ.
(i) For every v 2 V , ðG  ðF þ F 0 ÞÞðvÞ ¼ GððF þ F 0 ÞðvÞÞ ¼ GðFðvÞ þ F 0 ðvÞÞ ¼ GðFðvÞÞ þ GðF 0 ðvÞÞ ¼ ðG  FÞðvÞ þ ðG  F 0 ÞðvÞ ¼ ðG  F þ G  F 0 ÞðvÞ Thus, G  ðF þ F 0 Þ ¼ G  F þ G  F 0 . (ii) For every v 2 V, ððG þ G0 Þ  FÞðvÞ ¼ ðG þ G0 ÞðFðvÞÞ ¼ GðFðvÞÞ þ G0 ðFðvÞÞ ¼ ðG  FÞðvÞ þ ðG0  FÞðvÞ ¼ ðG  F þ G0  FÞðvÞ Thus, ðG þ G0 Þ  F ¼ G  F þ G0  F.

188
(iii) For every v 2 V,

CHAPTER 5 Linear Mappings

ðkðG  FÞÞðvÞ ¼ kðG  FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ ðkGÞðFðvÞÞ ¼ ðkG  FÞðvÞ and ðkðG  FÞÞðvÞ ¼ kðG  FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ GðkFðvÞÞ ¼ GððkFÞðvÞÞ ¼ ðG  kFÞðvÞ Accordingly, kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ. (We emphasize that two mappings are shown to be equal by showing that each of them assigns the same image to each point in the domain.)

Algebra of Linear Maps 5.39. Let F and G be the linear operators on R2 defined by Fðx; yÞ ¼ ðy; xÞ and Gðx; yÞ ¼ ð0; xÞ. Find formulas defining the following operators: (a) F þ G, (b) 2F À 3G, (c) FG, (d) GF, (e) F 2 , (f ) G2 .
(a) (b) (c) (d) (e) (f ) ðF þ GÞðx; yÞ ¼ Fðx; yÞ þ Gðx; yÞ ¼ ðy; xÞ þ ð0; xÞ ¼ ðy; 2xÞ. ð2F À 3GÞðx; yÞ ¼ 2Fðx; yÞ À 3Gðx; yÞ ¼ 2ðy; xÞ À 3ð0; xÞ ¼ ð2y; ÀxÞ. ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fð0; xÞ ¼ ðx; 0Þ. ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gðy; xÞ ¼ ð0; yÞ. F 2 ðx; yÞ ¼ FðFðx; yÞÞ ¼ Fðy; xÞ ¼ ðx; yÞ. (Note that F 2 ¼ I, the identity mapping.) G2 ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. (Note that G2 ¼ 0, the zero mapping.)

5.40. Consider the linear operator T on R3 defined by Tðx; y; zÞ ¼ ð2x; 4x À y; 2x þ 3y À zÞ. (a) Show that T is invertible. Find formulas for (b) T À1 , (c) T 2 , (d ) T À2 .
(a) Let W ¼ Ker T . We need only show that T is nonsingular (i.e., that W ¼ f0g). Set T ðx; y; zÞ ¼ ð0; 0; 0Þ, which yields T ðx; y; zÞ ¼ ð2x; 4x À y; 2x þ 3y À zÞ ¼ ð0; 0; 0Þ Thus, W is the solution space of the homogeneous system 2x ¼ 0; 4x À y ¼ 0; 2x þ 3y À z ¼ 0

which has only the trivial solution (0, 0, 0). Thus, W ¼ f0g. Hence, T is nonsingular, and so T is invertible. (b) Set T ðx; y; zÞ ¼ ðr; s; tÞ [and so T À1 ðr; s; tÞ ¼ ðx; y; zÞ]. We have ð2x; 4x À y; 2x þ 3y À zÞ ¼ ðr; s; tÞ or 2x ¼ r; 4x À y ¼ s; 2x þ 3y À z ¼ t

Solve for x, y, z in terms of r, s, t to get x ¼ 1 r, y ¼ 2r À s, z ¼ 7r À 3s À t. Thus, 2 T À1 ðr; s; tÞ ¼ ð1 r; 2r À s; 7r À 3s À tÞ 2 (c) Apply T twice to get T 2 ðx; y; zÞ ¼ T ð2x; 4x À y; 2x þ 3y À zÞ ¼ ½4x; 4ð2xÞ À ð4x À yÞ; 2ð2xÞ þ 3ð4x À yÞ À ð2x þ 3y À zފ ¼ ð4x; 4x þ y; 14x À 6y þ zÞ (d) Apply T À1 twice to get T À2 ðx; y; zÞ ¼ T À2 ð1 x; 2x À y; 7x À 3y À zÞ 2 ¼ ½1 x; 2ð1 xÞ À ð2x À yÞ; 7ð1 xÞ À 3ð2x À yÞ À ð7x À 3y À zފ 4 2 2 ¼ ð1 x; 4 Àx þ y; À 19 x þ 6y þ zÞ 2 or T À1 ðx; y; zÞ ¼ ð1 x; 2x À y; 7x À 3y À zÞ 2

CHAPTER 5 Linear Mappings

189

5.41. Let V be of finite dimension and let T be a linear operator on V for which TR ¼ I, for some operator R on V. (We call R a right inverse of T .)
(a) Show that T is invertible. (b) Show that R ¼ T À1 . (c) Give an example showing that the above need not hold if V is of infinite dimension. (a) Let dim V ¼ n. By Theorem 5.14, T is invertible if and only if T is onto; hence, T is invertible if and only if rankðT Þ ¼ n. We have n ¼ rankðIÞ ¼ rankðTRÞ rankðT Þ n. Hence, rankðT Þ ¼ n and T is invertible. (b) TT À1 ¼ T À1 T ¼ I. Then R ¼ IR ¼ ðT À1 T ÞR ¼ T À1 ðTRÞ ¼ T À1 I ¼ T À1 . (c) Let V be the space of polynomials in t over K; say, pðtÞ ¼ a0 þ a1 t þ a2 t2 þ Á Á Á þ as ts . Let T and R be the operators on V defined by T ð pðtÞÞ ¼ 0 þ a1 þ a2 t þ Á Á Á þ as tsÀ1 We have ðTRÞð pðtÞÞ ¼ T ðRð pðtÞÞÞ ¼ T ða0 t þ a1 t2 þ Á Á Á þ as tsþ1 Þ ¼ a0 þ a1 t þ Á Á Á þ as ts ¼ pðtÞ and so TR ¼ I, the identity mapping. On the other hand, if k 2 K and k 6¼ 0, then ðRT ÞðkÞ ¼ RðT ðkÞÞ ¼ Rð0Þ ¼ 0 6¼ k Accordingly, RT 6¼ I. and Rð pðtÞÞ ¼ a0 t þ a1 t2 þ Á Á Á þ as tsþ1

5.42. Let F and G be linear operators on R2 defined by Fðx; yÞ ¼ ð0; xÞ and Gðx; yÞ ¼ ðx; 0Þ. Show that (a) GF ¼ 0, the zero mapping, but FG 6¼ 0. (b) G2 ¼ G.
(a) ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. Because GF assigns 0 ¼ ð0; 0Þ to every vector (x; y) in R2 , it is the zero mapping; that is, GF ¼ 0. On the other hand, ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fðx; 0Þ ¼ ð0; xÞ. For example, ðFGÞð2; 3Þ ¼ ð0; 2Þ. Thus, FG 6¼ 0, as it does not assign 0 ¼ ð0; 0Þ to every vector in R2 . (b) For any vector (x; y) in R2 , we have G2 ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gðx; 0Þ ¼ ðx; 0Þ ¼ Gðx; yÞ. Hence, G2 ¼ G.

5.43. Find the dimension of (a) AðR4 Þ, (b) AðP2 ðtÞÞ, (c) AðM2;3 ).
Use dim½AðV ފ ¼ n2 where dim V ¼ n. Hence, (a) dim½AðR4 ފ ¼ 42 ¼ 16, (b) dim½AðP2 ðtÞފ ¼ 32 ¼ 9, (c) dim½AðM2;3 ފ ¼ 62 ¼ 36.

5.44. Let E be a linear operator on V for which E2 ¼ E. (Such an operator is called a projection.) Let U be the image of E, and let W be the kernel. Prove
(a) (b) (c) If u 2 U, then EðuÞ ¼ u (i.e., E is the identity mapping on U ). If E 6¼ I, then E is singular—that is, EðvÞ ¼ 0 for some v 6¼ 0. V ¼ U È W.

(a) If u 2 U, the image of E, then EðvÞ ¼ u for some v 2 V. Hence, using E2 ¼ E, we have u ¼ EðvÞ ¼ E2 ðvÞ ¼ EðEðvÞÞ ¼ EðuÞ (b) If E 6¼ I, then for some v 2 V, EðvÞ ¼ u, where v 6¼ u. By (i), EðuÞ ¼ u. Thus, Eðv À uÞ ¼ EðvÞ À EðuÞ ¼ u À u ¼ 0; where v À u 6¼ 0 (c) We first show that V ¼ U þ W. Let v 2 V. Set u ¼ EðvÞ and w ¼ v À EðvÞ. Then v ¼ EðvÞ þ v À EðvÞ ¼ u þ w By deflnition, u ¼ EðvÞ 2 U, the image of E. We now show that w 2 W, the kernel of E, EðwÞ ¼ Eðv À EðvÞÞ ¼ EðvÞ À E2 ðvÞ ¼ EðvÞ À EðvÞ ¼ 0 and thus w 2 W. Hence, V ¼ U þ W. We next show that U \ W ¼ f0g. Let v 2 U \ W. Because v 2 U, EðvÞ ¼ v by part (a). Because v 2 W, EðvÞ ¼ 0. Thus, v ¼ EðvÞ ¼ 0 and so U \ W ¼ f0g. The above two properties imply that V ¼ U È W.

190
SUPPLEMENTARY PROBLEMS Mappings

CHAPTER 5 Linear Mappings

5.45. Determine the number of different mappings from ðaÞ f1; 2g into f1; 2; 3g; ðbÞ f1; 2; . . . ; rg into f1; 2; . . . ; sg: 5.46. Let f : R ! R and g : R ! R be defined by f ðxÞ ¼ x2 þ 3x þ 1 and gðxÞ ¼ 2x À 3. Find formulas defining the composition mappings: (a) f  g; (b) g  f ; (c) g  g; (d) f  f. 5.47. For each mappings f : R ! R find a formula for its inverse: (a) f ðxÞ ¼ 3x À 7, (b) f ðxÞ ¼ x3 þ 2. 5.48. For any mapping f : A ! B, show that 1B  f ¼ f ¼ f  1A .

Linear Mappings
5.49. Show that the following mappings are linear: (a) F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðx þ 2y À 3z; 4x À 5y þ 6zÞ. (b) F : R2 ! R2 defined by Fðx; yÞ ¼ ðax þ by; cx þ dyÞ, where a, b, c, d belong to R. 5.50. Show that the following mappings are not linear: (a) (b) (c) (d) F : R2 F : R3 F : R2 F : R3 ! R2 ! R2 ! R2 ! R2 defined defined defined defined by by by by Fðx; yÞ ¼ ðx2 ; y2 Þ. Fðx; y; zÞ ¼ ðx þ 1; y þ zÞ. Fðx; yÞ ¼ ðxy; yÞ. Fðx; y; zÞ ¼ ðjxj; y þ zÞ.

5.51. Find Fða; bÞ, where the linear map F : R2 ! R2 is defined by Fð1; 2Þ ¼ ð3; À1Þ and Fð0; 1Þ ¼ ð2; 1Þ. 5.52. Find a 2  2 matrix A that maps (a) ð1; 3ÞT and ð1; 4ÞT into ðÀ2; 5ÞT and ð3; À1ÞT , respectively. (b) ð2; À4ÞT and ðÀ1; 2ÞT into ð1; 1ÞT and ð1; 3ÞT , respectively. 5.53. Find a 2  2 singular matrix B that maps ð1; 1ÞT into ð1; 3ÞT . 5.54. Let V be the vector space of real n-square matrices, and let M be a fixed nonzero matrix in V. Show that the first two of the following mappings T : V ! V are linear, but the third is not: (a) T ðAÞ ¼ MA, (b) T ðAÞ ¼ AM þ MA, (c) T ðAÞ ¼ M þ A. 5.55. Give an example of a nonlinear map F : R2 ! R2 such that F À1 ð0Þ ¼ f0g but F is not one-to-one. 5.56. Let F : R2 ! R2 be defined by Fðx; yÞ ¼ ð3x þ 5y; 2x þ 3yÞ, and let S be the unit circle in R2 . (S consists of all points satisfying x2 þ y2 ¼ 1.) Find (a) the image FðSÞ, (b) the preimage F À1 ðSÞ. 5.57. Consider the linear map G : R3 ! R3 defined by Gðx; y; zÞ ¼ ðx þ y þ z; y À 2z; y À 3zÞ and the unit sphere S2 in R3 , which consists of the points satisfying x2 þ y2 þ z2 ¼ 1. Find (a) GðS2 Þ, (b) GÀ1 ðS2 Þ. 5.58. Let H be the plane x þ 2y À 3z ¼ 4 in R3 and let G be the linear map in Problem 5.57. Find (a) GðHÞ, (b) GÀ1 ðHÞ. 5.59. Let W be a subspace of V. The inclusion map, denoted by i : W ,! V, is defined by iðwÞ ¼ w for every w 2 W . Show that the inclusion map is linear. 5.60. Suppose F : V ! U is linear. Show that FðÀvÞ ¼ ÀFðvÞ.

Kernel and Image of Linear Mappings
5.61. For each linear map F find a basis and the dimension of the kernel and the image of F: (a) F : R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ 2y À 3z; 2x þ 5y À 4z; x þ 4y þ zÞ, (b) F : R4 ! R3 defined by Fðx; y; z; tÞ ¼ ðx þ 2y þ 3z þ 2t; 2x þ 4y þ 7z þ 5t; x þ 2y þ 6z þ 5tÞ.

CHAPTER 5 Linear Mappings
5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G: (a) G : R3 ! R2 defined by Gðx; y; zÞ ¼ ðx þ y þ z; 2x þ 2y þ 2zÞ, (b) G : R3 ! R2 defined by Gðx; y; zÞ ¼ ðx þ y; y þ zÞ, (c) G : R5 ! R3 defined by Gðx; y; z; s; tÞ ¼ ðx þ 2y þ 2z þ s þ t; x þ 2y þ 3z þ 2s À t; into R3 : 3x þ 6y þ 8z þ 5s À tÞ: 5.63. Each of the following matrices determines a linear map from R4 2 3 2 1 2 0 1 1 (a) A ¼ 4 2 À1 2 À1 5, (b) B ¼ 4 2 1 À3 2 À2 À2

191

3 0 2 À1 3 À1 1 5. 0 À5 3

Find a basis as well as the dimension of the kernel and the image of each linear map. 5.64. Find a linear mapping F : R3 ! R3 whose image is spanned by (1, 2, 3) and (4, 5, 6). 5.65. Find a linear mapping G : R4 ! R3 whose kernel is spanned by (1, 2, 3, 4) and (0, 1, 1, 1). 5.66. Let V ¼ P10 ðtÞ, the vector space of polynomials of degree 10. Consider the linear map D4 : V ! V, where D4 denotes the fourth derivative d 4 ð f Þ=dt4 . Find a basis and the dimension of (a) the image of D4 ; (b) the kernel of D4 . 5.67. Suppose F : V ! U is linear. Show that (a) the image of any subspace of V is a subspace of U ; (b) the preimage of any subspace of U is a subspace of V. 5.68. Show that if F : V ! U is onto, then dim U dim V. Determine all linear maps F : R3 ! R4 that are onto.

5.69. Consider the zero mapping 0 : V ! U defined by 0ðvÞ ¼ 0; 8 v 2 V . Find the kernel and the image of 0.

Operations with linear Mappings
5.70. Let F : R3 ! R2 and G : R3 ! R2 be defined by Fðx; y; zÞ ¼ ðy; x þ zÞ and Gðx; y; zÞ ¼ ð2z; x À yÞ. Find formulas defining the mappings F þ G and 3F À 2G. 5.71. Let H : R2 ! R2 be defined by Hðx; yÞ ¼ ðy; 2xÞ. Using the maps F and G in Problem 5.70, find formulas defining the mappings: (a) H  F and H  G, (b) F  H and G  H, (c) H  ðF þ GÞ and H  F þ H  G. 5.72. Show that the following mappings F, G, H are linearly independent: (a) F; G; H 2 HomðR2 ; R2 Þ defined by Fðx; yÞ ¼ ðx; 2yÞ, Gðx; yÞ ¼ ðy; x þ yÞ, Hðx; yÞ ¼ ð0; xÞ, (b) F; G; H 2 HomðR3 ; RÞ defined by Fðx; y; zÞ ¼ x þ y þ z, Gðx; y; zÞ ¼ y þ z, Hðx; y; zÞ ¼ x À z. 5.73. For F; G 2 HomðV; U Þ, show that rankðF þ GÞ rankðFÞ þ rankðGÞ. (Here V has finite dimension.)

5.74. Let F : V ! U and G : U ! V be linear. Show that if F and G are nonsingular, then G  F is nonsingular. Give an example where G  F is nonsingular but G is not. [Hint: Let dim V < dim U :Š 5.75. Find the dimension d of (a) HomðR2 ; R8 Þ, (b) HomðP4 ðtÞ; R3 Þ, (c) HomðM2;4 ; P2 ðtÞÞ. 5.76. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v whose image is 0; otherwise find a formula for the inverse map: (a) F : R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x þ 3y þ 5z; x þ 3y þ 7zÞ, (b) G : R3 ! P2 ðtÞ defined by Gðx; y; zÞ ¼ ðx þ yÞt2 þ ðx þ 2y þ 2zÞt þ y þ z, (c) H : R2 ! P2 ðtÞ defined by Hðx; yÞ ¼ ðx þ 2yÞt2 þ ðx À yÞt þ x þ y. 5.77. When can dim ½HomðV; U ފ ¼ dim V ?

192
Algebra of Linear Operators

CHAPTER 5 Linear Mappings

5.78. Let F and G be the linear operators on R2 defined by Fðx; yÞ ¼ ðx þ y; 0Þ and Gðx; yÞ ¼ ðÀy; xÞ. Find formulas defining the linear operators: (a) F þ G, (b) 5F À 3G, (c) FG, (d) GF, (e) F 2 , ( f ) G2 . 5.79. Show that each linear operator T on R2 is nonsingular and find a formula for T À1 , where (a) T ðx; yÞ ¼ ðx þ 2y; 2x þ 3yÞ, (b) T ðx; yÞ ¼ ð2x À 3y; 3x À 4yÞ. 5.80. Show that each of the following linear operators T on R3 is nonsingular and find a formula for T À1 , where (a) T ðx; y; zÞ ¼ ðx À 3y À 2z; y À 4z; zÞ; (b) T ðx; y; zÞ ¼ ðx þ z; x À y; yÞ. 5.81. Find the dimension of AðV Þ, where (a) V ¼ R7 , (b) V ¼ P5 ðtÞ, (c) V ¼ M3;4 . 5.82. Which of the following integers can be the dimension of an algebra AðV Þ of linear maps: 5, 9, 12, 25, 28, 36, 45, 64, 88, 100? 5.83. Let T be the linear operator on R2 defined by T ðx; yÞ ¼ ðx þ 2y; 3x þ 4yÞ. Find a formula for f ðT Þ, where (a) f ðtÞ ¼ t2 þ 2t À 3, (b) f ðtÞ ¼ t2 À 5t À 2.

Miscellaneous Problems
5.84. Suppose F : V ! U is linear and k is a nonzero scalar. Prove that the maps F and kF have the same kernel and the same image. 5.85. Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension. Show that rankðFGÞ ¼ rankðGFÞ ¼ rankðGÞ. 5.86. Suppose V has finite dimension. Suppose T is a linear operator on V such that rankðT 2 Þ ¼ rankðT Þ. Show that Ker T \ Im T ¼ f0g. 5.87. Suppose V ¼ U È W . Let E1 and E2 be the linear operators on V defined by E1 ðvÞ ¼ u, E2 ðvÞ ¼ w, where 2 2 v ¼ u þ w, u 2 U , w 2 W. Show that (a) E1 ¼ E1 and E2 ¼ E2 (i.e., that E1 and E2 are projections); (b) E1 þ E2 ¼ I, the identity mapping; (c) E1 E2 ¼ 0 and E2 E1 ¼ 0. 5.88. Let E1 and E2 be linear operators on V satisfying parts (a), (b), (c) of Problem 5.88. Prove V ¼ Im E1 È Im E2 5.89. Let v and w be elements of a real vector space V. The line segment L from v to v þ w is defined to be the set of vectors v þ tw for 0 t 1. (See Fig. 5.6.) (a) Show that the line segment L between vectors v and u consists of the points: (i) ð1 À tÞv þ tu for 0 t 1, (ii) t1 v þ t2 u for t1 þ t2 ¼ 1, t1 ! 0, t2 ! 0. (b) Let F : V ! U be linear. Show that the image FðLÞ of a line segment L in V is a line segment in U .

Figure 5-6

CHAPTER 5 Linear Mappings

193

5.90. Let F : V ! U be linear and let W be a subspace of V . The restriction of F to W is the map FjW : W ! U defined by FjW ðvÞ ¼ FðvÞ for every v in W . Prove the following: (a) FjW is linear; (b) KerðFjW Þ ¼ ðKer FÞ \ W ; (c) ImðFjW Þ ¼ FðW Þ. 5.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors) P; Q 2 X is contained in X . (a) Show that the intersection of convex sets is convex; (b) suppose F : V ! U is linear and X is convex. Show that FðX Þ is convex.

ANSWERS TO SUPPLEMENTARY PROBLEMS
5.45. ðaÞ 32 ¼ 9; ðbÞ sr 5.46. (a) ð f  gÞðxÞ ¼ 4x2 þ 1, (b) ðg  f ÞðxÞ ¼ 2x2 þ 6x À 1, (c) ðg  gÞðxÞ ¼ 4x À 9, (d) ð f  f ÞðxÞ ¼ x4 þ 6x3 þ 14x2 þ 15x þ 5 5.47. (a) f À1 ðxÞ ¼ 1 ðx þ 7Þ, (b) f À1 ðxÞ ¼ 3 pffiffiffiffiffiffiffiffiffiffiffi 3 xÀ2 ! 1 2 À3 a , (b) A ¼ 4 À5 6 c b d !

5.49. Fðx; y; zÞ ¼ Aðx; y; zÞT , where (a) A ¼

5.50. (a) u ¼ ð2; 2Þ, k ¼ 3; then FðkuÞ ¼ ð36; 36Þ but kFðuÞ ¼ ð12; 12Þ; (b) Fð0Þ 6¼ 0; (c) u ¼ ð1; 2Þ, v ¼ ð3; 4Þ; then Fðu þ vÞ ¼ ð24; 6Þ but FðuÞ þ FðvÞ ¼ ð14; 6Þ; (d) u ¼ ð1; 2; 3Þ, k ¼ À2; then FðkuÞ ¼ ð2; À10Þ but kFðuÞ ¼ ðÀ2; À10Þ. 5.51. Fða; bÞ ¼ ðÀa þ 2b; À3a þ bÞ ! À17 5 ; (b) None. (2; À4) and (À1; 2) are linearly dependent but not (1, 1) and (1, 3). 23 À6 ! 1 0 5.53. B ¼ [Hint: Send ð0; 1ÞT into ð0; 0ÞT .] 3 0 5.52. (a) A ¼ 5.55. Fðx; yÞ ¼ ðx2 ; y2 Þ 5.56. (a) 13x2 À 42xy þ 34y2 ¼ 1, (b) 13x2 þ 42xy þ 34y2 ¼ 1 5.57. (a) x2 À 8xy þ 26y2 þ 6xz À 38yz þ 14z2 ¼ 1, (b) x2 þ 2xy þ 3y2 þ 2xz À 8yz þ 14z2 ¼ 1 5.58. (a) x À y þ 2z ¼ 4, (b) x þ 6z ¼ 4 5.61. (a) dimðKer FÞ ¼ 1, fð7; À2; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 2Þg; (b) dimðKer FÞ ¼ 2, fðÀ2; 1; 0; 0Þ; ð1; 0; À1; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ;

ð0; 1; 3Þg

5.62. (a) dimðKer GÞ ¼ 2, fð1; 0; À1Þ; ð1; À1; 0Þg; dimðIm GÞ ¼ 1, fð1; 2Þg; (b) dimðKer GÞ ¼ 1, fð1; À1; 1Þg; Im G ¼ R2 , fð1; 0Þ; ð0; 1Þg; (c) dimðKer GÞ ¼ 3, fðÀ2; 1; 0; 0; 0Þ; ð1; 0; À1; 1; 0Þ; ðÀ5; 0; 2; 0; 1Þg; dimðIm GÞ ¼ 2, fð1; 1; 3Þ; ð0; 1; 2Þg 5.63. (a) dimðKer AÞ ¼ 2, fð4; À2; À5; 0Þ; ð1; À3; 0; 5Þg; dimðIm AÞ ¼ 2, fð1; 2; 1Þ; (b) dimðKer BÞ ¼ 1, fðÀ1; 2 ; 1; 1Þg; Im B ¼ R3 3 5.64. Fðx; y; zÞ ¼ ðx þ 4y; 2x þ 5y; 3x þ 6yÞ ð0; 1; 1Þg;

194
5.65. Fðx; y; z; tÞ ¼ ðx þ y À z; 2x þ y À t; 0Þ 5.66. (a) f1; t; t2 ; . . . ; t6 g, (b) f1; t; t2 ; t3 g 5.68. None, because dim R4 > dim R3 : 5.69. Ker 0 ¼ V , Im 0 ¼ f0g

CHAPTER 5 Linear Mappings

5.70. ðF þ GÞðx; y; zÞ ¼ ðy þ 2z; 2x À y þ zÞ, ð3F À 2GÞðx; y; zÞ ¼ ð3y À 4z; x þ 2y þ 3zÞ 5.71. (a) ðH  FÞðx; y; zÞ ¼ ðx þ z; 2yÞ, ðH  GÞðx; y; zÞ ¼ ðx À y; 4zÞ; (b) not defined; (c) ðH  ðF þ GÞÞðx; y; zÞ ¼ ðH  F þ H  GÞðx; y; zÞ ¼ ð2x À y þ z; 2y þ 4zÞ 5.74. Fðx; yÞ ¼ ðx; y; yÞ; Gðx; y; zÞ ¼ ðx; yÞ 5.75. (a) 16, (b) 15, (c) 24 5.76. (a) v ¼ ð2; À3; 1Þ; (b) GÀ1 ðat2 þ bt þ cÞ ¼ ðb À 2c; a À b þ 2c; Àa þ b À cÞ; (c) H is nonsingular, but not invertible, because dim P2 ðtÞ > dim R2 . 5.77. dim U ¼ 1; that is, U ¼ K. 5.78. (a) ðF þ GÞðx; yÞ ¼ ðx; xÞ; (b) ð5F À 3GÞðx; yÞ ¼ ð5x þ 8y; À3xÞ; (c) ðFGÞðx; yÞ ¼ ðx À y; 0Þ; (d) ðGFÞðx; yÞ ¼ ð0; x þ yÞ; (e) F 2 ðx; yÞ ¼ ðx þ y; 0Þ (note that F 2 ¼ F); ( f ) G2 ðx; yÞ ¼ ðÀx; [Note that G2 þ I ¼ 0; hence, G is a zero of f ðtÞ ¼ t2 þ 1.] 5.79. (a) T À1 ðx; yÞ ¼ ðÀ3x þ 2y; 2x À yÞ, (b) T À1 ðx; yÞ ¼ ðÀ4x þ 3y; À3x þ 2yÞ 5.80. (a) T À1 ðx; y; zÞ ¼ ðx þ 3y þ 14z; y À 4z; zÞ, (b) T À1 ðx; y; zÞ ¼ ðy þ z; y; x À y À zÞ 5.81. (a) 49, (b) 36, (c) 144 5.82. Squares: 9, 25, 36, 64, 100 5.83. (a) T ðx; yÞ ¼ ð6x þ 14y; 21x þ 27yÞ; (b) T ðx; yÞ ¼ ð0; 0Þ—that is, f ðT Þ ¼ 0

ÀyÞ.

CHAPTER 6

Linear Mappings and Matrices
6.1 Introduction v ¼ a1 u1 þ a2 u2 þ Á Á Á þ an un Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless otherwise stated or implied), is denoted and defined by ½vŠS ¼ ½a1 ; a2 ; . . . ; an ŠT Recall (Section 4.11) that the mapping v7!½vŠS , determined by the basis S, is an isomorphism between V and K n . This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra AðV Þ of linear operators on V and the algebra M of n-square matrices over K. Thus, every linear mapping F: V ! V will correspond to an n-square matrix ½FŠS determined by the basis S. We will also show how our matrix representation changes when we choose another basis. Consider a basis S ¼ fu1 ; u2 ; . . . ; un g of a vector space V over a field K. For any vector v 2 V , suppose

6.2

Matrix Representation of a Linear Operator

Let T be a linear operator (transformation) from a vector space V into itself, and suppose S ¼ fu1 ; u2 ; . . . ; un g is a basis of V . Now T ðu1 Þ, T ðu2 Þ; . . . ; T ðun Þ are vectors in V , and so each is a linear combination of the vectors in the basis S; say, T ðu1 Þ ¼ a11 u1 þ a12 u2 þ Á Á Á þ a1n un T ðu2 Þ ¼ a21 u1 þ a22 u2 þ Á Á Á þ a2n un :::::::::::::::::::::::::::::::::::::::::::::::::::::: T ðun Þ ¼ an1 u1 þ an2 u2 þ Á Á Á þ ann un The following definition applies. The transpose of the above matrix of coefficients, denoted by mS ðT Þ or ½T ŠS , is called the matrix representation of T relative to the basis S, or simply the matrix of T in the basis S. (The subscript S may be omitted if the basis S is understood.) Using the coordinate (column) vector notation, the matrix representation of T may be written in the form  à mS ðT Þ ¼ ½T ŠS ¼ ½T ðu1 ފS ; ½T ðu2 ފS ; . . . ; ½T ðu1 ފS
DEFINITION:

That is, the columns of mðT Þ are the coordinate vectors of T ðu1 Þ, T ðu2 Þ; . . . ; T ðun Þ, respectively.

195

196
EXAMPLE 6.1

CHAPTER 6 Linear Mappings and Matrices
Let F: R2 ! R2 be the linear operator defined by Fðx; yÞ ¼ ð2x þ 3y; 4x À 5yÞ.

(a) Find the matrix representation of F relative to the basis S ¼ fu1 ; u2 g ¼ fð1; 2Þ; ð2; 5Þg. (1) First find Fðu1 Þ, and then write it as a linear combination of the basis vectors u1 and u2 . (For notational convenience, we use column vectors.) We have

 Fðu1 Þ ¼ F

1 2

! ¼

! ! ! 8 1 2 ¼x þy 2 5 À6

and

x þ 2y ¼ 8 2x þ 5y ¼ À6

Solve the system to obtain x ¼ 52, y ¼ À22. Hence, Fðu1 Þ ¼ 52u1 À 22u2 . (2) Next find Fðu2 Þ, and then write it as a linear combination of u1 and u2 :

 Fðu2 Þ ¼ F

2 5

! ¼

! ! ! 2 1 19 þy ¼x 5 2 À17

and

x þ 2y ¼ 19 2x þ 5y ¼ À17

Solve the system to get x ¼ 129, y ¼ À55. Thus, Fðu2 Þ ¼ 129u1 À 55u2 . Now write the coordinates of Fðu1 Þ and Fðu2 Þ as columns to obtain the matrix

½FŠS ¼

52 129 À22 À55

!

(b) Find the matrix representation of F relative to the (usual) basis E ¼ fe1 ; e2 g ¼ fð1; 0Þ; ð0; 1Þg. Find Fðe1 Þ and write it as a linear combination of the usual basis vectors e1 and e2 , and then find Fðe2 Þ and write it as a linear combination of e1 and e2 . We have

Fðe1 Þ ¼ Fð1; 0Þ ¼ ð2; 2Þ ¼ 2e1 þ 4e2 Fðe2 Þ ¼ Fð0; 1Þ ¼ ð3; À5Þ ¼ 3e1 À 5e2

and so

2 ½FŠE ¼ 4

3 À5

!

Note that the coordinates of Fðe1 Þ and Fðe2 Þ form the columns, not the rows, of ½FŠE . Also, note that the arithmetic is much simpler using the usual basis of R2 .

Let V be the vector space of functions with basis S ¼ fsin t; cos t; e3t g, and let D: V ! V be the differential operator defined by Dð f ðtÞÞ ¼ dð f ðtÞÞ=dt. We compute the matrix representing D in the basis S:
EXAMPLE 6.2

Dðsin tÞ ¼ Dðe3t Þ ¼

cos t ¼ 3e3t ¼

0ðsin tÞ þ 1ðcos tÞ þ 0ðe3 tÞ 0ðsin tÞ þ 0ðcos tÞ þ 3ðe3t Þ 2 0 À1 0 0 0 3

Dðcos tÞ ¼ À sin t ¼ À1ðsin tÞ þ 0ðcos tÞ þ 0ðe3t Þ

and so

6 ½DŠ ¼ 4 1 0

7 05 3

Note that the coordinates of Dðsin tÞ, Dðcos tÞ, Dðe3t Þ form the columns, not the rows, of ½DŠ.

Matrix Mappings and Their Matrix Representation
Consider the following matrix A, which may be viewed as a linear operator on R2 , and basis S of R2 : ! & ! !' 3 À2 1 2 A¼ and S ¼ fu1 ; u2 g ¼ ; 4 À5 2 5 (We write vectors as columns, because our map is a matrix.) We find the matrix representation of A relative to the basis S.

CHAPTER 6 Linear Mappings and Matrices
(1) First we write Aðu1 Þ as a linear combination of u1 and u2 . We have ! ! ! ! ! x þ 2y ¼ À1 2 1 À1 3 À2 1 and so þy ¼x ¼ Aðu1 Þ ¼ 2x þ 5y ¼ À6 5 2 À6 4 À5 2 Solving the system yields x ¼ 7, y ¼ À4. Thus, Aðu1 Þ ¼ 7u1 À 4u2 . Next we write Aðu2 Þ as a linear combination of u1 and u2 . We have 3 Aðu2 Þ ¼ 4 À2 À5 ! ! ! ! ! 2 À4 1 2 ¼ ¼x þy 2 5 5 À7 and so x þ 2y ¼ À4 2x þ 5y ¼ À7

197

(2)

Solving the system yields x ¼ À6, y ¼ 1. Thus, Aðu2 Þ ¼ À6u1 þ u2 . Writing the coordinates of Aðu1 Þ and Aðu2 Þ as columns gives us the following matrix representation of A: ! 7 À6 ½AŠS ¼ À4 1 Remark: Suppose we want to find the matrix representation of A relative to the usual basis E ¼ fe1 ; e2 g ¼ f½1; 0ŠT ; ½0; 1ŠT g of R2 : We have Aðe1 Þ ¼ Aðe2 Þ ¼ 3 4 3 4 ! ! 1 3 ¼ ¼ 3e1 þ 4e2 0 4 ! ! ! À2 À2 0 ¼ À2e1 À 5e2 ¼ À5 À5 1 À2 À5 ! !

and so

½AŠE ¼

3 4

À2 À5

Note that ½AŠE is the original matrix A. This result is true in general: The matrix representation of any n  n square matrix A over a field K relative to the usual basis E of K n is the matrix A itself ; that is; ½AŠE ¼ A

Algorithm for Finding Matrix Representations
Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be useful to use it in Step 1(b), which is repeated for each basis vector.
ALGORITHM 6.1:

The input is a linear operator T on a vector space V and a basis S ¼ fu1 ; u2 ; . . . ; un g of V . The output is the matrix representation ½T ŠS .

Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S. Step 1. Repeat for each basis vector uk in S: (a) Find T ðuk Þ. (b) Write Tðuk Þ as a linear combination of the basis vectors u1 ; u2 ; . . . ; un . Step 2. Form the matrix ½T ŠS whose columns are the coordinate vectors in Step 1(b). Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ð2x þ 3y; 4x À 5yÞ. Find the matrix representation ½FŠS of F relative to the basis S ¼ fu1 ; u2 g ¼ fð1; À2Þ; ð2; À5Þg.
EXAMPLE 6.3

(Step 0) First find the coordinates of ða; bÞ 2 R2 relative to the basis S. We have

! ! ! 2 1 a þy ¼x À5 À2 b

or

x þ 2y ¼ a À2x À 5y ¼ b

or

x þ 2y ¼ a Ày ¼ 2a þ b

198

CHAPTER 6 Linear Mappings and Matrices
Solving for x and y in terms of a and b yields x ¼ 5a þ 2b, y ¼ À2a À b. Thus, ða; bÞ ¼ ð5a þ 2bÞu1 þ ðÀ2a À bÞu2

(Step 1) Now we find Fðu1 Þ and write it as a linear combination of u1 and u2 using the above formula for ða; bÞ, and then we repeat the process for Fðu2 Þ. We have

Fðu1 Þ ¼ Fð1; À2Þ ¼ ðÀ4; 14Þ ¼ 8u1 À 6u2 Fðu2 Þ ¼ Fð2; À5Þ ¼ ðÀ11; 33Þ ¼ 11u1 À 11u2
(Step 2) Finally, we write the coordinates of Fðu1 Þ and Fðu2 Þ as columns to obtain the required matrix: ! 8 11 ½FŠS ¼ À6 À11

Properties of Matrix Representations
This subsection gives the main properties of the matrix representations of linear operators T on a vector space V . We emphasize that we are always given a particular basis S of V . Our first theorem, proved in Problem 6.9, tells us that the ‘‘action’’ of a linear operator T on a vector v is preserved by its matrix representation.
THEOREM 6.1:

Let T : V ! V be a linear operator, and let S be a (finite) basis of V . Then, for any vector v in V , ½T ŠS ½vŠS ¼ ½T ðvފS .

EXAMPLE 6.4

Consider the linear operator F on R2 and the basis S of Example 6.3; that is,
4x À 5yÞ and S ¼ fu1 ; u2 g ¼ fð1; À2Þ; ð2; À5Þg

Fðx; yÞ ¼ ð2x þ 3y; Let v ¼ ð5; À7Þ;

and so

FðvÞ ¼ ðÀ11; 55Þ

Using the formula from Example 6.3, we get

½vŠ ¼ ½11; À3ŠT

and

½Fðvފ ¼ ½55; À33ŠT

We verify Theorem 6.1 for this vector v (where ½FŠ is obtained from Example 6.3):

½FŠ½vŠ ¼

8 À6

11 À11

!

! ! 55 11 ¼ ½Fðvފ ¼ À33 À3

Given a basis S of a vector space V , we have associated a matrix ½T Š to each linear operator T in the algebra AðV Þ of linear operators on V . Theorem 6.1 tells us that the ‘‘action’’ of an individual linear operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11) tell us that the three basic operations in AðV Þ with these operators—namely (i) addition, (ii) scalar multiplication, and (iii) composition—are also preserved.
THEOREM 6.2:

Let V be an n-dimensional vector space over K, let S be a basis of V , and let M be the algebra of n  n matrices over K. Then the mapping m: AðV Þ ! M defined by mðT Þ ¼ ½T ŠS

is a vector space isomorphism. That is, for any F; G 2 AðV Þ and any k 2 K, (i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ GŠ ¼ ½FŠ þ ½GŠ (ii) mðkFÞ ¼ kmðFÞ or ½kFŠ ¼ k½FŠ (iii) m is bijective (one-to-one and onto).

CHAPTER 6 Linear Mappings and Matrices
THEOREM 6.3:

199

For any linear operators F; G 2 AðV Þ, mðG  FÞ ¼ mðGÞmðFÞ or ½G  FŠ ¼ ½GŠ½FŠ (Here G  F denotes the composition of the maps G and F.)

6.3

Change of Basis

Let V be an n-dimensional vector space over a field K. We have shown that once we have selected a basis S of V , every vector v 2 V can be represented by means of an n-tuple ½vŠS in K n , and every linear operator T in AðV Þ can be represented by an n  n matrix over K. We ask the following natural question: How do our representations change if we select another basis? In order to answer this question, we first need a definition.
DEFINITION:

Let S ¼ fu1 ; u2 ; . . . ; un g be a basis of a vector space V; and let S 0 ¼ fv 1 ; v 2 ; . . . ; v n g be another basis. (For reference, we will call S the ‘‘old’’ basis and S 0 the ‘‘new’’ basis.) Because S is a basis, each vector in the ‘‘new’’ basis S 0 can be written uniquely as a linear combination of the vectors in S; say, v 1 ¼ a11 u1 þ a12 u2 þ Á Á Á þ a1n un v 2 ¼ a21 u1 þ a22 u2 þ Á Á Á þ a2n un ::::::::::::::::::::::::::::::::::::::::::::::::: v n ¼ an1 u1 þ an2 u2 þ Á Á Á þ ann un Let P be the transpose of the above matrix of coefficients; that is, let P ¼ ½pij Š, where pij ¼ aji . Then P is called the change-of-basis matrix (or transition matrix) from the ‘‘old’’ basis S to the ‘‘new’’ basis S 0 .

The following remarks are in order. Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns are, respectively, the coordinate column vectors of the ‘‘new’’ basis vectors v i relative to the ‘‘old’’ basis S; namely,  à P ¼ ½v 1 ŠS ; ½v 2 ŠS ; . . . ; ½v n ŠS Remark 2: Analogously, there is a change-of-basis matrix Q from the ‘‘new’’ basis S 0 to the ‘‘old’’ basis S. Similarly, Q may be viewed as the matrix whose columns are, respectively, the coordinate column vectors of the ‘‘old’’ basis vectors ui relative to the ‘‘new’’ basis S 0 ; namely,  à Q ¼ ½u1 ŠS 0 ; ½u2 ŠS 0 ; . . . ; ½un ŠS 0 Remark 3: Because the vectors v 1 ; v 2 ; . . . ; v n in the new basis S 0 are linearly independent, the matrix P is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following proposition (proved in Problem 6.18).
PROPOSITION 6.4:

Let P and Q be the above change-of-basis matrices. Then Q ¼ PÀ1 .

Now suppose S ¼ fu1 ; u2 ; . . . ; un g is a basis of a vector space V , and suppose P ¼ ½pij Š is any nonsingular matrix. Then the n vectors v i ¼ p1i ui þ p2i u2 þ Á Á Á þ pni un ; i ¼ 1; 2; . . . ; n

corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus, they form another basis S 0 of V . Moreover, P will be the change-of-basis matrix from S to the new basis S 0 .

200
EXAMPLE 6.5

CHAPTER 6 Linear Mappings and Matrices
Consider the following two bases of R2 : and S 0 ¼ fv 1 ; v 2 g ¼ fð1; À1Þ; ð1; À2Þg

S ¼ fu1 ; u2 g ¼ fð1; 2Þ; ð3; 5Þg

(a) Find the change-of-basis matrix P from S to the ‘‘new’’ basis S 0 . Write each of the new basis vectors of S 0 as a linear combination of the original basis vectors u1 and u2 of S. We have ! 1 ¼x À1 ! 1 ¼x À1 Thus, ! 1 þy 2 ! 1 þy 2 3 5 3 5 ! or ! or x þ 3y ¼ 1 2x þ 5y ¼ À1 x þ 3y ¼ 1 2x þ 5y ¼ À1 yielding yielding x ¼ À8; x ¼ À11; y¼3 y¼4

v 1 ¼ À8u1 þ 3u2 v 2 ¼ À11u1 þ 4u2

and hence;



À8 3

! À11 : 4

Note that the coordinates of v 1 and v 2 are the columns, not rows, of the change-of-basis matrix P. (b) Find the change-of-basis matrix Q from the ‘‘new’’ basis S 0 back to the ‘‘old’’ basis S. Here we write each of the ‘‘old’’ basis vectors u1 and u2 of S 0 as a linear combination of the ‘‘new’’ basis vectors v 1 and v 2 of S 0 . This yields

u1 ¼ 4v 1 À 3v 2 u2 ¼ 11v 1 À 8v 2

and hence;

4 Q¼ À3

11 À8

!

As expected from Proposition 6.4, Q ¼ PÀ1 . (In fact, we could have obtained Q by simply finding PÀ1 .)
EXAMPLE 6.6

Consider the following two bases of R3 :
E ¼ fe1 ; e2 ; e3 g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð2; 1; 2Þ; ð0; 0; 1Þg ð1; 2; 2Þg

and

S ¼ fu1 ; u2 ; u3 g ¼ fð1; 0; 1Þ;

(a) Find the change-of-basis matrix P from the basis E to the basis S. Because E is the usual basis, we can immediately write each basis element of S as a linear combination of the basis elements of E. Specifically, 2 3 e3 1 2 1 u1 ¼ ð1; 0; 1Þ ¼ e1 þ 6 7 u2 ¼ ð2; 1; 2Þ ¼ 2e1 þ e2 þ 2e3 and hence; P ¼ 40 1 25 1 2 2 u3 ¼ ð1; 2; 2Þ ¼ e1 þ 2e2 þ 2e3 Again, the coordinates of u1 ; u2 ; u3 appear as the columns in P. Observe that P is simply the matrix whose columns are the basis vectors of S. This is true only because the original basis was the usual basis E. (b) Find the change-of-basis matrix Q from the basis S to the basis E. The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in E as a linear combination of the basis elements of S. This yields e1 ¼ ð1; 0; 0Þ ¼ À2u1 þ 2u2 À u3 e2 ¼ ð0; 1; 0Þ ¼ À2u1 þ u2 e3 ¼ ð0; 0; 1Þ ¼ 3u1 À 2u2 þ u3 and hence; 2 6 Q¼4 2 À1 À2 À2 1 0 3 1 3 7 À2 5

We emphasize that to find Q, we need to solve three 3 Â 3 systems of linear equations—one 3 Â 3 system for each of e1 ; e2 ; e3 .

CHAPTER 6 Linear Mappings and Matrices

201

Alternatively, we can find Q ¼ PÀ1 by forming the matrix M ¼ ½P; IŠ and row reducing M to row canonical form:

2

6 M ¼ 40 1 thus;

1

2 1 2

1 2 2

1 0 0 1 0 0

7 6 2 1 05 $ 40 1 0 0 0 1 À1 0 1 2 3 À2 À2 3 6 7 1 À2 5 Q ¼ PÀ1 ¼ 4 2 À1 0 1

0

3

2

1

0 0

À2

À2

7 À2 5 ¼ ½I; PÀ1 Š 1

3

3

(Here we have used the fact that Q is the inverse of P.)

The result in Example 6.6(a) is true in general. We state this result formally, because it occurs often.
PROPOSITION 6.5:

The change-of-basis matrix from the usual basis E of K n to any basis S of K n is the matrix P whose columns are, respectively, the basis vectors of S.

Applications of Change-of-Basis Matrix
First we show how a change of basis affects the coordinates of a vector in a vector space V . The following theorem is proved in Problem 6.22.
THEOREM 6.6:

Let P be the change-of-basis matrix from a basis S to a basis S 0 in a vector space V . Then, for any vector v 2 V , we have P½vŠS 0 ¼ ½vŠS and hence; PÀ1 ½vŠS ¼ ½vŠS 0

Namely, if we multiply the coordinates of v in the original basis S by PÀ1 , we get the coordinates of v in the new basis S 0 . Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis S 0 , we emphasize that PÀ1 transforms the coordinates of v in the original basis S into the coordinates of v in the new basis S 0 . Remark 2: Because of the above theorem, many texts call Q ¼ PÀ1 , not P, the transition matrix from the old basis S to the new basis S 0 . Some texts also refer to Q as the change-of-coordinates matrix. We now give the proof of the above theorem for the special case that dim V ¼ 3. Suppose P is the change-of-basis matrix from the basis S ¼ fu1 ; u2 ; u3 g to the basis S 0 ¼ fv 1 ; v 2 ; v 3 g; say, v 1 ¼ a1 u1 þ a2 u2 þ a3 a3 v 2 ¼ b1 u1 þ b2 u2 þ b3 u3 v 3 ¼ c1 u1 þ c2 u2 þ c3 u3 and hence; a1 P ¼ 4 a2 a3 2 b1 b2 b3 3 c1 c2 5 c3

Now suppose v 2 V and, say, v ¼ k1 v 1 þ k2 v 2 þ k3 v 3 . Then, substituting for v 1 ; v 2 ; v 3 from above, we obtain v ¼ k1 ða1 u1 þ a2 u2 þ a3 u3 Þ þ k2 ðb1 u1 þ b2 u2 þ b3 u3 Þ þ k3 ðc1 u1 þ c2 u2 þ c3 u3 Þ ¼ ða1 k1 þ b1 k2 þ c1 k3 Þu1 þ ða2 k1 þ b2 k2 þ c2 k3 Þu2 þ ða3 k1 þ b3 k2 þ c3 k3 Þu3

202
Thus, 3 k1 ½vŠS 0 ¼ 4 k2 5 k3 2 2 2

CHAPTER 6 Linear Mappings and Matrices

and

3 a1 k1 þ b1 k2 þ c1 k3 ½vŠS ¼ 4 a2 k1 þ b2 k2 þ c2 k3 5 a3 k1 þ b3 k2 þ c3 k3

Accordingly,

a1 4 a2 P½vŠS 0 ¼ a3

b1 b2 b3

32 3 2 3 k1 a1 k1 þ b1 k2 þ c1 k3 c1 c2 54 k2 5 ¼ 4 a2 k1 þ b2 k2 þ c2 k3 5 ¼ ½vŠS c3 k3 a3 k1 þ b3 k2 þ c3 k3

Finally, multiplying the equation ½vŠS ¼ P½vŠS , by PÀ1 , we get PÀ1 ½vŠS ¼ PÀ1 P½vŠS 0 ¼ I½vŠS 0 ¼ ½vŠS 0 The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix representation of a linear operator.
THEOREM 6.7:

Let P be the change-of-basis matrix from a basis S to a basis S 0 in a vector space V . Then, for any linear operator T on V , ½T ŠS 0 ¼ PÀ1 ½T ŠS P That is, if A and B are the matrix representations of T relative, respectively, to S and S 0 , then B ¼ PÀ1 AP

EXAMPLE 6.7

Consider the following two bases of R3 : E ¼ fe1 ; e2 ; e3 g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð2; 1; 2Þ; ð0; 0; 1Þg ð1; 2; 2Þg S ¼ fu1 ; u2 ; u3 g ¼ fð1; 0; 1Þ;

and

The change-of-basis matrix P from E to S and its inverse PÀ1 were obtained in Example 6.6. (a) Write v ¼ ð1; 3; 5Þ as a linear combination of u1 ; u2 ; u3 , or, equivalently, find ½vŠS . One way to do this is to directly solve the vector equation v ¼ xu1 þ yu2 þ zu3 ; that is,

2 3 2 3 2 3 2 3 1 1 2 1 4 3 5 ¼ x4 0 5 þ y4 1 5 þ z4 2 5 5 1 2 2

or

x þ 2y þ z ¼ 1 y þ 2z ¼ 3 x þ 2y þ 2z ¼ 5

The solution is x ¼ 7, y ¼ À5, z ¼ 4, so v ¼ 7u1 À 5u2 þ 4u3 . On the other hand, we know that ½vŠE ¼ ½1; 3; 5ŠT , because E is the usual basis, and we already know PÀ1 . Therefore, by Theorem 6.6,

32 3 2 3 À2 À2 3 1 7 ½vŠS ¼ PÀ1 ½vŠE ¼ 4 2 1 À2 54 3 5 ¼ 4 À5 5 5 À1 0 1 4
Thus, again, v ¼ 7u1 À 5u2 þ 4u3 . 2 3 1 3 À2 (b) Let A ¼ 4 2 À4 1 5, which may be viewed as a linear operator on R3 . Find the matrix B that represents A 3 À1 2 relative to the basis S.

2

CHAPTER 6 Linear Mappings and Matrices

203

The definition of the matrix representation of A relative to the basis S tells us to write each of Aðu1 Þ, Aðu2 Þ, Aðu3 Þ as a linear combination of the basis vectors u1 ; u2 ; u3 of S. This yields

Aðu1 Þ ¼ ðÀ1; 3; 5Þ ¼ 11u1 À 5u2 þ 6u3 Aðu2 Þ ¼ ð1; 2; 9Þ ¼ 21u1 À 14u2 þ 8u3 Aðu3 Þ ¼ ð3; À4; 5Þ ¼ 17u1 À 8e2 þ 2u3

and hence;

3 11 21 17 6 7 B ¼ 4 À5 À14 À8 5 6 8 2

2

We emphasize that to find B, we need to solve three 3 Â 3 systems of linear equations—one 3 Â 3 system for each of Aðu1 Þ, Aðu2 Þ, Aðu3 Þ. On the other hand, because we know P and PÀ1 , we can use Theorem 6.7. That is,

À2 B ¼ PÀ1 AP ¼ 4 2 À1

2

32 1 À2 3 1 À2 54 2 3 0 1

32 1 3 À2 À4 1 54 0 1 À1 2

3 2 3 2 1 11 21 17 1 2 5 ¼ 4 À5 À14 À8 5 2 2 6 8 2

This, as expected, gives the same result.

6.4

Similarity

Suppose A and B are square matrices for which there exists an invertible matrix P such that B ¼ PÀ1 AP; then B is said to be similar to A, or B is said to be obtained from A by a similarity transformation. We show (Problem 6.29) that similarity of matrices is an equivalence relation. By Theorem 6.7 and the above remark, we have the following basic result.
THEOREM 6.8:

Two matrices represent the same linear operator if and only if the matrices are similar.

That is, all the matrix representations of a linear operator T form an equivalence class of similar matrices. A linear operator T is said to be diagonalizable if there exists a basis S of V such that T is represented by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the following result.
THEOREM 6.9:

Let A be the matrix representation of a linear operator T . Then T is diagonalizable if and only if there exists an invertible matrix P such that PÀ1 AP is a diagonal matrix.

That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity transformation. We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that every linear operator can be represented by certain ‘‘standard’’ matrices called its normal or canonical forms. Such a discussion will require some theory of fields, polynomials, and determinants.

Functions and Similar Matrices
Suppose f is a function on square matrices that assigns the same value to similar matrices; that is, f ðAÞ ¼ f ðBÞ whenever A is similar to B. Then f induces a function, also denoted by f , on linear operators T in the following natural way. We define f ðT Þ ¼ f ð½T ŠS Þ where S is any basis. By Theorem 6.8, the function is well defined. The determinant (Chapter 8) is perhaps the most important example of such a function. The trace (Section 2.7) is another important example of such a function.

204
EXAMPLE 6.8

CHAPTER 6 Linear Mappings and Matrices
Consider the following linear operator F and bases E and S of R2 : E ¼ fð1; 0Þ; ð0; 1Þg; S ¼ fð1; 2Þ; ð2; 5Þg

Fðx; yÞ ¼ ð2x þ 3y; 4x À 5yÞ;

By Example 6.1, the matrix representations of F relative to the bases E and S are, respectively,



2 3 4 À5

! and B¼

52 129 À22 À55

!

Using matrix A, we have (i) Determinant of F ¼ detðAÞ ¼ À10 À 12 ¼ À22; On the other hand, using matrix B, we have (i) Determinant of F ¼ detðBÞ ¼ À2860 þ 2838 ¼ À22; As expected, both matrices yield the same result. (ii) Trace of F ¼ trðBÞ ¼ 52 À 55 ¼ À3. (ii) Trace of F ¼ trðAÞ ¼ 2 À 5 ¼ À3:

6.5

Matrices and General Linear Mappings

Last, we consider the general case of linear mappings from one vector space into another. Suppose V and U are vector spaces over the same field K and, say, dim V ¼ m and dim U ¼ n. Furthermore, suppose S ¼ fv 1 ; v 2 ; . . . ; v m g and S 0 ¼ fu1 ; u2 ; . . . ; un g

are arbitrary but fixed bases, respectively, of V and U . Suppose F: V ! U is a linear mapping. Then the vectors Fðv 1 Þ, Fðv 2 Þ; . . . ; Fðv m Þ belong to U , and so each is a linear combination of the basis vectors in S 0 ; say, Fðv 1 Þ ¼ a11 u1 þ a12 u2 þ Á Á Á þ a1n un Fðv 2 Þ ¼ a21 u1 þ a22 u2 þ Á Á Á þ a2n un ::::::::::::::::::::::::::::::::::::::::::::::::::::::: Fðv m Þ ¼ am1 u1 þ am2 u2 þ Á Á Á þ amn un
DEFINITION:

The transpose of the above matrix of coefficients, denoted by mS;S 0 ðFÞ or ½FŠS;S 0 , is called the matrix representation of F relative to the bases S and S 0 . [We will use the simple notation mðFÞ and ½FŠ when the bases are understood.]

The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67).
THEOREM 6.10:

For any vector v 2 V , ½FŠS;S 0 ½vŠS ¼ ½FðvފS 0 .

That is, multiplying the coordinates of v in the basis S of V by ½FŠ, we obtain the coordinates of FðvÞ in the basis S 0 of U . Recall that for any vector spaces V and U , the collection of all linear mappings from V into U is a vector space and is denoted by HomðV ; U Þ. The following theorem is analogous to Theorem 6.2 for linear operators, where now we let M ¼ Mm;n denote the vector space of all m  n matrices (Problem 6.67).
THEOREM 6.11:

The mapping m: HomðV ; U Þ ! M defined by mðFÞ ¼ ½FŠ is a vector space isomorphism. That is, for any F; G 2 HomðV ; U Þ and any scalar k, (i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ GŠ ¼ ½FŠ þ ½GŠ (ii) mðkFÞ ¼ kmðFÞ or ½kFŠ ¼ k½FŠ (iii) m is bijective (one-to-one and onto).

CHAPTER 6 Linear Mappings and Matrices
Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67). THEOREM 6.12:

205

Let S; S 0 ; S 00 be bases of vector spaces V ; U ; W , respectively. Let F: V ! U and G  U ! W be linear mappings. Then ½G  FŠS;S 00 ¼ ½GŠS 0 ;S 00 ½FŠS;S 0

That is, relative to the appropriate bases, the matrix representation of the composition of two mappings is the matrix product of the matrix representations of the individual mappings. Next we show how the matrix representation of a linear mapping F: V ! U is affected when new bases are selected (Problem 6.67). THEOREM 6.13: Let P be the change-of-basis matrix from a basis e to a basis e0 in V , and let Q be the change-of-basis matrix from a basis f to a basis f 0 in U . Then, for any linear map F: V ! U , ½FŠe0 ; f 0 ¼ QÀ1 ½FŠe; f P In other words, if A is the matrix representation of a linear mapping F relative to the bases e and f , and B is the matrix representation of F relative to the bases e0 and f 0 , then B ¼ QÀ1 AP Our last theorem, proved in Problem 6.36, shows that any linear mapping from one vector space V into another vector space U can be represented by a very simple matrix. We note that this theorem is analogous to Theorem 3.18 for m  n matrices. THEOREM 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases of V and U such that the matrix representation of F has the form ! Ir 0 A¼ 0 0 where Ir is the r-square identity matrix. The above matrix A is called the normal or canonical form of the linear map F.

SOLVED PROBLEMS

Matrix Representation of Linear Operators 6.1. Consider the linear mapping F: R2 ! R2 defined by Fðx; yÞ ¼ ð3x þ 4y; following bases of R2 : E ¼ fe1 ; e2 g ¼ fð1; 0Þ; ð0; 1Þg and 2x À 5yÞ and the

S ¼ fu1 ; u2 g ¼ fð1; 2Þ; ð2; 3Þg

(a) Find the matrix A representing F relative to the basis E. (b) Find the matrix B representing F relative to the basis S.
(a) Because E is the usual basis, the rows of A are simply the coefficients in the components of Fðx; yÞ; that is, using ða; bÞ ¼ ae1 þ be2 , we have ! 3 4 Fðe1 Þ ¼ Fð1; 0Þ ¼ ð3; 2Þ ¼ 3e1 þ 2e2 and so A¼ 2 À5 Fðe2 Þ ¼ Fð0; 1Þ ¼ ð4; À5Þ ¼ 4e1 À 5e2 Note that the coefficients of the basis vectors are written as columns in the matrix representation.

206

CHAPTER 6 Linear Mappings and Matrices

(b) First find Fðu1 Þ and write it as a linear combination of the basis vectors u1 and u2 . We have Fðu1 Þ ¼ Fð1; 2Þ ¼ ð11; À8Þ ¼ xð1; 2Þ þ yð2; 3Þ; Solve the system to obtain x ¼ À49, y ¼ 30. Therefore, Fðu1 Þ ¼ À49u1 þ 30u2 Next find Fðu2 Þ and write it as a linear combination of the basis vectors u1 and u2 . We have Fðu2 Þ ¼ Fð2; 3Þ ¼ ð18; À11Þ ¼ xð1; 2Þ þ yð2; 3Þ; Solve for x and y to obtain x ¼ À76, y ¼ 47. Hence, Fðu2 Þ ¼ À76u1 þ 47u2 Write the coefficients of u1 and u2 as columns to obtain B ¼ À49 À76 30 47 ! and so x þ 2y ¼ 18 2x þ 3y ¼ À11 and so x þ 2y ¼ 11 2x þ 3y ¼ À8

(b0 ) Alternatively, one can first find the coordinates of an arbitrary vector ða; bÞ in R2 relative to the basis S. We have ða; bÞ ¼ xð1; 2Þ þ yð2; 3Þ ¼ ðx þ 2y; 2x þ 3yÞ; and so x þ 2y ¼ a 2x þ 3y ¼ b

Solve for x and y in terms of a and b to get x ¼ À3a þ 2b, y ¼ 2a À b. Thus, ða; bÞ ¼ ðÀ3a þ 2bÞu1 þ ð2a À bÞu2 Then use the formula for ða; bÞ to find the coordinates of Fðu1 Þ and Fðu2 Þ relative to S: Fðu1 Þ ¼ Fð1; 2Þ ¼ ð11; À8Þ ¼ À49u1 þ 30u2 Fðu2 Þ ¼ Fð2; 3Þ ¼ ð18; À11Þ ¼ À76u1 þ 47u2 and so À49 B¼ 30 À76 47 !

6.2.

Consider the following linear operator G on R2 and basis S: Gðx; yÞ ¼ ð2x À 7y; 4x þ 3yÞ and S ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð2; 5Þg

(a) Find the matrix representation ½GŠS of G relative to S. (b) Verify ½GŠS ½vŠS ¼ ½GðvފS for the vector v ¼ ð4; À3Þ in R2 . First find the coordinates of an arbitrary vector v ¼ ða; bÞ in R2 relative to the basis S. We have ! ! ! x þ 2y ¼ a 2 1 a ; and so þy ¼x 3x þ 5y ¼ b 5 3 b Solve for x and y in terms of a and b to get x ¼ À5a þ 2b, y ¼ 3a À b. Thus, ða; bÞ ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 ; and so ½vŠ ¼ ½À5a þ 2b; 3a À bŠT
! (a) Using the formula for ða; bÞ and Gðx; yÞ ¼ ð2x À 7y; 4x þ 3yÞ, we have Gðu1 Þ ¼ Gð1; 3Þ ¼ ðÀ19; 13Þ ¼ 121u1 À 70u2 Gðu2 Þ ¼ Gð2; 5Þ ¼ ðÀ31; 23Þ ¼ 201u1 À 116u2 and so ½GŠS ¼ 121 À70 201 À116

(We emphasize that the coefficients of u1 and u2 are written as columns, not rows, in the matrix representation.) (b) Use the formula ða; bÞ ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 to get v ¼ ð4; À3Þ ¼ À26u1 þ 15u2 GðvÞ ¼ Gð4; À3Þ ¼ ð20; 7Þ ¼ À131u1 þ 80u2 Then ½vŠS ¼ ½À26; 15ŠT and ½GðvފS ¼ ½À131; 80ŠT

CHAPTER 6 Linear Mappings and Matrices
Accordingly, ½GŠS ½vŠS ¼ (This is expected from Theorem 6.1.) 121 201 À70 À116 ! ! ! À26 À131 ¼ ¼ ½GðvފS 15 80

207

6.3.

Consider the following 2 Â 2 matrix A and basis S of R2 : & ! ! 2 4 1 and S ¼ fu1 ; u2 g ¼ ; A¼ À2 5 6

3 À7

!'

The matrix A defines a linear operator on R2 . Find the matrix B that represents the mapping A relative to the basis S.
First find the coordinates of an arbitrary vector ða; bÞT with respect to the basis S. We have ! ! ! x þ 3y ¼ a 3 1 a or þy ¼x À2x À 7y ¼ b À7 À2 b ða; bÞT ¼ ð7a þ 3bÞu1 þ ðÀ2a À bÞu2 Then use the formula for ða; bÞT to find the coordinates ! ! 2 4 1 ¼ Au1 ¼ 5 6 À2 ! ! 2 4 3 Au2 ¼ ¼ 5 6 À7 Writing the coordinates as columns yields B¼ À63 19 À235 71 of Au1 and Au2 relative to the basis S: ! À6 ¼ À63u1 þ 19u2 À7 ! À22 ¼ À235u1 þ 71u2 À27 !

Solve for x and y in terms of a and b to obtain x ¼ 7a þ 3b, y ¼ À2a À b. Thus,

6.4.

Find the matrix representation of each of the following linear operators F on R3 relative to the usual basis E ¼ fe1 ; e2 ; e3 g of R3 ; that is, find ½FŠ ¼ ½FŠE : (a) F defined by Fðx; y; zÞ ¼ ðx þ 2y À 3z; 4x À 5y À 6z; 7x þ 8y þ 9z). 2 3 1 1 1 (b) F defined by the 3  3 matrix A ¼ 4 2 3 4 5. 5 5 5 (c) F defined by Fðe1 Þ ¼ ð1; 3; 5Þ; Fðe2 Þ ¼ ð2; 4; 6Þ, Fðe3 Þ ¼ ð7; 7; 7Þ. (Theorem 5.2 states that a linear map is completely defined by its action on the vectors in a basis.)
(a) Because E is the usual basis, simply write the coefficients of the components of Fðx; y; zÞ as rows: 2 3 1 2 À3 ½FŠ ¼ 4 4 À5 À6 5 7 8 9 (b) Because E is the usual basis, ½FŠ ¼ A, the matrix A itself. (c) Here Fðe1 Þ ¼ ð1; 3; 5Þ ¼ e1 þ 3e2 þ 5e3 Fðe2 Þ ¼ ð2; 4; 6Þ ¼ 2e1 þ 4e2 þ 6e3 and so Fðe3 Þ ¼ ð7; 7; 7Þ ¼ 7e1 þ 7e2 þ 7e3

1 2 ½FŠ ¼ 4 3 4 5 6

2

3 7 75 7

That is, the columns of ½FŠ are the images of the usual basis vectors.

6.5.

Let G be the linear operator on R3 defined by Gðx; y; zÞ ¼ ð2y þ z; x À 4y; 3xÞ. (a) Find the matrix representation of G relative to the basis S ¼ fw1 ; w2 ; w3 g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg

(b) Verify that ½GŠ½vŠ ¼ ½Gðvފ for any vector v in R3 .

208

CHAPTER 6 Linear Mappings and Matrices

First find the coordinates of an arbitrary vector ða; b; cÞ 2 R3 with respect to the basis S. Write ða; b; cÞ as a linear combination of w1 ; w2 ; w3 using unknown scalars x; y, and z: ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 1; 0Þ þ zð1; 0; 0Þ ¼ ðx þ y þ z; x þ y; xÞ Set corresponding components equal to each other to obtain the system of equations x þ y þ z ¼ a; x þ y ¼ b; x¼c Solve the system for x; y, z in terms of a; b, c to find x ¼ c, y ¼ b À c, z ¼ a À b. Thus, ða; b; cÞ ¼ cw1 þ ðb À cÞw2 þ ða À bÞw3 , or equivalently, ½ða; b; cފ ¼ ½c; b À c; a À bŠT

(a) Because Gðx; y; zÞ ¼ ð2y þ z; x À 4y; 3xÞ, Gðw1 Þ ¼ Gð1; 1; 1Þ ¼ ð3; À3; 3Þ ¼ 3w1 À 6x2 þ 6x3 Gðw2 Þ ¼ Gð1; 1; 0Þ ¼ ð2; À3; 3Þ ¼ 3w1 À 6w2 þ 5w3 Gðw3 Þ ¼ Gð1; 0; 0Þ ¼ ð0; 1; 3Þ ¼ 3w1 À 2w2 À w3 Write the coordinates Gðw1 Þ, Gðw2 Þ, Gðw3 Þ as columns to get 3 3 3 3 ½GŠ ¼ 4 À6 À6 À2 5 6 5 À1 (b) Write GðvÞ as a linear combination of w1 ; w2 ; w3 , where v ¼ ða; b; cÞ is an arbitrary vector in R3 , GðvÞ ¼ Gða; b; cÞ ¼ ð2b þ c; a À 4b; 3aÞ ¼ 3aw1 þ ðÀ2a À 4bÞw2 þ ðÀa þ 6b þ cÞw3 or equivalently, ½Gðvފ ¼ ½3a; À2a À 4b; Àa þ 6b þ cŠT Accordingly, 32 3 2 3 c 3a 3 3 3 ½GŠ½vŠ ¼ 4 À6 À6 À2 54 b À c 5 ¼ 4 À2a À 4b 5 ¼ ½Gðvފ 6 5 À1 aÀb Àa þ 6b þ c 2 2

6.6.

Consider the following 3 Â 3 matrix A and basis S of R3 : 1 A ¼ 43 1 2 À2 À1 4 3 1 05 À2 82 3 < 1 S ¼ fu1 ; u2 ; u3 g ¼ 4 1 5; : 1 2 3 0 4 1 5; 1 2 39 1 = 425 ; 3

and

The matrix A defines a linear operator on R3 . Find the matrix B that represents the mapping A relative to the basis S. (Recall that A represents itself relative to the usual basis of R3 .)
First find the coordinates of an arbitrary vector ða; b; cÞ in R3 with respect to the basis S. We have 2 3 2 3 2 3 2 3 1 0 1 a 4 b 5 ¼ x4 1 5 þ y4 1 5 þ z4 2 5 3 1 1 c xþ z¼a x þ y þ 2z ¼ b x þ y þ 3z ¼ c

or

Solve for x; y; z in terms of a; b; c to get x ¼ a þ b À c; thus;
T

y ¼ Àa þ 2b À c;

z¼cÀb

ða; b; cÞ ¼ ða þ b À cÞu1 þ ðÀa þ 2b À cÞu2 þ ðc À bÞu3

CHAPTER 6 Linear Mappings and Matrices

209

Then use the formula for ða; b; cÞT to find the coordinates of Au1 , Au2 , Au3 relative to the basis S: ¼ Àu1 þ u2 þ u3 Aðu1 Þ ¼ Að1; 1; 1ÞT ¼ ð0; 2; 3ÞT Aðu2 Þ ¼ Að1; 1; 0ÞT ¼ ðÀ1; À1; 2ÞT ¼ À4u1 À 3u2 þ 3u3 Aðu3 Þ ¼ Að1; 2; 3ÞT ¼ ð0; 1; 3ÞT ¼ À2u1 À u2 þ 2u3 À1 B¼4 1 1 2 À4 À3 3 3 À2 À1 5 2

so

6.7.

For each of the following linear transformations (operators) L on R2 , find the matrix A that represents L (relative to the usual basis of R2 ): (a) L is defined by Lð1; 0Þ ¼ ð2; 4Þ and Lð0; 1Þ ¼ ð5; 8Þ. (b) L is the rotation in R2 counterclockwise by 90 . (c) L is the reflection in R2 about the line y ¼ Àx.
(a) Because fð1; 0Þ; ð0; 1Þg is the usual basis of R2 , write their images under L as columns to get 2 5 A¼ 4 8 !

(b) Under the rotation L, we have Lð1; 0Þ ¼ ð0; 1Þ and Lð0; 1Þ ¼ ðÀ1; 0Þ. Thus, 0 À1 A¼ 1 0 !

(c) Under the reflection L, we have Lð1; 0Þ ¼ ð0; À1Þ and Lð0; 1Þ ¼ ðÀ1; 0Þ. Thus, A¼ 0 À1 À1 0 !

6.8.

The set S ¼ fe3t , te3t , t2 e3t g is a basis of a vector space V of functions f : R ! R. Let D be the differential operator on V ; that is, Dð f Þ ¼ df =dt. Find the matrix representation of D relative to the basis S.
Find the image of each basis function: ¼ 3ðe3t Þ þ 0ðte3t Þ þ 0ðt2 e3t Þ Dðe3t Þ ¼ 3e3t ¼ 1ðe3t Þ þ 3ðte3t Þ þ 0ðt2 e3t Þ Dðte3t Þ ¼ e3t þ 3te3t Dðt2 e3t Þ ¼ 2te3t þ 3t2 e3t ¼ 0ðe3t Þ þ 2ðte3t Þ þ 3ðt2 e3t Þ 3 ½DŠ ¼ 4 0 0 2 3 1 0 3 25 0 3

and thus;

6.9.

Prove Theorem 6.1: Let T : V ! V be a linear operator, and let S be a (finite) basis of V . Then, for any vector v in V , ½T ŠS ½vŠS ¼ ½T ðvފS .
Suppose S ¼ fu1 ; u2 ; . . . ; un g, and suppose, for i ¼ 1; . . . ; n, T ðui Þ ¼ ai1 u1 þ ai2 u2 þ Á Á Á þ ain un ¼ Then ½T ŠS is the n-square matrix whose jth row is ða1j ; a2j ; . . . ; anj Þ Now suppose v ¼ k1 u1 þ k2 u2 þ Á Á Á þ kn un ¼ Writing a column vector as the transpose of a row vector, we have ½vŠS ¼ ½k1 ; k2 ; . . . ; kn ŠT ð2Þ n P i¼1 n P j¼1

aij uj

ð1Þ

ki ui

210

CHAPTER 6 Linear Mappings and Matrices

Furthermore, using the linearity of T , n   n  n n P P P P ki u i ¼ ki T ðui Þ ¼ ki aij uj T ðvÞ ¼ T ¼ n n P P j¼1 i¼1 i¼1



i¼1 n P j¼1

i¼1

j¼1

aij ki uj ¼

ða1j k1 þ a2j k2 þ Á Á Á þ anj kn Þuj

Thus, ½T ðvފS is the column vector whose jth entry is a1j k1 þ a2j k2 þ Á Á Á þ anj kn ð3Þ

On the other hand, the jth entry of ½T ŠS ½vŠS is obtained by multiplying the jth row of ½T ŠS by ½vŠS —that is (1) by (2). But the product of (1) and (2) is (3). Hence, ½T ŠS ½vŠS and ½T ðvފS have the same entries. Thus, ½T ŠS ½vŠS ¼ ½T ðvފS .

6.10. Prove Theorem 6.2: Let S ¼ fu1 ; u2 ; . . . ; un g be a basis for V over K, and let M be the algebra of n-square matrices over K. Then the mapping m: AðV Þ ! M defined by mðT Þ ¼ ½T ŠS is a vector space isomorphism. That is, for any F; G 2 AðV Þ and any k 2 K, we have (i) ½F þ GŠ ¼ ½FŠ þ ½GŠ,
(i) Suppose, for i ¼ 1; . . . ; n, Fðui Þ ¼ n P j¼1

(ii) ½kFŠ ¼ k½FŠ,

(iii)

m is one-to-one and onto.
Gðui Þ ¼ n P j¼1 T

aij uj

and

bij uj

Consider the matrices A ¼ ½aij Š and B ¼ ½bij Š. Then ½FŠ ¼ AT and ½GŠ ¼ B . We have, for i ¼ 1; . . . ; n, n P ðF þ GÞðui Þ ¼ Fðui Þ þ Gðui Þ ¼ ðaij þ bij Þuj j¼1 Because A þ B is the matrix ðaij þ bij Þ, we have ½F þ GŠ ¼ ðA þ BÞT ¼ AT þ BT ¼ ½FŠ þ ½GŠ (ii) Also, for i ¼ 1; . . . ; n;

ðkFÞðui Þ ¼ kFðui Þ ¼ k
Because kA is the matrix ðkaij Þ, we have

n P j¼1

aij uj ¼

n P

ðkaij Þuj

j¼1

½kFŠ ¼ ðkAÞT ¼ kAT ¼ k½FŠ
(iii) Finally, m is one-to-one, because a linear mapping is completely determined by its values on a basis. Also, m is onto, because matrix A ¼ ½aij Š in M is the image of the linear operator, n P aij uj ; i ¼ 1; . . . ; n Fðui Þ ¼ j¼1 Thus, the theorem is proved.

6.11. Prove Theorem 6.3: For any linear operators G; F 2 AðV Þ, ½G  FŠ ¼ ½GŠ½FŠ.
Using the notation in Problem 6.10, we have ðG  FÞðui Þ ¼ GðFðui ÞÞ ¼ G ¼ n P j¼1



n P j¼1

 aij uj n P k¼1

¼ 

n P j¼1

aij Gðuj Þ

 aij

n P k¼1

 bjk uk j¼1 ¼

n P j¼1

 aij bjk uk

Recall that AB is the matrix AB ¼ ½cik Š, where cik ¼ The theorem is proved.

Pn

aij bjk . Accordingly,

½G  FŠ ¼ ðABÞT ¼ BT AT ¼ ½GŠ½FŠ

CHAPTER 6 Linear Mappings and Matrices

211

6.12. Let A be the matrix representation of a linear operator T . Prove that, for any polynomial f ðtÞ, we have that f ðAÞ is the matrix representation of f ðT Þ. [Thus, f ðT Þ ¼ 0 if and only if f ðAÞ ¼ 0.]
Let f be the mapping that sends an operator T into its matrix representation A. We need to prove that fð f ðT ÞÞ ¼ f ðAÞ. Suppose f ðtÞ ¼ an tn þ Á Á Á þ a1 t þ a0 . The proof is by induction on n, the degree of f ðtÞ. Suppose n ¼ 0. Recall that fðI 0 Þ ¼ I, where I 0 is the identity mapping and I is the identity matrix. Thus, fð f ðT ÞÞ ¼ fða0 I 0 Þ ¼ a0 fðI 0 Þ ¼ a0 I ¼ f ðAÞ and so the theorem holds for n ¼ 0. Now assume the theorem holds for polynomials of degree less than n. Then, because f is an algebra isomorphism, fð f ðT ÞÞ ¼ fðan T n þ anÀ1 T nÀ1 þ Á Á Á þ a1 T þ a0 I 0 Þ ¼ an fðT ÞfðT nÀ1 Þ þ fðanÀ1 T nÀ1 þ Á Á Á þ a1 T þ a0 I 0 Þ ¼ an AAnÀ1 þ ðanÀ1 AnÀ1 þ Á Á Á þ a1 A þ a0 IÞ ¼ f ðAÞ and the theorem is proved.

Change of Basis The coordinate vector ½vŠS in this section will always denote a column vector; that is, ½vŠS ¼ ½a1 ; a2 ; . . . ; an ŠT 6.13. Consider the following bases of R2 : E ¼ fe1 ; e2 g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð1; 4Þg

(a) Find the change-of-basis matrix P from the usual basis E to S. (b) Find the change-of-basis matrix Q from S back to E. (c) Find the coordinate vector ½vŠ of v ¼ ð5; À3Þ relative to S.
(a) Because E is the usual basis, simply write the basis vectors in S as columns: P ¼ 1 1 3 4

!

(b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in E as a linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector v ¼ ða; bÞ relative to S. We have xþ y¼a ða; bÞ ¼ xð1; 3Þ þ yð1; 4Þ ¼ ðx þ y; 3x þ 4yÞ or 3x þ 4y ¼ b Solve for x and y to obtain x ¼ 4a À b, y ¼ À3a þ b. Thus, v ¼ ð4a À bÞu1 þ ðÀ3a þ bÞu2 and ½vŠS ¼ ½ða; bފS ¼ ½4a À b; À3a þ bŠT Using the above formula for ½vŠS and writing the coordinates of the ei as columns yields ! 4 À1 e1 ¼ ð1; 0Þ ¼ 4u1 À 3u2 and Q¼ À3 1 e2 ¼ ð0; 1Þ ¼ Àu1 þ u2 Method 2. Because Q ¼ PÀ1 ; find PÀ1 , say by using the formula for the inverse of a 2  2 matrix. Thus, ! 4 À1 PÀ1 ¼ À3 1 (c) Method 1. Write v as a linear combination of the vectors in S, say by using the above formula for v ¼ ða; bÞ. We have v ¼ ð5; À3Þ ¼ 23u1 À 18u2 ; and so ½vŠS ¼ ½23; À18ŠT . Method 2. Use, from Theorem 6.6, the fact that ½vŠS ¼ PÀ1 ½vŠE and the fact that ½vŠE ¼ ½5; À3ŠT : ! ! ! 4 À1 5 23 À1 ¼ ½vŠS ¼ P ½vŠE ¼ À3 1 À3 À18

212

CHAPTER 6 Linear Mappings and Matrices

6.14. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3 . Find (a) The change-of-basis matrix P from the usual basis E ¼ fe1 ; e2 ; e3 g to S. 2 (b) The change-of-basis matrix Q from S back to E.
3 1 1 0 (a) Because E is the usual basis, simply write the basis vectors of S as columns: P ¼ 4 2 3 1 5 0 2 3 (b) Method 1. Express each basis vector of E as a linear combination of the basis vectors of S by first finding the coordinates of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S. We have 2 3 2 3 2 3 2 3 xþ y ¼a 0 1 1 a 4 b 5 ¼ x4 2 5 þ y4 3 5 þ z4 1 5 or 2x þ 3y þ z ¼ b 2y þ 3z ¼ c 3 2 0 c Solve for x; y; z to get x ¼ 7a À 3b þ c, y ¼ À6a þ 3b À c, z ¼ 4a À 2b þ c. Thus, v ¼ ða; b; cÞ ¼ ð7a À 3b þ cÞu1 þ ðÀ6a þ 3b À cÞu2 þ ð4a À 2b þ cÞu3 or ½vŠS ¼ ½ða; b; cފS ¼ ½7a À 3b þ c; À6a þ 3b À c; 4a À 2b þ cŠT Using the above formula for ½vŠS and then writing the coordinates of the ei as columns yields 2 3 e1 ¼ ð1; 0; 0Þ ¼ 7u1 À 6u2 þ 4u3 7 À3 1 e2 ¼ ð0; 1; 0Þ ¼ À3u1 þ 3u2 À 2u3 and Q ¼ 4 À6 3 À1 5 e3 ¼ ð0; 0; 1Þ ¼ u1 À u2 þ u3 4 À2 1 Method 2. Find PÀ1 by row reducing M ¼ ½P; IŠ to the form ½I; PÀ1 Š: 1 6 M ¼ 42 0 2 1 6 $ 40 0 2 Thus, Q ¼ PÀ1 2 1 0 1 3 1 0 2 3 0 3 2 0 0 1 1 7 6 1 05 $ 40 1 0 1 0 2 3 2 1 1 0 1 0 0 7 6 1 1 À2 1 05 $ 40 0 0 1 4 À2 1 0 1 3 1 À2 0 3 0 0 7 1 05 0 1 7 À6 4 À3 3 À2 1

0 0 1 0 0 1

3

7 À1 5 ¼ ½I; PÀ1 Š 1

3 7 À3 1 ¼ 4 À6 3 À1 5. 4 À2 1

6.15. Suppose the x-axis and y-axis in the plane R2 are rotated counterclockwise 45 so that the new x 0 -axis and y 0 -axis are along the line y ¼ x and the line y ¼ Àx, respectively. (a) Find the change-of-basis matrix P. (b) Find the coordinates of the point Að5; 6Þ under the given rotation.

(a) The unit vectors in the direction of the new x 0 - and y 0 -axes are pffiffiffi pffiffiffi pffiffiffi pffiffiffi u1 ¼ ð1 2; 1 2Þ and u2 ¼ ðÀ 1 2; 1 2Þ 2 2 2 2 (The unit vectors in the direction of the original x and y axes are the usual basis of R2 .) Thus, write the coordinates of u1 and u2 as columns to obtain " pffiffiffi pffiffiffi # 1 1 2 2 À2 2 P¼ pffiffiffi 1 pffiffiffi 1 2 2 2 2 (b) Multiply the coordinates of the point by PÀ1 : " pffiffiffi 1 pffiffiffi # ! " 11 pffiffiffi # 1 2 5 2 2 2 2 ¼ 2 pffiffiffi pffiffiffi 1 pffiffiffi 1 1 À2 2 2 2 6 2 2 (Because P is orthogonal, PÀ1 is simply the transpose of P.)

CHAPTER 6 Linear Mappings and Matrices

213

6.16. The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð0; 1; 1Þ, u3 ¼ ð1; 2; 2Þ form a basis S of R3 . Find the coordinates of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S.
Method 1. Express v as a linear combination of u1 ; u2 ; u3 using unknowns x; y; z. We have ða; b; cÞ ¼ xð1; 1; 0Þ þ yð0; 1; 1Þ þ zð1; 2; 2Þ ¼ ðx þ z; x þ y þ 2z; y þ 2zÞ this yields the system xþ z¼a x þ y þ 2z ¼ b y þ 2z ¼ c xþ or z¼a y þ z ¼ Àa þ b y þ 2z ¼ c xþ or z¼a y þ z ¼ Àa þ b z¼aÀbþc

Solving by back-substitution yields x ¼ b À c, y ¼ À2a þ 2b À c, z ¼ a À b þ c. Thus, ½vŠS ¼ ½b À c; À2a þ 2b À c; a À b þ cŠT Method 2. Find PÀ1 by row reducing M ¼ ½P; IŠ to the form ½I; PÀ1 Š, where P is the change-of-basis matrix from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S. We have 2 3 2 1 0 1 0 0 7 6 1 05 $ 40 1 1 0 1 2 0 1 3 2 1 0 0 1 0 7 6 À1 1 05 $ 40 1 1 À1 1 0 0 3 À1 7 À1 5 and ½vŠS ¼ PÀ1 ½vŠE 1 3 0 0 7 1 05 0 1 0 1 À1

1 6 M ¼ 41 0 2 1 6 $ 40 0 2 Thus; P
À1

0 1 1 1 2 0 1 2 0 0 1 1 1 0 1

1 À1 0 0 0 1

3

0 1 6 ¼ 4 À2 2 1 À1

7 À2 2 À1 5 ¼ ½I; PÀ1 Š 1 À1 1 2 32 3 2 3 0 1 À1 a bÀc 6 76 7 6 7 ¼ 4 À2 2 À1 54 b 5 ¼ 4 À2a þ 2b À c 5 1 À1 1 c aÀbþc

6.17. Consider the following bases of R2 : S ¼ fu1 ; u2 g ¼ fð1; À2Þ; ð3; À4Þg (a) (b) (c) (d) (e) (f ) and S 0 ¼ fv 1 ; v 2 g ¼ fð1; 3Þ; ð3; 8Þg

Find the coordinates of v ¼ ða; bÞ relative to the basis S. Find the change-of-basis matrix P from S to S 0 . Find the coordinates of v ¼ ða; bÞ relative to the basis S 0 . Find the change-of-basis matrix Q from S 0 back to S. Verify Q ¼ PÀ1 . Show that, for any vector v ¼ ða; bÞ in R2 , PÀ1 ½vŠS ¼ ½vŠS 0 . (See Theorem 6.6.)

(a) Let v ¼ xu1 þ yu2 for unknowns x and y; that is, ! ! ! a 1 3 x þ 3y ¼ a ¼x þy or b À2 À4 À2x À 4y ¼ b

or

x þ 3y ¼ a 2y ¼ 2a þ b

Solve for x and y in terms of a and b to get x ¼ À2a À 3 b and y ¼ a þ 1 b. Thus, 2 2 ða; bÞ ¼ ðÀ2a À 3Þu1 þ ða þ 1 bÞu2 2 2 or ½ða; bފS ¼ ½À2a À 3 b; a þ 1 bŠT 2 2

(b) Use part (a) to write each of the basis vectors v 1 and v 2 of S 0 as a linear combination of the basis vectors u1 and u2 of S; that is, v 1 ¼ ð1; 3Þ ¼ ðÀ2 À 9Þu1 þ ð1 þ 3Þu2 ¼ À 13 u1 þ 5 u2 2 2 2 2 v 2 ¼ ð3; 8Þ ¼ ðÀ6 À 12Þu1 þ ð3 þ 4Þu2 ¼ À18u1 þ 7u2

214

CHAPTER 6 Linear Mappings and Matrices
Then P is the matrix whose columns are the coordinates of v 1 and v 2 relative to the basis S; that is, " # À 13 À18 2 P¼ 5 7 2

(c) Let v ¼ xv 1 þ yv 2 for unknown scalars x and y: ! ! ! x þ 3y ¼ a 3 1 a or þy ¼x 3x þ 8y ¼ b 8 3 b Solve for x and y to get x ¼ À8a þ 3b and y ¼ 3a À b. Thus, ða; bÞ ¼ ðÀ8a þ 3bÞv 1 þ ð3a À bÞv 2 or

or

x þ 3y ¼ a Ày ¼ b À 3a 3a À bŠT

½ða; bފS 0 ¼ ½À8a þ 3b;

(d) Use part (c) to express each of the basis vectors u1 and u2 of S as a linear combination of the basis vectors v 1 and v 2 of S 0 : u1 ¼ ð1; À2Þ ¼ ðÀ8 À 6Þv 1 þ ð3 þ 2Þv 2 ¼ À14v 1 þ 5v 2 u2 ¼ ð3; À4Þ ¼ ðÀ24 À 12Þv 1 þ ð9 þ 4Þv 2 ¼ À36v 1 þ 13v 2 Write the coordinates of u1 and u2 relative to S 0 as columns to obtain Q ¼ # !" 13 ! À14 À36 À 2 À18 1 0 (e) QP ¼ ¼ ¼I 5 5 13 0 1 7 2 (f ) Use parts (a), (c), and (d) to obtain À14 À36 P ½vŠS ¼ Q½vŠS ¼ 5 13
À1

! À14 À36 . 5 13

!"

À2a À 3 b 2 a þ1b 2

#

! À8a þ 3b ¼ ¼ ½vŠS 0 3a À b

6.18. Suppose P is the change-of-basis matrix from a basis fui g to a basis fwi g, and suppose Q is the change-of-basis matrix from the basis fwi g back to fui g. Prove that P is invertible and that Q ¼ PÀ1 .
Suppose, for i ¼ 1; 2; . . . ; n, that wi ¼ ai1 u1 þ ai2 u2 þ . . . þ ain un ¼ and, for j ¼ 1; 2; . . . ; n, uj ¼ bj1 w1 þ bj2 w2 þ Á Á Á þ bjn wn ¼ n P j¼1 n P k¼1

aij uj

ð1Þ

bjk wk

ð2Þ

Let A ¼ ½aij Š and B ¼ ½bjk Š. Then P ¼ AT and Q ¼ BT . Substituting (2) into (1) yields  n   n  n n P P P P aij bjk wk ¼ aij bjk wk wi ¼ j¼1 k¼1 k¼1 j¼1

Because fwi g is a basis, aij bjk ¼ dik , where dik is the Kronecker delta; that is, dik ¼ 1 if i ¼ k but dik ¼ 0 if i ¼ k. Suppose AB ¼ ½cik Š. Then cik ¼ dik . Accordingly, AB ¼ I, and so 6 QP ¼ BT AT ¼ ðABÞT ¼ I T ¼ I Thus, Q ¼ PÀ1 .

P

6.19. Consider a finite sequence of vectors S ¼ fu1 ; u2 ; . . . ; un g. Let S 0 be the sequence of vectors obtained from S by one of the following ‘‘elementary operations’’: (1) (2) (3) Interchange two vectors. Multiply a vector by a nonzero scalar. Add a multiple of one vector to another vector.

Show that S and S 0 span the same subspace W . Also, show that S 0 is linearly independent if and only if S is linearly independent.

CHAPTER 6 Linear Mappings and Matrices

215

Observe that, for each operation, the vectors S 0 are linear combinations of vectors in S. Also, because each operation has an inverse of the same type, each vector in S is a linear combination of vectors in S 0 . Thus, S and S 0 span the same subspace W . Moreover, S 0 is linearly independent if and only if dim W ¼ n, and this is true if and only if S is linearly independent.

6.20. Let A ¼ ½aij Š and B ¼ ½bij Š be row equivalent m  n matrices over a field K, and let v 1 ; v 2 ; . . . ; v n be any vectors in a vector space V over K. For i ¼ 1; 2; . . . ; m, let ui and wi be defined by ui ¼ ai1 v 1 þ ai2 v 2 þ Á Á Á þ ain v n and wi ¼ bi1 v 1 þ bi2 v 2 þ Á Á Á þ bin v n

Applying an ‘‘elementary operation’’ of Problem 6.19 to fui g is equivalent to applying an elementary row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of elementary row operations. Hence, fwi g can be obtained from fui g by the corresponding sequence of operations. Accordingly, fui g and fwi g span the same space.

Show that fui g and fwi g span the same subspace of V .

6.21. Suppose u1 ; u2 ; . . . ; un belong to a vector space V over a field K, and suppose P ¼ ½aij Š is an n-square matrix over K. For i ¼ 1; 2; . . . ; n, let v i ¼ ai1 u1 þ ai2 u2 þ Á Á Á þ ain un . (a) Suppose P is invertible. Show that fui g and fv i g span the same subspace of V . Hence, fui g is linearly independent if and only if fv i g is linearly independent. (b) Suppose P is singular (not invertible). Show that fv i g is linearly dependent. (c) Suppose fv i g is linearly independent. Show that P is invertible.
(a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 6.19, fv i g and fui g span the same subspace of V . Thus, one is linearly independent if and only if the other is linearly independent. (b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means fv i g spans a substance that has a spanning set with less than n elements. Thus, fv i g is linearly dependent. (c) This is the contrapositive of the statement of part (b), and so it follows from part (b).

6.22. Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis S to a basis S 0 in a vector space V . Then, for any vector v 2 V , we have P½vŠS 0 ¼ ½vŠS , and hence, PÀ1 ½vŠS ¼ ½vŠS 0 .
Suppose S ¼ fu1 ; . . . ; un g and S 0 ¼ fw1 ; . . . ; wn g, and suppose, for i ¼ 1; . . . ; n, wi ¼ ai1 u1 þ ai2 u2 þ Á Á Á þ ain un ¼ Then P is the n-square matrix whose jth row is ða1j ; a2j ; . . . ; anj Þ Pn Also suppose v ¼ k1 w1 þ k2 w2 þ Á Á Á þ kn wn ¼ i¼1 ki wi . Then ½vŠS 0 ¼ ½k1 ; k2 ; . . . ; kn ŠT Substituting for wi in the equation for v, we obtain  n   n  n n n P P P P P ki w i ¼ ki aij uj ¼ aij ki uj v¼ i¼1 i¼1 j¼1 j¼1 i¼1 n P j¼1

aij uj

ð1Þ

ð2Þ

¼

n P

ða1j k1 þ a2j k2 þ Á Á Á þ anj kn Þuj

j¼1

Accordingly, ½vŠS is the column vector whose jth entry is a1j k1 þ a2j k2 þ Á Á Á þ anj kn

ð3Þ

On the other hand, the jth entry of P½vŠS 0 is obtained by multiplying the jth row of P by ½vŠS 0 —that is, (1) by (2). However, the product of (1) and (2) is (3). Hence, P½vŠS 0 and ½vŠS have the same entries. Thus, P½vŠS 0 ¼ ½vŠS 0 , as claimed. Furthermore, multiplying the above by PÀ1 gives PÀ1 ½vŠS ¼ PÀ1 P½vŠS 0 ¼ ½vŠS 0 .

216
Linear Operators and Change of Basis

CHAPTER 6 Linear Mappings and Matrices

6.23. Consider the linear transformation F on R2 defined by Fðx; yÞ ¼ ð5x À y; 2x þ yÞ and the following bases of R2 : E ¼ fe1 ; e2 g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1 ; u2 g ¼ fð1; 4Þ; ð2; 7Þg

(a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix Q from S back to E. (b) Find the matrix A that represents F in the basis E. (c) Find the matrix B that represents F in the basis S.
(a) Because E is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis matrix P. Recall, also, that Q ¼ PÀ1 . Thus, ! ! 1 2 À7 2 À1 P¼ and Q¼P ¼ 4 À1 4 7 (b) Write the coefficients of x and y in Fðx; yÞ ¼ ð5x À y; 2x þ yÞ as rows to get ! 5 À1 A¼ 2 1 (c) Method 1. Find the coordinates of Fðu1 Þ and Fðu2 Þ relative to the basis S. This may be done by first finding the coordinates of an arbitrary vector ða; bÞ in R2 relative to the basis S. We have ða; bÞ ¼ xð1; 4Þ þ yð2; 7Þ ¼ ðx þ 2y; 4x þ 7yÞ; and so x þ 2y ¼ a 4x þ 7y ¼ b

Solve for x and y in terms of a and b to get x ¼ À7a þ 2b, y ¼ 4a À b. Then ða; bÞ ¼ ðÀ7a þ 2bÞu1 þ ð4a À bÞu2 Now use the formula for ða; bÞ to obtain Fðu1 Þ ¼ Fð1; 4Þ ¼ ð1; 6Þ ¼ 5u1 À 2u2 Fðu2 Þ ¼ Fð2; 7Þ ¼ ð3; 11Þ ¼ u1 þ u2 and so B¼ 5 1 À2 1 !

Method 2. By Theorem 6.7, B ¼ PÀ1 AP. Thus, ! ! ! ! À7 2 5 À1 1 2 5 1 B ¼ PÀ1 AP ¼ ¼ 4 À1 2 1 4 7 À2 1

! 2 3 . Find the matrix B that represents the linear operator A relative to the basis 6.24. Let A ¼ 4 À1 S ¼ fu1 ; u2 g ¼ f½1; 3ŠT ; ½2; 5ŠT g. [Recall A defines a linear operator A: R2 ! R2 relative to the usual basis E of R2 ].
Method 1. Find the coordinates of Aðu1 Þ and Aðu2 Þ relative to the basis S by first finding the coordinates of an arbitrary vector ½a; bŠT in R2 relative to the basis S. By Problem 6.2, ½a; bŠT ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 Using the formula for ½a; bŠT , we obtain Aðu1 Þ ¼ and Thus; 2 4 2 Aðu2 Þ ¼ 4 ! 3 À1 ! 3 À1 ! ! 1 11 ¼ ¼ À53u1 þ 32u2 3 1 ! ! 2 19 ¼ ¼ À89u1 þ 54u2 5 3 ! À53 À89 B¼ 32 54

Method 2. Use B ¼ PÀ1 AP, where P is the change-of-basis matrix from the usual basis E to S. Thus, simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula

CHAPTER 6 Linear Mappings and Matrices for PÀ1 . This gives P¼ Then
À1

217
! À5 2 3 À1 ! 2 À53 ¼ À1 32

1 3

2 5

! and ! 2 3 4 À1 !

PÀ1 ¼ À5 3

1 2 B ¼ P AP ¼ 3 5

À89 54

!

1 6.25. Let A ¼ 4 2 1 basis

2

3 5 À2

3 1 À4 5: Find the matrix B that represents the linear operator A relative to the 2 S ¼ fu1 ; u2 ; u3 g ¼ f½1; 1; 0ŠT ; ½0; 1; 1ŠT ; ½1; 2; 2ŠT g

[Recall A that defines a linear operator A: R3 ! R3 relative to the usual basis E of R3 .]
Method 1. Find the coordinates of Aðu1 Þ, Aðu2 Þ, Aðu3 Þ relative to the basis S by first finding the coordinates of an arbitrary vector v ¼ ða; b; cÞ in R3 relative to the basis S. By Problem 6.16, ½vŠS ¼ ðb À cÞu1 þ ðÀ2a þ 2b À cÞu2 þ ða À b þ cÞu3 Using this formula for ½a; b; cŠT , we obtain Aðu1 Þ ¼ ½4; 7; À1ŠT ¼ 8u1 þ 7u2 À 5u3 ; Aðu2 Þ ¼ ½4; 1; 0ŠT ¼ u1 À 6u2 þ 3u3 Aðu3 Þ ¼ ½9; 4; 1ŠT ¼ 3u1 À 11u2 þ 6u3 Writing the coefficients of u1 ; u2 ; u3 as columns yields 2 3 8 1 3 B ¼ 4 7 À6 À11 5 À5 3 6 Method 2. Use B ¼ PÀ1 AP, where P is the change-of-basis matrix from the usual basis E to S. The matrix P (whose columns are simply the vectors in S) and PÀ1 appear in Problem 6.16. Thus, 3 2 32 32 3 2 8 1 3 0 1 À1 1 3 1 1 0 1 B ¼ PÀ1 AP ¼ 4 À2 2 À1 54 2 5 À4 54 1 1 2 5 ¼ 4 7 À6 À11 5 À5 3 6 1 À1 1 1 À2 2 0 1 2

6.26. Prove Theorem 6.7: Let P be the change-of-basis matrix from a basis S to a basis S 0 in a vector space V . Then, for any linear operator T on V , ½T ŠS 0 ¼ PÀ1 ½T ŠS P.
Let v be a vector in V . Then, by Theorem 6.6, P½vŠS 0 ¼ ½vŠS . Therefore, PÀ1 ½T ŠS P½vŠS 0 ¼ PÀ1 ½T ŠS ½vŠS ¼ PÀ1 ½T ðvފS ¼ ½T ðvފS 0 But ½T ŠS 0 ½vŠS 0 ¼ ½T ðvފS 0 . Hence, PÀ1 ½T ŠS P½vŠS 0 ¼ ½T ŠS 0 ½vŠS 0

Because the mapping v 7! ½vŠS 0 is onto K n , we have PÀ1 ½T ŠS PX ¼ ½T ŠS 0 X for every X 2 K n . Thus, PÀ1 ½T ŠS P ¼ ½T ŠS 0 , as claimed.

Similarity of Matrices 4 6.27. Let A ¼ 3
(a) First find P

À2 6
À1

!

1 and P ¼ 3

! 2 . 4
(c) Verify detðBÞ ¼ detðAÞ:

(a) Find B ¼ PÀ1 AP.

(b) Verify trðBÞ ¼ trðAÞ:

using the formula for the inverse of a 2 Â 2 matrix. We have " # À2 1 À1 P ¼ 3 1 2 À2

218
Then B ¼ P AP ¼
À1

CHAPTER 6 Linear Mappings and Matrices
! ! ! 25 2 ¼ À 27 4 2 !

À2
3 2

1 À1 2

4 3

À2 6

1 3

30 À15

(b) trðAÞ ¼ 4 þ 6 ¼ 10 and trðBÞ ¼ 25 À 15 ¼ 10. Hence, trðBÞ ¼ trðAÞ. (c) detðAÞ ¼ 24 þ 6 ¼ 30 and detðBÞ ¼ À375 þ 405 ¼ 30. Hence, detðBÞ ¼ detðAÞ.

6.28. Find the trace of each of the linear transformations F on R3 in Problem 6.4.
Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix representation ½FŠ ¼ ½FŠE of F relative to the usual basis E given in Problem 6.4. (a) trðFÞ ¼ trð½FŠÞ ¼ 1 À 5 þ 9 ¼ 5. (b) trðFÞ ¼ trð½FŠÞ ¼ 1 þ 3 þ 5 ¼ 9. (c) trðFÞ ¼ trð½FŠÞ ¼ 1 þ 4 þ 7 ¼ 12.

6.29. Write A % B if A is similar to B—that is, if there exists an invertible matrix P such that A ¼ PÀ1 BP. Prove that % is an equivalence relation (on square matrices); that is, (a) A % A, for every A. (b) If A % B, then B % A. (c) If A % B and B % C, then A % C.
(a) The identity matrix I is invertible, and I À1 ¼ I. Because A ¼ I À1 AI, we have A % A. (b) Because A % B, there exists an invertible matrix P such that A ¼ PÀ1 BP. Hence, B ¼ PAPÀ1 ¼ ðPÀ1 ÞÀ1 AP and PÀ1 is also invertible. Thus, B % A. (c) Because A % B, there exists an invertible matrix P such that A ¼ PÀ1 BP, and as B % C, there exists an invertible matrix Q such that B ¼ QÀ1 CQ. Thus, A ¼ PÀ1 BP ¼ PÀ1 ðQÀ1 CQÞP ¼ ðPÀ1 QÀ1 ÞCðQPÞ ¼ ðQPÞÀ1 CðQPÞ and QP is also invertible. Thus, A % C.

6.30. Suppose B is similar to A, say B ¼ PÀ1 AP. Prove (a) Bn ¼ PÀ1 An P, and so Bn is similar to An . (b) f ðBÞ ¼ PÀ1 f ðAÞP, for any polynomial f ðxÞ, and so f ðBÞ is similar to f ðAÞ: (c) B is a root of a polynomial gðxÞ if and only if A is a root of gðxÞ.
(a) The proof is by induction on n. The result holds for n ¼ 1 by hypothesis. Suppose n > 1 and the result holds for n À 1. Then Bn ¼ BBnÀ1 ¼ ðPÀ1 APÞðPÀ1 AnÀ1 PÞ ¼ PÀ1 An P (b) Suppose f ðxÞ ¼ an xn þ Á Á Á þ a1 x þ a0 . Using the left and right distributive laws and part (a), we have PÀ1 f ðAÞP ¼ PÀ1 ðan An þ Á Á Á þ a1 A þ a0 IÞP ¼ PÀ1 ðan An ÞP þ Á Á Á þ PÀ1 ða1 AÞP þ PÀ1 ða0 IÞP ¼ an ðPÀ1 An PÞ þ Á Á Á þ a1 ðPÀ1 APÞ þ a0 ðPÀ1 IPÞ ¼ an Bn þ Á Á Á þ a1 B þ a0 I ¼ f ðBÞ (c) By part (b), gðBÞ ¼ 0 if and only if PÀ1 gðAÞP ¼ 0 if and only if gðAÞ ¼ P0PÀ1 ¼ 0.

Matrix Representations of General Linear Mappings 6.31. Let F: R3 ! R2 be the linear map defined by Fðx; y; zÞ ¼ ð3x þ 2y À 4z; x À 5y þ 3zÞ. (a) Find the matrix of F in the following bases of R3 and R2 : S ¼ fw1 ; w2 ; w3 g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S 0 ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð2; 5Þg

CHAPTER 6 Linear Mappings and Matrices

219

(b) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any v in R3 , we have ½FŠS;S 0 ½vŠS ¼ ½FðvފS 0 .
(a) From Problem 6.2, ða; bÞ ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 . Thus, Fðw1 Þ ¼ Fð1; 1; 1Þ ¼ ð1; À1Þ ¼ À7u1 þ 4u2 Fðw2 Þ ¼ Fð1; 1; 0Þ ¼ ð5; À4Þ ¼ À33u1 þ 19u2 Fðw3 Þ ¼ Fð1; 0; 0Þ ¼ ð3; 1Þ ¼ À13u1 þ 8u2 Write the coordinates of Fðw1 Þ, Fðw2 Þ; Fðw3 Þ as columns to get ! À7 À33 13 ½FŠS;S 0 ¼ 4 19 8 (b) If v ¼ ðx; y; zÞ, then, by Problem 6.5, v ¼ zw1 þ ðy À zÞw2 þ ðx À yÞw3 . Also, FðvÞ ¼ ð3x þ 2y À 4z; x À 5y þ 3zÞ ¼ ðÀ13x À 20y þ 26zÞu1 þ ð8x þ 11y À 15zÞu2 ! À13x À 20y þ 26z T ½vŠS ¼ ðz; y À z; x À yÞ and ½FðvފS 0 ¼ 8x þ 11y À 15z 2 3 ! ! z À7 À33 À13 4 À13x À 20y þ 26z y Àx5 ¼ ¼ ½FðvފS 0 ½FŠS;S 0 ½vŠS ¼ 4 19 8 8x þ 11y À 15z xÀy

Hence; Thus,

6.32. Let F: Rn ! Rm be the linear mapping defined as follows: Fðx1 ; x2 ; . . . ; xn Þ ¼ ða11 x1 þ Á Á Á þ a1n xn , a21 x1 þ Á Á Á þ a2n xn ; . . . ; am1 x1 þ Á Á Á þ amn xn Þ (a) Show that the rows of the matrix ½FŠ representing F relative to the usual bases of Rn and Rm are the coefficients of the xi in the components of Fðx1 ; . . . ; xn Þ. (b) Find the matrix representation of each of the following linear mappings relative to the usual basis of Rn : (i) F: R2 ! R3 defined by Fðx; yÞ ¼ ð3x À y; 2x þ 4y; 5x À 6yÞ. (ii) F: R4 ! R2 defined by Fðx; y; s; tÞ ¼ ð3x À 4y þ 2s À 5t; 5x þ 7y À s À 2tÞ. (iii) F: R3 ! R4 defined by Fðx; y; zÞ ¼ ð2x þ 3y À 8z; x þ y þ z; 4x À 5z; 6yÞ.
(a) We have Fð1; 0; . . . ; 0Þ ¼ ða11 ; a21 ; . . . ; am1 Þ Fð0; 1; . . . ; 0Þ ¼ ða12 ; a22 ; . . . ; am2 Þ ::::::::::::::::::::::::::::::::::::::::::::::::::::: Fð0; 0; . . . ; 1Þ ¼ ða1n ; a2n ; . . . ; amn Þ 3 a11 a12 . . . a1n 6a a22 . . . a2n 7 7 ½FŠ ¼ 6 21 4 ::::::::::::::::::::::::::::::::: 5 am1 am2 . . . amn 2

and thus;

! 2 5 À3 . Recall that A determines a mapping F: R3 ! R2 defined by FðvÞ ¼ Av, 6.33. Let A ¼ 1 À4 7 where vectors are written as columns. Find the matrix ½FŠ that represents the mapping relative to the following bases of R3 and R2 : (a) The usual bases of R3 and of R2 . (b) S ¼ fw1 ; w2 ; w3 g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S 0 ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð2; 5Þg.
(a) Relative to the usual bases, ½FŠ is the matrix A.

(b) By part (a), we need only look at the coefficients of the unknown x; y; . . . in Fðx; y; . . .Þ. Thus, 2 3 2 3 2 3 À8 ! 3 À1 61 1 3 À4 2 À5 17 7 ðiiÞ ½FŠ ¼ ; ðiiiÞ ½FŠ ¼ 6 ðiÞ ½FŠ ¼ 4 2 4 5; 4 4 0 À5 5 5 7 À1 À2 5 À6 0 6 0

220

CHAPTER 6 Linear Mappings and Matrices

(b) From Problem 9.2, ða; bÞ ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 . Thus, 2 3 ! 1 ! 2 5 À3 6 7 4 Fðw1 Þ ¼ ¼ À12u1 þ 8u2 415 ¼ 1 À4 7 4 1 2 3 ! ! 1 7 2 5 À3 6 7 ¼ À41u1 þ 24u2 Fðw2 Þ ¼ 415 ¼ À3 1 À4 7 0 2 3 ! 1 ! 2 5 À3 6 7 2 Fðw3 Þ ¼ ¼ À8u1 þ 5u2 405 ¼ 1 À4 7 1 0 À12 À41 Writing the coefficients of Fðw1 Þ, Fðw2 Þ, Fðw3 Þ as columns yields ½FŠ ¼ 8 24

! À8 . 5

6.34. Consider the linear transformation T on R2 defined by T ðx; yÞ ¼ ð2x À 3y; following bases of R2 : E ¼ fe1 ; e2 g ¼ fð1; 0Þ; ð0; 1Þg and

x þ 4yÞ and the

S ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð2; 5Þg

(a) Find the matrix A representing T relative to the bases E and S. (b) Find the matrix B representing T relative to the bases S and E.

(We can view T as a linear mapping from one space into another, each having its own basis.)
(a) From Problem 6.2, ða; bÞ ¼ ðÀ5a þ 2bÞu1 þ ð3a À bÞu2 . Hence, T ðe1 Þ ¼ T ð1; 0Þ ¼ ð2; 1Þ ¼ À8u1 þ 5u2 T ðe2 Þ ¼ T ð0; 1Þ ¼ ðÀ3; 4Þ ¼ 23u1 À 13u2 (b) We have T ðu1 Þ ¼ T ð1; 3Þ ¼ ðÀ7; 13Þ ¼ À7e1 þ 13e2 T ðu2 Þ ¼ T ð2; 5Þ ¼ ðÀ11; 22Þ ¼ À11e1 þ 22e2 and so B¼ À7 À11 13 22 and so A¼ À8 5 23 À13 ! !

6.35. How are the matrices A and B in Problem 6.34 related?
By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular matrices P and Q such that B ¼ QÀ1 AP, where P is the change-of-basis matrix from S to E, and Q is the change-of-basis matrix from E to S. Thus, ! ! ! 1 2 1 2 À5 2 À1 P¼ ; Q¼ ; Q ¼ 3 5 3 5 3 À1 ! ! ! ! 1 2 À8 À23 1 2 À7 À11 ¼ ¼B and QÀ1 AP ¼ 3 5 5 À13 3 5 13 22

6.36. Prove Theorem 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases V and of U such that the matrix representation of F has the following form, where Ir is the r-square identity matrix: ! Ir 0 A¼ 0 0
Suppose dim V ¼ m and dim U ¼ n. Let W be the kernel of F and U 0 the image of F. We are given that rank ðFÞ ¼ r. Hence, the dimension of the kernel of F is m À r. Let fw1 ; . . . ; wmÀr g be a basis of the kernel of F and extend this to a basis of V : fv 1 ; . . . ; v r ; w1 ; . . . ; wmÀr g Set u1 ¼ Fðv 1 Þ; u2 ¼ Fðv 2 Þ; . . . ; ur ¼ Fðv r Þ

CHAPTER 6 Linear Mappings and Matrices
Then fu1 ; . . . ; ur g is a basis of U 0 , the image of F. Extend this to a basis of U , say fu1 ; . . . ; ur ; urþ1 ; . . . ; un g Observe that Fðv 1 Þ ¼ u1 ¼ 1u1 þ 0u2 þ Á Á Á þ 0ur þ 0urþ1 þ Á Á Á þ 0un Fðv 2 Þ ¼ u2 ¼ 0u1 þ 1u2 þ Á Á Á þ 0ur þ 0urþ1 þ Á Á Á þ 0un :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ¼ ur ¼ 0u1 þ 0u2 þ Á Á Á þ 1ur þ 0urþ1 þ Á Á Á þ 0un Fðv r Þ Fðw1 Þ ¼ 0 ¼ 0u1 þ 0u2 þ Á Á Á þ 0ur þ 0urþ1 þ Á Á Á þ 0un :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: FðwmÀr Þ ¼ 0 ¼ 0u1 þ 0u2 þ Á Á Á þ 0ur þ 0urþ1 þ Á Á Á þ 0un Thus, the matrix of F in the above bases has the required form.

221

SUPPLEMENTARY PROBLEMS

Matrices and Linear Operators
6.37. Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ð4x þ 5y; 2x À yÞ. (a) (b) (c) (d) Find the matrix A representing F in the usual basis E. Find the matrix B representing F in the basis S ¼ fu1 ; u2 g ¼ fð1; 4Þ; ð2; 9Þg. Find P such that B ¼ PÀ1 AP. For v ¼ ða; bÞ, find ½vŠS and ½FðvފS . Verify that ½FŠS ½vŠS ¼ ½FðvފS .
2 2

! 5 À1 . 6.38. Let A: R ! R be defined by the matrix A ¼ 2 4 (a) Find the matrix B representing A relative to the basis S ¼ fu1 ; u2 g ¼ fð1; 3Þ; ð2; 8Þg. (Recall that A represents the mapping A relative to the usual basis E.) (b) For v ¼ ða; bÞ, find ½vŠS and ½AðvފS . 6.39. For each linear transformation L on R2 , find the matrix A representing L (relative to the usual basis of R2 ): (a) (b) (c) (d) L L L L is is is is the rotation in R2 counterclockwise by 45 . the reflection in R2 about the line y ¼ x. defined by Lð1; 0Þ ¼ ð3; 5Þ and Lð0; 1Þ ¼ ð7; À2Þ. defined by Lð1; 1Þ ¼ ð3; 7Þ and Lð1; 2Þ ¼ ð5; À4Þ.

6.40. Find the matrix representing each linear transformation T on R3 relative to the usual basis of R3 : (a) T ðx; y; zÞ ¼ ðx; y; 0Þ. (b) T ðx; y; zÞ ¼ ðz; y þ z; x þ y þ zÞ. (c) T ðx; y; zÞ ¼ ð2x À 7y À 4z; 3x þ y þ 4z; 6x À 8y þ zÞ. 6.41. Repeat Problem 6.40 using the basis S ¼ fu1 ; u2 ; u3 g ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg. 6.42. Let L be the linear transformation on R3 defined by Lð1; 0; 0Þ ¼ ð1; 1; 1Þ; Lð0; 1; 0Þ ¼ ð1; 3; 5Þ; Lð0; 0; 1Þ ¼ ð2; 2; 2Þ

(a) Find the matrix A representing L relative to the usual basis of R3 . (b) Find the matrix B representing L relative to the basis S in Problem 6.41. 6.43. Let D denote the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Each of the following sets is a basis of a vector space V of functions. Find the matrix representing D in each basis: (a) fet ; e2t ; te2t g. (b) f1; t; sin 3t; cos 3tg. (c) fe5t ; te5t ; t2 e5t g.

222
(a) Find the matrix A ¼ ½DŠS .

CHAPTER 6 Linear Mappings and Matrices

6.44. Let D denote the differential operator on the vector space V of functions with basis S ¼ fsin y, cos yg. (b) Use A to show that D is a zero of f ðtÞ ¼ t2 þ 1.

6.45. Let V be the vector space of 2 Â 2 matrices. Consider the following matrix M and usual basis E of V : ! & ! ! ! !' a b 1 0 0 1 0 0 0 0 M¼ and ; ; ; E¼ c d 0 0 0 0 1 0 0 1 Find the matrix representing each of the following linear operators T on V relative to E: (a) T ðAÞ ¼ MA. (b) T ðAÞ ¼ AM. (c) T ðAÞ ¼ MA À AM.

6.46. Let 1V and 0V denote the identity and zero operators, respectively, on a vector space V . Show that, for any (b) ½0V ŠS ¼ 0, the zero matrix. basis S of V , (a) ½1V ŠS ¼ I, the identity matrix.

Change of Basis
6.47. Find the change-of-basis matrix P from the usual basis E of R2 to a basis S, the change-of-basis matrix Q from S back to E, and the coordinates of v ¼ ða; bÞ relative to S, for the following bases S: (a) S ¼ fð1; 2Þ; ð3; 5Þg. (b) S ¼ fð1; À3Þ; ð3; À8Þg. (c) S ¼ fð2; 5Þ; ð3; 7Þg. (d) S ¼ fð2; 3Þ; ð4; 5Þg.

6.48. Consider the bases S ¼ fð1; 2Þ; ð2; 3Þg and S 0 ¼ fð1; 3Þ; ð1; 4Þg of R2 . Find the change-of-basis matrix: (a) P from S to S 0 . (b) Q from S 0 back to S.

6.49. Suppose that the x-axis and y-axis in the plane R2 are rotated counterclockwise 30 to yield new x 0 -axis and y 0 -axis for the plane. Find (a) The unit vectors in the direction of the new x 0 -axis and y 0 -axis. (b) The change-of-basis matrix P for the new coordinate system. (c) The new coordinates of the points Að1; 3Þ, Bð2; À5Þ, Cða; bÞ. 6.50. Find the change-of-basis matrix P from the usual basis E of R3 to a basis S, the change-of-basis matrix Q from S back to E, and the coordinates of v ¼ ða; b; cÞ relative to S, where S consists of the vectors: (a) u1 ¼ ð1; 1; 0Þ; u2 ¼ ð0; 1; 2Þ; u3 ¼ ð0; 1; 1Þ. (b) u1 ¼ ð1; 0; 1Þ; u2 ¼ ð1; 1; 2Þ; u3 ¼ ð1; 2; 4Þ. (c) u1 ¼ ð1; 2; 1Þ; u2 ¼ ð1; 3; 4Þ; u3 ¼ ð2; 5; 6Þ. 6.51. Suppose S1 ; S2 ; S3 are bases of V . Let P and Q be the change-of-basis matrices, respectively, from S1 to S2 and from S2 to S3 . Prove that PQ is the change-of-basis matrix from S1 to S3 .

Linear Operators and Change of Basis
6.52. Consider the linear operator F on R2 defined by Fðx; yÞ ¼ ð5x þ y; 3x À 2yÞ and the following bases of R2 : S ¼ fð1; 2Þ; ð2; 3Þg (a) (b) (c) (d) and S 0 ¼ fð1; 3Þ; ð1; 4Þg

Find the matrix A representing F relative to the basis S. Find the matrix B representing F relative to the basis S 0 . Find the change-of-basis matrix P from S to S 0 . How are A and B related? 1 3 ! À1 . Find the matrix B that represents the linear 2 S ¼ fð1; 3ÞT ; ð2; 5ÞT g. (b) S ¼ fð1; 3ÞT ; ð2; 4ÞT g.

6.53. Let A: R2 ! R2 be defined by the matrix A ¼

operator A relative to each of the following bases: (a)

CHAPTER 6 Linear Mappings and Matrices

223

6.54. Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ðx À 3y; 2x À 4yÞ. Find the matrix A that represents F relative to each of the following bases: (a) S ¼ fð2; 5Þ; ð3; 7Þg. (b) S ¼ fð2; 3Þ; ð4; 5Þg. 1 3 6.55. Let A: R3 ! R3 be defined by the matrix A ¼ 4 2 7 1 4 2 3 1 4 5. Find the matrix B that represents the linear 3

operator A relative to the basis S ¼ fð1; 1; 1ÞT ; ð0; 1; 1ÞT ; ð1; 2; 3ÞT g.

Similarity of Matrices
6.56. Let A ¼ 1 1 2 À3 ! and P ¼ ! 1 À2 . 3 À5 (c) Verify that detðBÞ ¼ detðAÞ.

(a) Find B ¼ PÀ1 AP.

(b) Verify that trðBÞ ¼ trðAÞ:

6.57. Find the trace and determinant of each of the following linear maps on R2 : (a) Fðx; yÞ ¼ ð2x À 3y; 5x þ 4yÞ. (b) Gðx; yÞ ¼ ðax þ by; cx þ dyÞ.

6.58. Find the trace and determinant of each of the following linear maps on R3 : (a) Fðx; y; zÞ ¼ ðx þ 3y; 3x À 2z; x À 4y À 3zÞ. (b) Gðx; y; zÞ ¼ ðy þ 3z; 2x À 4z; 5x þ 7yÞ. 6.59. Suppose S ¼ fu1 ; u2 g is a basis of V , and T : V ! V is defined by T ðu1 Þ ¼ 3u1 À 2u2 and T ðu2 Þ ¼ u1 þ 4u2 . Suppose S 0 ¼ fw1 ; w2 g is a basis of V for which w1 ¼ u1 þ u2 and w2 ¼ 2u1 þ 3u2 . (a) Find the matrices A and B representing T relative to the bases S and S 0 , respectively. (b) Find the matrix P such that B ¼ PÀ1 AP. 6.60. Let A be a ! Â 2 matrix such that only A is similar to itself. Show that A is a scalar matrix, that is, that 2 a 0 A¼ . 0 a 6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar matrices have the same rank.

Matrix Representation of General Linear Mappings
6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for Rn : (a) F: R3 ! R2 defined by Fðx; y; zÞ ¼ ð2x À 4y þ 9z; 5x þ 3y À 2zÞ. (b) F: R2 ! R4 defined by Fðx; yÞ ¼ ð3x þ 4y; 5x À 2y; x þ 7y; 4xÞ: (c) F: R4 ! R defined by Fðx1 ; x2 ; x3 ; x4 Þ ¼ 2x1 þ x2 À 7x3 À x4 . 6.63. Let G: R3 ! R2 be defined by Gðx; y; zÞ ¼ ð2x þ 3y À z; 4x À y þ 2zÞ. (a) Find the matrix A representing G relative to the bases S ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg and S 0 ¼ fð1; 2Þ; ð2; 3Þg (b) For any v ¼ ða; b; cÞ in R3 , find ½vŠS and ½GðvފS 0 . (c) Verify that A½vŠS ¼ ½GðvފS 0 .

6.64. Let H: R2 ! R2 be defined by Hðx; yÞ ¼ ð2x þ 7y; x À 3yÞ and consider the following bases of R2 : S ¼ fð1; 1Þ; ð1; 2Þg and S 0 ¼ fð1; 4Þ; ð1; 5Þg (a) Find the matrix A representing H relative to the bases S and S 0 . (b) Find the matrix B representing H relative to the bases S 0 and S.

224

CHAPTER 6 Linear Mappings and Matrices
3x À 2y þ 4zÞ. ð1; 4Þg

6.65. Let F: R3 ! R2 be defined by Fðx; y; zÞ ¼ ð2x þ y À z;

(a) Find the matrix A representing F relative to the bases S ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S 0 ¼ ð1; 3Þ; (b) Verify that, for any v ¼ ða; b; cÞ in R3 , A½vŠS ¼ ½FðvފS 0 .

6.66. Let S and S 0 be bases of V , and let 1V be the identity mapping on V . Show that the matrix A representing 1V relative to the bases S and S 0 is the inverse of the change-of-basis matrix P from S to S 0 ; that is, A ¼ PÀ1 . 6.67. Prove (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs of the analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7 (Problem 6.26).]

Miscellaneous Problems
6.68. Suppose F: V ! V is linear. A subspace W of V is said to be invariant under F if FðW Þ  W . Suppose W is ! A B invariant under F and dim W ¼ r. Show that F has a block triangular matrix representation M ¼ 0 C where A is an r  r submatrix. 6.69. Suppose V ¼ U þ W , and suppose U and V are each invariant under a linear operator F: V ! V . Also, ! A 0 suppose dim U ¼ r and dim W ¼ S. Show that F has a block diagonal matrix representation M ¼ 0 B where A and B are r  r and s  s submatrices. 6.70. Two linear operators F and G on V are said to be similar if there exists an invertible linear operator T on V such that G ¼ T À1  F  T . Prove (a) F and G are similar if and only if, for any basis S of V , ½FŠS and ½GŠS are similar matrices. (b) If F is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable.

ANSWERS TO SUPPLEMENTARY PROBLEMS

Notation: M ¼ ½R1 ;

R2 ;

. . .Š represents a matrix M with rows R1 ; R2 ; . . . :
4; 9Š;

6.37. (a) A ¼ ½4; 5; 2; À1Š; (b) B ¼ ½220; 487; À98; À217Š; (c) P ¼ ½1; 2; (d) ½vŠS ¼ ½9a À 2b; À4a þ bŠT and ½FðvފS ¼ ½32a þ 47b; À14a À 21bŠT 6.38. (a) B ¼ ½À6; À28; 4; 15Š; (b) ½vŠS ¼ ½4a À b; À 3 a þ 1 bŠT and ½AðvފS ¼ ½18a À 8b; 2 2 pffiffiffi pffiffiffi pffiffiffi pffiffiffi 6.39. (a) ½ 2; À 2; 2; 2Š; (d) ½1; 2; 18; À11Š 6.40. (a) (c) 6.41. (a) (c) 6.42. (a) 6.43. (a) (c) (b) ½0; 1; 1; 0Š; (c)
1 2 ðÀ13a

þ 7bފ

½3; 7;

5; À2Š;

½1; 0; 0; 0; 1; 0; 0; 0; 0Š; (b) ½2; À7; À4; 3; 1; 4; 6; À8; 1Š ½1; 3; 5; 0; À5; À10; 0; 3; 6Š; ½15; 65; 104; À49; À219; À351; ½1; 1; 2; ½1; 0; 0; ½5; 1; 0; 1; 3; 2; 0; 2; 1; 0; 5; 2; 1; 5; 2Š; 0; 0; 2Š; 0; 0; 5Š (b) (b)

½0; 0; 1;

0; 1; 1;

1; 1; 1Š;

(b) ½0; 1; 2; 29; 130; 208Š ½0; 2; 14; 22; 0;

À1; 2; 3;

1; 0; 0Š;

0; À5; À8Š 0; 0; 0; À3; 0; 0; 3; 0Š;

½0; 1; 0; 0;

CHAPTER 6 Linear Mappings and Matrices
6.44. (a) A ¼ ½0; À1; 1; 0Š; (b) A2 þ I ¼ 0

225

6.45. (a) ½a; 0; b; 0; 0; a; 0; b; c; 0; d; 0; 0; c; 0; dŠ; (b) ½a; c; 0; 0; b; d; 0; 0; 0; 0; a; c; 0; 0; b; dŠ; (c) ½0; Àc; b; 0; Àb; a À d; 0; b; c; 0; d À a; Àc; 6.47. (a) ½1; 3; (b) ½1; 3; (c) ½2; 3; (d) ½2; 4; 6.48. (a)

0; c; Àb; 0Š

2; 5Š; ½À5; 3; 2; À1Š; ½vŠ ¼ ½À5a þ 3b; 2a À bŠT ; À3; À8Š; ½À8; À3; 3; 1Š; ½vŠ ¼ ½À8a À 3b; 3a þ bŠT ; 5; 7Š; ½À7; 3; 5; À2Š; ½vŠ ¼ ½À7a þ 3b; 5a À 2bŠT ; 5 3 ½vŠ ¼ ½À 5 a þ 2b; 3 a À bŠT 3; 5Š; ½À 2 ; 2; 2 ; À1Š; 2 2 À1; À2Š; (b) Q ¼ ½2; 5; À1; À3Š

P ¼ ½3; 5;

pffiffiffi 6.49. Here K ¼ 3: 1 (a) 1 ðK; 1Þ; 2 2 ðÀ1; KÞ; 1 ðbÞ P ¼ 2 ½K; À1; 1; KŠ; ðcÞ 1 ½K þ 3; 3K À 1ŠT ; 2

1 2 ½2K

À 5; À5K À 2ŠT ;

1 2 ½aK

þ b; bK À aŠT

6.50. P is the matrix whose columns are u1 ; u2 ; u3 ; Q ¼ PÀ1 ; ½vŠ ¼ Q½a; b; cŠT : (a) Q ¼ ½1; 0; 0; 1; À1; 1; À2; 2; À1Š; ½vŠ ¼ ½a; a À b þ c; À2a þ 2b À cŠT ; (b) Q ¼ ½0; À2; 1; 2; 3; À2; À1; À1; 1Š; ½vŠ ¼ ½À2b þ c; 2a þ 3b À 2c; Àa À b þ cŠT ; (c) Q ¼ ½À2; 2; À1; À7; 4; À1; 5; À3; 1Š; ½vŠ ¼ ½À2a þ 2b À c; À7a þ 4b À c; 5a À 3b þ cŠT 6.52. (a) 6.53. (a) 6.54. (a) ½À23; À39; ½28; 47; ½43; 60; 15; 26Š; (b) ½35; 41; (b) ½13; 18; (b)
1 2 ½3; 7;

À27; À32Š; À 15 ; À10Š 2 À5; À9Š

(c)

½3; 5;

À1; À2Š;

(d) B ¼ PÀ1 AP

À15; À25Š; À33; À46Š;

6.55. ½10; 8; 20; 6.56. (a) 6.57. (a) 6.58. (a) 6.59. (a) 6.62. (a) 6.63. (a)

13; 11; 28;

À5; À4; À10Š (b) trðBÞ ¼ trðAÞ ¼ À2; (b) (c) detðBÞ ¼ detðAÞ ¼ À5

½À34; 57;

À19; 32Š;

trðFÞ ¼ 6; detðFÞ ¼ 23; trðFÞ ¼ À2; detðFÞ ¼ 13; A ¼ ½3; 1; ½2; À4; 9;

trðGÞ ¼ a þ d; detðGÞ ¼ ad À bc

(b) trðGÞ ¼ 0; detðGÞ ¼ 22 À2; À1Š; (b) P ¼ ½1; 2; (c) 1; 3Š ½2; 1; À7; À1Š À3a þ 3b À cŠT , and

À2; 4Š; B ¼ ½8; 11; 5; 3; À2Š; (b)

½3; 5; 1; 4;

4; À2; 7; 0Š;

(b) ½À9; 1; 4; 7; 2; 1Š; ½GðvފS 0 ¼ ½2a À 11b þ 7c; A ¼ ½47; 85; À38; À69Š;

½vŠS ¼ ½Àa þ 2b À c; 7b À 4cŠT (b) B ¼ ½71; 88;

5a À 5b þ 2c; À41; À51Š

6.64. (a)

6.65. A ¼ ½3; 11; 5;

À1; À8; À3Š

CHAPTER 7

Inner Product Spaces, Orthogonality
7.1 Introduction
The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field R, in which case V is called a real vector space; in the last sections of this chapter, we extend our results to the case where K is the complex field C, in which case V is called a complex vector space. Also, we adopt the previous notation that u; v; w a; b; c; k are vectors in V are scalars in K

Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied. Recall that the concepts of ‘‘length’’ and ‘‘orthogonality’’ did not appear in the investigation of arbitrary vector spaces V (although they did appear in Section 1.4 on the spaces Rn and Cn ). Here we place an additional structure on a vector space V to obtain an inner product space, and in this context these concepts are defined.

7.2

Inner Product Spaces

We begin with a definition.
DEFINITION:

Let V be a real vector space. Suppose to each pair of vectors u; v 2 V there is assigned a real number, denoted by hu; vi. This function is called a (real) inner product on V if it satisfies the following axioms: ½I1 Š (Linear Property): hau1 þ bu2 ; vi ¼ ahu1 ; vi þ bhu2 ; vi. ½I2 Š (Symmetric Property): hu; vi ¼ hv; ui. ½I3 Š (Positive Definite Property): hu; ui ! 0.; and hu; ui ¼ 0 if and only if u ¼ 0. The vector space V with an inner product is called a (real) inner product space.

Axiom ½I1 Š states that an inner product function is linear in the first position. Using ½I1 Š and the symmetry axiom ½I2 Š, we obtain hu; cv 1 þ dv 2 i ¼ hcv 1 þ dv 2 ; ui ¼ chv 1 ; ui þ dhv 2 ; ui ¼ chu; v 1 i þ dhu; v 2 i That is, the inner product function is also linear in its second position. Combining these two properties and using induction yields the following general formula: ) ( PP P P ai ui ; bj v j ¼ ai bj hui ; v j i i j i j

226

CHAPTER 7 Inner Product Spaces, Orthogonality

227

That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner products of the vectors.
EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity,

h3u1 À 4u2 ; 2v 1 À 5v 2 þ 6v 3 i ¼ 6hu1 ; v 1 i À 15hu1 ; v 2 i þ 18hu1 ; v 3 i À 8hu2 ; v 1 i þ 20hu2 ; v 2 i À 24hu2 ; v 3 i h2u À 5v; 4u þ 6vi ¼ 8hu; ui þ 12hu; vi À 20hv; ui À 30hv; vi ¼ 8hu; ui À 8hv; ui À 30hv; vi
Observe that in the last equation we have used the symmetry property that hu; vi ¼ hv; ui.

Remark: Axiom ½I1 Š by itself implies h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0: Thus, ½I1 Š, ½I2 Š, ½I3 Š are equivalent to ½I1 Š, ½I2 Š, and the following axiom: ½I03 Š If u 6¼ 0; then hu; ui is positive: That is, a function satisfying ½I1 Š, ½I2 Š, ½I03 Š is an inner product.

Norm of a Vector
By the third axiom ½I3 Š of an inner product, hu; ui is nonnegative for any vector u. Thus, its positive square root exists. We use the notation pffiffiffiffiffiffiffiffiffiffiffi kuk ¼ hu; ui This nonnegative number is called the norm or length of u. The relation kuk2 ¼ hu; ui will be used frequently. Remark: If kuk ¼ 1 or, equivalently, if hu; ui ¼ 1, then u is called a unit vector and it is said to be normalized. Every nonzero vector v in V can be multiplied by the reciprocal of its length to obtain the unit vector ^ v¼ 1 v kvk

which is a positive multiple of v. This process is called normalizing v.

7.3

Examples of Inner Product Spaces

This section lists the main examples of inner product spaces used in this text.

Euclidean n-Space Rn
Consider the vector space Rn . The dot product or scalar product in Rn is defined by u Á v ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn where u ¼ ðai Þ and v ¼ ðbi Þ. This function defines an inner product on Rn . The norm kuk of the vector u ¼ ðai Þ in this space is as follows: pffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kuk ¼ u Á u ¼ a2 þ a2 þ Á Á Á þ a2 n 1 2 On the other hand, by the Pythagorean theorem, the distance from the origin O in R3 to a point pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pða; b; cÞ is given by a2 þ b2 þ c2 . This is precisely the same as the above-defined norm of the vector v ¼ ða; b; cÞ in R3 . Because the Pythagorean theorem is a consequence of the axioms of

228

CHAPTER 7 Inner Product Spaces, Orthogonality

Euclidean geometry, the vector space Rn with the above inner product and norm is called Euclidean n-space. Although there are many ways to define an inner product on Rn , we shall assume this inner product unless otherwise stated or implied. It is called the usual (or standard ) inner product on Rn . Remark: Frequently the vectors in Rn will be represented by column vectors—that is, by n  1 column matrices. In such a case, the formula hu; vi ¼ uT v defines the usual inner product on Rn .
EXAMPLE 7.2 Let u ¼ ð1; 3; À4; 2Þ, v ¼ ð4; À2; 2; 1Þ, w ¼ ð5; À1; À2; 6Þ in R4 . (a) Show h3u À 2v; wi ¼ 3hu; wi À 2hv; wi: By definition,

hu; wi ¼ 5 À 3 þ 8 þ 12 ¼ 22

and

hv; wi ¼ 20 þ 2 À 4 þ 6 ¼ 24

Note that 3u À 2v ¼ ðÀ5; 13; À16; 4Þ. Thus,

h3u À 2v; wi ¼ À25 À 13 þ 32 þ 24 ¼ 18
As expected, 3hu; wi À 2hv; wi ¼ 3ð22Þ À 2ð24Þ ¼ 18 ¼ h3u À 2v; wi. (b) Normalize u and v: By definition,

kuk ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 1 þ 9 þ 16 þ 4 ¼ 30

and

kvk ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 þ 4 þ 4 þ 1 ¼ 5   4 À2 2 1 ; ; ; 5 5 5 5

We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively:

1 ^ u¼ u¼ kuk

  1 3 À4 2 pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi 30 30 30 30

and

1 ^ v¼ v¼ kvk

Function Space C½a; bŠ and Polynomial Space PðtÞ
The notation C½a; bŠ is used to denote the vector space of all continuous functions on the closed interval ½a; bŠ—that is, where a t b. The following defines an inner product on C½a; bŠ, where f ðtÞ and gðtÞ are functions in C½a; bŠ: ðb h f ; gi ¼ f ðtÞgðtÞ dt a It is called the usual inner product on C½a; bŠ. The vector space PðtÞ of all polynomials is a subspace of C½a; bŠ for any interval ½a; bŠ, and hence, the above is also an inner product on PðtÞ.
EXAMPLE 7.3 Consider f ðtÞ ¼ 3t À 5 and gðtÞ ¼ t2 in the polynomial space PðtÞ with inner product

ð1 h f ; gi ¼
0

f ðtÞgðtÞ dt:

(a) Find h f ; gi. We have f ðtÞgðtÞ ¼ 3t3 À 5t2 . Hence,

ð1
0

h f ; gi ¼

1  ð3t3 À 5t2 Þ dt ¼ 3 t4 À 5 t3  ¼ 3 À 5 ¼ À 11 4 3  4 3 12
0

CHAPTER 7 Inner Product Spaces, Orthogonality
(b) Find k f k and kgk. We have ½ f ðtފ2 ¼ f ðtÞ f ðtÞ ¼ 9t2 À 30t þ 25 and ½gðtފ2 ¼ t4 . Then
2

229

1  k f k ¼ h f ; f i ¼ ð9t À 30t þ 25Þ dt ¼ 3t À 15t þ 25t ¼ 13  0 0 1 ð1  kgk2 ¼ hg; gi ¼ t4 dt ¼ 1 t5  ¼ 1 5  5 ð1
2 3 2 0 0

qffiffi pffiffiffiffiffi pffiffiffi Therefore, k f k ¼ 13 and kgk ¼ 1 ¼ 1 5. 5 5

Matrix Space M ¼ Mm;n
Let M ¼ Mm;n , the vector space of all real m  n matrices. An inner product is defined on M by hA; Bi ¼ trðBT AÞ where, as usual, trð Þ is the trace—the sum of the diagonal elements. If A ¼ ½aij Š and B ¼ ½bij Š, then m n m n PP PP 2 hA; Bi ¼ trðBT AÞ ¼ aij bij and kAk2 ¼ hA; Ai ¼ aij i¼1 j¼1 i¼1 j¼1

That is, hA; Bi is the sum of the products of the corresponding entries in A and B and, in particular, hA; Ai is the sum of the squares of the entries of A.

Hilbert Space
Let V be the vector space of all infinite sequences of real numbers ða1 ; a2 ; a3 ; . . .Þ satisfying 1 P 2 ai ¼ a2 þ a2 þ Á Á Á < 1 1 2 i¼1 that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if u ¼ ða1 ; a2 ; . . .Þ then u þ v ¼ ða1 þ b1 ; a2 þ b2 ; . . .Þ hu; vi ¼ a1 b1 þ a2 b2 þ Á Á Á The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined. This inner product space is called l2 -space or Hilbert space. An inner product is defined in v by and and v ¼ ðb1 ; b2 ; . . .Þ ku ¼ ðka1 ; ka2 ; . . .Þ

7.4

Cauchy–Schwarz Inequality, Applications

The following formula (proved in Problem 7.8) is called the Cauchy–Schwarz inequality or Schwarz inequality. It is used in many branches of mathematics.
THEOREM 7.1:

(Cauchy–Schwarz) For any vectors u and v in an inner product space V, hu; vi2 hu; uihv; vi or jhu; vij kukkvk

Next we examine this inequality in specific cases.
EXAMPLE 7.4 (a) Consider any real numbers a1 ; . . . ; an , b1 ; . . . ; bn . Then, by the Cauchy–Schwarz inequality,

ða1 b1 þ a2 b2 þ Á Á Á þ an bn Þ2
That is, ðu Á vÞ2

ða2 þ Á Á Á þ a2 Þðb2 þ Á Á Á þ b2 Þ 1 n 1 n

kuk2 kvk2 , where u ¼ ðai Þ and v ¼ ðbi Þ.

230 ð1 f ðtÞgðtÞ dt
0 0
2 2 2

CHAPTER 7 Inner Product Spaces, Orthogonality

(b) Let f and g be continuous functions on the unit interval ½0; 1Š. Then, by the Cauchy–Schwarz inequality,

!2

ð1 f 2 ðtÞ dt

ð1 g 2 ðtÞ dt
0

That is, ðh f ; giÞ

k f k kvk . Here V is the inner product space C½0; 1Š.

The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third property requires the Cauchy–Schwarz inequality.
THEOREM 7.2:

Let V be an inner product space. Then the norm in V satisfies the following properties: ½N1 Š kvk ! 0; and kvk ¼ 0 if and only if v ¼ 0. ½N2 Š kkvk ¼ jkjkvk. ½N3 Š ku þ vk kuk þ kvk.

The property ½N3 Š is called the triangle inequality, because if we view u þ v as the side of the triangle formed with sides u and v (as shown in Fig. 7-1), then ½N3 Š states that the length of one side of a triangle cannot be greater than the sum of the lengths of the other two sides.

Figure 7-1

Angle Between Vectors
For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be the angle y such that 0 y p and cos y ¼ hu; vi kukkvk cos y 1, and so the angle exists and is unique.

By the Cauchy–Schwartz inequality, À1
EXAMPLE 7.5

(a) Consider vectors u ¼ ð2; 3; 5Þ and v ¼ ð1; À4; 3Þ in R3 . Then

hu; vi ¼ 2 À 12 þ 15 ¼ 5; 5 cos y ¼ pffiffiffiffiffipffiffiffiffiffi 38 26

kuk ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 4 þ 9 þ 25 ¼ 38;

kvk ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 1 þ 16 þ 9 ¼ 26

Then the angle y between u and v is given by

Note that y is an acute angle, because cos y is positive. (b) Let f ðtÞ ¼ 3t À 5 and gðtÞ ¼ t2 in the polynomial space PðtÞ with inner product h f ; gi ¼ Example 7.3, Ð1
0

f ðtÞgðtÞ dt. By

h f ; gi ¼ À 11 ; 12

k fk ¼

pffiffiffiffiffi 13;

pffiffiffi kgk ¼ 1 5 5

Then the ‘‘angle’’ y between f and g is given by

À 11 55 12 cos y ¼ pffiffiffiffiffi À1 pffiffiffiÁ ¼ À pffiffiffiffiffipffiffiffi ð 13Þ 5 5 12 13 5
Note that y is an obtuse angle, because cos y is negative.

CHAPTER 7 Inner Product Spaces, Orthogonality

231

7.5

Orthogonality

Let V be an inner product space. The vectors u; v 2 V are said to be orthogonal and u is said to be orthogonal to v if hu; vi ¼ 0 The relation is clearly symmetric—if u is orthogonal to v, then hv; ui ¼ 0, and so v is orthogonal to u. We note that 0 2 V is orthogonal to every v 2 V, because h0; vi ¼ h0v; vi ¼ 0hv; vi ¼ 0 Conversely, if u is orthogonal to every v 2 V, then hu; ui ¼ 0 and hence u ¼ 0 by ½I3 Š: Observe that u and v are orthogonal if and only if cos y ¼ 0, where y is the angle between u and v. Also, this is true if and only if u and v are ‘‘perpendicular’’—that is, y ¼ p=2 (or y ¼ 90 ).
EXAMPLE 7.6 (a) Consider the vectors u ¼ ð1; 1; 1Þ, v ¼ ð1; 2; À3Þ, w ¼ ð1; À4; 3Þ in R3 . Then

hu; vi ¼ 1 þ 2 À 3 ¼ 0;

hu; wi ¼ 1 À 4 þ 3 ¼ 0;

hv; wi ¼ 1 À 8 À 9 ¼ À16

Thus, u is orthogonal to v and w, but v and w are not orthogonal. (b) Consider the functions sin t and cos t in the vector space C½Àp; pŠ of continuous functions on the closed interval ½Àp; pŠ. Then

ðp hsin t; cos ti ¼
Àp

sin t cos t dt ¼ 1 sin2 tjp ¼ 0 À 0 ¼ 0 Àp 2

Thus, sin t and cos t are orthogonal functions in the vector space C½Àp; pŠ.

Remark: A vector w ¼ ðx1 ; x2 ; . . . ; xn Þ is orthogonal to u ¼ ða1 ; a2 ; . . . ; an Þ in Rn if hu; wi ¼ a1 x1 þ a2 x2 þ Á Á Á þ an xn ¼ 0 That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements of u.
EXAMPLE 7.7 Find a nonzero vector w that is orthogonal to u1 ¼ ð1; 2; 1Þ and u2 ¼ ð2; 5; 4Þ in R3 . Let w ¼ ðx; y; zÞ. Then we want hu1 ; wi ¼ 0 and hu2 ; wi ¼ 0. This yields the homogeneous system

x þ 2y þ z ¼ 0 2x þ 5y þ 4z ¼ 0

or

x þ 2y þ z ¼ 0 y þ 2z ¼ 0

Here z is the only free variable in the echelon system. Set z ¼ 1 to obtain y ¼ À2 and x ¼ 3. Thus, w ¼ ð3; À2; 1Þ is a desired nonzero vector orthogonal to u1 and u2 . Any multiple of w will also be orthogonal to u1 and u2 . Normalizing w, we obtain the following unit vector orthogonal to u1 and u2 :   w 3 2 1 ^ ¼ pffiffiffiffiffi ; À pffiffiffiffiffi ; pffiffiffiffiffi w¼ kwk 14 14 14

Orthogonal Complements
Let S be a subset of an inner product space V. The orthogonal complement of S, denoted by S ? (read ‘‘S perp’’) consists of those vectors in V that are orthogonal to every vector u 2 S; that is, S ? ¼ fv 2 V : hv; ui ¼ 0 for every u 2 Sg

232

CHAPTER 7 Inner Product Spaces, Orthogonality

In particular, for a given vector u in V, we have u? ¼ fv 2 V : hv; ui ¼ 0g that is, u? consists of all vectors in V that are orthogonal to the given vector u. We show that S ? is a subspace of V. Clearly 0 2 S ? , because 0 is orthogonal to every vector in V. Now suppose v, w 2 S ? . Then, for any scalars a and b and any vector u 2 S, we have hav þ bw; ui ¼ ahv; ui þ bhw; ui ¼ a Á 0 þ b Á 0 ¼ 0 Thus, av þ bw 2 S ? , and therefore S ? is a subspace of V. We state this result formally.
PROPOSITION 7.3:

Let S be a subset of a vector space V. Then S ? is a subspace of V.

Remark 1: Suppose u is a nonzero vector in R3 . Then there is a geometrical description of u? . Specifically, u? is the plane in R3 through the origin O and perpendicular to the vector u. This is shown in Fig. 7-2.

Figure 7-2

Remark 2: Let W be the solution space of an m  n homogeneous system AX ¼ 0, where A ¼ ½aij Š and X ¼ ½xi Š. Recall that W may be viewed as the kernel of the linear mapping A: Rn ! Rm . Now we can give another interpretation of W using the notion of orthogonality. Specifically, each solution vector w ¼ ðx1 ; x2 ; . . . ; xn Þ is orthogonal to each row of A; hence, W is the orthogonal complement of the row space of A.
EXAMPLE 7.8 Find a basis for the subspace u? of R3 , where u ¼ ð1; 3; À4Þ. Note that u? consists of all vectors w ¼ ðx; y; zÞ such that hu; wi ¼ 0, or x þ 3y À 4z ¼ 0. The free variables are y and z. (1) Set y ¼ 1, z ¼ 0 to obtain the solution w1 ¼ ðÀ3; 1; 0Þ. (2) Set y ¼ 0, z ¼ 1 to obtain the solution w1 ¼ ð4; 0; 1Þ. The vectors w1 and w2 form a basis for the solution space of the equation, and hence a basis for u? .

Suppose W is a subspace of V. Then both W and W ? are subspaces of V. The next theorem, whose proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra.

THEOREM 7.4:

Let W be a subspace of V. Then V is the direct sum of W and W ? ; that is, V ¼ W È W ?.

CHAPTER 7 Inner Product Spaces, Orthogonality

233

7.6

Orthogonal Sets and Bases

Consider a set S ¼ fu1 ; u2 ; . . . ; ur g of nonzero vectors in an inner product space V. S is called orthogonal if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector in S has unit length. That is, (i) Orthogonal: hui ; uj i ¼ 0 for i 6¼ j & 0 for i 6¼ j (ii) Orthonormal: hui ; uj i ¼ 1 for i ¼ j Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of its length in order to transform S into an orthonormal set of vectors. The following theorems apply.
THEOREM 7.5: THEOREM 7.6:

Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. (Pythagoras) Suppose fu1 ; u2 ; . . . ; ur g is an orthogonal set of vectors. Then ku1 þ u2 þ Á Á Á þ ur k2 ¼ ku1 k2 þ ku2 k2 þ Á Á Á þ kur k2

These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean theorem in the special and familiar case for two vectors. Specifically, suppose hu; vi ¼ 0. Then ku þ vk2 ¼ hu þ v; u þ vi ¼ hu; ui þ 2hu; vi þ hv; vi ¼ hu; ui þ hv; vi ¼ kuk2 þ kvk2 which gives our result.
EXAMPLE 7.9 (a) Let E ¼ fe1 ; e2 ; e3 g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg be the usual basis of Euclidean space R3 . It is clear that

he1 ; e2 i ¼ he1 ; e3 i ¼ he2 ; e3 i ¼ 0
3

and

he1 ; e1 i ¼ he2 ; e2 i ¼ he3 ; e3 i ¼ 1

Namely, E is an orthonormal basis of R . More generally, the usual basis of Rn is orthonormal for every n. (b) Let V ¼ C½Àp; pŠ beÐ the vector space of continuous functions on the interval Àp t p with inner product p defined by h f ; gi ¼ Àp f ðtÞgðtÞ dt. Then the following is a classical example of an orthogonal set in V :

f1; cos t; cos 2t; cos 3t; . . . ; sin t; sin 2t; sin 3t; . . .g
This orthogonal set plays a fundamental role in the theory of Fourier series.

Orthogonal Basis and Linear Combinations, Fourier Coefficients
Let S consist of the following three vectors in R3 : u1 ¼ ð1; 2; 1Þ; u2 ¼ ð2; 1; À4Þ; u3 ¼ ð3; À2; 1Þ

The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S is an orthogonal basis of R3 . Suppose we want to write v ¼ ð7; 1; 9Þ as a linear combination of u1 ; u2 ; u3 . First we set v as a linear combination of u1 ; u2 ; u3 using unknowns x1 ; x2 ; x3 as follows: v ¼ x1 u1 þ x2 u2 þ x3 u3 We can proceed in two ways.
METHOD 1:

or

ð7; 1; 9Þ ¼ x1 ð1; 2; 1Þ þ x2 ð2; 1; À4Þ þ x3 ð3; À2; 1Þ

ð*Þ

Expand ð*Þ (as in Chapter 3) to obtain the system x1 þ 2x2 þ 3x3 ¼ 7; 2x1 þ x2 À 2x3 ¼ 1; x1 À 4x2 þ x3 ¼ 7 Solve the system by Gaussian elimination to obtain x1 ¼ 3, x2 ¼ À1, x3 ¼ 2. Thus, v ¼ 3u1 À u2 þ 2u3 .

234
METHOD 2:

CHAPTER 7 Inner Product Spaces, Orthogonality
(This method uses the fact that the basis vectors are orthogonal, and the arithmetic is much simpler.) If we take the inner product of each side of ð*Þ with respect to ui , we get hv; ui i ¼ hx1 u2 þ x2 u2 þ x3 u3 ; ui i or hv; ui i ¼ xi hui ; ui i or xi ¼ hv; ui i hui ; ui i

Here two terms drop out, because u1 ; u2 ; u3 are orthogonal. Accordingly, x1 ¼ hv; u1 i 7 þ 2 þ 9 18 hv; u2 i 14 þ 1 À 36 À21 ¼ ¼ 3; x2 ¼ ¼ ¼ À1 ¼ ¼ hu1 ; u1 i 1 þ 4 þ 1 6 hu2 ; u2 i 4 þ 1 þ 16 21 hv; u3 i 21 À 2 þ 9 28 ¼ ¼ ¼2 x3 ¼ hu3 ; u3 i 9þ4þ1 14

Thus, again, we get v ¼ 3u1 À u2 þ 2u3 . The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in Problem 7.17).
THEOREM 7.7:

Let fu1 ; u2 ; . . . ; un g be an orthogonal basis of V. Then, for any v 2 V, v¼ hv; u1 i hv; u2 i hv; un i u þ u þ ÁÁÁ þ u hu1 ; u1 i 1 hu2 ; u2 i 2 hun ; un i n

hv; ui i is called the Fourier coefficient of v with respect to ui , because it hui ; ui i is analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric interpretation, which is discussed below. Remark: The scalar ki 

Projections
Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose v is another vector. We seek the ‘‘projection of v along w,’’ which, as indicated in Fig. 7-3(a), will be the multiple cw of w such that v 0 ¼ v À cw is orthogonal to w. This means hv; wi hv À cw; wi ¼ 0 or hv; wi À chw; wi ¼ 0 or c¼ hw; wi

Figure 7-3

Accordingly, the projection of v along w is denoted and defined by projðv; wÞ ¼ cw ¼ hv; wi w hw; wi

Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of v along w. The above notion is generalized as follows (see Problem 7.25).

CHAPTER 7 Inner Product Spaces, Orthogonality
THEOREM 7.8:

235

Suppose w1 ; w2 ; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be any vector in V. Define v 0 ¼ v À ðc1 w1 þ c2 w2 þ Á Á Á þ cr wr Þ where c1 ¼ hv; w1 i ; hw1 ; w1 i c2 ¼ hv; w2 i ; hw2 ; w2 i ...; cr ¼ hv; wr i hwr ; wr i

Then v 0 is orthogonal to w1 ; w2 ; . . . ; wr . Note that each ci in the above theorem is the component (Fourier coefficient) of v along the given wi . Remark: The notion of the projection of a vector v 2 V along a subspace W of V is defined as follows. By Theorem 7.4, V ¼ W È W ? . Hence, v may be expressed uniquely in the form v ¼ w þ w0 ; where w2W and w0 2 W ? We define w to be the projection of v along W, and denote it by projðv; W Þ, as pictured in Fig. 7-2(b). In particular, if W ¼ spanðw1 ; w2 ; . . . ; wr Þ, where the wi form an orthogonal set, then projðv; W Þ ¼ c1 w1 þ c2 w2 þ Á Á Á þ cr wr Here ci is the component of v along wi , as above.

7.7

Gram–Schmidt Orthogonalization Process

Suppose fv 1 ; v 2 ; . . . ; v n g is a basis of an inner product space V. One can use this basis to construct an orthogonal basis fw1 ; w2 ; . . . ; wn g of V as follows. Set w1 ¼ v 1 w2 ¼ v 2 À w3 ¼ v 3 À hv 2 ; w1 i w hw1 ; w1 i 1 hv 3 ; w1 i hv ; w i w À 3 2 w hw1 ; w1 i 1 hw2 ; w2 i 2 hv n ; w1 i hv ; w i hv n ; wnÀ1 i w1 À n 2 w2 À Á Á Á À w hw1 ; w1 i hw2 ; w2 i hwnÀ1 ; wnÀ1 i nÀ1

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: wn ¼ v n À

In other words, for k ¼ 2; 3; . . . ; n, we define wk ¼ v k À ck1 w1 À ck2 w2 À Á Á Á À ck;kÀ1 wkÀ1 where cki ¼ hv k ; wi i=hwi ; wi i is the component of v k along wi . By Theorem 7.8, each wk is orthogonal to the preceeding w’s. Thus, w1 ; w2 ; . . . ; wn form an orthogonal basis for V as claimed. Normalizing each wi will then yield an orthonormal basis for V. The above construction is known as the Gram–Schmidt orthogonalization process. The following remarks are in order. Remark 1: Each vector wk is a linear combination of v k and the preceding w’s. Hence, one can easily show, by induction, that each wk is a linear combination of v 1 ; v 2 ; . . . ; v n . Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in hand calculations to clear fractions in any new wk , by multiplying wk by an appropriate scalar, before obtaining the next wkþ1 .

236

CHAPTER 7 Inner Product Spaces, Orthogonality

Remark 3: Suppose u1 ; u2 ; . . . ; ur are linearly independent, and so they form a basis for U ¼ spanðui Þ. Applying the Gram–Schmidt orthogonalization process to the u’s yields an orthogonal basis for U . The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks.
THEOREM 7.9:

Let fv 1 ; v 2 ; . . . ; v n g be any basis of an inner product space V. Then there exists an orthonormal basis fu1 ; u2 ; . . . ; un g of V such that the change-of-basis matrix from fv i g to fui g is triangular; that is, for k ¼ 1; . . . ; n, uk ¼ ak1 v 1 þ ak2 v 2 þ Á Á Á þ akk v k

THEOREM 7.10:

Suppose S ¼ fw1 ; w2 ; . . . ; wr g is an orthogonal basis for a subspace W of a vector space V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors wrþ1 ; . . . ; wn such that fw1 ; w2 ; . . . ; wn g is an orthogonal basis for V.

EXAMPLE 7.10

Apply the Gram–Schmidt orthogonalization process to find an orthogonal basis and then an orthonormal basis for the subspace U of R4 spanned by v 1 ¼ ð1; 1; 1; 1Þ; (1) (2) v 2 ¼ ð1; 2; 4; 5Þ; v 3 ¼ ð1; À3; À4; À2Þ First set w1 ¼ v 1 ¼ ð1; 1; 1; 1Þ. Compute

v2 À

hv 2 ; w1 i 12 w1 ¼ v 2 À w1 ¼ ðÀ2; À1; 1; 2Þ hw1 ; w1 i 4

(3)

Set w2 ¼ ðÀ2; À1; 1; 2Þ. Compute

v3 À

À Á hv 3 ; w1 i hv ; w i ðÀ8Þ ðÀ7Þ w1 À 3 2 w2 ¼ v 3 À w1 À w2 ¼ 8 ; À 17 ; À 13 ; 7 5 10 10 5 hw1 ; w1 i hw2 ; w2 i 4 10

Clear fractions to obtain w3 ¼ ðÀ6; À17; À13; 14Þ. Thus, w1 ; w2 ; w3 form an orthogonal basis for U . Normalize these vectors to obtain an orthonormal basis fu1 ; u2 ; u3 g of U . We have kw1 k2 ¼ 4, kw2 k2 ¼ 10, kw3 k2 ¼ 910, so

1 u1 ¼ ð1; 1; 1; 1Þ; 2
EXAMPLE 7.11 Let Ð

1 u2 ¼ pffiffiffiffiffi ðÀ2; À1; 1; 2Þ; 10

1 u3 ¼ pffiffiffiffiffiffiffiffi ð16; À17; À13; 14Þ 910

V be the vector space of polynomials f ðtÞ with inner product 1 h f ; gi ¼ À1 f ðtÞgðtÞ dt. Apply the Gram–Schmidt orthogonalization process to f1; t; t2 ; t3 g to find an orthogonal basis f f0 ; f1 ; f2 ; f3 g with integer coefficients for P3 ðtÞ.
Here we use the fact that, for r þ s ¼ n,

ð1 htr ; ts i ¼
(1) (2) (3)
À1

tn dt ¼

1 & tnþ1   ¼ 2=ðn þ 1Þ when n is even 0 when n is odd n þ 1 À1

First set f0 ¼ 1. ht; 1i Compute t ¼ ð1Þ ¼ t À 0 ¼ t. Set f1 ¼ t. h1; 1i Compute

t2 À

2 ht2 ; 1i ht2 ; ti ð1Þ À ðtÞ ¼ t2 À 3 ð1Þ þ 0ðtÞ ¼ t2 À 1 3 h1; 1i ht; ti 2

Multiply by 3 to obtain f2 ¼ 3t2 ¼ 1.

CHAPTER 7 Inner Product Spaces, Orthogonality
(4) Compute

237

t3 À

ht3 ; 1i ht3 ; ti ht3 ; 3t2 À 1i ð1Þ À ðtÞ À 2 ð3t2 À 1Þ h1; 1i ht; ti h3t À 1; 3t2 À 1i
2

¼ t3 À 0ð1Þ À 5 ðtÞ À 0ð3t2 À 1Þ ¼ t3 À 3 t 5 2
3

Multiply by 5 to obtain f3 ¼ 5t3 À 3t. Thus, f1; t; 3t2 À 1; 5t3 À 3tg is the required orthogonal basis.

Remark: Normalizing the polynomials in Example 7.11 so that pð1Þ ¼ 1 yields the polynomials 1; t;
1 2 2 ð3t

À 1Þ;

1 3 2 ð5t

À 3tÞ

These are the first four Legendre polynomials, which appear in the study of differential equations.

7.8

Orthogonal and Positive Definite Matrices

This section discusses two types of matrices that are closely related to real inner product spaces V. Here vectors in Rn will be represented by column vectors. Thus, hu; vi ¼ uT v denotes the inner product in Euclidean space Rn .

Orthogonal Matrices
A real matrix P is orthogonal if P is nonsingular and PÀ1 ¼ PT , or, in other words, if PPT ¼ PT P ¼ I. First we recall (Theorem 2.6) an important characterization of such matrices. Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b) the rows of P form an orthonormal set; (c) the columns of P form an orthonormal set. (This theorem is true only using the usual inner product on Rn . It is not true if Rn is given any other inner product.)
THEOREM 7.11: EXAMPLE 7.12 2 pffiffiffi pffiffiffi 1= 3 1=p3 ffiffiffi (a) Let P ¼ 4 0 ffiffiffi 1=p2 p ffiffiffi 2= 6 À1= 6 P is an orthogonal matrix. pffiffiffi 3 1=p3 ffiffiffi 1=p2 5: The rows of P are orthogonal to each other and are unit vectors. Thus ffiffiffi À1= 6

(b) Let P be a 2 Â 2 orthogonal matrix. Then, for some real number y, we have



cos y À sin y

sin y cos y

!

or



cos y sin y

sin y À cos y

!

The following two theorems (proved in Problems 7.37 and 7.38) show important relationships between orthogonal matrices and orthonormal bases of a real inner product space V.
THEOREM 7.12: THEOREM 7.13:

Suppose E ¼ fei g and E0 ¼ fe0i g are orthonormal bases of V. Let P be the changeof-basis matrix from the basis E to the basis E0 . Then P is orthogonal. Let fe1 ; . . . ; en g be an orthonormal basis of an inner product space V. Let P ¼ ½aij Š be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V : e0i ¼ a1i e1 þ a2i e2 þ Á Á Á þ ani en ; i ¼ 1; 2; . . . ; n

238
Positive Definite Matrices

CHAPTER 7 Inner Product Spaces, Orthogonality

Let A be a real symmetric matrix; that is, AT ¼ A. Then A is said to be positive definite if, for every nonzero vector u in Rn , hu; Aui ¼ uT Au > 0 Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 12. However, for 2 Â 2 matrices, we have simple criteria that we state formally in the following theorem (proved in Problem 7.43). ! ! a b a b is positive definite ¼ THEOREM 7.14: A 2 Â 2 real symmetric matrix A ¼ b d c d if and only if the diagonal entries a and d are positive and the determinant jAj ¼ ad À bc ¼ ad À b2 is positive.
EXAMPLE 7.13

Consider the following symmetric matrices: ! ! 1 1 À2 1 3 ; C¼ ; B¼ A¼ À2 À2 À3 3 4

À2 5

!

A is not positive definite, because jAj ¼ 4 À 9 ¼ À5 is negative. B is not positive definite, because the diagonal entry À3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the determinant jCj ¼ 5 À 4 ¼ 1 is also positive.

The following theorem (proved in Problem 7.44) holds.
THEOREM 7.15:

Let A be a real positive definite matrix. Then the function hu; vi ¼ uT Av is an inner product on Rn .

Matrix Representation of an Inner Product (Optional)
Theorem 7.15 says that every positive definite matrix A determines an inner product on Rn . This subsection may be viewed as giving the converse of this result. Let V be a real inner product space with basis S ¼ fu1 ; u2 ; . . . ; un g. The matrix A ¼ ½aij Š; where aij ¼ hui ; uj i is called the matrix representation of the inner product on V relative to the basis S. Observe that A is symmetric, because the inner product is symmetric; that is, hui ; uj i ¼ huj ; ui i. Also, A depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then A is diagonal, and if S is an orthonormal basis, then A is the identity matrix.
EXAMPLE 7.14 The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 3; 5Þ form a basis S for Euclidean space R3 . Find the matrix A that represents the inner product in R3 relative to this basis S. First compute each hui ; uj i to obtain

hu1 ; u1 i ¼ 1 þ 1 þ 0 ¼ 2; hu2 ; u2 i ¼ 1 þ 4 þ 9 ¼ 14;
2 Then A ¼ 4 3 4 2

hu1 ; u2 i ¼ 1 þ 2 þ 0 ¼ 3; hu2 ; u3 i ¼ 1 þ 6 þ 15 ¼ 22;

hu1 ; u3 i ¼ 1 þ 3 þ 0 ¼ 4 hu3 ; u3 i ¼ 1 þ 9 þ 25 ¼ 35

3 3 4 14 22 5. As expected, A is symmetric. 22 35

The following theorems (proved in Problems 7.45 and 7.46, respectively) hold.
THEOREM 7.16:

Let A be the matrix representation of an inner product relative to basis S for V. Then, for any vectors u; v 2 V, we have hu; vi ¼ ½uŠT A½vŠ where ½uŠ and ½vŠ denote the (column) coordinate vectors relative to the basis S.

CHAPTER 7 Inner Product Spaces, Orthogonality
THEOREM 7.17:

239

Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix.

7.9

Complex Inner Product Spaces

This section considers vector spaces over the complex field C. First we recall some properties of the complex numbers (Section 1.7), especially the relations between a complex number z ¼ a þ bi; where a; b 2 R; and its complex conjugate z ¼ a À bi:  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ z z ¼ a2 þ b2 ; z jzj ¼ a2 þ b2 ; z1 þ z2 ¼ z1 þ z2 z1 z2 ¼ z1 z2 ; z Also, z is real if and only if z ¼ z.  The following definition applies.
DEFINITION:

Let V be a vector space over C. Suppose to each pair of vectors, u; v 2 V there is assigned a complex number, denoted by hu; vi. This function is called a (complex) inner product on V if it satisfies the following axioms: ½I1 (Linear Property) hau1 þ bu2 ; vi ¼ ahu1 ; vi þ bhu2 ; vi *Š *Š ½I2 (Conjugate Symmetric Property) hu; vi ¼ hv; ui ½I3 (Positive Definite Property) hu; ui ! 0; and hu; ui ¼ 0 if and only if u ¼ 0. *Š

The vector space V over C with an inner product is called a (complex) inner product space. Observe that a complex inner product differs from the real case only in the second axiom ½I 2 *Š: Axiom ½I1 (Linear Property) is equivalent to the two conditions: *Š ðaÞ hu1 þ u2 ; vi ¼ hu1 ; vi þ hu2 ; vi; ðbÞ hku; vi ¼ khu; vi On the other hand, applying ½I 1 and ½I 2 we obtain *Š *Š,   hu; kvi ¼ hkv; ui ¼ khv; ui ¼ khv; ui ¼ khu; vi That is, we must take the conjugate of a complex number when it is taken out of the second position of a complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second position; that is,   hu; av 1 þ bv 2 i ¼ ahu; v 1 i þ bhu; v 2 i Combining linear in the first position and conjugate linear in the second position, we obtain, by induction, * + P P P ai ui ; bj v j ¼ ai bj hui ; v j i i j i;j

The following remarks are in order. Remark 1: Axiom ½I1 by itself implies that h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0. Accordingly, ½I1 ½I2 *Š *Š, *Š, and ½I3 are equivalent to ½I1 ½I2 and the following axiom: *Š *Š, *Š, ½I 3 0 Š If u 6¼ 0; then hu; ui > 0: * That is, a function satisfying ½I1 Š, ½I2 and ½I3 0 Š is a (complex) inner product on V. *Š, * Remark 2: By ½I2 hu; ui ¼ hu; ui. Thus, hu; ui must be real. By ½I3 hu; ui must be nonnegative, *Š; *Š; pffiffiffiffiffiffiffiffiffiffiffi and hence, its positive real square root exists. As with real inner product spaces, we define kuk ¼ hu; ui to be the norm or length of u. Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal complement, and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier coefficient and projections are the same as in the real case.

240
EXAMPLE 7.15

CHAPTER 7 Inner Product Spaces, Orthogonality
(Complex Euclidean Space Cn ). Let V ¼ Cn , and let u ¼ ðzi Þ and v ¼ ðwi Þ be vectors in zk wk ¼ z1 w1 þ z2 w2 þ Á Á Á þ zn wn

C . Then hu; vi ¼ P k n

is an inner product on V, called the usual or standard inner product on Cn . V with this inner product is called Complex Euclidean Space. We assume this inner product on Cn unless otherwise stated or implied. Assuming u and v are column vectors, the above inner product may be defined by

 hu; vi ¼ uT v
 where, as with matrices, v means the conjugate of each element of v. If u and v are real, we have wi ¼ wi . In this case, the inner product reduced to the analogous one on Rn .
EXAMPLE 7.16

(a) Let V be the vector space of complex continuous functions on the (real) interval a is the usual inner product on V :

t

b. Then the following

ðb h f ; gi ¼ a f ðtÞgðtÞ dt

(b) Let U be the vector space of m  n matrices over C. Suppose A ¼ ðzij Þ and B ¼ ðwij Þ are elements of U . Then the following is the usual inner product on U :

hA; Bi ¼ trðBH AÞ ¼

m n PP i¼1 j¼1

 wij zij

 As usual, BH ¼ BT ; that is, BH is the conjugate transpose of B.

The following is a list of theorems for complex inner product spaces that are analogous to those for  the real case. Here a Hermitian matrix A (i.e., one where AH ¼ AT ¼ AÞ plays the same role that a symmetric matrix A (i.e., one where AT ¼ A) plays in the real case. (Theorem 7.18 is proved in Problem 7.50.)
THEOREM 7.18:

(Cauchy–Schwarz) Let V be a complex inner product space. Then jhu; vij kukkvk

THEOREM 7.19: THEOREM 7.20:

Let W be a subspace of a complex inner product space V. Then V ¼ W È W ? . Suppose fu1 ; u2 ; . . . ; un g is a basis for a complex inner product space V. Then, for any v 2 V, v¼ hv; u1 i hv; u2 i hv; un i u1 þ u2 þ Á Á Á þ u hu1 ; u1 i hu2 ; u2 i hun ; un i n

THEOREM 7.21:

Suppose fu1 ; u2 ; . . . ; un g is a basis for a complex inner product space V. Let A ¼ ½aij Š be the complex matrix defined by aij ¼ hui ; uj i. Then, for any u; v 2 V, hu; vi ¼ ½uŠT A½vŠ where ½uŠ and ½vŠ are the coordinate column vectors in the given basis fui g. (Remark: This matrix A is said to represent the inner product on V.)   Let A be a Hermitian matrix (i.e., AH ¼ AT ¼ AÞ such that X T AX is real and positive for every nonzero vector X 2 Cn . Then hu; vi ¼ uT A is an inner product v on Cn . Let A be the matrix that represents an inner product on V. Then A is Hermitian, and X T AX is real and positive for any nonzero vector in Cn .

THEOREM 7.22:

THEOREM 7.23:

CHAPTER 7 Inner Product Spaces, Orthogonality

241

7.10

Normed Vector Spaces (Optional)

We begin with a definition.
DEFINITION:

Let V be a real or complex vector space. Suppose to each v 2 V there is assigned a real number, denoted by kvk. This function k Á k is called a norm on V if it satisfies the following axioms: ½N1 Š kvk ! 0; and kvk ¼ 0 if and only if v ¼ 0. ½N2 Š kkvk ¼ jkjkvk. ½N3 Š ku þ vk kuk þ kvk.

A vector space V with a norm is called a normed vector space. Suppose V is a normed vector space. The distance between two vectors u and v in V is denoted and defined by dðu; vÞ ¼ ku À vk The following theorem (proved in Problem 7.56) is the main reason why dðu; vÞ is called the distance between u and v.
THEOREM 7.24:

Let V be a normed vector space. Then the function dðu; vÞ ¼ ku À vk satisfies the following three axioms of a metric space: ½M1 Š dðu; vÞ ! 0; and dðu; vÞ ¼ 0 if and only if u ¼ v. ½M2 Š dðu; vÞ ¼ dðv; uÞ. ½M3 Š dðu; vÞ dðu; wÞ þ dðw; vÞ.

Normed Vector Spaces and Inner Product Spaces
Suppose V is an inner product space. Recall that the norm of a vector v in V is defined by pffiffiffiffiffiffiffiffiffiffiffi kvk ¼ hv; vi One can prove (Theorem 7.2) that this norm satisfies ½N1 Š, ½N2 Š, and ½N3 Š. Thus, every inner product space V is a normed vector space. On the other hand, there may be norms on a vector space V that do not come from an inner product on V, as shown below.

Norms on Rn and Cn
The following define three important norms on Rn and Cn : kða1 ; . . . ; an Þk1 ¼ maxðjai jÞ kða1 ; . . . ; an Þk1 ¼ ja1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi q j þ ja2 j þ Á Á Á þ jan j kða1 ; . . . ; an Þk2 ¼ ja1 j2 þ ja2 j2 þ Á Á Á þ jan j2 (Note that subscripts are used to distinguish between the three norms.) The norms k Á k1 , k Á k1 , and k Á k2 are called the infinity-norm, one-norm, and two-norm, respectively. Observe that k Á k2 is the norm on Rn (respectively, Cn ) induced by the usual inner product on Rn (respectively, Cn ). We will let d1 , d1 , d2 denote the corresponding distance functions.
EXAMPLE 7.17 kuk1 ¼ 5

Consider vectors u ¼ ð1; À5; 3Þ and v ¼ ð4; 2; À3Þ in R3 . and kvk1 ¼ 4

(a) The infinity norm chooses the maximum of the absolute values of the components. Hence,

242 kuk1 ¼ 1 þ 5 þ 3 ¼ 9

CHAPTER 7 Inner Product Spaces, Orthogonality

(b) The one-norm adds the absolute values of the components. Thus, and kvk1 ¼ 4 þ 2 þ 3 ¼ 9

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R3 ). Thus,

kuk2 ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 1 þ 25 þ 9 ¼ 35

and

kvk2 ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 16 þ 4 þ 9 ¼ 29

(d) Because u À v ¼ ð1 À 4; À5 À 2; 3 þ 3Þ ¼ ðÀ3; À7; 6Þ, we have d1 ðu; vÞ ¼ 7; EXAMPLE 7.18 d1 ðu; vÞ ¼ 3 þ 7 þ 6 ¼ 16; d2 ðu; vÞ ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi 9 þ 49 þ 36 ¼ 94

Consider the Cartesian plane R2 shown in Fig. 7-4.

(a) Let D1 be the set of points u ¼ ðx; yÞ in R2 such that kuk2 ¼ 1. Then D1 consists of the points ðx; yÞ such that kuk2 ¼ x2 þ y2 ¼ 1. Thus, D1 is the unit circle, as shown in Fig. 7-4. 2

Figure 7-4

(b) Let D2 be the set of points u ¼ ðx; yÞ in R2 such that kuk1 ¼ 1. Then D1 consists of the points ðx; yÞ such that kuk1 ¼ jxj þ jyj ¼ 1. Thus, D2 is the diamond inside the unit circle, as shown in Fig. 7-4. (c) Let D3 be the set of points u ¼ ðx; yÞ in R2 such that kuk1 ¼ 1. Then D3 consists of the points ðx; yÞ such that kuk1 ¼ maxðjxj, jyjÞ ¼ 1. Thus, D3 is the square circumscribing the unit circle, as shown in Fig. 7-4.

Norms on C½a; bŠ
Consider the vector space V ¼ C½a; bŠ of real continuous functions on the interval a the following defines an inner product on V : ðb h f ; gi ¼ a t

b. Recall that

f ðtÞgðtÞ dt

Accordingly, the above inner product defines the following norm on V ¼ C½a; bŠ (which is analogous to the k Á k2 norm on Rn ): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðb k f k2 ¼ ½ f ðtފ2 dt a CHAPTER 7 Inner Product Spaces, Orthogonality
The following define the other norms on V ¼ C½a; bŠ: ðb and k f k1 ¼ maxðj f ðtÞjÞ k f k1 ¼ j f ðtÞj dt a 243

There are geometrical descriptions of these two norms and their corresponding distance functions, which are described below. The first norm is pictured in Fig. 7-5. Here k f k1 ¼ area between the function j f j and the t-axis d1 ð f ; gÞ ¼ area between the functions f and g

Figure 7-5

This norm is analogous to the norm k Á k1 on Rn . The second norm is pictured in Fig. 7-6. Here k f k1 ¼ maximum distance between f and the t-axis d1 ð f ; gÞ ¼ maximum distance between f and g This norm is analogous to the norms k Á k1 on Rn .

Figure 7-6

SOLVED PROBLEMS Inner Products 7.1. Expand: (a) h5u1 þ 8u2 ; 6v 1 À 7v 2 i, (b) h3u þ 5v; 4u À 6vi, (c) k2u À 3vk2 Use linearity in both positions and, when possible, symmetry, hu; vi ¼ hv; ui.

244

CHAPTER 7 Inner Product Spaces, Orthogonality

(a) Take the inner product of each term on the left with each term on the right: h5u1 þ 8u2 ; 6v 1 À 7v 2 i ¼ h5u1 ; 6v 1 i þ h5u1 ; À7v 2 i þ h8u2 ; 6v 1 i þ h8u2 ; À7v 2 i ¼ 30hu1 ; v 1 i À 35hu1 ; v 2 i þ 48hu2 ; v 1 i À 56hu2 ; v 2 i [Remark: Observe the similarity between the above expansion and the expansion (5a–8b)(6c–7d ) in ordinary algebra.] (b) h3u þ 5v; 4u À 6vi ¼ 12hu; ui À 18hu; vi þ 20hv; ui À 30hv; vi ¼ 12hu; ui þ 2hu; vi À 30hv; vi 2 (c) k2u À 3vk ¼ h2u À 3v; 2u À 3vi ¼ 4hu; ui À 6hu; vi À 6hv; ui þ 9hv; vi ¼ 4kuk2 À 12ðu; vÞ þ 9kvk2

7.2.

Consider vectors u ¼ ð1; 2; 4Þ; v ¼ ð2; À3; 5Þ; w ¼ ð4; 2; À3Þ in R3 . Find
(a) u Á v, (b) u Á w; (c) v Á w, (d) ðu þ vÞ Á w, (e) kuk, (f ) kvk. (a) Multiply corresponding components and add to get u Á v ¼ 2 À 6 þ 20 ¼ 16: (b) u Á w ¼ 4 þ 4 À 12 ¼ À4. (c) v Á w ¼ 8 À 6 À 15 ¼ À13. (d) First find u þ v ¼ ð3; À1; 9Þ. Then ðu þ vÞ Á w ¼ 12 À 2 À 27 ¼ À17. Alternatively, using ½I1 Š, ðu þ vÞ Á w ¼ u Á w þ v Á w ¼ À4 À 13 ¼ À17. (e) First find kuk2 by squaring the components of u and adding: pffiffiffiffiffi kuk2 ¼ 12 þ 22 þ 42 ¼ 1 þ 4 þ 16 ¼ 21; and so kuk ¼ 21 pffiffiffiffiffi (f ) kvk2 ¼ 4 þ 9 þ 25 ¼ 38, and so kvk ¼ 38.

7.3.

Verify that the following defines an inner product in R2 : hu; vi ¼ x1 y1 À x1 y2 À x2 y1 þ 3x2 y2 ; where u ¼ ðx1 ; x2 Þ; v ¼ ðy1 ; y2 Þ We argue via matrices. We can write hu; vi in matrix notation as follows: ! ! 1 À1 y1 T hu; vi ¼ u Av ¼ ½x1 ; x2 Š À1 3 y2 Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 3 are positive, and the determinant kAk ¼ 3 À 1 ¼ 2 is positive. Thus, by Theorem 7.14, A is positive definite. Accordingly, by Theorem 7.15, hu; vi is an inner product.

7.4.

Consider the vectors u ¼ ð1; 5Þ and v ¼ ð3; 4Þ in R2 . Find (a) (b) (c) (d)
(a) (b) (c) (d)

hu; vi with respect to the usual inner product in R2 . hu; vi with respect to the inner product in R2 in Problem 7.3. kvk using the usual inner product in R2 . kvk using the inner product in R2 in Problem 7.3. hu; vi ¼ 3 þ 20 ¼ 23. hu; vi ¼ 1 Á 3 À 1 Á 4 À 5 Á 3 þ 3 Á 5 Á 4 ¼ 3 À 4 À 15 þ 60 ¼ 44. kvk2 ¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9 þ 16 ¼ 25; hence, jvk ¼ 5. pffiffiffiffiffi kvk2 ¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9 À 12 À 12 þ 48 ¼ 33; hence, kvk ¼ 33.

7.5.

Consider the following polynomials in PðtÞ with the inner product h f ; gi ¼ f ðtÞ ¼ t þ 2;
(a) Find h f ; gi and h f ; hi. (b) Find k f k and kgk. (c) Normalize f and g.

Ð1
0

f ðtÞgðtÞ dt:

gðtÞ ¼ 3t À 2;

hðtÞ ¼ t À 2t À 3
2

CHAPTER 7 Inner Product Spaces, Orthogonality
(a) Integrate as follows:  1 ð1 ð1  h f ; gi ¼ ðt þ 2Þð3t À 2Þ dt ¼ ð3t2 þ 4t À 4Þ dt ¼ t3 þ 2t2 À 4t  ¼ À1 
0 0 0

245

(b)

1  t 7t 37 2 À À 6t  ¼ À h f ; hi ¼ ðt þ 2Þðt À 2t À 3Þ dt ¼  4 4 2 0 0 qffiffiffiffi pffiffiffiffiffi Ð1 hence, k f k ¼ 19 ¼ 1 57 h f ; f i ¼ 0 ðt þ 2Þðt þ 2Þ dt ¼ 19; 3 3 3 ð1 pffiffiffi hg; gi ¼ ð3t À 2Þð3t À 2Þ ¼ 1; hence; kgk ¼ 1 ¼ 1 ð1
4 2 0



pffiffiffiffiffi (c) Because k f k ¼ 1 57 and g is already a unit vector, we have 3 1 3 f ¼ pffiffiffiffiffi ðt þ 2Þ f^ ¼ k fk 57 and ^ g ¼ g ¼ 3t À 2

7.6.

Find cos y where y is the angle between: (a) u ¼ ð1; 3; À5; 4Þ and v ¼ ð2; À3; 4; 1Þ in R4 , ! ! 1 2 3 9 8 7 , where hA; Bi ¼ trðBT AÞ: and B ¼ (b) A ¼ 4 5 6 6 5 4 hu; vi Use cos y ¼ kukkvk
(a) Compute: hu; vi ¼ 2 À 9 À 20 þ 4 ¼ À23; Thus; (b) Use hA; Bi ¼ trðBT AÞ ¼ Pm Pn i¼1 j¼1

kuk2 ¼ 1 þ 9 þ 25 þ 16 ¼ 51; À23 À23 cos y ¼ pffiffiffiffiffipffiffiffiffiffi ¼ pffiffiffiffiffiffiffiffi 51 30 3 170

kvk2 ¼ 4 þ 9 þ 16 þ 1 ¼ 30

aij bij , the sum of the products of corresponding entries.

hA; Bi ¼ 9 þ 16 þ 21 þ 24 þ 25 þ 24 ¼ 119 Pm Pn 2 Use kAk ¼ hA; Ai ¼ i¼1 j¼1 aij ; the sum of the squares of all the elements of A. pffiffiffiffiffiffiffiffi and so kAk2 ¼ hA; Ai ¼ 92 þ 82 þ 72 þ 62 þ 52 þ 42 ¼ 271; kAk ¼ 271 pffiffiffiffiffi and so kBk ¼ 91 kBk2 ¼ hB; Bi ¼ 12 þ 22 þ 32 þ 42 þ 52 þ 62 ¼ 91;
2

Thus;

119 cos y ¼ pffiffiffiffiffiffiffiffipffiffiffiffiffi 271 91

7.7.

Verify each of the following:
(a) Parallelogram Law (Fig. 7-7): ku þ vk2 þ ku À vk2 ¼ 2kuk2 þ 2kvk2 . (b) Polar form for hu; vi (which shows the inner product can be obtained from the norm function): hu; vi ¼ 1 ðku þ vk2 À ku À vk2 Þ: 4 Expand as follows to obtain ku þ vk2 ¼ hu þ v; u þ vi ¼ kuk2 þ 2hu; vi þ kvk2 ku À vk ¼ hu À v; u À vi ¼ kuk À 2hu; vi þ kvk ku þ vk2 À ku À vk2 ¼ 4hu; vi Divide by 4 to obtain the (real) polar form (b).
2 2 2

ð1Þ ð2Þ

Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain

246

CHAPTER 7 Inner Product Spaces, Orthogonality

Figure 7-7

7.8.

Prove Theorem 7.1 (Cauchy–Schwarz): For u and v in a real inner product space V ; hu; ui2 hu; uihv; vi or jhu; vij kukkvk: For any real number t, htu þ v; tu þ vi ¼ t2 hu; ui þ 2thu; vi þ hv; vi ¼ t2 kuk2 þ 2thu; vi þ kvk2 Let a ¼ kuk2 , b ¼ 2hu; vÞ, c ¼ kvk2 . Because ktu þ vk2 ! 0, we have at2 þ bt þ c ! 0 for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that b2 À 4ac 0 or b2 4ac. Thus, 4hu; vi2 4kuk2 kvk2 Dividing by 4 gives our result.

7.9.

Prove Theorem 7.2: The norm in an inner product space V satisfies (a) ½N1 Š kvk ! 0; and kvk ¼ 0 if and only if v ¼ 0. (b) ½N2 Š kkvk ¼ jkjkvk. (c) ½N3 Š ku þ vk kuk þ kvk. pffiffiffiffiffiffiffiffiffiffiffi (a) If v 6¼ pffiffiffi then hv; vi > 0, and hence, kvk ¼ hv; vi > 0. If v ¼ 0, then h0; 0i ¼ 0. Consequently, 0, k0k ¼ 0 ¼ 0. Thus, ½N1 Š is true. (b) We have kkvk2 ¼ hkv; kvi ¼ k 2 hv; vi ¼ k 2 kvk2 . Taking the square root of both sides gives ½N2 Š. (c) Using the Cauchy–Schwarz inequality, we obtain ku þ vk2 ¼ hu þ v; u þ vi ¼ hu; ui þ hu; vi þ hu; vi þ hv; vi kuk2 þ 2kukkvk þ kvk2 ¼ ðkuk þ kvkÞ2 Taking the square root of both sides yields ½N3 Š.

Orthogonality, Orthonormal Complements, Orthogonal Sets 7.10. Find k so that u ¼ ð1; 2; k; 3Þ and v ¼ ð3; k; 7; À5Þ in R4 are orthogonal.
First find hu; vi ¼ ð1; 2; k; 3Þ Á ð3; k; 7; À5Þ ¼ 3 þ 2k þ 7k À 15 ¼ 9k À 12 Then set hu; vi ¼ 9k À 12 ¼ 0 to obtain k ¼ 4. 3

7.11. Let W be the subspace of R5 spanned by u ¼ ð1; 2; 3; À1; 2Þ and v ¼ ð2; 4; 7; 2; À1Þ. Find a basis of the orthogonal complement W ? of W.
We seek all vectors w ¼ ðx; y; z; s; tÞ such that hw; ui ¼ x þ 2y þ 3z À s þ 2t ¼ 0 hw; vi ¼ 2x þ 4y þ 7z þ 2s À t ¼ 0 Eliminating x from the second equation, we find the equivalent system x þ 2y þ 3z À s þ 2t ¼ 0 z þ 4s À 5t ¼ 0

CHAPTER 7 Inner Product Spaces, Orthogonality
The free variables are y; s, and t. Therefore, (1) Set y ¼ À1, s ¼ 0, t ¼ 0 to obtain the solution w1 ¼ ð2; À1; 0; 0; 0Þ. (2) Set y ¼ 0, s ¼ 1, t ¼ 0 to find the solution w2 ¼ ð13; 0; À4; 1; 0Þ. (3) Set y ¼ 0, s ¼ 0, t ¼ 1 to obtain the solution w3 ¼ ðÀ17; 0; 5; 0; 1Þ. The set fw1 ; w2 ; w3 g is a basis of W ? .

247

7.12. Let w ¼ ð1; 2; 3; 1Þ be a vector in R4 . Find an orthogonal basis for w? .

Find a nonzero solution of x þ 2y þ 3z þ t ¼ 0, say v 1 ¼ ð0; 0; 1; À3Þ. Now find a nonzero solution of the system x þ 2y þ 3z þ t ¼ 0; z À 3t ¼ 0 x þ 2y þ 3z þ t ¼ 0; À5y þ 3z þ t ¼ 0;
?

say v 2 ¼ ð0; À5; 3; 1Þ. Last, find a nonzero solution of the system z À 3t ¼ 0 say v 3 ¼ ðÀ14; 2; 3; 1Þ. Thus, v 1 , v 2 , v 3 form an orthogonal basis for w .

7.13. Let S consist of the following vectors in R4 : u1 ¼ ð1; 1; 0; À1Þ; u2 ¼ ð1; 2; 1; 3Þ; u3 ¼ ð1; 1; À9; 2Þ; u4 ¼ ð16; À13; 1; 3Þ (a) Show that S is orthogonal and a basis of R4 . (b) Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4 relative to the basis S.
(a) Compute u1 Á u2 ¼ 1 þ 2 þ 0 À 3 ¼ 0; u2 Á u3 ¼ 1 þ 2 À 9 þ 6 ¼ 0; u1 Á u3 ¼ 1 þ 1 þ 0 À 2 ¼ 0; u2 Á u4 ¼ 16 À 26 þ 1 þ 9 ¼ 0; u1 Á u4 ¼ 16 À 13 þ 0 À 3 ¼ 0 u3 Á u4 ¼ 16 À 13 À 9 þ 6 ¼ 0

Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R4 because any four linearly independent vectors form a basis of R4 . (b) Because S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors, as in Theorem 7.7. Thus, k1 ¼ k2 ¼ hv; u1 i aþbÀd ; ¼ hu1 ; u1 i 3 hv; u2 i a þ 2b þ c þ 3d ; ¼ hu2 ; u2 i 15 k3 ¼ k4 ¼ hv; u3 i a þ b À 9c þ 2d ¼ hu3 ; u3 i 87 hv; u4 i 16a À 13b þ c þ 3d ¼ hu4 ; u4 i 435

are the coordinates of v with respect to the basis S.

7.14. Suppose S, S1 , S2 are the subsets of V. Prove the following: (a) S  S ?? . ? ? (b) If S1  S2 , then S2  S1 . (c) S ? ¼ span ðSÞ? .
(a) Let w 2 S. Then hw; vi ¼ 0 for every v 2 S ? ; hence, w 2 S ?? . Accordingly, S  S ?? . ? (b) Let w 2 S2 . Then hw; vi ¼ 0 for every v 2 S2 . Because S1  S2 , hw; vi ¼ 0 for every v ¼ S1 . Thus, ? ? ? w 2 S1 , and hence, S2  S1 . (c) Because S  spanðSÞ, part (b) gives us spanðSÞ?  S ? . Suppose u 2 S ? and v 2 spanðSÞ. Then there exist w1 ; w2 ; . . . ; wk in S such that v ¼ a1 w1 þ a2 w2 þ Á Á Á þ ak wk . Then, using u 2 S ? , we have hu; vi ¼ hu; a1 w1 þ a2 w2 þ Á Á Á þ ak wk i ¼ a1 hu; w1 i þ a2 hu; w2 i þ Á Á Á þ ak hu; wk i ¼ a1 ð0Þ þ a2 ð0Þ þ Á Á Á þ ak ð0Þ ¼ 0 Thus, u 2 spanðSÞ? . Accordingly, S ?  spanðSÞ? . Both inclusions give S ? ¼ spanðSÞ? .

7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.

248

CHAPTER 7 Inner Product Spaces, Orthogonality
Suppose S ¼ fu1 ; u2 ; . . . ; ur g and suppose a1 u1 þ a2 u2 þ Á Á Á þ ar ur ¼ 0 ð1Þ

Taking the inner product of (1) with u1 , we get 0 ¼ h0; u1 i ¼ ha1 u1 þ a2 u2 þ Á Á Á þ ar ur ; u1 i ¼ a1 hu1 ; u1 i þ a2 hu2 ; u1 i þ Á Á Á þ ar hur ; u1 i ¼ a1 hu1 ; u1 i þ a2 Á 0 þ Á Á Á þ ar Á 0 ¼ a1 hu1 ; u1 i Because u1 6¼ 0, we have hu1 ; u1 i 6¼ 0. Thus, a1 ¼ 0. Similarly, for i ¼ 2; . . . ; r, taking the inner product of (1) with ui , 0 ¼ h0; ui i ¼ ha1 u1 þ Á Á Á þ ar ur ; ui i ¼ a1 hu1 ; ui i þ Á Á Á þ ai hui ; ui i þ Á Á Á þ ar hur ; ui i ¼ ai hui ; ui i But hui ; ui i 6¼ 0, and hence, every ai ¼ 0. Thus, S is linearly independent.

7.16. Prove Theorem 7.6 (Pythagoras): Suppose fu1 ; u2 ; . . . ; ur g is an orthogonal set of vectors. Then ku1 þ u2 þ Á Á Á þ ur k2 ¼ ku1 k2 þ ku2 k2 þ Á Á Á þ kur k2 Expanding the inner product, we have ku1 þ u2 þ Á Á Á þ ur k2 ¼ hu1 þ u2 þ Á Á Á þ ur ; u1 þ u2 þ Á Á Á þ ur i P ¼ hu1 ; u1 i þ hu2 ; u2 i þ Á Á Á þ hur ; ur i þ hui ; uj i i6¼j The theorem follows from the fact that hui ; ui i ¼ kui k2 and hui ; uj i ¼ 0 for i 6¼ j.

7.17. Prove Theorem 7.7: Let fu1 ; u2 ; . . . ; un g be an orthogonal basis of V. Then for any v 2 V, v¼ hv; u1 i hv; u2 i hv; un i u þ u þ ÁÁÁ þ u hu1 ; u1 i 1 hu2 ; u2 i 2 hun ; un i n

Suppose v ¼ k1 u1 þ k2 u2 þ Á Á Á þ kn un . Taking the inner product of both sides with u1 yields hv; u1 i ¼ hk1 u2 þ k2 u2 þ Á Á Á þ kn un ; u1 i ¼ k1 hu1 ; u1 i þ k2 hu2 ; u1 i þ Á Á Á þ kn hun ; u1 i ¼ k1 hu1 ; u1 i þ k2 Á 0 þ Á Á Á þ kn Á 0 ¼ k1 hu1 ; u1 i hv; u1 i . Similarly, for i ¼ 2; . . . ; n, hu1 ; u1 i hv; ui i ¼ hk1 ui þ k2 u2 þ Á Á Á þ kn un ; ui i ¼ k1 hu1 ; ui i þ k2 hu2 ; ui i þ Á Á Á þ kn hun ; ui i ¼ k1 Á 0 þ Á Á Á þ ki hui ; ui i þ Á Á Á þ kn Á 0 ¼ ki hui ; ui i hv; ui i . Substituting for ki in the equation v ¼ k1 u1 þ Á Á Á þ kn un , we obtain the desired result. Thus, ki ¼ hu1 ; ui i Thus, k1 ¼

7.18. Suppose E ¼ fe1 ; e2 ; . . . ; en g is an orthonormal basis of V. Prove (a) For any u 2 V, we have u ¼ hu; e1 ie1 þ hu; e2 ie2 þ Á Á Á þ hu; en ien . (b) ha1 e1 þ Á Á Á þ an en ; b1 e1 þ Á Á Á þ bn en i ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn . (c) For any u; v 2 V, we have hu; vi ¼ hu; e1 ihv; e1 i þ Á Á Á þ hu; en ihv; en i.
(a) Suppose u ¼ k1 e1 þ k2 e2 þ Á Á Á þ kn en . Taking the inner product of u with e1 , hu; e1 i ¼ hk1 e1 þ k2 e2 þ Á Á Á þ kn en ; e1 i ¼ k1 he1 ; e1 i þ k2 he2 ; e1 i þ Á Á Á þ kn hen ; e1 i ¼ k1 ð1Þ þ k2 ð0Þ þ Á Á Á þ kn ð0Þ ¼ k1

CHAPTER 7 Inner Product Spaces, Orthogonality
Similarly, for i ¼ 2; . . . ; n, hu; ei i ¼ hk1 e1 þ Á Á Á þ ki ei þ Á Á Á þ kn en ; ei i ¼ k1 he1 ; ei i þ Á Á Á þ ki hei ; ei i þ Á Á Á þ kn hen ; ei i ¼ k1 ð0Þ þ Á Á Á þ ki ð1Þ þ Á Á Á þ kn ð0Þ ¼ ki Substituting hu; ei i for ki in the equation u ¼ k1 e1 þ Á Á Á þ kn en , we obtain the desired result. (b) We have * + n n n n P P P P P a i ei ; bj ej ¼ ai bj hei ; ej i ¼ ai bi hei ; ei i þ ai bj hei ; ej i i¼1 j¼1 i;j¼1 i¼1 i6¼j

249

But hei ; ej i ¼ 0 for i 6¼ j, and hei ; ej i ¼ 1 for i ¼ j. Hence, as required, * + n n n P P P a i ei ; b j ej ¼ ai bi ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn i¼1 j¼1 i¼1

(c) By part (a), we have u ¼ hu; e1 ie1 þ Á Á Á þ hu; en ien Thus, by part (b), hu; vi ¼ hu; e1 ihv; e1 i þ hu; e2 ihv; e2 i þ Á Á Á þ hu; en ihv; en i and v ¼ hv; e1 ie1 þ Á Á Á þ hv; en ien

Projections, Gram–Schmidt Algorithm, Applications 7.19. Suppose w 6¼ 0. Let v be any vector in V. Show that c¼ hv; wi hv; wi ¼ hw; wi kwk2

is the unique scalar such that v 0 ¼ v À cw is orthogonal to w.
In order for v 0 to be orthogonal to w we must have hv À cw; wi ¼ 0 or hv; wi À chw; wi ¼ 0 or hv; wi ¼ chw; wi hv; wi hv; wi . Conversely, suppose c ¼ . Then Thus, c hw; wi hw; wi hv; wi hv À cw; wi ¼ hv; wi À chw; wi ¼ hv; wi À hw; wi ¼ 0 hw; wi

7.20. Find the Fourier coefficient c and the projection of v ¼ ð1; À2; 3; À4Þ along w ¼ ð1; 2; 1; 2Þ in R4 .
Compute hv; wi ¼ 1 À 4 þ 3 À 8 ¼ À8 and kwk2 ¼ 1 þ 4 þ 1 þ 4 ¼ 10. Then
8 c ¼ À 10 ¼ À 4 5

and

projðv; wÞ ¼ cw ¼ ðÀ 4 ; À 8 ; À 4 ; À 8Þ 5 5 5 5

7.21. Consider the subspace U of R4 spanned by the vectors: v 1 ¼ ð1; 1; 1; 1Þ; v 2 ¼ ð1; 1; 2; 4Þ; v 3 ¼ ð1; 2; À4; À3Þ

Find (a) an orthogonal basis of U ; (b) an orthonormal basis of U .
(a) Use the Gram–Schmidt algorithm. Begin by setting w1 ¼ u ¼ ð1; 1; 1; 1Þ. Next find v2 À hv 2 ; w1 i 8 w ¼ ð1; 1; 2; 4Þ À ð1; 1; 1; 1Þ ¼ ðÀ1; À1; 0; 2Þ hw1 ; w1 i 1 4

Set w2 ¼ ðÀ1; À1; 0; 2Þ. Then find v3 À hv 3 ; w1 i hv ; w i ðÀ4Þ ðÀ9Þ ð1; 1; 1; 1Þ À ðÀ1; À1; 0; 2Þ w À 3 2 w ¼ ð1; 2; À4; À3Þ À hw1 ; w1 i 1 hw2 ; w2 i 2 4 6 ¼ ð1 ; 3 ; À3; 1Þ 2 2

Clear fractions to obtain w3 ¼ ð1; 3; À6; 2Þ. Then w1 ; w2 ; w3 form an orthogonal basis of U.

250

CHAPTER 7 Inner Product Spaces, Orthogonality

(b) Normalize the orthogonal basis consisting of w1 ; w2 ; w3 . Because kw1 k2 ¼ 4, kw2 k2 ¼ 6, and kw3 k2 ¼ 50, the following vectors form an orthonormal basis of U : 1 u1 ¼ ð1; 1; 1; 1Þ; 2 1 u2 ¼ pffiffiffi ðÀ1; À1; 0; 2Þ; 6 1 u3 ¼ pffiffiffi ð1; 3; À6; 2Þ 5 2

Ð1 7.22. Consider the vector space PðtÞ with inner product h f ; gi ¼ 0 f ðtÞgðtÞ dt. Apply the Gram– Schmidt algorithm to the set f1; t; t2 g to obtain an orthogonal set f f0 ; f1 ; f2 g with integer coefficients.
First set f0 ¼ 1. Then find tÀ
1 ht; 1i 1 Á1¼tÀ 2 Á1¼tÀ h1; 1i 2 1

Clear fractions to obtain f1 ¼ 2t À 1. Then find t2 À
1 1 ht2 ; 1i ht2 ; 2t À 1i 1 ð1Þ À ð2t À 1Þ ¼ t2 À 3 ð1Þ À 6 ð2t À 1Þ ¼ t2 À t þ 1 h1; 1i h2t À 1; 2t À 1i 6 1 3

Clear fractions to obtain f2 ¼ 6t2 À 6t þ 1. Thus, f1; 2t À 1; 6t2 À 6t þ 1g is the required orthogonal set.

7.23. Suppose v ¼ ð1; 3; 5; 7Þ. Find the projection of v onto W or, in other words, find w 2 W that minimizes kv À wk, where W is the subspance of R4 spanned by (a) u1 ¼ ð1; 1; 1; 1Þ and u2 ¼ ð1; À3; 4; À2Þ, (b) v 1 ¼ ð1; 1; 1; 1Þ and v 2 ¼ ð1; 2; 3; 2Þ.
(a) Because u1 and u2 are orthogonal, we need only compute the Fourier coefficients: hv; u1 i 1 þ 3 þ 5 þ 7 16 ¼ ¼4 ¼ c1 ¼ hu1 ; u1 i 1 þ 1 þ 1 þ 1 4 c2 ¼ hv; u2 i 1 À 9 þ 20 À 14 À2 1 ¼ ¼À ¼ hu2 ; u2 i 1 þ 9 þ 16 þ 4 30 15

1 Then w ¼ projðv; W Þ ¼ c1 u1 þ c2 u2 ¼ 4ð1; 1; 1; 1Þ À 15 ð1; À3; 4; À2Þ ¼ ð59 ; 63 ; 56 ; 62Þ: 15 5 15 15 (b) Because v 1 and v 2 are not orthogonal, first apply the Gram–Schmidt algorithm to find an orthogonal basis for W . Set w1 ¼ v 1 ¼ ð1; 1; 1; 1Þ. Then find hv ; w i 8 v 2 À 2 1 w1 ¼ ð1; 2; 3; 2Þ À ð1; 1; 1; 1Þ ¼ ðÀ1; 0; 1; 0Þ hw1 ; w1 i 4

Set w2 ¼ ðÀ1; 0; 1; 0Þ. Now compute c1 ¼ c2 ¼ hv; w1 i 1 þ 3 þ 5 þ 7 16 ¼ ¼4 ¼ hw1 ; w1 i 1 þ 1 þ 1 þ 1 4 hv; w2 i À1 þ 0 þ 5 þ 0 À6 ¼ ¼ À3 À hw2 ; w2 i 1þ0þ1þ0 2

Then w ¼ projðv; W Þ ¼ c1 w1 þ c2 w2 ¼ 4ð1; 1; 1; 1Þ À 3ðÀ1; 0; 1; 0Þ ¼ ð7; 4; 1; 4Þ.

7.24. Suppose w1 and w2 are nonzero orthogonal vectors. Let v be any vector in V. Find c1 and c2 so that v 0 is orthogonal to w1 and w2 , where v 0 ¼ v À c1 w1 À c2 w2 .
If v 0 is orthogonal to w1 , then 0 ¼ hv À c1 w1 À c2 w2 ; w1 i ¼ hv; w1 i À c1 hw1 ; w1 i À c2 hw2 ; w1 i ¼ hv; w1 i À c1 hw1 ; w1 i À c2 0 ¼ hv; w1 i À c1 hw1 ; w1 i Thus, c1 ¼ hv; w1 i=hw1 ; w1 i. (That is, c1 is the component of v along w1 .) Similarly, if v 0 is orthogonal to w2 , then 0 ¼ hv À c1 w1 À c2 w2 ; w2 i ¼ hv; w2 i À c2 hw2 ; w2 i Thus, c2 ¼ hv; w2 i=hw2 ; w2 i. (That is, c2 is the component of v along w2 .)

CHAPTER 7 Inner Product Spaces, Orthogonality

251

7.25. Prove Theorem 7.8: Suppose w1 ; w2 ; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v 2 V. Define hv; wi i v 0 ¼ v À ðc1 w1 þ c2 w2 þ Á Á Á þ cr wr Þ; where ci ¼ hwi ; wi i Then v 0 is orthogonal to w1 ; w2 ; . . . ; wr .
For i ¼ 1; 2; . . . ; r and using hwi ; wj i ¼ 0 for i 6¼ j, we have hv À c1 w1 À c2 x2 À Á Á Á À cr wr ; wi i ¼ hv; wi i À c1 hw1 ; wi i À Á Á Á À ci hwi ; wi i À Á Á Á À cr hwr ; wi i ¼ hv; wi i À c1 Á 0 À Á Á Á À ci hwi ; wi i À Á Á Á À cr Á 0 hv; wi i hw ; w i ¼ 0 ¼ hv; wi i À ci hwi ; wi i ¼ hv; wi i À hwi ; wi i i i The theorem is proved.

7.26. Prove Theorem 7.9: Let fv 1 ; v 2 ; . . . ; v n g be any basis of an inner product space V. Then there exists an orthonormal basis fu1 ; u2 ; . . . ; un g of V such that the change-of-basis matrix from fv i g to fui g is triangular; that is, for k ¼ 1; 2; . . . ; n, uk ¼ ak1 v 1 þ ak2 v 2 þ Á Á Á þ akk v k The proof uses the Gram–Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the algorithm to fv i g to obtain an orthogonal basis fwi ; . . . ; wn g, and then normalize fwi g to obtain an orthonormal basis fui g of V. The specific algorithm guarantees that each wk is a linear combination of v 1 ; . . . ; v k , and hence, each uk is a linear combination of v 1 ; . . . ; v k .

7.27. Prove Theorem 7.10: Suppose S ¼ fw1 ; w2 ; . . . ; wr g, is an orthogonal basis for a subspace W of V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors wrþ1 ; . . . ; wr such that fw1 ; w2 ; . . . ; wn g is an orthogonal basis for V.
Extend S to a basis S 0 ¼ fw1 ; . . . ; wr ; v rþ1 ; . . . ; v n g for V. Applying the Gram–Schmidt algorithm to S 0 , we first obtain w1 ; w2 ; . . . ; wr because S is orthogonal, and then we obtain vectors wrþ1 ; . . . ; wn , where fw1 ; w2 ; . . . ; wn g is an orthogonal basis for V. Thus, the theorem is proved.

7.28. Prove Theorem 7.4: Let W be a subspace of V. Then V ¼ W È W ? .
By Theorem 7.9, there exists an orthogonal basis fu1 ; . . . ; ur g of W , and by Theorem 7.10 we can extend it to an orthogonal basis fu1 ; u2 ; . . . ; un g of V. Hence, urþ1 ; . . . ; un 2 W ? . If v 2 V, then v ¼ a1 u1 þ Á Á Á þ an un ; where a1 u1 þ Á Á Á þ ar ur 2 W and arþ1 urþ1 þ Á Á Á þ an un 2 W ? Accordingly, V ¼ W þ W ? . On the other hand, if w 2 W \ W ? , then hw; wi ¼ 0. This yields w ¼ 0. Hence, W \ W ? ¼ f0g. The two conditions V ¼ W þ W ? and W \ W ? ¼ f0g give the desired result V ¼ W È W ? .

Remark: Note that we have proved the theorem for the case that V has finite dimension. We remark that the theorem also holds for spaces of arbitrary dimension. 7.29. Suppose W is a subspace of a finite-dimensional space V. Prove that W ¼ W ?? .
By Theorem 7.4, V ¼ W È W ? , and also V ¼ W ? È W ?? . Hence, dim W ¼ dim V À dim W ? and dim W ?? ¼ dim V À dim W ? This yields dim W ¼ dim W ?? . But W  W ?? (see Problem 7.14). Hence, W ¼ W ?? , as required.

7.30. Prove the following: Suppose w1 ; w2 ; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be any vector in V and let ci be the component of v along wi . Then, for any scalars a1 ; . . . ; ar , we have     r r     v À P ck wk  v À P ak wk      k¼1 k¼1

That is,

P

ci wi is the closest approximation to v as a linear combination of w1 ; . . . ; wr .

252

CHAPTER 7 Inner Product Spaces, Orthogonality

P By Theorem 7.8, v À ck wk is orthogonal to every wi and hence orthogonal to any linear combination of w1 ; w2 ; . . . ; wr . Therefore, using the Pythagorean theorem and summing from k ¼ 1 to r, P P P P P 2 2 2 2 kv À ak wk k ¼ kv À ck wk þ ðck À ak Þwk k ¼ kv À ck wk k þk ðck À ak Þwk k P ! kv À ck wk k2 The square root of both sides gives our theorem.

7.31. Suppose fe1 ; e2 ; . . . ; er g is an orthonormal set of vectors in V. Let v be any vector in V and let ci be the Fourier coefficient of v with respect to ui . Prove Bessel’s inequality: r P k¼1

c2 k

kvk2

Note that ci ¼ hv; ei i, because kei k ¼ 1. Then, using hei ; ej i ¼ 0 for i 6¼ j and summing from k ¼ 1 to r, we get  P P P P P P ck ek i þ c2 ¼ hv; vi À 2ck hv; ek i þ c2 0 hv À ck ek ; v À ck ; ek i ¼ hv; vi À 2 v; k k P P P ¼ hv; vi À 2c2 þ c2 ¼ hv; vi À c2 k k k This gives us our inequality.

Orthogonal Matrices 7.32. Find an orthogonal matrix P whose first row is u1 ¼ ð1 ; 2 ; 2Þ. 3 3 3
First find a nonzero vector w2 ¼ ðx; y; zÞ that is orthogonal to u1 —that is, for which x 2y 2z or x þ 2y þ 2z ¼ 0 0 ¼ hu1 ; w2 i ¼ þ þ ¼ 0 3 3 3 One such solution is w2 ¼ ð0; 1; À1Þ. Normalize w2 to obtain the second row of P: pffiffiffi pffiffiffi u2 ¼ ð0; 1= 2; À1= 2Þ Next find a nonzero vector w3 ¼ ðx; y; zÞ that is orthogonal to both u1 and u2 —that is, for which x 2y 2z or x þ 2y þ 2z ¼ 0 0 ¼ hu1 ; w3 i ¼ þ þ ¼ 0 3 3 3 y y 0 ¼ hu2 ; w3 i ¼ pffiffiffi À pffiffiffi ¼ 0 or yÀz¼0 2 2 Set z ¼ À1 and find the solution w3 ¼ ð4; À1; À1Þ. Normalize w3 and obtain the third row of P; that is, pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi u3 ¼ ð4= 18; À1= 18; À1= 18Þ: 2 1 3 2 2 3 3 ffiffiffi 3 pffiffiffi p P ¼ 4 0pffiffiffi Thus; 1= pffiffiffi À1= pffiffiffi 5 2 2 4=3 2 À1=3 2 À1=3 2 We emphasize that the above matrix P is not unique.

1 7.33. Let A ¼ 4 1 7

2

1 3 À5

3 À1 4 5. Determine whether or not: (a) the rows of A are orthogonal; 2 and (b) A is an orthogonal matrix; (c) the columns of A are orthogonal.
(a) Yes, because ð1; 1; À1Þ Á ð1; 3; 4Þ ¼ 1 þ 3 À 4 ¼ 0, ð1; 1 À 1Þ Á ð7; À5; 2Þ ¼ 7 À 5 À 2 ¼ 0, ð1; 3; 4Þ Á ð7; À5; 2Þ ¼ 7 À 15 þ 8 ¼ 0. (b) No, because the rows of A are not unit vectors, for example, ð1; 1; À1Þ2 ¼ 1 þ 1 þ 1 ¼ 3. (c) No; for example, ð1; 1; 7Þ Á ð1; 3; À5Þ ¼ 1 þ 3 À 35 ¼ À31 6¼ 0.

7.34. Let B be the matrix obtained by normalizing each row of A in Problem 7.33. (a) Find B. (b) Is B an orthogonal matrix? (c) Are the columns of B orthogonal?

CHAPTER 7 Inner Product Spaces, Orthogonality
(a) We have kð1; 1; À1Þk2 ¼ 1 þ 1 þ 1 ¼ 3; pffiffiffi 1= 3 6 pffiffiffiffiffi B ¼ 4 1= 26 pffiffiffiffiffi 7= 78 2 kð1; 3; 4Þk2 ¼ 1 þ 9 þ 16 ¼ 26 pffiffiffi 1= 3 pffiffiffiffiffi 3= 26 pffiffiffiffiffi À5= 78 pffiffiffi 3 À1= 3 pffiffiffiffiffi 7 4= 26 5 pffiffiffiffiffi 2= 78 kð7; À5; 2Þk2 ¼ 49 þ 25 þ 4 ¼ 78

253

Thus;

(b) Yes, because the rows of B are still orthogonal and are now unit vectors. (c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of B must automatically form an orthonormal set.

7.35. Prove each of the following: (a) P is orthogonal if and only if PT is orthogonal. (b) If P is orthogonal, then PÀ1 is orthogonal. (c) If P and Q are orthogonal, then PQ is orthogonal.
(a) We have ðPT ÞT ¼ P. Thus, P is orthogonal if and only if PPT ¼ I if and only if PTT PT ¼ I if and only if PT is orthogonal. (b) We have PT ¼ PÀ1 , because P is orthogonal. Thus, by part (a), PÀ1 is orthogonal. (c) We have PT ¼ PÀ1 and QT ¼ QÀ1 . Thus, ðPQÞðPQÞT ¼ PQQT PT ¼ PQQÀ1 PÀ1 ¼ I. Therefore, ðPQÞT ¼ ðPQÞÀ1 , and so PQ is orthogonal.

7.36. Suppose P is an orthogonal matrix. Show that (a) hPu; Pvi ¼ hu; vi for any u; v 2 V ; (b) kPuk ¼ kuk for every u 2 V. Use PT P ¼ I and hu; vi ¼ uT v.
(a) hPu; Pvi ¼ ðPuÞT ðPvÞ ¼ uT PT Pv ¼ uT v ¼ hu; vi. (b) We have kPuk2 ¼ hPu; Pui ¼ uT PT Pu ¼ uT u ¼ hu; ui ¼ kuk2 Taking the square root of both sides gives our result.

7.37. Prove Theorem 7.12: Suppose E ¼ fei g and E0 ¼ fe0i g are orthonormal bases of V. Let P be the change-of-basis matrix from E to E0 . Then P is orthogonal.
Suppose e0i ¼ bi1 e1 þ bi2 e2 þ Á Á Á þ bin en ; i ¼ 1; . . . ; n ð1Þ

Using Problem 7.18(b) and the fact that E0 is orthonormal, we get dij ¼ he0i ; e0j i ¼ bi1 bj1 þ bi2 bj2 þ Á Á Á þ bin bjn Let B ¼ ½bij Š be the matrix of the coefficients in (1). (Then P ¼ B .) Suppose BB ¼ ½cij Š. Then
T T

ð2Þ

cij ¼ bi1 bj1 þ bi2 bj2 þ Á Á Á þ bin bjn
T

ð3Þ
T

By (2) and (3), we have cij ¼ dij . Thus, BB ¼ I. Accordingly, B is orthogonal, and hence, P ¼ B is orthogonal.

7.38. Prove Theorem 7.13: Let fe1 ; . . . ; en g be an orthonormal basis of an inner product space V . Let P ¼ ½aij Š be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V : e0i ¼ a1i e1 þ a2i e2 þ Á Á Á þ ani en ; i ¼ 1; 2; . . . ; n

254

CHAPTER 7 Inner Product Spaces, Orthogonality
Because fei g is orthonormal, we get, by Problem 7.18(b), he0i ; e0j i ¼ a1i a1j þ a2i a2j þ Á Á Á þ ani anj ¼ hCi ; Cj i

where Ci denotes the ith column of the orthogonal matrix P ¼ ½aij Š: Because P is orthogonal, its columns form an orthonormal set. This implies he0i ; e0j i ¼ hCi ; Cj i ¼ dij : Thus, fe0i g is an orthonormal basis.

Inner Products And Positive Definite Matrices 7.39. Which of the following symmetric matrices are positive definite? ! ! ! 2 1 8 À3 3 4 , (d) , (c) C ¼ , (b) B ¼ (a) A ¼ 1 À3 À3 2 4 5 !

3 D¼ 5

5 9

Use Theorem 7.14 that a 2 Â 2 real symmetric matrix is positive definite if and only if its diagonal entries are positive and if its determinant is positive. (a) (b) (c) (d) No, because jAj ¼ 15 À 16 ¼ À1 is negative. Yes. No, because the diagonal entry À3 is negative. Yes.

7.40. Find the values of k that make each of the following matrices positive definite: ! ! ! 2 À4 4 k k 5 (a) A ¼ , (b) B ¼ , (c) C ¼ À4 k k 9 5 À2
(a) First, k must be positive. Also, jAj ¼ 2k À 16 must be positive; that is, 2k À 16 > 0. Hence, k > 8. (b) We need jBj ¼ 36 À k 2 positive; that is, 36 À k 2 > 0. Hence, k 2 < 36 or À6 < k < 6. (c) C can never be positive definite, because C has a negative diagonal entry À2.

7.41. Find the matrix A that represents the usual inner product on R2 relative to each of the following bases of R2 : ðaÞ fv 1 ¼ ð1; 3Þ; v 2 ¼ ð2; 5Þg; ðbÞ fw1 ¼ ð1; 2Þ; w2 ¼ ð4; À2Þg:
(a) Compute hv 1 ; v 1 i ¼ 1 þ 9 ¼ 10, hv 1 ; v 2 i ¼ 2 þ 15 ¼ 17, hv 2 ; v 2 i ¼ 4 þ 25 ¼ 29. Thus, ! 10 17 . A¼ 17 29

! 5 0 (b) Compute hw1 ; w1 i ¼ 1 þ 4 ¼ 5, hw1 ; w2 i ¼ 4 À 4 ¼ 0, hw2 ; w2 i ¼ 16 þ 4 ¼ 20. Thus, A ¼ . 0 20 (Because the basis vectors are orthogonal, the matrix A is diagonal.)

7.42. Consider the vector space P2 ðtÞ with inner product h f ; gi ¼
2

Ð1
À1

f ðtÞgðtÞ dt.

(a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t À 3t þ 4. (b) Find the matrix A of the inner product with respect to the basis f1; t; t2 g of V. (c) Verify Theorem 7.16 by showing that h f ; gi ¼ ½ f ŠT A½gŠ with respect to the basis f1; t; t2 g. ð1 (a) h f ; gi ¼
À1

ð1 ðt þ 2Þðt À 3t þ 4Þ dt ¼
2 À1

 ðt À t À 2t þ 8Þ dt ¼
3 2

1  t4 t3 46 2 À À t þ 8t  ¼  3 4 3 À1

(b) Here we use the fact that if r þ s ¼ n, 1 & ð1 tnþ1  r r n  ¼ 2=ðn þ 1Þ ht ; t i ¼ t dt ¼ 0 n þ 1 À1 À1 2 3

if n is even; if n is odd:

Then h1; 1i ¼ 2, h1; ti ¼ 0, h1; t2 i ¼ 2, ht; ti ¼ 2, ht; t2 i ¼ 0, ht2 ; t2 i ¼ 2. Thus, 3 3 5 2 0 A ¼ 40 2 3 2 0 3 05
2 5 2 3

CHAPTER 7 Inner Product Spaces, Orthogonality
(c) We have ½ f ŠT ¼ ð2; 1; 0Þ and ½gŠT ¼ ð4; À3; 1Þ relative to the given basis. Then 2 3 2 32 3 2 0 2 4 4 3 T ½ f Š A½gŠ ¼ ð2; 1; 0Þ4 0 2 0 54 À3 5 ¼ ð4; 2 ; 4Þ4 À3 5 ¼ 46 ¼ h f ; gi 3 3 3 3 2 0 2 1 1 3 5

255

a 7.43. Prove Theorem 7.14: A ¼ b 2 jAj ¼ ad À b is positive.
Let u ¼ ½x; yŠT . Then

b c

!

is positive definite if and only if a and d are positive and
! ! x ¼ ax2 þ 2bxy þ dy2 y

f ðuÞ ¼ uT Au ¼ ½x; yŠ

a b b d

Suppose f ðuÞ > 0 for every u 6¼ 0. Then f ð1; 0Þ ¼ a > 0 and f ð0; 1Þ ¼ d > 0. Also, we have f ðb; ÀaÞ ¼ aðad À b2 Þ > 0. Because a > 0, we get ad À b2 > 0. Conversely, suppose a > 0, b ¼ 0, ad À b2 > 0. Completing the square gives us    2 2b b2 b2 by ad À b2 2 y þ f ðuÞ ¼ a x2 þ xy þ y2 þ dy2 À y2 ¼ a x þ a a a2 a a Accordingly, f ðuÞ > 0 for every u 6¼ 0.

7.44. Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function hu; vi ¼ uT Av is an inner product on Rn .
For any vectors u1 ; u2 , and v, hu1 þ u2 ; vi ¼ ðu1 þ u2 ÞT Av ¼ ðuT þ uT ÞAv ¼ uT Av þ uT Av ¼ hu1 ; vi þ hu2 ; vi 1 2 1 2 and, for any scalar k and vectors u; v, hku; vi ¼ ðkuÞT Av ¼ kuT Av ¼ khu; vi Thus ½I1 Š is satisfied. Because uT Av is a scalar, ðuT AvÞT ¼ uT Av. Also, AT ¼ A because A is symmetric. Therefore, hu; vi ¼ uT Av ¼ ðuT AvÞT ¼ v T AT uTT ¼ v T Au ¼ hv; ui Thus, ½I2 Š is satisfied. Last, because A is positive definite, X T AX > 0 for any nonzero X 2 Rn . Thus, for any nonzero vector v; hv; vi ¼ v T Av > 0. Also, h0; 0i ¼ 0T A0 ¼ 0. Thus, ½I3 Š is satisfied. Accordingly, the function hu; vi ¼ Av is an inner product.

7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of V. Then, for any vectors u; v 2 V, we have hu; vi ¼ ½uŠT A½vŠ
Suppose S ¼ fw1 ; w2 ; . . . ; wn g and A ¼ ½kij Š. Hence, kij ¼ hwi ; wj i. Suppose and v ¼ b1 w1 þ b2 w2 þ Á Á Á þ bn wn u ¼ a1 w1 þ a2 w2 þ Á Á Á þ an wn n n PP Then hu; vi ¼ ai bj hwi ; wj i i¼1 j¼1

ð1Þ

On the other hand,

32 b 3 1 k11 k12 . . . k1n 6 7 6k k22 . . . k2n 76 b2 7 6 21 76 7 ½uŠT A½vŠ ¼ ða1 ; a2 ; . . . ; an Þ6 7 4 :::::::::::::::::::::::::::::: 56 . 7 4 . 5 . 2 kn1  ¼ n P n P

kn2

...

b1 6 b 7 n n 6 27 PP ai ki1 ; ai ki2 ; . . . ; ai kin 6 . 7 ¼ ai bj kij 6 . 7 i¼1 i¼1 i¼1 4 . 5 j¼1 i¼1 n P

knn 2

bn 3

ð2Þ

bn Equations ð1Þ and (2) give us our result.

256

CHAPTER 7 Inner Product Spaces, Orthogonality

7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix.

Because hwi ; wj i ¼ hwj ; wi i for any basis vectors wi and wj , the matrix A is symmetric. Let X be any nonzero vector in Rn . Then ½uŠ ¼ X for some nonzero vector u 2 V. Theorem 7.16 tells us that X T AX ¼ ½uŠT A½uŠ ¼ hu; ui > 0. Thus, A is positive definite.

Complex Inner Product Spaces 7.47. Let V be a complex inner product space. Verify the relation   hu; av 1 þ bv 2 i ¼ ahu; v 1 i þ bhu; v 2 i
Using ½I*Š, ½I*Š, and then ½I*Š, we find 2 1 2     hu; av 1 þ bv 2 i ¼ hav 1 þ bv 2 ; ui ¼ ahv 1 ; ui þ bhv 2 ; ui ¼ ahv 1 ; ui þ bhv 2 ; ui ¼ ahu; v 1 i þ bhu; v 2 i

7.48. Suppose hu; vi ¼ 3 þ 2i in a complex inner product space V. Find (a) hð2 À 4iÞu; vi; (b) hu; ð4 þ 3iÞvi; (c) hð3 À 6iÞu; ð5 À 2iÞvi:
(a) hð2 À 4iÞu; vi ¼ ð2 À 4iÞhu; vi ¼ ð2 À 4iÞð3 þ 2iÞ ¼ 14 À 8i (b) hu; ð4 þ 3iÞvi ¼ ð4 þ 3iÞhu; vi ¼ ð4 À 3iÞð3 þ 2iÞ ¼ 18 À i (c) hð3 À 6iÞu; ð5 À 2iÞvi ¼ ð3 À 6iÞð5 À 2iÞhu; vi ¼ ð3 À 6iÞð5 þ 2iÞð3 þ 2iÞ ¼ 129 À 18i

7.49. Find the Fourier coefficient (component) c and the projection cw of v ¼ ð3 þ 4i; 2 À 3iÞ along w ¼ ð5 þ i; 2iÞ in C2 .
Recall that c ¼ hv; wi=hw; wi. Compute hv; wi ¼ ð3 þ 4iÞð5 þ iÞ þ ð2 À 3iÞð2iÞ ¼ ð3 þ 4iÞð5 À iÞ þ ð2 À 3iÞðÀ2iÞ ¼ 19 þ 17i À 6 À 4i ¼ 13 þ 13i hw; wi ¼ 25 þ 1 þ 4 ¼ 30
1 Thus, c ¼ ð13 þ 13iÞ=30 ¼ 13 þ 13 i: Accordingly, projðv; wÞ ¼ cw ¼ ð26 þ 39 i; À 13 þ 15 iÞ 30 30 15 15 15

7.50. Prove Theorem 7.18 (Cauchy–Schwarz): Let V be a complex inner product space. Then jhu; vij kukkvk.
If v ¼ 0, the inequality reduces to 0 0 and hence is valid. Now suppose v 6¼ 0. Using z ¼ jzj2 (for z any complex number z) and hv; ui ¼ hu; vi, we expand ku À hu; vitvk2 ! 0, where t is any real value:

0

ku À hu; vitvk2 ¼ hu À hu; vitv; u À hu; vitvi ¼ hu; ui À hu; vithu; vi À hu; vÞthv; ui þ hu; vihu; vit2 hv; vi ¼ kuk2 À 2tjhu; vij2 þ jhu; vij2 t2 kvk2 jhu; vij2

, from which jhu; vij2 kvk2 root of both sides, we obtain the required inequality. 7.51. Find an orthogonal basis for u? in C 3 where u ¼ ð1; i; 1 þ iÞ.
Here u? consists of all vectors s ¼ ðx; y; zÞ such that hw; ui ¼ x À iy þ ð1 À iÞz ¼ 0

Set t ¼ 1=kvk2 to find 0

kuk2 À

kvk2 kvk2 . Taking the square

Find one solution, say w1 ¼ ð0; 1 À i; iÞ. Then find a solution of the system x À iy þ ð1 À iÞz ¼ 0; ð1 þ iÞy À iz ¼ 0 Here z is a free variable. Set z ¼ 1 to obtain y ¼ i=ð1 þ iÞ ¼ ð1 þ iÞ=2 and x ¼ ð3i À 3Þ2. Multiplying by 2 yields the solution w2 ¼ ð3i À 3, 1 þ i, 2). The vectors w1 and w2 form an orthogonal basis for u? .

CHAPTER 7 Inner Product Spaces, Orthogonality
7.52. Find an orthonormal basis of the subspace W of C3 spanned by v 1 ¼ ð1; i; 0Þ and v 2 ¼ ð1; 2; 1 À iÞ: Apply the Gram–Schmidt algorithm. Set w1 ¼ v 1 ¼ ð1; i; 0Þ. Compute hv ; w i 1 À 2i ð1; i; 0Þ ¼ ð1 þ i; 1 À 1 i; 1 À iÞ v 2 À 2 1 w1 ¼ ð1; 2; 1 À iÞ À 2 2 hw1 ; w1 i 2 Multiplypffiffiffiffiffi 2 to clear fractions, obtaining w2 ¼ ð1 þ 2i; 2 À i; 2 À 2iÞ. Next find kw1 k ¼ by kw2 k ¼ 18. Normalizing fw1 ; w2 g, we obtain the following orthonormal basis of W : &  '   1 i 1 þ 2i 2 À i 2 À 2i u1 ¼ pffiffiffi ; pffiffiffi ; 0 ; u2 ¼ pffiffiffiffiffi ; pffiffiffiffiffi ; pffiffiffiffiffi 18 18 18 2 2

257

pffiffiffi 2 and then

7.53. Find the matrix P that represents the usual inner product on C3 relative to the basis f1; i; 1 À ig.
Compute the following six inner products: h1; 1i ¼ 1; h1; ii ¼  ¼ Ài; i  hi; ii ¼ ii ¼ 1; hi; 1 À ii ¼ ið1 À iÞ ¼ À1 þ i; Then, using ðu; vÞ ¼ hv; ui, we obtain 3 1 Ài 1þi P¼4 i 1 À1 þ i 5 1 À i À1 À i 2 2 h1; 1 À ii ¼ 1 À i ¼ 1 þ i h1 À i; 1 À ii ¼ 2

(As expected, P is Hermitian; that is, PH ¼ P.)

Normed Vector Spaces 7.54. Consider vectors u ¼ ð1; 3; À6; 4Þ and v ¼ ð3; À5; 1; À2Þ in R4 . Find (a) kuk1 and kvj1 , (b) kuk1 and kvk1 , (c) (d) d1 ðu; vÞ; d1 ðu; vÞ, d2 ðu; vÞ. kuk1 ¼ 6 kuk1 ¼ 1 þ 3 þ 6 þ 4 ¼ 14

kuk2 and kvk2 ,

(a) The infinity norm chooses the maximum of the absolute values of the components. Hence, and kvk1 ¼ 5 kvk1 ¼ 3 þ 5 þ 1 þ 2 ¼ 11

(b) The one-norm adds the absolute values of the components. Thus, and

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R3 ). Thus, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi kuk2 ¼ 1 þ 9 þ 36 þ 16 ¼ 62 and kvk2 ¼ 9 þ 25 þ 1 þ 4 ¼ 39 (d) First find u À v ¼ ðÀ2; 8; À7; 6Þ. Then d1 ðu; vÞ ¼ ku À vk1 ¼ 8 d1 ðu; vÞ ¼ ku À vk1 ¼ 2 þ 8 þ 7 þ 6 ¼ 23 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffi d2 ðu; vÞ ¼ ku À vk2 ¼ 4 þ 64 þ 49 þ 36 ¼ 153

7.55. Consider the function f ðtÞ ¼ t2 À 4t in C½0; 3Š. (a) Find k f k1 , (b) Plot f ðtÞ in the plane R2 , (c) Find k f k1 , (d) Find k f k2 .
(a) We seek k f k1 ¼ maxðj f ðtÞjÞ. Because f ðtÞ is differentiable on ½0; 3Š, j f ðtÞj has a maximum at a critical point of f ðtÞ (i.e., when the derivative f 0 ðtÞ ¼ 0), or at an endpoint of ½0; 3Š. Because f 0 ðtÞ ¼ 2t À 4, we set 2t À 4 ¼ 0 and obtain t ¼ 2 as a critical point. Compute f ð2Þ ¼ 4 À 8 ¼ À4; Thus, k f k1 ¼ j f ð2Þj ¼ j À 4j ¼ 4. f ð0Þ ¼ 0 À 0 ¼ 0; f ð3Þ ¼ 9 À 12 ¼ À3

258

CHAPTER 7 Inner Product Spaces, Orthogonality

(b) Compute f ðtÞ for various values of t in ½0; 3Š, for example, t 0 1 2 3 f ðtÞ
2

0 À3 À4 À3

Plot the points in R and then draw a continuous curve through the points, as shown in Fig. 7-8. Ð3 (c) We seek k f k1 ¼ 0 j f ðtÞj dt. As indicated in Fig. 7-3, f ðtÞ is negative in ½0; 3Š; hence, j f ðtÞj ¼ Àðt2 À 4tÞ ¼ 4t À t 2  3 ð3 t3   ¼ 18 À 9 ¼ 9 Thus; k f k1 ¼ ð4t À t 2 Þ dt ¼ 2t2 À 3 0 0 5 3 ð3 ð3 t 16t3  2 2 4 3 2 4  ¼ 153. À 2t þ f ðtÞ dt ¼ ðt À 8t þ 16t Þ dt ¼ (d) k f k2 ¼ 5 5 3 0 rffiffiffiffiffiffiffiffi 0 0 153 . Thus, k f k2 ¼ 5

Figure 7-8

7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function dðu; vÞ ¼ ku À vk satisfies the following three axioms of a metric space: ½M1 Š dðu; vÞ ! 0; and dðu; vÞ ¼ 0 iff u ¼ v. ½M2 Š dðu; vÞ ¼ dðv; uÞ. ½M3 Š dðu; vÞ dðu; wÞ þ dðw; vÞ. dðu; vÞ ¼ ku À vk ¼ k À 1ðv À uÞk ¼ j À 1jkv À uk ¼ kv À uk ¼ dðv; uÞ dðu; vÞ ¼ ku À vk ¼ kðu À wÞ þ ðw À vÞk ku À wk þ kw À vk ¼ dðu; wÞ þ dðw; vÞ

If u 6¼ v, then u À v 6¼ 0, and hence, dðu; vÞ ¼ ku À vk > 0. Also, dðu; uÞ ¼ ku À uk ¼ k0k ¼ 0. Thus, ½M1 Š is satisfied. We also have and

Thus, ½M2 Š and ½M3 Š are satisfied.

SUPPLEMENTARY PROBLEMS Inner Products
7.57. Verify that the following is an inner product on R2 , where u ¼ ðx1 ; x2 Þ and v ¼ ðy1 ; y2 Þ: f ðu; vÞ ¼ x1 y1 À 2x1 y2 À 2x2 y1 þ 5x2 y2 7.58. Find the values of k so that the following is an inner product on R2 , where u ¼ ðx1 ; x2 Þ and v ¼ ðy1 ; y2 Þ: f ðu; vÞ ¼ x1 y1 À 3x1 y2 À 3x2 y1 þ kx2 y2

CHAPTER 7 Inner Product Spaces, Orthogonality
7.59. Consider the vectors u ¼ ð1; À3Þ and v ¼ ð2; 5Þ in R2 . Find (a) hu; vi with respect to the usual inner product in R2 . (b) hu; vi with respect to the inner product in R2 in Problem 7.57. (c) kvk using the usual inner product in R2 . (d) kvk using the inner product in R2 in Problem 7.57.

259

7.60. Show that each of the following is not an inner product on R3 , where u ¼ ðx1 ; x2 ; x3 Þ and v ¼ ðy1 ; y2 ; y3 Þ: (a) hu; vi ¼ x1 y1 þ x2 y2 ; (b) hu; vi ¼ x1 y2 x3 þ y1 x2 y3 .

7.61. Let V be the vector space of m  n matrices over R. Show that hA; Bi ¼ trðBT AÞ defines an inner product in V. 7.62. Suppose jhu; vij ¼ kukkvk. (That is, the Cauchy–Schwarz inequality reduces to an equality.) Show that u and v are linearly dependent. 7.63. Suppose f ðu; vÞ and gðu; vÞ are inner products on a vector space V over R. Prove (a) The sum f þ g is an inner product on V, where ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ. (b) The scalar product kf , for k > 0, is an inner product on V, where ðkf Þðu; vÞ ¼ kf ðu; vÞ.

Orthogonality, Orthogonal Complements, Orthogonal Sets
7.64. Let V be the vector space of polynomials over R of degree 2 with inner product defined by Ð1 h f ; gi ¼ 0 f ðtÞgðtÞ dt. Find a basis of the subspace W orthogonal to hðtÞ ¼ 2t þ 1. 7.65. Find a basis of the subspace W of R4 orthogonal to u1 ¼ ð1; À2; 3; 4Þ and u2 ¼ ð3; À5; 7; 8Þ. 7.66. Find a basis for the subspace W of R5 orthogonal to the vectors u1 ¼ ð1; 1; 3; 4; 1Þ and u2 ¼ ð1; 2; 1; 2; 1Þ. 7.67. Let w ¼ ð1; À2; À1; 3Þ be a vector in R4 . Find (a) an orthogonal basis for w? ; (b) an orthonormal basis for w? . 7.68. Let W be the subspace of R4 orthogonal to u1 ¼ ð1; 1; 2; 2Þ and u2 ¼ ð0; 1; 2; À1Þ. Find (a) an orthogonal basis for W ; (b) an orthonormal basis for W . (Compare with Problem 7.65.) 7.69. Let S consist of the following vectors in R4 : u1 ¼ ð1; 1; 1; 1Þ; (a) (b) (c) (d) u2 ¼ ð1; 1; À1; À1Þ; u3 ¼ ð1; À1; 1; À1Þ; u4 ¼ ð1; À1; À1; 1Þ

Show that S is orthogonal and a basis of R4 . Write v ¼ ð1; 3; À5; 6Þ as a linear combination of u1 ; u2 ; u3 ; u4 . Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4 relative to the basis S. Normalize S to obtain an orthonormal basis of R4 .

7.70. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT AÞ. Show that the following is an orthonormal basis for M: ! ! ! & !' 1 0 0 1 0 0 0 0 ; ; ; 0 0 0 0 1 0 0 1 7.71. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT AÞ. Find an orthogonal basis for the orthogonal complement of (a) diagonal matrices, (b) symmetric matrices.

260

CHAPTER 7 Inner Product Spaces, Orthogonality

7.72. Suppose fu1 ; u2 ; . . . ; ur g is an orthogonal set of vectors. Show that fk1 u1 ; k2 u2 ; . . . ; kr ur g is an orthogonal set for any scalars k1 ; k2 ; . . . ; kr . 7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that (a) ðU þ W Þ? ¼ U ? \ W ? ; (b) ðU \ W Þ? ¼ U ? þ W ? .

Projections, Gram–Schmidt Algorithm, Applications
7.74. Find the Fourier coefficient c and projection cw of v along w, where (a) v ¼ ð2; 3; À5Þ and w ¼ ð1; À5; 2Þ in R3 : (b) v ¼ ð1; 3; 1; 2Þ and w ¼ ð1; À2; 7; 4Þ in R4 : Ð1 (c) v ¼ t2 and w ¼ t þ 3 in PðtÞ; with inner product h f ; gi ¼ 0 f ðtÞgðtÞ dt ! ! 1 1 1 2 in M ¼ M2;2 ; with inner product hA; Bi ¼ trðBT AÞ: and w ¼ (d) v ¼ 5 5 3 4 7.75. Let U be the subspace of R4 spanned by v 1 ¼ ð1; 1; 1; 1Þ; v 2 ¼ ð1; À1; 2; 2Þ; v 3 ¼ ð1; 2; À3; À4Þ

(a) Apply the Gram–Schmidt algorithm to find an orthogonal and an orthonormal basis for U . (b) Find the projection of v ¼ ð1; 2; À3; 4Þ onto U . 7.76. Suppose v ¼ ð1; 2; 3; 4; 6Þ. Find the projection of v onto W, or, in other words, find w 2 W that minimizes kv À wk, where W is the subspace of R5 spanned by (a) u1 ¼ ð1; 2; 1; 2; 1Þ and u2 ¼ ð1; À1; 2; À1; 1Þ, (b) v 1 ¼ ð1; 2; 1; 2; 1Þ and v 2 ¼ ð1; 0; 1; 5; À1Þ. Ð1 7.77. Consider the subspace W ¼ P2 ðtÞ of PðtÞ with inner product h f ; gi ¼ 0 f ðtÞgðtÞ dt. Find the projection of f ðtÞ ¼ t3 onto W . (Hint: Use the orthogonal polynomials 1; 2t À 1, 6t2 À 6t þ 1 obtained in Problem 7.22.) 7.78. Consider PðtÞ with inner product h f ; gi ¼ Ð1
À1

f ðtÞgðtÞ dt and the subspace W ¼ P3 ðtÞ:

(a) Find an orthogonal basis for W by applying the Gram–Schmidt algorithm to f1; t; t2 ; t3 g. (b) Find the projection of f ðtÞ ¼ t5 onto W .

Orthogonal Matrices
7.79. Find the number and exhibit all 2 Â 2 orthogonal matrices of the form
1 3

y

! x . z

7.80. Find a 3 Â 3 orthogonal matrix P whose first two rows are multiples of u ¼ ð1; 1; 1Þ and v ¼ ð1; À2; 3Þ, respectively. 7.81. Find a symmetric orthogonal matrix P whose first row is ð1 ; 2 ; 2Þ. (Compare with Problem 7.32.) 3 3 3 7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that B ¼ PT AP. Show that this relation is an equivalence relation.

Positive Definite Matrices and Inner Products
7.83. Find the matrix A that represents the usual inner product on R2 relative to each of the following bases: (a) fv 1 ¼ ð1; 4Þ; v 2 ¼ ð2; À3Þg, (b) fw1 ¼ ð1; À3Þ; w2 ¼ ð6; 2Þg. 7.84. Consider the following inner product on R2 : f ðu; vÞ ¼ x1 y1 À 2x1 y2 À 2x2 y1 þ 5x2 y2 ;

where

u ¼ ðx1 ; x2 Þ

v ¼ ðy1 ; y2 Þ

Find the matrix B that represents this inner product on R2 relative to each basis in Problem 7.83.

CHAPTER 7 Inner Product Spaces, Orthogonality

261

7.85. Find the matrix C that represents the usual basis on R3 relative to the basis S of R3 consisting of the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 1Þ, u3 ¼ ð1; À1; 3Þ. 7.86. Let V ¼ P2 ðtÞ with inner product h f ; gi ¼ Ð1
0

f ðtÞgðtÞ dt.

(a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t2 À 3t þ 4. (b) Find the matrix A of the inner product with respect to the basis f1; t; t2 g of V. (c) Verify Theorem 7.16 that h f ; gi ¼ ½ f ŠT A½gŠ with respect to the basis f1; t; t2 g. 7.87. Determine which of the following matrices are positive definite: ! ! ! ! 1 3 3 4 4 2 6 À7 (a) , (b) , (c) , (d) . 3 5 4 7 2 1 À7 9 7.88. Suppose A and B are positive definite matrices. Show that: (b) kA is positive definite for k > 0. (a) A þ B is positive definite and 7.89. Suppose B is a real nonsingular matrix. Show that: (a) BT B is symmetric and (b) BT B is positive definite.

Complex Inner Product Spaces
7.90. Verify that b1 v 1 þ b2 v 2 i ¼ a1 1 hu1 ; v 1 i þ a1 2 hu1 ; v 2 i þ a2 1 hu2 ; v 1 i þ a2 2 hu2 ; v 2 i b b b b Pn P Pm  More generally, prove that h i¼1 ai ui ; j¼1 bj v j i ¼ i;j ai bj hui ; v i i. ha1 u1 þ a2 u2 7.91. Consider u ¼ ð1 þ i; 3; 4 À iÞ and v ¼ ð3 À 4i; 1 þ i; 2iÞ in C3 . Find (a) hu; vi, (b) hv; ui, (c) kuk, (d) kvk, (e) dðu; vÞ.

7.92. Find the Fourier coefficient c and the projection cw of (a) u ¼ ð3 þ i; 5 À 2iÞ along w ¼ ð5 þ i; 1 þ iÞ in C2 , (b) u ¼ ð1 À i; 3i; 1 þ iÞ along w ¼ ð1; 2 À i; 3 þ 2iÞ in C3 . 7.93. Let u ¼ ðz1 ; z2 Þ and v ¼ ðw1 ; w2 Þ belong to C2 . Verify that the following is an inner product of C2 :     f ðu; vÞ ¼ z1 w1 þ ð1 þ iÞz1 w2 þ ð1 À iÞz2 w1 þ 3z2 w2 7.94. Find an orthogonal basis and an orthonormal basis for the subspace W of C3 spanned by u1 ¼ ð1; i; 1Þ and u2 ¼ ð1 þ i; 0; 2Þ. 7.95. Let u ¼ ðz1 ; z2 Þ and v ¼ ðw1 ; w2 Þ belong to C2 . For what values of a; b; c; d 2 C is the following an inner product on C2 ?     f ðu; vÞ ¼ az1 w1 þ bz1 w2 þ cz2 w1 þ dz2 w2 7.96. Prove the following form for an inner product in a complex space V : hu; vi ¼ 1 ku þ vk2 À 1 ku À vk2 þ 1 ku þ ivk2 À 1 ku À ivk2 4 4 4 4 [Compare with Problem 7.7(b).] 7.97. Let V be a real inner product space. Show that (i) kuk ¼ kvk if and only if hu þ v; u À vi ¼ 0; (ii) ku þ vk2 ¼ kuk2 þ kvk2 if and only if hu; vi ¼ 0. Show by counterexamples that the above statements are not true for, say, C2 . 7.98. Find the matrix P that represents the usual inner product on C3 relative to the basis f1; 1 þ i; 1 À 2ig.

262

CHAPTER 7 Inner Product Spaces, Orthogonality

7.99. A complex matrix A is unitary if it is invertible and AÀ1 ¼ AH . Alternatively, A is unitary if its rows (columns) form an orthonormal set of vectors (relative to the usual inner product of Cn ). Find a unitary matrix whose first row is: (a) a multiple of ð1; 1 À iÞ; (b) a multiple of ð1 ; 1 i; 1 À 1 iÞ. 2 2 2 2

Normed Vector Spaces
7.100. Consider vectors u ¼ ð1; À3; 4; 1; À2Þ and v ¼ ð3; 1; À2; À3; 1Þ in R5 . Find (a) kuk1 and kvk1 , 7.101. 7.102. (b) kuk1 and kvk1 , (c) kuk2 and kvk2 , (d) d1 ðu; vÞ; d1 ðu; vÞ, d2 ðu; vÞ

Repeat Problem 7.100 for u ¼ ð1 þ i; 2 À 4iÞ and v ¼ ð1 À i; 2 þ 3iÞ in C2 . Consider the functions f ðtÞ ¼ 5t À t2 and gðtÞ ¼ 3t À t2 in C½0; 4Š. Find (a) d1 ð f ; gÞ, (b) d1 ð f ; gÞ, (c) d2 ð f ; gÞ (b) k Á k1 is a norm on Rn . (b) k Á k1 is a norm on C½a; bŠ.

7.103. 7.104.

Prove (a) k Á k1 is a norm on Rn .

Prove (a) k Á k1 is a norm on C½a; bŠ.

ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1 ; R2 ; . . .Š denotes a matrix M with rows R1 ; R2 ; : . . . Also, basis need not be unique. 7.58. 7.59. 7.60. 7.64. 7.65. 7.66. 7.67. k>9 (a) À13, (b) À71, (c) pffiffiffiffiffi 29, (d) pffiffiffiffiffi 89

Let u ¼ ð0; 0; 1Þ; then hu; ui ¼ 0 in both cases f7t2 À 5t; 12t2 À 5g fð1; 2; 1; 0Þ; ð4; 4; 0; 1Þg ðÀ1; 0; 0; 0; 1Þ; ðÀ6; 2; 0; 1; 0Þ; ðÀ5; 2; 1; 0; 0Þ (a) u1 ¼ ð0; 0; 3; 1Þ; u2 ¼ ð0; 5; À1; 3Þ; u3 ¼ ðÀ14; À2; À1; 3Þ; pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffi (b) u1 = 10; u2 = 35; u3 = 210 (a) ð0; 2; À1; 0Þ; ðÀ15; 1; 2; 5Þ, (b) pffiffiffiffiffiffiffiffi pffiffiffi ð0; 2; À1; 0Þ= 5; ðÀ15; 1; 2; 5Þ= 255

7.68. 7.69.

(b) v ¼ 1 ð5u1 þ 3u2 À 13u3 þ 9u4 Þ, 4 (c) ½vŠ ¼ 1 ½a þ b þ c þ d; a þ b À c À d; a À b þ c À d; a À b À c þ dŠ 4 (a) (a) ½0; 1; 0; 0Š; c ¼ À 23, 30 ½0; 0; 1; 0Š, (b) c ¼ 1, 7 (c) (b) ½0; À1; 1; 0Š
15 c ¼ 148,

7.71. 7.74. 7.75.

(d)

c ¼ 19 26

(a) w1 ¼ ð1; 1; 1; 1Þ; w2 ¼ ð0; À2; 1; 1Þ; w3 ¼ ð12; À4; À1; À7Þ, (b) projðv; U Þ ¼ 1 ðÀ1; 12; 3; 6Þ 5 (a) projðv; W Þ ¼ 1 ð23; 25; 30; 25; 23Þ, (b) First find an orthogonal basis for W ; 8 1 say, w1 ¼ ð1; 2; 1; 2; 1Þ and w2 ¼ ð0; 2; 0; À3; 2Þ. Then projðv; W Þ ¼ 17 ð34; 76; 34; 56; 42Þ

7.76.

7.77.

1 projð f ; W Þ ¼ 3 t2 À 3 t þ 20 2 5

CHAPTER 7 Inner Product Spaces, Orthogonality
7.78. 7.79. 7.80. 7.81. 7.83. 7.84. 7.85. 7.86. 7.87. 7.91. 7.92. 7.94. 7.95. 7.97. 7.98. 7.99. (a) f1; t; 3t2 À 1; 5t3 À 3tg, b; ÀaŠ, ½a; b;
5 projð f ; W Þ ¼ 10 t3 À 21 t 9

263 pffiffiffi Àb; ÀaŠ, where a ¼ 1 and b ¼ 1 8 3 3 pffiffiffiffiffi pffiffiffi pffiffiffiffiffi 3; b ¼ 14; c ¼ 38

Four: ½a; b;

Àb; ÀaŠ, ½a; Àb;

b; aŠ, ½a; Àb;

P ¼ ½1=a; 1=a; 1=a;
1 3 ½1; 2; 2;

1=b; À2=b; 3=b; 2; 1; À2Š

5=c; À2=c; À3=cŠ, where a ¼

2; À2; 1;

(a) (a)

½17; À10; ½65; À68; 4; 6; 2;

À10; 13Š, À68; 73Š, 3; 2; 11Š

(b) ½10; 0; (b) ½58; 8;

0; 40Š 8; 8Š

½3; 4; 3; (a) (a) (a) (a)
83 12,

(b) ½1; a; b; (b) Yes,

a; b; c; b; c; dŠ, where a ¼ 1, b ¼ 1, c ¼ 1, d ¼ 1 2 3 4 5 (c) (c) (b) No, pffiffiffiffiffi 28, (d) Yes (d) pffiffiffiffiffi 31, (e) pffiffiffiffiffi 59

No, À4i,

(b) 4i,

1 c ¼ 28 ð19 À 5iÞ,

1 c ¼ 19 ð3 þ 6iÞ

pffiffiffiffiffi pffiffiffi fv 1 ¼ ð1; i; 1Þ= 3; v 2 ¼ ð2i; 1 À 3i; 3 À iÞ= 24g  a and d real and positive, c ¼ b and ad À bc positive. u ¼ ð1; 2Þ; v ¼ ði; 2iÞ P ¼ ½1; 1 À i; 1 þ 2i; 1 þ i; 2; À1 þ 3i; 1 À 2i; À1 À 3i; 5Š

pffiffiffi 1 þ i; À1Š, (a) ð1= 3Þ½1; 1 À i; pffiffiffi (b) ½a; ai; a À ai; bi; b; 0; a; ai; Àa À aiŠ, where a ¼ 1 and b ¼ 1= 2. 2 4 and 3, (b) 11 and 10, (c) pffiffiffiffiffi pffiffiffiffiffi 31 and 24, (d) (c) 6; 19; 9 pffiffiffiffiffi (d) 7; 9; 53

7.100. (a) 7.101. (a) 7.102. (a)

pffiffiffiffiffi pffiffiffiffiffi 20 and 13, 8, (b) 16,

pffiffiffi pffiffiffiffiffi pffiffiffi pffiffiffiffiffi (b) 2 þ 20 and 2 þ 13, (c) pffiffiffi 16= 3

pffiffiffiffiffi pffiffiffiffiffi 22 and 15,

CHAPTER 8

Determinants
8.1 Introduction
Each n-square matrix A ¼ ½aij Š is assigned a special scalar called the determinant of A, denoted by detðAÞ or jAj or    a11 a12 . . . a1n     a21 a22 . . . a2n     :::::::::::::::::::::::::::::    a  n1 an2 . . . ann We emphasize that an n  n array of scalars enclosed by straight lines, called a determinant of order n, is not a matrix but denotes the determinant of the enclosed array of scalars (i.e., the enclosed matrix). The determinant function was first discovered during the investigation of systems of linear equations. We shall see that the determinant is an indispensable tool in investigating and obtaining properties of square matrices. The definition of the determinant and most of its properties also apply in the case where the entries of a matrix come from a commutative ring. We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for our general definition of the determinant.

8.2

Determinants of Orders 1 and 2

Determinants of orders 1 and 2 are defined as follows:    a11 a12    ja11 j ¼ a11 and  a21 a22  ¼ a11 a22 À a12 a21 Thus, the determinant of a 1  1 matrix A ¼ ½a11 Š is the scalar a11 ; that is, detðAÞ ¼ ja11 j ¼ a11 . The determinant of order two may easily be remembered by using the following diagram:  þ ƒƒ  a11 a12  À  ƒƒƒƒ   a21 ƒƒ  a22 ƒ! That, is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants of order 3, but not for higher-order determinants.)
EXAMPLE 8.1

(a) Because the determinant of order 1 is the scalar itself, we have:

detð27Þ ¼ 27;

  5 3  (b)   4 6  ¼ 5ð6Þ À 3ð4Þ ¼ 30 À 12 ¼ 18;

264

ƒƒ

ƒƒ

ƒ ƒƒ

! ƒƒ

detðÀ7Þ ¼ À7;

detðt À 3Þ ¼ t À 3
   3 2    À5 7  ¼ 21 þ 10 ¼ 31

CHAPTER 8 Determinants

265

Application to Linear Equations
Consider two linear equations in two unknowns, say a1 z þ b1 y ¼ c1 a2 x þ b2 y ¼ c2 Let D ¼ a1 b2 À a2 b1 , the determinant of the matrix of coefficients. Then the system has a unique solution if and only if D 6¼ 0. In such a case, the unique solution may be expressed completely in terms of determinants as follows:      c1 b1   a1 c1      Ny a1 c2 À a2 c1  a2 c2  Nx b2 c1 À b1 c2  c2 b2  ;  x¼ ¼ ¼ ¼ ¼ y¼ D a1 b2 À a2 b1  a1 b1  D a1 b2 À a2 b1  a1 b1       a2 b2   a2 b2  Here D appears in the denominator of both quotients. The numerators Nx and Ny of the quotients for x and y, respectively, can be obtained by substituting the column of constant terms in place of the column of coefficients of the given unknown in the matrix of coefficients. On the other hand, if D ¼ 0, then the system may have no solution or more than one solution. &
EXAMPLE 8.2

Solve by determinants the system

4x À 3y ¼ 15 2x þ 5y ¼ 1

First find the determinant D of the matrix of coefficients:

 4 D¼ 2

 À3   ¼ 4ð5Þ À ðÀ3Þð2Þ ¼ 20 þ 6 ¼ 26 5

Because D 6¼ 0, the system has a unique solution. To obtain the numerators Nx and Ny , simply replace, in the matrix of coefficients, the coefficients of x and y, respectively, by the constant terms, and then take their determinants:

   15 À3    ¼ 75 þ 3 ¼ 78 Nx ¼  1 5
Then the unique solution of the system is x¼ Nx 78 ¼ 3; ¼ D 26 y¼

   4 15    ¼ 4 À 30 ¼ À26 Ny ¼  2 1

Ny À26 ¼ À1 ¼ 26 D

8.3

Determinants of Order 3

Consider an arbitrary 3  3 matrix A ¼ ½aij Š. The determinant of A is defined as follows:    a11 a12 a13    detðAÞ ¼  a21 a22 a23  ¼ a11 a22 a33 þ a12 a23 a31 þ a13 a21 a32 À a13 a22 a31 À a12 a21 a33 À a11 a23 a32   a a32 a33  31 Observe that there are six products, each product consisting of three elements of the original matrix. Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled (change their sign). The diagrams in Fig. 8-1 may help us to remember the above six products in detðAÞ. That is, the determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in

266

CHAPTER 8 Determinants

Fig. 8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled arrows. We emphasize that there are no such diagrammatic devices with which to remember determinants of higher order.

Figure 8-1 3 2 2 1 1 3 EXAMPLE 8.3 Let A ¼ 4 0 5 À2 5 and B ¼ 4 À4 1 À3 4 2 2 3 2 1 5 À1 5. Find detðAÞ and detðBÞ. À3 4

Use the diagrams in Fig. 8-1: detðAÞ ¼ 2ð5Þð4Þ þ 1ðÀ2Þð1Þ þ 1ðÀ3Þð0Þ À 1ð5Þð1Þ À ðÀ3ÞðÀ2Þð2Þ À 4ð1Þð0Þ ¼ 40 À 2 þ 0 À 5 À 12 À 0 ¼ 21 detðBÞ ¼ 60 À 4 þ 12 À 10 À 9 þ 32 ¼ 81

Alternative Form for a Determinant of Order 3
The determinant of the 3  3 matrix A ¼ ½aij Š may be rewritten as follows: detðAÞ ¼ a11 ða22 a23 À a23 a32 Þ À a12 ða21 a33 À a23 a31 Þ þ a13 ða21 a32 À a22 a31 Þ       a  a  a   22 a23   21 a23   21 a22  ¼ a11   À a12   þ a13    a32 a33   a31 a33   a31 a32  which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs) form the first row of the given matrix. This linear combination may be indicated in the form        a11 a12 a13   a11 a12 a13   a11 a12 a13              a11  a21 a22 a23  À a12  a21 a22 a23  þ a13  a21 a22 a23              a31 a32 a33 a31 a32 a33 a31 a32 a33 Note that each 2  2 matrix can be obtained by deleting, in the original matrix, the row and column containing its coefficient.
EXAMPLE 8.4

       1 1 1 1 2 3 2 3 2 3                þ 3 4  À 2 4 À2  4 À2  ¼ 1 4 À2 3 3 3            0  0  0  0 5 À1 5 À1 5 À1        4 À2  4  À2 3 3       ¼ 1   þ 3  À 2 0  0 À1   5 À1  5

2 À2 5

 3   3   À1 

¼ 1ð2 À 15Þ À 2ðÀ4 þ 0Þ þ 3ð20 þ 0Þ ¼ À13 þ 8 þ 60 ¼ 55

CHAPTER 8 Determinants

267

8.4

Permutations

A permutation s of the set f1; 2; . . . ; ng is a one-to-one mapping of the set onto itself or, equivalently, a rearrangement of the numbers 1; 2; . . . ; n. Such a permutation s is denoted by   1 2 ... n s¼ where ji ¼ sðiÞ or s ¼ j1 j2 Á Á Á jn ; j1 j2 . . . jn The set of all such permutations is denoted by Sn , and the number of such permutations is n!. If s 2 Sn ; then the inverse mapping sÀ1 2 Sn ; and if s; t 2 Sn , then the composition mapping s  t 2 Sn . Also, the identity mapping e ¼ s  sÀ1 2 Sn . (In fact, e ¼ 123 . . . n.)
EXAMPLE 8.5

(a) There are 2! ¼ 2 Á 1 ¼ 2 permutations in S2 ; they are 12 and 21. (b) There are 3! ¼ 3 Á 2 Á 1 ¼ 6 permutations in S3 ; they are 123, 132, 213, 231, 312, 321.

Sign (Parity) of a Permutation
Consider an arbitrary permutation s in Sn , say s ¼ j1 j2 Á Á Á jn : We say s is an even or odd permutation according to whether there is an even or odd number of inversions in s. By an inversion in s we mean a pair of integers ði; kÞ such that i > k, but i precedes k in s. We then define the sign or parity of s, written sgn s, by & 1 if s is even sgn s ¼ À1 if s is odd
EXAMPLE 8.6

(a) Find the sign of s ¼ 35142 in S5 . For each element k, we count the number of elements i such that i > k and i precedes k in s. There are

2 numbers ð3 and 5Þ greater than and preceding 1; 3 numbers ð3; 5; and 4Þ greater than and preceding 2; 1 number ð5Þ greater than and preceding 4:
(There are no numbers greater than and preceding either 3 or 5.) Because there are, in all, six inversions, s is even and sgn s ¼ 1. (b) The identity permutation e ¼ 123 . . . n is even because there are no inversions in e. (c) In S2 , the permutation 12 is even and 21 is odd. In S3 , the permutations 123, 231, 312 are even and the permutations 132, 213, 321 are odd. (d) Let t be the permutation that interchanges two numbers i and j and leaves the other numbers fixed. That is,

tðiÞ ¼ j;

tðjÞ ¼ i;

tðkÞ ¼ k;

where

k ¼ i; j 6

We call t a transposition. If i < j, then there are 2ð j À iÞ À 1 inversions in t, and hence, the transposition t is odd.

Remark: One can show that, for any n, half of the permutations in Sn are even and half of them are odd. For example, 3 of the 6 permutations in S3 are even, and 3 are odd.

8.5.

Determinants of Arbitrary Order

Let A ¼ ½aij Š be a square matrix of order n over a field K. Consider a product of n elements of A such that one and only one element comes from each row and one and only one element comes from each column. Such a product can be written in the form a1j1 a2j2 Á Á Á anjn

268

CHAPTER 8 Determinants

that is, where the factors come from successive rows, and so the first subscripts are in the natural order 1; 2; . . . ; n. Now because the factors come from different columns, the sequence of second subscripts forms a permutation s ¼ j1 j2 Á Á Á jn in Sn . Conversely, each permutation in Sn determines a product of the above form. Thus, the matrix A contains n! such products.
DEFINITION:

The determinant of A ¼ ½aij Š, denoted by detðAÞ or jAj, is the sum of all the above n! products, where each such product is multiplied by sgn s. That is, P jAj ¼ ðsgn sÞa1j1 a2j2 Á Á Á anjn Ps or jAj ¼ ðsgn sÞa1sð1Þ a2sð2Þ Á Á Á ansðnÞ s2Sn The determinant of the n-square matrix A is said to be of order n. The next example shows that the above definition agrees with the previous definition of determinants of orders 1, 2, and 3.
EXAMPLE 8.7

(a) Let A ¼ ½a11 Š be a 1  1 matrix. Because S1 has only one permutation, which is even, detðAÞ ¼ a11 , the number itself. (b) Let A ¼ ½aij Š be a 2  2 matrix. In S2 , the permutation 12 is even and the permutation 21 is odd. Hence,

 a detðAÞ ¼  11  a21

 a12   ¼ a11 a22 À a12 a21 a22 

(c) Let A ¼ ½aij Š be a 3  3 matrix. In S3 , the permutations 123, 231, 312 are even, and the permutations 321, 213, 132 are odd. Hence,    a11 a12 a13    detðAÞ ¼  a21 a22 a23  ¼ a11 a22 a33 þ a12 a23 a31 þ a13 a21 a32 À a13 a22 a31 À a12 a21 a33 À a11 a23 a32   a a a 
31 32 33

Remark: As n increases, the number of terms in the determinant becomes astronomical. Accordingly, we use indirect methods to evaluate determinants rather than the definition of the determinant. In fact, we prove a number of properties about determinants that will permit us to shorten the computation considerably. In particular, we show that a determinant of order n is equal to a linear combination of determinants of order n À 1, as in the case n ¼ 3 above.

8.6

Properties of Determinants

We now list basic properties of the determinant.
THEOREM 8.1:

The determinant of a matrix A and its transpose AT are equal; that is, jAj ¼ jAT j.

By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that concerns the rows of A will have an analogous theorem concerning the columns of A. The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be obtained immediately.
THEOREM 8.2:

Let A be a square matrix. (i) If A has a row (column) of zeros, then jAj ¼ 0. (ii) If A has two identical rows (columns), then jAj ¼ 0.

CHAPTER 8 Determinants
(iii)

269

If A is triangular (i.e., A has zeros above or below the diagonal), then jAj ¼ product of diagonal elements. Thus, in particular, jIj ¼ 1, where I is the identity matrix.

The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is affected by the elementary row and column operations.
THEOREM 8.3:

Suppose B is obtained from A by an elementary row (column) operation. (i) If two rows (columns) of A were interchanged, then jBj ¼ ÀjAj. (ii) If a row (column) of A were multiplied by a scalar k, then jBj ¼ kjAj. (iii) If a multiple of a row (column) of A were added to another row (column) of A, then jBj ¼ jAj.

Major Properties of Determinants
We now state two of the most important and useful theorems on determinants.
THEOREM 8.4:

The determinant of a product of two matrices A and B is the product of their determinants; that is, detðABÞ ¼ detðAÞ detðBÞ

The above theorem says that the determinant is a multiplicative function.
THEOREM 8.5:

Let A be a square matrix. Then the following are equivalent: (i) A is invertible; that is, A has an inverse AÀ1 . (ii) AX ¼ 0 has only the zero solution. (iii) The determinant of A is not zero; that is, detðAÞ 6¼ 0.

Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an invertible matrix A, or a matrix A for which jAj ¼ 0, or a matrix A for which AX ¼ 0 has only the zero 6 solution. The above theorem shows that all such definitions are equivalent. We will prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of Theorem 8.4.
LEMMA 8.6:

Let E be an elementary matrix. Then, for any matrix A; jEAj ¼ jEjjAj.

Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B ¼ PÀ1 AP. Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31) the following theorem.
THEOREM 8.7:

Suppose A and B are similar matrices. Then jAj ¼ jBj.

8.7

Minors and Cofactors

Consider an n-square matrix A ¼ ½aij Š. Let Mij denote the ðn À 1Þ-square submatrix of A obtained by deleting its ith row and jth column. The determinant jMij j is called the minor of the element aij of A, and we define the cofactor of aij , denoted by Aij ; to be the ‘‘signed’’ minor: Aij ¼ ðÀ1Þiþj jMij j

270

CHAPTER 8 Determinants

Note that the ‘‘signs’’ ðÀ1Þiþj accompanying the minors form a chessboard pattern with þ’s on the main diagonal: 2 3 þ À þ À ... 6À þ À þ ... 7 6 7 4þ À þ À ... 5 ::::::::::::::::::::::::::::::: We emphasize that Mij denotes a matrix, whereas Aij denotes a scalar. Remark: The sign ðÀ1Þiþj of the cofactor Aij is frequently obtained using the checkerboard pattern. Specifically, beginning with þ and alternating signs: þ; À; þ; À; . . . ; count from the main diagonal 2 1 EXAMPLE 8.8 Let A ¼ 4 4 (b) jM31 j and A31 . 7
 1  (a) jM 23 j ¼  4  7  1  (b) jM31 j ¼  4  7

to the appropriate square. 3 2 3 5 6 5. Find the following minors and cofactors: (a) jM23 j and A23 , 8 9

  2 3   1 2 2þ3  5 6 ¼    7 8  ¼ 8 À 14 ¼ À6, and so A23 ¼ ðÀ1Þ jM23 j ¼ ÀðÀ6Þ ¼ 6 8 9   2 3   2 3  ¼ 12 À 15 ¼ À3, and so A31 ¼ ðÀ1Þ1þ3 jM31 j ¼ þðÀ3Þ ¼ À3 ¼ 5 6  5 6 8 9

Laplace Expansion
The following theorem (proved in Problem 8.32) holds.
THEOREM 8.8:

(Laplace) The determinant of a square matrix A ¼ ½aij Š is equal to the sum of the products obtained by multiplying the elements of any row (column) by their respective cofactors: n P jAj ¼ ai1 Ai1 þ ai2 Ai2 þ Á Á Á þ ain Ain ¼ aij Aij j¼1 jAj ¼ a1j A1j þ a2j A2j þ Á Á Á þ anj Anj ¼

n P i¼1

aij Aij

The above formulas for jAj are called the Laplace expansions of the determinant of A by the ith row and the jth column. Together with the elementary row (column) operations, they offer a method of simplifying the computation of jAj, as described below.

8.8

Evaluation of Determinants

The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a determinant of order n À 1.
ALGORITHM 8.1:

(Reduction of the order of a determinant) The input is a nonzero n-square matrix A ¼ ½aij Š with n > 1.

Step 1. Choose an element aij ¼ 1 or, if lacking, aij 6¼ 0. Step 2. Using aij as a pivot, apply elementary row (column) operations to put 0’s in all the other positions in the column (row) containing aij . Step 3. Expand the determinant by the column (row) containing aij .

CHAPTER 8 Determinants
The following remarks are in order.

271

Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants of order less than 4, one uses the specific formulas for the determinant. Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the product of its diagonal entries. However, one must keep track of the number of row interchanges, because each row interchange changes the sign of the determinant. 2 3 5 4 2 1 6 2 3 1 À2 7 7. EXAMPLE 8.9 Use Algorithm 8.1 to find the determinant of A ¼ 6 4 À5 À7 À3 95 1 À2 À1 4
Use a23 ¼ 1 as a pivot to put 0’s in the other positions of the third column; that is, apply the row operations ‘‘Replace R1 by À2R2 þ R1 ,’’ ‘‘Replace R3 by 3R2 þ R3 ,’’ and ‘‘Replace R4 by R2 þ R4 .’’ By Theorem 8.3(iii), the value of the determinant does not change under these operations. Thus,

  5   2 jAj ¼   À5   1

  4 2 1   1 À2   3 1 À2   2 3 ¼ 2 À7 À3 9 1   1 À2 À1 4 3

0 1 0 0

 5  À2   3  2

Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the minor M23 is ðÀ1Þ2þ3 ¼ À1. Thus,

 1  2 jAj ¼ À 1  3

2 3 2 1

0 1 0 0

  5  1 À2    À2  ¼ À 1 2   3 3 1 2

 5  3  ¼ Àð4 À 18 þ 5 À 30 À 3 þ 4Þ ¼ ÀðÀ38Þ ¼ 38  2

8.9

Classical Adjoint

Let A ¼ ½aij Š be an n  n matrix over a field K and let Aij denote the cofactor of aij . The classical adjoint of A, denoted by adj A, is the transpose of the matrix of cofactors of A. Namely, adj A ¼ ½Aij ŠT We say ‘‘classical adjoint’’ instead of simply ‘‘adjoint’’ because the term ‘‘adjoint’’ is currently used for an entirely different concept. 2 3 2 3 À4 EXAMPLE 8.10 Let A ¼ 4 0 À4 2 5. The cofactors of the nine elements of A follow: 1 À1 5        À4 2  0 2  0 À4   ¼ À18;  ¼ 2;  A12 ¼ À A13 ¼ þ A11 ¼ þ  À1 5  1 5  1 À1  ¼ 4        3 À4   2 À4  2 3   ¼ À11;   ¼ 14;  ¼5 A21 ¼ À A22 ¼ þ A23 ¼ À À1 5 1 5 1 À1         3 À4   2 À4  2 3  ¼ À10;  ¼ À4;  A31 ¼ þ A32 ¼ À A33 ¼ þ  À4 0  0 À4  ¼ À8 2 2

272
3 À18 À11 À10 adj A ¼ 4 2 14 À4 5 4 5 À8 The following theorem (proved in Problem 8.34) holds.
THEOREM 8.9:

CHAPTER 8 Determinants

The transpose of the above matrix of cofactors yields the classical adjoint of A; that is,

2

Let A be any square matrix. Then Aðadj AÞ ¼ ðadj AÞA ¼ jAjI where I is the identity matrix. Thus, if jAj 6¼ 0, AÀ1 ¼ 1 ðadj AÞ jAj

EXAMPLE 8.11 Let A be the matrix in Example 8.10. We have

detðAÞ ¼ À40 þ 6 þ 0 À 16 þ 4 þ 0 ¼ À46
Thus, A does have an inverse, and, by Theorem 8.9,

2 AÀ1 ¼ 1 1 6 ðadj AÞ ¼ À 4 jAj 46

À18 À11 À10 2 4 14 5 À8

3

2

7 À4 5 ¼

9 23 6 1 4 À 23 2 À 23

11 46 7 À 23 5 À 46

5 3 23 2 7 23 5 4 23

8.10

Applications to Linear Equations, Cramer’s Rule

Consider a system AX ¼ B of n linear equations in n unknowns. Here A ¼ ½aij Š is the (square) matrix of coefficients and B ¼ ½bi Š is the column vector of constants. Let Ai be the matrix obtained from A by replacing the ith column of A by the column vector B. Furthermore, let D ¼ detðAÞ; N1 ¼ detðA1 Þ; N2 ¼ detðA2 Þ; ...; Nn ¼ detðAn Þ

The fundamental relationship between determinants and the solution of the system AX ¼ B follows.
THEOREM 8.10:

The (square) system AX ¼ B has a solution if and only if D 6¼ 0. In this case, the unique solution is given by x1 ¼ N1 ; D x2 ¼ N2 ; D ...; xn ¼ Nn D

The above theorem (proved in Problem 8.10) is known as Cramer’s rule for solving systems of linear equations. We emphasize that the theorem only refers to a system with the same number of equations as unknowns, and that it only gives the solution when D 6¼ 0. In fact, if D ¼ 0, the theorem does not tell us whether or not the system has a solution. However, in the case of a homogeneous system, we have the following useful result (to be proved in Problem 8.54).
THEOREM 8.11:

A square homogeneous system AX ¼ 0 has a nonzero solution if and only if D ¼ jAj ¼ 0.

CHAPTER 8 Determinants
8 < xþ yþ z¼ 5 EXAMPLE 8.12 Solve the system using determinants x À 2y À 3z ¼ À1 : 2x þ y À z ¼ 3
First compute the determinant D of the matrix of coefficients:   1 1 1   D ¼  1 À2 À3  ¼ 2 À 6 þ 1 þ 4 þ 3 þ 1 ¼ 5   2 1 À1 

273

Because D 6¼ 0, the system has a unique solution. To compute Nx , Ny , Nz , we replace, respectively, the coefficients of x; y; z in the matrix of coefficients by the constant terms. This yields        5 1 1 1 1 5 1 1 5       Ny ¼  1 À1 À3  ¼ À10; Nz ¼  1 À2 À1  ¼ 15 Nx ¼  À1 À2 À3  ¼ 20;        3 2 2 1 À1  3 À1  1 3 Thus, the unique solution of the system is vector u ¼ ð4; À2; 3Þ. x ¼ Nx =D ¼ 4, y ¼ Ny =D ¼ À2, z ¼ Nz =D ¼ 3; that is, the

8.11

Submatrices, Minors, Principal Minors

Let A ¼ ½aij Š be a square matrix of order n. Consider any r rows and r columns of A. That is, consider any set I ¼ ði1 ; i2 ; . . . ; ir Þ of r row indices and any set J ¼ ðj1 ; j2 ; . . . ; jr Þ of r column indices. Then I and J define an r  r submatrix of A, denoted by AðI; J Þ, obtained by deleting the rows and columns of A whose subscripts do not belong to I or J , respectively. That is, AðI; J Þ ¼ ½ast : s 2 I; t 2 J Š The determinant jAðI; J Þj is called a minor of A of order r and ðÀ1Þi1 þi2 þÁÁÁþir þj1 þj2 þÁÁÁþjr jAðI; J Þj is the corresponding signed minor. (Note that a minor of order n À 1 is a minor in the sense of Section 8.7, and the corresponding signed minor is a cofactor.) Furthermore, if I 0 and J 0 denote, respectively, the remaining row and column indices, then jAðI 0 ; J 0 Þj denotes the complementary minor, and its sign (Problem 8.74) is the same sign as the minor.
EXAMPLE 8.13 Let A ¼ ½aij Š be a 5-square matrix, and let I ¼ f1; 2; 4g and J ¼ f2; 3; 5g. Then I 0 ¼ f3; 5g and J 0 ¼ f1; 4g, and the corresponding minor jMj and complementary minor jM 0 j are as follows:      a12 a13 a15   a31 a34    0 0 0  a22 a23 a25    and jM j ¼ jAðI ; J Þj ¼  jMj ¼ jAðI; J Þj ¼   a51 a54  a  42 a43 a45

Because 1 þ 2 þ 4 þ 2 þ 3 þ 5 ¼ 17 is odd, ÀjMj is the signed minor, and ÀjM 0 j is the signed complementary minor.

Principal Minors
A minor is principal if the row and column indices are the same, or equivalently, if the diagonal elements of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is always þ1, because the sum of the row and identical column subscripts must always be even.

274
1 2 EXAMPLE 8.14 Let A ¼ 4 3 5 À3 1 orders 1, 2, and 3, respectively. j1j ¼ 1; j5j ¼ 5; 2

CHAPTER 8 Determinants
3 À1 4 5. Find the sums C1 , C2 , and C3 of the principal minors of A of À2

(a) There are three principal minors of order 1. These are

j À 2j ¼ À2;

and so

C1 ¼ 1 þ 5 À 2 ¼ 4

Note that C1 is simply the trace of A. Namely, C1 ¼ trðAÞ: (b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order 2. These are

 1  3

 2  ¼ À1; 5

   1 À1     À3 À2  ¼ 1;

 5  1

 4  ¼ À14 À2 

(Note that these minors of order 2 are the cofactors A33 , A22 , and A11 of A, respectively.) Thus,

C2 ¼ À1 þ 1 À 14 ¼ À14
(c) There is only one way to choose three of the three diagonal elements. Thus, the only minor of order 3 is the determinant of A itself. Thus, C3 ¼ jAj ¼ À10 À 24 À 3 À 15 À 4 þ 12 ¼ À44

8.12

Block Matrices and Determinants

The following theorem (proved in Problem 8.36) is the main result of this section. Suppose M is an upper (lower) triangular block matrix with the diagonal blocks A1 ; A2 ; . . . ; An . Then detðMÞ ¼ detðA1 Þ detðA2 Þ . . . detðAn Þ 2 3 2 3 4 7 8 6 À1 5 3 2 17 6 7 EXAMPLE 8.15 Find jMj where M ¼ 6 0 0 2 1 57 6 7 4 0 0 3 À1 4 5 0 0 5 2 6
THEOREM 8.12: Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block:

  2   À1

 3  ¼ 10 þ 3 ¼ 13; 5

 2  3  5

 1 5  À1 4  ¼ À12 þ 20 þ 30 þ 25 À 16 À 18 ¼ 29  2 6

! A B , where A; B; C; D are square matrices. Then it is not generally C D true that jMj ¼ jAjjDj À jBjjCj. (See Problem 8.68.) Remark: Suppose M ¼

Then jMj ¼ 13ð29Þ ¼ 377.

8.13

Determinants and Volume

Determinants are related to the notions of area and volume as follows. Let u1 ; u2 ; . . . ; un be vectors in Rn . Let S be the (solid) parallelopiped determined by the vectors; that is, S ¼ fa1 u1 þ a2 u2 þ Á Á Á þ an un : 0 V ðSÞ ¼ absolute value of det ðAÞ ai 1 for i ¼ 1; . . . ; ng (When n ¼ 2; S is a parallelogram.) Let V ðSÞ denote the volume of S (or area of S when n ¼ 2Þ. Then

CHAPTER 8 Determinants

275

where A is the matrix with rows u1 ; u2 ; . . . ; un . In general, V ðSÞ ¼ 0 if and only if the vectors u1 ; . . . ; un do not form a coordinate system for Rn (i.e., if and only if the vectors are linearly dependent).
EXAMPLE 8.16

Let u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 1; 1Þ, u3 ¼ ð0; 2; 3Þ. Find the volume V ðSÞ of the parallelopiped S in R3 (Fig. 8-2) determined by the three vectors.

z u3 u2 0 u1 y

x

Figure 8-2

Evaluate the determinant of the matrix whose rows are u1 ; u2 ; u3 :

 1  1  0

1 1 2

 0  1  ¼ 3 þ 0 þ 0 À 0 À 2 À 3 ¼ À2  3

Hence, V ðSÞ ¼ j À 2j ¼ 2.

8.14

Determinant of a Linear Operator

Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation of F relative to some basis S of V. Then we define the determinant of F, written detðFÞ, by detðFÞ ¼ jAj If B were another matrix representation of F relative to another basis S 0 of V, then A and B are similar matrices (Theorem 6.7) and jBj ¼ jAj (Theorem 8.7). In other words, the above definition detðFÞ is independent of the particular basis S of V. (We say that the definition is well defined.) The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices.
THEOREM 8.13:

Let F and G be linear operators on a vector space V. Then (i) detðF  GÞ ¼ detðFÞ detðGÞ. (ii) F is invertible if and only if detðFÞ 6¼ 0.

EXAMPLE 8.17 Let F be the following linear operator on R3 and let A be the matrix that represents F

relative to the usual basis of R3 : Fðx; y; zÞ ¼ ð2x À 4y þ z; x À 2y þ 3z; 5x þ y À zÞ and

2 41 A¼ 5

2

À4 À2 1

3 1 35 À1

Then detðFÞ ¼ jAj ¼ 4 À 60 þ 1 þ 10 À 6 À 4 ¼ À55

276
8.15 Multilinearity and Determinants

CHAPTER 8 Determinants

Let V be a vector space over a field K. Let a ¼ V n ; that is, a consists of all the n-tuples A ¼ ðA1 ; A2 ; . . . ; An Þ where the Ai are vectors in V. The following definitions apply.
DEFINITION:

A function D: a ! K is said to be multilinear if it is linear in each component: (i) If Ai ¼ B þ C, then DðAÞ ¼ Dð. . . ; B þ C; . . .Þ ¼ Dð. . . ; B; . . . ; Þ þ Dð. . . ; C; . . .Þ (ii) If Ai ¼ kB, where k 2 K, then DðAÞ ¼ Dð. . . ; kB; . . .Þ ¼ kDð. . . ; B; . . .Þ We also say n-linear for multilinear if there are n components.

DEFINITION:

A function D: a ! K is said to be alternating if DðAÞ ¼ 0 whenever A has two identical elements: DðA1 ; A2 ; . . . ; An Þ ¼ 0 whenever Ai ¼ Aj ; i 6¼ j

Now let M denote the set of all n-square matrices A over a field K. We may view A as an n-tuple consisting of its row vectors A1 ; A2 ; . . . ; An ; that is, we may view A in the form A ¼ ðA1 ; A2 ; . . . ; An Þ. The following theorem (proved in Problem 8.37) characterizes the determinant function.
THEOREM 8.14:

There exists a unique function D: M ! K such that (i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1. This function D is the determinant function; that is, DðAÞ ¼ jAj; for any matrix A 2 M.

SOLVED PROBLEMS Computation of Determinants 8.1. Evaluate the determinant of each of the following matrices: ! ! ! ! tÀ5 6 4 À5 2 À3 6 5 ; (d) D ¼ ; (c) C ¼ , (b) B ¼ (a) A ¼ 3 tþ2 À1 À2 4 7 2 3   a b  Use the formula   c d  ¼ ad À bc:
(a) (b) (c) (d) jAj ¼ 6ð3Þ À 5ð2Þ ¼ 18 À 10 ¼ 8 jBj ¼ 14 þ 12 ¼ 26 jCj ¼ À8 À 5 ¼ À13 jDj ¼ ðt À 5Þðt þ 2Þ À 18 ¼ t2 À 3t À 10 À 18 ¼ t2 À 10t À 28

8.2.

Evaluate the determinant of each of the following matrices: 2 3 2 3 2 2 3 4 1 À2 3 1 (a) A ¼ 4 5 4 3 5, (b) B ¼ 4 2 4 À1 5, (c) C ¼ 4 3 1 2 1 1 5 À2 1

3 À1 À2

3 À5 25 1

CHAPTER 8 Determinants
Use the diagram in Fig. 8-1 to obtain the six products:

277

(a) jAj ¼ 2ð4Þð1Þ þ 3ð3Þð1Þ þ 4ð2Þð5Þ À 1ð4Þð4Þ À 2ð3Þð2Þ À 1ð3Þð5Þ ¼ 8 þ 9 þ 40 À 16 À 12 À15 ¼ 14 (b) jBj ¼ À8 þ 2 þ 30 À 12 þ 5 À 8 ¼ 9 (c) jCj ¼ À1 þ 6 þ 30 À 5 þ 4 À 9 ¼ 25

8.3.

Compute the determinant of each of the following matrices: 2 3 21 3 2 3 4 À6 8 9 À1 À 1 2 3 4 2 3 6 0 À2 7 À3 7 7 1 7, (c) C ¼ 6 3 : (a) A ¼ 4 5 6 7 5, (b) B ¼ 6 44 40 5 2 À1 5 0 5 6 8 9 1 1 À4 1 0 0 0 3
(a) One can simplify the entries by first subtracting twice the first row from the second row—that is, by applying the row operation ‘‘Replace R2 by À21 þ R2 .’’ Then     2 3 4 2 3 4     jAj ¼  5 6 7  ¼  1 0 À1  ¼ 0 À 24 þ 36 À 0 þ 18 À 3 ¼ 27     8 9 1 8 9 1

(b) B is triangular, so jBj ¼ product of the diagonal entries ¼ À120. (c) The arithmetic is simpler if fractions are first eliminated. Hence, multiply the first row R1 by 6 and the second row R2 by 4. Then    3 À6 À2    28 7 j24Cj ¼  3 2 À4  ¼ 6 þ 24 þ 24 þ 4 À 48 þ 18 ¼ 28; ¼ so jCj ¼   24 6  1 À4 1

8.4.

Compute the determinant of each of the following matrices: 2 2 3 6 2 1 2 5 À3 À2 6 2 1 1 6 6 À2 À3 2 À5 7 7, (b) B ¼ 6 1 (a) A ¼ 6 1 2 6 4 1 3 À2 25 4 3 0 2 À1 À6 4 3 À1 À1 À3

0 À2 À2 3 4

3 5 17 7 37 7 À1 5 2

(a) Use a31 ¼ 1 as a pivot to put 0’s in the first column, by applying the row operations ‘‘Replace R1 by À2R3 þ R1 ,’’ ‘‘Replace R2 by 2R3 þ R2 ,’’ and ‘‘Replace R4 by R3 þ R4 .’’ Then       2 1 À6   5 À3 À2   0 À1     À1  1 À6    À2 À3 3 À2 À1   2 À5   0       jAj ¼   ¼  3 À2 À1  ¼   1 3 À2 2  3 À2 2 1   À3    2 5  À1 À6 2 5 4 3   0 À3 ¼ 10 þ 3 À 36 þ 36 À 2 À 15 ¼ À4 (b) First reduce jBj to a determinant of order 4, and then to a determinant of order 3, for which we can use Fig. 8-1. First use c22 ¼ 1 as a pivot to put 0’s in the second column, by applying the row operations ‘‘Replace R1 by À2R2 þ R1 ,’’ ‘‘Replace R3 by ÀR2 þ R3 ,’’ and ‘‘Replace R5 by R2 þ R5 .’’ Then    2 0 À1 4 3       2 À1 4  1 4 5 3  1       1 À2 1   2 1   À1  1 0 0 1 0 2  0    jBj ¼  À1 0 1 0 2 ¼   ¼   3  2 3 À5  2 3 À1   5       2 3 À1    3 0   7 1 À2 2 3   À1 À2 2  1 0 À2 2 3     1 4 5    ¼  5 3 À5  ¼ 21 þ 20 þ 50 þ 15 þ 10 À 140 ¼ À34    À1 2 7

278
Cofactors, Classical Adjoints, Minors, Principal Minors 2 3 2 1 À3 4 6 5 À4 7 À2 7 7: 8.5. Let A ¼ 6 44 0 6 À3 5 3 À2 5 2

CHAPTER 8 Determinants

(a) Find A23 , the cofactor (signed minor) of 7 in A. (b) Find the minor and the signed minor of the submatrix M ¼ Að2; 4; 2; 3Þ. (c) Find the principal minor determined by the first and third diagonal entries—that is, by M ¼ Að1; 3; 1; 3Þ.
(a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which contain the 7), and multiply the determinant by ðÀ1Þ2þ3 :   2 1 4     A23 ¼ À 4 0 À3  ¼ ÀðÀ61Þ ¼ 61    3 À2 2 The exponent 2 þ 3 comes from the subscripts of A23 —that is, from the fact that 7 appears in row 2 and column 3. (b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence, the minor is the determinant      a22 a23   À4 7   ¼  ¼ À20 þ 14 ¼ À6 jMj ¼  a42 a43   À2 5  and the signed minor is ðÀ1Þ2þ4þ2þ3 jMj ¼ ÀjMj ¼ ÀðÀ6Þ ¼ 6. (c) The principal minor is the determinant      a11 a13   2 À3   ¼  ¼ 12 þ 12 ¼ 24 jMj ¼  a31 a33   4 6 Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the sign of the principal minor is positive.

8.6.

(a) jBj ¼ 27 þ 20 þ 16 À 15 À 32 À 18 ¼ À2 (b) Take the transpose of the matrix of cofactors: 2     3    2 3 T 3 4     À 2 4  6 8 9 5 9 5 87 2 3T 2 3 7 6 À5 2 1 À5 À1 1     7 6  1 1 1 17 6 1 1 6 7 6 7     7 adj B ¼ 6 À 4 À3 5 ¼ 4 2 4 À2 5  5 9  À 5 8  7 ¼ 4 À1 6 8 9 6 7     7 1 À2 1 1 À3 1 6  1 1 1 15 4 1 1   À    2 4 3 4 2 3 3 2 3 2 5 1 1 À5 À1 1 2 2 À2 1 1 4 7 6 15 (c) Because jBj 6¼ 0, BÀ1 ¼ ðadj BÞ ¼ 2 4 À2 5 ¼ 4 À1 À2 jBj À2 3 1 À3 1 À1 À1
2 2 2

1 42 Let B ¼ 5

2

1 3 8

3 1 4 5. Find: (a) 9

jBj,

(b) adj B, (c)

BÀ1 using adj B.

8.7.

1 44 Let A ¼ 0 (a) k ¼ 1,

2

2 5 7

3 6 5, and let Sk denote the sum of its principal minors of order k. Find Sk for 8 k ¼ 3.

3

(b) k ¼ 2, (c)

CHAPTER 8 Determinants
(a) The principal minors of order 1 are the diagonal elements. Thus, S1 is the trace of A; that is, S1 ¼ trðAÞ ¼ 1 þ 5 þ 8 ¼ 14 (b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus,  5 S2 ¼ A11 þ A22 þ A33 ¼  7   6 1 þ 8 0    3 1 2  ¼ À2 þ 8 À 3 ¼ 3 þ 8 4 5

279

(c) There is only one principal minor of order 3, the determinant of A. Then S3 ¼ jAj ¼ 40 þ 0 þ 84 À 0 À 42 À 64 ¼ 18

8.8.

1 6 À4 Let A ¼ 6 4 1 3 (a) k ¼ 1,

2

3 2 0 À2

3 0 À1 5 17 7. Find the number Nk and sum Sk of principal minors of order: 3 À2 5 1 4 (c) k ¼ 3, (d) k ¼ 4.

(b) k ¼ 2,

Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of f1; 2; 3; 4gÞ   n! n determines a principal minor of A, and Nk ¼ of them are of order k. ¼ k k!ðn À kÞ!         4 4 4 4 Thus; N1 ¼ ¼ 4; N2 ¼ ¼ 6; N3 ¼ ¼ 4; N4 ¼ ¼1 1 2 3 4 (a) S1 ¼ j1j þ j2j þ j3j þ j4j ¼ 1 þ 2 þ 3 þ 4 ¼ 10              1 3   1 0   1 À1   2 5   2 1   3 À2  þ þ þ þ þ  (b) S2 ¼   À4 2   1 3   3 4   0 3   À2 4   1 4 ¼ 14 þ 3 þ 7 þ 6 þ 10 þ 14 ¼ 54          1 3 0  1 1 3 À1   1 0 À1   2 5                 2 1  þ  1 3 À2  þ  0 3 À2  (c) S3 ¼  À4 2 5  þ  À4          1 0 3   3 À2 4 4   À2 1 4 3 1 ¼ 57 þ 65 þ 22 þ 54 ¼ 198 (d) S4 ¼ detðAÞ ¼ 378

Determinants and Systems of Linear Equations 8 < 3y þ 2x ¼ z þ 1 8.9. Use determinants to solve the system 3x þ 2z ¼ 8 À 5y : : 3z À 1 ¼ x À 2y
First arrange the equation in standard form, then compute the determinant D of the matrix of coefficients:   2 2x þ 3y À z ¼ 1 3 À1    3x þ 5y þ 2z ¼ 8 and D ¼ 3 5 2  ¼ À30 þ 6 þ 6 þ 5 þ 8 þ 27 ¼ 22    1 À2 À3  x À 2y À 3z ¼ À1 Because D 6¼ 0, the coefficients of x; y; z   1 3  Nx ¼  8 5   À1 À2 system has a unique solution. To compute Nx ; Ny ; Nz , we replace, respectively, the in the matrix of coefficients by the constant terms. Then      2 2 À1  1 À1  3 1      Ny ¼  3 Nz ¼  3 2  ¼ 66; 8 2  ¼ À22; 5 8  ¼ 44       1 À1 À3   1 À2 À1  À1 

280
Thus, x¼ Nx 66 ¼ 3; ¼ D 22 y¼ Ny À22 ¼ À1; ¼ 22 D

CHAPTER 8 Determinants



Nz 44 ¼2 ¼ D 22

8 < kx þ y þ z ¼ 1 8.10. Consider the system x þ ky þ z ¼ 1 : x þ y þ kz ¼ 1 Use determinants to find those values of k for which the system has (a) a unique solution, (b) more than one solution, (c) no solution.
(a) The system has a unique solution when D 6¼ 0, where D is the determinant of the matrix of coefficients. Compute   k 1 1   D ¼  1 k 1  ¼ k 3 þ 1 þ 1 À k À k À k ¼ k 3 À 3k þ 2 ¼ ðk À 1Þ2 ðk þ 2Þ   1 1 k Thus, the system has a unique solution when ðk À 1Þ2 ðk þ 2Þ 6¼ 0; when k 6¼ 1 and k 6¼ 2

(b and c) Gaussian elimination shows that the system has more than one solution when k ¼ 1, and the system has no solution when k ¼ À2.

Miscellaneous Problems 8.11. Find the volume V ðSÞ of the parallelepiped S in R3 determined by the vectors: (a) u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 3; À4Þ; u3 ¼ ð1; 2; À5Þ. (b) u1 ¼ ð1; 2; 4Þ; u2 ¼ ð2; 1; À3Þ; u3 ¼ ð5; 7; 9Þ.
V ðSÞ is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus,   1 1 1   (a) jMj ¼  1 3 À4  ¼ À15 À 4 þ 2 À 3 þ 8 þ 5 ¼ À7. Hence, V ðSÞ ¼ j À 7j ¼ 7.    1 2 À5    1 2 4   (b) jMj ¼  2 1 À3  ¼ 9 À 30 þ 56 À 20 þ 21 À 36 ¼ 0. Thus, V ðSÞ ¼ 0, or, in other words, u1 ; u2 ; u3   5 7 9 lie in a plane and are linearly dependent.

3 62 6 8.12. Find detðMÞ where M ¼ 6 0 6 40 0

2

4 5 9 5 0

0 0 2 0 4

0 0 0 6 3

3 2 3 0 07 62 7 6 07 ¼ 60 7 6 75 40 0 4

4 5 9 5 0

0 0 2 0 4

0 0 0 6 3

3 0 07 7 07 7 75 4

M is a (lower) triangular block matrix; hence, evaluate the determinant of each diagonal block:     6 7 3 4     ¼ 15 À 8 ¼ 7; j2j ¼ 2;  3 4  ¼ 24 À 21 ¼ 3 2 5 Thus, jMj ¼ 7ð2Þð3Þ ¼ 42.

8.13. Find the determinant of F: R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ 3y À 4z; 2y þ 7z; x þ 5y À 3zÞ

CHAPTER 8 Determinants

281

The determinant of a linear operator F is equal to the determinant of any matrix that represents F. Thus first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients of x; y; z). Then 2 3 1 3 À4 and so detðFÞ ¼ jAj ¼ À6 þ 21 þ 0 þ 8 À 35 À 0 ¼ À8 A ¼ 40 2 7 5; 1 5 À3

8.14. Write out g ¼ gðx1 ; x2 ; x3 ; x4 Þ explicitly where gðx1 ; x2 ; . . . ; xn Þ ¼

Q i k but i precedes k in s, there is a pair ði*; j*Þ such that i* < k* and sði*Þ > sð j*Þ ð1Þ and vice versa. Thus, s is even or odd according to whether there is an even or an odd number of pairs satisfying (1).

282

CHAPTER 8 Determinants

Choose i* and k* so that sði*Þ ¼ i and sðk*Þ ¼ k. Then i > k if and only if sði*Þ > sðk*Þ, and i precedes k in s if and only if i* < k*.

8.19. Consider the polynomials g ¼ gðx1 ; . . . ; xn Þ and sðgÞ, defined by Q Q and sðgÞ ¼ ðxsðiÞ À xsðjÞ Þ g ¼ gðx1 ; . . . ; xn Þ ¼ ðxi À xj Þ i r and j

r, then mij ¼ 0. Thus, we need only consider those permutations s such that and sf1; 2; . . . ; rg ¼ f1; 2; . . . ; rg r, and let s2 ðkÞ ¼ sðr þ kÞ À r for k s. Then

sfr þ 1; r þ 2; . . . ; r þ sg ¼ fr þ 1; r þ 2; . . . ; r þ sg Let s1 ðkÞ ¼ sðkÞ for k

ðsgn sÞm1sð1Þ m2sð2Þ Á Á Á mnsðnÞ ¼ ðsgn s1 Þa1s1 ð1Þ a2s1 ð2Þ Á Á Á ars1 ðrÞ ðsgn s2 Þb1s2 ð1Þ b2s2 ð2Þ Á Á Á bss2 ðsÞ which implies detðMÞ ¼ detðAÞ detðBÞ.

8.37. Prove Theorem 8.14: There exists a unique function D : M ! K such that (i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1. This function D is the determinant function; that is, DðAÞ ¼ jAj.
Let D be the determinant function, DðAÞ ¼ jAj. We must show that D satisfies (i), (ii), and (iii), and that D is the only function satisfying (i), (ii), and (iii). By Theorem 8.2, D satisfies (ii) and (iii). Hence, we show that it is multilinear. Suppose the ith row of A ¼ ½aij Š has the form ðbi1 þ ci1 ; bi2 þ ci2 ; . . . ; bin þ cin Þ. Then DðAÞ ¼ DðA1 ; . . . ; Bi þ Ci ; . . . ; An Þ P ¼ ðsgn sÞa1sð1Þ Á Á Á aiÀ1;sðiÀ1Þ ðbisðiÞ þ cisðiÞ Þ Á Á Á ansðnÞ ¼ P
Sn Sn

ðsgn sÞa1sð1Þ Á Á Á bisðiÞ Á Á Á ansðnÞ þ

P
Sn

ðsgn sÞa1sð1Þ Á Á Á cisðiÞ Á Á Á ansðnÞ

¼ DðA1 ; . . . ; Bi ; . . . ; An Þ þ DðA1 ; . . . ; Ci ; . . . ; An Þ

CHAPTER 8 Determinants
Also, by Theorem 8.3(ii), DðA1 ; . . . ; kAi ; . . . ; An Þ ¼ kDðA1 ; . . . ; Ai ; . . . ; An Þ

287

Thus, D is multilinear—D satisfies (i). We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If fe1 ; . . . ; en g is the usual basis of K n , then, by (iii), Dðe1 ; e2 ; . . . ; en Þ ¼ DðIÞ ¼ 1. Using (ii), we also have that Dðei1 ; ei2 ; . . . ; ein Þ ¼ sgn s; where s ¼ i1 i2 Á Á Á in ð1Þ

Now suppose A ¼ ½aij Š. Observe that the kth row Ak of A is Ak ¼ ðak1 ; ak2 ; . . . ; akn Þ ¼ ak1 e1 þ ak2 e2 þ Á Á Á þ akn en Thus, DðAÞ ¼ Dða11 e1 þ Á Á Á þ a1n en ; a21 e1 þ Á Á Á þ a2n en ; . . . ; an1 e1 þ Á Á Á þ ann en Þ Using the multilinearity of D, we can write DðAÞ as a sum of terms of the form DðAÞ ¼ ¼ P P Dða1i1 ei1 ; a2i2 ei2 ; . . . ; anin ein Þ ða1i1 a2i2 Á Á Á anin ÞDðei1 ; ei2 ; . . . ; ein Þ ð2Þ

where the sum is summed over all sequences i1 i2 . . . in , where ik 2 f1; . . . ; ng. If two of the indices are equal, say ij ¼ ik but j 6¼ k, then, by (ii), Dðei1 ; ei2 ; . . . ; ein Þ ¼ 0 Accordingly, the sum in (2) need only be summed over all permutations s ¼ i1 i2 Á Á Á in . Using (1), we finally have that P DðAÞ ¼ ða1i1 a2i2 Á Á Á anin ÞDðei1 ; ei2 ; . . . ; ein Þ s P where s ¼ i1 i2 Á Á Á in ¼ ðsgn sÞa1i1 a2i2 Á Á Á anin ; s Hence, D is the determinant function, and so the theorem is proved.

SUPPLEMENTARY PROBLEMS Computation of Determinants
8.38. Evaluate:   2 6  , (a)  4 1   5 1  , (b)  3 À2      À2  8  , (d)  4 (c)  1 À5 À3      9 a  , (e)  a þ b   b À3  a þb  4  ¼0 t À 2

8.39. Find all t such that

  t À 4 3    ¼ 0, (b) (a)  2 t À 9

 t À1   3

8.40. Compute the determinant of each of the following matrices: 2 3 2 3 2 3 À2 À1 4 3 À2 À4 2 1 1 5 À1 5, (c) 4 6 À3 À2 5, (a) 4 0 5 À2 5, (b) 4 2 4 1 2 0 6 1 1 À3 4

7 (d) 4 1 3

2

6 2 À2

3 5 15 1

288
8.41. Find the determinant of each of the following matrices: 3 2 3 2 1 2 2 3 2 1 3 2 61 6 0 À2 07 0 1 À2 7 7, (b) 6 3 7 (a) 6 4 3 À1 4 1 À1 1 À2 5 4 35 4 À3 0 2 2 2 À1 1 8.42. Evaluate:    2 À1 3 À4    2 1 À2 1 , (a)  3 3 À5 4   5 2 À1 4   2   À1 (b)   3   1

CHAPTER 8 Determinants

À1 1 2 À2

    1 À2 3 À1  4 À3      1 À2 0 0 2 , (c)  1  2 0 4 À5  3 À1     1 4 4 À6  2 À3   9  2  3 , (c)  2  1  1  5  0  0  0  5  1  1  4  3

8.43. Evaluate each of the following determinants:    1  1 2 À1 3 1    2  2 À1 1 À2 3    1 0 2 À1 , (b)  0 (a)  3    0  5 1 2 À3 4    0   À2 3 À1 1 À2

3 4 0 0 0

5 2 1 5 2

7 4 2 6 3

2 4 0 0 0

3 3 6 0 0

4 2 5 7 2

Cofactors, Classical Adjoints, Inverses
8.44. Find detðAÞ, adj A, and AÀ1 , where 2 3 2 1 2 1 1 0 (a) A ¼ 4 1 1 1 5, (b) A ¼ 4 3 1 0 2 1 1 1 3 2 05 1

8.45. Find the classical adjoint of each matrix in Problem 8.41. ! a b 8.46. Let A ¼ . (a) Find adj A, (b) Show that adjðadj AÞ ¼ A, (c) c d 8.47. Show that if A is diagonal (triangular) then adj A is diagonal (triangular). 8.48. Suppose A ¼ ½aij Š is triangular. Show that

When does A ¼ adj A?

(a) A is invertible if and only if each diagonal element aii 6¼ 0. (b) The diagonal elements of AÀ1 (if it exists) are aÀ1 , the reciprocals of the diagonal elements of A. ii

Minors, Principal Minors
1 61 8.49. Let A ¼ 6 43 4 2 2 0 À1 À3 3 À2 2 0 3 2 2 1 62 37 7 and B ¼ 6 40 55 À1 3 3 À3 À5 0 À1 1 2 5 3 5 47 7. Find the minor and the signed minor 15 À2

corresponding to the following submatrices: (a) Að1; 4; 3; 4Þ, (b) Bð1; 4; 3; 4Þ, (c) Að2; 3; 2; 4Þ, (d) Bð2; 3; 2; 4Þ.

8.50. For k ¼ 1; 2; 3, find the sum Sk of all principal minors of order k for 1 (a) A ¼ 4 2 5 2 3 3 2 À4 3 5, À2 1 3 1 5 À4 (b) B ¼ 4 2 6 1 5, 3 À2 0 2 3 1 À4 3 C ¼ 42 1 55 4 À7 11 2

(c)

CHAPTER 8 Determinants
8.51. For k ¼ 1; 2; 3; 4, find the sum Sk of all principal minors of order k for 1 61 (a) A ¼ 6 40 4 2 3 2 3 À1 À2 0 57 7, 1 À2 25 0 À1 À3 1 60 (b) B ¼ 6 41 2 2 2 1 3 7 1 2 0 4 3 2 37 7 45 5

289

Determinants and Linear Equations
8.52. Solve the following systems by determinants: & (a) 3x þ 5y ¼ 8 , (b) 4x À 2y ¼ 1 & 2x À 3y ¼ À1 , 4x þ 7y ¼ À1 & (c) ax À 2by ¼ c 3ax À 5by ¼ 2c ðab 6¼ 0Þ

8.53. Solve the following systems by 8 < 2x À 5y þ 2z ¼ 2 (a) x þ 2y À 4z ¼ 5 , (b) : 3x À 4y À 6z ¼ 1

determinants: 8 < 2z þ 3 ¼ y þ 3x x À 3z ¼ 2y þ 1 : 3y þ z ¼ 2 À 2x

8.54. Prove Theorem 8.11: The system AX ¼ 0 has a nonzero solution if and only if D ¼ jAj ¼ 0.

Permutations
8.55. Find the parity of the permutations s ¼ 32154, t ¼ 13524, p ¼ 42531 in S5 . 8.56. For the permutations in Problem 8.55, find (a) t  s, (b) p  s, (c) sÀ1 , (d) tÀ1 . 8.57. Let t 2 Sn : Show that t  s runs through Sn as s runs through Sn ; that is, Sn ¼ ft  s : s 2 Sn g: 8.58. Let s 2 Sn have the property that sðnÞ ¼ n. Let s* 2 SnÀ1 be defined by s*ðxÞ ¼ sðxÞ. (a) Show that sgn s* ¼ sgn s, (b) Show that as s runs through Sn , where sðnÞ ¼ n, s* runs through SnÀ1 ; that is, SnÀ1 ¼ fs* : s 2 Sn ; sðnÞ ¼ ng: 8.59. Consider a permutation s ¼ j1 j2 . . . jn . Let fei g be the usual basis of K n , and let A be the matrix whose ith row is eji [i.e., A ¼ ðej1 , ej2 ; . . . ; ejn Þ]. Show that jAj ¼ sgn s.

Determinant of Linear Operators
8.60. Find the determinant of each of the following linear transformations: (a) T :R2 ! R2 defined by T ðx; yÞ ¼ ð2x À 9y; 3x À 5yÞ, (b) T :R3 ! R3 defined by T ðx; y; zÞ ¼ ð3x À 2z; 5y þ 7z; x þ y þ zÞ, (c) T :R3 ! R2 defined by T ðx; y; zÞ ¼ ð2x þ 7y À 4z; 4x À 6y þ 2zÞ. 8.61. Let D:V ! V be the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Find detðDÞ if V is the vector space of functions with the following bases: (a) f1; t; . . . ; t5 g, (b) fet ; e2t ; e3t g, (c) fsin t; cos tg. 8.62. Prove Theorem 8.13: Let F and G be linear operators on a vector space V. Then (i) detðF  GÞ ¼ detðFÞ detðGÞ, (ii) F is invertible if and only if detðFÞ 6¼ 0.

8.63. Prove (a) detð1V Þ ¼ 1, where 1V is the identity operator, (b) -detðT À1 Þ ¼ detðT ÞÀ1 when T is invertible.

290
Miscellaneous Problems

CHAPTER 8 Determinants

8.64. Find the volume V ðSÞ of the parallelopiped S in R3 determined by the following vectors: (a) u1 ¼ ð1; 2; À3Þ, u2 ¼ ð3; 4; À1Þ, u3 ¼ ð2; À1; 5Þ, (b) u1 ¼ ð1; 1; 3Þ, u2 ¼ ð1; À2; À4Þ, u3 ¼ ð4; 1; 5Þ. 8.65. Find the volume V ðSÞ of the parallelepiped S in R4 determined by the following vectors: u1 ¼ ð1; À2; 5; À1Þ; u2 ¼ ð2; 1; À2; 1Þ; u3 ¼ ð3; 0; 1 À 2Þ; u4 ¼ ð1; À1; 4; À1Þ ! a b over R. Determine whether D:V ! R is 2-linear (with 8.66. Let V be the space of 2 Â 2 matrices M ¼ c d respect to the rows), where ðaÞ DðMÞ ¼ a þ d; ðbÞ DðMÞ ¼ ad; ðcÞ DðMÞ ¼ ac À bd; ðdÞ DðMÞ ¼ ab À cd; ðeÞ ðf Þ DðMÞ ¼ 0 DðMÞ ¼ 1

! A B 8.68. Let A; B; C; D be commuting n-square matrices. Consider the 2n-square block matrix M ¼ . Prove C D that jMj ¼ jAjjDj À jBjjCj. Show that the result may not be true if the matrices do not commute. 8.69. Suppose A is orthogonal; that is, AT A ¼ I. Show that detðAÞ ¼ Æ1. 8.70. Let V be the space of m-square matrices viewed as m-tuples of row vectors. Suppose D:V ! K is m-linear and alternating. Show that (a) Dð. . . ; A; . . . ; B; . . .Þ ¼ ÀDð. . . ; B; . . . ; A; . . .Þ; sign changed when two rows are interchanged. (b) If A1 ; A2 ; . . . ; Am are linearly dependent, then DðA1 ; A2 ; . . . ; Am Þ ¼ 0. 8.71. Let V be the space of m-square matrices (as above), and suppose D: V ! K. Show that the following weaker statement is equivalent to D being alternating: DðA1 ; A2 ; . . . ; An Þ ¼ 0 whenever Ai ¼ Aiþ1 for some i

8.67. Let A be an n-square matrix. Prove jkAj ¼ k n jAj.

Let V be the space of n-square matrices over K. Suppose B 2 V is invertible and so detðBÞ 6¼ 0. Define D: V ! K by DðAÞ ¼ detðABÞ=detðBÞ, where A 2 V. Hence, DðA1 ; A2 ; . . . ; An Þ ¼ detðA1 B; A2 B; . . . ; An BÞ=detðBÞ where Ai is the ith row of A, and so Ai B is the ith row of AB. Show that D is multilinear and alternating, and that DðIÞ ¼ 1. (This method is used by some texts to prove that jABj ¼ jAjjBj.) 8.72. Show that g ¼ gðx1 ; . . . ; xn Þ ¼ ðÀ1Þn VnÀ1 ðxÞ where g ¼ gðxi Þ is the difference product in Problem 8.19, x ¼ xn , and VnÀ1 is the Vandermonde determinant defined by  1 1 ... 1 1   6 6 x1 x2 . . . xnÀ1 x   6  6 2 2 2 2  x2 . . . xnÀ1 x  VnÀ1 ðxÞ  6 x1 6 6 ::::::::::::::::::::::::::::::::::::::::::::   4   xnÀ1 xnÀ1 . . . xnÀ1 xnÀ1 1 2 nÀ1 8.73. Let A be any matrix. Show that the signs of a minor A½I; J Š and its complementary minor A½I 0 ; J 0 Š are equal. 2

CHAPTER 8 Determinants

291

8.74. Let A be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal rank of A is equal to its rank—the maximum number of linearly independent rows (or columns).

ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1 ;
8.38. (a) 8.39. (a) 8.40. (a) 8.41. (a) 8.42. (a) 8.43. (a) À22, 3; 10; 21, À131, 33, À32,

R2 ;
(b) (b)

. . .Š denotes a matrix with rows R1 ; R2 ; : . . .
(c) 46, (d) À21, (e) a2 þ ab þ b2

À13, 5; À2

(b) À11, (b) À55 (b) 0, (b) (c) À14,

(c)

100,

(d) 0

45 (c) À468

8.44. (a) jAj ¼ À2; (b) jAj ¼ À1;

adj A ¼ ½À1; À1; 1; À1; 1; À1; 2; À2; 0Š, adj A ¼ ½1; 0; À2; À3; À1; 6; 2; 1; À5Š. Also, AÀ1 ¼ ðadj AÞ=jAj À30; À38; À16; 29; À8; 51; À13; À1; À13; 1; 28; À18Š, À44; 11; 33; 11; À29; 1; 13; 21; 17; 7; À19; À18Š (c) A ¼ kI (c) 3; À3, (d) 17; À17

8.45. (a) ½À16; À29; À26; À2; (b) ½21; À14; À17; À19; 8.46. (a) 8.49. (a) 8.50. (a) 8.51. (a) 8.52. (a) 8.53. (a) 8.55. (a) 8.56. (a) 8.60. (a) 8.61. (a) 8.64. (a) 8.65. 17 8.66. (a) no, (b) yes, (c) adj A ¼ ½d; Àb; À3; À3, À2; À17; 73, À6; 13; 62; À219; x ¼ 21 ; y ¼ 29; 26 26 Àc; aŠ,

(b) À23; À23, (b) 7; 10; 105,

(c) 13; 54; 0

(b) 7; À37; 30; 20
5 1 (b) x ¼ À 13 ; y ¼ 13;

(c)

c c x ¼ Àa;y ¼ Àb

x ¼ 5; y ¼ 2; z ¼ 1,

(b) Because D ¼ 0, the system cannot be solved by determinants.

sgn s ¼ 1; sgn t ¼ À1; sgn p ¼ À1 t  s ¼ 53142, detðT Þ ¼ 17, 0, 18, (b) 6, (b) p  s ¼ 52413, (c) (c) sÀ1 ¼ 32154, (d) tÀ1 ¼ 14253

(b) detðT Þ ¼ 4, (c) 1

not defined

(b) 0

yes,

(d) no,

(e) yes,

(f )

no

CHAPTER H A P T E R 9 C 9

Diagonalization: Eigenvalues and Eigenvectors
9.1 Introduction
The ideas in this chapter can be discussed from two points of view.

Matrix Point of View
Suppose an n-square matrix A is given. The matrix A is said to be diagonalizable if there exists a nonsingular matrix P such that B ¼ PÀ1 AP is diagonal. This chapter discusses the diagonalization of a matrix A. In particular, an algorithm is given to find the matrix P when it exists.

Linear Operator Point of View
Suppose a linear operator T : V ! V is given. The linear operator T is said to be diagonalizable if there exists a basis S of V such that the matrix representation of T relative to the basis S is a diagonal matrix D. This chapter discusses conditions under which the linear operator T is diagonalizable.

Equivalence of the Two Points of View
The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a linear operator F defined by FðX Þ ¼ AX where X is a column vector, and B ¼ PÀ1 AP represents F relative to a new coordinate system (basis) S whose elements are the columns of P. On the other hand, any linear operator T can be represented by a matrix A relative to one basis and, when a second basis is chosen, T is represented by the matrix B ¼ PÀ1 AP where P is the change-of-basis matrix. Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear mappings T .

Role of Underlying Field K
The underlying number field K did not play any special role in our previous discussions on vector spaces and linear mappings. However, the diagonalization of a matrix A or a linear operator T will depend on the

292

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

293

roots of a polynomial DðtÞ over K, and these roots do depend on K. For example, suppose DðtÞ ¼ t2 þ 1. Then DðtÞ has no roots if K ¼ R, the real field; but DðtÞ has roots Æi if K ¼ C, the complex field. Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself (frequently discussed in numerical analysis courses). Accordingly, our examples will usually lead to those polynomials DðtÞ whose roots can be easily determined.

9.2

Polynomials of Matrices

Consider a polynomial f ðtÞ ¼ an tn þ Á Á Á þ a1 t þ a0 over a field K. Recall (Section 2.8) that if A is any square matrix, then we define f ðAÞ ¼ an An þ Á Á Á þ a1 A þ a0 I where I is the identity matrix. In particular, we say that A is a root of f ðtÞ if f ðAÞ ¼ 0, the zero matrix.
! ! 1 2 7 10 2 EXAMPLE 9.1 Let A ¼ . Then A ¼ . Let 3 4 15 22

f ðtÞ ¼ 2t2 À 3t þ 5
Then

and

gðtÞ ¼ t2 À 5t À 2 ! ! 0 16 14 ¼ 5 21 37 ! 0 0 ¼ À2 0 !

! ! 14 20 À3 À6 5 f ðAÞ ¼ 2A À 3A þ 5I ¼ þ þ 30 44 À9 À12 0
2

and

! ! 7 10 À5 À10 À2 gðAÞ ¼ A À 5A À 2I ¼ þ þ 15 22 À15 À20 0
2

0 0

Thus, A is a zero of gðtÞ.

The following theorem (proved in Problem 9.7) applies.
THEOREM 9.1:

Let f and g be polynomials. For any square matrix A and scalar k, (i) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ (ii) ð fgÞðAÞ ¼ f ðAÞgðAÞ (iii) ðkf ÞðAÞ ¼ kf ðAÞ (iv) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ:

Observe that (iv) tells us that any two polynomials in A commute.

Matrices and Linear Operators
Now suppose that T : V ! V is a linear operator on a vector space V . Powers of T are defined by the composition operation: T 2 ¼ T  T; T 3 ¼ T 2  T; ...

Also, for any polynomial f ðtÞ ¼ an tn þ Á Á Á þ a1 t þ a0 , we define f ðT Þ in the same way as we did for matrices: f ðT Þ ¼ an T n þ Á Á Á þ a1 T þ a0 I where I is now the identity mapping. We also say that T is a zero or root of f ðtÞ if f ðT Þ ¼ 0; the zero mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices. Remark: Suppose A is a matrix representation of a linear operator T . Then f ðAÞ is the matrix representation of f ðTÞ, and, in particular, f ðT Þ ¼ 0 if and only if f ðAÞ ¼ 0.

294
9.3

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

Characteristic Polynomial, Cayley–Hamilton Theorem

Let A ¼ ½aij Š be an n-square matrix. The matrix M ¼ A À tIn , where In is the n-square identity matrix and t is an indeterminate, may be obtained by subtracting t down the diagonal of A. The negative of M is the matrix tIn À A, and its determinant DðtÞ ¼ detðtIn À AÞ ¼ ðÀ1Þn detðA À tIn Þ which is a polynomial in t of degree n and is called the characteristic polynomial of A. We state an important theorem in linear algebra (proved in Problem 9.8).
THEOREM 9.2:

(Cayley–Hamilton) Every matrix A is a root of its characteristic polynomial.

Remark: Suppose A ¼ ½aij Š is a triangular matrix. Then tI À A is a triangular matrix with diagonal entries t À aii ; hence, DðtÞ ¼ detðtI À AÞ ¼ ðt À a11 Þðt À a22 Þ Á Á Á ðt À ann Þ Observe that the roots of DðtÞ are the diagonal elements of A.
EXAMPLE 9.2

Let A ¼

  t À 1 À3  2  DðtÞ ¼ jtI À Aj ¼   À4 t À 5  ¼ ðt À 1Þðt À 5Þ À 12 ¼ t À 6t À 7
As expected from the Cayley–Hamilton theorem, A is a root of DðtÞ; that is,

! 1 3 . Its characteristic polynomial is 4 5

! ! 13 18 À6 À18 À7 DðAÞ ¼ A À 6A À 7I ¼ þ þ 24 37 À24 À30 0
2

! 0 0 ¼ À7 0

0 0

!

Now suppose A and B are similar matrices, say B ¼ PÀ1 AP, where P is invertible. We show that A and B have the same characteristic polynomial. Using tI ¼ PÀ1 tIP, we have DB ðtÞ ¼ detðtI À BÞ ¼ detðtI À PÀ1 APÞ ¼ detðPÀ1 tIP À PÀ1 APÞ ¼ det½PÀ1 ðtI À AÞPŠ ¼ detðPÀ1 Þ detðtI À AÞ detðPÞ Using the fact that determinants are scalars and commute and that detðPÀ1 Þ detðPÞ ¼ 1, we finally obtain DB ðtÞ ¼ detðtI À AÞ ¼ DA ðtÞ Thus, we have proved the following theorem.
THEOREM 9.3:

Similar matrices have the same characteristic polynomial.

Characteristic Polynomials of Degrees 2 and 3
There are simple formulas for!the characteristic polynomials of matrices of orders 2 and 3. a a12 (a) Suppose A ¼ 11 . Then a21 a22 DðtÞ ¼ t2 À ða11 þ a22 Þt þ detðAÞ ¼ t2 À trðAÞ t þ detðAÞ Here trðAÞ denotes the trace of A—that is, the sum of the diagonal elements of A. 2 3 a11 a12 a13 Suppose A ¼ 4 a21 a22 a23 5. Then a31 a32 a33 DðtÞ ¼ t3 À trðAÞ t2 þ ðA11 þ A22 þ A33 Þt À detðAÞ (Here A11 , A22 , A33 denote, respectively, the cofactors of a11 , a22 , a33 .)

(b)

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
EXAMPLE 9.3

295

Find the characteristic polynomial of each of the following matrices: ! À2 . À4

(a) A ¼

! ! 5 3 7 À1 5 , (b) B ¼ , (c) C ¼ 2 10 6 2 4

(a) We have trðAÞ ¼ 5 þ 10 ¼ 15 and jAj ¼ 50 À 6 ¼ 44; hence, DðtÞ þ t2 À 15t þ 44. (b) We have trðBÞ ¼ 7 þ 2 ¼ 9 and jBj ¼ 14 þ 6 ¼ 20; hence, DðtÞ ¼ t2 À 9t þ 20. (c) We have trðCÞ ¼ 5 À 4 ¼ 1 and jCj ¼ À20 þ 8 ¼ À12; hence, DðtÞ ¼ t2 À t À 12. 1 EXAMPLE 9.4 Find the characteristic polynomial of A ¼ 4 0 1 2 3 1 2 3 2 5. 3 9

We have trðAÞ ¼ 1 þ 3 þ 9 ¼ 13. The cofactors of the diagonal elements are as follows:

 3 A11 ¼  3

 2  ¼ 21; 9

 1 A22 ¼  1

 2  ¼ 7; 9

 1 A33 ¼  0

 1 ¼3 3

Thus, A11 þ A22 þ A33 ¼ 31. Also, jAj ¼ 27 þ 2 þ 0 À 6 À 6 À 0 ¼ 17. Accordingly, DðtÞ ¼ t3 À 13t2 þ 31t À 17

Remark: The coefficients of the characteristic polynomial DðtÞ of the 3-square matrix A are, with alternating signs, as follows: S1 ¼ trðAÞ; S2 ¼ A11 þ A22 þ A33 ; S3 ¼ detðAÞ

We note that each Sk is the sum of all principal minors of A of order k. The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in general.
THEOREM 9.4:

Let A be an n-square matrix. Then its characteristic polynomial is DðtÞ ¼ tn À S1 tnÀ1 þ S2 tnÀ2 þ Á Á Á þ ðÀ1Þn Sn where Sk is the sum of the principal minors of order k.

Characteristic Polynomial of a Linear Operator
Now suppose T: V ! V is a linear operator on a vector space V of finite dimension. We define the characteristic polynomial DðtÞ of T to be the characteristic polynomial of any matrix representation of T . Recall that if A and B are matrix representations of T , then B ¼ PÀ1 AP, where P is a change-of-basis matrix. Thus, A and B are similar, and by Theorem 9.3, A and B have the same characteristic polynomial. Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix representation of T is computed. Because f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0, where f ðtÞ is any polynomial and A is any matrix representation of T , we have the following analogous theorem for linear operators.
THEOREM 9.20 :

(Cayley–Hamilton) A linear operator T is a zero of its characteristic polynomial.

296
9.4

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

Diagonalization, Eigenvalues and Eigenvectors

Let A be any n-square matrix. Then A can be represented by (or is similar to) a diagonal matrix D ¼ diagðk1 ; k2 ; . . . ; kn Þ if and only if there exists a basis S consisting of (column) vectors u1 ; u2 ; . . . ; un such that Au1 ¼ k1 u1 Au2 ¼ k2 u2 :::::::::::::::::::::::::::::::::::: kn un Aun ¼ In such a case, A is said to be diagonizable. Furthermore, D ¼ PÀ1 AP, where P is the nonsingular matrix whose columns are, respectively, the basis vectors u1 ; u2 ; . . . ; un . The above observation leads us to the following definition.
DEFINITION:

Let A be any square matrix. A scalar l is called an eigenvalue of A if there exists a nonzero (column) vector v such that Av ¼ lv Any vector satisfying this relation is called an eigenvector of A belonging to the eigenvalue l.

We note that each scalar multiple kv of an eigenvector v belonging to l is also such an eigenvector, because AðkvÞ ¼ kðAvÞ ¼ kðlvÞ ¼ lðkvÞ The set El of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of l. (If dim El ¼ 1, then El is called an eigenline and l is called a scaling factor.) The terms characteristic value and characteristic vector (or proper value and proper vector) are sometimes used instead of eigenvalue and eigenvector. The above observation and definitions give us the following theorem.
THEOREM 9.5:

An n-square matrix A is similar to a diagonal matrix D if and only if A has n linearly independent eigenvectors. In this case, the diagonal elements of D are the corresponding eigenvalues and D ¼ PÀ1 AP, where P is the matrix whose columns are the eigenvectors.

Suppose a matrix A can be diagonalized as above, say PÀ1 AP ¼ D, where D is diagonal. Then A has the extremely useful diagonal factorization: A ¼ PDPÀ1 Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be easily calculated. Specifically, suppose D ¼ diagðk1 ; k2 ; . . . ; kn Þ. Then m m Am ¼ ðPDPÀ1 Þm ¼ PDm PÀ1 ¼ P diagðk1 ; . . . ; kn ÞPÀ1

More generally, for any polynomial f ðtÞ, f ðAÞ ¼ f ðPDPÀ1 Þ ¼ Pf ðDÞPÀ1 ¼ P diagð f ðk1 Þ; f ðk2 Þ; . . . ; f ðkn ÞÞPÀ1 Furthermore, if the diagonal entries of D are nonnegative, let pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi B ¼ P diagð k1 ; k2 ; . . . ; kn Þ PÀ1 Then B is a nonnegative square root of A; that is, B2 ¼ A and the eigenvalues of B are nonnegative.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
EXAMPLE 9.5

297

Let A ¼

3 1 2 2

! and let v 1 ¼

1 À2

! and v 2 ¼

! 1 . Then 1

3 Av 1 ¼ 2

1 2

!

! ! 1 1 ¼ ¼ v1 À2 À2

and

3 1 Av 2 ¼ 2 2

!

! ! 1 4 ¼ ¼ 4v 2 1 4

Thus, v 1 and v 2 are eigenvectors of A belonging, respectively, to the eigenvalues l1 ¼ 1 and l2 ¼ 4. Observe that v 1 and v 2 are linearly independent and hence form a basis of R2 . Accordingly, A is diagonalizable. Furthermore, let P be the matrix whose columns are the eigenvectors v 1 and v 2 . That is, let

" P¼ 1 À2

# 1 ; 1

" and so PÀ1 ¼

1 3 2 3

À1 3
1 3

#

Then A is similar to the diagonal matrix

" D ¼ P AP ¼
À1

1 3 2 3

À1 3
1 3

#"

3 1 2 2

#"

1 À2

1 1

# ¼

"

1 0

0 4

#

As expected, the diagonal elements 1 and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors v 1 and v 2 , which are the columns of P. In particular, A has the factorization

" A ¼ PDP
Accordingly,
À1

¼

1 1 À2 1

#"

1 0

0 4

#" 1
3 2 3

À1 3
1 3

#

" A ¼
4

1 1 À2 1

#"

1 0

0 256

#" 1
3 2 3

À1 3
1 3

# ¼

"

171 85 170 86

#

Moreover, suppose f ðtÞ ¼ t3 À 5t2 þ 3t þ 6; hence, f ð1Þ ¼ 5 and f ð4Þ ¼ 2. Then

f ðAÞ ¼ Pf ðDÞP

À1

¼

1 À2 !

1 1

!

5 0

0 2

!" 1
3 2 3

À1 3
1 3

# ¼

3 À2 #

À1 4

!

pffiffiffi pffiffiffi Last, we obtain a ‘‘positive square root’’ of A. Specifically, using 1 ¼ 1 and 4 ¼ 2, we obtain the matrix

pffiffiffiffi B ¼ P DPÀ1 ¼

1 1 À2 1

1 0

0 2

!" 1
3 2 3

À1 3
1 3

# ¼

"

5 3 2 3

1 3 4 3

where B2 ¼ A and where B has positive eigenvalues 1 and 2.

Remark: Throughout this chapter, we use the following fact: ! d=jPj a b ; then PÀ1 ¼ If P ¼ c d Àc=jPj

Àb=jPj a=jPj

! :

That is, PÀ1 is obtained by interchanging the diagonal elements a and d of P, taking the negatives of the nondiagonal elements b and c, and dividing each element by the determinant jPj.

Properties of Eigenvalues and Eigenvectors
Example 9.5 indicates the advantages of a diagonal representation (factorization) of a square matrix. In the following theorem (proved in Problem 9.20), we list properties that help us to find such a representation.

298
THEOREM 9.6:

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Let A be a square matrix. Then the following are equivalent. (i) (ii) (iii) A scalar l is an eigenvalue of A. The matrix M ¼ A À lI is singular. The scalar l is a root of the characteristic polynomial DðtÞ of A.

The eigenspace El of an eigenvalue l is the solution space of the homogeneous system MX ¼ 0, where M ¼ A À lI; that is, M is obtained by subtracting l down the diagonal of A. Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the following result.
THEOREM 9.7:

Let A be a square matrix over the complex field C. Then A has at least one eigenvalue.

The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear operators is proved in Problem 9.21, and Theorem 9.9 is proved in Problem 9.22.)
THEOREM 9.8:

Suppose v 1 ; v 2 ; . . . ; v n are nonzero eigenvectors of a matrix A belonging to distinct eigenvalues l1 ; l2 ; . . . ; ln . Then v 1 ; v 2 ; . . . ; v n are linearly independent. Suppose the characteristic polynomial DðtÞ of an n-square matrix A is a product of n distinct factors, say, DðtÞ ¼ ðt À a1 Þðt À a2 Þ Á Á Á ðt À an Þ. Then A is similar to the diagonal matrix D ¼ diagða1 ; a2 ; . . . ; an Þ.

THEOREM 9.9:

If l is an eigenvalue of a matrix A, then the algebraic multiplicity of l is defined to be the multiplicity of l as a root of the characteristic polynomial of A, and the geometric multiplicity of l is defined to be the dimension of its eigenspace, dim El . The following theorem (whose equivalent for linear operators is proved in Problem 9.23) holds.
THEOREM 9.10:

The geometric multiplicity of an eigenvalue l of a matrix A does not exceed its algebraic multiplicity.

Diagonalization of Linear Operators
Consider a linear operator T : V ! V . Then T is said to be diagonalizable if it can be represented by a diagonal matrix D. Thus, T is diagonalizable if and only if there exists a basis S ¼ fu1 ; u2 ; . . . ; un g of V for which T ðu1 Þ ¼ k1 u1 T ðu2 Þ ¼ k2 u2 ::::::::::::::::::::::::::::::::::::::: T ðun Þ ¼ kn un In such a case, T is represented by the diagonal matrix D ¼ diagðk1 ; k2 ; . . . ; kn Þ relative to the basis S. The above observation leads us to the following definitions and theorems, which are analogous to the definitions and theorems for matrices discussed above.
DEFINITION:

Let T be a linear operator. A scalar l is called an eigenvalue of T if there exists a nonzero vector v such that T ðvÞ ¼ lv. Every vector satisfying this relation is called an eigenvector of T belonging to the eigenvalue l.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

299

The set El of all eigenvectors belonging to an eigenvalue l is a subspace of V , called the eigenspace of l. (Alternatively, l is an eigenvalue of T if lI À T is singular, and, in this case, El is the kernel of lI À T .) The algebraic and geometric multiplicities of an eigenvalue l of a linear operator T are defined in the same way as those of an eigenvalue of a matrix A. The following theorems apply to a linear operator T on a vector space V of finite dimension.
THEOREM 9.50 :

T can be represented by a diagonal matrix D if and only if there exists a basis S of V consisting of eigenvectors of T . In this case, the diagonal elements of D are the corresponding eigenvalues. Let T be a linear operator. Then the following are equivalent: (i) (ii) (iii) A scalar l is an eigenvalue of T . The linear operator lI À T is singular. The scalar l is a root of the characteristic polynomial DðtÞ of T .

THEOREM 9.60 :

THEOREM 9.70 : THEOREM 9.80 :

Suppose V is a complex vector space. Then T has at least one eigenvalue. Suppose v 1 ; v 2 ; . . . ; v n are nonzero eigenvectors of a linear operator T belonging to distinct eigenvalues l1 ; l2 ; . . . ; ln . Then v 1 ; v 2 ; . . . ; v n are linearly independent. Suppose the characteristic polynomial DðtÞ of T is a product of n distinct factors, say, DðtÞ ¼ ðt À a1 Þðt À a2 Þ Á Á Á ðt À an Þ. Then T can be represented by the diagonal matrix D ¼ diagða1 ; a2 ; . . . ; an Þ. The geometric multiplicity of an eigenvalue l of T does not exceed its algebraic multiplicity.

THEOREM 9.90 :

THEOREM 9.100 :

Remark: The following theorem reduces the investigation of the diagonalization of a linear operator T to the diagonalization of a matrix A.
THEOREM 9.11:

Suppose A is a matrix representation of T . Then T is diagonalizable if and only if A is diagonalizable.

9.5

Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices

This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix A and for determining whether or not a nonsingular matrix P exists such that PÀ1 AP is diagonal.
ALGORITHM 9.1:

(Diagonalization Algorithm) The input is an n-square matrix A.

Step 1. Find the characteristic polynomial DðtÞ of A. Step 2. Find the roots of DðtÞ to obtain the eigenvalues of A. Step 3. Repeat (a) and (b) for each eigenvalue l of A. (a) Form the matrix M ¼ A À lI by subtracting l down the diagonal of A. (b) Find a basis for the solution space of the homogeneous system MX ¼ 0. (These basis vectors are linearly independent eigenvectors of A belonging to l.)

300

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

Step 4. Consider the collection S ¼ fv 1 ; v 2 ; . . . ; v m g of all eigenvectors obtained in Step 3. (a) If m 6¼ n, then A is not diagonalizable. (b) If m ¼ n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the eigenvectors v 1 ; v 2 ; . . . ; v n . Then D ¼ PÀ1 AP ¼ diagðl1 ; l2 ; . . . ; ln Þ where li is the eigenvalue corresponding to the eigenvector v i .
EXAMPLE 9.6

(1)

! 4 2 . 3 À1 The characteristic polynomial DðtÞ of A is computed. We have The diagonalizable algorithm is applied to A ¼

trðAÞ ¼ 4 À 1 ¼ À3; hence, jAj ¼ À4 À 6 ¼ À10;

DðtÞ ¼ t2 À 3t À 10 ¼ ðt À 5Þðt þ 2Þ
(2) (3) Set DðtÞ ¼ ðt À 5Þðt þ 2Þ ¼ 0. The roots l1 ¼ 5 and l2 ¼ À2 are the eigenvalues of A. (i) We find an eigenvector v 1 of A belonging to the eigenvalue l1 ¼ 5. Subtract l1 ¼ 5 down the diagonal of ! À1 2 A to obtain the matrix M ¼ . The eigenvectors belonging to l1 ¼ 5 form the solution of the 3 À6 homogeneous system MX ¼ 0; that is,

À1 3

2 À6

!

! ! 0 x ¼ 0 y

or

Àx þ 2y ¼ 0 3x À 6y ¼ 0

or

Àx þ 2y ¼ 0

The system has only one free variable. Thus, a nonzero solution, for example, v 1 ¼ ð2; 1Þ, is an eigenvector that spans the eigenspace of l1 ¼ 5. (ii) We find an eigenvector v 2 of A belonging to the eigenvalue l2 ¼ À2. Subtract À2 (or add 2) down the diagonal of A to obtain the matrix



6 3

! 2 and the homogenous system 1

6x þ 2y ¼ 0 3x þ y ¼ 0

or

3x þ y ¼ 0:

(4)

The system has only one independent solution. Thus, a nonzero solution, say v 2 ¼ ðÀ1; 3Þ; is an eigenvector that spans the eigenspace of l2 ¼ À2: Let P be the matrix whose columns are the eigenvectors v 1 and v 2 . Then

! 2 À1 ; P¼ 1 3

" and so P
À1

¼

3 7 À1 7

1 7 2 7

#

Accordingly, D ¼ PÀ1 AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues; that is,

" D ¼ P AP ¼
À1

3 7 À1 7

1 7 2 7

#

4 3

2 À1

!

2 1

À1 3

! ¼

5 0

0 À2

!

EXAMPLE 9.7

Consider the matrix B ¼

5 1

! À1 . We have 3

trðBÞ ¼ 5 þ 3 ¼ 8;

jBj ¼ 15 þ 1 ¼ 16;

so

DðtÞ ¼ t2 À 8t þ 16 ¼ ðt À 4Þ2

Accordingly, l ¼ 4 is the only eigenvalue of B.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Subtract l ¼ 4 down the diagonal of B to obtain the matrix

301

1 M¼ 1

! À1 and the homogeneous system À1

xÀy¼0 xÀy¼0

or

xÀy¼0

The system has only one independent solution; for example, x ¼ 1; y ¼ 1. Thus, v ¼ ð1; 1Þ and its multiples are the only eigenvectors of B. Accordingly, B is not diagonalizable, because there does not exist a basis consisting of eigenvectors of B. ! 3 À5 . Here trðAÞ ¼ 3 À 3 ¼ 0 and jAj ¼ À9 þ 10 ¼ 1. Thus, EXAMPLE 9.8 Consider the matrix A ¼ 2 À3 2 DðtÞ ¼ t þ 1 is the characteristic polynomial of A. We consider two cases: (a) A is a matrix over the real field R. Then DðtÞ has no (real) roots. Thus, A has no eigenvalues and no eigenvectors, and so A is not diagonalizable. (b) A is a matrix over the complex field C. Then DðtÞ ¼ ðt À iÞðt þ iÞ has two roots, i and Ài. Thus, A has two distinct eigenvalues i and Ài, and hence, A has two independent eigenvectors. Accordingly there exists a nonsingular matrix P over the complex field C for which

i P AP ¼ 0
À1

0 Ài

!

Therefore, A is diagonalizable (over C).

9.6

Diagonalizing Real Symmetric Matrices and Quadratic Forms

There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have any (real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist. Namely, we have the following theorems.
THEOREM 9.12:

Let A be a real symmetric matrix. Then each root l of its characteristic polynomial is real. Let A be a real symmetric matrix. Suppose u and v are eigenvectors of A belonging to distinct eigenvalues l1 and l2 . Then u and v are orthogonal, that; is, hu; vi ¼ 0.

THEOREM 9.13:

The above two theorems give us the following fundamental result.
THEOREM 9.14:

Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that D ¼ PÀ1 AP is diagonal.

The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as illustrated below. In such a case, we say that A is ‘‘orthogonally diagonalizable.’’
EXAMPLE 9.9

Let A ¼

diagonal. First we find the characteristic polynomial DðtÞ of A. We have

! 2 À2 , a real symmetric matrix. Find an orthogonal matrix P such that PÀ1 AP is À2 5

trðAÞ ¼ 2 þ 5 ¼ 7;

jAj ¼ 10 À 4 ¼ 6;

so

DðtÞ ¼ t2 À 7t þ 6 ¼ ðt À 6Þðt À 1Þ

Accordingly, l1 ¼ 6 and l2 ¼ 1 are the eigenvalues of A. (a) Subtracting l1 ¼ 6 down the diagonal of A yields the matrix

À4 M¼ À2

À2 À1

! and the homogeneous system

À4x À 2y ¼ 0 À2x À y ¼ 0

or

2x þ y ¼ 0

A nonzero solution is u1 ¼ ð1; À2Þ.

302
!

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

(b) Subtracting l2 ¼ 1 down the diagonal of A yields the matrix

1 M¼ À2

À2 4

and the homogeneous system

x À 2y ¼ 0

(The second equation drops out, because it is a multiple of the first equation.) A nonzero solution is u2 ¼ ð2; 1Þ. As expected from Theorem 9.13, u1 and u2 are orthogonal. Normalizing u1 and u2 yields the orthonormal vectors

pffiffiffi pffiffiffi ^ u1 ¼ ð1= 5; À2= 5Þ

and

pffiffiffi pffiffiffi ^ u2 ¼ ð2= 5; 1= 5Þ

^ ^ Finally, let P be the matrix whose columns are u1 and u2 , respectively. Then



pffiffiffi pffiffiffi ! 1=p5 2=pffiffiffi 5 ffiffiffi À2= 5 1= 5

and

PÀ1 AP ¼

6 0

0 1

!

As expected, the diagonal entries of PÀ1 AP are the eigenvalues corresponding to the columns of P.

The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an orthogonal matrix P such that PÀ1 AP is diagonal.
ALGORITHM 9.2:

(Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A.

Step 1. Find the characteristic polynomial DðtÞ of A. Step 2. Find the eigenvalues of A, which are the roots of DðtÞ. Step 3. For each eigenvalue l of A in Step 2, find an orthogonal basis of its eigenspace. Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of Rn . Step 5. Let P be the matrix whose columns are the normalized eigenvectors in Step 4.

Application to Quadratic Forms
Let q be a real polynomial in variables x1 ; x2 ; . . . ; xn such that every term in q has degree two; that is, qðx1 ; x2 ; . . . ; xn Þ ¼ P i ci x2 þ i

P i 0. (See Fig. 9-1.) (a) Show that v 1 ¼ ðk; 1Þ and v 2 ¼ ð1; ÀkÞ are eigenvectors of L. (b) Show that L is diagonalizable, and find a diagonal representation D.

y L(v2)

L(P)

P 0 v2 x

y = kx

Figure 9-1

(a) The vector v 1 ¼ ðk; 1Þ lies on the line y ¼ kx, and hence is left fixed by L; that is, Lðv 1 Þ ¼ v 1 . Thus, v 1 is an eigenvector of L belonging to the eigenvalue l1 ¼ 1. The vector v 2 ¼ ð1; ÀkÞ is perpendicular to the line y ¼ kx, and hence, L reflects v 2 into its negative; that is, Lðv 2 Þ ¼ Àv 2 . Thus, v 2 is an eigenvector of L belonging to the eigenvalue l2 ¼ À1.

312

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

(b) Here S ¼ fv 1 ; v 2 g is a basis of R2 consisting of eigenvectors of L. Thus, L is diagonalizable, with the ! 1 0 diagonal representation D ¼ (relative to the basis S). 0 À1

Eigenvalues and Eigenvectors 3 4 1 À1 9.14. Let A ¼ 4 2 5 À2 5: (a) Find all eigenvalues of A. 1 1 2 (b) Find a maximum set S of linearly independent eigenvectors of A. (c) Is A diagonalizable? If yes, find P such that D ¼ PÀ1 AP is diagonal.
(a) First find the characteristic polynomial DðtÞ of A. We have trðAÞ ¼ 4 þ 5 þ 2 ¼ 11 and jAj ¼ 40 À 2 À 2 þ 5 þ 8 À 4 ¼ 45  4 ¼ 2  1  ¼ 18 5

2

Also, find each cofactor Aii of aii in A:      5 À2   4 À1     ¼ 12;  ¼ 9; A11 ¼  A22 ¼  1 2 1 2 Hence;

A33

DðtÞ ¼ t3 À trðAÞ t2 þ ðA11 þ A22 þ A33 Þt À jAj ¼ t3 À 11t2 þ 39t À 45

Assuming Dt has a rational root, it must be among Æ1, Æ3, Æ5, Æ9, Æ15, Æ45. Testing, by synthetic division, we get 3 1 À 11 þ 39 À 45 3 À 24 þ 45 1 À 8 þ 15 þ 0 Thus, t ¼ 3 is a root of DðtÞ. Also, t À 3 is a factor and t2 À 8t þ 15 is a factor. Hence, DðtÞ ¼ ðt À 3Þðt2 À 8t þ 15Þ ¼ ðt À 3Þðt À 5Þðt À 3Þ ¼ ðt À 3Þ2 ðt À 5Þ Accordingly, l ¼ 3 and l ¼ 5 are eigenvalues of A. (b) Find linearly independent eigenvectors for each eigenvalue of A. (i) Subtract l ¼ 3 down the diagonal of A to obtain the matrix 1 M ¼ 42 1 (ii) 2 3 1 À1 2 À2 5; 1 À1 xþyÀz¼0

corresponding to

Here u ¼ ð1; À1; 0Þ and v ¼ ð1; 0; 1Þ are linearly independent solutions. Subtract l ¼ 5 down the diagonal of A to obtain the matrix 2 3 Àx þ y À z ¼ 0 À1 1 À1 corresponding to 2x À M ¼ 4 2 0 À2 5; 2z ¼ 0 or 1 1 À3 x þ y À 3z ¼ 0 Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution.

x

À z¼0 y À 2z ¼ 0

Thus, S ¼ fu; v; wg ¼ fð1; À1; 0Þ; ð1; 0; 1Þ; ð1; 2; 1Þg is a maximal set of linearly independent eigenvectors of A.

Remark: The vectors u and v were chosen so that they were independent solutions of the system x þ y À z ¼ 0. On the other hand, w is automatically independent of u and v because w belongs to a different eigenvalue of A. Thus, the three vectors are linearly independent.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(c) A is diagonalizable, because it columns u; v; w. Then 2 1 P ¼ 4 À1 0

313

has three linearly independent eigenvectors. Let P be the matrix with

3 1 1 0 25 1 1

2 and D ¼ PÀ1 AP ¼ 4

3 3 5

3 5

3 9.15. Repeat Problem 9.14 for the matrix B ¼ 4 7 6 trðBÞ ¼ 0; jBj ¼ À16; B11 ¼ À4;

2

À1 À5 À6

3 1 1 5. 2 B33 ¼ À8; so P i (a) First find the characteristic polynomial DðtÞ of B. We have

B22 ¼ 0;

Bii ¼ À12

Therefore, DðtÞ ¼ t3 À 12t þ 16 ¼ ðt À 2Þ2 ðt þ 4Þ. Thus, l1 ¼ 2 and l2 ¼ À4 are the eigenvalues of B.
(b) Find a basis for the eigenspace of each eigenvalue of B. (i) Subtract l1 ¼ 2 down the diagonal of B to obtain 2 3 xÀ yþz¼0 1 À1 1 corresponding to 7x À 7y þ z ¼ 0 M ¼ 4 7 À7 1 5; 6x À 6y ¼0 6 À6 0

or

xÀyþz¼0 z¼0

(ii)

The system has only one independent solution; for example, x ¼ 1, y ¼ 1, z ¼ 0. Thus, u ¼ ð1; 1; 0Þ forms a basis for the eigenspace of l1 ¼ 2. Subtract l2 ¼ À4 (or add 4) down the diagonal of B to obtain 2 3 7x À y þ z ¼ 0 7 À1 1 xÀ yþ z¼0 corresponding to 7x À y þ z ¼ 0 or M ¼ 4 7 À1 1 5; 6y À 6z ¼ 0 6x À 6y þ 6z ¼ 0 6 À6 6 The system has only one independent solution; for example, x ¼ 0, y ¼ 1, z ¼ 1. Thus, v ¼ ð0; 1; 1Þ forms a basis for the eigenspace of l2 ¼ À4.

Thus S ¼ fu; vg is a maximal set of linearly independent eigenvectors of B. (c) Because B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that is, B is not diagonalizable.

9.16. Find the algebraic and geometric multiplicities of the eigenvalue l1 ¼ 2 of the matrix B in Problem 9.15.
The algebraic multiplicity of l1 ¼ 2 is 2, because t À 2 appears with exponent 2 in DðtÞ. However, the geometric multiplicity of l1 ¼ 2 is 1, because dim El1 ¼ 1 (where El1 is the eigenspace of l1 ).

9.17. Let T: R3 ! R3 be defined by T ðx; y; zÞ ¼ ð2x þ y À 2z; 2x þ 3y À 4z; x þ y À zÞ. Find all eigenvalues of T , and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of R3 that diagonalizes T ; and find its diagonal representation D.
First find the matrix A that represents T relative to the usual basis of R3 by writing down the coefficients of x; y; z as rows, and then find the characteristic polynomial of A (and T ). We have 2 1 A ¼ ½T Š ¼ 4 2 3 1 1 2 3 À2 À4 5 À1 trðAÞ ¼ 4; jAj ¼ 2 A11 ¼ 1; P22 ¼ 0; A33 ¼ 4 A Aii ¼ 5 i and

Therefore, DðtÞ ¼ t3 À 4t2 þ 5t À 2 ¼ ðt À 1Þ2 ðt À 2Þ, and so l ¼ 1 and l ¼ 2 are the eigenvalues of A (and T ). We next find linearly independent eigenvectors for each eigenvalue of A.

314

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(i) Subtract l ¼ 1 down 2 1 M ¼ 42 1 the diagonal of A to obtain the matrix 3 1 À2 2 À4 5; corresponding to x þ y À 2z ¼ 0 1 À2

Here y and z are free variables, and so there are two linearly independent eigenvectors belonging to l ¼ 1. For example, u ¼ ð1; À1; 0Þ and v ¼ ð2; 0; 1Þ are two such eigenvectors. (ii) Subtract l ¼ 2 down the diagonal of A to obtain 2 3 0 1 À2 y À 2z ¼ 0 x þ y À 3z ¼ 0 M ¼ 4 2 1 À4 5; corresponding to 2x þ y À 4z ¼ 0 or y À 2z ¼ 0 x þ y À 3z ¼ 0 1 1 À3 Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution. Thus, T is diagonalizable, because it has three independent eigenvectors. Specifically, choosing S ¼ fu; v; wg ¼ fð1; À1; 0Þ; ð2; 0; 1Þ; ð1; 2; 1Þg

as a basis, T is represented by the diagonal matrix D ¼ diagð1; 1; 2Þ.

9.18. Prove the following for a linear operator (matrix) T : (a) The scalar 0 is an eigenvalue of T if and only if T is singular. (b) If l is an eigenvalue of T , where T is invertible, then lÀ1 is an eigenvalue of T À1 .
(a) We have that 0 is an eigenvalue of T if and only if there is a vector v 6¼ 0 such that T ðvÞ ¼ 0v—that is, if and only if T is singular. (b) Because T is invertible, it is nonsingular; hence, by (a), l 6¼ 0. By definition of an eigenvalue, there exists v 6¼ 0 such that T ðvÞ ¼ lv. Applying T À1 to both sides, we obtain v ¼ T À1 ðlvÞ ¼ lT À1 ðvÞ; Therefore, lÀ1 is an eigenvalue of T À1 . and so T À1 ðvÞ ¼ lÀ1 v

9.19. Let l be an eigenvalue of a linear operator T : V ! V , and let El consists of all the eigenvectors belonging to l (called the eigenspace of l). Prove that El is a subspace of V . That is, prove (a) If u 2 El , then ku 2 El for any scalar k. (b) If u; v; 2 El , then u þ v 2 El .
(a) Because u 2 El , we have T ðuÞ ¼ lu. Then T ðkuÞ ¼ kT ðuÞ ¼ kðluÞ ¼ lðkuÞ; and so ku 2 El : (We view the zero vector 0 2 V as an ‘‘eigenvector’’ of l in order for El to be a subspace of V .) (b) As u; v 2 El , we have T ðuÞ ¼ lu and T ðvÞ ¼ lv. Then T ðu þ vÞ ¼ T ðuÞ þ T ðvÞ ¼ lu þ lv ¼ lðu þ vÞ; and so u þ v 2 El

9.20. Prove Theorem 9.6: The following are equivalent: (i) The scalar l is an eigenvalue of A. (ii) The matrix lI À A is singular. (iii) The scalar l is a root of the characteristic polynomial DðtÞ of A.
The scalar l is an eigenvalue of A if and only if there exists a nonzero vector v such that Av ¼ lv or ðlIÞv À Av ¼ 0 or ðlI À AÞv ¼ 0

or lI À A is singular. In such a case, l is a root of DðtÞ ¼ jtI À Aj. Also, v is in the eigenspace El of l if and only if the above relations hold. Hence, v is a solution of ðlI À AÞX ¼ 0.

9.21. Prove Theorem 9.80 : Suppose v 1 ; v 2 ; . . . ; v n are nonzero eigenvectors of T belonging to distinct eigenvalues l1 ; l2 ; . . . ; ln . Then v 1 ; v 2 ; . . . ; v n are linearly independent.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

315

Suppose the theorem is not true. Let v 1 ; v 2 ; . . . ; v s be a minimal set of vectors for which the theorem is not true. We have s > 1, because v 1 6¼ 0. Also, by the minimality condition, v 2 ; . . . ; v s are linearly independent. Thus, v 1 is a linear combination of v 2 ; . . . ; v s , say, v 1 ¼ a2 v 2 þ a3 v 3 þ Á Á Á þ as v s (where some ak 6¼ 0Þ. Applying T to (1) and using the linearity of T yields T ðv 1 Þ ¼ T ða2 v 2 þ a3 v 3 þ Á Á Á þ as v s Þ ¼ a2 T ðv 2 Þ þ a3 T ðv 3 Þ þ Á Á Á þ as T ðv s Þ Because v j is an eigenvector of T belonging to lj , we have T ðv j Þ ¼ lj v j . Substituting in (2) yields l1 v 1 ¼ a2 l2 v 2 þ a3 l3 v 3 þ Á Á Á þ as ls v s Multiplying (1) by l1 yields l1 v 1 ¼ a2 l1 v 2 þ a3 l1 v 3 þ Á Á Á þ as l1 v s Setting the right-hand sides of (3) and (4) equal to each other, or subtracting (3) from (4) yields a2 ðl1 À l2 Þv 2 þ a3 ðl1 À l3 Þv 3 þ Á Á Á þ as ðl1 À ls Þv s ¼ 0 Because v 2 ; v 3 ; . . . ; v s are linearly independent, the coefficients in (5) must all be zero. That is, a2 ðl1 À l2 Þ ¼ 0; a3 ðl1 À l3 Þ ¼ 0; ...; as ðl1 À ls Þ ¼ 0 However, the li are distinct. Hence l1 À lj 6¼ 0 for j > 1. Hence, a2 ¼ 0, a3 ¼ 0; . . . ; as ¼ 0. This contradicts the fact that some ak 6¼ 0. The theorem is proved. ð5Þ ð4Þ ð3Þ ð2Þ ð1Þ

9.22. Prove Theorem 9.9. Suppose DðtÞ ¼ ðt À a1 Þðt À a2 Þ . . . ðt À an Þ is the characteristic polynomial of an n-square matrix A, and suppose the n roots ai are distinct. Then A is similar to the diagonal matrix D ¼ diagða1 ; a2 ; . . . ; an Þ.
Let v 1 ; v 2 ; . . . ; v n be (nonzero) eigenvectors corresponding to the eigenvalues ai . Then the n eigenvectors v i are linearly independent (Theorem 9.8), and hence form a basis of K n . Accordingly, A is diagonalizable (i.e., A is similar to a diagonal matrix D), and the diagonal elements of D are the eigenvalues ai .

9.23. Prove Theorem 9.100 : The geometric multiplicity of an eigenvalue l of T does not exceed its algebraic multiplicity.
Suppose the geometric multiplicity of l is r. Then its eigenspace El contains r linearly independent eigenvectors v 1 ; . . . ; v r . Extend the set fv i g to a basis of V , say, fv i ; . . . ; v r ; w1 ; . . . ; ws g. We have T ðv 1 Þ ¼ lv 1 ; T ðv 2 Þ ¼ lv 2 ; ...; T ðv r Þ ¼ lv r ;

T ðw1 Þ ¼ a11 v 1 þ Á Á Á þ a1r v r þ b11 w1 þ Á Á Á þ b1s ws T ðw2 Þ ¼ a21 v 1 þ Á Á Á þ a2r v r þ b21 w1 þ Á Á Á þ b2s ws :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: T ðws Þ ¼ as1 v 1 þ Á Á Á þ asr v r þ bs1 w1 þ Á Á Á þ bss ws lIr Then M ¼ 0 A B ! is the matrix of T in the above basis, where A ¼ ½aij ŠT and B ¼ ½bij ŠT :

Because M is block diagonal, the characteristic polynomial ðt À lÞr of the block lIr must divide the characteristic polynomial of M and hence of T . Thus, the algebraic multiplicity of l for T is at least r, as required.

Diagonalizing Real Symmetric Matrices and Quadratic Forms ! 7 3 . Find an orthogonal matrix P such that D ¼ PÀ1 AP is diagonal. 9.24. Let A ¼ 3 À1

316

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
First find the characteristic polynomial DðtÞ of A. We have DðtÞ ¼ t2 À trðAÞ t þ jAj ¼ t2 À 6t À 16 ¼ ðt À 8Þðt þ 2Þ

Thus, the eigenvalues of A are l ¼ 8 and l ¼ À2. We next find corresponding eigenvectors. Subtract l ¼ 8 down the diagonal of A to obtain the matrix ! À1 3 Àx þ 3y ¼ 0 M¼ or x À 3y ¼ 0 ; corresponding to 3 À9 3x À 9y ¼ 0 A nonzero solution is u1 ¼ ð3; 1Þ. Subtract l ¼ À2 (or add 2) down the diagonal of A to obtain the matrix ! 9x þ 3y ¼ 0 9 3 or ; corresponding to M¼ 3x þ y ¼ 0 3 1

3x þ y ¼ 0

A nonzero solution is u2 ¼ ð1; À3Þ. As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal. Normalize u1 and u2 to obtain, respectively, the unit vectors pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi ^ ^ u1 ¼ ð3= 10; 1= 10Þ and u2 ¼ ð1= 10; À3= 10Þ: ^ ^ Finally, let P be the matrix whose columns are the unit vectors u1 and u2 , respectively. Then " pffiffiffiffiffi pffiffiffiffiffi # ! 3= 10 1= 10 8 0 À1 P¼ and D ¼ P AP ¼ pffiffiffiffiffi pffiffiffiffiffi 0 À2 1= 10 À3= 10 As expected, the diagonal entries in D are the eigenvalues of A.

11 À8 9.25. Let B ¼ 4 À8 À1 4 À2

2

3 4 À2 5. (a) À4

Find all eigenvalues of B.

(b) Find a maximal set S of nonzero orthogonal eigenvectors of B. (c) Find an orthogonal matrix P such that D ¼ PÀ1 BP is diagonal.
(a) First find the characteristic polynomial of B. We have trðBÞ ¼ 6; jBj ¼ 400; B11 ¼ 0; B22 ¼ À60; B33 ¼ À75; so P i Bii ¼ À135

Hence, DðtÞ ¼ t3 À 6t2 À 135t À 400. If DðtÞ has an integer root it must divide 400. Testing t ¼ À5, by synthetic division, yields 1 À 6 À 135 À 400 À 5 þ 55 þ 400 1 À 11 À 80 þ 0 Thus, t þ 5 is a factor of DðtÞ, and t2 À 11t À 80 is a factor. Thus, DðtÞ ¼ ðt þ 5Þðt2 À 11t À 80Þ ¼ ðt þ 5Þ2 ðt À 16Þ The eigenvalues of B are l ¼ À5 (multiplicity 2), and l ¼ 16 (multiplicity 1). (b) Find an orthogonal basis for each eigenspace. Subtract l ¼ À5 (or, add 5) down the diagonal of B to obtain the homogeneous system 16x À 8y þ 4z ¼ 0; À8x þ 4y À 2z ¼ 0; 4x À 2y þ z ¼ 0 À5

That is, 4x À 2y þ z ¼ 0. The system has two independent solutions. One solution is v 1 ¼ ð0; 1; 2Þ. We seek a second solution v 2 ¼ ða; b; cÞ, which is orthogonal to v 1 , such that 4a À 2b þ c ¼ 0; and also b À 2c ¼ 0

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
One such solution is v 2 ¼ ðÀ5; À8; 4Þ. Subtract l ¼ 16 down the diagonal of B to obtain the homogeneous system À5x À 8y þ 4z ¼ 0; À8x À 17y À 2z ¼ 0; 4x À 2y À 20z ¼ 0

317

This system yields a nonzero solution v 3 ¼ ð4; À2; 1Þ. (As expected from Theorem 9.13, the eigenvector v 3 is orthogonal to v 1 and v 2 .) Then v 1 ; v 2 ; v 3 form a maximal set of nonzero orthogonal eigenvectors of B. (c) Normalize v 1 ; v 2 ; v 3 to obtain the orthonormal basis: pffiffiffiffiffiffiffiffi pffiffiffiffiffi pffiffiffi ^ ^ ^ v1 ¼ v 1 = 5; v2 ¼ v 2 = 105; v3 ¼ v 3 = 21 ^ ^ ^ Then P is the matrix whose columns are v1 ; v2 ; v3 . Thus, pffiffiffiffiffi 3 pffiffiffiffiffiffiffiffi 0 À5= 105 4= 21 pffiffiffiffiffi 7 pffiffiffiffiffiffiffiffi 6 pffiffiffi P ¼ 4 1= 5 À8= 105 À2= 21 5 pffiffiffiffiffi pffiffiffiffiffiffiffiffi pffiffiffi 2= 5 4= 105 1= 21 2 2 and 6 D ¼ PÀ1 BP ¼ 4 À5 À5 16 3 7 5

9.26. Let qðx; yÞ ¼ x2 þ 6xy À 7y2 . Find an orthogonal substitution that diagonalizes q.
Find the symmetric matrix A that represents q and its characteristic polynomial DðtÞ. We have 1 3 A¼ 3 À7 ! and DðtÞ ¼ t2 þ 6t À 16 ¼ ðt À 2Þðt þ 8Þ

The eigenvalues of A are l ¼ 2 and l ¼ À8. Thus, using s and t as new variables, a diagonal form of q is qðs; tÞ ¼ 2s2 À 8t2 The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of A. (i) Subtract l ¼ 2 down the diagonal of A to obtain the matrix ! À1 3 Àx þ 3y ¼ 0 M¼ ; corresponding to 3 À9 3x À 9y ¼ 0

or

À x þ 3y ¼ 0

A nonzero solution is u1 ¼ ð3; 1Þ. (ii) Subtract l ¼ À8 (or add 8) down the diagonal of A to obtain the matrix ! 9 3 9x þ 3y ¼ 0 M¼ or ; corresponding to 3 1 3x þ y ¼ 0

3x þ y ¼ 0

A nonzero solution is u2 ¼ ðÀ1; 3Þ. As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal. Now normalize u1 and u2 to obtain, respectively, the unit vectors pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi and u2 ¼ ðÀ1= 10; 3= 10Þ: ^ u1 ¼ ð3= 10; 1= 10Þ ^ Finally, let P be the matrix whose columns are the unit vectors u1 and u2 , respectively, and then ^ ^ ½x; yŠT ¼ P½s; tŠT is the required orthogonal change of coordinates. That is,  pffiffiffiffiffi #  3= 10 À1=pffiffiffiffiffi 3s À t s þ 3t 10  and x ¼ pffiffiffiffiffi ; P ¼  pffiffiffiffiffi y ¼ pffiffiffiffiffi pffiffiffiffiffi  1= 10 10 10 3= 10 One can also express s and t in terms of x and y by using PÀ1 ¼ PT . That is, 3x þ y s ¼ pffiffiffiffiffi ; 10 Àx þ 3t t ¼ pffiffiffiffiffi 10

318
Minimal Polynomial 2 4 À2 4 6 À3 9.27. Let A ¼ 3 À2

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

3 2 3 2 3 À2 2 4 5 and B ¼ 4 4 À4 6 5. The characteristic polynomial of both matrices is 3 2 À3 5

DðtÞ ¼ ðt À 2Þðt À 1Þ2 . Find the minimal polynomial mðtÞ of each matrix.
The minimal polynomial mðtÞ must divide DðtÞ. Also, each factor of DðtÞ (i.e., t À 2 and t À 1) must also be a factor of mðtÞ. Thus, mðtÞ must be exactly one of the following: f ðtÞ ¼ ðt À 2Þðt À 1Þ or gðtÞ ¼ ðt À 2Þðt À 1Þ2 only test f ðtÞ. 3 2 0 0 2 45 ¼ 40 0 0 0 2 We have 3 0 05 0

(a) By the Cayley–Hamilton theorem, gðAÞ ¼ DðAÞ ¼ 0, so we need 2 32 2 À2 2 3 À2 f ðAÞ ¼ ðA À 2IÞðA À IÞ ¼ 4 6 À5 4 54 6 À4 3 À2 1 3 À2

Thus, mðtÞ ¼ f ðtÞ ¼ ðt À 2Þðt À 1Þ ¼ t2 À 3t þ 2 is the minimal polynomial of A. (b) Again gðBÞ ¼ DðBÞ ¼ 0, so we need only test f ðtÞ. We get 2 32 3 2 3 1 À2 2 2 À2 2 À2 2 À2 f ðBÞ ¼ ðB À 2IÞðB À IÞ ¼ 4 4 À6 6 54 4 À5 6 5 ¼ 4 À4 4 À4 5 6¼ 0 2 À3 3 2 À3 4 À2 2 À2 Thus, mðtÞ 6¼ f ðtÞ. Accordingly, mðtÞ ¼ gðtÞ ¼ ðt À 2Þðt À 1Þ2 is the minimal polynomial of B. [We emphasize that we do not need to compute gðBÞ; we know gðBÞ ¼ 0 from the Cayley–Hamilton theorem.]

9.28. Find the minimal polynomial mðtÞ 2 ! 1 5 1 (a) A ¼ , (b) B ¼ 4 0 3 7 0

of each of the following matrices: 3 ! 2 3 5, (c) C ¼ 4 À1 2 3 1 2 0 3

(a) The characteristic polynomial of A is DðtÞ ¼ t2 À 12t þ 32 ¼ ðt À 4Þðt À 8Þ. Because DðtÞ has distinct factors, the minimal polynomial mðtÞ ¼ DðtÞ ¼ t2 À 12t þ 32. (b) Because B is triangular, its eigenvalues are the diagonal elements 1; 2; 3; and so its characteristic polynomial is DðtÞ ¼ ðt À 1Þðt À 2Þðt À 3Þ. Because DðtÞ has distinct factors, mðtÞ ¼ DðtÞ. (c) The characteristic polynomial of C is DðtÞ ¼ t2 À 6t þ 9 ¼ ðt À 3Þ2 . Hence the minimal polynomial of C is f ðtÞ ¼ t À 3 or gðtÞ ¼ ðt À 3Þ2 . However, f ðCÞ 6¼ 0; that is, C À 3I 6¼ 0. Hence, mðtÞ ¼ gðtÞ ¼ DðtÞ ¼ ðt À 3Þ2 :

9.29. Suppose S ¼ fu1 ; u2 ; . . . ; un g is a basis of V , and suppose F and G are linear operators on V such that ½FŠ has 0’s on and below the diagonal, and ½GŠ has a 6¼ 0 on the superdiagonal and 0’s elsewhere. That is,
3 0 a21 a31 . . . an1 6 0 0 a32 . . . an2 7 7 6 ½FŠ ¼ 6 :::::::::::::::::::::::::::::::::::::::: 7; 7 6 40 0 0 . . . an;nÀ1 5 0 0 0 ... 0 2 3 0 a 0 ... 0 60 0 a ... 07 7 6 ½GŠ ¼ 6 ::::::::::::::::::::::::::: 7 7 6 40 0 0 ... a5 0 0 0 ... 0 2

Show that (a) F n ¼ 0, (b) GnÀ1 6¼ 0, but Gn ¼ 0. (These conditions also hold for ½FŠ and ½GŠ.) (a) We have Fðu1 Þ ¼ 0 and, for r > 1, Fður Þ is a linear combination of vectors preceding ur in S. That is, Fður Þ ¼ ar1 u1 þ ar2 u2 þ Á Á Á þ ar;rÀ1 urÀ1

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

319

Hence, F 2 ður Þ ¼ FðFður ÞÞ is a linear combination of vectors preceding urÀ1 , and so on. Hence, F r ður Þ ¼ 0 for each r. Thus, for each r, F n ður Þ ¼ F nÀr ð0Þ ¼ 0, and so F n ¼ 0, as claimed. (b) We have Gðu1 Þ ¼ 0 and, for each k > 1, Gðuk Þ ¼ aukÀ1 . Hence, Gr ðuk Þ ¼ ar ukÀr for r < k. Because a ¼ 0, 6 anÀ1 ¼ 0. Therefore, GnÀ1 ðun Þ ¼ anÀ1 u1 ¼ 0, and so GnÀ1 ¼ 0. On the other hand, by (a), Gn ¼ 0. 6 6 6

9.30. Let B be the matrix in Example 9.12(a) that has 1’s on the diagonal, a’s on the superdiagonal, where a 6¼ 0, and 0’s elsewhere. Show that f ðtÞ ¼ ðt À lÞn is both the characteristic polynomial DðtÞ and the minimum polynomial mðtÞ of A.
Because A is triangular with l’s on the diagonal, DðtÞ ¼ f ðtÞ ¼ ðt À lÞn is its characteristic polynomial. Thus, mðtÞ is a power of t À l. By Problem 9.29, ðA À lIÞrÀ1 6¼ 0. Hence, mðtÞ ¼ DðtÞ ¼ ðt À lÞn .

9.31. Find the characteristic polynomial DðtÞ and minimal polynomial mðtÞ of each matrix: 3 2 2 3 4 1 0 0 0 2 7 0 0 60 4 1 0 07 7 6 60 2 0 07 7 (a) M ¼ 6 0 0 4 0 0 7, (b) M 0 ¼ 6 7 6 40 0 1 15 40 0 0 4 15 0 0 À2 4 0 0 0 0 4
(a) M is block diagonal with diagonal blocks 2 3 4 1 0 A ¼ 40 4 15 0 0 4 !

and

4 B¼ 0

1 4

The characteristic and minimal polynomial of A is f ðtÞ ¼ ðt À 4Þ3 and the characteristic and minimal polynomial of B is gðtÞ ¼ ðt À 4Þ2 . Then DðtÞ ¼ f ðtÞgðtÞ ¼ ðt À 4Þ5 but mðtÞ ¼ LCM½ f ðtÞ; gðtފ ¼ ðt À 4Þ3

(where LCM means least common multiple). We emphasize that the exponent in mðtÞ is the size of the largest block. ! ! 2 7 1 1 (b) Here M 0 is block diagonal with diagonal blocks A0 ¼ and B0 ¼ The char0 2 À2 4 acteristic and minimal polynomial of A0 is f ðtÞ ¼ ðt À 2Þ2 . The characteristic polynomial of B0 is gðtÞ ¼ t2 À 5t þ 6 ¼ ðt À 2Þðt À 3Þ, which has distinct factors. Hence, gðtÞ is also the minimal polynomial of B. Accordingly,

DðtÞ ¼ f ðtÞgðtÞ ¼ ðt À 2Þ3 ðt À 3Þ

but

mðtÞ ¼ LCM½ f ðtÞ; gðtފ ¼ ðt À 2Þ2 ðt À 3Þ

9.32. Find a matrix A whose minimal polynomial is f ðtÞ ¼ t3 À 8t2 þ 5t þ 7.
0 0 Simply let A ¼ 4 1 0 0 1 2 3 À7 À5 5, the companion matrix of f ðtÞ [defined in Example 9.12(b)]. 8

9.33. Prove Theorem 9.15: The minimal polynomial mðtÞ of a matrix (linear operator) A divides every polynomial that has A as a zero. In particular (by the Cayley–Hamilton theorem), mðtÞ divides the characteristic polynomial DðtÞ of A.
Suppose f ðtÞ is a polynomial for which f ðAÞ ¼ 0. By the division algorithm, there exist polynomials qðtÞ and rðtÞ for which f ðtÞ ¼ mðtÞqðtÞ þ rðtÞ and rðtÞ ¼ 0 or deg rðtÞ < deg mðtÞ. Substituting t ¼ A in this equation, and using that f ðAÞ ¼ 0 and mðAÞ ¼ 0, we obtain rðAÞ ¼ 0. If rðtÞ 6¼ 0, then rðtÞ is a polynomial of degree less than mðtÞ that has A as a zero. This contradicts the definition of the minimal polynomial. Thus, rðtÞ ¼ 0, and so f ðtÞ ¼ mðtÞqðtÞ; that is, mðtÞ divides f ðtÞ.

320

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors

9.34. Let mðtÞ be the minimal polynomial of an n-square matrix A. Prove that the characteristic polynomial DðtÞ of A divides ½mðtފn .
Suppose mðtÞ ¼ tr þ c1 trÀ1 þ Á Á Á þ crÀ1 t þ cr . Define matrices Bj as follows: B0 ¼ I so I ¼ B0 so B1 ¼ A þ c1 I c1 I ¼ B1 À A ¼ B1 À AB0 so c2 I ¼ B2 À AðA þ c1 IÞ ¼ B2 À AB1 B2 ¼ A2 þ c1 A þ c2 I ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: crÀ1 I ¼ BrÀ1 À ABrÀ2 so BrÀ1 ¼ ArÀ1 þ c1 ArÀ2 þ Á Á Á þ crÀ1 I Then Set Then ðtI À AÞBðtÞ ¼ ðtr B0 þ trÀ1 B1 þ Á Á Á þ tBrÀ1 Þ À ðtrÀ1 AB0 þ trÀ2 AB1 þ Á Á Á þ ABrÀ1 Þ ¼ t r B0 þ trÀ1 ðB1 À AB0 Þ þ t rÀ2 ðB2 À AB1 Þ þ Á Á Á þ tðBrÀ1 À ABrÀ2 Þ À ABrÀ1 ¼ t r I þ c1 trÀ1 I þ c2 t rÀ2 I þ Á Á Á þ crÀ1 tI þ cr I ¼ mðtÞI Taking the determinant of both sides gives jtI À AjjBðtÞj ¼ jmðtÞIj ¼ ½mðtފn . Because jBðtÞj is a polynomial, jtI À Aj divides ½mðtފn ; that is, the characteristic polynomial of A divides ½mðtފn . ÀABrÀ1 ¼ cr I À ðAr þ c1 ArÀ1 þ Á Á Á þ crÀ1 A þ cr IÞ ¼ cr I À mðAÞ ¼ cr I BðtÞ ¼ trÀ1 B0 þ t rÀ2 B1 þ Á Á Á þ tBrÀ2 þ BrÀ1

9.35. Prove Theorem 9.16: The characteristic polynomial DðtÞ and the minimal polynomial mðtÞ of A have the same irreducible factors.
Suppose f ðtÞ is an irreducible polynomial. If f ðtÞ divides mðtÞ, then f ðtÞ also divides DðtÞ [because mðtÞ divides Dðtފ. On the other hand, if f ðtÞ divides DðtÞ, then by Problem 9.34, f ðtÞ also divides ½mðtފn . But f ðtÞ is irreducible; hence, f ðtÞ also divides mðtÞ. Thus, mðtÞ and DðtÞ have the same irreducible factors.

9.36. Prove Theorem 9.19: The minimal polynomial mðtÞ of a block diagonal matrix M with diagonal blocks Ai is equal to the least common multiple (LCM) of the minimal polynomials of the diagonal blocks Ai .
We prove the theorem for the case r ¼ 2. The general theorem follows easily by induction. Suppose ! A 0 M¼ , where A and B are square matrices. We need to show that the minimal polynomial mðtÞ of M 0 B is the LCM of the minimal polynomials gðtÞ and hðtÞ of A and B, respectively. ! mðAÞ 0 Because mðtÞ is the minimal polynomial of M; mðMÞ ¼ ¼ 0, and mðAÞ ¼ 0 and 0 mðBÞ mðBÞ ¼ 0. Because gðtÞ is the minimal polynomial of A, gðtÞ divides mðtÞ. Similarly, hðtÞ divides mðtÞ. Thus mðtÞ is a multiple of gðtÞ and hðtÞ. ! ! 0 0 f ðAÞ 0 ¼ 0. But ¼ Now let f ðtÞ be another multiple of gðtÞ and hðtÞ. Then f ðMÞ ¼ 0 0 0 f ðBÞ mðtÞ is the minimal polynomial of M; hence, mðtÞ divides f ðtÞ. Thus, mðtÞ is the LCM of gðtÞ and hðtÞ.

9.37. Suppose mðtÞ ¼ tr þ arÀ1 trÀ1 þ Á Á Á þ a1 t þ a0 is the minimal polynomial of an n-square matrix A. Prove the following: (a) A is nonsingular if and only if the constant term a0 6¼ 0. (b) If A is nonsingular, then AÀ1 is a polynomial in A of degree r À 1 < n.
(a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of mðtÞ, (iii) a0 6¼ 0. Thus, the statement is true.

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(b) Because A is nonsingular, a0 6¼ 0 by (a). We have

321

Thus; Accordingly;

mðAÞ ¼ Ar þ arÀ1 ArÀ1 þ Á Á Á þ a1 A þ a0 I ¼ 0 1 À ðArÀ1 þ arÀ1 ArÀ2 þ Á Á Á þ a1 IÞA ¼ I a0 1 AÀ1 ¼ À ðArÀ1 þ arÀ1 ArÀ2 þ Á Á Á þ a1 IÞ a0

SUPPLEMENTARY PROBLEMS Polynomials of Matrices
9.38. Let A ¼ 2 5 À3 1 ! and B ¼ ! 1 2 . Find f ðAÞ, gðAÞ, f ðBÞ, gðBÞ, where f ðtÞ ¼ 2t2 À 5t þ 6 and 0 3

gðtÞ ¼ t3 À 2t2 þ t þ 3. ! 1 2 9.39. Let A ¼ . Find A2 , A3 , An , where n > 3, and AÀ1 . 0 1 2 3 8 12 0 9.40. Let B ¼ 4 0 8 12 5. Find a real matrix A such that B ¼ A3 . 0 0 8 9.41. For each matrix, find a polynomial having the following matrix as a root: 2 3 ! ! 1 1 2 2 À3 2 5 (a) A ¼ , (b) B ¼ , (c) C ¼ 4 1 2 3 5 7 À4 1 À3 2 1 4 9.42. Let A be any square matrix and let f ðtÞ be any polynomial. Prove (a) ðPÀ1 APÞn ¼ PÀ1 An P. (b) f ðPÀ1 APÞ ¼ PÀ1 f ðAÞP. (c) f ðAT Þ ¼ ½ f ðAފT . (d) If A is symmetric, then f ðAÞ is symmetric. 9.43. Let M ¼ diag½A1 ; . . . ; Ar Š be a block diagonal matrix, and let f ðtÞ be any polynomial. Show that f ðMÞ is block diagonal and f ðMÞ ¼ diag½ f ðA1 Þ; . . . ; f ðAr ފ: 9.44. Let M be a block triangular matrix with diagonal blocks A1 ; . . . ; Ar , and let f ðtÞ be any polynomial. Show that f ðMÞ is also a block triangular matrix, with diagonal blocks f ðA1 Þ; . . . ; f ðAr Þ.

Eigenvalues and Eigenvectors
9.45. For each of the following matrices, find all eigenvalues and corresponding linearly independent eigenvectors: ! ! ! 2 À3 2 4 1 À4 (a) A ¼ , (b) B ¼ , (c) C ¼ 2 À5 À1 6 3 À7 When possible, find the nonsingular matrix P that diagonalizes the matrix. 9.46. Let A ¼ (a) (b) (c) (d) Find Find Find Find ! 2 À1 . À2 3 eigenvalues and corresponding eigenvectors. a nonsingular matrix P such that D ¼ PÀ1 AP is diagonal. A8 and f ðAÞ where f ðtÞ ¼ t4 À 5t3 þ 7t2 À 2t þ 5. a matrix B such that B2 ¼ A.

322

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
5 À2 ! 6 . À2

9.47. Repeat Problem 9.46 for A ¼

9.48. For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent eigenvectors: 2 3 2 3 2 3 1 À3 3 3 À1 1 1 2 2 (a) A ¼ 4 3 À5 3 5, (b) B ¼ 4 7 À5 1 5, (c) C ¼ 4 1 2 À1 5 6 À6 4 6 À6 2 À1 1 4 Which matrices can be diagonalized, and why? 9.49. For each of the following linear operators T : R2 ! R2 , find all eigenvalues and a basis for each eigenspace: (a) T ðx; yÞ ¼ ð3x þ 3y; x þ 5yÞ, 9.50. Let A ¼ (b) T ðx; yÞ ¼ ð3x À 13y; x À 3yÞ. ! a b be a real matrix. Find necessary and sufficient conditions on a; b; c; d so that A is c d diagonalizable—that is, so that A has two (real) linearly independent eigenvectors.

9.51. Show that matrices A and AT have the same eigenvalues. Give an example of a 2 Â 2 matrix A where A and AT have different eigenvectors. 9.52. Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear operator kF þ k 0 G, where k and k 0 are scalars. 9.53. Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue l. Prove (a) For n > 0; v is an eigenvector of T n belonging to ln . (b) f ðlÞ is an eigenvalue of f ðT Þ for any polynomial f ðtÞ. 9.54. Suppose l 6¼ 0 is an eigenvalue of the composition F  G of linear operators F and G. Show that l is also an eigenvalue of the composition G  F. [Hint: Show that GðvÞ is an eigenvector of G  F.] 9.55. Let E: V ! V be a projection mapping; that is, E2 ¼ E. Show that E is diagonalizable and, in fact, can be ! Ir 0 represented by the diagonal matrix M ¼ , where r is the rank of E. 0 0

Diagonalizing Real Symmetric Matrices and Quadratic Forms
9.56. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such that D ¼ PÀ1 AP: ! ! ! 5 4 4 À1 7 3 (a) A ¼ , (b) A ¼ , (c) A ¼ 4 À1 À1 4 3 À1 9.57. For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of eigenvectors, and an orthogonal matrix P such that D ¼ PÀ1 BP is diagonal: 2 3 2 3 2 2 4 0 1 1 85 (a) B ¼ 4 1 0 1 5, (b) B ¼ 4 2 5 4 8 17 1 1 0 9.58. Using variables s and t, find an orthogonal substitution that diagonalizes each of the following quadratic forms: (a) qðx; yÞ ¼ 4x2 þ 8xy À 11y2 , (b) qðx; yÞ ¼ 2x2 À 6xy þ 10y2

9.59. For each of the following quadratic forms qðx; y; zÞ, find an orthogonal substitution expressing x; y; z in terms of variables r; s; t, and find qðr; s; tÞ: (a) qðx; y; zÞ ¼ 5x2 þ 3y2 þ 12xz; (b) qðx; y; zÞ ¼ 3x2 À 4xy þ 6y2 þ 2xz À 4yz þ 3z2

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
9.60. Find a real 2 Â 2 symmetric matrix A with eigenvalues: (a) l ¼ 1 and l ¼ 4 and eigenvector u ¼ ð1; 1Þ belonging to l ¼ 1; (b) l ¼ 2 and l ¼ 3 and eigenvector u ¼ ð1; 2Þ belonging to l ¼ 2. In each case, find a matrix B for which B2 ¼ A.

323

Characteristic and Minimal Polynomials
9.61. Find the characteristic and minimal polynomials of each of the following matrices: 2 3 2 3 3 2 À1 3 1 À1 (a) A ¼ 4 2 4 À2 5, (b) B ¼ 4 3 8 À3 5 3 6 À1 À1 À1 3 9.62. Find the characteristic and minimal polynomials of each 2 2 3 4 À1 0 2 5 0 0 0 61 60 2 0 0 07 2 0 6 6 7 0 3 (a) A ¼ 6 0 0 4 2 0 7, (b) B ¼ 6 0 6 6 7 40 40 0 3 5 05 0 0 0 0 0 0 0 0 0 7 2 of the following matrices: 2 3 3 2 0 0 61 4 0 07 6 7 1 0 7, (c) C ¼ 6 0 0 6 7 40 0 3 15 0 0 0 3

0 0 3 1 0

0 0 1 3 0

3 0 07 7 07 7 05 4

3 2 3 1 1 0 2 0 0 9.63. Let A ¼ 4 0 2 0 5 and B ¼ 4 0 2 2 5. Show that A and B have different characteristic polynomials 0 0 1 0 0 1 (and so are not similar) but have the same minimal polynomial. Thus, nonsimilar matrices may have the same minimal polynomial. 9.64. Let A be an n-square matrix for which Ak ¼ 0 for some k > n. Show that An ¼ 0. 9.65. Show that a matrix A and its transpose AT have the same minimal polynomial. 9.66. Suppose f ðtÞ is an irreducible monic polynomial for which f ðAÞ ¼ 0 for a matrix A. Show that f ðtÞ is the minimal polynomial of A. 9.67. Show that A is a scalar matrix kI if and only if the minimal polynomial of A is mðtÞ ¼ t À k. 9.68. Find a matrix A whose minimal polynomial is (a) t3 À 5t2 þ 6t þ 8, (b) t4 À 5t3 À 2t þ 7t þ 4. 9.69. Let f ðtÞ and gðtÞ be monic polynomials (leading coefficient one) of minimal degree for which A is a root. Show f ðtÞ ¼ gðtÞ: [Thus, the minimal polynomial of A is unique.]

ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1 ; R2 ; . . .Š denotes a matrix M with rows R1 ; R2 ; . . . : À65; À27Š, AÀ1 ¼ ½1; À2;

9.38. f ðAÞ ¼ ½À26; À3; 5; À27Š, gðAÞ ¼ ½À40; 39; f ðBÞ ¼ ½3; 6; 0; 9Š, gðBÞ ¼ ½3; 12; 0; 15Š 9.39. A2 ¼ ½1; 4; 0; 1Š, A3 ¼ ½1; 6; 0; 2; c; 0; 1Š,

An ¼ ½1; 2n;

0; 1Š,

0; 1Š

9.40. Let A ¼ ½2; a; b;

0; 0; 2Š. Set B ¼ A3 and then a ¼ 1, b ¼ À 1, c ¼ 1 2

324
9.41. Find DðtÞ: (a) 9.45. (a) (c) t2 þ t À 11,

CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(b) t2 þ 2t þ 13, (c) t3 À 7t2 þ 6t À 1

l ¼ 1; u ¼ ð3; 1Þ; l ¼ À4; v ¼ ð1; 2Þ, (b) l ¼ 4; u ¼ ð2; 1Þ, l ¼ À1; u ¼ ð2; 1Þ; l ¼ À5; v ¼ ð2; 3Þ. Only A and C can be diagonalized; use P ¼ ½u; vŠ.

9.46. (a) l ¼ 1; u ¼ ð1; 1Þ; l ¼ 4; v ¼ ð1; À2Þ, (b) P ¼ ½u; vŠ, (c) f ðAÞ Â ½3; 1; 2; 1Š; à ¼ A8 ¼ ½21 846; À21 845; 4 1 2 5 (d) B ¼ 3 ; À 3 ; À 3 ; 3

À43 690; 43 691Š,

9.47. (a) l ¼ 1; u ¼ ð3; À2Þ; l ¼ 2; v ¼ ð2; À1Þ, (b) P ¼ ½u; vŠ, (c) f ðAÞ ¼ ½2; À6; ffiffiffi 2; 9Š; A8 p pffiffiffi ¼ ½1021; 1530; À510;ffiffiffi pffiffiffi p À764Š, (d) B ¼ ½À3 þ 4 2; À6 þ 6 2; 2 À 2 2; 4 À 3 2Š 9.48. (a) l ¼ À2; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; À1Þ; l ¼ 4; w ¼ ð1; 1; 2Þ, (b) l ¼ 2; u ¼ ð1; 1; 0Þ; l ¼ À4; v ¼ ð0; 1; 1Þ, (c) l ¼ 3; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; 1Þ; l ¼ 1; w ¼ ð2; À1; 1Þ. Only A and C can be diagonalized; use P ¼ ½u; v; wŠ: 9.49. (a) l ¼ 2; u ¼ ð3; À1Þ; l ¼ 6; v ¼ ð1; 1Þ, (b) No real eigenvalues

9.50. We need ½ÀtrðAފ2 À 4½detðAފ ! 0 or ða À dÞ2 þ 4bc ! 0. 9.51. A ¼ ½1; 1; 0; 1Š D ¼ ½7; 0; 0; 3Š, D ¼ ½3; 0; 0; 5Š, D ¼ ½8; 0; 0; 2Š

pffiffiffi 9.56. (a) P ¼ ½2; À1; 1; 2Š= p5ffi, ffiffi (b) P ¼ ½1; 1; 1; À1Š=pffiffiffiffiffi 2, (c) P ¼ ½3; À1; 1; 3Š= 10,

9.57. (a) l ¼ À1; u ¼ ð1; À1; 0Þ; v ¼ ð1; 1; À2Þ; l ¼ 2; w ¼ ð1; 1; 1Þ, (b) l ¼ 1; u ¼ ð2; 1; À1Þ; v ¼ ð2; À3; 1Þ; l ¼ 22; w ¼ ð1; 2; 4Þ; ^ ^ ^ Normalize u; v; w, obtaining u; v ; w, and set P ¼ ½^; v ; wŠ. (Remark: u and v are not unique.) u ^ ^ pffiffiffiffiffi 9.58. (a) x ¼ ð4s þ tÞ= p17; ffiffiffiffiffi (b) x ¼ ð3s À tÞ= 10; pffiffiffiffiffi y ¼ ðÀs þ 4tÞ= ffiffiffiffiffi ; p 17 y ¼ ðs þ 3tÞ= 10; qðs; tÞ ¼ 5s2 À 12t2 , qðs; tÞ ¼ s2 þ 11t2

pffiffiffiffiffi pffiffiffiffiffi 9.59. (a) x ¼ ð3s þ 2tÞ= 13; y ¼ r; z ¼ ð2s À 3tÞ= 13; qðr; s; tÞ ¼ 3r2 þ 9s2 À ffiffiffi 2 , pffiffiffiffiffi p 4t (b) x ¼ 5Ks þ Lt; y ¼ Jr þ 2Ks À 2Lt; z ¼ 2Jr À Ks À Lt, where J ¼ 1= 5, K ¼ 1= 30, pffiffiffi L ¼ 1= 6; qðr; s; tÞ ¼ 2r2 þ 2s2 þ 8t2 9.60. (a) A ¼ 1 ½5; À3; À3; 5Š; B ¼ 1 ½3;pffiffiffi À1;ffiffiffi pffiffiffi À1; p 3Š, pffiffiffi pffiffiffi pffiffiffi pffiffiffi pffiffiffi 2 2 (b) A ¼ 1 ½14; À2; À2; 11Š, B ¼ 1 ½ 2 þ 4 3; 2 2 À 2 3; 2 2 À 2 3; 4 2 þ 3Š 5 5 9.61. (a) DðtÞ ¼ mðtÞ ¼ ðt À 2Þ2 ðt À 6Þ, (b) DðtÞ ¼ ðt À 2Þ2 ðt À 6Þ; mðtÞ ¼ ðt À 2Þðt À 6Þ

mðtÞ ¼ ðt À 2Þ2 ðt À 7Þ, 9.62. (a) DðtÞ ¼ ðt À 2Þ3 ðt À 7Þ2 ; 5 mðtÞ ¼ ðt À 3Þ3 , (b) DðtÞ ¼ ðt À 3Þ ; 2 mðtÞ ¼ ðt À 2Þðt À 4Þðt À 5Þ (c) DðtÞ ¼ ðt À 2Þ ðt À 4Þ2 ðt À 5Þ; 9.68. Let A be the companion matrix [Example 9.12(b)] with last column: (a) ½À8; À6; 5ŠT , (b) ½À4; À7; 2; 5ŠT 9.69. Hint: A is a root of hðtÞ ¼ f ðtÞ À gðtÞ, where hðtÞ  0 or the degree of hðtÞ is less than the degree of f ðtÞ:

CHAPTER 10

Canonical Forms
10.1 Introduction
Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a diagonal matrix representation. However, it is still possible to ‘‘simplify’’ the matrix representation of T in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary decomposition theorem, and the triangular, Jordan, and rational canonical forms. We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic polynomial DðtÞ of T has all its roots in the base field K. This is always true if K is the complex field C but may not be true if K is the real field R. We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the proof of the existence of the triangular and rational canonical forms.

10.2

Triangular Form on an n-dimensional vector space V. Suppose T can be represented by the 3 a1n a2n 7 7 ... 5 ann

Let T be a linear operator triangular matrix 2 a11 a12 . . . 6 a22 . . . A¼6 4 ...

Then the characteristic polynomial DðtÞ of T is a product of linear factors; that is, DðtÞ ¼ detðtI À AÞ ¼ ðt À a11 Þðt À a22 Þ Á Á Á ðt À ann Þ The converse is also true and is an important theorem (proved in Problem 10.28).
THEOREM 10.1:

Let T :V ! V be a linear operator whose characteristic polynomial factors into linear polynomials. Then there exists a basis of V in which T is represented by a triangular matrix. (Alternative Form) Let A be a square matrix whose characteristic polynomial factors into linear polynomials. Then A is similar to a triangular matrix—that is, there exists an invertible matrix P such that PÀ1 AP is triangular.

THEOREM 10.1:

We say that an operator T can be brought into triangular form if it can be represented by a triangular matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main diagonal. We give an application of this remark.

325

326

CHAPTER 10 Canonical Forms

2 EXAMPLE ffiffiffi pffiffiffi p 10.1 Let A be a square matrix over the complex field C. Suppose l is an eigenvalue of A . Show that l or À l is an eigenvalue of A. By Theorem 10.1, A and A2 are similar, respectively, to triangular matrices of the form

2 6 B¼6 4

m1

* m2

3 ... * ... * 7 7 ... ...5 mn

2 and 6 B2 ¼ 6 4

m2 1

* m2 2

3 ... * ... * 7 7 ... ...5 m2 n pffiffiffi pffiffiffi l or mi ¼ À l is an

Because similar matrices have the same eigenvalues, l ¼ m2 for some i. Hence, mi ¼ i eigenvalue of A.

10.3

Invariance

Let T :V ! V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W into itself—that is, if v 2 W implies T ðvÞ 2 W. In this case, T restricted to W defines a linear operator on ^ ^ W; that is, T induces a linear operator T :W ! W defined by T ðwÞ ¼ T ðwÞ for every w 2 W.
EXAMPLE 10.2

(a) Let T : R3 ! R3 be the following linear operator, which rotates each vector v about the z-axis by an angle y (shown in Fig. 10-1):

T ðx; y; zÞ ¼ ðx cos y À y sin y; x sin y þ y cos y; zÞ z U T(v) θ T(w) 0 x θ w W y

v

Figure 10-1

Observe that each vector w ¼ ða; b; 0Þ in the xy-plane W remains in W under the mapping T ; hence, W is T -invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates each vector about the origin O, and the restriction of T to U is the identity mapping of U. (b) Nonzero eigenvectors of a linear operator T :V ! V may be characterized as generators of T -invariant one-dimensional subspaces. Suppose T ðvÞ ¼ lv, v 6¼ 0. Then W ¼ fkv; k 2 Kg, the one-dimensional subspace generated by v, is invariant under T because

T ðkvÞ ¼ kT ðvÞ ¼ kðlvÞ ¼ klv 2 W
Conversely, suppose dim U ¼ 1 and u 6¼ 0 spans U, and U is invariant under T. Then T ðuÞ 2 U and so T ðuÞ is a multiple of u—that is, T ðuÞ ¼ mu. Hence, u is an eigenvector of T.

The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces.
THEOREM 10.2:

Let T :V ! V be any linear operator, and let f ðtÞ be any polynomial. Then the kernel of f ðT Þ is invariant under T. Suppose W is an ! invariant subspace of T :V ! V. Then T has a block matrix repreA B ^ sentation , where A is a matrix representation of the restriction T of T to W. 0 C

The notion of invariance is related to matrix representations (Problem 10.5) as follows.
THEOREM 10.3:

CHAPTER 10 Canonical Forms

327

10.4

Invariant Direct-Sum Decompositions

A vector space V is termed the direct sum of subspaces W1 ; . . . ; Wr , written V ¼ W1 È W2 È . . . È Wr if every vector v 2 V can be written uniquely in the form v ¼ w1 þ w2 þ . . . þ wr ; with wi 2 W i The following theorem (proved in Problem 10.7) holds.
THEOREM 10.4:

Suppose W1 ; W2 ; . . . ; Wr are subspaces of V, and suppose B1 ¼ fw11 ; w12 ; . . . ; w1n1 g; ...; Br ¼ fwr1 ; wr2 ; . . . ; wrnr g

are bases of W1 ; W2 ; . . . ; Wr , respectively. Then V is the direct sum of the Wi if and only if the union B ¼ B1 [ . . . [ Br is a basis of V. Now suppose T :V ! V is linear and V is the direct sum of (nonzero) T -invariant subspaces W1 ; W2 ; . . . ; Wr ; that is, V ¼ W1 È . . . È Wr and T ðWi Þ  Wi ; i ¼ 1; . . . ; r

Let Ti denote the restriction of T to Wi . Then T is said to be decomposable into the operators Ti or T is said to be the direct sum of the Ti ; written T ¼ T1 È . . . È Tr : Also, the subspaces W1 ; . . . ; Wr are said to reduce T or to form a T-invariant direct-sum decomposition of V. Consider the special case where two subspaces U and W reduce an operator T :V ! V ; say dim U ¼ 2 and dim W ¼ 3, and suppose fu1 ; u2 g and fw1 ; w2 ; w3 g are bases of U and W, respectively. If T1 and T2 denote the restrictions of T to U and W, respectively, then T1 ðu1 Þ ¼ a11 u1 þ a12 u2 T1 ðu2 Þ ¼ a21 u1 þ a22 u2 T2 ðw1 Þ ¼ b11 w1 þ b12 w2 þ b13 w3 T2 ðw2 Þ ¼ b21 w1 þ b22 w2 þ b23 w3 T2 ðw3 Þ ¼ b31 w1 þ b32 w2 þ b33 w3

Accordingly, the following matrices A; B; M are the matrix representations of T1 , T2 , T, respectively, 2 3 ! ! b11 b21 b31 a A 0 a21 A ¼ 11 M¼ ; B ¼ 4 b12 b22 b32 5; a12 a22 0 B b13 b23 b33 The block diagonal matrix M results from the fact that fu1 ; u2 ; w1 ; w2 ; w3 g is a basis of V (Theorem 10.4), and that Tðui Þ ¼ T1 ðui Þ and T ðwj Þ ¼ T2 ðwj Þ. A generalization of the above argument gives us the following theorem. Suppose T :V ! V is linear and suppose V is the direct sum of T -invariant subspaces, say, W1 ; . . . ; Wr . If Ai is a matrix representation of the restriction of T to Wi , then T can be represented by the block diagonal matrix: M ¼ diagðA1 ; A2 ; . . . ; Ar Þ

THEOREM 10.5:

10.5

Primary Decomposition

The following theorem shows that any operator T :V ! V is decomposable into operators whose minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a canonical form for T.

328
THEOREM 10.6:

CHAPTER 10 Canonical Forms
(Primary Decomposition Theorem) Let T :V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1 ðtÞn1 f2 ðtÞn2 Á Á Á fr ðtÞnr where the fi ðtÞ are distinct monic irreducible polynomials. Then V is the direct sum of T -invariant subspaces W1 ; . . . ; Wr , where Wi is the kernel of fi ðTÞni . Moreover, fi ðtÞni is the minimal polynomial of the restriction of T to Wi .

The above polynomials fi ðtÞni are relatively prime. Therefore, the above fundamental theorem follows (Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively).
THEOREM 10.7:

Suppose T :V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials such that f ðT Þ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the T -invariant subspace U and W, where U ¼ Ker gðT Þ and W ¼ Ker hðT Þ. In Theorem 10.7, if f ðtÞ is the minimal polynomial of T [and gðtÞ and hðtÞ are monic], then gðtÞ and hðtÞ are the minimal polynomials of the restrictions of T to U and W, respectively.

THEOREM 10.8:

We will also use the primary decomposition theorem to prove the following useful characterization of diagonalizable operators (see Problem 10.12 for the proof).
THEOREM 10.9:

A linear operator T :V ! V is diagonalizable if and only if its minimal polynomial mðtÞ is a product of distinct linear polynomials. (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its minimal polynomial is a product of distinct linear polynomials.

THEOREM 10.9:

EXAMPLE 10.3 Suppose A 6¼ I is a square matrix for which A3 ¼ I. Determine whether or not A is similar to a

diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C. Because A3 ¼ I, A is a zero of the polynomial f ðtÞ ¼ t3 À 1 ¼ ðt À 1Þðt2 þ t þ 1Þ: The minimal polynomial mðtÞ of A cannot be t À 1, because A 6¼ I. Hence,

mðtÞ ¼ t2 þ t þ 1

or

mðtÞ ¼ t3 À 1

Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable over C.

10.6

Nilpotent Operators

A linear operator T :V ! V is termed nilpotent if T n ¼ 0 for some positive integer n; we call k the index of nilpotency of T if T k ¼ 0 but T kÀ1 6¼ 0: Analogously, a square matrix A is termed nilpotent if An ¼ 0 for some positive integer n, and of index k if Ak ¼ 0 but AkÀ1 6¼ 0. Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is mðtÞ ¼ tk ; hence, 0 is its only eigenvalue.
EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter:

3 0 1 0 ... 0 0 60 0 1 ... 0 0 7 6 7 N ¼ N ðrÞ ¼ 6 :::::::::::::::::::::::::::::::: 7 6 7 40 0 0 ... 0 1 5 0 0 0 ... 0 0

2

and

3 l 1 0 ... 0 0 60 l 1 ... 0 0 7 6 7 J ðlÞ ¼ 6 :::::::::::::::::::::::::::::::: 7 6 7 40 0 0 ... l 1 5 0 0 0 ... 0 l

2

CHAPTER 10 Canonical Forms

329

The first matrix N , called a Jordan nilpotent block, consists of 1’s above the diagonal (called the superdiagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the 1 Â 1 zero matrix [0].) The second matrix J ðlÞ, called a Jordan block belonging to the eigenvalue l, consists of l’s on the diagonal, 1’s on the superdiagonal, and 0’s elsewhere. Observe that

J ðlÞ ¼ lI þ N
In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a scalar operator and a nilpotent operator.

The following (proved in Problem 10.16) is a fundamental result on nilpotent operators.
THEOREM 10.10:

Let T :V ! V be a nilpotent operator of index k. Then T has a block diagonal matrix representation in which each diagonal entry is a Jordan nilpotent block N . There is at least one N of order k, and all other N are of orders k. The number of N of each possible order is uniquely determined by T. The total number of N of all orders is equal to the nullity of T.

The proof of Theorem 10.10 shows that the number of N of order i is equal to 2mi À miþ1 À miÀ1 , where mi is the nullity of T i .

10.7

Jordan Canonical Form

An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend the base field K to a field in which the characteristic and minimal polynomials do factor into linear factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan canonical form. The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear operator T.
THEOREM 10.11:

Let T :V ! V be a linear operator whose characteristic and minimal polynomials are, respectively, DðtÞ ¼ ðt À l1 Þn1 Á Á Á ðt À lr Þnr and mðtÞ ¼ ðt À l1 Þm1 Á Á Á ðt À lr Þmr

where the li are distinct scalars. Then T has a block diagonal matrix representation J in which each diagonal entry is a Jordan block Jij ¼ J ðli Þ. For each lij , the corresponding Jij have the following properties: (i) There is at least one Jij of order mi ; all other Jij are of order mi . (ii) The sum of the orders of the Jij is ni . (iii) The number of Jij equals the geometric multiplicity of li . (iv) The number of Jij of each possible order is uniquely determined by T.
EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, respec-

tively, DðtÞ ¼ ðt À 2Þ4 ðt À 5Þ3 and mðtÞ ¼ ðt À 2Þ2 ðt À 5Þ3

330
0 diag@ 2 31 0 1 5A 5

CHAPTER 10 Canonical Forms

Then the Jordan canonical form of T is one of the following block diagonal matrices:

2 1 ; 0 2

!

2 0

5 1 1 ; 40 5 2 0 0

!

or

5 2 1 ; ½2Š; ½2Š; 4 0 diag@ 0 2 0

0

!

2

1 5 0

31 0 1 5A 5

The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix occurs if T has three independent eigenvectors belonging to the eigenvalue 2.

10.8

Cyclic Subspaces

Let T be a linear operator on a vector space V of finite dimension over K. Suppose v 2 V and v 6¼ 0. The set of all vectors of the form f ðT ÞðvÞ, where f ðtÞ ranges over all polynomials over K, is a T -invariant subspace of V called the T-cyclic subspace of V generated by v; we denote it by Zðv; T Þ and denote the restriction of T to Zðv; T Þ by Tv : By Problem 10.56, we could equivalently define Zðv; T Þ as the intersection of all T -invariant subspaces of V containing v. Now consider the sequence v; T ðvÞ; T 2 ðvÞ; T 3 ðvÞ; . . . of powers of T acting on v. Let k be the least integer such that T k ðvÞ is a linear combination of those vectors that precede it in the sequence, say, T k ðvÞ ¼ ÀakÀ1 T kÀ1 ðvÞ À Á Á Á À a1 T ðvÞ À a0 v Then mv ðtÞ ¼ tk þ akÀ1 tkÀ1 þ Á Á Á þ a1 t þ a0 is the unique monic polynomial of lowest degree for which mv ðT ÞðvÞ ¼ 0. We call mv ðtÞ the T-annihilator of v and Zðv; T Þ. The following theorem (proved in Problem 10.29) holds.
THEOREM 10.12:

Let Zðv; T Þ, Tv , mv ðtÞ be defined as above. Then (i) The set fv; T ðvÞ; . . . ; T kÀ1 ðvÞg is a basis of Zðv; T Þ; hence, dim Zðv; T Þ ¼ k. (ii) The minimal polynomial of Tv is mv ðtÞ. (iii) The matrix representation of Tv in the above basis is just the companion matrix Cðmv Þ of mv ðtÞ; that is, 3 0 0 0 ... 0 Àa0 61 0 0 ... 0 Àa1 7 7 6 60 1 0 ... 0 Àa2 7 7 6 Cðmv Þ ¼ 6 7 6 :::::::::::::::::::::::::::::::::::::::: 7 4 0 0 0 . . . 0 ÀakÀ2 5 0 0 0 . . . 1 ÀakÀ1 2

10.9

Rational Canonical Form

In this section, we present the rational canonical form for a linear operator T :V ! V. We emphasize that this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall that this is not the case for the Jordan canonical form.)

CHAPTER 10 Canonical Forms
LEMMA 10.13:

331

Let T :V ! V be a linear operator whose minimal polynomial is f ðtÞn , where f ðtÞ is a monic irreducible polynomial. Then V is the direct sum V ¼ Zðv 1 ; T Þ È Á Á Á È Zðv r ; T Þ of T -cyclic subspaces Zðv i ; T Þ with corresponding T -annihilators f ðtÞn1 ; f ðtÞn2 ; . . . ; f ðtÞnr ; n ¼ n1 ! n2 ! . . . ! nr Any other decomposition of V into T -cyclic subspaces has the same number of components and the same set of T -annihilators.

We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors v i or other T -cyclic subspaces Zðv i ; T Þ are uniquely determined by T , but it does say that the set of T -annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation: M ¼ diagðC1 ; C2 ; . . . ; Cr Þ where the Ci are companion matrices. In fact, the Ci are the companion matrices of the polynomials f ðtÞni . Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result.
THEOREM 10.14:

Let T :V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1 ðtÞm1 f2 ðtÞm2 Á Á Á fs ðtÞms where the fi ðtÞ are distinct monic irreducible polynomials. Then T has a unique block diagonal matrix representation: M ¼ diagðC11 ; C12 ; . . . ; C1r1 ; . . . ; Cs1 ; Cs2 ; . . . ; Csrs Þ where the Cij are companion matrices. In particular, the Cij are the companion matrices of the polynomials fi ðtÞnij , where m1 ¼ n11 ! n12 ! Á Á Á ! n1r1 ; ...; ms ¼ ns1 ! ns2 ! Á Á Á ! nsrs

The above matrix representation of T is called its rational canonical form. The polynomials fi ðtÞnij are called the elementary divisors of T.
EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on V whose minimal polynomial is

mðtÞ ¼ f1 ðtÞf2 ðtÞ2 ¼ ðt4 À 4t3 þ 6t2 À 4t À 7Þðt À 3Þ2
Thus, because dim V ¼ 8; the characteristic polynomial DðtÞ ¼ f1 ðtÞ f2 ðtÞ4 : Also, the rational canonical form M of T must have one block the companion matrix of f1 ðtÞ and one block the companion matrix of f2 ðtÞ2 . There are two possibilities: (a) diag½Cðt4 À 4t3 þ 6t2 À 4t À 7Þ, (b) diag½Cðt4 À 4t3 þ 6t2 À 4t À 7Þ, That is, 0 B6 1 (a) diagB6 @4 0 0 02 0 0 1 0 3 0 7 0 47 7; 0 À6 5 1 4 Cððt À 3Þ2 Þ, Cððt À 3Þ2 ފ Cððt À 3Þ2 Þ, Cðt À 3Þ; Cðt À 3ފ 1 0 À9 ; 1 6 ! 0 1 À9 C C; (b) 6 A ! 0 B6 1 diagB6 @4 0 0 02 0 0 1 0 3 0 7 0 47 7; 0 À6 5 1 4 1 C 0 À9 ; ½3Š; ½3ŠC A 1 6 !

10.10

Quotient Spaces

Let V be a vector space over a field K and let W be a subspace of V. If v is any vector in V, we write v þ W for the set of sums v þ w with w 2 W ; that is, v þ W ¼ fv þ w : w 2 W g

332

CHAPTER 10 Canonical Forms

These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into mutually disjoint subsets.
EXAMPLE 10.7 Let W be the subspace of R2 defined by

W ¼ fða; bÞ : a ¼ bg; that is, W is the line given by the equation x À y ¼ 0. We can view v þ W as a translation of the line obtained by adding the vector v to each point in W. As shown in Fig. 10-2, the coset v þ W is also a line, and it is parallel to W. Thus, the cosets of W in R2 are precisely all the lines parallel to W.

In the following theorem, we use the cosets of a subspace W of a vector space V to define a new vector space; it is called the quotient space of V by W and is denoted by V =W.
THEOREM 10.15:

Figure 10-2

Let W be a subspace of a vector space over a field K. Then the cosets of W in V form a vector space over K with the following operations of addition and scalar multiplication: ðiÞ ðu þ wÞ þ ðv þ W Þ ¼ ðu þ vÞ þ W ; ðiiÞ kðu þ W Þ ¼ ku þ W ; where k 2 K

We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the operations are well defined; that is, whenever u þ W ¼ u0 þ W and v þ W ¼ v 0 þ W, then ðiÞ ðu þ vÞ þ W ¼ ðu0 þ v 0 Þ þ W and ðiiÞ ku þ W ¼ ku0 þ W for any k 2 K In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27).
THEOREM 10.16:

Suppose W is a subspace invariant under a linear operator T :V ! V. Then T   induces a linear operator T on V =W defined by Tðv þ W Þ ¼ T ðvÞ þ W. Moreover, . Thus, the minimal polynomial of T  if T is a zero of any polynomial, then so is T divides the minimal polynomial of T.

SOLVED PROBLEMS

Invariant Subspaces 10.1. Suppose T :V ! V is linear. Show that each of the following is invariant under T : (a) f0g, (b) V, (c) kernel of T, (d) image of T.
(a) We have T ð0Þ ¼ 0 2 f0g; hence, f0g is invariant under T. (b) For every v 2 V , T ðvÞ 2 V; hence, V is invariant under T. (c) Let u 2 Ker T . Then T ðuÞ ¼ 0 2 Ker T because the kernel of T is a subspace of V. Thus, Ker T is invariant under T. (d) Because T ðvÞ 2 Im T for every v 2 V, it is certainly true when v 2 Im T . Hence, the image of T is invariant under T.

10.2. Suppose fWi g is T collection of T -invariant subspaces of a vector space V. Show that the a intersection W ¼ i Wi is also T -invariant.
Suppose v 2 W ; then v 2 Wi for every i. Because Wi is T -invariant, T ðvÞ 2 Wi for every i. Thus, T ðvÞ 2 W and so W is T -invariant.

CHAPTER 10 Canonical Forms

333

10.3. Prove Theorem 10.2: Let T :V ! V be linear. For any polynomial f ðtÞ, the kernel of f ðT Þ is invariant under T.
Suppose v 2 Ker f ðT Þ—that is, f ðT ÞðvÞ ¼ 0. We need to show that T ðvÞ also belongs to the kernel of f ðT Þ—that is, f ðT ÞðT ðvÞÞ ¼ ð f ðT Þ  T ÞðvÞ ¼ 0. Because f ðtÞt ¼ tf ðtÞ, we have f ðT Þ  T ¼ T  f ðT Þ. Thus, as required, ð f ðT Þ  T ÞðvÞ ¼ ðT  f ðT ÞÞðvÞ ¼ T ð f ðT ÞðvÞÞ ¼ T ð0Þ ¼ 0

2 10.4. Find all invariant subspaces of A ¼ 1

À5 À2

! viewed as an operator on R2 .

By Problem 10.1, R2 and f0g are invariant under A. Now if A has any other invariant subspace, it must be one-dimensional. However, the characteristic polynomial of A is DðtÞ ¼ t2 À trðAÞ t þ jAj ¼ t2 þ 1 Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant subspaces correspond to the eigenvectors; thus, R2 and f0g are the only subspaces invariant under A.

10.5. Prove Theorem 10.3: Suppose W is T -invariant. Then T has a triangular block representation ! A B ^ , where A is the matrix representation of the restriction T of T to W. 0 C
We choose a basis fw1 ; . . . ; wr g of W and extend it to a basis fw1 ; . . . ; wr ; v 1 ; . . . ; v s g of V. We have ^ T ðw1 Þ ¼ T ðw1 Þ ¼ a11 w1 þ Á Á Á þ a1r wr ^ T ðw2 Þ ¼ T ðw2 Þ ¼ a21 w1 þ Á Á Á þ a2r wr :::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ^ T ðwr Þ ¼ T ðwr Þ ¼ ar1 w1 þ Á Á Á þ arr wr T ðv 1 Þ ¼ b11 w1 þ Á Á Á þ b1r wr þ c11 v 1 þ Á Á Á þ c1s v s T ðv 2 Þ ¼ b21 w1 þ Á Á Á þ b2r wr þ c21 v 1 þ Á Á Á þ c2s v s :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: T ðv s Þ ¼ bs1 w1 þ Á Á Á þ bsr wr þ cs1 v 1 þ Á Á Á þ css v s But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of ! A B equations (Section 6.2). Therefore, it has the form , where A is the transpose of the matrix of 0 C ^ coefficients for the obvious subsystem. By the same argument, A is the matrix of T relative to the basis fwi g of W.

^ 10.6. Let T denote the restriction of an operator T to an invariant subspace W. Prove ^ (a) For any polynomial f ðtÞ, f ðT ÞðwÞ ¼ f ðT ÞðwÞ. ^ divides the minimal polynomial of T. (b) The minimal polynomial of T
(a) If f ðtÞ ¼ 0 or if f ðtÞ is a constant (i.e., of degree 1), then the result clearly holds. Assume deg f ¼ n > 1 and that the result holds for polynomials of degree less than n. Suppose that f ðtÞ ¼ an tn þ anÀ1 tnÀ1 þ Á Á Á þ a1 t þ a0 Then ^ ^ ^ f ðT ÞðwÞ ¼ ðan T n þ anÀ1 T nÀ1 þ Á Á Á þ a0 IÞðwÞ ^ ^ ^ ¼ ðan T nÀ1 ÞðT ðwÞÞ þ ðanÀ1 T nÀ1 þ Á Á Á þ a0 IÞðwÞ ¼ ðan T nÀ1 ÞðT ðwÞÞ þ ðanÀ1 T nÀ1 þ Á Á Á þ a0 IÞðwÞ ¼ f ðT ÞðwÞ ^ (b) Let mðtÞ denote the minimal polynomial of T. Then by (a), mðT ÞðwÞ ¼ mðT ÞðwÞ ¼ 0ðwÞ ¼ 0 for ^ ^ every w 2 W ; that is, T is a zero of the polynomial mðtÞ. Hence, the minimal polynomial of T divides mðtÞ.

334
Invariant Direct-Sum Decompositions

CHAPTER 10 Canonical Forms

10.7. Prove Theorem 10.4: Suppose W1 ; W2 ; . . . ; Wr are subspaces of V with respective bases Br ¼ fwr1 ; wr2 ; . . . ; wrnr g S Then V is the direct sum of the Wi if and only if the union B ¼ i Bi is a basis of V.
Suppose B is a basis of V. Then, for any v 2 V, v ¼ a11 w11 þ Á Á Á þ a1n1 w1n1 þ Á Á Á þ ar1 wr1 þ Á Á Á þ arnr wrnr ¼ w1 þ w2 þ Á Á Á þ wr where wi ¼ ai1 wi1 þ Á Á Á þ aini wini 2 Wi . We next show that such a sum is unique. Suppose v ¼ w01 þ w02 þ Á Á Á þ w0r ; where w0i 2 Wi Because fwi1 ; . . . ; wini g is a basis of Wi , w0i ¼ bi1 wi1 þ Á Á Á þ bini wini , and so v ¼ b11 w11 þ Á Á Á þ b1n1 w1n1 þ Á Á Á þ br1 wr1 þ Á Á Á þ brnr wrnr Because B is a basis of V ; aij ¼ bij , for each i and each j. Hence, wi ¼ w0i , and so the sum for v is unique. Accordingly, V is the direct sum of the Wi . Conversely, suppose V is the direct sum of the Wi . Then for any v 2 V, v ¼ w1 þ Á Á Á þ wr , where wi 2 Wi . Because fwiji g is a basis of Wi , each wi is a linear combination of the wiji , and so v is a linear combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose a11 w11 þ Á Á Á þ a1n1 w1n1 þ Á Á Á þ ar1 wr1 þ Á Á Á þ arnr wrnr ¼ 0 Note that ai1 wi1 þ Á Á Á þ aini wini 2 Wi . We also have that 0 ¼ 0 þ 0 Á Á Á 0 2 Wi . Because such a sum for 0 is unique, ai1 wi1 þ Á Á Á þ aini wini ¼ 0 for i ¼ 1; . . . ; r The independence of the bases fwiji g implies that all the a’s are 0. Thus, B is linearly independent and is a basis of V.

B1 ¼ fw11 ; w12 ; . . . ; w1n1 g;

...;

10.8. Suppose T :V ! V is linear and suppose T ¼ T1 È T2 with respect to a T -invariant direct-sum decomposition V ¼ U È W. Show that (a) mðtÞ is the least common multiple of m1 ðtÞ and m2 ðtÞ, where mðtÞ, m1 ðtÞ, m2 ðtÞ are the minimum polynomials of T ; T1 ; T2 , respectively. (b) DðtÞ ¼ D1 ðtÞD2 ðtÞ, where DðtÞ; D1 ðtÞ, D2 ðtÞ are the characteristic polynomials of T ; T1 ; T2 , respectively.
(a) By Problem 10.6, each of m1 ðtÞ and m2 ðtÞ divides mðtÞ. Now suppose f ðtÞ is a multiple of both m1 ðtÞ and m2 ðtÞ, then f ðT1 ÞðU Þ ¼ 0 and f ðT2 ÞðW Þ ¼ 0. Let v 2 V , then v ¼ u þ w with u 2 U and w 2 W. Now f ðT Þv ¼ f ðT Þu þ f ðT Þw ¼ f ðT1 Þu þ f ðT2 Þw ¼ 0 þ 0 ¼ 0 That is, T is a zero of f ðtÞ. Hence, mðtÞ divides f ðtÞ, and so mðtÞ is the least common multiple of m1 ðtÞ and m2 ðtÞ. ! A 0 (b) By Theorem 10.5, T has a matrix representation M ¼ , where A and B are matrix representations 0 B of T1 and T2 , respectively. Then, as required,    tI À A 0   ¼ jtI À AjjtI À Bj ¼ D1 ðtÞD2 ðtÞ DðtÞ ¼ jtI À Mj ¼   0 tI À B 

10.9. Prove Theorem 10.7: Suppose T :V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials such that f ðT Þ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the T -invariant subspaces U and W where U ¼ Ker gðT Þ and W ¼ Ker hðT Þ.

CHAPTER 10 Canonical Forms

335

Note first that U and W are T -invariant by Theorem 10.2. Now, because gðtÞ and hðtÞ are relatively prime, there exist polynomials rðtÞ and sðtÞ such that rðtÞgðtÞ þ sðtÞhðtÞ ¼ 1 Hence; for the operator T ; Let v 2 V ; then; by ð*Þ; rðT ÞgðT Þ þ sðT ÞhðT Þ ¼ I v ¼ rðT ÞgðT Þv þ sðT ÞhðT Þv ð*Þ

But the first term in this sum belongs to W ¼ Ker hðT Þ, because hðT ÞrðT ÞgðT Þv ¼ rðT ÞgðT ÞhðT Þv ¼ rðT Þf ðT Þv ¼ rðT Þ0v ¼ 0 Similarly, the second term belongs to U. Hence, V is the sum of U and W. To prove that V ¼ U È W, we must show that a sum v ¼ u þ w with u 2 U , w 2 W, is uniquely determined by v. Applying the operator rðT ÞgðT Þ to v ¼ u þ w and using gðT Þu ¼ 0, we obtain rðT ÞgðT Þv ¼ rðT ÞgðT Þu þ rðT ÞgðT Þw ¼ rðT ÞgðT Þw Also, applying ð*Þ to w alone and using hðT Þw ¼ 0, we obtain w ¼ rðT ÞgðT Þw þ sðT ÞhðT Þw ¼ rðT ÞgðT Þw Both of the above formulas give us w ¼ rðT ÞgðT Þv, and so w is uniquely determined by v. Similarly u is uniquely determined by v. Hence, V ¼ U È W, as required.

10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f ðtÞ is the minimal polynomial of T (and gðtÞ and hðtÞ are monic), then gðtÞ is the minimal polynomial of the restriction T1 of T to U and hðtÞ is the minimal polynomial of the restriction T2 of T to W.
Let m1 ðtÞ and m2 ðtÞ be the minimal polynomials of T1 and T2 , respectively. Note that gðT1 Þ ¼ 0 and hðT2 Þ ¼ 0 because U ¼ Ker gðT Þ and W ¼ Ker hðT Þ. Thus, and m2 ðtÞ divides hðtÞ ð1Þ m1 ðtÞ divides gðtÞ By Problem 10.9, f ðtÞ is the least common multiple of m1 ðtÞ and m2 ðtÞ. But m1 ðtÞ and m2 ðtÞ are relatively prime because gðtÞ and hðtÞ are relatively prime. Accordingly, f ðtÞ ¼ m1 ðtÞm2 ðtÞ. We also have that f ðtÞ ¼ gðtÞhðtÞ. These two equations together with (1) and the fact that all the polynomials are monic imply that gðtÞ ¼ m1 ðtÞ and hðtÞ ¼ m2 ðtÞ, as required.

10.11. Prove the Primary Decomposition Theorem 10.6: Let T :V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1 ðtÞn1 f2 ðtÞn2 . . . fr ðtÞnr where the fi ðtÞ are distinct monic irreducible polynomials. Then V is the direct sum of T invariant subspaces W1 ; . . . ; Wr where Wi is the kernel of fi ðT Þni . Moreover, fi ðtÞni is the minimal polynomial of the restriction of T to Wi .
The proof is by induction on r. The case r ¼ 1 is trivial. Suppose that the theorem has been proved for r À 1. By Theorem 10.7, we can write V as the direct sum of T -invariant subspaces W1 and V1 , where W1 is the kernel of f1 ðT Þn1 and where V1 is the kernel of f2 ðT Þn2 Á Á Á fr ðT Þnr . By Theorem 10.8, the minimal polynomials of the restrictions of T to W1 and V1 are f1 ðtÞn1 and f2 ðtÞn2 Á Á Á fr ðtÞnr , respectively. ^ Denote the restriction of T to V1 by T1 . By the inductive hypothesis, V1 is the direct sum of subspaces W2 ; . . . ; Wr such that Wi is the kernel of fi ðT1 Þni and such that fi ðtÞni is the minimal polynomial for the ^ restriction of T1 to Wi . But the kernel of fi ðT Þni , for i ¼ 2; . . . ; r is necessarily contained in V1 , because fi ðtÞni divides f2 ðtÞn2 Á Á Á fr ðtÞnr . Thus, the kernel of fi ðT Þni is the same as the kernel of fi ðT1 Þni , which is Wi . ^ Also, the restriction of T to Wi is the same as the restriction of T1 to Wi (for i ¼ 2; . . . ; r); hence, fi ðtÞni is also the minimal polynomial for the restriction of T to Wi . Thus, V ¼ W1 È W2 È Á Á Á È Wr is the desired decomposition of T.

10.12. Prove Theorem 10.9: A linear operator T :V ! V has a diagonal matrix representation if and only if its minimal polynomal mðtÞ is a product of distinct linear polynomials.

336

CHAPTER 10 Canonical Forms
Suppose mðtÞ is a product of distinct linear polynomials, say, mðtÞ ¼ ðt À l1 Þðt À l2 Þ Á Á Á ðt À lr Þ where the li are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces W1 ; . . . ; Wr , where Wi ¼ KerðT À li IÞ. Thus, if v 2 Wi , then ðT À li IÞðvÞ ¼ 0 or T ðvÞ ¼ li v. In other words, every vector in Wi is an eigenvector belonging to the eigenvalue li . By Theorem 10.4, the union of bases for W1 ; . . . ; Wr is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable. Conversely, suppose T is diagonalizable (i.e., V has a basis consisting of eigenvectors of T ). Let l1 ; . . . ; ls be the distinct eigenvalues of T. Then the operator f ðT Þ ¼ ðT À l1 IÞðT À l2 IÞ Á Á Á ðT À ls IÞ maps each basis vector into 0. Thus, f ðT Þ ¼ 0, and hence, the minimal polynomial mðtÞ of T divides the polynomial f ðtÞ ¼ ðt À l1 Þðt À l2 Þ Á Á Á ðt À ls IÞ Accordingly, mðtÞ is a product of distinct linear polynomials.

Nilpotent Operators, Jordan Canonical Form 10.13. Let T :V be linear. Suppose, for v 2 V, T k ðvÞ ¼ 0 but T kÀ1 ðvÞ 6¼ 0. Prove (a) (b) (c) (d) The set S ¼ fv; T ðvÞ; . . . ; T kÀ1 ðvÞg is linearly independent. The subspace W generated by S is T -invariant. ^ The restriction T of T to W is nilpotent of index k. Relative to the basis fT kÀ1 ðvÞ; . . . ; T ðvÞ; vg of W, the matrix of T is the k-square Jordan nilpotent block Nk of index k (see Example 10.5). av þ a1 T ðvÞ þ a2 T 2 ðvÞ þ Á Á Á þ akÀ1 T kÀ1 ðvÞ ¼ 0 kÀ1 k kÀ1 kÀ1

(a) Suppose ð*Þ to ð*Þ and using T ðvÞ ¼ 0, we obtain aT ðvÞ ¼ 0; because T ðvÞ 6¼ 0, a ¼ 0. Applying T Now applying T kÀ2 to ð*Þ and using T k ðvÞ ¼ 0 and a ¼ 0, we fiind a1 T kÀ1 ðvÞ ¼ 0; hence, a1 ¼ 0. Next applying T kÀ3 to ð*Þ and using T k ðvÞ ¼ 0 and a ¼ a1 ¼ 0, we obtain a2 T kÀ1 ðvÞ ¼ 0; hence, a2 ¼ 0. Continuing this process, we find that all the a’s are 0; hence, S is independent. (b) Let v 2 W. Then v ¼ bv þ b1 T ðvÞ þ b2 T 2 ðvÞ þ Á Á Á þ bkÀ1 T kÀ1 ðvÞ Using T k ðvÞ ¼ 0, we have T ðvÞ ¼ bTðvÞ þ b1 T 2 ðvÞ þ Á Á Á þ bkÀ2 T kÀ1 ðvÞ 2 W Thus, W is T -invariant. (c) By hypothesis, T k ðvÞ ¼ 0. Hence, for i ¼ 0; . . . ; k À 1, ^ T k ðT i ðvÞÞ ¼ T kþi ðvÞ ¼ 0 ^ ^ ^ That is, applying T k to each generator of W, we obtain 0; hence, T k ¼ 0 and so T is nilpotent of index ^ at most k. On the other hand, T kÀ1 ðvÞ ¼ T kÀ1 ðvÞ 6¼ 0; hence, T is nilpotent of index exactly k. (d) For the basis fT kÀ1 ðvÞ, T kÀ2 ðvÞ; . . . ; T ðvÞ; vg of W, ^ T ðT kÀ1 ðvÞÞ ¼ T k ðvÞ ¼ 0 ^ T kÀ1 ðvÞ T ðT kÀ2 ðvÞÞ ¼ ^ðT kÀ3 ðvÞÞ ¼ T kÀ2 ðvÞ T :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ^ T ðT ðvÞÞ ¼ T 2 ðvÞ ^ T ðvÞ ¼ T ðvÞ Hence, as required, the matrix of T in this basis is the k-square Jordan nilpotent block Nk .

CHAPTER 10 Canonical Forms
10.14. Let T :V ! V be linear. Let U ¼ Ker T i and W ¼ Ker T iþ1 . Show that (a) U  W, (b) T ðW Þ  U .

337

(a) Suppose u 2 U ¼ Ker T i . Then T i ðuÞ ¼ 0 and so T iþ1 ðuÞ ¼ T ðT i ðuÞÞ ¼ T ð0Þ ¼ 0. Thus, u 2 Ker T iþ1 ¼ W. But this is true for every u 2 U ; hence, U  W. (b) Similarly, if w 2 W ¼ Ker T iþ1 , then T iþ1 ðwÞ ¼ 0: Thus, T iþ1 ðwÞ ¼ T i ðT ðwÞÞ ¼ T i ð0Þ ¼ 0 and so T ðW Þ  U .

10.15. Let T :V be linear. Let X ¼ Ker T iÀ2 , Y ¼ Ker T iÀ1 , Z ¼ Ker T i . Therefore (Problem 10.14), X  Y  Z. Suppose fu1 ; . . . ; ur g; fu1 ; . . . ; ur ; v 1 ; . . . ; v s g; fu1 ; . . . ; ur ; v 1 ; . . . ; v s ; w1 ; . . . ; wt g are bases of X ; Y ; Z, respectively. Show that S ¼ fu1 ; . . . ; ur ; Tðw1 Þ; . . . ; T ðwt Þg is contained in Y and is linearly independent.
By Problem 10.14, T ðZÞ  Y , and hence S  Y . Now suppose S is linearly dependent. Then there exists a relation a1 u1 þ Á Á Á þ ar ur þ b1 T ðw1 Þ þ Á Á Á þ bt T ðwt Þ ¼ 0 where at least one coefficient is not zero. Furthermore, because fui g is independent, at least one of the bk must be nonzero. Transposing, we find b1 T ðw1 Þ þ Á Á Á þ bt T ðwt Þ ¼ Àa1 u1 À Á Á Á À ar ur 2 X ¼ Ker T iÀ2 Hence; Thus; T iÀ1 T iÀ2 ðb1 T ðw1 Þ þ Á Á Á þ bt T ðwt ÞÞ ¼ 0 ðb1 w1 þ Á Á Á þ bt wt Þ ¼ 0; and so b1 w1 þ Á Á Á þ bt wt 2 Y ¼ Ker T iÀ1

Because fui ; v j g generates Y, we obtain a relation among the ui , v j , wk where one of the coefficients (i.e., one of the bk ) is not zero. This contradicts the fact that fui ; v j ; wk g is independent. Hence, S must also be independent.

10.16. Prove Theorem 10.10: Let T :V ! V be a nilpotent operator of index k. Then T has a unique block diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least one N of order k, and all other N are of orders k. The total number of N of all orders is equal to the nullity of T.
Suppose dim V ¼ n. Let W1 ¼ Ker T, W2 ¼ Ker T 2 ; . . . ; Wk ¼ Ker T k . Let us set mi ¼ dim Wi , for i ¼ 1; . . . ; k. Because T is of index k, Wk ¼ V and WkÀ1 6¼ V and so mkÀ1 < mk ¼ n. By Problem 10.14, W1  W2  Á Á Á  Wk ¼ V Thus, by induction, we can choose a basis fu1 ; . . . ; un g of V such that fu1 ; . . . ; umi g is a basis of Wi . We now choose a new basis for V with respect to which T has the desired form. It will be convenient to label the members of this new basis by pairs of indices. We begin by setting vð1; kÞ ¼ umkÀ1 þ1 ; vð2; kÞ ¼ umkÀ1 þ2 ; ...; vðmk À mkÀ1 ; kÞ ¼ umk and setting vð1; k À 1Þ ¼ T vð1; kÞ; By the preceding problem, S1 ¼ fu1 . . . ; umkÀ2 ; vð1; k À 1Þ; . . . ; vðmk À mkÀ1 ; k À 1Þg is a linearly independent subset of WkÀ1 . We extend S1 to a basis of WkÀ1 by adjoining new elements (if necessary), which we denote by vðmk À mkÀ1 þ 1; k À 1Þ; Next we set vð1; k À 2Þ ¼ T vð1; k À 1Þ; vð2; k À 2Þ ¼ T vð2; k À 1Þ; vðmkÀ1 À mkÀ2 ; k À 2Þ ¼ T vðmkÀ1 À mkÀ2 ; k À 1Þ ...; vðmk À mkÀ1 þ 2; k À 1Þ; ...; vðmkÀ1 À mkÀ2 ; k À 1Þ vð2; k À 1Þ ¼ T vð2; kÞ; ...; vðmk À mkÀ1 ; k À 1Þ ¼ T vðmk À mkÀ1 ; kÞ

338
Again by the preceding problem,

CHAPTER 10 Canonical Forms

S2 ¼ fu1 ; . . . ; umkÀs ; vð1; k À 2Þ; . . . ; vðmkÀ1 À mkÀ2 ; k À 2Þg is a linearly independent subset of WkÀ2 , which we can extend to a basis of WkÀ2 by adjoining elements vðmkÀ1 À mkÀ2 þ 1; k À 2Þ; vðmkÀ1 À mkÀ2 þ 2; k À 2Þ; ...; vðmkÀ2 À mkÀ3 ; k À 2Þ

Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows: vð1; kÞ vð1; k À 1Þ; vð1; 2Þ; vð1; 1Þ; . . . ; vðmk À mkÀ1 ; kÞ . . . ; vðmk À mkÀ1 ; k À 1Þ . . . ; vðmkÀ1 À mkÀ2 ; k À 1Þ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: . . . ; vðmk À mkÀ1 ; 2Þ; . . . ; vðmkÀ1 À mkÀ2 ; 2Þ; . . . ; vðm2 À m1 ; 2Þ . . . ; vðmk À mkÀ1 ; 1Þ; . . . ; vðmkÀ1 À mkÀ2 ; 1Þ; . . . ; vðm2 À m1 ; 1Þ; . . . ; vðm1 ; 1Þ

The bottom row forms a basis of W1 , the bottom two rows form a basis of W2 , and so forth. But what is important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the vector is in the bottom row. That is, & vði; j À 1Þ for j > 1 T vði; jÞ ¼ 0 for j ¼ 1 Now it is clear [see Problem 10.13(d)] that T will have the desired form if the vði; jÞ are ordered lexicographically: beginning with vð1; 1Þ and moving up the first column to vð1; kÞ, then jumping to vð2; 1Þ and moving up the second column as far as possible. Moreover, there will be exactly mk À mkÀ1 diagonal entries of order k: Also, there will be ðmkÀ1 À mkÀ2 Þ À ðmk À mkÀ1 Þ ¼ 2mkÀ1 À mk À mkÀ2 diagonal entries of order k À 1 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 2m2 À m1 À m3 diagonal entries of order 2 2m1 À m2 diagonal entries of order 1

as can be read off directly from the table. In particular, because the numbers m1 ; . . . ; mk are uniquely determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity m1 ¼ ðmk À mkÀ1 Þ þ ð2mkÀ1 À mk À mkÀ2 Þ þ Á Á Á þ ð2m2 À m1 À m3 Þ þ ð2m1 À m2 Þ shows that the nullity m1 of T is the total number of diagonal entries of T.

0 60 6 10.17. Let A ¼ 6 0 6 40 0

2

1 0 0 0 0

1 1 0 0 0

0 1 0 0 0

2 3 0 1 60 17 6 7 0 7 and B ¼ 6 0 6 7 40 05 0 0

1 0 0 0 0

1 1 0 0 0

0 1 1 0 0

3 0 17 7 1 7. The reader can verify that A and B 7 05 0

are both nilpotent of index 3; that is, A3 ¼ 0 but A2 6¼ 0, and B3 ¼ 0 but B2 6¼ 0. Find the nilpotent matrices MA and MB in canonical form that are similar to A and B, respectively.
Because A and B are nilpotent of index 3, MA and MB must each contain a Jordan nilpotent block of order 3, and none greater then 3. Note that rankðAÞ ¼ 2 and rankðBÞ ¼ 3, so nullityðAÞ ¼ 5 À 2 ¼ 3 and nullityðBÞ ¼ 5 À 3 ¼ 2. Thus, MA must contain three diagonal blocks, which must be one of order 3 and two of order 1; and MB must contain two diagonal blocks, which must be one of order 3 and one of order 2. Namely, 3 3 2 2 0 1 0 0 0 0 1 0 0 0 60 0 1 0 07 60 0 1 0 07 7 7 6 6 60 0 0 0 07 and MB ¼ 6 0 0 0 0 0 7 MA ¼ 6 7 7 6 40 0 0 0 05 40 0 0 0 15 0 0 0 0 0 0 0 0 0 0

CHAPTER 10 Canonical Forms
10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T.

339

By the primary decomposition theorem, T is decomposable into operators T1 ; . . . ; Tr ; that is, T ¼ T1 È Á Á Á È Tr , where ðt À li Þmi is the minimal polynomial of Ti . Thus, in particular, ðT1 À l1 IÞm1 ¼ 0; . . . ; ðTr À lr IÞmr ¼ 0 Set Ni ¼ Ti À li I. Then, for i ¼ 1; . . . ; r, Ti ¼ Ni þ li I; where Nim ¼ 0 i That is, Ti is the sum of the scalar operator li I and a nilpotent operator Ni , which is of index mi because ðt À li Þm is the minimal polynomial of Ti . i Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that Ni is in canonical form. In this basis, Ti ¼ Ni þ li I is represented by a block diagonal matrix Mi whose diagonal entries are the matrices Jij . The direct sum J of the matrices Mi is in Jordan canonical form and, by Theorem 10.5, is a matrix representation of T. Last, we must show that the blocks Jij satisfy the required properties. Property (i) follows from the fact that Ni is of index mi . Property (ii) is true because T and J have the same characteristic polynomial. Property (iii) is true because the nullity of Ni ¼ Ti À li I is equal to the geometric multiplicity of the eigenvalue li . Property (iv) follows from the fact that the Ti and hence the Ni are uniquely determined by T.

10.19. Determine all possible Jordan canonical forms J for a linear operator T :V ! V whose characteristic polynomial DðtÞ ¼ ðt À 2Þ5 and whose minimal polynomial mðtÞ ¼ ðt À 2Þ2 .
J must be a 5  5 matrix, because DðtÞ has degree 5, and all diagonal elements must be 2, because 2 is the only eigenvalue. Moreover, because the exponent of t À 2 in mðtÞ is 2, J must have one Jordan block of order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities: !  2 1 J ¼ diag ; 2 !  2 1 ; ½2Š 2 or !   2 1 J ¼ diag ; ½2Š; ½2Š; ½2Š 2

10.20. Determine all possible Jordan canonical forms for a linear operator T :V ! V whose characteristic polynomial DðtÞ ¼ ðt À 2Þ3 ðt À 5Þ2 . In each case, find the minimal polynomial mðtÞ.
Because t À 2 has exponent 3 in DðtÞ, 2 must appear three times on the diagonal. Similarly, 5 must appear twice. Thus, there are six possibilities: 02 1 3 02 3 1 ! 2 1 2 1 5 1 A , (b) diag@4 2 1 5; ½5Š; ½5ŠA, (a) diag@4 2 1 5; 5 2 2 !  !  !  2 1 5 1 2 1 (c) diag ; ½2Š; , (d) diag ; ½2Š; ½5Š; ½5Š , 2 5 2 !  5 1 (e) diag ½2Š; ½2Š; ½2Š; , (f ) diagð½2Š; ½2Š; ½2Š; ½5Š; ½5ŠÞ 5 The exponent in the minimal polynomial mðtÞ is equal to the size of the largest block. Thus, (a) mðtÞ ¼ ðt À 2Þ3 ðt À 5Þ2 , (b) mðtÞ ¼ ðt À 2Þ3 ðt À 5Þ, (c) mðtÞ ¼ ðt À 2Þ2 ðt À 5Þ2 , (d) mðtÞ ¼ ðt À 2Þ2 ðt À 5Þ, (e) mðtÞ ¼ ðt À 2Þðt À 5Þ2 , (f ) mðtÞ ¼ ðt À 2Þðt À 5Þ

Quotient Space and Triangular Form 10.21. Let W be a subspace of a vector space V. Show that the following are equivalent: (i) u 2 v þ W, (ii) u À v 2 W, (iii) v 2 u þ W.
Suppose u 2 v þ W. Then there exists w0 2 W such that u ¼ v þ w0 . Hence, u À v ¼ w0 2 W. Conversely, suppose u À v 2 W. Then u À v ¼ w0 where w0 2 W. Hence, u ¼ v þ w0 2 v þ W. Thus, (i) and (ii) are equivalent. We also have u À v 2 W iff À ðu À vÞ ¼ v À u 2 W iff v 2 u þ W. Thus, (ii) and (iii) are also equivalent.

340

CHAPTER 10 Canonical Forms

10.22. Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is, (a) Any two cosets u þ W and v þ W are either identical or disjoint. (b) Each v 2 V belongs to a coset; in fact, v 2 v þ W. Furthermore, u þ W ¼ v þ W if and only if u À v 2 W, and so ðv þ wÞ þ W ¼ v þ W for any w 2 W.
Let v 2 V. Because 0 2 W, we have v ¼ v þ 0 2 v þ W, which proves (b). Now suppose the cosets u þ W and v þ W are not disjoint; say, the vector x belongs to both u þ W and v þ W. Then u À x 2 W and x À v 2 W. The proof of (a) is complete if we show that u þ W ¼ v þ W. Let u þ w0 be any element in the coset u þ W. Because u À x, x À v, w0 belongs to W, ðu þ w0 Þ À v ¼ ðu À xÞ þ ðx À vÞ þ w0 2 W Thus, u þ w0 2 v þ W, and hence the cost u þ W is contained in the coset v þ W. Similarly, v þ W is contained in u þ W, and so u þ W ¼ v þ W. The last statement follows from the fact that u þ W ¼ v þ W if and only if u 2 v þ W, and, by Problem 10.21, this is equivalent to u À v 2 W.

10.23. Let W be the solution space of the homogeneous equation 2x þ 3y þ 4z ¼ 0. Describe the cosets of W in R3 .
W is a plane through the origin O ¼ ð0; 0; 0Þ, and the cosets of W are the planes parallel to W. Equivalently, the cosets of W are the solution sets of the family of equations 2x þ 3y þ 4z ¼ k; k2R In fact, the coset v þ W, where v ¼ ða; b; cÞ, is the solution set of the linear equation 2x þ 3y þ 4z ¼ 2a þ 3b þ 4c or 2ðx À aÞ þ 3ðy À bÞ þ 4ðz À cÞ ¼ 0

10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well defined; namely, show that if u þ W ¼ u0 þ W and v þ W ¼ v 0 þ W, then ðaÞ ðu þ vÞ þ W ¼ ðu0 þ v 0 Þ þ W
0

and
0

ðbÞ

ku þ W ¼ ku0 þ W
0 0

for any k 2 K

(a) Because u þ W ¼ u þ W and v þ W ¼ v þ W, both u À u and v À v belong to W. But then ðu þ vÞ À ðu0 þ v 0 Þ ¼ ðu À u0 Þ þ ðv À v 0 Þ 2 W. Hence, ðu þ vÞ þ W ¼ ðu0 þ v 0 Þ þ W. (b) Also, because u À u0 2 W implies kðu À u0 Þ 2 W, then ku À ku0 ¼ kðu À u0 Þ 2 W ; accordingly, ku þ W ¼ ku0 þ W.

10.25. Let V be a vector space and W a subspace of V. Show that the natural map Z: V ! V =W, defined by ZðvÞ ¼ v þ W, is linear.
For any u; v 2 V and any k 2 K, we have nðu þ vÞ ¼ u þ v þ W ¼ u þ W þ v þ W ¼ ZðuÞ þ ZðvÞ and ZðkvÞ ¼ kv þ W ¼ kðv þ W Þ ¼ kZðvÞ Accordingly, Z is linear.

10.26. Let W be a subspace of a vector space V. Suppose fw1 ; . . . ; wr g is a basis of W and the set of   cosets f1 ; . . . ; vs g, where vj ¼ v j þ W, is a basis of the quotient space. Show that the set of v vectors B ¼ fv 1 ; . . . ; v s , w1 ; . . . ; wr g is a basis of V. Thus, dim V ¼ dim W þ dimðV =W Þ.
Suppose u 2 V. Because fj g is a basis of V =W, v     u ¼ u þ W ¼ a1 v1 þ a2 v2 þ Á Á Á þ as vs Hence, u ¼ a1 v 1 þ Á Á Á þ as v s þ w, where w 2 W. Since fwi g is a basis of W, u ¼ a1 v 1 þ Á Á Á þ as v s þ b1 w1 þ Á Á Á þ br wr

CHAPTER 10 Canonical Forms
Accordingly, B spans V. We now show that B is linearly independent. Suppose c1 v 1 þ Á Á Á þ cs v s þ d1 w1 þ Á Á Á þ dr wr ¼ 0   0 Then c1 v 1 þ Á Á Á þ cs v s ¼  ¼ W

341

ð1Þ

Because fj g is independent, the c’s are all 0. Substituting into (1), we find d1 w1 þ Á Á Á þ dr wr ¼ 0. v Because fwi g is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V.

10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T :V ! V. Then   T induces a linear operator T on V =W defined by T ðv þ W Þ ¼ T ðvÞ þ W. Moreover, if T is a . Thus, the minimal polynomial of T divides the minimal  zero of any polynomial, then so is T polynomial of T.
   We first show that T is well defined; that is, if u þ W ¼ v þ W, then T ðu þ W Þ ¼ T ðv þ W Þ. If u þ W ¼ v þ W, then u À v 2 W, and, as W is T -invariant, T ðu À vÞ ¼ T ðuÞ À T ðvÞ 2 W. Accordingly,   T ðu þ W Þ ¼ T ðuÞ þ W ¼ T ðvÞ þ W ¼ T ðv þ W Þ as required.  We next show that T is linear. We have   T ððu þ W Þ þ ðv þ W ÞÞ ¼ T ðu þ v þ W Þ ¼ T ðu þ vÞ þ W ¼ T ðuÞ þ T ðvÞ þ W   ¼ T ðuÞ þ W þ T ðvÞ þ W ¼ T ðu þ W Þ þ T ðv þ W Þ Furthermore,   ^ T ðkðu þ W ÞÞ ¼ T ðku þ W Þ ¼ T ðkuÞ þ W ¼ kT ðuÞ þ W ¼ kðT ðuÞ þ W Þ ¼ k T ðu þ W Þ  Thus, T is linear. Now, for any coset u þ W in V =W,     T 2 ðu þ W Þ ¼ T 2 ðuÞ þ W ¼ T ðT ðuÞÞ þ W ¼ T ðT ðuÞ þ W Þ ¼ T ðT ðu þ W ÞÞ ¼ T 2 ðu þ W Þ   Hence, T 2 ¼ T 2 . Similarly, T n ¼ T n for any n. Thus, for any polynomial P i ai t f ðtÞ ¼ an tn þ Á Á Á þ a0 ¼ P P i f ðT Þðu þ W Þ ¼ f ðT ÞðuÞ þ W ¼ ai T ðuÞ þ W ¼ ai ðT i ðuÞ þ W Þ P P P    ¼ ai T i ðu þ W Þ ¼ ai T i ðu þ W Þ ¼ ð ai T i Þðu þ W Þ ¼ f ðT Þðu þ W Þ    0 and so f ðT Þ ¼ f ðT Þ. Accordingly, if T is a root of f ðtÞ then f ðT Þ ¼  ¼ W ¼ f ðT Þ; that is, T is also a root of f ðtÞ. The theorem is proved.

10.28. Prove Theorem 10.1: Let T :V ! V be a linear operator whose characteristic polynomial factors into linear polynomials. Then V has a basis in which T is represented by a triangular matrix.
The proof is by induction on the dimension of V. If dim V ¼ 1, then every matrix representation of T is a 1 Â 1 matrix, which is triangular. Now suppose dim V ¼ n > 1 and that the theorem holds for spaces of dimension less than n. Because the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at least one nonzero eigenvector v, say T ðvÞ ¼ a11 v. Let W be the one-dimensional subspace spanned by v.   Set V ¼ V =W. Then (Problem 10.26) dim V ¼ dim V À dim W ¼ n À 1. Note also that W is invariant   under T. By Theorem 10.16, T induces a linear operator T on V whose minimal polynomial divides the minimal polynomial of T. Because the characteristic polynomial of T is a product of linear polynomials,   so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of T . Thus, V    and T satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis f2 ; . . . ; vn g of V v such that v  T ð2 Þ ¼ a22 v2 v   T ð3 Þ ¼ a32 v2 þ a33 v3 ::::::::::::::::::::::::::::::::::::::::: v    T ðn Þ ¼ an2 vn þ an3 v3 þ Á Á Á þ ann vn

342

CHAPTER 10 Canonical Forms
Now let v 2 ; . . . ; v n be elements of V that belong to the cosets v 2 ; . . . ; v n , respectively. Then fv; v 2 ; . . . ; v n g   is a basis of V (Problem 10.26). Because T ðv 2 Þ ¼ a22 v2 , we have v T ð2 Þ À a22 v22 ¼ 0;  and so T ðv 2 Þ À a22 v 2 2 W But W is spanned by v; hence, T ðv 2 Þ À a22 v 2 is a multiple of v, say, and so T ðv 2 Þ ¼ a21 v þ a22 v 2 T ðv 2 Þ À a22 v 2 ¼ a21 v; Similarly, for i ¼ 3; . . . ; n T ðv i Þ À ai2 v 2 À ai3 v 3 À Á Á Á À aii v i 2 W ; Thus, T ðvÞ ¼ a11 v T ðv 2 Þ ¼ a21 v þ a22 v 2 :::::::::::::::::::::::::::::::::::::::: T ðv n Þ ¼ an1 v þ an2 v 2 þ Á Á Á þ ann v n and hence the matrix of T in this basis is triangular. and so T ðv i Þ ¼ ai1 v þ ai2 v 2 þ Á Á Á þ aii v i

Cyclic Subspaces, Rational Canonical Form 10.29. Prove Theorem 10.12: Let Zðv; T Þ be a T -cyclic subspace, Tv the restriction of T to Zðv; TÞ, and mv ðtÞ ¼ tk þ akÀ1 tkÀ1 þ Á Á Á þ a0 the T -annihilator of v. Then, (i) The set fv; T ðvÞ; . . . ; T kÀ1 ðvÞg is a basis of Zðv; T Þ; hence, dim Zðv; T Þ ¼ k. (ii) The minimal polynomial of Tv is mv ðtÞ. (iii) The matrix of Tv in the above basis is the companion matrix C ¼ Cðmv Þ of mv ðtÞ [which has 1’s below the diagonal, the negative of the coefficients a0 ; a1 ; . . . ; akÀ1 of mv ðtÞ in the last column, and 0’s elsewhere].
(i) By definition of mv ðtÞ, T k ðvÞ is the first vector in the sequence v, T ðvÞ, T 2 ðvÞ; . . . that, is a linear combination of those vectors that precede it in the sequence; hence, the set B ¼ fv; T ðvÞ; . . . ; T kÀ1 ðvÞg is linearly independent. We now only have to show that Zðv; T Þ ¼ LðBÞ, the linear span of B. By the above, T k ðvÞ 2 LðBÞ. We prove by induction that T n ðvÞ 2 LðBÞ for every n. Suppose n > k and T nÀ1 ðvÞ 2 LðBÞ—that is, T nÀ1 ðvÞ is a linear combination of v; . . . ; T kÀ1 ðvÞ. Then T n ðvÞ ¼ T ðT nÀ1 ðvÞÞ is a linear combination of T ðvÞ; . . . ; T k ðvÞ. But T k ðvÞ 2 LðBÞ; hence, T n ðvÞ 2 LðBÞ for every n. Consequently, f ðT ÞðvÞ 2 LðBÞ for any polynomial f ðtÞ. Thus, Zðv; T Þ ¼ LðBÞ, and so B is a basis, as claimed. (ii) Suppose mðtÞ ¼ ts þ bsÀ1 tsÀ1 þ Á Á Á þ b0 is the minimal polynomial of Tv . Then, because v 2 Zðv; T Þ, 0 ¼ mðTv ÞðvÞ ¼ mðT ÞðvÞ ¼ T s ðvÞ þ bsÀ1 T sÀ1 ðvÞ þ Á Á Á þ b0 v Thus, T s ðvÞ is a linear combination of v, T ðvÞ; . . . ; T sÀ1 ðvÞ, and therefore k s. However, mv ðT Þ ¼ 0 and so mv ðTv Þ ¼ 0: Then mðtÞ divides mv ðtÞ; and so s k: Accordingly, k ¼ s and hence mv ðtÞ ¼ mðtÞ. (iii) ¼ T ðvÞ Tv ðvÞ Tv ðT ðvÞÞ ¼ T 2 ðvÞ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Tv ðT kÀ2 ðvÞÞ ¼ T kÀ1 ðvÞ kÀ1 k 2 Tv ðT ðvÞÞ ¼ T ðvÞ ¼ Àa0 v À a1 T ðvÞ À a2 T ðvÞ À Á Á Á À akÀ1 T kÀ1 ðvÞ By definition, the matrix of Tv in this basis is the tranpose of the matrix of coefficients of the above system of equations; hence, it is C, as required.

 10.30. Let T :V ! V be linear. Let W be a T -invariant subspace of V and T the induced operator on V =W. Prove (a) The T-annihilator of v 2 V divides the minimal polynomial of T.   (b) The T -annihilator of v 2 V =W divides the minimal polynomial of T.

CHAPTER 10 Canonical Forms

343

(a) The T -annihilator of v 2 V is the minimal polynomial of the restriction of T to Zðv; T Þ; therefore, by Problem 10.6, it divides the minimal polynomial of T.    (b) The T -annihilator of v 2 V =W divides the minimal polynomial of T , which divides the minimal polynomial of T by Theorem 10.16. Remark: In the case where the minimum polynomial of T is f ðtÞn , where f ðtÞ is a monic irreducible   polynomial, then the T -annihilator of v 2 V and the T -annihilator of v 2 V =W are of the form f ðtÞm , where m n.

10.31. Prove Lemma 10.13: Let T :V ! V be a linear operator whose minimal polynomial is f ðtÞn , where f ðtÞ is a monic irreducible polynomial. Then V is the direct sum of T -cyclic subspaces Zi ¼ Zðv i ; T Þ, i ¼ 1; . . . ; r, with corresponding T -annihilators f ðtÞn1 ; f ðtÞn2 ; . . . ; f ðtÞnr ; n ¼ n1 ! n2 ! Á Á Á ! nr

Any other decomposition of V into the direct sum of T -cyclic subspaces has the same number of components and the same set of T -annihilators.
The proof is by induction on the dimension of V. If dim V ¼ 1, then V is T -cyclic and the lemma holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than that of V. Because the minimal polynomial of T is f ðtÞn , there exists v 1 2 V such that f ðT ÞnÀ1 ðv 1 Þ 6¼ 0; hence,   the T -annihilator of v 1 is f ðtÞn . Let Z1 ¼ Zðv 1 ; T Þ and recall that Z1 is T -invariant. Let V ¼ V =Z1 and let T   be the linear operator on V induced by T. By Theorem 10.16, the minimal polynomial of T divides f ðtÞn ;     hence, the hypothesis holds for V and T . Consequently, by induction, V is the direct sum of T -cyclic subspaces; say,  v  V ¼ Zð2 ; T Þ È Á Á Á È Zðr ; T Þ v   where the corresponding T -annihilators are f ðtÞn2 ; . . . ; f ðtÞnr , n ! n2 ! Á Á Á ! nr .    We claim that there is a vector v 2 in the coset v2 whose T -annihilator is f ðtÞn2 , the T -annihilator of v2 . n2  Let w be any vector in v2 . Then f ðT Þ ðwÞ 2 Z1 . Hence, there exists a polynomial gðtÞ for which f ðT Þn2 ðwÞ ¼ gðT Þðv 1 Þ Because f ðtÞn is the minimal polynomial of T, we have, by (1), 0 ¼ f ðT Þn ðwÞ ¼ f ðT ÞnÀn2 gðT Þðv 1 Þ But f ðtÞn is the T -annihilator of v 1 ; hence, f ðtÞn divides f ðtÞnÀn2 gðtÞ, and so gðtÞ ¼ f ðtÞn2 hðtÞ for some polynomial hðtÞ. We set v 2 ¼ w À hðT Þðv 1 Þ  Because w À v 2 ¼ hðT Þðv 1 Þ 2 Z1 , v 2 also belongs to the coset v2 . Thus, the T -annihilator of v 2 is a   multiple of the T -annihilator of v2 . On the other hand, by (1), f ðT Þn2 ðv 2 Þ ¼ f ðT Þns ðw À hðT Þðv 1 ÞÞ ¼ f ðT Þn2 ðwÞ À gðT Þðv 1 Þ ¼ 0 Consequently, the T -annihilator of v 2 is f ðtÞn2 , as claimed. Similarly, there exist vectors v 3 ; . . . ; v r 2 V such that v i 2 v i and that the T -annihilator of v i is f ðtÞni , -annihilator of v i . We set the T Z2 ¼ Zðv 2 ; T Þ; ...; Zr ¼ Zðv r ; T Þ Let d denote the degree of f ðtÞ, so that f ðtÞni has degree dni . Then, because f ðtÞni is both the T -annihilator  of v i and the T -annihilator of v i , we know that  and fi :T ðv i Þ; . . . ; T dni À1 ðv i Þg v  fv i ; T ðv i Þ; . . . ; T dni À1 ðv i Þg     are bases for Zðv i ; T Þ and Zðv i ; T Þ, respectively, for i ¼ 2; . . . ; r. But V ¼ Zðv 2 ; T Þ È Á Á Á È Zðv r ; T Þ; hence,  T v T v f2 ; . . . ;  dn2 À1 ð2 Þ; . . . ; vr ; . . . ;  dnr À1 ðr Þg v ð1Þ

344

CHAPTER 10 Canonical Forms
  v is a basis for V. Therefore, by Problem 10.26 and the relation T i ðÞ ¼ T i ðvÞ (see Problem 10.27), fv 1 ; . . . ; T dn1 À1 ðv 1 Þ; v 2 ; . . . ; T en2 À1 ðv 2 Þ; . . . ; v r ; . . . ; T dnr À1 ðv r Þg is a basis for V. Thus, by Theorem 10.4, V ¼ Zðv 1 ; T Þ È Á Á Á È Zðv r ; T Þ, as required. It remains to show that the exponents n1 ; . . . ; nr are uniquely determined by T. Because d ¼ degree of f ðtÞ; dim V ¼ dðn1 þ Á Á Á þ nr Þ and dim Zi ¼ dni ; s i ¼ 1; . . . ; r

Also, if s is any positive integer, then (Problem 10.59) f ðT Þ ðZi Þ is a cyclic subspace generated by f ðT Þs ðv i Þ, and it has dimension dðni À sÞ if ni > s and dimension 0 if ni s. Now any vector v 2 V can be written uniquely in the form v ¼ w1 þ Á Á Á þ wr , where wi 2 Zi . Hence, any vector in f ðT Þs ðV Þ can be written uniquely in the form f ðT Þs ðvÞ ¼ f ðT Þs ðw1 Þ þ Á Á Á þ f ðT Þs ðwr Þ where f ðT Þs ðwi Þ 2 f ðT Þs ðZi Þ. Let t be the integer, dependent on s, for which n1 > s; ...; nt > s; ntþ1 ! s Then f ðT Þs ðV Þ ¼ f ðT Þs ðZ1 Þ È Á Á Á È f ðT Þs ðZt Þ and so dim½ f ðT Þs ðV ފ ¼ d½ðn1 À sÞ þ Á Á Á þ ðnt À sފ ð2Þ The numbers on the left of (2) are uniquely determined by T. Set s ¼ n À 1, and (2) determines the number of ni equal to n. Next set s ¼ n À 2, and (2) determines the number of ni (if any) equal to n À 1. We repeat the process until we set s ¼ 0 and determine the number of ni equal to 1. Thus, the ni are uniquely determined by T and V, and the lemma is proved.

10.32. Let V be a seven-dimensional vector space over R, and let T :V ! V be a linear operator with minimal polynomial mðtÞ ¼ ðt2 À 2t þ 5Þðt À 3Þ3 . Find all possible rational canonical forms M of T.
Because dim V ¼ 7; there are only two possible characteristic polynomials, D1 ðtÞ ¼ ðt2 À 2t þ 5Þ2 ðt À 3Þ3 or D1 ðtÞ ¼ ðt2 À 2t þ 5Þðt À 3Þ5 : Moreover, the sum of the orders of the companion matrices must add up to 7. Also, one companion matrix must be Cðt2 À 2t þ 5Þ and one must be Cððt À 3Þ3 Þ ¼ Cðt3 À 9t2 þ 27t À 27Þ. Thus, M must be one of the following block diagonal matrices: 0 2 31 ! ! 0 0 27 0 À5 0 À5 (a) diag@ ; ; 4 1 0 À27 5A; 1 2 1 2 0 1 9 0 2 3 1 ! ! 0 0 27 0 À5 0 À9 A ; 4 1 0 À27 5; (b) diag@ ; 1 2 1 6 0 1 9 0 1 2 3 ! 0 0 27 0 À5 (c) diag@ ; 4 1 0 À27 5; ½3Š; ½3ŠA 1 2 0 1 9

Projections 10.33. Suppose V ¼ W1 È Á Á Á È Wr . The projection of V into its subspace Wk is the mapping E: V ! V defined by EðvÞ ¼ wk , where v ¼ w1 þ Á Á Á þ wr ; wi 2 Wi . Show that (a) E is linear, (b) E2 ¼ E.
(a) Because the sum v ¼ w1 þ Á Á Á þ wr , wi 2 W is uniquely determined by v, the mapping E is well defined. Suppose, for u 2 V, u ¼ w01 þ Á Á Á þ w0r , w0i 2 Wi . Then v þ u ¼ ðw1 þ w01 Þ þ Á Á Á þ ðwr þ w0r Þ and kv ¼ kw1 þ Á Á Á þ kwr ; kwi ; wi þ w0i 2 Wi are the unique sums corresponding to v þ u and kv. Hence, and EðkvÞ ¼ kwk þ kEðvÞ Eðv þ uÞ ¼ wk þ w0k ¼ EðvÞ þ EðuÞ and therefore E is linear.

CHAPTER 10 Canonical Forms
(b) We have that wk ¼ 0 þ Á Á Á þ 0 þ wk þ 0 þ Á Á Á þ 0 is the unique sum corresponding to wk 2 Wk ; hence, Eðwk Þ ¼ wk . Then, for any v 2 V, E2 ðvÞ ¼ EðEðvÞÞ ¼ Eðwk Þ ¼ wk ¼ EðvÞ Thus, E2 ¼ E, as required.

345

10.34. Suppose E:V ! V is linear and E2 ¼ E. Show that (a) EðuÞ ¼ u for any u 2 Im E (i.e., the restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and kernel of E:V ¼ Im E È Ker E; (c) E is the projection of V into Im E, its image. Thus, by the preceding problem, a linear mapping T :V ! V is a projection if and only if T 2 ¼ T ; this characterization of a projection is frequently used as its definition.
(a) If u 2 Im E, then there exists v 2 V for which EðvÞ ¼ u; hence, as required, EðuÞ ¼ EðEðvÞÞ ¼ E 2 ðvÞ ¼ EðvÞ ¼ u (b) Let v 2 V. We can write v in the form v ¼ EðvÞ þ v À EðvÞ. Now EðvÞ 2 Im E and, because Eðv À EðvÞÞ ¼ EðvÞ À E 2 ðvÞ ¼ EðvÞ À EðvÞ ¼ 0 v À EðvÞ 2 Ker E. Accordingly, V ¼ Im E þ Ker E. Now suppose w 2 Im E \ Ker E. By (i), EðwÞ ¼ w because w 2 Im E. On the other hand, EðwÞ ¼ 0 because w 2 Ker E. Thus, w ¼ 0, and so Im E \ Ker E ¼ f0g. These two conditions imply that V is the direct sum of the image and kernel of E. (c) Let v 2 V and suppose v ¼ u þ w, where u 2 Im E and w 2 Ker E. Note that EðuÞ ¼ u by (i), and EðwÞ ¼ 0 because w 2 Ker E. Hence, EðvÞ ¼ Eðu þ wÞ ¼ EðuÞ þ EðwÞ ¼ u þ 0 ¼ u That is, E is the projection of V into its image.

10.35. Suppose V ¼ U È W and suppose T :V ! V is linear. Show that U and W are both T -invariant if and only if TE ¼ ET , where E is the projection of V into U.
Observe that EðvÞ 2 U for every v 2 V, and that (i) EðvÞ ¼ v iff v 2 U , (ii) EðvÞ ¼ 0 iff v 2 W. Suppose ET ¼ TE. Let u 2 U . Because EðuÞ ¼ u, T ðuÞ ¼ T ðEðuÞÞ ¼ ðTEÞðuÞ ¼ ðET ÞðuÞ ¼ EðT ðuÞÞ 2 U Hence, U is T -invariant. Now let w 2 W. Because EðwÞ ¼ 0, EðT ðwÞÞ ¼ ðETÞðwÞ ¼ ðTEÞðwÞ ¼ T ðEðwÞÞ ¼ T ð0Þ ¼ 0;

and so

T ðwÞ 2 W

Hence, W is also T -invariant. Conversely, suppose U and W are both T -invariant. Let v 2 V and suppose v ¼ u þ w, where u 2 T and w 2 W. Then T ðuÞ 2 U and T ðwÞ 2 W ; hence, EðT ðuÞÞ ¼ T ðuÞ and EðT ðwÞÞ ¼ 0. Thus, ðET ÞðvÞ ¼ ðET Þðu þ wÞ ¼ ðET ÞðuÞ þ ðET ÞðwÞ ¼ EðT ðuÞÞ þ EðT ðwÞÞ ¼ T ðuÞ ðTEÞðvÞ ¼ ðTEÞðu þ wÞ ¼ T ðEðu þ wÞÞ ¼ T ðuÞ

and

That is, ðETÞðvÞ ¼ ðTEÞðvÞ for every v 2 V ; therefore, ET ¼ TE, as required.

SUPPLEMENTARY PROBLEMS

Invariant Subspaces
10.36. Suppose W is invariant under T :V ! V. Show that W is invariant under f ðT Þ for any polynomial f ðtÞ. 10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators.

346

CHAPTER 10 Canonical Forms

10.38. Let W be invariant under T1 : V ! V and T2 : V ! V. Prove W is also invariant under T1 þ T2 and T1 T2 . 10.39. Let T :V ! V be linear. Prove that any eigenspace, El is T -invariant. 10.40. Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear operator on V has an invariant subspace other than V or f0g. ! 2 À4 10.41. Determine the invariant subspace of A ¼ viewed as a linear operator on (a) R2 , (b) C2 . 5 À2 10.42. Suppose dim V ¼ n. Show that T :V ! V has a triangular matrix representation if and only if there exist T -invariant subspaces W1 & W2 & Á Á Á & Wn ¼ V for which dim Wk ¼ k, k ¼ 1; . . . ; n.

Invariant Direct Sums
10.43. The subspaces W1 ; . . . ; Wr are said to be independent if w1 þ Á Á Á þ wr ¼ 0, wi 2 Wi , implies that each wi ¼ 0. Show that spanðWi Þ ¼ W1 È Á Á Á È Wr if and only if the Wi are independent. [Here spanðWi Þ denotes the linear span of the Wi .] 10.44. Show that V ¼ W1 È Á Á Á È Wr if and only if (i) V ¼ spanðWi Þ and (ii) for k ¼ 1; 2; . . . ; r, Wk \ spanðW1 ; . . . ; WkÀ1 ; Wkþ1 ; . . . ; Wr Þ ¼ f0g. 10.45. Show that spanðWi Þ ¼ W1 È Á Á Á È Wr if and only if dim ½spanðWi ފ ¼ dim W1 þ Á Á Á þ dim Wr . 10.46. Suppose the characteristic polynomial of T :V ! V is DðtÞ ¼ f1 ðtÞn1 f2 ðtÞn2 Á Á Á fr ðtÞnr , where the fi ðtÞ are distinct monic irreducible polynomials. Let V ¼ W1 È Á Á Á È Wr be the primary decomposition of V into T invariant subspaces. Show that fi ðtÞni is the characteristic polynomial of the restriction of T to Wi .

Nilpotent Operators
10.47. Suppose T1 and T2 are nilpotent operators that commute (i.e., T1 T2 ¼ T2 T1 ). Show that T1 þ T2 and T1 T2 are also nilpotent. 10.48. Suppose A is a supertriangular matrix (i.e., all entries on and below the main diagonal are 0). Show that A is nilpotent. 10.49. Let V be the vector space of polynomials of degree of index n þ 1. n. Show that the derivative operator on V is nilpotent

10.50. Show that any Jordan nilpotent block matrix N is similar to its transpose N T (the matrix with 1’s below the diagonal and 0’s elsewhere). 10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4.

Jordan Canonical Form
10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial DðtÞ and minimal polynomial mðtÞ are as follows: (a) DðtÞ ¼ ðt À 2Þ4 ðt À 3Þ2 ; mðtÞ ¼ ðt À 2Þ2 ðt À 3Þ2 , (b) DðtÞ ¼ ðt À 7Þ5 ; mðtÞ ¼ ðt À 7Þ2 , (c) DðtÞ ¼ ðt À 2Þ7 ; mðtÞ ¼ ðt À 2Þ3 10.53. Show that every complex matrix is similar to its transpose. (Hint: Use its Jordan canonical form.) 10.54. Show that all n  n complex matrices A for which An ¼ I but Ak 6¼ I for k < n are similar. 10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real entries.

CHAPTER 10 Canonical Forms
Cyclic Subspaces

347

10.56. Suppose T :V ! V is linear. Prove that Zðv; T Þ is the intersection of all T -invariant subspaces containing v. 10.57. Let f ðtÞ and gðtÞ be the T -annihilators of u and v, respectively. Show that if f ðtÞ and gðtÞ are relatively prime, then f ðtÞgðtÞ is the T -annihilator of u þ v. 10.58. Prove that Zðu; T Þ ¼ Zðv; T Þ if and only if gðT ÞðuÞ ¼ v where gðtÞ is relatively prime to the T -annihilator of u. 10.59. Let W ¼ Zðv; T Þ, and suppose the T -annihilator of v is f ðtÞn , where f ðtÞ is a monic irreducible polynomial of degree d. Show that f ðT Þs ðW Þ is a cyclic subspace generated by f ðT Þs ðvÞ and that it has dimension dðn À sÞ if n > s and dimension 0 if n s.

Rational Canonical Form
10.60. Find all possible rational forms for a 6 Â 6 matrix over R with minimal polynomial: (a) mðtÞ ¼ ðt2 À 2t þ 3Þðt þ 1Þ2 , (b) mðtÞ ¼ ðt À 2Þ3 .

10.61. Let A be a 4 Â 4 matrix with minimal polynomial mðtÞ ¼ ðt2 þ 1Þðt2 À 3Þ. Find the rational canonical form for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C. 10.62. Find the rational canonical form for the four-square Jordan block with l’s on the diagonal. 10.63. Prove that the characteristic polynomial of an operator T :V ! V is a product of its elementary divisors. 10.64. Prove that two 3 Â 3 matrices with the same minimal and characteristic polynomials are similar. 10.65. Let Cð f ðtÞÞ denote the companion matrix to an arbitrary polynomial f ðtÞ. Show that f ðtÞ is the characteristic polynomial of Cð f ðtÞÞ.

Projections
10.66. Suppose V ¼ W1 È Á Á Á È Wr . Let Ei denote the projection of V into Wi . Prove (i) Ei Ej ¼ 0, i 6¼ j; (ii) I ¼ E1 þ Á Á Á þ Er . 10.67. Let E1 ; . . . ; Er be linear operators on V such that (i) Ei2 ¼ Ei (i.e., the Ei are projections); (ii) Ei Ej ¼ 0, i 6¼ j; (iii) I ¼ E1 þ Á Á Á þ Er Prove that V ¼ Im E1 È Á Á Á È Im Er . 10.68. Suppose E: V ! V is a projection (i.e., E2 ¼ E). Prove that E has a matrix representation of the form ! Ir 0 , where r is the rank of E and Ir is the r-square identity matrix. 0 0 10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.) 10.70. Suppose E: V ! V is a projection. Prove (i) I À E is a projection and V ¼ Im E È Im ðI À EÞ, (ii) I þ E is invertible (if 1 þ 1 6¼ 0).

Quotient Spaces
10.71. Let W be a subspace of V. Suppose the set of cosets fv 1 þ W ; v 2 þ W ; . . . ; v n þ W g in V =W is linearly independent. Show that the set of vectors fv 1 ; v 2 ; . . . ; v n g in V is also linearly independent. 10.72. Let W be a substance of V. Suppose the set of vectors fu1 ; u2 ; . . . ; un g in V is linearly independent, and that Lðui Þ \ W ¼ f0g. Show that the set of cosets fu1 þ W ; . . . ; un þ W g in V =W is also linearly independent.

348

CHAPTER 10 Canonical Forms

10.73. Suppose V ¼ U È W and that fu1 ; . . . ; un g is a basis of U. Show that fu1 þ W ; . . . ; un þ W g is a basis of the quotient spaces V =W. (Observe that no condition is placed on the dimensionality of V or W.) 10.74. Let W be the solution space of the linear equation a1 x1 þ a2 x2 þ Á Á Á þ an xn ¼ 0; ai 2 K

and let v ¼ ðb1 ; b2 ; . . . ; bn Þ 2 K n . Prove that the coset v þ W of W in K n is the solution set of the linear equation a1 x1 þ a2 x2 þ Á Á Á þ an xn ¼ b; where b ¼ a1 b1 þ Á Á Á þ an bn

10.75. Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible by t4 (i.e., of the form a0 t4 þ a1 t5 þ Á Á Á þ anÀ4 tn ). Show that the quotient space V =W has dimension 4. 10.76. Let U and W be subspaces of V such that W & U & V. Note that any coset u þ W of W in U may also be viewed as a coset of W in V, because u 2 U implies u 2 V ; hence, U =W is a subset of V =W. Prove that (i) U =W is a subspace of V =W, (ii) dimðV =W Þ À dimðU =W Þ ¼ dimðV =U Þ. 10.77. Let U and W be subspaces of V. Show that the cosets of U \ W in V can be obtained by intersecting each of the cosets of U in V by each of the cosets of W in V : V =ðU \ W Þ ¼ fðv þ U Þ \ ðv 0 þ W Þ : v; v 0 2 V g 10.78. Let T :V ! V 0 be linear with kernel W and image U. Show that the quotient space V =W is isomorphic to U under the mapping y :V =W ! U defined by yðv þ W Þ ¼ T ðvÞ. Furthermore, show that T ¼ i  y  Z, where Z :V ! V =W is the natural mapping of V into V =W (i.e., ZðvÞ ¼ v þ W), and i :U ,! V 0 is the inclusion mapping (i.e., iðuÞ ¼ u). (See diagram.)

ANSWERS TO SUPPLEMENTARY PROBLEMS
10.41. (a) R2 and f0g,  (b) C2 ; f0g; W1 ¼ spanð2; 1 À 2iÞ; W2 ¼ spanð2; 1 þ 2iÞ ! 1 ; ½2Š: ½2Š; 2  ½7Š; ½7Š; ½7Š ; 3 1 3 ! ;

10.52. (a)

! ! !  2 1 2 1 3 1 2 ; ; ; diag 2 2 3  !   ! ! 7 1 7 1 7 1 ; ; ½7Š ; diag ; (b) diag 7 7 7 (c) Let Mk denote a Jordan block with l ¼ 2 and order diagðM3 ; M2 ; M1 ; M1 Þ, diagðM3 ; M1 ; M1 ; M1 ; M1 Þ diag 0 B¼ 1 ! À1 ; À2 0 C ¼ 41 0 2

k. Then diagðM3 ; M3 ; M1 Þ, diagðM3 ; M2 ; M2 Þ, ! 0 À4 D¼ . 1 4

! 0 À3 10.60. Let A ¼ ; 1 2 (a)

3 0 8 0 À12 5; 1 6

diagðA; A; BÞ; diagðA; B; BÞ; diagðA; B; À1; À1Þ; (b) diagðC; CÞ; diagðC; D; 2Þ; diagðC; 2; 2; 2Þ ! ! 0 À1 0 3 ; B¼ . 10.61. Let A ¼ 1 0 1 0 pffiffiffi pffiffiffi pffiffiffi pffiffiffi (a) diagðA; BÞ, (b) diagðA; 3; À 3Þ, (c) diagði; Ài; 3; À 3Þ 10.62. Companion matrix with the last column ½Àl4 ; 4l3 ; À6l2 ; 4lŠT

CHAPTERC11 P T E R 1 1 HA

Linear Functionals and the Dual Space
11.1 Introduction
In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless otherwise stated or implied, we view K as a vector space over itself.) Naturally all the theorems and results for arbitrary mappings on V hold for this special case. However, we treat these mappings separately because of their fundamental importance and because the special relationship of V to K gives rise to new notions and results that do not apply in the general case.

11.2

Linear Functionals and the Dual Space

Let V be a vector space over a field K. A mapping f:V ! K is termed a linear functional (or linear form) if, for every u; v 2 V and every a; b; 2 K, fðau þ bvÞ ¼ afðuÞ þ bfðvÞ In other words, a linear functional on V is a linear mapping from V into K.
EXAMPLE 11.1

(a) Let pi :K n ! K be the ith projection mapping; that is, pi ða1 ; a2 ; . . . an Þ ¼ ai . Then pi is linear and so it is a linear functional on K n . (b) Let V be Ðthe vector space of polynomials in t over R. Let J:V ! R be the integral operator defined by 1 JðpðtÞÞ ¼ 0 pðtÞ dt. Recall that J is linear; and hence, it is a linear functional on V. (c) Let V be the vector space of n-square matrices over K. Let T :V ! K be the trace mapping

T ðAÞ ¼ a11 þ a22 þ Á Á Á þ ann ;

where

A ¼ ½aij Š

That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is a linear functional on V.

By Theorem 5.10, the set of linear functionals on a vector space V over a field K is also a vector space over K, with addition and scalar multiplication defined by ðf þ sÞðvÞ ¼ fðvÞ þ sðvÞ and ðkfÞðvÞ ¼ kfðvÞ where f and s are linear functionals on V and k 2 K. This space is called the dual space of V and is denoted by V *.
EXAMPLE 11.2 Let V ¼ K n , the vector space of n-tuples, which we write as column vectors. Then the dual space V * can be identified with the space of row vectors. In particular, any linear functional f ¼ ða1 ; . . . ; an Þ in V * has the representation

fðx1 ; x2 ; . . . ; xn Þ ¼ ½a1 ; a2 ; . . . ; an Š½x2 ; x2 ; . . . ; xn ŠT ¼ a1 x1 þ a2 x2 þ Á Á Á þ an xn
Historically, the formal expression on the right was termed a linear form.

349

350
11.3 Dual Basis

CHAPTER 11 Linear Functionals and the Dual Space

Suppose V is a vector space of dimension n over K. By Theorem 5.11, the dimension of the dual space V * is also n (because K is of dimension 1 over itself). In fact, each basis of V determines a basis of V * as follows (see Problem 11.3 for the proof).
THEOREM

11.1:

Suppose fv 1 ; . . . ; v n g is a basis of V over K. Let f1 ; . . . ; fn 2 V * be the linear functionals as defined by & 1 if i ¼ j fi ðv j Þ ¼ dij ¼ 0 if i 6¼ j Then ff1 ; . . . ; fn g is a basis of V *:

The above basis ffi g is termed the basis dual to fv i g or the dual basis. The above formula, which uses the Kronecker delta dij , is a short way of writing f1 ðv 1 Þ ¼ 1; f1 ðv 2 Þ ¼ 0; f1 ðv 3 Þ ¼ 0; . . . ; f1 ðv n Þ ¼ 0 f2 ðv 1 Þ ¼ 0; f2 ðv 2 Þ ¼ 1; f2 ðv 3 Þ ¼ 0; . . . ; f2 ðv n Þ ¼ 0 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: fn ðv 1 Þ ¼ 0; fn ðv 2 Þ ¼ 0; . . . ; fn ðv nÀ1 Þ ¼ 0; fn ðv n Þ ¼ 1 By Theorem 5.2, these linear mappings fi are unique and well defined.
EXAMPLE 11.3 Consider the basis fv 1 ¼ ð2; 1Þ; v 2 ¼ ð3; 1Þg of R2 . Find the dual basis ff1 ; f2 g.

We seek linear functionals f1 ðx; yÞ ¼ ax þ by and f2 ðx; yÞ ¼ cx þ dy such that

f1 ðv 1 Þ ¼ 1;

f1 ðv 2 Þ ¼ 0; '

f2 ðv 2 Þ ¼ 0;

f2 ðv 2 Þ ¼ 1 '

These four conditions lead to the following two systems of linear equations:

f1 ðv 1 Þ ¼ f1 ð2; 1Þ ¼ 2a þ b ¼ 1 f1 ðv 2 Þ ¼ f1 ð3; 1Þ ¼ 3a þ b ¼ 0

and

f2 ðv 1 Þ ¼ f2 ð2; 1Þ ¼ 2c þ d ¼ 0 f2 ðv 2 Þ ¼ f2 ð3; 1Þ ¼ 3c þ d ¼ 1

The solutions yield a ¼ À1, b ¼ 3 and c ¼ 1, d ¼ À2. Hence, f1 ðx; yÞ ¼ Àx þ 3y and f2 ðx; yÞ ¼ x À 2y form the dual basis.

The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between bases and their duals.
THEOREM

11.2:

Let fv 1 ; . . . ; v n g be a basis of V and let ff1 ; . . . ; fn g be the dual basis in V *. Then (i) For any vector u 2 V, u ¼ f1 ðuÞv 1 þ f2 ðuÞv 2 þ Á Á Á þ fn ðuÞv n . (ii) For any linear functional s 2 V *, s ¼ sðv 1 Þf1 þ sðv 2 Þf2 þ Á Á Á þ sðv n Þfn .

THEOREM

11.3:

Let fv 1 ; . . . ; v n g and fw1 ; . . . ; wn g be bases of V and let ff1 ; . . . ; fn g and fs1 ; . . . ; sn g be the bases of V * dual to fv i g and fwi g, respectively. Suppose P is the change-of-basis matrix from fv i g to fwi g. Then ðPÀ1 ÞT is the change-of-basis matrix from ffi g to fsi g.

11.4

Second Dual Space

We repeat: Every vector space V has a dual space V *, which consists of all the linear functionals on V. Thus, V * has a dual space V **, called the second dual of V, which consists of all the linear functionals on V *. ^ We now show that each v 2 V determines a specific element v 2 V **. First, for any f 2 V *, we define ^ v ðfÞ ¼ fðvÞ

CHAPTER 11 Linear Functionals and the Dual Space

351

^ It remains to be shown that this map v :V * ! K is linear. For any scalars a; b 2 K and any linear functionals f; s 2 V *, we have ^ v ðaf þ bsÞ ¼ ðaf þ bsÞðvÞ ¼ afðvÞ þ bsðvÞ ¼ a^ðfÞ þ b^ðsÞ v v ^ ^ That is, v is linear and so v 2 V **. The following theorem (proved in Problem 12.7) holds.
THEOREM

11.4:

^ If V has finite dimensions, then the mapping v 7! v is an isomorphism of V onto V **.

^ The above mapping v 7! v is called the natural mapping of V into V **. We emphasize that this mapping is never onto V ** if V is not finite-dimensional. However, it is always linear, and moreover, it is always one-to-one. Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an isomorphism between V and V **. Unless otherwise stated, we will identify V with V ** by this mapping. Accordingly, we will view V as the space of linear functionals on V * and write V ¼ V **. We remark that if ffi g is the basis of V * dual to a basis fv i g of V, then fv i g is the basis of V ** ¼ V that is dual to ffi g.

11.5

Annihilators

Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional f 2 V * is called an annihilator of W if fðwÞ ¼ 0 for every w 2 W—that is, if fðW Þ ¼ f0g. We show that the set of all such mappings, denoted by W 0 and called the annihilator of W, is a subspace of V *. Clearly, 0 2 W 0 : Now suppose f; s 2 W 0 . Then, for any scalars a; b; 2 K and for any w 2 W, ðaf þ bsÞðwÞ ¼ afðwÞ þ bsðwÞ ¼ a0 þ b0 ¼ 0 Thus, af þ bs 2 W 0 , and so W 0 is a subspace of V *. In the case that W is a subspace of V, we have the following relationship between W and its annihilator W 0 (see Problem 11.11 for the proof).
THEOREM

11.5:

Suppose V has finite dimension and W is a subspace of V. Then and ðiiÞ W 00 ¼ W

ðiÞ dim W þ dim W 0 ¼ dim V

Here W 00 ¼ fv 2 V :fðvÞ ¼ 0 for every f 2 W 0 g or, equivalently, W 00 ¼ ðW 0 Þ0 , where W 00 is viewed as a subspace of V under the identification of V and V **.

11.6

Transpose of a Linear Mapping

Let T :V ! U be an arbitrary linear mapping from a vector space V into a vector space U . Now for any linear functional f 2 U *, the composition f  T is a linear mapping from V into K:

That is, f  T 2 V *. Thus, the correspondence f 7! f  T is a mapping from U * into V *; we denote it by T t and call it the transpose of T . In other words, T t :U * ! V * is defined by T t ðfÞ ¼ f  T Thus, ðT t ðfÞÞðvÞ ¼ fðT ðvÞÞ for every v 2 V.

352
THEOREM

CHAPTER 11 Linear Functionals and the Dual Space
11.6: The transpose mapping T t defined above is linear.

Proof. For any scalars a; b 2 K and any linear functionals f; s 2 U *, T t ðaf þ bsÞ ¼ ðaf þ bsÞ  T ¼ aðf  T Þ þ bðs  T Þ ¼ aT t ðfÞ þ bT t ðsÞ That is, T t is linear, as claimed. We emphasize that if T is a linear mapping from V into U , then T t is a linear mapping from U * into V *. The same ‘‘transpose’’ for the mapping T t no doubt derives from the following theorem (proved in Problem 11.16).
THEOREM

11.7:

Let T :V ! U be linear, and let A be the matrix representation of T relative to bases fv i g of V and fui g of U . Then the transpose matrix AT is the matrix representation of T t :U * ! V * relative to the bases dual to fui g and fv i g.

SOLVED PROBLEMS Dual Spaces and Dual Bases 11.1. Find the basis ff1 ; f2 ; f3 g that is dual to the following basis of R3 : fv 1 ¼ ð1; À1; 3Þ; v 2 ¼ ð0; 1; À1Þ; v 3 ¼ ð0; 3; À2Þg The linear functionals may be expressed in the form f1 ðx; y; zÞ ¼ a1 x þ a2 y þ a3 z; f2 ðx; y; zÞ ¼ b1 x þ b2 y þ b3 z; f3 ðx; y; zÞ ¼ c1 x þ c2 y þ c3 z By definition of the dual basis, fi ðv j Þ ¼ 0 for i 6¼ j, but fi ðv j Þ ¼ 1 for i ¼ j. We find f1 by setting f1 ðv 1 Þ ¼ 1; f1 ðv 2 Þ ¼ 0; f1 ðv 3 Þ ¼ 0: This yields f1 ð1; À1; 3Þ ¼ a1 À a2 þ 3a3 ¼ 1; f1 ð0; 1; À1Þ ¼ a2 À a3 ¼ 0; f1 ð0; 3; À2Þ ¼ 3a2 À 2a3 ¼ 0 Solving the system of equations yields a1 ¼ 1, a2 ¼ 0, a3 ¼ 0. Thus, f1 ðx; y; zÞ ¼ x. We find f2 by setting f2 ðv 1 Þ ¼ 0, f2 ðv 2 Þ ¼ 1, f2 ðv 3 Þ ¼ 0. This yields f2 ð1; À1; 3Þ ¼ b1 À b2 þ 3b3 ¼ 0; f2 ð0; 1; À1Þ ¼ b2 À b3 ¼ 1; f2 ð0; 3; À2Þ ¼ 3b2 À 2b3 ¼ 0 Solving the system of equations yields b1 ¼ 7, b2 ¼ À2, a3 ¼ À3. Thus, f2 ðx; y; zÞ ¼ 7x À 2y À 3z. We find f3 by setting f3 ðv 1 Þ ¼ 0, f3 ðv 2 Þ ¼ 0, f3 ðv 3 Þ ¼ 1. This yields f3 ð1; À1; 3Þ ¼ c1 À c2 þ 3c3 ¼ 0; f3 ð0; 1; À1Þ ¼ c2 À c3 ¼ 0; f3 ð0; 3; À2Þ ¼ 3c2 À 2c3 ¼ 1 Solving the system of equations yields c1 ¼ À2, c2 ¼ 1, c3 ¼ 1. Thus, f3 ðx; y; zÞ ¼ À2x þ y þ z.

11.2.

Let V ¼ fa þ bt : a; b 2 Rg, the vector space of real polynomials of degree fv 1 ; v 2 g of V that is dual to the basis ff1 ; f2 g of V * defined by ð1 0

1. Find the basis

ð2
0

f1 ð f ðtÞÞ ¼

f ðtÞ dt

and

f2 ð f ðtÞÞ ¼

f ðtÞ dt

Let v 1 ¼ a þ bt and v 2 ¼ c þ dt. By definition of the dual basis, f1 ðv 1 Þ ¼ 1; Thus, f1 ðv 1 Þ ¼ f2 ðv 1 Þ ¼
1 0 ða þ btÞ dt ¼ a þ 2 b ¼ 1 Ð2 0 ða þ btÞ dt ¼ 2a þ 2b ¼ 0

f1 ðv 2 Þ ¼ 0 )

and

f2 ðv 1 Þ ¼ 0; Ð1

fi ðv j Þ ¼ 1 )

Ð1

and

f1 ðv 2 Þ ¼ f2 ðv 2 Þ ¼

1 0 ðc þ dtÞ dt ¼ c þ 2 d ¼ 0 Ð2 0 ðc þ dtÞ dt ¼ 2c þ 2d ¼ 1

Solving each system yields a ¼ 2, b ¼ À2 and c ¼ À 1, d ¼ 1. Thus, fv 1 ¼ 2 À 2t; v 2 ¼ À 1 þ tg is 2 2 the basis of V that is dual to ff1 ; f2 g.

CHAPTER 11 Linear Functionals and the Dual Space
11.3.

353

Prove Theorem 11.1: Suppose fv 1 ; . . . ; v n g is a basis of V over K. Let f1 ; . . . ; fn 2 V * be defined by fi ðv j Þ ¼ 0 for i 6¼ j, but fi ðv j Þ ¼ 1 for i ¼ j. Then ff1 ; . . . ; fn g is a basis of V *.
We first show that ff1 ; . . . ; fn g spans V *. Let f be an arbitrary element of V *, and suppose fðv 1 Þ ¼ k1 ; Set s ¼ k1 f1 þ Á Á Á þ kn fn . Then sðv 1 Þ ¼ ðk1 f1 þ Á Á Á þ kn fn Þðv 1 Þ ¼ k1 f1 ðv 1 Þ þ k2 f2 ðv 1 Þ þ Á Á Á þ kn fn ðv 1 Þ ¼ k 1 Á 1 þ k 2 Á 0 þ Á Á Á þ kn Á 0 ¼ k 1 Similarly, for i ¼ 2; . . . ; n, sðv i Þ ¼ ðk1 f1 þ Á Á Á þ kn fn Þðv i Þ ¼ k1 f1 ðv i Þ þ Á Á Á þ ki fi ðv i Þ þ Á Á Á þ kn fn ðv i Þ ¼ ki Thus, fðv i Þ ¼ sðv i Þ for i ¼ 1; . . . ; n. Because f and s agree on the basis f ¼ s ¼ k1 f1 þ Á Á Á þ kn fn . Accordingly, ff1 ; . . . ; fn g spans V *. It remains to be shown that ff1 ; . . . ; fn g is linearly independent. Suppose a1 f1 þ a2 f2 þ Á Á Á þ an fn ¼ 0 Applying both sides to v 1 , we obtain 0 ¼ 0ðv 1 Þ ¼ ða1 f1 þ Á Á Á þ an fn Þðv 1 Þ ¼ a1 f1 ðv 1 Þ þ a2 f2 ðv 1 Þ þ Á Á Á þ an fn ðv 1 Þ ¼ a1 Á 1 þ a2 Á 0 þ Á Á Á þ an Á 0 ¼ a1 Similarly, for i ¼ 2; . . . ; n, 0 ¼ 0ðv i Þ ¼ ða1 f1 þ Á Á Á þ an fn Þðv i Þ ¼ a1 f1 ðv i Þ þ Á Á Á þ ai fi ðv i Þ þ Á Á Á þ an fn ðv i Þ ¼ ai That is, a1 ¼ 0; . . . ; an ¼ 0. Hence, ff1 ; . . . ; fn g is linearly independent, and so it is a basis of V *. vectors, fðv 2 Þ ¼ k2 ; ...; fðv n Þ ¼ kn

11.4.

Prove Theorem 11.2: Let fv 1 ; . . . ; v n g be a basis of V and let ff1 ; .P ; fn g be the dual basis in .. P V *. For any u 2 V and any s 2 V *, (i) u ¼ i fi ðuÞv i . (ii) s ¼ i fðv i Þfi .
Suppose u ¼ a1 v 1 þ a2 v 2 þ Á Á Á þ an v n Then f1 ðuÞ ¼ a1 f1 ðv 1 Þ þ a2 f1 ðv 2 Þ þ Á Á Á þ an f1 ðv n Þ ¼ a1 Á 1 þ a2 Á 0 þ Á Á Á þ an Á 0 ¼ a1 Similarly, for i ¼ 2; . . . ; n, fi ðuÞ ¼ a1 fi ðv 1 Þ þ Á Á Á þ ai fi ðv i Þ þ Á Á Á þ an fi ðv n Þ ¼ ai That is, f1 ðuÞ ¼ a1 , f2 ðuÞ ¼ a2 ; . . . ; fn ðuÞ ¼ an . Substituting these results into (1), we obtain (i). Next we prove ðiiÞ. Applying the linear functional s to both sides of (i), sðuÞ ¼ f1 ðuÞsðv 1 Þ þ f2 ðuÞsðv 2 Þ þ Á Á Á þ fn ðuÞsðv n Þ ¼ sðv 1 Þf1 ðuÞ þ sðv 2 Þf2 ðuÞ þ Á Á Á þ sðv n Þfn ðuÞ ¼ ðsðv 1 Þf1 þ sðv 2 Þf2 þ Á Á Á þ sðv n Þfn ÞðuÞ Because the above holds for every u 2 V, s ¼ sðv 1 Þf2 þ sðv 2 Þf2 þ Á Á Á þ sðv n Þfn , as claimed. ð1Þ

11.5.

Prove Theorem 11.3. Let fv i g and fwi g be bases of V and let ffi g and fsi g be the respective dual bases in V *. Let P be the change-of-basis matrix from fv i g to fwi g: Then ðPÀ1 ÞT is the change-of-basis matrix from ffi g to fsi g.
Suppose, for i ¼ 1; . . . ; n, wi ¼ ai1 v 1 þ ai2 v 2 þ Á Á Á þ ain v n and si ¼ bi1 f1 þ bi2 f2 þ Á Á Á þ ain v n Then P ¼ ½aij Š and Q ¼ ½bij Š. We seek to prove that Q ¼ ðPÀ1 ÞT . Let Ri denote the ith row of Q and let Cj denote the jth column of PT . Then Ri ¼ ðbi1 ; bi2 ; . . . ; bin Þ and Cj ¼ ðaj1 ; aj2 ; . . . ; ajn ÞT

354
By definition of the dual basis,

CHAPTER 11 Linear Functionals and the Dual Space

si ðwj Þ ¼ ðbi1 f1 þ bi2 f2 þ Á Á Á þ bin fn Þðaj1 v 1 þ aj2 v 2 þ Á Á Á þ ajn v n Þ ¼ bi1 aj1 þ bi2 aj2 þ Á Á Á þ bin ajn ¼ Ri Cj ¼ dij where dij is the Kronecker delta. Thus, QPT ¼ ½Ri Cj Š ¼ ½dij Š ¼ I Therefore, Q ¼ ðPT ÞÀ1 ¼ ðPÀ1 ÞT , as claimed.

11.6.

Suppose v 2 V, v 6¼ 0, and dim V ¼ n. Show that there exists f 2 V * such that fðvÞ 6¼ 0.
We extend fvg to a basis fv; v 2 ; . . . ; v n g of V. By Theorem 5.2, there exists a unique linear mapping f:V ! K such that fðvÞ ¼ 1 and fðv i Þ ¼ 0, i ¼ 2; . . . ; n. Hence, f has the desired property.

11.7.

^ Prove Theorem 11.4: Suppose dim V ¼ n. Then the natural mapping v 7! v is an isomorphism of V onto V **.
^ We first prove that the map v 7! v is linear—that is, for any vectors v; w 2 V and any scalars a; b 2 K, d av þ bw ¼ a^ þ bw. For any linear functional f 2 V *, v ^

d ^ ^ av þ bwðfÞ ¼ fðav þ bwÞ ¼ afðvÞ þ bfðwÞ ¼ a^ðfÞ þ bwðfÞ ¼ ða^ þ bwÞðfÞ v v d d ^ ^ v Because av þ bwðfÞ ¼ ða^ þ bwÞðfÞ for every f 2 V *, we have av þ bw ¼ a^ þ bw. Thus, the map v ^ v 7! v is linear. Now suppose v 2 V, v 6¼ 0. Then, by Problem 11.6, there exists f 2 V * for which fðvÞ 6¼ 0. Hence, ^ ^ ^ ^ v ðfÞ ¼ fðvÞ 6¼ 0, and thus v 6¼ 0. Because v 6¼ 0 implies v 6¼ 0, the map v 7! v is nonsingular and hence an isomorphism (Theorem 5.64). Now dim V ¼ dim V * ¼ dim V **, because V has finite dimension. Accordingly, the mapping v 7! v ^ is an isomorphism of V onto V **.

Annihilators 11.8. Show that if f 2 V * annihilates a subset S of V, then f annihilates the linear span LðSÞ of S. Hence, S 0 ¼ ½spanðSފ0 .
Suppose v 2 spanðSÞ. Then there exists w1 ; . . . ; wr 2 S for which v ¼ a1 w1 þ a2 w2 þ Á Á Á þ ar wr . fðvÞ ¼ a1 fðw1 Þ þ a2 fðw2 Þ þ Á Á Á þ ar fðwr Þ ¼ a1 0 þ a2 0 þ Á Á Á þ ar 0 ¼ 0 Because v was an arbitrary element of spanðSÞ; f annihilates spanðSÞ, as claimed.

11.9.

Find a basis of the annihilator W 0 of the subspace W of R4 spanned by v 1 ¼ ð1; 2; À3; 4Þ and v 2 ¼ ð0; 1; 4; À1Þ By Problem 11.8, it suffices to find a basis of the set of linear functionals f such that fðv 1 Þ ¼ 0 and fðv 2 Þ ¼ 0, where fðx1 ; x2 ; x3 ; x4 Þ ¼ ax1 þ bx2 þ cx3 þ dx4 . Thus, fð1; 2; À3; 4Þ ¼ a þ 2b À 3c þ 4d ¼ 0 and fð0; 1; 4; À1Þ ¼ b þ 4c À d ¼ 0 The system of two equations in the unknowns a; b; c; d is in echelon form with free variables c and d. (1) Set c ¼ 1, d ¼ 0 to obtain the solution a ¼ 11, b ¼ À4, c ¼ 1, d ¼ 0. (2) Set c ¼ 0, d ¼ 1 to obtain the solution a ¼ 6, b ¼ À1, c ¼ 0, d ¼ 1. The linear functions f1 ðxi Þ ¼ 11x1 À 4x2 þ x3 and f2 ðxi Þ ¼ 6x1 À x2 þ x4 form a basis of W 0 .

11.10. Show that (a)

0 0 For any subset S of V ; S  S 00 . (b) If S1  S2 , then S2  S1 .

^ ^ (a) Let v 2 S. Then for every linear functional f 2 S 0 , v ðfÞ ¼ fðvÞ ¼ 0. Hence, v 2 ðS 0 Þ0 . Therefore, under the identification of V and V **, v 2 S 00 . Accordingly, S  S 00 . 0 (b) Let f 2 S2 . Then fðvÞ ¼ 0 for every v 2 S2 . But S1  S2 ; hence, f annihilates every element of S1 0 0 0 (i.e., f 2 S1 ). Therefore, S2  S1 .

CHAPTER 11 Linear Functionals and the Dual Space
11.11. Prove Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. Then (i) dim W þ dim W 0 ¼ dim V, (ii) W 00 ¼ W.

355

(i) Suppose dim V ¼ n and dim W ¼ r n. We want to show that dim W 0 ¼ n À r. We choose a basis fw1 ; . . . ; wr g of W and extend it to a basis of V, say fw1 ; . . . ; wr ; v 1 ; . . . ; v nÀr g. Consider the dual basis ff1 ; . . . ; fr ; s1 ; . . . ; snÀr g By definition of the dual basis, each of the above s’s annihilates each wi ; hence, s1 ; . . . ; snÀr 2 W 0 . We claim that fsi g is a basis of W 0 . Now fsj g is part of a basis of V *, and so it is linearly independent. We next show that ffj g spans W 0 . Let s 2 W 0 . By Theorem 11.2, s ¼ sðw1 Þf1 þ Á Á Á þ sðwr Þfr þ sðv 1 Þs1 þ Á Á Á þ sðv nÀr ÞsnÀr ¼ 0f1 þ Á Á Á þ 0fr þ sðv 1 Þs1 þ Á Á Á þ sðv nÀr ÞsnÀr ¼ sðv 1 Þs1 þ Á Á Á þ sðv nÀr ÞsnÀr Consequently, fs1 ; . . . ; snÀr g spans W 0 and so it is a basis of W 0 . Accordingly, as required dim W 0 ¼ n À r ¼ dim V À dim W : (ii) Suppose dim V ¼ n and dim W ¼ r. Then dim V * ¼ n and, by (i), dim W 0 ¼ n À r. Thus, by (i), dim W 00 ¼ n À ðn À rÞ ¼ r; therefore, dim W ¼ dim W 00 . By Problem 11.10, W  W 00 . Accordingly, W ¼ W 00 .

11.12. Let U and W be subspaces of V. Prove that ðU þ W Þ0 ¼ U 0 \ W 0 .
Let f 2 ðU þ W Þ0 . Then f annihilates U þ W; and so, in particular, f annihilates U and W: That is, f 2 U 0 and f 2 W 0 ; hence, f 2 U 0 \ W 0 : Thus, ðU þ W Þ0  U 0 \ W 0 : On the other hand, suppose s 2 U 0 \ W 0 : Then s annihilates U and also W. If v 2 U þ W, then v ¼ u þ w, where u 2 U and w 2 W. Hence, sðvÞ ¼ sðuÞ þ sðwÞ ¼ 0 þ 0 ¼ 0. Thus, s annihilates U þ W; that is, s 2 ðU þ W Þ0 . Accordingly, U 0 þ W 0  ðU þ W Þ0 . The two inclusion relations together give us the desired equality. Remark: Observe that no dimension argument is employed in the proof; hence, the result holds for spaces of finite or infinite dimension.

Transpose of a Linear Mapping 11.13. Let f be the linear functional on R2 defined by fðx; yÞ ¼ x À 2y. For each of the following linear operators T on R2 , find ðT t ðfÞÞðx; yÞ: (a) T ðx; yÞ ¼ ðx; 0Þ, (b) T ðx; yÞ ¼ ðy; x þ yÞ, t t

(c) T ðx; yÞ ¼ ð2x À 3y; 5x þ 2yÞ

By definition, T ðfÞ ¼ f  T ; that is, ðT ðfÞÞðvÞ ¼ fðT ðvÞÞ for every v. Hence, (a) ðT t ðfÞÞðx; yÞ ¼ fðT ðx; yÞÞ ¼ fðx; 0Þ ¼ x (b) ðT t ðfÞÞðx; yÞ ¼ fðT ðx; yÞÞ ¼ fðy; x þ yÞ ¼ y À 2ðx þ yÞ ¼ À2x À y (c) ðT t ðfÞÞðx; yÞ ¼ fðT ðx; yÞÞ ¼ fð2x À 3y; 5x þ 2yÞ ¼ ð2x À 3yÞ À 2ð5x þ 2yÞ ¼ À8x À 7y

11.14. Let T :V ! U be linear and let T t :U * ! V * be its transpose. Show that the kernel of T t is the annihilator of the image of T —that is, Ker T t ¼ ðIm T Þ0 .
Suppose f 2 Ker T t ; that is, T t ðfÞ ¼ f  T ¼ 0. If u 2 Im T , then u ¼ T ðvÞ for some v 2 V ; hence, fðuÞ ¼ fðT ðvÞÞ ¼ ðf  T ÞðvÞ ¼ 0ðvÞ ¼ 0 We have that fðuÞ ¼ 0 for every u 2 Im T ; hence, f 2 ðIm T Þ0 . Thus, Ker T t  ðIm T Þ0 . On the other hand, suppose s 2 ðIm T Þ0 ; that is, sðIm T Þ ¼ f0g . Then, for every v 2 V, ðT t ðsÞÞðvÞ ¼ ðs  T ÞðvÞ ¼ sðT ðvÞÞ ¼ 0 ¼ 0ðvÞ

356

CHAPTER 11 Linear Functionals and the Dual Space
We have ðT t ðsÞÞðvÞ ¼ 0ðvÞ for every v 2 V ; hence, T t ðsÞ ¼ 0. Thus, s 2 Ker T t , and so ðIm T Þ0  Ker T t . The two inclusion relations together give us the required equality.

11.15. Suppose V and U have finite dimension and T :V ! U is linear. Prove rankðT Þ ¼ rankðT t Þ.
Suppose dim V ¼ n and dim U ¼ m, and suppose rankðT Þ ¼ r. By Theorem 11.5, dimðIm T Þ0 ¼ dim u À dimðIm T Þ ¼ m À rankðT Þ ¼ m À r By Problem 11.14, Ker T t ¼ ðIm T Þ0 . Hence, nullity ðT t Þ ¼ m À r. It then follows that, as claimed, rankðT t Þ ¼ dim U * À nullityðT t Þ ¼ m À ðm À rÞ ¼ r ¼ rankðT Þ

11.16. Prove Theorem 11.7: Let T :V ! U be linear and let A be the matrix representation of T in the bases fv j g of V and fui g of U . Then the transpose matrix AT is the matrix representation of T t :U * ! V * in the bases dual to fui g and fv j g.
Suppose, for j ¼ 1; . . . ; m, T ðv j Þ ¼ aj1 u1 þ aj2 u2 þ Á Á Á þ ajn un We want to prove that, for i ¼ 1; . . . ; n, T t ðsi Þ ¼ a1i f1 þ a2i f2 þ Á Á Á þ ami fm where fsi g and ffj g are the bases dual to fui g and fv j g, respectively. Let v 2 V and suppose v ¼ k1 v 1 þ k2 v 2 þ Á Á Á þ km v m . Then, by (1), T ðvÞ ¼ k1 T ðv 1 Þ þ k2 T ðv 2 Þ þ Á Á Á þ km T ðv m Þ ¼ k1 ða11 u1 þ Á Á Á þ a1n un Þ þ k2 ða21 u1 þ Á Á Á þ a2n un Þ þ Á Á Á þ km ðam1 u1 þ Á Á Á þ amn un Þ ¼ ðk1 a11 þ k2 a21 þ Á Á Á þ km am1 Þu1 þ Á Á Á þ ðk1 a1n þ k2 a2n þ Á Á Á þ km amn Þun n P ¼ ðk1 a1i þ k2 a2i þ Á Á Á þ km ami Þui i¼1 ð1Þ

ð2Þ

Hence, for j ¼ 1; . . . ; n. ðT ðsj ÞðvÞÞ ¼ sj ðT ðvÞÞ ¼ sj t 

n P

 ðk1 a1i þ k2 a2i þ Á Á Á þ km ami Þui ð3Þ

i¼1

¼ k1 a1j þ k2 a2j þ Á Á Á þ km amj On the other hand, for j ¼ 1; . . . ; n, ða1j f1 þ a2j f2 þ Á Á Á þ amj fm ÞðvÞ ¼ ða1j f1 þ a2j f2 þ Á Á Á þ amj fm Þðk1 v 1 þ k2 v 2 þ Á Á Á þ km v m Þ ¼ k1 a1j þ k2 a2j þ Á Á Á þ km amj Because v 2 V was arbitrary, (3) and (4) imply that T t ðsj Þ ¼ a1j f1 þ a2j f2 þ Á Á Á þ amj fm ; which is (2). Thus, the theorem is proved. j ¼ 1; . . . ; n

ð4Þ

SUPPLEMENTARY PROBLEMS Dual Spaces and Dual Bases
11.17. Find (a) f þ s, (b) 3f, (c) 2f À 5s, where f:R3 ! R and s:R3 ! R are defined by fðx; y; zÞ ¼ 2x À 3y þ z and sðx; y; zÞ ¼ 4x À 2y þ 3z

11.18. Find the dual basis of each of the following bases of R3 : (a) fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg, (b) fð1; À2; 3Þ; ð1; À1; 1Þ; ð2; À4; 7Þg.

CHAPTER 11 Linear Functionals and the Dual Space

357

11.19. Let V be the vector space of polynomials over R of degree 2. Let f1 ; f2 ; f3 be the linear functionals on V defined by ð1 f2 ð f ðtÞÞ ¼ f 0 ð1Þ; f3 ð f ðtÞÞ ¼ f ð0Þ f1 ð f ðtÞÞ ¼ f ðtÞ dt;
0

Here f ðtÞ ¼ a þ bt þ ct 2 V and f 0 ðtÞ denotes the derivative of f ðtÞ. Find the basis f f1 ðtÞ; f2 ðtÞ; f3 ðtÞg of V that is dual to ff1 ; f2 ; f3 g.
2

11.20. Suppose u; v 2 V and that fðuÞ ¼ 0 implies fðvÞ ¼ 0 for all f 2 V *. Show that v ¼ ku for some scalar k. 11.21. Suppose f; s 2 V * and that fðvÞ ¼ 0 implies sðvÞ ¼ 0 for all v 2 V. Show that s ¼ kf for some scalar k. 11.22. Let V be the vector space of polynomials over K. For a 2 K, define fa :V ! K by fa ð f ðtÞÞ ¼ f ðaÞ. Show 6 that (a) fa is linear; (b) if a ¼ b, then fa 6¼ fb . 11.23. Let V be the vector space of polynomials of degree 2. Let a; b; c 2 K be distinct scalars. Let fa ; fb ; fc be the linear functionals defined by fa ð f ðtÞÞ ¼ f ðaÞ, fb ð f ðtÞÞ ¼ f ðbÞ, fc ð f ðtÞÞ ¼ f ðcÞ. Show that ffa ; fb ; fc g is linearly independent, and find the basis f f1 ðtÞ; f2 ðtÞ; f3 ðtÞg of V that is its dual. 11.24. Let V be the vector space of square matrices of order n. Let T :V ! K be the trace mapping; that is, T ðAÞ ¼ a11 þ a22 þ Á Á Á þ ann , where A ¼ ðaij Þ. Show that T is linear. 11.25. Let W be a subspace of V. For any linear functional f on W, show that there is a linear functional s on V such that sðwÞ ¼ fðwÞ for any w 2 W ; that is, f is the restriction of s to W. 11.26. Let fe1 ; . . . ; en g be the usual basis of K n . Show that the dual basis is fp1 ; . . . ; pn g where pi is the ith projection mapping; that is, pi ða1 ; . . . ; an Þ ¼ ai . 11.27. Let V be a vector space over R. Let f1 ; f2 2 V * and suppose s:V ! R; defined by sðvÞ ¼ f1 ðvÞf2 ðvÞ; also belongs to V *. Show that either f1 ¼ 0 or f2 ¼ 0.

Annihilators
11.28. Let W be the subspace of R4 spanned by ð1; 2; À3; 4Þ, annihilator of W. ð1; 3; À2; 6Þ, ð1; 4; À1; 8Þ. Find a basis of the

11.29. Let W be the subspace of R3 spanned by ð1; 1; 0Þ and ð0; 1; 1Þ. Find a basis of the annihilator of W. 11.30. Show that, for any subset S of V ; spanðSÞ ¼ S 00 , where spanðSÞ is the linear span of S. 11.31. Let U and W be subspaces of a vector space V of finite dimension. Prove that ðU \ W Þ0 ¼ U 0 þ W 0 . 11.32. Suppose V ¼ U È W. Prove that V 0 ¼ U 0 È W 0 .

Transpose of a Linear Mapping
11.33. Let f be the linear functional on R2 defined by fðx; yÞ ¼ 3x À 2y. For each of the following linear mappings T :R3 ! R2 , find ðT t ðfÞÞðx; y; zÞ: (a) T ðx; y; zÞ ¼ ðx þ y; y þ zÞ, (b) T ðx; y; zÞ ¼ ðx þ y þ z; 2x À yÞ

t t 11.34. Suppose T1 :U ! V and T2 :V ! W are linear. Prove that ðT2  T1 Þt ¼ T1  T2 .

11.35. Suppose T :V ! U is linear and V has finite dimension. Prove that Im T t ¼ ðKer T Þ0 .

358

CHAPTER 11 Linear Functionals and the Dual Space

11.36. Suppose T :V ! U is linear and u 2 U . Prove that u 2 Im T or there exists f 2 V * such that T t ðfÞ ¼ 0 and fðuÞ ¼ 1. 11.37. Let V be of finite dimension. Show that the mapping T 7! T t is an isomorphism from HomðV ; V Þ onto HomðV *; V *Þ. (Here T is any linear operator on V.)

Miscellaneous Problems
11.38. Let V be a vector space over R. The line segment uv joining points u; v 2 V is defined by uv ¼ ftu þ ð1 À tÞv : 0 t 1g. A subset S of V is convex if u; v 2 S implies uv  S. Let f 2 V *. Define W þ ¼ fv 2 V : fðvÞ > 0g; W ¼ fv 2 V : fðvÞ ¼ 0g; W À ¼ fv 2 V : fðvÞ < 0g

Prove that W þ ; W, and W À are convex. 11.39. Let V be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzero linear functional f on V. Show that every subspace of V is the intersection of a finite number of hyperplanes.

ANSWERS TO SUPPLEMENTARY PROBLEMS
11.17. (a) 11.18. (a) 6x À 5y þ 4z, (b) 6x À 9y þ 3z, (b) (c) À16x þ 4y À 13z

f1 ¼ x; f2 ¼ y; f3 ¼ z;

f1 ¼ À3x À 5y À 2z; f2 ¼ 2x þ y; f3 ¼ x þ 2y þ z f3 ðtÞ ¼ 1 À 3t þ 3 t2 2

11.19. f1 ðtÞ ¼ 3t À 3 t2 ; 2

f2 ðtÞ ¼ À 1 t þ 3 t2 ; 2 4

11.22. (b) Let f ðtÞ ¼ t. Then fa ð f ðtÞÞ ¼ a 6¼ b ¼ fb ð f ðtÞÞ; and therefore, fa 6¼ fb & ' t2 À ðb þ cÞt þ bc t2 À ða þ cÞt þ ac t2 À ða þ bÞt þ ab ; f2 ðtÞ ¼ ; f3 ðtÞ ¼ 11.23. f1 ðtÞ ¼ ða À bÞða À cÞ ðb À aÞðb À cÞ ðc À aÞðc À bÞ 11.28. ff1 ðx; y; z; tÞ ¼ 5x À y þ z; f2 ðx; y; z; tÞ ¼ 2y À tg 11.29. ffðx; y; zÞ ¼ x À y þ zg 11.33. (a) ðT t ðfÞÞðx; y; zÞ ¼ 3x þ y À 2z, (b) ðT t ðfÞÞðx; y; zÞ ¼ Àx þ 5y þ 3z

CHAPTER 12

Bilinear, Quadratic, and Hermitian Forms
12.1 Introduction
This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we introduce the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian forms. Although quadratic forms were discussed previously, this chapter is treated independently of the previous results. Although the field K is arbitrary, we will later specialize to the cases K ¼ R and K ¼ C. Furthermore, we may sometimes need to divide by 2. In such cases, we must assume that 1 þ 1 6¼ 0, which is true when K ¼ R or K ¼ C.

12.2

Bilinear Forms

Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping f :V Â V ! K such that, for all a; b 2 K and all ui ; v i 2 V: (i) f ðau1 þ bu2 ; vÞ ¼ af ðu1 ; vÞ þ bf ðu2 ; vÞ, (ii) f ðu; av 1 þ bv 2 Þ ¼ af ðu; v 1 Þ þ bf ðu; v 2 Þ We express condition (i) by saying f is linear in the first variable, and condition (ii) by saying f is linear in the second variable.
EXAMPLE 12.1

(a) Let f be the dot product on Rn ; that is, for u ¼ ðai Þ and v ¼ ðbi Þ,

f ðu; vÞ ¼ u Á v ¼ a1 b1 þ a2 b2 þ Á Á Á þ an bn
Then f is a bilinear form on Rn . (In fact, any inner product on a real vector space V is a bilinear form on V.) (b) Let f and s be arbitrarily linear functionals on V. Let f :V  V ! K be defined by f ðu; vÞ ¼ fðuÞsðvÞ. Then f is a bilinear form, because f and s are each linear. (c) Let A ¼ ½aij Š be any n  n matrix over a field K. Then A may be identified with the following bilinear form F on K n , where X ¼ ½xi Š and Y ¼ ½yi Š are column vectors of variables:

f ðX ; Y Þ ¼ X T AY ¼

P i;j aij xi yi ¼ a11 x1 y1 þ a12 x1 y2 þ Á Á Á þ ann xn yn

The above formal expression in the variables xi ; yi is termed the bilinear polynomial corresponding to the matrix A. Equation (12.1) shows that, in a certain sense, every bilinear form is of this type.

359

360
Space of Bilinear Forms

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms

Let BðV Þ denote the set of all bilinear forms on V. A vector space structure is placed on BðV Þ, where for any f ; g 2 BðV Þ and any k 2 K, we define f þ g and kf as follows: ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ and ðkf Þðu; vÞ ¼ kf ðu; vÞ The following theorem (proved in Problem 12.4) applies.
THEOREM 12.1:

Let V be a vector space of dimension n over K. Let ff1 ; . . . ; fn g be any basis of the dual space V *. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðV Þ, where fij is defined by fij ðu; vÞ ¼ fi ðuÞfj ðvÞ. Thus, in particular, dim BðV Þ ¼ n2 .

12.3

Bilinear Forms and Matrices and v ¼ b1 u1 þ Á Á Á þ bn un P i;j Let f be a bilinear form on V and let S ¼ fu1 ; . . . ; un g be a basis of V. Suppose u; v 2 V and u ¼ a1 u1 þ Á Á Á þ an un Then f ðu; vÞ ¼ f ða1 u1 þ Á Á Á þ an un ; b1 u1 þ Á Á Á þ bn un Þ ¼

ai bj f ðui ; uj Þ

Thus, f is completely determined by the n2 values f ðui ; uj Þ. The matrix A ¼ ½aij Š where aij ¼ f ðui ; uj Þ is called the matrix representation of f relative to the basis S or, simply, the ‘‘matrix of f in S.’’ It ‘‘represents’’ f in the sense that, for all u; v 2 V, P ð12:1Þ f ðu; vÞ ¼ ai bj f ðui ; uj Þ ¼ ½uŠT A½vŠS S i;j [As usual, ½uŠS denotes the coordinate (column) vector of u in the basis S.]

Change of Basis, Congruent Matrices
We now ask, how does a matrix representing a bilinear form transform when a new basis is selected? The answer is given in the following theorem (proved in Problem 12.5).
THEOREM 12.2:

Let P be a change-of-basis matrix from one basis S to another basis S 0 . If A is the matrix representing a bilinear form f in the original basis S, then B ¼ PT AP is the matrix representing f in the new basis S 0 .

The above theorem motivates the following definition.
DEFINITION:

A matrix B is congruent to a matrix A, written B ’ A, if there exists a nonsingular matrix P such that B ¼ PTAP.

Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that congruent matrices have the same rank, because P and PT are nonsingular; hence, the following definition is well defined.
DEFINITION:

The rank of a bilinear form f on V, written rankð f Þ, is the rank of any matrix representation of f . We say f is degenerate or nondegenerate according to whether rankð f Þ < dim V or rankð f Þ ¼ dim V.

12.4

Alternating Bilinear Forms

Let f be a bilinear form on V. Then f is called (i) alternating if f ðv; vÞ ¼ 0 for every v 2 V ; (ii) skew-symmetric if f ðu; vÞ ¼ Àf ðv; uÞ for every u; v 2 V.

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Now suppose (i) is true. Then (ii) is true, because, for any u; v; 2 V, 0 ¼ f ðu þ v; u þ vÞ ¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ ¼ f ðu; vÞ þ f ðv; uÞ

361

On the other hand, suppose (ii) is true and also 1 þ 1 6¼ 0. Then (i) is true, because, for every v 2 V, we have f ðv; vÞ ¼ Àf ðv; vÞ. In other words, alternating and skew-symmetric are equivalent when 1 þ 1 6¼ 0. The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows.
THEOREM 12.3:

Let f be an alternating bilinear form on V. Then there exists a basis of V in which f is represented by a block diagonal matrix M of the form  !  ! ! 0 1 0 1 0 1 ; ½0Š; ½0Š; . . . ½0Š ; ...; ; M ¼ diag À1 0 À1 0 À1 0 Moreover, the number of nonzero blocks is uniquely determined by f [because it is equal to 1 rankð f ފ. 2

In particular, the above theorem shows that any alternating bilinear form must have even rank.

12.5

Symmetric Bilinear Forms, Quadratic Forms

This section investigates the important notions of symmetric bilinear forms and quadratic forms and their representation by means of symmetric matrices. The only restriction on the field K is that 1 þ 1 ¼ 0. In 6 Section 12.6, we will restrict K to be the real field R, which yields important special results.

Symmetric Bilinear Forms
Let f be a bilinear form on V. Then f is said to be symmetric if, for every u; v 2 V, f ðu; vÞ ¼ f ðv; uÞ One can easily show that f is symmetric if and only if any matrix representation A of f is a symmetric matrix. The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize that we are assuming that 1 þ 1 6¼ 0.)
THEOREM 12.4:

Let f be a symmetric bilinear form on V. Then V has a basis fv 1 ; . . . ; v n g in which f is represented by a diagonal matrix—that is, where f ðv i ; v j Þ ¼ 0 for i 6¼ j. (Alternative Form) Let A be a symmetric matrix over K. Then A is congruent to a diagonal matrix; that is, there exists a nonsingular matrix P such that PTAP is diagonal.

THEOREM 12.4:

Diagonalization Algorithm
Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of obtaining the diagonal form D ¼ PTAP is by a sequence of elementary row operations and the same sequence of elementary column operations. This same sequence of elementary row operations on the identity matrix I will yield PT . This algorithm is formalized below.
ALGORITHM 12.1:

(Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric matrix A ¼ ½aij Š of order n.

Step 1. Form the n  2n (block) matrix M ¼ ½A1 ; IŠ, where A1 ¼ A is the left half of M and the identity matrix I is the right half of M. Step 2. Examine the entry a11 . There are three cases.

362
Case I:

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms a11 6¼ 0. (Use a11 as a pivot to put 0’s below a11 in M and to the right of a11 in A1 :Þ For i ¼ 2; . . . ; n: (a) Apply the row operation ‘‘Replace Ri by Àai1 R1 þ a11 Ri .’’ (b) Apply the corresponding column operation ‘‘Replace Ci by Àai1 C1 þ a11 Ci .’’ These operations reduce the matrix M to the form ! a11 0 * * M$ 0 A1 * * Case II: a11 ¼ 0 but akk 6¼ 0, for some k > 1. (a) Apply the row operation ‘‘Interchange R1 and Rk .’’ (b) Apply the corresponding column operation ‘‘Interchange C1 and Ck .’’ (These operations bring akk into the first diagonal position, which reduces the matrix to Case I.) Case III: All diagonal entries aii ¼ 0 but some aij 6¼ 0. (a) Apply the row operation ‘‘Replace Ri by Rj þ Ri .’’ (b) Apply the corresponding column operation ‘‘Replace Ci by Cj þ Ci .’’ (These operations bring 2aij into the ith diagonal position, which reduces the matrix to Case II.) Thus, M is finally reduced to the form ð*Þ, where A2 is a symmetric matrix of order less than A.

ð*Þ

Step 3. Repeat Step 2 with each new matrix Ak (by neglecting the first row and column of the preceding matrix) until A is diagonalized. Then M is transformed into the form M 0 ¼ ½D; QŠ, where D is diagonal. Step 4. Set P ¼ QT . Then D ¼ PTAP. Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the column operations will only change the left half of M. Remark 2: The condition 1 þ 1 6¼ 0 is used in Case III, where we assume that 2aij 6¼ 0 when aij 6¼ 0. The justification for the above algorithm appears in Problem 12.9. 2 3 1 2 À3 EXAMPLE 12.2 Let A ¼ 4 2 5 À4 5. Apply Algorithm 9.1 to find a nonsingular matrix P such À3 À4 8 that D ¼ PTAP is diagonal.
First form the block matrix M ¼ ½A; IŠ; that is, let

1 M ¼ ½A; IŠ ¼ 4 2 À3

2

2 5 À4

À3 À4 8

1 0 0 1 0 0

3 0 05 1

Apply the row operations ‘‘Replace R2 by À2R1 þ R2 ’’ and ‘‘Replace R3 by 3R1 þ R3 ’’ to M, and then apply the corresponding column operations ‘‘Replace C2 by À2C1 þ C2 ’’ and ‘‘Replace C3 by 3C1 þ C3 ’’ to obtain

2

1 2 40 1 0 2

À3 2 À1

1 À2 3

3 0 0 1 05 0 1

2 and then

1 0 40 1 0 2

0 2 À1

1 À2 3

3 0 0 1 05 0 1

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms

363

Next apply the row operation ‘‘Replace R3 by À2R2 þ R3 ’’ and then the corresponding column operation ‘‘Replace C3 by À2C2 þ C3 ’’ to obtain

2

1 0 40 1 0 0 2

0 2 À5

1 À2 7

3 0 0 1 05 À2 1

2 and then

1 40 0

0 0 1 0 0 À5 2

1 À2 7

3 0 0 1 05 À2 1

Now A has been diagonalized. Set

1 À2 P ¼ 40 1 0 0

3 7 À2 5 1

and then

1 0 D ¼ PÀ1 AP ¼ 4 0 1 0 0

3 0 05 À5

We emphasize that P is the transpose of the right half of the final matrix.

Quadratic Forms
We begin with a definition.
DEFINITION A:

A mapping q:V ! K is a quadratic form if qðvÞ ¼ f ðv; vÞ for some symmetric bilinear form f on V.

If 1 þ 1 6¼ 0 in K, then the bilinear form f can be obtained from the quadratic form q by the following polar form of f : f ðu; vÞ ¼ 1 ½qðu þ vÞ À qðuÞ À qðvފ 2 Now suppose f is represented by a symmetric matrix A ¼ ½aij Š, and 1 þ 1 6¼ 0. Letting X ¼ ½xi Š denote a column vector of variables, q can be represented in the form P P P qðX Þ ¼ f ðX ; X Þ ¼ X T AX ¼ aij xi xj ¼ aii x2 þ 2 aij xi xj i i;j i i 0 1 2 n Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can be represented in diagonal form with only 1’s and À1’s as nonzero coefficients. Namely, we have the following corollary.
COROLLARY 12.6:
EXAMPLE 12.3 Let f be the dot product on Rn . Recall that f is a symmetric bilinear form on Rn . We note

Any real quadratic form q has a unique representation in the form qðx1 ; x2 ; . . . ; xn Þ ¼ x2 þ Á Á Á þ x2 À x2 À Á Á Á À x2 1 p pþ1 r where r ¼ p þ n is the rank of the form.

COROLLARY 12.6:

(Alternative Form) Any real symmetric matrix A is congruent to the unique diagonal matrix D ¼ diagðIp ; ÀIn ; 0Þ where r ¼ p þ n is the rank of A.

12.7

Hermitian Forms

Let V be a vector space of finite dimension over the complex field C. A Hermitian form on V is a mapping f :V  V ! C such that, for all a; b 2 C and all ui ; v 2 V, (i) f ðau1 þ bu2 ; vÞ ¼ af ðu1 ; vÞ þ bf ðu2 ; vÞ, (ii) f ðu; vÞ ¼ f ðv; uÞ.  (As usual, k denotes the complex conjugate of k 2 C.) Using (i) and (ii), we get f ðu; av 1 þ bv 2 Þ ¼ f ðav 1 þ bv 2 ; uÞ ¼ af ðv 1 ; uÞ þ bf ðv 2 ; uÞ   ^ ¼ af ðv 1 ; uÞ þ bf ðv 2 ; uÞ ¼ af ðu; v 1 Þ þ bf ðu; v 2 Þ That is,   ðiiiÞ f ðu; av 1 þ bv 2 Þ ¼ af ðu; v 1 Þ þ bf ðu; v 2 Þ: As before, we express condition (i) by saying f is linear in the first variable. On the other hand, we express condition (iii) by saying f is ‘‘conjugate linear’’ in the second variable. Moreover, condition (ii) tells us that f ðv; vÞ ¼ f ðv; vÞ, and hence, f ðv; vÞ is real for every v 2 V. The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms. Thus, the mapping q:V ! R, defined by qðvÞ ¼ f ðv; vÞ, is called the Hermitian quadratic form or complex quadratic form associated with the Hermitian form f . We can obtain f from q by the polar form f ðu; vÞ ¼ 1 ½qðu þ vÞ À qðu À vފ þ 1 ½qðu þ ivÞ À qðu À ivފ 4 4

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms

365

Now suppose S ¼ fu1 ; . . . ; un g is a basis of V. The matrix H ¼ ½hij Š where hij ¼ f ðui ; uj Þ is called the matrix representation of f in the basis S. By (ii), f ðui ; uj Þ ¼ f ðuj ; ui Þ; hence, H is Hermitian and, in particular, the diagonal entries of H are real. Thus, any diagonal representation of f contains only real entries. The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real symmetric bilinear forms.
THEOREM 12.7:

Let f be a Hermitian form on V over C. Then there exists a basis of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation of f has the same number p of positive entries and the same number n of negative entries.

Again the rank and signature of the Hermitian form f are denoted and defined by rankð f Þ ¼ p þ n and sigð f Þ ¼ p À n These are uniquely defined by Theorem 12.7. Analogously, a Hermitian form f is said to be (i) positive definite if qðvÞ ¼ f ðv; vÞ > 0 for every v 6¼ 0, (ii) nonnegative semidefinite if qðvÞ ¼ f ðv; vÞ ! 0 for every v.
EXAMPLE 12.4 Let f be the dot product on Cn ; that is, for any u ¼ ðzi Þ and v ¼ ðwi Þ in Cn ,

   f ðu; vÞ ¼ u Á v ¼ z1 w1 þ z2 w2 þ Á Á Á þ zn wn
Then f is a Hermitian form on Cn . Moreover, f is also positive definite, because, for any u ¼ ðzi Þ 6¼ 0 in Cn ,

f ðu; uÞ ¼ z1 z1 þ z2 z2 þ Á Á Á þ zn zn ¼ jz1 j2 þ jz2 j2 þ Á Á Á þ jzn j2 > 0   

SOLVED PROBLEMS Bilinear Forms 12.1. Let u ¼ ðx1 ; x2 ; x3 Þ and v ¼ ðy1 ; y2 ; y3 Þ. Express f in matrix notation, where f ðu; vÞ ¼ 3x1 y1 À 2x1 y3 þ 5x2 y1 þ 7x2 y2 À 8x2 y3 þ 4x3 y2 À 6x3 y3
Let A ¼ ½aij Š, where aij is the coefficient of xi yj . Then 2 3 0 f ðu; vÞ ¼ X T AY ¼ ½x1 ; x2 ; x3 Š4 5 7 0 4 32 3 y1 À2 À8 54 y2 5 À6 y3

12.2.

Let A be an n  n matrix over K. Show that the mapping f defined by f ðX ; Y Þ ¼ X TAY is a bilinear form on K n .
For any a; b 2 K and any Xi ; Yi 2 K n ,
T T f ðaX1 þ bX2 ; Y Þ ¼ ðaX1 þ bX2 ÞT AY ¼ ðaX1 þ bX2 ÞAY T T ¼ aX1 AY þ bX2 AY ¼ af ðX1 ; Y Þ þ bf ðX2 ; Y Þ

Hence, f is linear in the first variable. Also, f ðX ; aY1 þ bY2 Þ ¼ X T AðaY1 þ bY2 Þ ¼ aX T AY1 þ bX T AY2 ¼ af ðX ; Y1 Þ þ bf ðX ; Y2 Þ Hence, f is linear in the second variable, and so f is a bilinear form on K n .

366
12.3.

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Let f be the bilinear form on R2 defined by f ½ðx1 ; x2 Þ; ðy1 ; y2 ފ ¼ 2x1 y1 À 3x1 y2 þ 4x2 y2 (a) Find the matrix A of f in the basis fu1 ¼ ð1; 0Þ; u2 ¼ ð1; 1Þg. (b) Find the matrix B of f in the basis fv 1 ¼ ð2; 1Þ; v 2 ¼ ð1; À1Þg. (c) Find the change-of-basis matrix P from the basis fui g to the basis fv i g, and verify that B ¼ PTAP.
(a) Set A ¼ ½aij Š, where aij ¼ f ðui ; uj Þ. This yields a11 ¼ f ½ð1; 0Þ; ð1; 0ފ ¼ 2 À 0 À 0 ¼ 2; a21 ¼ f ½ð1; 1Þ; ð1; 0ފ ¼ 2 À 0 þ 0 ¼ 2 a12 ¼ f ½ð1; 0Þ; ð1; 1ފ ¼ 2 À 3 À 0 ¼ À1; a22 ¼ f ½ð1; 1Þ; ð1; 1ފ ¼ 2 À 3 þ 4 ¼ 3 ! 2 À1 Thus, A ¼ is the matrix of f in the basis fu1 ; u2 g. 2 3 (b) Set B ¼ ½bij Š, where bij ¼ f ðv i ; v j Þ. This yields b21 ¼ f ½ð1; À1Þ; ð2; 1ފ ¼ 4 À 3 À 4 ¼ À3 b11 ¼ f ½ð2; 1Þ; ð2; 1ފ ¼ 8 À 6 þ 4 ¼ 6; b12 ¼ f ½ð2; 1Þ; ð1; À1ފ ¼ 4 þ 6 À 4 ¼ 6; b22 ¼ f ½ð1; À1Þ; ð1; À1ފ ¼ 2 þ 3 þ 4 ¼ 9 ! 6 6 Thus, B ¼ is the matrix of f in the basis fv 1 ; v 2 g. À3 9 (c) Writing v 1 and v 2 in terms of the ui yields v 1 ¼ u1 þ u2 and v 2 ¼ 2u1 À u2 . Then ! ! 1 2 1 1 P¼ ; PT ¼ 1 À1 2 À1 PTAP ¼ 1 1 2 À1 ! 2 À1 2 3 ! ! ! 1 2 6 6 ¼ ¼B 1 À1 À3 9

and

12.4.

Prove Theorem 12.1: Let V be an n-dimensional vector space over K. Let ff1 ; . . . ; fn g be any basis of the dual space V *. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðV Þ, where fij is defined by fij ðu; vÞ ¼ fi ðuÞfj ðvÞ. Thus, dim BðV Þ ¼ n2 .
Let fu1 ; . . . ; un g be the basis of V dual Pffi g. We first show that f fij g spans BðV Þ. Let f 2 BðV Þ and to suppose f ðui ; uj Þ ¼ aij : We claim that f ¼ i;j aij fij . It suffices to show that ÀP Á f ðus ; ut Þ ¼ aij fij ðus ; ut Þ for s; t ¼ 1; . . . ; n We have Á P P P aij fij ðus ; ut Þ ¼ aij fij ðus ; ut Þ ¼ aij fi ðus Þfj ðut Þ ¼ aij dis djt ¼ ast ¼ f ðus ; ut Þ P aij fij ¼ 0. Then for s; t ¼ 1; . . . ; n, as required. Hence, ffij g spans BðV Þ. Next, suppose P 0 ¼ 0ðus ; ut Þ ¼ ð aij fij Þðus ; ut Þ ¼ ars ÀP

The last step follows as above. Thus, f fij g is independent, and hence is a basis of BðV Þ.

12.5.

Prove Theorem 12.2. Let P be the change-of-basis matrix from a basis S to a basis S 0 . Let A be the matrix representing a bilinear form in the basis S. Then B ¼ PTAP is the matrix representing f in the basis S 0 .
Let u; v 2 V. Because P is the change-of-basis matrix from S to S 0 , we have P½uŠS 0 ¼ ½uŠS and also P½vŠS 0 ¼ ½vŠS ; hence, ½uŠT ¼ ½uŠT0 PT . Thus, S S f ðu; vÞ ¼ ½uŠT A½vŠS ¼ ½uŠT0 PT AP½vŠS 0 S S Because u and v are arbitrary elements of V, PTAP is the matrix of f in the basis S 0 .

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Symmetric Bilinear Forms, Quadratic Forms 12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms: (a) qðx; y; zÞ ¼ 3x2 þ 4xy À y2 þ 8xz À 6yz þ z2 , (b) q0 ðx; y; zÞ ¼ 3x2 þ xz À 2yz, (c) q00 ðx; y; zÞ ¼ 2x2 À 5y2 À 7z2

367

The symmetric matrix A ¼ ½aij Š that represents qðx1 ; . . . ; xn Þ has the diagonal entry aii equal to the coefficient of the square term x2 and the nondiagonal entries aij and aji each equal to half of the coefficient i of the cross-product term xi xj . Thus, 2 3 2 3 2 3 1 3 0 2 0 0 3 2 4 2 0 À1 5, (c) A00 ¼ 4 0 À5 (a) A ¼ 4 2 À1 À3 5, (b) A0 ¼ 4 0 05 1 À1 0 0 0 À7 4 À3 1 2 The third matrix A00 is diagonal, because the quadratic form q00 is diagonal; that is, q00 has no cross-product terms.

12.7.

Find the quadratic form qðX Þ that corresponds to each of the following symmetric matrices: 2 3 2 3 2 4 À1 5 ! 4 À5 7 6 4 À7 À6 8 7 5 À3 7 8 5, (c) C ¼ 6 ; (b) B ¼ 4 À5 À6 (a) A ¼ 4 À1 À6 À3 8 3 95 7 8 À9 5 8 9 1
The quadratic form qðX Þ that corresponds to a symmetric matrix M is defined by qðX Þ ¼ X TMX , where X ¼ ½xi Š is the column vector of unknowns. (a) Compute as follows: qðx; yÞ ¼ X T AX ¼ ½x; yŠ 5 À3 À3 8 ! ! ! x x ¼ ½5x À 3y; À3x þ 8yŠ y y

¼ 5x2 À 3xy À 3xy þ 8y2 ¼ 5x2 À 6xy þ 8y2 As expected, the coefficient 5 of the square term x2 and the coefficient 8 of the square term y2 are the diagonal elements of A, and the coefficient À6 of the cross-product term xy is the sum of the nondiagonal elements À3 and À3 of A (or twice the nondiagonal element À3, because A is symmetric). (b) Because B is a three-square matrix, there are three unknowns, say x; y; z or x1 ; x2 ; x3 . Then qðx; y; zÞ ¼ 4x2 À 10xy À 6y2 þ 14xz þ 16yz À 9z2 or qðx1 ; x2 ; x3 Þ ¼ 4x2 À 10x1 x2 À 6x2 þ 14x1 x3 þ 16x2 x3 À 9x2 1 2 3

Here we use the fact that the coefficients of the square terms x2 ; x2 ; x2 (or x2 ; y2 ; z2 ) are the respective 1 2 3 diagonal elements 4; À6; À9 of B, and the coefficient of the cross-product term xi xj is the sum of the nondiagonal elements bij and bji (or twice bij , because bij ¼ bji ). (c) Because C is a four-square matrix, there are four unknowns. Hence, qðx1 ; x2 ; x3 ; x4 Þ ¼ 2x2 À 7x2 þ 3x2 þ x2 þ 8x1 x2 À 2x1 x3 1 2 3 4 þ 10x1 x4 À 12x2 x3 þ 16x2 x4 þ 18x3 x4

12.8.

3 1 À3 2 Let A ¼ 4 À3 7 À5 5. Apply Algorithm 12.1 to find a nonsingular matrix P such that 2 À5 8 T D ¼ P AP is diagonal, and find sigðAÞ, the signature of A.

2

368

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
First form the block matrix M ¼ ½A; IŠ: 2

1 M ¼ ½A; IŠ ¼ 4 À3 2

À3 7 À5

2 À5 8

3 1 0 0 0 1 05 0 0 1

Using a11 ¼ 1 as a pivot, apply the row operations ‘‘Replace R2 by 3R1 þ R2 ’’ and ‘‘Replace R3 by À2R1 þ R3 ’’ to M and then apply the corresponding column operations ‘‘Replace C2 by 3C1 þ C2 ’’ and ‘‘Replace C3 by À2C1 þ C3 ’’ to A to obtain 2 3 2 3 1 À3 2 1 0 0 1 0 0 1 0 0 4 0 À2 1 4 0 À2 1 3 1 05 and then 3 1 0 5: 0 1 4 À2 0 1 0 1 4 À2 0 1 Next apply the row operation ‘‘Replace R3 by R2 þ 2R3 ’’ and then the corresponding column operation ‘‘Replace C3 by C2 þ 2C3 ’’ to obtain 2 3 2 3 1 0 0 1 0 0 1 0 0 1 0 0 4 0 À2 1 4 0 À2 3 1 05 and then 0 3 1 05 0 0 9 À1 1 2 0 0 18 À1 1 2 Now A has been diagonalized and the transpose of P is in the right half of M. Thus, set 2 3 2 3 1 0 0 1 3 À1 05 15 P ¼ 40 1 and then D ¼ PTAP ¼ 4 0 À2 0 0 18 0 0 2 Note D has p ¼ 2 positive and n ¼ 1 negative diagonal elements. Thus, the signature of A is sigðAÞ ¼ p À n ¼ 2 À 1 ¼ 1.

12.9.

Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A.
Consider the block matrix M ¼ ½A; IŠ. The algorithm applies a sequence of elementary row operations and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to premultiplying A by a sequence of elementary matrices, say, E1 ; E2 ; . . . ; Er , and postmultiplying A by the transposes of the Ei . Thus, when the algorithm ends, the diagonal matrix D on the left side of M is equal to
T T T D ¼ Er Á Á Á E2 E1 AE1 E2 Á Á Á Er ¼ QAQT ;

where

Q ¼ Er Á Á Á E2 E1

On the other hand, the algorithm only applies the elementary row operations to the identity matrix I on the right side of M. Thus, when the algorithm ends, the matrix on the right side of M is equal to Er Á Á Á E2 E1 I ¼ Er Á Á Á E2 E1 ¼ Q Setting P ¼ QT , we get D ¼ PTAP, which is a diagonalization of A under congruence.

12.10. Prove Theorem 12.4: Let f be a symmetric bilinear form on V over K (where 1 þ 1 6¼ 0). Then V has a basis in which f is represented by a diagonal matrix.
Algorithm 12.1 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is equivalent to the statement that f has a diagonal representation.

12.11. Let q be the quadratic form associated with the symmetric bilinear form f . Verify the polar identity f ðu; vÞ ¼ 1 ½qðu þ vÞ À qðuÞ À qðvފ. (Assume that 1 þ 1 ¼ 0.) 6 2
We have qðu þ vÞ À qðuÞ À qðvÞ ¼ f ðu þ v; u þ vÞ À f ðu; uÞ À f ðv; vÞ ¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ À f ðu; uÞ À f ðv; vÞ ¼ 2f ðu; vÞ If 1 þ 1 6¼ 0, we can divide by 2 to obtain the required identity.

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
12.12. Consider the quadratic form qðx; yÞ ¼ 3x2 þ 2xy À y2 and the linear substitution x ¼ s À 3t; y ¼ 2s þ t

369

(a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing qðx; yÞ. (b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the substitution. (c) Find qðs; tÞ using direct substitution. (d) Find qðs; tÞ using matrix notation.
! ! ! 3 1 3 1 x . Thus, A ¼ ; and qðX Þ ¼ X TAX , where X ¼ ½x; yŠT . (a) Here qðx; yÞ ¼ ½x; yŠ 1 À1 1 À1 y ! ! ! ! ! ! x 1 À3 s 1 À3 x s (b) Here ¼ . Thus, P ¼ ; and X ¼ ;Y ¼ and X ¼ PY . y 2 1 t 2 1 y t

(c) Substitute for x and y in q to obtain qðs; tÞ ¼ 3ðs À 3tÞ2 þ 2ðs À 3tÞð2s þ tÞ À ð2s þ tÞ2 ¼ 3ðs2 À 6st þ 9t2 Þ þ 2ð2s2 À 5st À 3t2 Þ À ð4s2 þ 4st þ t2 Þ ¼ 3s2 À 32st þ 20t2 (d) Here qðX Þ ¼ X TAX and X ¼ PY . Thus, X T ¼ Y T PT . Therefore, ! ! ! ! 1 2 3 1 1 À3 s qðs; tÞ ¼ qðY Þ ¼ Y T PT APY ¼ ½s; tŠ À3 1 1 À1 2 1 t ! ! 3 À16 s ¼ ½s; tŠ ¼ 3s2 À 32st þ 20t2 À16 20 t [As expected, the results in parts (c) and (d) are equal.]

12.13. Consider any diagonal matrix A ¼ diagða1 ; . . . ; an Þ over K. Show that for any nonzero scalars 2 2 k1 ; . . . ; kn 2 K; A is congruent to a diagonal matrix D with diagonal entries a1 k1 ; . . . ; an kn . Furthermore, show that (a) If K ¼ C, then we can choose D so that its diagonal entries are only 1’s and 0’s. (b) If K ¼ R, then we can choose D so that its diagonal entries are only 1’s, À1’s, and 0’s.
Let P ¼ diagðk1 ; . . . ; kn Þ. Then, as required,
2 2 D ¼ PTAP ¼ diagðki Þ diagðai Þ diagðki Þ ¼ diagða1 k1 ; . . . ; an kn Þ & pffiffiffiffi 1= ai if ai 6¼ 0 (a) Let P ¼ diagðbi Þ, where bi ¼ 1 if ai ¼ 0 Then PTAP has the required form. & pffiffiffiffiffiffiffi 1= jai j if ai 6¼ 0 (b) Let P ¼ diagðbi Þ, where bi ¼ 1 if ai ¼ 0 Then PTAP has the required form.

Remark: We emphasize that (b) is no longer true if ‘‘congruence’’ is replaced by ‘‘Hermitian congruence.’’ 12.14. Prove Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there exists a basis of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation of f has the same number p of positive entries and the same number n of negative entries.
By Theorem 12.4, there is a basis fu1 ; . . . ; un g of V in which f is represented by a diagonal matrix with, say, p positive and n negative entries. Now suppose fw1 ; . . . ; wn g is another basis of V, in which f is represented by a diagonal matrix with p0 positive and n0 negative entries. We can assume without loss of generality that the positive entries in each matrix appear first. Because rankð f Þ ¼ p þ n ¼ p0 þ n0 , it suffices to prove that p ¼ p0 .

370

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Let U be the linear span of u1 ; . . . ; up and let W be the linear span of wp0 þ1 ; . . . ; wn . Then f ðv; vÞ > 0 for every nonzero v 2 U , and f ðv; vÞ 0 for every nonzero v 2 W. Hence, U \ W ¼ f0g. Note that dim U ¼ p and dim W ¼ n À p0 . Thus, dimðU þ W Þ ¼ dim U þ dimW À dimðU \ W Þ ¼ p þ ðn À p0 Þ À 0 ¼ p À p0 þ n But dimðU þ W Þ as required. dim V ¼ n; hence, p À p0 þ n n or p p0 . Similarly, p0 p and therefore p ¼ p0 ,

Remark: The above theorem and proof depend only on the concept of positivity. Thus, the theorem is true for any subfield K of the real field R such as the rational field Q.

Positive Definite Real Quadratic Forms 12.15. Prove that the following definitions of a positive definite quadratic form q are equivalent: (a) The diagonal entries are all positive in any diagonal representation of q. (b) qðY Þ > 0, for any nonzero vector Y in Rn .
Suppose qðY Þ ¼ a1 y2 þ a2 y2 þ Á Á Á þ an y2 . If all the coefficients are positive, then clearly qðY Þ > 0 n 1 2 whenever Y 6¼ 0. Thus, (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal entry ak 0. Let ek ¼ ð0; . . . ; 1; . . . 0Þ be the vector whose entries are all 0 except 1 in the kth position. Then qðek Þ ¼ ak is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly, (a) and (b) are equivalent.

12.16. Determine whether each of the following quadratic forms q is positive definite: (a) qðx; y; zÞ ¼ x2 þ 2y2 À 4xz À 4yz þ 7z2 (b) qðx; y; zÞ ¼ x2 þ y2 þ 2xz þ 4yz þ 3z2
Diagonalize (under congruence) the symmetric matrix A corresponding to q. (a) Apply the operations ‘‘Replace R3 by 2R1 þ R3 ’’ and ‘‘Replace C3 by 2C1 þ C3 ,’’ and then ‘‘Replace R3 by R2 þ R3 ’’ and ‘‘Replace C3 by C2 þ C3 .’’ These yield 2 3 2 3 2 3 1 0 À2 1 0 0 1 0 0 2 À2 5 ’ 4 0 2 0 5 A¼4 0 2 À2 5 ’ 4 0 À2 À2 7 0 À2 3 0 0 1 The diagonal representation of positive definite. (b) We have 2 1 A ¼ 40 1 q only contains positive entries, 1; 2; 1, on the diagonal. Thus, q is 3 2 3 2 0 1 1 0 0 1 0 1 25 ’ 40 1 25 ’ 40 1 2 3 0 2 2 0 0 3 0 05 À2

There is a negative entry À2 on the diagonal representation of q. Thus, q is not positive definite.

12.17. Show that qðx; yÞ ¼ ax2 þ bxy þ cy2 is positive definite if and only if a > 0 and the discriminant D ¼ b2 À 4ac < 0.
Suppose v ¼ ðx; yÞ 6¼ 0. Then either x 6¼ 0 or y ¼ 0; say, y 6¼ 0. Let t ¼ x=y. Then 6 qðvÞ ¼ y2 ½aðx=yÞ2 þ bðx=yÞ þ cŠ ¼ y2 ðat2 þ bt þ cÞ However, the following are equivalent: (i) s ¼ at2 þ bt þ c is positive for every value of t. (ii) s ¼ at2 þ bt þ c lies above the t-axis. (iii) a > 0 and D ¼ b2 À 4ac < 0.

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms

371

Thus, q is positive definite if and only if a > 0 and D < 0. [Remark: D < 0 is the same as detðAÞ > 0, where A is the symmetric matrix corresponding to q.]

12.18. Determine whether or not each of the following quadratic forms q is positive definite: (a) qðx; yÞ ¼ x2 À 4xy þ 7y2 , (b) qðx; yÞ ¼ x2 þ 8xy þ 5y2 , (c) qðx; yÞ ¼ 3x2 þ 2xy þ y2
Compute the discriminant D ¼ b2 À 4ac, and then use Problem 12.17. (a) D ¼ 16 À 28 ¼ À12. Because a ¼ 1 > 0 and D < 0; q is positive definite. (b) D ¼ 64 À 20 ¼ 44. Because D > 0; q is not positive definite. (c) D ¼ 4 À 12 ¼ À8. Because a ¼ 3 > 0 and D < 0; q is positive definite.

Hermitian Forms 12.19. Determine whether the following matrices are Hermitian: 2 3 2 3 2 2 2 þ 3i 4 À 5i 3 2Ài 4þi 4 (a) 4 2 À 3i 5 6 þ 2i 5, (b) 4 2 À i 6 i 5, (c) 4 À3 4 þ 5i 6 À 2i À7 4þi i 7 5
 A complex matrix A ¼ ½aij Š is Hermitian if A* ¼ A—that is, if aij ¼ aji : (a) Yes, because it is equal to its conjugate transpose. (b) No, even though it is symmetric. (c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric.

À3 2 1

3 5 15 À6

12.20. Let A be a Hermitian matrix. Show that f is a Hermitian form on Cn where f is defined by  f ðX ; Y Þ ¼ X TAY .
For all a; b 2 C and all X1 ; X2 ; Y 2 Cn ,
T T   f ðaX1 þ bX2 ; Y Þ ¼ ðaX1 þ bX2 ÞT AY ¼ ðaX1 þ bX2 ÞAY T  T  ¼ aX1 AY þ bX2 AY ¼ af ðX1 ; Y Þ þ bf ðX2 ; Y Þ

Hence, f is linear in the first variable. Also,      f ðX ; Y Þ ¼ X TAY ¼ ðX TAY ÞT ¼ Y T AT X ¼ Y T A*X ¼ Y T AX ¼ f ðY ; X Þ Hence, f is a Hermitian form on Cn .

 Remark: We use the fact that X T AY is a scalar and so it is equal to its transpose. 12.21. Let f be a Hermitian form on V. Let H be the matrix of f in a basis S ¼ fui g of V. Prove the following: (a) f ðu; vÞ ¼ ½uŠT H½vŠS for all u; v 2 V. S  (b) If P is the change-of-basis matrix from S to a new basis S 0 of V, then B ¼ PT H P (or  B ¼ Q*HQ, where Q ¼ PÞ is the matrix of f in the new basis S 0 .
Note that (b) is the complex analog of Theorem 12.2. (a) Let u; v 2 V and suppose u ¼ a1 u1 þ Á Á Á þ an un and v ¼ b1 u1 þ Á Á Á þ bn un . Then, as required, f ðu; vÞ ¼ f ða1 u1 þ Á Á Á þ an un ; b1 u1 þ Á Á Á þ bn un Þ P    ¼ ai bj f ðui ; v j Þ ¼ ½a1 ; . . . ; an ŠH½b1 ; . . . ; bn ŠT ¼ ½uŠT H½vŠS S i;j 372

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(b) Because P is the change-of-basis matrix from S to S 0 , we have P½uŠS 0 ¼ ½uŠS and P½vŠS 0 ¼ ½vŠS ; hence,  ½uŠT ¼ ½uŠT0 PT and ½vŠS ¼ P½vŠS 0 : Thus, by (a), S S  f ðu; vÞ ¼ ½uŠT H½vŠS ¼ ½uŠT0 PT H P½vŠS 0 S S  But u and v are arbitrary elements of V; hence, PT H P is the matrix of f in the basis S 0 :

3 1 1þi 2i 12.22. Let H ¼ 4 1 À i 4 2 À 3i 5, a Hermitian matrix. À2i 2 þ 3i 7  Find a nonsingular matrix P such that D ¼ PTH P is diagonal. Also, find the signature of H.
Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate column operations. Thus, first form the block matrix M ¼ ½H; IŠ: 2 3 1 1þi 2i 1 0 0 M ¼ 41 À i 4 2 À 3i 0 1 0 5 À2i 2 þ 3i 7 0 0 1 Apply the row operations ‘‘Replace R2 by ðÀ1 þ iÞR1 þ R2 ’’ and ‘‘Replace R3 by 2iR1 þ R3 ’’ and then the corresponding conjugate column operations ‘‘Replace C2 by ðÀ1 À iÞC1 þ C2 ’’ and ‘‘Replace C3 by À2iC1 þ C3 ’’ to obtain 2 3 2 3 1 0 0 1 0 0 1 1þi 2i 1 0 0 4 0 2 À5i À1 þ i 1 0 5 40 and then 2 À5i À1 þ i 1 0 5 0 5i 3 2i 0 1 0 5i 3 2i 0 1 Next apply the row operation ‘‘Replace R3 by À5iR2 þ 2R3 ’’ and the corresponding conjugate column operation ‘‘Replace C3 by 5iC2 þ 2C3 ’’ to obtain 2 3 2 3 1 0 0 1 0 0 1 0 0 1 0 0 4 0 2 À5i À1 þ i 40 2 1 05 and then 0 À1 þ i 1 05 0 0 À19 5 þ 9i À5i 2 0 0 À38 5 þ 9i À5i 2 Now H has been diagonalized, and the transpose of the right half of M is P. Thus, set 2 3 2 3 1 À1 þ i 5 þ 9i 1 0 0  P ¼ 40 1 À5i 5; and then D ¼ PT H P ¼ 4 0 2 0 5: 0 0 À38 0 0 2 Note D has p ¼ 2 positive elements and n ¼ 1 negative elements. Thus, the signature of H is sigðHÞ ¼ 2 À 1 ¼ 1.

2

Miscellaneous Problems 12.23. Prove Theorem 12.3: Let f be an alternating form on V. Then there exists a basis of V in which f ! 0 1 or 0. The number is represented by a block diagonal matrix M with blocks of the form À1 0 of nonzero blocks is uniquely determined by f [because it is equal to 1 rankð f ފ. 2
If f ¼ 0, then the theorem is obviously true. Also, if dim V ¼ 1, then f ðk1 u; k2 uÞ ¼ k1 k2 f ðu; uÞ ¼ 0 and so f ¼ 0. Accordingly, we can assume that dim V > 1 and f 6¼ 0. Because f 6¼ 0, there exist (nonzero) u1 ; u2 2 V such that f ðu1 ; u2 Þ 6¼ 0. In fact, multiplying u1 by an appropriate factor, we can assume that f ðu1 ; u2 Þ ¼ 1 and so f ðu2 ; u1 Þ ¼ À1. Now u1 and u2 are linearly independent; because if, say, u2 ¼ ku1 , then f ðu1 ; u2 Þ ¼ f ðu1 ; ku1 Þ ¼ kf ðu1 ; u1 Þ ¼ 0. Let U ¼ spanðu1 ; u2 Þ; then,

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(i) The matrix representation of the restriction of f to U in the basis fu1 ; u2 g is (ii) If u 2 U , say u ¼ au1 þ bu2 , then f ðu; u1 Þ ¼ f ðau1 þ bu2 ; u1 Þ ¼ Àb and f ðu; u2 Þ ¼ f ðau1 þ bu2 ; u2 Þ ¼ a ! 0 1 , À1 0

373

Let W consists of those vectors w 2 V such that f ðw; u1 Þ ¼ 0 and f ðw; u2 Þ ¼ 0: Equivalently, W ¼ fw 2 V : f ðw; uÞ ¼ 0 for every u 2 U g We claim that V ¼ U È W. It is clear that U \ W ¼ f0g, and so it remains to show that V ¼ U þ W. Let v 2 V. Set and w¼vÀu ð1Þ u ¼ f ðv; u2 Þu1 À f ðv; u1 Þu2 Because u is a linear combination of u1 and u2 ; u 2 U. We show next that w 2 W. By (1) and (ii), f ðu; u1 Þ ¼ f ðv; u1 Þ; hence, f ðw; u1 Þ ¼ f ðv À u; u1 Þ ¼ f ðv; u1 Þ À f ðu; u1 Þ ¼ 0 Similarly, f ðu; u2 Þ ¼ f ðv; u2 Þ and so f ðw; u2 Þ þ f ðv À u; u2 Þ ¼ f ðv; u2 Þ À f ðu; u2 Þ ¼ 0 Then w 2 W and so, by (1), v ¼ u þ w, where u 2 W. This shows that V ¼ U þ W ; therefore, V ¼ U È W. Now the restriction of f to W is an alternating bilinear form on W. By induction, there exists a basis u3 ; . . . ; un of W in which the matrix representing f restricted to W has the desired form. Accordingly, u1 ; u2 ; u3 ; . . . ; un is a basis of V in which the matrix representing f has the desired form.

SUPPLEMENTARY PROBLEMS Bilinear Forms
12.24. Let u ¼ ðx1 ; x2 Þ and v ¼ ðy1 ; y2 Þ. Determine which of the following are bilinear forms on R2 : (a) f ðu; vÞ ¼ 2x1 y2 À 3x2 y1 , (b) f ðu; vÞ ¼ x1 þ y2 , (c) f ðu; vÞ ¼ 3x2 y2 , (d) f ðu; vÞ ¼ x1 x2 þ y1 y2 , (e) f ðu; vÞ ¼ 1, (f ) f ðu; vÞ ¼ 0

12.25. Let f be the bilinear form on R2 defined by f ½ðx1 ; x2 Þ; ðy1 ; y2 ފ ¼ 3x1 y1 À 2x1 y2 þ 4x2 y1 À x2 y2 (a) Find the matrix A of f in the basis fu1 ¼ ð1; 1Þ; u2 ¼ ð1; 2Þg. (b) Find the matrix B of f in the basis fv 1 ¼ ð1; À1Þ; v 2 ¼ ð3; 1Þg. (c) Find the change-of-basis matrix P from fui g to fv i g, and verify that B ¼ PTAP. ! 1 2 12.26. Let V be the vector space of two-square matrices over R . Let M ¼ , and let f ðA; BÞ ¼ trðAT MBÞ, 3 5 where A; B 2 V and ‘‘tr’’ denotes trace. (a) Show that f is a bilinear form on V. (b) Find the matrix of f in the basis & ! 1 0 ; 0 0 ! 0 1 ; 0 0 ! 0 0 ; 1 0 0 0 0 1 !'

12.27. Let BðV Þ be the set of bilinear forms on V over K. Prove the following: (a) If f ; g 2 BðV Þ, then f þ g, kg 2 BðV Þ for any k 2 K. (b) If f and s are linear functions on V, then f ðu; vÞ ¼ fðuÞsðvÞ belongs to BðV Þ. 12.28. Let ½ f Š denote the matrix representation of a bilinear form f on V relative to a basis fui g. Show that the mapping f 7! ½ f Š is an isomorphism of BðV Þ onto the vector space V of n-square matrices.

374

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms

12.29. Let f be a bilinear form on V. For any subset S of V, let S ? ¼ fv 2 V : f ðu; vÞ ¼ 0 for every u 2 Sg and S > ¼ fv 2 V : f ðv; uÞ ¼ 0 for every u 2 Sg
? ? > > Show that: (a) S > and S > are subspaces of V ; (b) S1  S2 implies S2  S1 and S2  S1 ; ? > (c) f0g ¼ f0g ¼ V.

12.30. Suppose f is a bilinear form on V. Prove that: rankð f Þ ¼ dim V À dim V ? ¼ dim V À dim V > , and hence, dim V ? ¼ dim V > . ^ ~ ^ 12.31. Let f be a bilinear form on V. For each u 2 V, let u :V ! K and u :V ! K be defined by uðxÞ ¼ f ðx; uÞ and ~ uðxÞ ¼ f ðu; xÞ. Prove the following: ^ ~ ^ ~ (a) u and u are each linear; i.e., u; u 2 V *, ^ ~ (b) u 7! u and u 7! u are each linear mappings from V into V *, ^ ~ (c) rankð f Þ ¼ rankðu 7! uÞ ¼ rankðu 7! uÞ. 12.32. Show that congruence of matrices (denoted by ’) is an equivalence relation; that is, (i) A ’ A; (ii) If A ’ B, then B ’ A; (iii) If A ’ B and B ’ C, then A ’ C.

Symmetric Bilinear Forms, Quadratic Forms
12.33. Find the symmetric matrix A belonging to each of the following quadratic forms: (a) qðx; y; zÞ À 2x2 À 8xy þ y2 À 16xz þ 14yz þ 5z2 , (b) qðx; y; zÞ ¼ x2 À xz þ y2 , (c) qðx; y; zÞ ¼ xy þ y2 þ 4xz þ z2 (d) qðx; y; zÞ ¼ xy þ yz

12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D ¼ PTAP is diagonal: 2 3 2 3 2 3 1 À1 0 2 1 0 2 1 À2 1 6 À1 2 1 07 7 (a) A ¼ 4 0 3 6 5, (b) A ¼ 4 À2 5 3 5, (c) A ¼ 6 4 0 1 1 25 2 6 7 1 3 À2 2 0 2 À1 12.35. Let qðx; yÞ ¼ 2x2 À 6xy À 3y2 and x ¼ s þ 2t, y ¼ 3s À t. (a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing the quadratic form. (b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the substitution. (c) Find qðs; tÞ using (i) direct substitution, (ii) matrix notation. 12.36. For each of the following quadratic forms qðx; y; zÞ, find a nonsingular linear substitution expressing the variables x; y; z in terms of variables r; s; t such that qðr; s; tÞ is diagonal: (a) qðx; y; zÞ ¼ x2 þ 6xy þ 8y2 À 4xz þ 2yz À 9z2 , (b) qðx; y; zÞ ¼ 2x2 À 3y2 þ 8xz þ 12yz þ 25z2 , (c) qðx; y; zÞ ¼ x2 þ 2xy þ 3y2 þ 4xz þ 8yz þ 6z2 . In each case, find the rank and signature. 12.37. Give an example of a quadratic form qðx; yÞ such that qðuÞ ¼ 0 and qðvÞ ¼ 0 but qðu þ vÞ 6¼ 0. 12.38. Let SðV Þ denote all symmetric bilinear forms on V. Show that (a) SðV Þ is a subspace of BðV Þ; (b) If dim V ¼ n, then dim SðV Þ ¼ 1 nðn þ 1Þ. 2 Pn i;j¼1 12.39. Consider a real quadratic polynomial qðx1 ; . . . ; xn Þ ¼

aij xi xj ; where aij ¼ aji .

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(a) If a11 6¼ 0, show that the substitution 1 x1 ¼ y1 À ða y þ Á Á Á þ a1n yn Þ; a11 12 2

375

x2 ¼ y2 ;

...;

x n ¼ yn

yields the equation qðx1 ; . . . ; xn Þ ¼ a11 y2 þ q0 ðy2 ; . . . ; yn Þ, where q0 is also a quadratic polynomial. 1 (b) If a11 ¼ 0 but, say, a12 6¼ 0, show that the substitution x2 ¼ y1 À y2 ; x3 ¼ y3 ; ...; xn ¼ yn P yields the equation qðx1 ; . . . ; xn Þ ¼ bij yi yj , where b11 6¼ 0, which reduces this case to case (a). x1 ¼ y1 þ y2 ;

Remark: This method of diagonalizing q is known as completing the square. Positive Definite Quadratic Forms
12.40. Determine whether or not each of the following quadratic forms is positive definite: (a) qðx; yÞ ¼ 4x2 þ 5xy þ 7y2 , (b) qðx; yÞ ¼ 2x2 À 3xy À y2 ; (c) qðx; y; zÞ ¼ x2 þ 4xy þ 5y2 þ 6xz þ 2yz þ 4z2 (d) qðx; y; zÞ ¼ x2 þ 2xy þ 2y2 þ 4xz þ 6yz þ 7z2

12.41. Find those values of k such that the given quadratic form is positive definite: (a) qðx; yÞ ¼ 2x2 À 5xy þ ky2 , (b) qðx; yÞ ¼ 3x2 À kxy þ 12y2 (c) qðx; y; zÞ ¼ x2 þ 2xy þ 2y2 þ 2xz þ 6yz þ kz2 12.42. Suppose A is a real symmetric positive definite matrix. Show that A ¼ PTP for some nonsingular matrix P.

Hermitian Forms
12.43. Modify Algorithm 12.1 so that, for a given Hermitian matrix H, it finds a nonsingular matrix P for which  D ¼ PTAP is diagonal.  12.44. For each Hermitian matrix H, find a nonsingular matrix P such that D ¼ PTH P is diagonal: 2 3 ! ! 1 i 2þi 1 i 1 2 þ 3i (a) H ¼ , (b) H ¼ , (c) H ¼ 4 Ài 2 1 À i5 À1 Ài 2 2 À 3i 2Ài 1þi 2 Find the rank and signature in each case. 12.45. Let A be a complex nonsingular matrix. Show that H ¼ A*A is Hermitian and positive definite.  12.46. We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B ¼ PTAP or, equivalently, if there exists a nonsingular matrix Q such that B ¼ Q*AQ. Show that Hermitian congruence   is an equivalence relation. (Note: If P ¼ Q, then PTAP ¼ Q*AQ.) 12.47. Prove Theorem 12.7: Let f be a Hermitian form on V. Then there is a basis S of V in which f is represented by a diagonal matrix, and every such diagonal representation has the same number p of positive entries and the same number n of negative entries.

Miscellaneous Problems
12.48. Let e denote an elementary row operation, and let f * denote the corresponding conjugate column operation  (where each scalar k in e is replaced by k in f *). Show that the elementary matrix corresponding to f * is the conjugate transpose of the elementary matrix corresponding to e. 12.49. Let V and W be vector spaces over K. A mapping f :V Â W ! K is called a bilinear form on V and W if (i) f ðav 1 þ bv 2 ; wÞ ¼ af ðv 1 ; wÞ þ bf ðv 2 ; wÞ, (ii) f ðv; aw1 þ bw2 Þ ¼ af ðv; w1 Þ þ bf ðv; w2 Þ for every a; b 2 K; v i 2 V ; wj 2 W. Prove the following:

376

CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(a) The set BðV ; W Þ of bilinear forms on V and W is a subspace of the vector space of functions from V Â W into K. (b) If ff1 ; . . . ; fm g is a basis of V * and fs1 ; . . . ; sn g is a basis of W *, then f fij : i ¼ 1; . . . ; m; j ¼ 1; . . . ; ng is a basis of BðV ; W Þ, where fij is defined by fij ðv; wÞ ¼ fi ðvÞsj ðwÞ. Thus, dim BðV ; W Þ ¼ dim V dim W. [Note that if V ¼ W, then we obtain the space BðV Þ investigated in this chapter.] m times

zfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflffl{ 12.50. Let V be a vector space over K. A mapping f :V Â V Â . . . Â V ! K is called a multilinear (or m-linear) form on V if f is linear in each variable; that is, for i ¼ 1; . . . ; m, d ^ ^ f ð. . . ; au þ bv; . . .Þ ¼ af ð. . . ; u; . . .Þ þ bf ð. . . ; v ; . . .Þ where .c. denotes the ith element, and other elements are held fixed. An m-linear form f is said to be . alternating if f ðv 1 ; . . . v m Þ ¼ 0 whenever v i ¼ v j for i 6¼ j. Prove the following: (a) The set Bm ðV Þ of m-linear forms on V is a subspace of the vector space of functions from V Â V Â Á Á Á Â V into K. (b) The set Am ðV Þ of alternating m-linear forms on V is a subspace of Bm ðV Þ. Remark 1: If m ¼ 2, then we obtain the space BðV Þ investigated in this chapter. Remark 2: If V ¼ K m , then the determinant function is an alternating m-linear form on V.

ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1 ; 12.24. (a) 12.25. (a) 12.26. (b) 12.33. (a) (c) yes, R2 ; (b) . . .Š denotes a matrix M with rows R1 ; R2 ; . . .. no, (c) (b) yes, (d) no, (e) no, (f ) yes

A ¼ ½4; 1; 7; 3Š,

B ¼ ½0; À4; 20; 32Š,

(c) P ¼ ½3; 5; À2; À2Š

½1; 0; 2; 0; 0; 1; 0; 2; 3; 0; 5; 0; 0; 3; 0; 5Š ½2; À4; À8; À4; 1; 7; À8; 7; 5Š, (d) ½0; 1 ; 2; 1 ; 1; 0; 2; 0; 1Š, 2 2 (b) ½1; 0; À 1 ; 0; 1; 0; À 1 ; 0; 0Š, 2 2 ½0; 1 ; 0; 1 ; 0; 1; 1 ; 0; 1 ; 0; 1 ; 0Š 2 2 2 2 2

12.34. (a) P ¼ ½1; 0; À2; 0; 1; À2; 0; 0; 1Š; D ¼ diagð1; 3; À9Þ; (b) P ¼ ½1; 2; À11; 0; 1; À5; 0; 0; 1Š; D ¼ diagð1; 1; À28Þ; (c) P ¼ ½1; 1; À1; À4; 0; 1; À1; À2; 0; 0; 1; 0; 0; 0; 0; 1Š; D ¼ diagð1; 1; 0; À9Þ 12.35. A ¼ ½2; À3; À3; À3Š, P ¼ ½1; 2; 3; À1Š, qðs; tÞ ¼ À43s2 À 4st þ 17t2 12.36. (a) x ¼ r À 3s À 19t, y ¼ s þ 7t, z ¼ t; qðr; s; tÞ ¼ r2 À s2 þ 36t2 ; (b) x ¼ r À 2t; y ¼ s þ 2t; z ¼ t; qðr; s; tÞ ¼ 2r2 À 3s2 þ 29t2 ; (c) x ¼ r À s À t; y ¼ s À t; z ¼ t; qðr; s; tÞ ¼ r2 À 2s2 12.37. qðx; yÞ ¼ x2 À y2 , u ¼ ð1; 1Þ, v ¼ ð1; À1Þ 12.40. (a) 12.41. (a) 12.44. (a) (c) yes, k > 25, 8 (b) no, (b) (c) no, (d) (c) yes k>5

À12 < k < 12,

P ¼ ½1; i; 0; 1Š, D ¼ I; s ¼ 2; (b) P ¼ ½1; À2 þ 3i; 0; 1Š, D ¼ diagð1; À14Þ, s ¼ 0; P ¼ ½1; i; À3 þ i; 0; 1; i; 0; 0; 1Š, D ¼ diagð1; 1; À4Þ; s ¼ 1

CHAPTER 13

Linear Operators on Inner Product Spaces
13.1 Introduction
This chapter investigates the space AðV Þ of linear operators T on an inner product space V. (See Chapter 7.) Thus, the base field K is either the real numbers R or the complex numbers C. In fact, different terminologies will be used for the real case and the complex case. We also use the fact that the inner products on real Euclidean space Rn and complex Euclidean space Cn may be defined, respectively, by hu; vi ¼ uT v and  hu; vi ¼ uT v where u and v are column vectors. The reader should review the material in Chapter 7 and be very familiar with the notions of norm (length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner product spaces, whereas here we assume that V is a complex inner product space unless otherwise stated or implied. Lastly, we note that in Chapter 2, we used AH to denote the conjugate transpose of a complex matrix A; that is, AH ¼ AT . This notation is not standard. Many texts, expecially advanced texts, use A* to denote such a matrix; we will use that notation in this chapter. That is, now A* ¼ AT .

13.2

Adjoint Operators

We begin with the following basic definition.
DEFINITION:

A linear operator T on an inner product space V is said to have an adjoint operator T * on V if hT ðuÞ; vi ¼ hu; T *ðvÞi for every u; v 2 V.

The following example shows that the adjoint operator has a simple description within the context of matrix mappings.
EXAMPLE 13.1

(a)

Let A be a real n-square matrix viewed as a linear operator on Rn . Then, for every u; v 2 Rn ;

hAu; vi ¼ ðAuÞT v ¼ uT AT v ¼ hu; AT vi
Thus, the transpose AT of A is the adjoint of A. (b) Let B be a complex n-square matrix viewed as a linear operator on Cn . Then for every u; v; 2 Cn ,

  hBu; vi ¼ ðBuÞT v ¼ uT BT v ¼ uT B* ¼ hu; B*vi v
Thus, the conjugate transpose B* of B is the adjoint of B.

377

378

CHAPTER 13

Linear Operators on Inner Product Spaces

Remark: B* may mean either the adjoint of B as a linear operator or the conjugate transpose of B as a matrix. By Example 13.1(b), the ambiguity makes no difference, because they denote the same object. The following theorem (proved in Problem 13.4) is the main result in this section.
THEOREM 13.1:

Let T be a linear operator on a finite-dimensional inner product space V over K. Then (i) There exists a unique linear operator T * on V such that hT ðuÞ; vi ¼ hu; T *ðvÞi for every u; v 2 V. (That is, T has an adjoint T *.) (ii) If A is the matrix representation T with respect to any orthonormal basis S ¼ fui g of V, then the matrix representation of T * in the basis S is the conjugate transpose A* of A (or the transpose AT of A when K is real).

We emphasize that no such simple relationship exists between the matrices representing T and T * if the basis is not orthonormal. Thus, we see one useful property of orthonormal bases. We also emphasize that this theorem is not valid if V has infinite dimension (Problem 13.31). The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint.
THEOREM 13.2:

Let T ; T1 ; T2 be linear operators on V and let k 2 K. Then (i) ðT1 þ T2 Þ* ¼ T1 þ T2 * *,  (ii) ðkT Þ* ¼ kT *, (iii) ðT1 T2 Þ* ¼ T2 1 *T*, (iv) ðT *Þ* ¼ T.

Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose operation on matrices.

Linear Functionals and Inner Product Spaces
Recall (Chapter 11) that a linear functional f on a vector space V is a linear mapping f:V ! K. This subsection contains an important result (Theorem 13.3) that is used in the proof of the above basic Theorem 13.1. ^ Let V be an inner product space. Each u 2 V determines a mapping u :V ! K defined by ^ uðvÞ ¼ hv; ui Now, for any a; b 2 K and any v 1 ; v 2 2 V, ^ uðav 1 þ bv 2 Þ ¼ hav 1 þ bv 2 ; ui ¼ ahv 1 ; ui þ bhv 2 ; ui ¼ a^ðv 1 Þ þ b^ðv 2 Þ u u ^ That is, u is a linear functional on V. The converse is also true for spaces of finite dimension and it is contained in the following important theorem (proved in Problem 13.3).
THEOREM 13.3:

Let f be a linear functional on a finite-dimensional inner product space V. Then there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V.

We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24).

13.3

Analogy Between AðVÞ and C, Special Linear Operators

Let AðV Þ denote the algebra of all linear operators on a finite-dimensional inner product space V. The adjoint mapping T 7! T * on AðV Þ is quite analogous to the conjugation mapping z 7! z on the complex  field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T 2 AðV Þ whose behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex numbers. The analogy between these operators T and complex numbers z is reflected in the next theorem.

CHAPTER 13 Linear Operators on Inner Product Spaces
Table 13-1 Class of complex numbers Unit circle ðjzj ¼ 1Þ Behavior under conjugation z  ¼ 1=z Class of operators in AðV Þ Orthogonal operators (real case) Unitary operators (complex case) Self-adjoint operators Also called: symmetric (real case) Hermitian (complex case) Skew-adjoint operators Also called: skew-symmetric (real case) skew-Hermitian (complex case) Positive definite operators

379

Behavior under the adjoint map T * ¼ T À1

Real axis

z ¼ z

T* ¼ T

Imaginary axis Positive real axis ð0; 1Þ

z  ¼ Àz  z ¼ ww; w 6¼ 0

T * ¼ ÀT T ¼ S*S with S nonsingular

THEOREM 13.4:

Let l be an eigenvalue of a linear operator T on V. (i) (ii) (iii) (iv) If T * ¼ T À1 (i.e., T is orthogonal or unitary), then jlj ¼ 1. If T * ¼ T (i.e., T is self-adjoint), then l is real. If T * ¼ ÀT (i.e., T is skew-adjoint), then l is pure imaginary. If T ¼ S*S with S nonsingular (i.e., T is positive definite), then l is real and positive.

Proof. In each case let v be a nonzero eigenvector of T belonging to l; that is, T ðvÞ ¼ lv with v 6¼ 0. Hence, hv; vi is positive. Proof of (i). We show that l vi ¼ hv; vi: lhv; l vi ¼ hlv; lvi ¼ hT ðvÞ; T ðvÞi ¼ hv; T *T ðvÞi ¼ hv; IðvÞi ¼ hv; vi lhv; But hv; vi 6¼ 0; hence, l ¼ 1 and so jlj ¼ 1. l Proof of (ii). We show that lhv; vi ¼  vi: lhv; lhv; vi ¼ hlv; vi ¼ hT ðvÞ; vi ¼ hv; T *ðvÞi ¼ hv; T ðvÞi ¼ hv; lvi ¼  vi lhv; l But hv; vi 6¼ 0; hence, l ¼  and so l is real. Proof of (iii). We show that lhv; vi ¼ À vi: lhv; lhv; vi ¼ hlv; vi ¼ hT ðvÞ; vi ¼ hv; T *ðvÞi ¼ hv; ÀT ðvÞi ¼ hv; Àlvi ¼ À vi lhv; l l But hv; vi 6¼ 0; hence, l ¼ À or  ¼ Àl, and so l is pure imaginary. Proof of (iv). Note first that SðvÞ 6¼ 0 because S is nonsingular; hence, hSðvÞ, SðvÞi is positive. We show that lhv; vi ¼ hSðvÞ; SðvÞi: lhv; vi ¼ hlv; vi ¼ hT ðvÞ; vi ¼ hS*SðvÞ; vi ¼ hSðvÞ; SðvÞi But hv; vi and hSðvÞ; SðvÞi are positive; hence, l is positive.

380

CHAPTER 13

Linear Operators on Inner Product Spaces

Remark: Each of the above operators T commutes with its adjoint; that is, TT * ¼ T*T. Such operators are called normal operators.

13.4

Self-Adjoint Operators

Let T be a self-adjoint operator on an inner product space V ; that is, suppose T* ¼ T (If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By Theorem 13.4, the eigenvalues of T are real. The following is another important property of T.
THEOREM 13.5:

Let T be a self-adjoint operator on V. Suppose u and v are eigenvectors of T belonging to distinct eigenvalues. Then u and v are orthogonal; that is, hu; vi ¼ 0.

Proof. Suppose T ðuÞ ¼ l1 u and T ðvÞ ¼ l2 v, where l1 6¼ l2 . We show that l1 hu; vi ¼ l2 hu; vi: l1 hu; vi ¼ hl1 u; vi ¼ hT ðuÞ; vi ¼ hu; T *ðvÞi ¼ hu; T ðvÞi  ¼ hu; l vi ¼ l hu; vi ¼ l hu; vi
2 2 2

(The fourth equality uses the fact that T* ¼ T, and the last equality uses the fact that the eigenvalue l2 is real.) Because l1 6¼ l2 , we get hu; vi ¼ 0. Thus, the theorem is proved.

13.5

Orthogonal and Unitary Operators
UU * ¼ U *U ¼ I

Let U be a linear operator on a finite-dimensional inner product space V. Suppose U * ¼ U À1 or equivalently

Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The next theorem (proved in Problem 13.10) gives alternative characterizations of these operators.
THEOREM 13.6:

The following conditions on an operator U are equivalent: (i) U * ¼ U À1 ; that is, UU * ¼ U *U ¼ I. [U is unitary (orthogonal).] (ii) U preserves inner products; that is, for every v; w 2 V, hU ðvÞ, U ðwÞi ¼ hv; wi. (iii) U preserves lengths; that is, for every v 2 V, kU ðvÞk ¼ kvk.

EXAMPLE 13.2

(a) Let T :R3 ! R3 be the linear operator that rotates each vector v about the z-axis by a fixed angle y as shown in Fig. 10-1 (Section 10.3). That is, T is defined by

T ðx; y; zÞ ¼ ðx cos y À y sin y; x sin y þ y cos y; zÞ
We note that lengths (distances from the origin) are preserved under T. Thus, T is an orthogonal operator. (b) Let V be l2 -space (Hilbert space), defined in Section 7.3. Let T :V ! V be the linear operator defined by

T ða1 ; a2 ; a3 ; . . .Þ ¼ ð0; a1 ; a2 ; a3 ; . . .Þ
Clearly, T preserves inner products and lengths. However, T is not surjective, because, for example, ð1; 0; 0; . . .Þ does not belong to the image of T ; hence, T is not invertible. Thus, we see that Theorem 13.6 is not valid for spaces of infinite dimension.

An isomorphism from one inner product space into another is a bijective mapping that preserves the three basic operations of an inner product space: vector addition, scalar multiplication, and inner

CHAPTER 13 Linear Operators on Inner Product Spaces

381

products. Thus, the above mappings (orthogonal and unitary) may also be characterized as the isomorphisms of V into itself. Note that such a mapping U also preserves distances, because kU ðvÞ À U ðwÞk ¼ kU ðv À wÞk ¼ kv À wk Hence, U is called an isometry.

13.6

Orthogonal and Unitary Matrices

Let U be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following results.
THEOREM 13.7A:

A complex matrix A represents a unitary operator U (relative to an orthonormal basis) if and only if A* ¼ AÀ1 . A real matrix A represents an orthogonal operator U (relative to an orthonormal basis) if and only if AT ¼ AÀ1 .

THEOREM 13.7B:

The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11).
DEFINITION: DEFINITION:

A complex matrix A for which A* ¼ AÀ1 is called a unitary matrix. A real matrix A for which AT ¼ AÀ1 is called an orthogonal matrix.

We repeat Theorem 2.6, which characterizes the above matrices.
THEOREM 13.8:

The following conditions on a matrix A are equivalent: (i) A is unitary (orthogonal). (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set.

13.7

Change of Orthonormal Basis

Orthonormal bases play a special role in the theory of inner product spaces V. Thus, we are naturally interested in the properties of the change-of-basis matrix from one such basis to another. The following theorem (proved in Problem 13.12) holds.
THEOREM 13.9:

Let fu1 ; . . . ; un g be an orthonormal basis of an inner product space V. Then the change-of-basis matrix from fui g into another orthonormal basis is unitary (orthogonal). Conversely, if P ¼ ½aij Š is a unitary (orthogonal) matrix, then the following is an orthonormal basis: fu0i ¼ a1i u1 þ a2i u2 þ Á Á Á þ ani un : i ¼ 1; . . . ; ng

Recall that matrices A and B representing the same linear operator T are similar; that is, B ¼ PÀ1 AP, where P is the (nonsingular) change-of-basis matrix. On the other hand, if V is an inner product space, we are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall that P is unitary if the conjugate tranpose P* ¼ PÀ1 , and P is orthogonal if the transpose PT ¼ PÀ1 .) This leads to the following definition.
DEFINITION:

Complex matrices A and B are unitarily equivalent if there exists a unitary matrix P for which B ¼ P*AP. Analogously, real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P for which B ¼ PTAP.

Note that orthogonally equivalent matrices are necessarily congruent.

382
13.8

CHAPTER 13

Linear Operators on Inner Product Spaces

Positive Definite and Positive Operators

Let P be a linear operator on an inner product space V. Then (i) P is said to be positive definite if P ¼ S*S for some nonsingular operators S: (ii) P is said to be positive (or nonnegative or semidefinite) if P ¼ S*S for some operator S: The following theorems give alternative characterizations of these operators.
THEOREM 13.10A:

The following conditions on an operator P are equivalent: (i) P ¼ T 2 for some nonsingular self-adjoint operator T. (ii) P is positive definite. (iii) P is self-adjoint and hPðuÞ; ui > 0 for every u 6¼ 0 in V.

The corresponding theorem for positive operators (proved in Problem 13.21) follows.
THEOREM 13.10B:

The following conditions on an operator P are equivalent: (i) P ¼ T 2 for some self-adjoint operator T. (ii) P is positive; that is, P ¼ S Ã S: (iii) P is self-adjoint and hPðuÞ; ui ! 0 for every u 2 V.

13.9

Diagonalization and Canonical Forms in Inner Product Spaces

Let T be a linear operator on a finite-dimensional inner product space V over K. Representing T by a diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence, upon the roots of the characteristic polynomial DðtÞ of T. Now DðtÞ always factors into linear polynomials over the complex field C but may not have any linear polynomials over the real field R. Thus, the situation for real inner product spaces (sometimes called Euclidean spaces) is inherently different than the situation for complex inner product spaces (sometimes called unitary spaces). Thus, we treat them separately.

Real Inner Product Spaces, Symmetric and Orthogonal Operators
The following theorem (proved in Problem 13.14) holds.
THEOREM 13.11:

Let T be a symmetric (self-adjoint) operator on a real finite-dimensional product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T ; that is, T can be represented by a diagonal matrix relative to an orthonormal basis.

We give the corresponding statement for matrices.
THEOREM 13.11:

(Alternative Form) Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that B ¼ PÀ1AP ¼ PTAP is diagonal.

We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then the diagonal entries of B are the corresponding eigenvalues. On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented by a diagonal matrix relative to an orthonormal matrix. However, such a matrix T does have a simple canonical representation, as described in the following theorem (proved in Problem 13.16).

CHAPTER 13 Linear Operators on Inner Product Spaces
THEOREM 13.12:

383

Let T be an orthogonal operator on a real inner product space V. Then there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of the form  ! ! cos y1 À sin y1 cos yr À sin yr M ¼ diag Is ; ÀIt ; ; ...; sin y1 cos y1 sin yr cos yr

The reader may recognize that each of the 2 Â 2 diagonal blocks represents a rotation in the corresponding two-dimensional subspace, and each diagonal entry À1 represents a reflection in the corresponding one-dimensional subspace.

Complex Inner Product Spaces, Normal and Triangular Operators
A linear operator T is said to be normal if it commutes with its adjoint—that is, if TT * ¼ T *T. We note that normal operators include both self-adjoint and unitary operators. Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose— that is, if AA* ¼ A*A. ! ! 1 1 1 Ài EXAMPLE 13.3 Let A ¼ . Then A* ¼ . i 3 þ 2i 1 3 À 2i
Also AA* ¼ 2 3 þ 3i ! 3 À 3i ¼ A*A. Thus, A is normal. 14

The following theorem (proved in Problem 13.19) holds.
THEOREM 13.13:

Let T be a normal operator on a complex finite-dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T ; that is, T can be represented by a diagonal matrix relative to an orthonormal basis.

We give the corresponding statement for matrices.
THEOREM 13.13:

(Alternative Form) Let A be a normal matrix. Then there exists a unitary matrix P such that B ¼ PÀ1 AP ¼ P*AP is diagonal.

The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary spaces have a relatively simple form.
THEOREM 13.14:

Let T be an arbitrary operator on a complex finite-dimensional inner product space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a unitary matrix P such that B ¼ PÀ1 AP ¼ P*AP is triangular.

THEOREM 13.14:

13.10

Spectral Theorem

The Spectral Theorem is a reformulation of the diagonalization Theorems 13.11 and 13.13.
THEOREM 13.15:

(Spectral Theorem) Let T be a normal (symmetric) operator on a complex (real) finite-dimensional inner product space V. Then there exists linear operators E1 ; . . . ; Er on V and scalars l1 ; . . . ; lr such that (i) T ¼ l1 E1 þ l2 E2 þ Á Á Á þ lr Er , (ii) E1 þ E2 þ Á Á Á þ Er ¼ I,
2 2 2 (iii) E1 ¼ E1 ; E2 ¼ E2 ; . . . ; Er ¼ Er , (iv) Ei Ej ¼ 0 for i 6¼ j.

384

CHAPTER 13

Linear Operators on Inner Product Spaces

The above linear operators E1 ; . . . ; Er are projections in the sense that Ei2 ¼ Ei . Moreover, they are said to be orthogonal projections because they have the additional property that Ei Ej ¼ 0 for i 6¼ j. The following example shows the relationship between a diagonal matrix representation and the corresponding orthogonal projections.
EXAMPLE 13.4 Consider the following diagonal matrices A; E1 ; E2 ; E3 :

2

6 A¼6 4

2 3 3 5

3

2

7 7; 5

6 E1 ¼ 6 4

1 0 0 0

3

2

7 7; 5

6 E2 ¼ 6 4

0 1 1 0

3 7 7; 5

2 6 E3 ¼ 6 4

0 0 0 1

3 7 7 5

The reader can verify that (i) A ¼ 2E1 þ 3E2 þ 5E3 , (ii) E1 þ E2 þ E3 ¼ I, (iii) Ei2 ¼ Ei , (iv) Ei Ej ¼ 0 for i 6¼ j.

SOLVED PROBLEMS Adjoints 13.1. Find the adjoint of F :R3 ! R3 defined by
Fðx; y; zÞ ¼ ð3x þ 4y À 5z; 2x À 6y þ 7z; 5x À 9y þ zÞ First find the matrix A that represents F in the usual basis of R3 —that is, the matrix A whose rows are the coefficients of x; y; z—and then form the transpose AT of A. This yields 2 3 2 3 3 2 5 3 4 À5 T and then A ¼ 4 4 À6 À9 5 A ¼ 4 2 À6 75 À5 7 1 5 À9 1 The adjoint F* is represented by the transpose of A; hence, F*ðx; y; zÞ ¼ ð3x þ 2y þ 5z; 4x À 6y À 9z; À5x þ 7y þ zÞ

13.2.

Find the adjoint of G:C3 ! C3 defined by
Gðx; y; zÞ ¼ ½2x þ ð1 À iÞy; ð3 þ 2iÞx À 4iz; 2ix þ ð4 À 3iÞy À 3zŠ First find the matrix B that represents G in the usual basis of C3 , and then form the conjugate transpose B* of B. This yields 3 2 3 2 2 1Ài 0 2 3 À 2i À2i B ¼ 4 3 þ 2i 0 À4i 5 and then B* ¼ 4 1 þ i 0 4 þ 3i 5 2i 4 À 3i À3 0 4i À3 Then G*ðx; y; zÞ ¼ ½2x þ ð3 À 2iÞy À 2iz; ð1 þ iÞx þ ð4 þ 3iÞz; 4iy À 3zŠ:

13.3.

Prove Theorem 13.3: Let f be a linear functional on an n-dimensional inner product space V. Then there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V.
Let fw1 ; . . . ; wn g be an orthonormal basis of V. Set u ¼ fðw1 Þw1 þ fðw2 Þw2 þ Á Á Á þ fðwn Þwn ^ ^ Let u be the linear functional on V defined by uðvÞ ¼ hv; ui for every v 2 V. Then, for i ¼ 1; . . . ; n, ^ uðwi Þ ¼ hwi ; ui ¼ hwi ; fðw1 Þw1 þ Á Á Á þ fðwn Þwn i ¼ fðwi Þ

CHAPTER 13 Linear Operators on Inner Product Spaces

385

^ ^ Because u and f agree on each basis vector, u ¼ f. Now suppose u0 is another vector in V for which fðvÞ ¼ hv; u0 i for every v 2 V. Then hv; ui ¼ hv; u0 i or hv; u À u0 i ¼ 0. In particular, this is true for v ¼ u À u0 , and so hu À u0 ; u À u0 i ¼ 0. This yields u À u0 ¼ 0 and u ¼ u0 . Thus, such a vector u is unique, as claimed.

13.4.

Prove Theorem 13.1: Let T be a linear operator on an n-dimensional inner product space V . Then (a) There exists a unique linear operator T * on V such that hT ðuÞ; vi ¼ hu; T *ðvÞi for all u; v 2 V :

(b) Let A be the matrix that represents T relative to an orthonormal basis S ¼ fui g. Then the conjugate transpose A* of A represents T * in the basis S.
(a) We first define the mapping T *. Let v be an arbitrary but fixed element of V. The map u 7! hT ðuÞ; vi is a linear functional on V. Hence, by Theorem 13.3, there exists a unique element v 0 2 V such that hT ðuÞ; vi ¼ hu; v 0 i for every u 2 V. We define T * : V ! V by T *ðvÞ ¼ v 0 . Then hT ðuÞ; vi ¼ hu; T *ðvÞi for every u; v 2 V. We next show that T * is linear. For any u; v i 2 V, and any a; b 2 K,   hu; T *ðav 1 þ bv 2 Þi ¼ hT ðuÞ; av 1 þ bv 2 i ¼ ahT ðuÞ; v 1 i þ bhT ðuÞ; v 2 i  T *ðv 2 Þi ¼ hu; aT *ðv 1 Þ þ bT *ðv 2 Þi  ¼ ahu; T *ðv 1 Þi þ bhu; But this is true for every u 2 V ; hence, T *ðav 1 þ bv 2 Þ ¼ aT *ðv 1 Þ þ bT *ðv 2 Þ. Thus, T * is linear. (b) The matrices A ¼ ½aij Š and B ¼ ½bij Š that represent T and T *, respectively, relative to the orthonormal basis S are given by aij ¼ hT ðuj Þ; ui i and bij ¼ hT *ðuj Þ; ui i (Problem 13.67). Hence, bij ¼ hT *ðuj Þ; ui i ¼ hui ; T *ðuj Þi ¼ hT ðui Þ; uj i ¼ aji Thus, B ¼ A*, as claimed.

13.5.

Prove Theorem 13.2: (i) ðT1 þ T2 Þ* ¼ T 1 þ T 2 * *,  (ii) ðkT Þ* ¼ kT *,
(i) For any u; v 2 V, hðT1 þ T2 ÞðuÞ; vi ¼ hT1 ðuÞ þ T2 ðuÞ; vi ¼ hT1 ðuÞ; vi þ hT2 ðuÞ; vi ¼ hu; T *ðvÞi þ hu; T *ðvÞi ¼ hu; T *ðvÞ þ T *ðvÞi 1 2 1 2 ¼ hu; ðT * þ T *ÞðvÞi 1 2 The uniqueness of the adjoint implies ðT1 þ T2 Þ* ¼ T * þ T *. 1 2 For any u; v 2 V,   hðkT ÞðuÞ; vi ¼ hkT ðuÞ; vi ¼ khT ðuÞ; vi ¼ khu; T *ðvÞi ¼ hu; kT *ðvÞi ¼ hu; ðkT *ÞðvÞi  The uniqueness of the adjoint implies ðkT Þ* ¼ kT *. (iii) For any u; v 2 V, hðT1 T2 ÞðuÞ; vi ¼ hT1 ðT2 ðuÞÞ; vi ¼ hT2 ðuÞ; T *ðvÞi 1 ¼ hu; T *ðT *ðvÞÞi ¼ hu; ðT *T *ÞðvÞi 2 1 2 1 The uniqueness of the adjoint implies ðT1 T2 Þ* ¼ T *T *. 2 1 (iv) For any u; v 2 V, hT *ðuÞ; vi ¼ hv; T *ðuÞi ¼ hT ðvÞ; ui ¼ hu; T ðvÞi The uniqueness of the adjoint implies ðT *Þ* ¼ T.

(iii) ðT1 T2 Þ* ¼ T 2 1 *T *, (iv) ðT *Þ* ¼ T.

(ii)

386
13.6.

CHAPTER 13
Show that ðaÞ I* ¼ I, and ðbÞ 0* ¼ 0.

Linear Operators on Inner Product Spaces

(a) For every u; v 2 V, hIðuÞ; vi ¼ hu; vi ¼ hu; IðvÞi; hence, I* ¼ I. (b) For every u; v 2 V, h0ðuÞ; vi ¼ h0; vi ¼ 0 ¼ hu; 0i ¼ hu; 0ðvÞi; hence, 0* ¼ 0.

13.7.

Suppose T is invertible. Show that ðT À1 Þ* ¼ ðT *ÞÀ1 .
I ¼ I* ¼ ðTT À1 Þ* ¼ ðT À1 Þ*T *; hence; ðT À1 Þ* ¼ ðT *ÞÀ1 :

13.8.

Let T be a linear operator on V, and let W be a T -invariant subspace of V. Show that W ? is invariant under T *.
Let u 2 W ? . If w 2 W, then T ðwÞ 2 W and so hw; T *ðuÞi ¼ hT ðwÞ; ui ¼ 0. Thus, T *ðuÞ 2 W ? because it is orthogonal to every w 2 W. Hence, W ? is invariant under T *.

13.9.

Let T be a linear operator on V. Show that each of the following conditions implies T ¼ 0: (i) hT ðuÞ; vi ¼ 0 for every u; v 2 V . (ii) V is a complex space, and hT ðuÞ; ui ¼ 0 for every u 2 V . (iii) T is self-adjoint and hT ðuÞ; ui ¼ 0 for every u 2 V.
Give an example of an operator T on a real space V for which hT ðuÞ; ui ¼ 0 for every u 2 V but T ¼ 0. 6 [Thus, (ii) need not hold for a real space V.] (i) Set v ¼ T ðuÞ. Then hT ðuÞ; T ðuÞi ¼ 0, and hence, T ðuÞ ¼ 0, for every u 2 V. Accordingly, T ¼ 0. (ii) By hypothesis, hT ðv þ wÞ; v þ wi ¼ 0 for any v; w 2 V. Expanding and setting hT ðvÞ; vi ¼ 0 and hT ðwÞ; wi ¼ 0, we find ð1Þ hT ðvÞ; wi þ hT ðwÞ; vi ¼ 0  Note w is arbitrary in (1). Substituting iw for w, and using hT ðvÞ; iwi ¼ ihT ðvÞ; wi ¼ ÀihT ðvÞ; wi and hT ðiwÞ; vi ¼ hiT ðwÞ; vi ¼ ihT ðwÞ; vi, we find ÀihT ðvÞ; wi þ ihT ðwÞ; vi ¼ 0 Dividing through by i and adding to (1), we obtain hT ðwÞ; vi ¼ 0 for any v; w; 2 V. By (i), T ¼ 0. (iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding hT ðv þ wÞ; v þ wi ¼ 0, we again obtain (1). Because T is self-adjoint and as it is a real space, we have hT ðwÞ; vi ¼ hw; T ðvÞi ¼ hT ðvÞ; wi. Substituting this into (1), we obtain hT ðvÞ; wi ¼ 0 for any v; w 2 V. By (i), T ¼ 0. For an example, consider the linear operator T on R2 defined by T ðx; yÞ ¼ ðy; ÀxÞ. Then hT ðuÞ; ui ¼ 0 for every u 2 V, but T 6¼ 0.

Orthogonal and Unitary Operators and Matrices 13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent: (i) U * ¼ U À1 ; that is, U is unitary. (ii) hU ðvÞ; U ðwÞi ¼ hu; wi.
Suppose (i) holds. Then, for every v; w; 2 V, hU ðvÞ; U ðwÞi ¼ hv; U *U ðwÞi ¼ hv; IðwÞi ¼ hv; wi Thus, (i) implies (ii). Now if (ii) holds, then pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi kU ðvÞk ¼ hU ðvÞ; U ðvÞi ¼ hv; vi ¼ kvk Hence, (ii) implies (iii). It remains to show that (iii) implies (i). Suppose (iii) holds. Then for every v 2 V, hU *U ðvÞi ¼ hU ðvÞ; U ðvÞi ¼ hv; vi ¼ hIðvÞ; vi Hence, hðU *U À IÞðvÞ; vi ¼ 0 for every v 2 V. But U *U À I is self-adjoint (Prove!); then, by Problem 13.9, we have U *U À I ¼ 0 and so U *U ¼ I. Thus, U * ¼ U À1 , as claimed.

(iii) kU ðvÞk ¼ kvk.

CHAPTER 13 Linear Operators on Inner Product Spaces

387

13.11. Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U . Show that W ? is also invariant under U .
Because U is nonsingular, U ðW Þ ¼ W ; that is, for any w 2 W, there exists w0 2 W such that U ðw0 Þ ¼ w. Now let v 2 W ? . Then, for any w 2 W, hU ðvÞ; wi ¼ hU ðvÞ; U ðw0 Þi ¼ hv; w0 i ¼ 0 Thus, U ðvÞ belongs to W ? . Therefore, W ? is invariant under U .

13.12. Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis fu1 ; . . . ; un g into another orthonormal basis is unitary P (orthogonal). Conversely, if P ¼ ½aij Š is a unitary (orthogonal) matrix, then the vectors ui0 ¼ j aji uj form an orthonormal basis.
Suppose fv i g is another orthonormal basis and suppose v i ¼ bi1 u1 þ bi2 u2 þ Á Á Á þ bin un ; i ¼ 1; . . . ; n Because fv i g is orthonormal, dij ¼ hv i ; v j i ¼ bi1 bj1 þ bi2 bj2 þ Á Á Á þ bin bjn ð2Þ ð1Þ

Let B ¼ ½bij Š be the matrix of coefficients in (1). (Then BT is the change-of-basis matrix from fui g to fv i g.) Then BB* ¼ ½cij Š, where cij ¼ bi1 bj1 þ bi2 bj2 þ Á Á Á þ bin bjn . By (2), cij ¼ dij , and therefore BB* ¼ I. Accordingly, B, and hence, BT, is unitary. It remains to prove that fu0i g is orthonormal. By Problem 13.67, hu0i ; u0j i ¼ a1i a1j þ a2i a2j þ Á Á Á þ ani anj ¼ hCi ; Cj i where Ci denotes the ith column of the unitary (orthogonal) matrix P ¼ ½aij Š: Because P is unitary (orthogonal), its columns are orthonormal; hence, hu0i ; u0j i ¼ hCi ; Cj i ¼ dij . Thus, fu0i g is an orthonormal basis.

Symmetric Operators and Canonical Forms in Euclidean Spaces 13.13. Let T be a symmetric operator. Show that (a) The characteristic polynomial DðtÞ of T is a product of linear polynomials (over R); (b) T has a nonzero eigenvector.
(a) Let A be a matrix representing T relative to an orthonormal basis of V ; then A ¼ AT. Let DðtÞ be the characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real eigenvalues by Theorem 13.4. Thus, DðtÞ ¼ ðt À l1 Þðt À l2 Þ Á Á Á ðt À ln Þ where the li are all real. In other words, DðtÞ is a product of linear polynomials over R. (b) By (a), T has at least one (real) eigenvalue. Hence, T has a nonzero eigenvector.

13.14. Prove Theorem 13.11: Let T be a symmetric operator on a real n-dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Hence, T can be represented by a diagonal matrix relative to an orthonormal basis.)
The proof is by induction on the dimension of V. If dim V ¼ 1, the theorem trivially holds. Now suppose dim V ¼ n > 1. By Problem 13.13, there exists a nonzero eigenvector v 1 of T. Let W be the space spanned by v 1 , and let u1 be a unit vector in W, e.g., let u1 ¼ v 1 =kv 1 k. Because v 1 is an eigenvector of T, the subspace W of V is invariant under T. By Problem 13.8, W ? is ^ invariant under T * ¼ T. Thus, the restriction T of T to W ? is a symmetric operator. By Theorem 7.4, ? ? V ¼ W È W . Hence, dim W ¼ n À 1, because dim W ¼ 1. By induction, there exists an orthonormal ^ basis fu2 ; . . . ; un g of W ? consisting of eigenvectors of T and hence of T. But hu1 ; ui i ¼ 0 for i ¼ 2; . . . ; n because ui 2 W ? . Accordingly fu1 ; u2 ; . . . ; un g is an orthonormal set and consists of eigenvectors of T. Thus, the theorem is proved.

388

CHAPTER 13

Linear Operators on Inner Product Spaces

13.15. Let qðx; yÞ ¼ 3x2 À 6xy þ 11y2 . Find an orthonormal change of coordinates (linear substitution) that diagonalizes the quadratic form q.
Find the symmetric matrix A representing q and its characteristic polynomial DðtÞ. We have ! 3 À3 A¼ and DðtÞ ¼ t2 À trðAÞ t þ jAj ¼ t2 À 14t þ 24 ¼ ðt À 2Þðt À 12Þ À3 11 The eigenvalues are l ¼ 2 and l ¼ 12. Hence, a diagonal form of q is qðs; tÞ ¼ 2s2 þ 12t2 (where we use s and t as new variables). The corresponding orthogonal change of coordinates is obtained by finding an orthogonal set of eigenvectors of A. Subtract l ¼ 2 down the diagonal of A to obtain the matrix ! 1 À3 x À 3y ¼ 0 M¼ corresponding to or x À 3y ¼ 0 À3 9 À3x þ 9y ¼ 0 A nonzero solution is u1 ¼ ð3; 1Þ. Next subtract l ¼ 12 down the diagonal of A to obtain the matrix ! À9 À3 À9x À 3y ¼ 0 or À 3x À y ¼ 0 M¼ corresponding to À3 À1 À3x À y ¼ 0 A nonzero solution is u2 ¼ ðÀ1; 3Þ. Normalize u1 and u2 to obtain the orthonormal basis pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi ^ ^ u1 ¼ ð3= 10; 1= 10Þ; u2 ¼ ðÀ1= 10; 3= 10Þ ^ ^ Now let P be the matrix whose columns are u1 and u2 . Then " pffiffiffiffiffi pffiffiffiffiffi # 2 3= 10 À1= 10 P¼ and D ¼ PÀ1AP ¼ PTAP ¼ pffiffiffiffiffi pffiffiffiffiffi 0 1= 10 3= 10 Thus, the required orthogonal change of coordinates is ! ! 3s À t s x ¼P or x ¼ pffiffiffiffiffi ; y t 10 3x þ y s ¼ pffiffiffiffiffi ; 10 Àx þ 3y pffiffiffiffiffi 10 !

0 12

s þ 3t y ¼ pffiffiffiffiffi 10

One can also express s and t in terms of x and y by using PÀ1 ¼ PT ; that is, t¼

13.16. Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of the form
 M ¼ diag 1; . . . ; 1; À1; . . . ; À1; cos y1 sin y1 ! Àsin y1 ; ...; cos y1 cos yr sin yr Àsin yr cos yr !

Let S ¼ T þ T À1 ¼ T þ T *. Then S* ¼ ðT þ T *Þ* ¼ T * þ T ¼ S. Thus, S is a symmetric operator on V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of S. If l1 ; . . . ; lm denote the distinct eigenvalues of S, then V can be decomposed into the direct sum V ¼ V1 È V2 È Á Á Á È Vm where the Vi consists of the eigenvectors of S belonging to li . We claim that each Vi is invariant under T. For suppose v 2 V ; then SðvÞ ¼ li v and SðT ðvÞÞ ¼ ðT þ T À1 ÞT ðvÞ ¼ T ðT þ T À1 ÞðvÞ ¼ TSðvÞ ¼ T ðli vÞ ¼ li T ðvÞ That is, T ðvÞ 2 Vi . Hence, Vi is invariant under T. Because the Vi are orthogonal to each other, we can restrict our investigation to the way that T acts on e