For question 1.4.6 of Cherney, Denton, and Waldron, there is a question about how matrix multiplication works. Indeed, part (b) asks you to infer the mechanics of the operation based on the mechanics of matrix-vector multiplication. Without giving away the answer completely (which of course you could spoil by asking the Oracle of Google Delphi), here are some hints.
First, let’s think about matrix multiplication and function composition as asked in part (a). What is meant by the composition of matrices? To arrive at a reasonable answer, consider what a function composition is: for
functions of
. Note that I’m being a bit cavalier with the notation to focus on the core concepts. Now, how does this relate to matrices? Recall that we are considering matrices to be linear operators. Hence, a matrix in this sense is a function that operates on a vector argument. Note that you can verify linearity without actually knowing how to perform matrix multiplication.
Now to the question at hand, which is the mechanics of matrix multiplication. CDW claims that you can deduce this from matrix-vector multiplication. Really? Let’s return to first principles and think about a matrix as an operator. Say I have a matrix and a vector
. We know that
. What about if we operate on another vector, say
? The operation is the same, so all we’re doing is plugging in a different set of variables. Notice that there isn’t any dependence on
.
Going from here to matrix multiplication isn’t such a large leap. What you need to remember is that mathematical entities are ultimately abstract representations. How we interpret them is up to us. Matrices can represent numerous things. In the first case, we interpret our matrix as a linear operator. Matrices can also be interpreted as an ordered set. This may seem strange, but consider that a matrix can be deconstructed as a collection of vectors. After all, we’ve seen how matrices can represent systems of linear equations, polynomials, and other mathematical structures. So why not sets? Returning to our old friend
, suppose that instead of operating on a vector, we want to operate on a collection of vectors. What would that look like? The answer, is matrix multiplication.