TransWikia.com

Should the formula for the inverse of a 2x2 matrix be obvious?

MathOverflow Asked by Frank Thorne on November 21, 2020

As every MO user knows, and can easily prove, the inverse of the matrix $begin{pmatrix} a & b \ c & d end{pmatrix}$ is $dfrac{1}{ad – bc} begin{pmatrix} d & -b \ -c & a end{pmatrix}$. This can be proved, for example, by writing the inverse as $ begin{pmatrix} r & s \ t & u end{pmatrix}$ and solving the resulting system of four equations in four variables.

As a grad student, when studying the theory of modular forms, I repeatedly forgot this formula (do you switch the $a$ and $d$ and invert the sign of $b$ and $c$ … or was it the other way around?) and continually had to rederive it. Much later, it occurred to me that it was better to remember the formula was obvious in a couple of special cases such as $begin{pmatrix} 1 & b \ 0 & 1 end{pmatrix}$, and diagonal matrices, for which the geometric intuition is simple. One can also remember this as a special case of the adjugate matrix.

Is there some way to just write down $dfrac{1}{ad – bc} begin{pmatrix} d & -b \ -c & a end{pmatrix}$, even in the case where $ad – bc = 1$, by pure thought—without having to compute? In particular, is there some geometric intuition, in terms of a linear transformation on a two-dimensional vector space, that renders this fact crystal clear?

Or may as well I be asking how to remember why $43 times 87$ is equal to $3741$ and not $3731$?

9 Answers

EDIT (8/14/2020): A couple people have suggested that this answer should come with a warning -- this is a pretty fancy approach to an elementary question, motivated by the fact that I know the OP's interests. Some of the other answers below are probably better if you just want to invert some matrices :). I've also fixed a couple of minor typos.


My favorite way to remember this is to think of $SL_2(mathbb{R})$ as a circle bundle over the upper half-plane, where $SL_2(mathbb{R})$ acts on the upper half-plane via fractional linear transformations; then the map sends an element of $SL_2(mathbb{R})$ to the image of $i$ under the corresponding fractional linear transformation. The fiber over a point is the corresponding coset of the stabilizer of $i$.

This naturally gives the Iwasawa decomposition of $SL_2(mathbb{R})$ as $$SL_2(mathbb{R})=NAK$$ where

$$K=left{begin{pmatrix} cos(theta) & -sin(theta) \ sin(theta) & cos(theta) end{pmatrix} , ~0leqtheta<2pi right}$$

$$A=left{begin{pmatrix} r & 0\ 0 &1/rend{pmatrix},~ rin mathbb{R}setminus{0}right}$$

$$N=left{begin{pmatrix} 1 & x \ 0 & 1end{pmatrix},~ xin mathbb{R}right}$$

Here $K$ is the stabilizer of $i$ in the upper half-plane picture; viewed as acting on the plane via the usual action of $SL_2(mathbb{R})$ on $mathbb{R}^2$ it is just rotation by $theta$ (and likewise if we view the upper half plane as the unit disk, sending $i$ to $0$ via a fractional linear transformation). $A$ is just scaling by $r^2$, in the upper half-plane picture, and is stretching in the $mathbb{R}^2$ picture. $N$ is translation by $x$ in the upper half-plane picture, and is a skew transformation in the $mathbb{R}^2$ picture.

In each case, the inverse is geometrically obvious: for $K$, replace $theta$ with $-theta$; for $A$ replace $r$ with $1/r$, and for $N$, replace $x$ with $-x$. Since $$SL_2(mathbb{R})=NAK$$ this lets us invert every $2times 2$ matrix by "pure thought", at least if you remember the Iwasawa decomposition (which is easy from the geometric picture, I think). Of course this easily extends to $GL_2$; if $A$ has determinant $d$, then $A^{-1}$ had better have determinant $d^{-1}$.

If you'd like to derive the formula you've written down by "pure thought" it suffices to look at any one of these cases if you remember the general form of the inverse; or you can simply put them all together to give a rigorous derivation.

Correct answer by Daniel Litt on November 21, 2020

It is probably too old question to answer, but I couldn't resist. Consider $M_2mathbb R$=${a+biota :a,bin mathbb C}$. Additional relations are present: $iota^2=1$, $iota b=bar biota$, which make enough to multiply $2times 2$ matrices as split quaternions. For reader convenience $iota=begin{pmatrix} 1 & 0 \ 0 & -1 end{pmatrix}$.

Now adjugate matrix to $a+biota $ is $bar a-biota$. Let's calculate $(a+biota)(bar a-biota)=abar a-bbar b+(-ab+ba)iota=abar a-bbar b$, because complex numbers multiplication is commutative.

The determinant of matrix $a+biota$ is $abar a-bbar b$.

Answered by user21230 on November 21, 2020

This is essentially the same as Tobias Hagge's answer and Jonny Evans's comment, but I thought that writing it up in this way would make things clearer.

Think about the product $$ begin{bmatrix} a & b\\ c & d end{bmatrix} begin{bmatrix} ? & ?\\ ? & ? end{bmatrix} =begin{bmatrix} ad-bc & 0\\ 0 & ad-bc end{bmatrix}. $$ Focus on the zero in position $(2,1)$ in the RHS. In order to get it with the row-by-column rule, the first column of the unknown matrix must be $begin{bmatrix}d\\-cend{bmatrix}$.

(Well, apart from the sign --- you could still get it wrong. But you can check that it is correct by computing the $(1,1)$ entry of the product.)

Now focus on the other zero entry in position (1,2) of the RHS, and you'll see that the second column must be $begin{bmatrix}-b\\aend{bmatrix}$. Again, if you're confused about the sign, just check the $(2,2)$ entry.

Answered by Federico Poloni on November 21, 2020

I remember the inverse by looking at the corresponding linear fractional transformation. It sends $frac{-d}{c}$ to $infty$ and $infty$ to $frac{a}{c}$, so the inverse had better reverse this; it follows that the $c$ should stay put and the $a$ and $d$ should switch, and so the $b$ and $c$ get negated.

Answered by Grant Lakeland on November 21, 2020

$bullet$ The sign switch is familiar from complex numbers:

The regular representation of $mathbb{C}$ over $mathbb{R}$ is the embedding of $mathbb{R}$-algebras $mathbb{C} to M_2(mathbb{R})$ defined by $a+ib mapsto begin{pmatrix} a & -b \ b & a end{pmatrix}$. The inverse of $a+ib$ is the conjugate $a-ib$ divided by the norm $a^2+b^2$, thus the inverse of $begin{pmatrix} a & -b \ b & a end{pmatrix}$ is the adjugate $begin{pmatrix} a & b \ -b & a end{pmatrix}$ divided by the determinant $a^2+b^2$.

$bullet$ Both the sign switch and the swap of the diagonal entries can be illustrated with quaternions:

The regular representations of $mathbb{H}$ over $mathbb{C}$ is the embedding $mathbb{H} to M_2(mathbb{C})$ mapping $u+jv mapsto begin{pmatrix} u & v \ - overline{v} & overline{u} end{pmatrix}$. The inverse of $u + jv$ is the conjugate $overline{u} - j overline{v}$ divided by the norm $|u|^2+|v|^2$. Thus, the inverse of $begin{pmatrix} u & v \ - overline{v} & overline{u} end{pmatrix}$ is the adjugate $begin{pmatrix} overline{u} & -v \ overline{v} & u end{pmatrix}$ divided by the determinant $|u|^2+|v|^2$.

Answered by Martin Brandenburg on November 21, 2020

Mnemonic: make the product diagonals the determinant, then scale.

The off diagonals are zero because the area of a parallelogram with planar edge vectors $c_1,c_2$ is the length of the scaled projection $|c_1 cdot i c_2| = |c_2 cdot i c_1|$, and the mnemonic sets row $r_k$ in the inverse to $(ic_{3 - k})^T$.

Answered by Tobias Hagge on November 21, 2020

My answer is not very highfaluting, but it is what I use to remember. Switch the diagonals, change the signs of the off-diagonals and divide by the determinant. Since the inverse of a diagonal matrix is easy, the switch should be easy to remember. On the other hand such mnemonics are dangerous. The critical points of a cubic $Ax^3+Bx^2+Cx+D$ are at $frac{-Bpmsqrt{B^2-3AC}}{3A}$, or so I remember.

Answered by Scott Carter on November 21, 2020

Think about $left({phantom-dphantom--batop-cphantom{--}a}right)$ as $tI - A$ where $t=a+d$ is the trace of $A$. Since $A$ satisfies its own characteristic equation (Cayley-Hamilton), we have $A^2 - t A + Delta cdot I = 0$ where $Delta = ad-bc$ is the determinant. Thus $Delta cdot I = t A - A^2$. Now divide both sides by $Delta cdot A$ to get $A^{-1} = Delta^{-1}(tI-A)$, QED.

Answered by Noam D. Elkies on November 21, 2020

Recall that the adjugate $text{adj}(A)$ of a square matrix is a matrix that satisfies $$A cdot text{adj}(A) = text{adj}(A) cdot A = det(A).$$

Like the determinant, the adjugate is multiplicative. Categorically, the reason the determinant is multiplicative is that it comes from a functor (the exterior power), so one might expect that the adjugate also comes from a functor, and indeed it does (the same functor!).

More precisely, let $T : V to V$ be a linear transformation on a finite-dimensional vector space with basis $e_1, ... e_n$. Then the adjugate of the matrix of $T$ with respect to the basis $e_i$ is the matrix of $Lambda^{n-1}(T) : Lambda^{n-1}(V) to Lambda^{n-1}(V)$ with respect to an appropriate "dual basis" $$(-1)^{i-1} bigwedge_{j neq i} e_j$$ of $Lambda^{n-1}(V)$ (it becomes an actual dual basis if you identify $Lambda^n(V)$ with the underlying field $k$ by sending $e_1 wedge ... wedge e_n$ to $1$). The exterior product $V times Lambda^{n-1}(V) to Lambda^n(V)$ can then be identified with the dual pairing $V times V^{ast} to k$, and the action of the exterior product on endomorphisms of $V$ and $Lambda^{n-1}(V)$ can be identified with the composition of endomorphisms of $V$ (remembering that $text{End}(V)$ is canonically isomorphic to $text{End}(V^{ast})$). This categorifies the above statement.

When $n = 2$, the dual basis is $e_2, - e_1$ but $Lambda^1$ is the identity functor, and the formula follows. The geometric intuition comes from thinking about the exterior product in terms of oriented areas of parallelograms in $mathbb{R}^2$.

Answered by Qiaochu Yuan on November 21, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP