问题

为何可逆上三角形矩阵的逆矩阵也是上三角形矩阵?

回答
我们来详细地探讨一下,为什么一个可逆的上三角形矩阵的逆矩阵也必然是上三角形矩阵。

首先,我们需要明确一些基本概念:

上三角形矩阵 (Upper Triangular Matrix): 一个方阵,如果它主对角线以下的元素都为零,则称为上三角形矩阵。主对角线上的元素可以为任意值。
例如:
$$
A = egin{pmatrix} a_{11} & a_{12} & a_{13} \ 0 & a_{22} & a_{23} \ 0 & 0 & a_{33} end{pmatrix}
$$

可逆矩阵 (Invertible Matrix): 一个方阵如果存在另一个方阵,使得它们的乘积是单位矩阵,则称该矩阵是可逆的。单位矩阵是一个主对角线元素为 1,其余元素为 0 的矩阵。如果矩阵 A 的逆矩阵记为 $A^{1}$,则 $A A^{1} = A^{1} A = I$ (单位矩阵)。
对于上三角形矩阵来说,要可逆,其主对角线上的元素都必须非零。如果主对角线上的任何一个元素为零,那么该矩阵是不可逆的(行列式为零)。

单位矩阵 (Identity Matrix):
$$
I = egin{pmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 1 end{pmatrix}
$$

证明思路:

我们将从两个角度来证明这一点:

1. 利用矩阵乘法和逆矩阵的定义(代数方法)
2. 利用性质和线性变换的视角(几何或抽象代数方法)



方法一:利用矩阵乘法和逆矩阵的定义(代数方法)

假设我们有一个 $n imes n$ 的可逆上三角形矩阵 $A$。我们知道存在一个矩阵 $B$ 使得 $AB = I$。我们要证明的就是这个矩阵 $B$ 也是一个上三角形矩阵。

令 $A$ 和 $B$ 的元素分别为 $a_{ij}$ 和 $b_{ij}$。由于 $A$ 是上三角形矩阵,所以当 $i > j$ 时,$a_{ij} = 0$。
$$
A = egin{pmatrix}
a_{11} & a_{12} & cdots & a_{1n} \
0 & a_{22} & cdots & a_{2n} \
vdots & vdots & ddots & vdots \
0 & 0 & cdots & a_{nn}
end{pmatrix}
$$
因为 $A$ 是可逆的,所以其主对角线元素 $a_{ii} eq 0$。

我们要求证明 $B$ 是上三角形矩阵,这意味着我们要证明对于所有 $i > j$,$b_{ij} = 0$。

矩阵乘法 $AB = I$ 的定义为:
$$
(AB)_{ik} = sum_{j=1}^{n} a_{ij} b_{jk} = delta_{ik}
$$
其中 $delta_{ik}$ 是 Kronecker delta 函数,当 $i=k$ 时为 1,否则为 0。

我们重点关注 $B$ 的非对角线元素,即当 $i > k$ 时,$b_{ik}$ 的值。

考虑 $AB=I$ 中的第 $i$ 行和第 $k$ 列的元素,其中 $i > k$。
$$
sum_{j=1}^{n} a_{ij} b_{jk} = 0 quad ( ext{因为 } i eq k)
$$

展开求和:
$$
a_{i1} b_{1k} + a_{i2} b_{2k} + cdots + a_{ii} b_{ik} + cdots + a_{in} b_{nk} = 0
$$

由于 $A$ 是上三角形矩阵,对于 $i > j$,我们有 $a_{ij} = 0$。
这意味着,在上面的求和式中,当 $j < i$ 时,$a_{ij}$ 都为零。
所以,求和可以简化为:
$$
sum_{j=i}^{n} a_{ij} b_{jk} = 0
$$
展开这个简化的求和:
$$
a_{ii} b_{ik} + a_{i, i+1} b_{i+1, k} + cdots + a_{in} b_{nk} = 0
$$

现在,我们要做的是 逐列 来确定 $B$ 的元素,从而证明 $B$ 的下三角部分是零。我们可以通过数学归纳法或者从上往下、从左往右的顺序来计算 $b_{ij}$。

一种更直观的代数方法是考虑方程组:

令 $x_k$ 为矩阵 $B$ 的第 $k$ 列向量,即 $x_k = egin{pmatrix} b_{1k} \ b_{2k} \ vdots \ b_{nk} end{pmatrix}$。
则矩阵方程 $AB = I$ 可以写成 $A x_k = e_k$,其中 $e_k$ 是单位矩阵 $I$ 的第 $k$ 列(一个第 $k$ 个元素为 1,其余为 0 的列向量)。

对于任意给定的 $k$ ($1 le k le n$),我们需要解这个线性方程组 $Ax_k = e_k$ 来找到 $B$ 的第 $k$ 列 $x_k$。

考虑方程组 $Ax_k = e_k$ 的第 $i$ 行:
$$
sum_{j=1}^{n} a_{ij} b_{jk} = delta_{ik}
$$

我们要证明的是 $b_{jk} = 0$ 当 $j > k$ 时。
让我们从最后一行开始,然后向上回代。

考虑方程组的 最后一行 ($i=n$):
$$
sum_{j=1}^{n} a_{nj} b_{jk} = delta_{nk}
$$
由于 $A$ 是上三角形矩阵,除了 $a_{nn}$,其他 $a_{nj}$ ($j < n$) 都为零。
所以方程变为:
$$
a_{nn} b_{nk} = delta_{nk}
$$
如果 $k=n$,则 $a_{nn} b_{nn} = 1$。因为 $a_{nn} eq 0$,所以 $b_{nn} = 1/a_{nn}$。这是对角线元素。
如果 $k < n$,则 $a_{nn} b_{nk} = 0$。因为 $a_{nn} eq 0$,所以 $b_{nk} = 0$。这证明了 $B$ 的第 $n$ 列,除了第 $n$ 个元素,其余元素都为零。

现在考虑方程组的 倒数第二行 ($i=n1$):
$$
sum_{j=1}^{n} a_{n1, j} b_{jk} = delta_{n1, k}
$$
由于 $A$ 是上三角形矩阵,除了 $a_{n1, n1}$ 和 $a_{n1, n}$,其他 $a_{n1, j}$ ($j < n1$) 都为零。
所以方程变为:
$$
a_{n1, n1} b_{n1, k} + a_{n1, n} b_{nk} = delta_{n1, k}
$$

我们要证明的是 $b_{j,k}=0$ 当 $j>k$。

让我们更系统地考虑求 $b_{jk}$(当 $j>k$ 时):

1. 求 $b_{nk}$ (最后一个元素):
对于任意列 $k$ ($1 le k le n$),考虑方程 $A x_k = e_k$ 的第 $n$ 行:
$a_{n1} b_{1k} + a_{n2} b_{2k} + dots + a_{nn} b_{nk} = delta_{nk}$
因为 $a_{nj} = 0$ for $j < n$, the equation becomes:
$a_{nn} b_{nk} = delta_{nk}$
当 $k < n$, $delta_{nk} = 0$. 于是 $a_{nn} b_{nk} = 0$. 由于 $a_{nn} eq 0$, 可得 $b_{nk} = 0$.
当 $k = n$, $delta_{nn} = 1$. 于是 $a_{nn} b_{nn} = 1$. 可得 $b_{nn} = 1/a_{nn}$.
这表明 $B$ 的第 $n$ 列,除了 $b_{nn}$ 是非零的(其值为 $1/a_{nn}$),其余元素都为零。这符合上三角形矩阵的定义。

2. 求 $b_{n1, k}$ (最后第二列):
现在我们已经知道了所有 $b_{nk}$ 的值。
考虑方程 $A x_k = e_k$ 的第 $n1$ 行:
$a_{n1, 1} b_{1k} + dots + a_{n1, n1} b_{n1, k} + a_{n1, n} b_{nk} = delta_{n1, k}$
由于 $a_{n1, j} = 0$ for $j < n1$, the equation becomes:
$a_{n1, n1} b_{n1, k} + a_{n1, n} b_{nk} = delta_{n1, k}$
如果我们要求 $b_{n1, k}$ 且 $n1 > k$, 我们就需要确保 $b_{n1, k}=0$.
假设我们正在计算 $b_{jk}$ 并且已经证明了所有 $b_{pk} = 0$ 对于 $p > j$ 以及所有 $b_{qk}$ 对于 $q > k$(这有点绕,我们应该从 $j$ 的角度来证明)。

让我们重新组织一下思路,专注于证明 $b_{ij}=0$ 当 $i>j$。

考虑方程 $A x_j = e_j$ 的第 $i$ 行,其中 $i > j$:
$$
sum_{p=1}^{n} a_{ip} b_{pj} = delta_{ij}
$$
由于 $i > j$, $delta_{ij} = 0$. 所以:
$$
sum_{p=1}^{n} a_{ip} b_{pj} = 0
$$
因为 $A$ 是上三角形矩阵,所以 $a_{ip} = 0$ for $p < i$.
因此,求和可以写成:
$$
sum_{p=i}^{n} a_{ip} b_{pj} = 0
$$
展开:
$$
a_{ii} b_{ij} + a_{i, i+1} b_{i+1, j} + dots + a_{in} b_{nj} = 0
$$
我们要证明 $b_{ij} = 0$ 当 $i > j$.
我们已经证明了 $b_{pj} = 0$ for $p > j$ when $p=n$ (i.e. $b_{nj}=0$ for $n>j$).

我们使用数学归纳法证明 $b_{ij} = 0$ 对所有 $i > j$。

基础步骤:
考虑 $i = n$. 我们已经证明了 $b_{nj} = 0$ for $n > j$. 这是因为在第 $n$ 行的方程中,$a_{nn}b_{nj} = delta_{nj}$。当 $n>j$, $delta_{nj}=0$, $a_{nn} eq0$, 所以 $b_{nj}=0$.

归纳假设:
假设对于某个 $m < n$, 我们已经证明了所有 $b_{pj} = 0$ 对于 $p > m$ (且 $p$ 属于 $[m+1, n]$) 以及所有 $j in [1, n]$.
也就是说,我们假设矩阵 $B$ 的下方 $nm$ 行(从 $m+1$ 到 $n$)的非对角线元素都为零。
这意味着,对于 $p in [m+1, n]$ 且 $p>q$, $b_{pq}=0$.

归纳步骤:
我们现在考虑第 $m$ 行,并证明 $b_{mj} = 0$ 对于 $m > j$.
取方程 $Ax_j = e_j$ 的第 $m$ 行 ($i=m$):
$$
sum_{p=1}^{n} a_{mp} b_{pj} = delta_{mj}
$$
由于 $A$ 是上三角形矩阵,$a_{mp} = 0$ for $p < m$.
$$
sum_{p=m}^{n} a_{mp} b_{pj} = delta_{mj}
$$
展开:
$$
a_{mm} b_{mj} + a_{m, m+1} b_{m+1, j} + dots + a_{mn} b_{nj} = delta_{mj}
$$
我们要证明的是,当 $m > j$, $b_{mj} = 0$.
当 $m > j$, $delta_{mj} = 0$. 所以:
$$
a_{mm} b_{mj} + a_{m, m+1} b_{m+1, j} + dots + a_{mn} b_{nj} = 0
$$
现在利用归纳假设。我们知道对于 $p > m$, $b_{pj} = 0$ 如果 $p>q$.
这里,我们关注的是 $b_{pj}$ 当 $p in [m+1, n]$ 的值。
根据我们前面的证明(例如,当 $i=n$ 时,我们证明了 $b_{nj}=0$ for $n>j$),我们已经知道对于所有的 $j le n$, $b_{pj}=0$ for $p > j$.

让我们换一个更简洁的归纳方式,从 $i$ 和 $j$ 的相对位置入手。

我们要求证 $b_{ij}=0$ for $i>j$.
Consider the equation for the $i$th row of $AB = I$:
$sum_{k=1}^{n} a_{ik} b_{kj} = delta_{ij}$
Since $A$ is upper triangular, $a_{ik}=0$ for $k So, $sum_{k=i}^{n} a_{ik} b_{kj} = delta_{ij}$.
This expands to: $a_{ii}b_{ij} + a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj} = delta_{ij}$.

We want to show that if $i > j$, then $b_{ij} = 0$.
When $i > j$, $delta_{ij} = 0$. So the equation becomes:
$a_{ii}b_{ij} + a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj} = 0$.

Let's process this equation by determining the values of $b_{ij}$ starting from the bottomright of the matrix $B$.

1. For $b_{n, j}$ where $n > j$:
The equation for the $n$th row of $AB=I$ is $sum_{k=1}^{n} a_{nk}b_{kj} = delta_{nj}$.
Since $a_{nk}=0$ for $k If $n > j$, then $delta_{nj}=0$. So, $a_{nn}b_{nj} = 0$. Since $a_{nn} eq 0$, we get $b_{nj}=0$.
This means the last row of $B$ (except $b_{nn}$) is zero.

2. For $b_{n1, j}$ where $n1 > j$:
The equation for the $(n1)$th row of $AB=I$ is $sum_{k=1}^{n} a_{n1,k}b_{kj} = delta_{n1,j}$.
Since $a_{n1,k}=0$ for $k We want to show $b_{n1,j}=0$ when $n1 > j$.
In this case, $delta_{n1,j}=0$. So, $a_{n1,n1}b_{n1,j} + a_{n1,n}b_{nj} = 0$.
From step 1, we know that $b_{nj}=0$ for $n>j$. Since we are considering $n1 > j$, this implies $n > j$. So $b_{nj}=0$.
The equation simplifies to $a_{n1,n1}b_{n1,j} = 0$.
Since $a_{n1,n1} eq 0$, we get $b_{n1,j}=0$.
This means the second to last row of $B$ (except $b_{n1,n1}$) is zero.

General Step:
Assume we have shown that for all $p > m$ and $p > j'$, $b_{pj'} = 0$.
Now consider the $m$th row, and we want to show $b_{mj} = 0$ for $m > j$.
The $m$th row equation is: $sum_{k=m}^{n} a_{mk}b_{kj} = delta_{mj}$.
If $m > j$, then $delta_{mj} = 0$.
$a_{mm}b_{mj} + a_{m,m+1}b_{m+1,j} + dots + a_{mn}b_{nj} = 0$.

We need to be careful. The assumption should be structured around the indices of $B$ that we are trying to set to zero.
We are proving $b_{ij}=0$ for $i>j$.

Let's reexamine the equation for row $i$, aiming to solve for $b_{ij}$:
$a_{ii}b_{ij} + a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj} = delta_{ij}$.

We can solve for $b_{ij}$ from this equation, provided $a_{ii} eq 0$:
$a_{ii}b_{ij} = delta_{ij} (a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj})$.
$b_{ij} = frac{1}{a_{ii}} left( delta_{ij} sum_{k=i+1}^{n} a_{ik}b_{kj} ight)$.

Now, let's use this formula to prove $b_{ij}=0$ when $i>j$.
If $i > j$, then $delta_{ij} = 0$.
So, $b_{ij} = frac{1}{a_{ii}} left( sum_{k=i+1}^{n} a_{ik}b_{kj} ight)$.

We need to show that the entire term inside the parenthesis is zero when $i>j$.
Consider the terms $b_{kj}$ in the sum. The indices are $(k, j)$, where $k$ ranges from $i+1$ to $n$.
Since $i > j$, we have $k ge i+1 > j$.
So, for every term $b_{kj}$ in the sum, the first index $k$ is strictly greater than the second index $j$.

This implies that we need to prove $b_{kj} = 0$ for all $k > j$ before we can conclude $b_{ij}=0$ for $i>j$. This sounds like induction again.

Let's use induction on the "distance" from the diagonal.
We want to prove $b_{ij}=0$ for $ij = d > 0$.

Base Case $d=1$: We want to prove $b_{i, i1} = 0$ for $i = 2, dots, n$.
Consider the $i$th row of $AB=I$. The equation for $b_{i, i1}$ is:
$a_{ii}b_{i,i1} + a_{i,i+1}b_{i+1,i1} + dots + a_{in}b_{ni} = delta_{i,i1}$.
Since $i > i1$, $delta_{i,i1} = 0$.
$a_{ii}b_{i,i1} + sum_{k=i+1}^{n} a_{ik}b_{k,i1} = 0$.

We need to know the values of $b_{k, i1}$ for $k > i$.
From our previous reasoning (e.g., $b_{nj}=0$ for $n>j$), we know that $b_{k, i1} = 0$ for $k > i1$.
More specifically, for $k in {i+1, dots, n}$, we have $k > i > i1$. Thus, $b_{k, i1} = 0$.
This is because $k > i1$ implies the first index is greater than the second.

So, the sum $sum_{k=i+1}^{n} a_{ik}b_{k,i1}$ becomes $sum_{k=i+1}^{n} a_{ik} cdot 0 = 0$.
The equation for $b_{i,i1}$ simplifies to $a_{ii}b_{i,i1} = 0$.
Since $a_{ii} eq 0$, we get $b_{i,i1} = 0$.
This proves that all entries immediately below the main diagonal of $B$ are zero.

Inductive Hypothesis: Assume that for some $d > 1$, all entries $b_{ij}$ with $ij = d'$ and $1 le d' < d$ are zero. In other words, all elements $b_{ij}$ in the first $d1$ subdiagonals below the main diagonal are zero.

Inductive Step: We want to prove that $b_{ij} = 0$ when $ij = d$.
Consider the $i$th row of $AB=I$. The equation for $b_{ij}$ is:
$a_{ii}b_{ij} + a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj} = delta_{ij}$.
Since $ij = d > 0$, $delta_{ij} = 0$.
$a_{ii}b_{ij} + sum_{k=i+1}^{n} a_{ik}b_{kj} = 0$.

Now consider the terms in the sum: $b_{kj}$ where $k$ ranges from $i+1$ to $n$.
For these terms, the row index is $k$, and the column index is $j$.
The difference between the indices is $kj$.
Since $k ge i+1$, we have $kj ge (i+1)j = (ij)+1 = d+1$.
So, for all terms $b_{kj}$ in the sum, we have $kj ge d+1$.

This means that these terms $b_{kj}$ are in subdiagonals below the $d$th subdiagonal.
This is where the structure of the proof must be careful. We need to show $b_{kj}=0$ for $k>j$.

Let's reconsider the equation and solve for $b_{ij}$ directly using the known values of elements in $B$ from the "lower" subdiagonals.

$b_{ij} = frac{1}{a_{ii}} left( delta_{ij} sum_{k=i+1}^{n} a_{ik}b_{kj} ight)$.

We want to show $b_{ij} = 0$ for $i > j$.

Let's prove by induction on $i$.

Base Case $i=1$: For any $j$, $b_{1j}$ needs to be determined. The equation for row 1 is $sum_{k=1}^{n} a_{1k}b_{kj} = delta_{1j}$. This only involves $a_{1k}$ and $b_{kj}$. Since $a_{1k}$ can be nonzero, this doesn't directly tell us about $b_{1j}$. This is not the right way.

The correct approach is to determine the entries of B by solving the system of linear equations $Ax_j = e_j$ column by column, from $j=1$ to $j=n$.

Let's solve for the $j$th column $x_j = (b_{1j}, b_{2j}, dots, b_{nj})^T$.
The equations are:
$a_{11}b_{1j} + a_{12}b_{2j} + dots + a_{1n}b_{nj} = delta_{1j}$
$a_{22}b_{2j} + dots + a_{2n}b_{nj} = delta_{2j}$
...
$a_{jj}b_{jj} + dots + a_{jn}b_{nj} = delta_{jj}$
...
$a_{nn}b_{nj} = delta_{nj}$

We want to show $b_{ij} = 0$ for $i > j$.

1. Consider the last equation ($i=n$): $a_{nn}b_{nj} = delta_{nj}$.
If $n > j$, then $delta_{nj} = 0$. Since $a_{nn} eq 0$, $b_{nj} = 0$.
This confirms our earlier finding: $B$'s last row (except $b_{nn}$) is zero.

2. Consider the second to last equation ($i=n1$): $a_{n1, n1}b_{n1, j} + a_{n1, n}b_{nj} = delta_{n1, j}$.
We want to show $b_{n1, j} = 0$ for $n1 > j$.
If $n1 > j$, then $delta_{n1, j} = 0$.
The equation becomes $a_{n1, n1}b_{n1, j} + a_{n1, n}b_{nj} = 0$.
We already know that for $n > j$, $b_{nj}=0$. So, $a_{n1, n1}b_{n1, j} = 0$.
Since $a_{n1, n1} eq 0$, $b_{n1, j} = 0$.
This confirms our finding: $B$'s second to last row (except $b_{n1, n1}$) is zero.

3. General step: Suppose we are trying to determine $b_{ij}$ for a specific column $j$. We are working upwards from row $n$.
Assume we have already determined that $b_{kj} = 0$ for all $k$ such that $k > i$.
Now consider the $i$th equation:
$sum_{k=i}^{n} a_{ik}b_{kj} = delta_{ij}$ (since $a_{ik}=0$ for $k $a_{ii}b_{ij} + a_{i,i+1}b_{i+1,j} + dots + a_{in}b_{nj} = delta_{ij}$.

We want to show that if $i > j$, then $b_{ij} = 0$.
If $i > j$, then $delta_{ij} = 0$.
$a_{ii}b_{ij} + sum_{k=i+1}^{n} a_{ik}b_{kj} = 0$.

By our assumption (which is derived from the previous steps for rows $n, n1, dots, i+1$), we know that for $k > i$, $b_{kj} = 0$ if $k > j$.
Since $k$ starts from $i+1$ and $i > j$, every $k$ in the sum satisfies $k > i > j$.
Therefore, for all $k in {i+1, dots, n}$, we have $k > j$.
This means that $b_{kj} = 0$ for all $k$ in the sum.

So the sum becomes $sum_{k=i+1}^{n} a_{ik} cdot 0 = 0$.
The equation simplifies to $a_{ii}b_{ij} = 0$.
Since $a_{ii} eq 0$, we get $b_{ij} = 0$.

This inductive process, working from row $n$ upwards, successfully proves that $b_{ij}=0$ for all $i>j$.
Therefore, the inverse matrix $B$ is also an upper triangular matrix.



方法二:利用性质和线性变换的视角

这是一个更抽象的证明,但可能更深刻。

1. 对角化和相似变换:
考虑上三角形矩阵 $A$。它可以通过相似变换转化为对角矩阵吗?不一定。但是,上三角形矩阵在一些情况下可以被相似变换为另一个上三角形矩阵(例如,若 $A$ 的特征值都相同,则可以通过某些技巧转化为上三角形矩阵)。

2. 可逆性和特征值:
一个矩阵可逆当且仅当其所有特征值都非零。对于上三角形矩阵 $A$,其特征值就是主对角线上的元素 $a_{ii}$。因此,可逆意味着 $a_{ii} eq 0$ 对于所有 $i$。

3. 子空间不变性:
上三角形矩阵在作用于特定的子空间时会保持其结构。
考虑由 $A$ 的列向量张成的子空间(列空间)。
对于一个上三角形矩阵 $A = (a_{ij})$,我们可以观察到:
$A e_1 = (a_{11}, 0, dots, 0)^T$. 这个向量落在了由 $e_1$ 张成的子空间中。
$A e_2 = (a_{12}, a_{22}, 0, dots, 0)^T$. 这个向量落在了由 $e_1, e_2$ 张成的子空间中。
更一般地,考虑由 $e_1, e_2, dots, e_k$ 张成的子空间 $V_k = ext{span}{e_1, dots, e_k}$。
对于 $x in V_k$, $x = sum_{i=1}^k x_i e_i$.
$Ax = A sum_{i=1}^k x_i e_i = sum_{i=1}^k x_i (Ae_i)$.
$Ae_i$ 是矩阵 $A$ 的第 $i$ 列。由于 $A$ 是上三角形矩阵,第 $i$ 列的元素 $a_{ji}$ 为零当 $j > i$.
所以 $Ae_i = (a_{1i}, a_{2i}, dots, a_{ii}, 0, dots, 0)^T$, 其中 $a_{ji}=0$ for $j>i$.
因此,$Ae_i$ 是一个只在 $1, dots, i$ 行有非零元素的向量。
当我们将这些向量线性组合时,$sum_{i=1}^k x_i (Ae_i)$ 的结果向量,其 $j$th 的分量 (for $j > k$) 将是 $sum_{i=1}^k x_i (Ae_i)_j$. 由于 $(Ae_i)_j = 0$ for $j > i$, 且这里 $j > k ge i$, 所以 $(Ae_i)_j = 0$ 对于所有 $i in {1, dots, k}$.
因此,$Ax$ 的第 $j$ 个分量 ($j>k$) 为零。
这意味着,$A$ 将子空间 $V_k$ 映射到自身,$A(V_k) subseteq V_k$。

现在考虑 $A^{1}$. 如果 $A$ 是上三角形矩阵,那么 $A^{1}$ 也必须是上三角形矩阵。
设 $B = A^{1}$. 我们有 $AB = I$.
对于子空间 $V_k = ext{span}{e_1, dots, e_k}$, 我们知道 $A(V_k) subseteq V_k$.
当我们将 $A^{1}$ 应用到 $V_k$ 时,会发生什么?
设 $y in V_k$. 则 $y = sum_{i=1}^k y_i e_i$.
考虑 $A^{1}y$.
如果 $A^{1}$ 也是上三角形矩阵,那么 $A^{1}(V_k) subseteq V_k$ 也应该成立。

让我们反过来想:如果一个矩阵 $B$ 使得 $A(V_k) subseteq V_k$ 对所有 $k=1, dots, n$ 成立,那么 $B$ 是上三角形矩阵吗?
考虑 $B e_j$ 的第 $i$ 个分量 $b_{ij}$.
如果 $A(V_k) subseteq V_k$ 对所有 $k$, 这意味着对于 $x in V_k$, $Ax in V_k$.
取 $x = e_j$. 那么 $Ae_j in V_j$. $Ae_j$ 的分量 $(Ae_j)_i$ 为零当 $i > j$.

现在我们考虑 $A^{1}$. 我们知道 $A$ 保持 $V_k$ 不变。
如果 $A^{1}$ 是上三角形矩阵,那么它也应该保持 $V_k$ 不变。
考虑 $A^{1}e_j$ 的第 $i$ 个分量,即 $b_{ij}$.
令 $y_j = A^{1}e_j$. 那么 $Ay_j = e_j$.
我们想要证明的是,如果 $i>j$, $y_j$ 的第 $i$ 个分量 $b_{ij}$ 必须为零。
这意味着 $e_j$ 在 $A^{1}$ 的作用下,仍然“限制”在由 $e_1, dots, e_j$ 张成的子空间内。
如果 $y_j in V_j$, 那么 $Ay_j in A(V_j)$.
由于 $A(V_j) subseteq V_j$, 我们有 $e_j in V_j$. 这是对的,因为 $e_j$ 就是 $V_j$ 的一个基向量。

更严谨的子空间论证:
令 $A$ 是一个上三角形矩阵。令 $V_k = ext{span}(e_1, dots, e_k)$ 是标准正交基张成的子空间。
我们已经证明 $A(V_k) subseteq V_k$ 对于所有 $k = 1, dots, n$.
设 $B = A^{1}$.
对于任意 $y in V_k$, $y = sum_{i=1}^k y_i e_i$.
考虑 $B y = B (sum_{i=1}^k y_i e_i) = sum_{i=1}^k y_i B e_i$.
我们希望证明 $B$ 是上三角形矩阵,即 $B e_j$ 的第 $i$ 个分量为零当 $i>j$.
令 $z_j = B e_j$. 我们要证明 $z_j in V_j$.
这意味着 $A z_j = A (B e_j) = (AB) e_j = I e_j = e_j$.
所以,我们需要证明,如果 $A(V_k) subseteq V_k$ 对所有 $k$, 那么对于 $e_j$, $A^{1} e_j$ 的前 $j$ 个分量是零当 $i>j$.

假设 $A^{1}$ 不是上三角形矩阵。那么存在某个 $i>j$ 使得 $b_{ij} eq 0$.
令 $j_0$ 是最小的列指标,使得 $A^{1} e_{j_0}$ 在 $V_{j_0}$ 之外有非零分量。
也就是说,存在 $i_0 > j_0$ 使得 $b_{i_0, j_0} eq 0$.
考虑方程 $A b_{j_0} = e_{j_0}$.
$(A b_{j_0})_i = sum_{k=1}^n a_{ik} b_{k, j_0}$.
我们知道 $a_{ik} = 0$ for $k < i$.
$(A b_{j_0})_i = sum_{k=i}^n a_{ik} b_{k, j_0}$.
对于 $i > j_0$, we have $sum_{k=i}^n a_{ik} b_{k, j_0} = 0$ (since $e_{j_0}$ has zero in row $i > j_0$).

Let's focus on $i_0 > j_0$.
$sum_{k=i_0}^n a_{i_0,k} b_{k, j_0} = 0$.
$a_{i_0, i_0} b_{i_0, j_0} + a_{i_0, i_0+1} b_{i_0+1, j_0} + dots + a_{i_0, n} b_{n, j_0} = 0$.

If we assume by induction that $b_{k,j_0}=0$ for all $k > i_0$, then the sum is zero, meaning $a_{i_0, i_0} b_{i_0, j_0} = 0$. Since $a_{i_0, i_0} eq 0$, this implies $b_{i_0, j_0} = 0$. This is exactly what we want to prove.

The key insight here is that the property $A(V_k) subseteq V_k$ for all $k$ implies that $A$ preserves the "upper triangular structure". If $A$ can be decomposed into block matrices where the blocks below the diagonal are zero, then its inverse must also have this block structure.

Let's consider the block form. For a $3 imes 3$ upper triangular matrix:
$$
A = egin{pmatrix} A_{11} & A_{12} \ 0 & A_{22} end{pmatrix}
$$
where $A_{11}$ is $k imes k$ and $A_{22}$ is $(nk) imes (nk)$. Both $A_{11}$ and $A_{22}$ are also upper triangular.
Let $A^{1} = B = egin{pmatrix} B_{11} & B_{12} \ B_{21} & B_{22} end{pmatrix}$.
Then $AB = egin{pmatrix} A_{11}B_{11} + A_{12}B_{21} & A_{11}B_{12} + A_{12}B_{22} \ 0 cdot B_{11} + A_{22}B_{21} & 0 cdot B_{12} + A_{22}B_{22} end{pmatrix} = egin{pmatrix} I_k & 0 \ 0 & I_{nk} end{pmatrix}$.

From the bottomleft block, $A_{22}B_{21} = 0$.
Since $A$ is upper triangular and invertible, its diagonal elements are nonzero. This implies that $A_{22}$ is also invertible (its diagonal elements are also nonzero).
Since $A_{22}$ is invertible, we can multiply by $A_{22}^{1}$ on the left: $A_{22}^{1} A_{22} B_{21} = A_{22}^{1} 0$.
This gives $I B_{21} = 0$, so $B_{21} = 0$.

Now the matrix $B$ has the form:
$$
B = egin{pmatrix} B_{11} & B_{12} \ 0 & B_{22} end{pmatrix}
$$
This means that $B$ is also upper triangular, provided that $B_{11}$, $B_{12}$, $B_{22}$ inherit the structure.
We have:
$A_{11}B_{11} = I_k$ (so $B_{11} = A_{11}^{1}$)
$A_{22}B_{22} = I_{nk}$ (so $B_{22} = A_{22}^{1}$)
$A_{11}B_{12} + A_{12}B_{22} = 0$ (so $B_{12} = A_{11}^{1}A_{12}A_{22}^{1}$)

Since $A_{11}$ and $A_{22}$ are upper triangular matrices, their inverses $B_{11}$ and $B_{22}$ are also upper triangular by induction.
The term $A_{11}^{1}A_{12}A_{22}^{1}$ is a product of upper triangular matrices ($A_{11}^{1}, A_{12}, A_{22}^{1}$).
The product of upper triangular matrices is an upper triangular matrix.
Specifically, $A_{11}^{1}$ is upper triangular, $A_{12}$ is upper triangular, and $A_{22}^{1}$ is upper triangular.
The product of two upper triangular matrices is upper triangular:
If $U_1, U_2$ are upper triangular, then $(U_1 U_2)_{ij} = sum_k (U_1)_{ik} (U_2)_{kj}$.
If $i > j$, then for any $k$, either $i>k$ or $k>j$.
If $(U_1)_{ik} eq 0$, then $i ge k$.
If $(U_2)_{kj} eq 0$, then $k ge j$.
Consider $(U_1 U_2)_{ij}$ for $i>j$.
$(U_1 U_2)_{ij} = sum_k (U_1)_{ik} (U_2)_{kj}$.
If $i>j$, we need to show this sum is zero.
For any term $(U_1)_{ik} (U_2)_{kj}$ to be nonzero, we need $(U_1)_{ik} eq 0$ (so $i ge k$) and $(U_2)_{kj} eq 0$ (so $k ge j$).
This doesn't immediately show the product is upper triangular for arbitrary $i>j$.

Let's use the property that if $U$ is upper triangular, then $U^T$ is lower triangular.
$(U_1 U_2)^T = U_2^T U_1^T$.
If $U_1, U_2$ are upper triangular, then $U_1^T, U_2^T$ are lower triangular.
The product of two lower triangular matrices is a lower triangular matrix.
$(L_1 L_2)_{ij} = sum_k (L_1)_{ik} (L_2)_{kj}$. If $ii$ and $(L_2)_{kj}=0$ for $k This argument seems reversed.

Let's stick to the $A^{1} leftrightarrow$ block matrix approach.
$B_{11} = A_{11}^{1}$ is upper triangular (by induction).
$B_{22} = A_{22}^{1}$ is upper triangular (by induction).
$A_{12}$ is upper triangular.
We need to show that the product of upper triangular matrices is upper triangular.
Let $U_1, U_2$ be upper triangular.
$(U_1 U_2)_{ij} = sum_k (U_1)_{ik} (U_2)_{kj}$.
If $i > j$, we want to show this is zero.
$(U_1)_{ik} eq 0 implies i ge k$.
$(U_2)_{kj} eq 0 implies k ge j$.
So for the term to be nonzero, we must have $j le k le i$.
If $i > j$, this range is valid.

Correct proof that product of upper triangular matrices is upper triangular:
Let $U, V$ be upper triangular matrices. We want to show $UV$ is upper triangular.
$(UV)_{ij} = sum_{k=1}^n u_{ik}v_{kj}$.
We need to show that if $i > j$, then $(UV)_{ij} = 0$.
If $u_{ik} eq 0$, then $k le i$.
If $v_{kj} eq 0$, then $j le k$.
So, for the term $u_{ik}v_{kj}$ to be nonzero, we must have $j le k le i$.
Now consider the term $(UV)_{ij}$ for $i > j$.
$(UV)_{ij} = sum_{k=j}^i u_{ik}v_{kj}$ (terms outside this range have $u_{ik}=0$ or $v_{kj}=0$).
We need to show this sum is zero.

Let's use the invariant subspace argument again, as it's cleaner.
$A(V_k) subseteq V_k$ for all $k$.
Let $B = A^{1}$. We want to show $B(V_k) subseteq V_k$ for all $k$.
Let $y in V_k$. So $y = sum_{i=1}^k y_i e_i$.
Consider $B y$.
$A (B y) = (AB) y = I y = y$.
Since $y in V_k$, we have $A(B y) = y in V_k$.
Let $z = B y$. We have $A z = y$.
Since $y in V_k$, all components of $y$ with index greater than $k$ are zero.
$A z = y implies (A z)_i = y_i$ for all $i$.
For $i > k$, we have $y_i = 0$. So $(A z)_i = 0$ for $i > k$.
$(A z)_i = sum_{l=1}^n a_{il} z_l$.
Since $A$ is upper triangular, $a_{il} = 0$ for $l < i$.
So for $i > k$, we have $sum_{l=i}^n a_{il} z_l = 0$.
$a_{ii} z_i + a_{i,i+1} z_{i+1} + dots + a_{in} z_n = 0$.

This equation means that the vector $z$ must be such that when multiplied by $A$ (from the perspective of rows $i>k$), the result is zero.
This suggests that $z$ must also live in a restricted subspace.

Consider the null space of $A$ restricted to certain subspaces.
The condition $A(V_k) subseteq V_k$ means that $A$ maps the subspace spanned by the first $k$ basis vectors to itself.
This implies that the first $k$ columns of $A$ span a subspace contained within the first $k$ coordinates.
$Ae_j in V_j$ for all $j$.

Now consider $A^{1}$. Let $B=A^{1}$.
We want to show $B(V_j) subseteq V_j$ for all $j$.
Suppose $B(V_j) otsubseteq V_j$. This means there exists $y in V_j$ such that $B y otin V_j$.
Let $z = B y$. Then $z otin V_j$, meaning there is some component $z_i eq 0$ for $i > j$.
We have $A z = A (B y) = (AB) y = I y = y$.
So $A z = y$. Since $y in V_j$, $y_i = 0$ for all $i > j$.
Thus, $(A z)_i = 0$ for all $i > j$.
$(A z)_i = sum_{l=1}^n a_{il} z_l$.
Since $i>j$ and $z otin V_j$, there exists $i_0 > j$ such that $z_{i_0} eq 0$.
For $i>j$, $sum_{l=i}^n a_{il} z_l = 0$. (because $a_{il}=0$ for $l Consider $i=i_0$.
$a_{i_0, i_0} z_{i_0} + a_{i_0, i_0+1} z_{i_0+1} + dots + a_{i_0, n} z_n = 0$.
Since $a_{i_0, i_0} eq 0$, this implies that $z_{i_0}$ can be expressed as a linear combination of $z_{i_0+1}, dots, z_n$ and the coefficients are related to $A$.
If we assume $z_l = 0$ for all $l > i_0$, then $a_{i_0, i_0} z_{i_0} = 0$, which implies $z_{i_0} = 0$, contradicting our assumption that $z otin V_j$ and $i_0$ is the first index $>j$ where $z_{i_0} eq 0$.

This indicates that if $A(V_k) subseteq V_k$, then $A^{1}(V_k) subseteq V_k$.
The condition $A^{1}(V_k) subseteq V_k$ means that for any $j$, $A^{1} e_j in V_j$.
The $i$th component of $A^{1} e_j$ is $b_{ij}$.
$A^{1} e_j in V_j$ means that the components of $A^{1} e_j$ with index greater than $j$ are zero.
So, $b_{ij} = 0$ for all $i > j$.
This is precisely the definition of $A^{1}$ being an upper triangular matrix.

Conclusion of method 2: The property that an upper triangular matrix $A$ maps the subspaces $V_k = ext{span}(e_1, dots, e_k)$ to themselves, i.e., $A(V_k) subseteq V_k$, is preserved by the inverse transformation. This means $A^{1}$ also maps $V_k$ to themselves, $A^{1}(V_k) subseteq V_k$. This latter property is equivalent to $A^{1}$ being an upper triangular matrix.



总结一下两种方法的要点:

代数方法(矩阵乘法): 通过分析逆矩阵定义 $AB=I$ 的方程,逐列(或逐行反推)证明逆矩阵 $B$ 的下三角部分元素必须为零。这个过程可以通过归纳法严谨地完成。
子空间不变性方法: 利用上三角形矩阵保持标准子空间 $V_k$ 不变($A(V_k) subseteq V_k$)的性质。证明了逆矩阵也具有相同的子空间不变性($A^{1}(V_k) subseteq V_k$),而这个性质直接等价于逆矩阵也是上三角形矩阵。

两种方法都殊途同归地证明了同一结论,但思路和侧重点不同。代数方法更直接地操作矩阵元素,而子空间方法则更侧重于矩阵作为线性变换的全局性质。

网友意见

user avatar

这个问题实际上可以转换为:

为 级可逆上三角矩阵,则矩阵元素 所对应的 级余子式

我们用归纳法证明

当 时显然;当 时,

由归纳假设 ,所以只需证明

即 每个元素所对应的代数余子式为 0.

事实上

显然.

类似的话题

本站所有内容均为互联网搜索引擎提供的公开搜索信息,本站不存储任何数据与内容,任何内容与数据均与本站无关,如有需要请联系相关搜索引擎包括但不限于百度google,bing,sogou

© 2025 tinynews.org All Rights Reserved. 百科问答小站 版权所有