Difference between revisions of "Strassen algorithm"
Borishaase (talk | contribs) (Strassen algorithm) |
Borishaase (talk | contribs) (Strassen algorithm) |
||
Line 1: | Line 1: | ||
'''Strassen algorithm for a symmetric matrix:''' | '''Strassen algorithm for a symmetric matrix:''' | ||
− | For a [[w:Symmetric matrix|<span class="wikipedia">symmetric matrix</span>]] <math>A \in \mathbb{C}^{n \times n}</math> where <math>n \in \mathbb{N}^*</math>, the [[w:Runtime (program lifecycle phase)|<span class="wikipedia">runtime</span>]] <math>T_s(n)</math> of the [[w:Strassen algorithm|<span class="wikipedia">Strassen algorithm</span>]] for the [[w:Matrix multiplication|<span class="wikipedia">matrix product</span>]] <math>A^2</math> is about half that of the original algorithm in <math>\mathcal{O}(n^{(_2 7)})</math>. | + | For a [[w:Symmetric matrix|<span class="wikipedia">symmetric matrix</span>]] <math>A \in \mathbb{C}^{n \times n}</math> where <math>2^k := n, k \in \mathbb{N}^*</math>, the [[w:Runtime (program lifecycle phase)|<span class="wikipedia">runtime</span>]] <math>T_s(n)</math> of the [[w:Strassen algorithm|<span class="wikipedia">Strassen algorithm</span>]] for the [[w:Matrix multiplication|<span class="wikipedia">matrix product</span>]] <math>A^2</math> is about half that of the original algorithm in <math>\mathcal{O}(n^{(_2 7)})</math>. |
'''Proof:''' For <math>A := | '''Proof:''' For <math>A := | ||
Line 17: | Line 17: | ||
'''Strassen algorithm for a square matrix:''' | '''Strassen algorithm for a square matrix:''' | ||
− | For a square [[w:Matrix (mathematics)|<span class="wikipedia">matrix</span>]] <math>A \in \mathbb{C}^{n \times n}</math> where <math>n \in \mathbb{N}^*</math>, the runtime <math>T_q(n)</math> of the Strassen algorithm is for the matrix product <math>A^TA</math> about <math>4/7</math> that of the original algorithm in <math>\mathcal{O}(n^{(_2 7)})</math>. | + | For a square [[w:Matrix (mathematics)|<span class="wikipedia">matrix</span>]] <math>A \in \mathbb{C}^{n \times n}</math> where <math>2^k := n, k \in \mathbb{N}^*</math>, the runtime <math>T_q(n)</math> of the Strassen algorithm is for the matrix product <math>A^TA</math> about <math>4/7</math> that of the original algorithm in <math>\mathcal{O}(n^{(_2 7)})</math>. |
'''Proof:''' For <math>A := | '''Proof:''' For <math>A := | ||
Line 29: | Line 29: | ||
\end{pmatrix}</math> such that <math>T_q(2n) = 4T_s(n) + 2n^{(_2 7)}</math> and <math>T_q(n) = 4T_s(n/2) + 2/7n^{(_2 7)} = 2/3n^{(_2 3)} + 4/7n^{(_2 7)}</math>.<math>\square</math> | \end{pmatrix}</math> such that <math>T_q(2n) = 4T_s(n) + 2n^{(_2 7)}</math> and <math>T_q(n) = 4T_s(n/2) + 2/7n^{(_2 7)} = 2/3n^{(_2 3)} + 4/7n^{(_2 7)}</math>.<math>\square</math> | ||
− | ''' | + | '''Theorem (decomposition method):''' |
− | For | + | For <math>A, B, D, E \in \mathbb{C}^{n \times n}</math> where <math>2^k := n, k \in \mathbb{N}^*</math>, the runtime is <math>T_z(n) = \mathcal{O}(n^2)</math> for the matrix product <math>QR = (A + D)(B + E) = C + F + G + H</math> if <math>A</math> has also the following form of <math>B</math>: |
'''Proof:''' For <math>B := | '''Proof:''' For <math>B := | ||
Line 57: | Line 57: | ||
: <math>C_{22} = M_{2} - M_{6} .</math> | : <math>C_{22} = M_{2} - M_{6} .</math> | ||
− | + | Let have <math>D</math> the following form of <math>E</math>: | |
− | For <math> | + | For <math>E := |
\begin{pmatrix} | \begin{pmatrix} | ||
− | B_{11} & B_{12} \\ | + | E_{11} & E_{12} \\ |
− | B_{21} | + | E_{21} & E_{11} |
− | + | \end{pmatrix}</math>, it holds that <math>DE = | |
+ | \begin{pmatrix} | ||
+ | D_{11}E_{11}+D_{12}E_{21} & D_{11}E_{12}+D_{12}E_{11} \\ | ||
+ | D_{21}E_{11}+D_{11}E_{21} & D_{21}E_{12}+D_{11}E_{11} | ||
+ | \end{pmatrix} =: F</math>. Putting | ||
+ | |||
+ | : <math>N_{1} := D_{11} \cdot (E_{11} + E_{12})</math> | ||
+ | : <math>N_{2} := D_{21} \cdot (E_{11} + E_{12})</math> | ||
+ | : <math>N_{3} := D_{12} \cdot (E_{11} - E_{21})</math> | ||
+ | : <math>N_{4} := (D_{11} - D_{12})\cdot E_{11}</math> | ||
+ | : <math>N_{5} := (D_{11} - D_{21})\cdot E_{11}</math> | ||
+ | : <math>N_{6} := (D_{11} + D_{12})\cdot E_{21}</math> | ||
+ | |||
+ | implies | ||
+ | |||
+ | : <math>F_{11} = N_{3} + N_{4}</math> | ||
+ | : <math>F_{12} = N_{1} - N_{4}</math> | ||
+ | : <math>F_{21} = F_{11} - N_{5} + N_{6}</math> | ||
+ | : <math>F_{22} = N_{2} + N_{5} .</math> | ||
+ | |||
+ | Then it holds for <math>D</math> and <math>B</math>: | ||
+ | |||
+ | <math>DB = | ||
+ | \begin{pmatrix} | ||
+ | D_{11}B_{11}+D_{12}B_{12} & D_{11}B_{12}+D_{12}B_{22} \\ | ||
+ | D_{21}B_{11}+D_{11}B_{12} & D_{21}B_{12}+D_{11}B_{22} | ||
+ | \end{pmatrix} =: G</math>. Putting | ||
+ | |||
+ | : <math>O_{1} := D_{12} \cdot (B_{12} + B_{22})</math> | ||
+ | : <math>O_{2} := D_{11} \cdot (B_{12} + B_{22})</math> | ||
+ | : <math>O_{3} := D_{21} \cdot (B_{11} + B_{12})</math> | ||
+ | : <math>O_{4} := (D_{11} - D_{12})\cdot B_{22}</math> | ||
+ | : <math>O_{5} := (D_{11} - D_{21})\cdot B_{12}</math> | ||
+ | : <math>O_{6} := (D_{11} - D_{21})\cdot B_{11}</math> | ||
+ | |||
+ | implies | ||
+ | |||
+ | : <math>G_{11} = O_{1} - G_{12} + G_{21} + O_{6}</math> | ||
+ | : <math>G_{12} = O_{2} - O_{4}</math> | ||
+ | : <math>G_{21} = O_{3} + O_{5}</math> | ||
+ | : <math>G_{22} = O_{2} - O_{5} .</math> | ||
+ | |||
+ | Then it holds for <math>A</math> and <math>E</math>: | ||
+ | |||
+ | <math>AE = | ||
\begin{pmatrix} | \begin{pmatrix} | ||
− | A_{11} | + | A_{11}E_{11}+A_{12}E_{21} & A_{11}E_{12}+A_{12}E_{11} \\ |
− | A_{ | + | A_{12}E_{11}+A_{22}E_{21} & A_{12}E_{12}+A_{22}E_{11} |
− | \end{pmatrix} =: | + | \end{pmatrix} =: H</math>. Putting |
− | : <math> | + | : <math>P_{1} := A_{12} \cdot (E_{11} - E_{21})</math> |
− | : <math> | + | : <math>P_{2} := A_{11} \cdot (E_{11} - E_{12})</math> |
− | : <math> | + | : <math>P_{3} := A_{22} \cdot (E_{11} - E_{21})</math> |
− | : <math> | + | : <math>P_{4} := (A_{11} + A_{12})\cdot E_{11}</math> |
− | : <math> | + | : <math>P_{5} := (A_{11} + A_{12})\cdot E_{12}</math> |
− | : <math> | + | : <math>P_{6} := (A_{12} + A_{22})\cdot E_{21}</math> |
implies | implies | ||
− | : <math> | + | : <math>H_{11} = P_{4} - P_{1}</math> |
− | : <math> | + | : <math>H_{12} = P_{4} - P_{2}</math> |
− | : <math> | + | : <math>H_{21} = P_{1} + P_{6}</math> |
− | : <math> | + | : <math>H_{22} = H_{21} - H_{12} + P_{3} + P_{5} .</math> |
− | + | Let <math>A_{12} = B_{12} = D_{11} = E_{11} := 0</math>, from which the claim follows, since every square matrix can be decomposed as described.<math>\square</math> | |
== See also == | == See also == |
Revision as of 03:35, 23 March 2022
Strassen algorithm for a symmetric matrix:
For a symmetric matrix [math]\displaystyle{ A \in \mathbb{C}^{n \times n} }[/math] where [math]\displaystyle{ 2^k := n, k \in \mathbb{N}^* }[/math], the runtime [math]\displaystyle{ T_s(n) }[/math] of the Strassen algorithm for the matrix product [math]\displaystyle{ A^2 }[/math] is about half that of the original algorithm in [math]\displaystyle{ \mathcal{O}(n^{(_2 7)}) }[/math].
Proof: For [math]\displaystyle{ A := \begin{pmatrix} A_{11} & A_{12} \\ A_{12}^T & A_{22} \end{pmatrix} }[/math], it holds that [math]\displaystyle{ A^TA = \begin{pmatrix} A_{11}A_{11}+A_{12}A_{12}^T & A_{11}A_{12}+A_{12}A_{22} \\ A_{12}^TA_{11}+A_{22}A_{12}^T & A_{12}^TA_{12}+A_{22}A_{22} \end{pmatrix} }[/math] and [math]\displaystyle{ T_s(2n) = 3T_s(n) + 2n^{(_2 7)} }[/math]. Thus [math]\displaystyle{ T_s(n) = 3T_s(n/2) + 2(n/2)^{(_2 7)} }[/math] and [math]\displaystyle{ T_s(n/2) = 3T_s(n/4) + 2(n/4)^{(_2 7)} }[/math].
The geometric series yields because of [math]\displaystyle{ T_s(1) = 1 }[/math]: [math]\displaystyle{ T_s(n) = 27T_s(n/8) + 2/7n^{(_2 7)}(1+3/7 + (3/7)^2 + ...) = 3^{(_2n)} + 2/7n^{(_2 7)} (1-(3/7)^{(_2n)})/(1-3/7) }[/math] [math]\displaystyle{ = n^{(_2 3)} + \hat{2}(n^{(_2 7)}-n^{(_2 3)}) = \hat{2} (n^{(_2 3)} + n^{(_2 7)}) }[/math].[math]\displaystyle{ \square }[/math]
Strassen algorithm for a square matrix:
For a square matrix [math]\displaystyle{ A \in \mathbb{C}^{n \times n} }[/math] where [math]\displaystyle{ 2^k := n, k \in \mathbb{N}^* }[/math], the runtime [math]\displaystyle{ T_q(n) }[/math] of the Strassen algorithm is for the matrix product [math]\displaystyle{ A^TA }[/math] about [math]\displaystyle{ 4/7 }[/math] that of the original algorithm in [math]\displaystyle{ \mathcal{O}(n^{(_2 7)}) }[/math].
Proof: For [math]\displaystyle{ A := \begin{pmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix} }[/math], it holds that [math]\displaystyle{ A^TA = \begin{pmatrix} A_{11}^TA_{11}+A_{21}^TA_{21} & A_{11}^TA_{12}+A_{21}^TA_{22} \\ A_{12}^TA_{11}+A_{22}^TA_{21} & A_{12}^TA_{12}+A_{22}^TA_{22} \end{pmatrix} }[/math] such that [math]\displaystyle{ T_q(2n) = 4T_s(n) + 2n^{(_2 7)} }[/math] and [math]\displaystyle{ T_q(n) = 4T_s(n/2) + 2/7n^{(_2 7)} = 2/3n^{(_2 3)} + 4/7n^{(_2 7)} }[/math].[math]\displaystyle{ \square }[/math]
Theorem (decomposition method):
For [math]\displaystyle{ A, B, D, E \in \mathbb{C}^{n \times n} }[/math] where [math]\displaystyle{ 2^k := n, k \in \mathbb{N}^* }[/math], the runtime is [math]\displaystyle{ T_z(n) = \mathcal{O}(n^2) }[/math] for the matrix product [math]\displaystyle{ QR = (A + D)(B + E) = C + F + G + H }[/math] if [math]\displaystyle{ A }[/math] has also the following form of [math]\displaystyle{ B }[/math]:
Proof: For [math]\displaystyle{ B := \begin{pmatrix} B_{11} & B_{12} \\ B_{12} & B_{22} \end{pmatrix} }[/math], it holds that [math]\displaystyle{ AB = \begin{pmatrix} A_{11}B_{11}+A_{12}B_{12} & A_{11}B_{12}+A_{12}B_{22} \\ A_{12}B_{11}+A_{22}B_{12} & A_{12}B_{12}+A_{22}B_{22} \end{pmatrix} =: C }[/math]. Putting
- [math]\displaystyle{ M_{1} := A_{12} \cdot (B_{11} + B_{12}) }[/math]
- [math]\displaystyle{ M_{2} := A_{12} \cdot (B_{12} + B_{22}) }[/math]
- [math]\displaystyle{ M_{3} := A_{22} \cdot (B_{12} + B_{22}) }[/math]
- [math]\displaystyle{ M_{4} := (A_{11} - A_{12})\cdot B_{11} }[/math]
- [math]\displaystyle{ M_{5} := (A_{11} - A_{12})\cdot B_{12} }[/math]
- [math]\displaystyle{ M_{6} := (A_{12} - A_{22})\cdot B_{22} }[/math]
implies
- [math]\displaystyle{ C_{11} = M_{1} + M_{4} }[/math]
- [math]\displaystyle{ C_{12} = M_{2} + M_{5} }[/math]
- [math]\displaystyle{ C_{21} = M_{1} - C_{22} + M_{3} }[/math]
- [math]\displaystyle{ C_{22} = M_{2} - M_{6} . }[/math]
Let have [math]\displaystyle{ D }[/math] the following form of [math]\displaystyle{ E }[/math]:
For [math]\displaystyle{ E := \begin{pmatrix} E_{11} & E_{12} \\ E_{21} & E_{11} \end{pmatrix} }[/math], it holds that [math]\displaystyle{ DE = \begin{pmatrix} D_{11}E_{11}+D_{12}E_{21} & D_{11}E_{12}+D_{12}E_{11} \\ D_{21}E_{11}+D_{11}E_{21} & D_{21}E_{12}+D_{11}E_{11} \end{pmatrix} =: F }[/math]. Putting
- [math]\displaystyle{ N_{1} := D_{11} \cdot (E_{11} + E_{12}) }[/math]
- [math]\displaystyle{ N_{2} := D_{21} \cdot (E_{11} + E_{12}) }[/math]
- [math]\displaystyle{ N_{3} := D_{12} \cdot (E_{11} - E_{21}) }[/math]
- [math]\displaystyle{ N_{4} := (D_{11} - D_{12})\cdot E_{11} }[/math]
- [math]\displaystyle{ N_{5} := (D_{11} - D_{21})\cdot E_{11} }[/math]
- [math]\displaystyle{ N_{6} := (D_{11} + D_{12})\cdot E_{21} }[/math]
implies
- [math]\displaystyle{ F_{11} = N_{3} + N_{4} }[/math]
- [math]\displaystyle{ F_{12} = N_{1} - N_{4} }[/math]
- [math]\displaystyle{ F_{21} = F_{11} - N_{5} + N_{6} }[/math]
- [math]\displaystyle{ F_{22} = N_{2} + N_{5} . }[/math]
Then it holds for [math]\displaystyle{ D }[/math] and [math]\displaystyle{ B }[/math]:
[math]\displaystyle{ DB = \begin{pmatrix} D_{11}B_{11}+D_{12}B_{12} & D_{11}B_{12}+D_{12}B_{22} \\ D_{21}B_{11}+D_{11}B_{12} & D_{21}B_{12}+D_{11}B_{22} \end{pmatrix} =: G }[/math]. Putting
- [math]\displaystyle{ O_{1} := D_{12} \cdot (B_{12} + B_{22}) }[/math]
- [math]\displaystyle{ O_{2} := D_{11} \cdot (B_{12} + B_{22}) }[/math]
- [math]\displaystyle{ O_{3} := D_{21} \cdot (B_{11} + B_{12}) }[/math]
- [math]\displaystyle{ O_{4} := (D_{11} - D_{12})\cdot B_{22} }[/math]
- [math]\displaystyle{ O_{5} := (D_{11} - D_{21})\cdot B_{12} }[/math]
- [math]\displaystyle{ O_{6} := (D_{11} - D_{21})\cdot B_{11} }[/math]
implies
- [math]\displaystyle{ G_{11} = O_{1} - G_{12} + G_{21} + O_{6} }[/math]
- [math]\displaystyle{ G_{12} = O_{2} - O_{4} }[/math]
- [math]\displaystyle{ G_{21} = O_{3} + O_{5} }[/math]
- [math]\displaystyle{ G_{22} = O_{2} - O_{5} . }[/math]
Then it holds for [math]\displaystyle{ A }[/math] and [math]\displaystyle{ E }[/math]:
[math]\displaystyle{ AE = \begin{pmatrix} A_{11}E_{11}+A_{12}E_{21} & A_{11}E_{12}+A_{12}E_{11} \\ A_{12}E_{11}+A_{22}E_{21} & A_{12}E_{12}+A_{22}E_{11} \end{pmatrix} =: H }[/math]. Putting
- [math]\displaystyle{ P_{1} := A_{12} \cdot (E_{11} - E_{21}) }[/math]
- [math]\displaystyle{ P_{2} := A_{11} \cdot (E_{11} - E_{12}) }[/math]
- [math]\displaystyle{ P_{3} := A_{22} \cdot (E_{11} - E_{21}) }[/math]
- [math]\displaystyle{ P_{4} := (A_{11} + A_{12})\cdot E_{11} }[/math]
- [math]\displaystyle{ P_{5} := (A_{11} + A_{12})\cdot E_{12} }[/math]
- [math]\displaystyle{ P_{6} := (A_{12} + A_{22})\cdot E_{21} }[/math]
implies
- [math]\displaystyle{ H_{11} = P_{4} - P_{1} }[/math]
- [math]\displaystyle{ H_{12} = P_{4} - P_{2} }[/math]
- [math]\displaystyle{ H_{21} = P_{1} + P_{6} }[/math]
- [math]\displaystyle{ H_{22} = H_{21} - H_{12} + P_{3} + P_{5} . }[/math]
Let [math]\displaystyle{ A_{12} = B_{12} = D_{11} = E_{11} := 0 }[/math], from which the claim follows, since every square matrix can be decomposed as described.[math]\displaystyle{ \square }[/math]