Ticket #174: HermiteNormalForm_1.tex

File HermiteNormalForm_1.tex, 5.0 KB (added by was, 15 years ago)

a student paper on fast HNF algorithms (and there are other papers out there)

Line 
1\documentclass[11pt]{article}
2
3\usepackage{amsmath}
4\usepackage{amssymb}
5\usepackage{eucal}
6\usepackage{fancyhdr}
7\usepackage{enumerate}
8
9\oddsidemargin-.5cm
10\evensidemargin-.5cm
11\topmargin-2cm     %I recommend adding these three lines to increase the
12\textwidth16cm     %amount of usable space on the page (and save trees)
13\textheight23.5cm
14
15\newcommand{\Z} {\mathbb{Z}}
16\newcommand{\amat}[4] {\left(\begin{matrix}#1&#2\\#3&#4\end{matrix}\right)}
17
18\begin{document}
19
20\noindent Fast Method for Computing Hermite Normal Form \\
21\noindent Andrew Crites and Michael Goff \\
22\noindent The results of this paper are taken from notes written by Allan Steel.\\
23
24The notion of echelon form of a matrix need not be restricted to matrices whose entries come from a field.  For a matrix with entries from any Euclidean domain, we define it do be in echelon form (or Hermite Normal Form) if the following conditions hold:
25\begin{enumerate}[(i)]
26  \item $a_{i,j}=0$, for $i>j$
27  \item $a_{ii}\ge0$
28  \item $a_{ij}<a_{ii}$, for all $j<i$ (unless $a_{ii}=0$ in which case all $a_{ij}=0$.)
29\end{enumerate}
30
31We first assume that $X$ is an $n\times n$ matrix with entries in $\Z$.  Decompose $X$ as:
32$$X=\left(\begin{array}{ccc|c}&&&\\&A&&c\\&&&\\\hline&v&&v_c\\\hline&w&&w_c\end{array}\right),$$
33where $A\in M_{(n-2)\times(n-1)}(\Z)$, $c\in M_{(n-2)\times1}(\Z)$, $v,w\in M_{1\times(n-1)}(\Z)$, and $v_c,w_c\in\Z.$  Next define
34\begin{align*}
35  d_1=\det\left(\begin{array}{c}A\\\hline v\end{array}\right),d_2=\det\left(\begin{array}{c}A\\\hline w\end{array}\right)\text{, and }\\
36  g=\text{GCD}(d_1,d_2)\text{, where }g=a_1d_1+a_2d_2.
37\end{align*}
38We will then have, by row linearity of the determinant, $$\det\left(\begin{array}{c}A\\\hline a_1v+a_2w\end{array}\right)=a_1\det\left(\begin{array}{c}A\\\hline v\end{array}\right) + a_2\det\left(\begin{array}{c}A\\\hline w\end{array}\right) = g.$$
39If we now multiply the bottom two rows of $X$ by the matrix $$T=\left(\begin{matrix}a_1&a_2\\-\frac{d_2}{g}&\frac{d_1}{g}\end{matrix}\right)$$ we will get a new matrix $$X_1=\left(\begin{array}{ccc|c}&&&\\&B&&d\\&&&\\\hline&u&&u_c\end{array}\right)$$ where the last row of $B$ is $a_1v+a_2w.$  Moreover, the determinant of $B$ will also be $g$.
40
41We can now compute the Herminte Normal form $H$ of $B$ quickly since $g$ is small.
42The elements on the diagonal in $H$ multiply to $g$, and hence they no greater than $g$ in absolute value.  It follows that the elements on the diagonal are less than $g$ in absolute value, so we only need to calculate $H$ mod $2g$, which is fast when $g$ is small.  The most naive algorithm is $O(n^3)$.
43
44Since calculating a small value of $g$ is crucial to calculating the Hermite Normal form quickly, we undertook experiments to see what $g$ turns out to be in practice.  We generated matrices with random entries, varying the size of the matrices as high as $50 \times 50$, and varying the size of the entries of the matrices to as many as $4$ digits.  There is no evidence to suggest that either of these factors affects the expected value of $g$.  For the random matrices, $g$ was less than $100$ over $90$ percent of the time, and in the tens of thousands of trials, $g$ never had more than $5$ digits.  In appears that, in these trials, the expected value of $g$ was approximately the expected value of the GCD of two random large numbers.  Also, the evidence suggests that the value of $g$ is not increased when $v$ and $w$ have common entries, as long as they are not the same.  However, the evidence does suggest that $g$ is often very large on randomly generated sparse matrices.
45
46Suppose that $U$ is the unique unimodular matrix with $H=UB$.  Then we want a column vector $e$ so that $$U\cdot\left(B|d\right)=\left(H|e\right).$$  Then $$e=Ud=HB^{-1}d,$$ so that $$BH^{-1}e=d.$$  Since $H$ has small entries we can quickly compute the inverse of $H$.  We can do this computation over $\mathbb{Q}$ since we know $BH^{-1}$ will still have integer entries since it equals $U$.  Finding $d$ then amounts to solving the system of equations $$\left(BH^{-1}\right)\cdot e=d.$$  This is often done using the $p$-adic nullspace algorithm.  We now have the matrix $$X_2=\left(\begin{array}{ccc|c}&&&\\&H&&e\\&&&\\\hline&u&&u_c\end{array}\right).$$  We can perform elementary row operations to reduce the last row by the rows above it to finally put the matrix into echelon form.
47
48While Steel's algorithm is fast in practice, bad luck or a maliciously chosen matrix can cause $g$ to be very large, which ruins the efficiency.  Another potential problem is partitioning the matrix in the right way so that $B$ is nonsingular.
49
50If $g$ is large, then the algorithm can be applied recursively when calculating the Hermite Normal form of $B$.  Another possibility is to multiply $X$ by a random unitary matrix, which does not affect the Hermite Normal form.  This idea is not yet tested.  Finally, since sparse matrices present a particularly difficult case, an alternative Hermite Normal form algorithm for sparse matrices might be better.
51
52\end{document}