Opened 4 years ago

Closed 4 years ago

#19340 closed enhancement (fixed)

Better interface for hadamard_matrix

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.9
Component: combinatorial designs Keywords:
Cc: knsam, dimpase, vdelecroix Merged in:
Authors: Nathann Cohen Reviewers: Dima Pasechnik
Report Upstream: N/A Work issues:
Branch: 2553467 (Commits) Commit: 2553467c572dc462c4b514ba8e405fb928017928
Dependencies: Stopgaps:

Description

This branch improves the interface of hadamard_matrix.

  • It adds it in matrix.<tab>
  • It checks the returned results ('check' flag)
  • It can test for the existence of a construction in Sage (useful for SRG code)

Nathann

Change History (30)

comment:1 Changed 4 years ago by ncohen

  • Branch set to u/ncohen/19340
  • Commit set to f465c96099ab3681ba1162eaf739c7768d8218ab
  • Status changed from new to needs_review

New commits:

f465c96trac #19340: Better interface for hadamard_matrix

comment:2 follow-up: Changed 4 years ago by dimpase

IMHO hadamard_matrix() should accept options, such as RSHCD and skew (the latter not implemented yet). (A skew-Hadamard matrix is a Hadamard matrix of the form H=S+I, with S.T==-S and I the identity matrix; I have some half-written patch for them...)

As well, is_hadamard_matrix() ought to check for RSHCD-ness and skew-ness.

comment:3 in reply to: ↑ 2 Changed 4 years ago by ncohen

IMHO hadamard_matrix() should accept options, such as RSHCD and skew (the latter not implemented yet).

HMmmmm.. I don't know for 'skew', but 'rshcd' sounds very very specific to have in such a general function. Having a function of its own in a submodule sounds sufficient to me, for the moment.

As well, is_hadamard_matrix() ought to check for RSHCD-ness and skew-ness.

I don't know... The rshcd function already tests the rshcd-specific property anyway.

Nathann

comment:4 Changed 4 years ago by git

  • Commit changed from f465c96099ab3681ba1162eaf739c7768d8218ab to e02902ae0c073a13700295ce75ec1db2548a7b6d

Branch pushed to git repo; I updated commit sha1. New commits:

e02902atrac #19340: Pre-existing bug that hid several paley constructions

comment:5 Changed 4 years ago by git

  • Commit changed from e02902ae0c073a13700295ce75ec1db2548a7b6d to 247a2bb4dcec180227cd9fb9bcd331a27a289016

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

247a2bbtrac #19340: Pre-existing bug that hid several paley constructions

comment:6 follow-ups: Changed 4 years ago by fredrik.johansson

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds), and unit testing should provide reasonable certainty that the algorithm is correct. Flint's test suite already checks for every size up to 300 that fmpz_mat_hadamard either constructs a correct Hadamard matrix or reports that the size is unsupported.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 4 years ago by ncohen

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

How easily can I call them on a Sage matrix?

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds), and unit testing should provide reasonable certainty that the algorithm is correct.

Sorry to disappoint, but I have thousands of line behind me (in combinat/designs/) that implement old combinatorial designs from lost papers. I know the value of this kind of testing.

Flint's test suite already checks for every size up to 300 that fmpz_mat_hadamard either constructs a correct Hadamard matrix or reports that the size is unsupported.

Are you saying that flint can be easily accessed from Sage *and* that it contains a database of hadamard matrices? If so, I woulld be glad to call it from the general constructor of hadamard matrices in Sage.

Nathann

comment:8 in reply to: ↑ 7 ; follow-up: Changed 4 years ago by fredrik.johansson

Replying to ncohen:

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

How easily can I call them on a Sage matrix?

It should be enough (untested) to add the following declarations in libs/flint/fmpz_mat.pxd

    int fmpz_mat_hadamard(fmpz_mat_t A)
    int fmpz_mat_is_hadamard(const fmpz_mat_t A)

and the following methods in matrix/matrix_integer_dense.pyx

    def _hadamard(self):
        # in-place, returns whether successful
        return fmpz_mat_hadamard(self._matrix)

    def is_hadamard(self):
        return fmpz_mat_is_hadamard(self._matrix)

Then you can do something like this in hadamard_matrix()

    if not(n % 4 == 0) and (n > 2):
        raise ValueError
    mat = Matrix(ZZ, n, n)
    if mat._hadamard():
        return mat
    else:
        # use database here
        raise ValueError

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds), and unit testing should provide reasonable certainty that the algorithm is correct.

Sorry to disappoint, but I have thousands of line behind me (in combinat/designs/) that implement old combinatorial designs from lost papers. I know the value of this kind of testing.

You may well be right about combinatorial designs in general. But, having written the flint code, and considering that it implements a fairly simple algorithm, when it's already exhaustively tested up to size 300, I think it's highly unlikely to give an incorrect matrix for larger sizes. So I can't agree with slowing it down many orders of magnitude by default.

Flint's test suite already checks for every size up to 300 that fmpz_mat_hadamard either constructs a correct Hadamard matrix or reports that the size is unsupported.

Are you saying that flint can be easily accessed from Sage *and* that it contains a database of hadamard matrices? If so, I woulld be glad to call it from the general constructor of hadamard matrices in Sage.

It doesn't have a database; it uses the Paley construction. As noted above, you can easily use a fallback database/algorithm if it fails.

BTW, fmpz_mat_hadamard succeeds for every multiple of 4 below 92; current code for the Paley construction in Sage doesn't succeed to construct a size 20 Hadamard matrix.

Last edited 4 years ago by fredrik.johansson (previous) (diff)

comment:9 in reply to: ↑ 8 ; follow-up: Changed 4 years ago by ncohen

It should be enough (untested) to add the following declarations in libs/flint/fmpz_mat.pxd

HMmmm.. I don't feel much at ease with adding matrix code, sorry. Especially if it is only implemented for a specific data structure and such.

You may well be right about combinatorial designs in general. But, having written the flint code, and considering that it implements a fairly simple algorithm, when it's already exhaustively tested up to size 300, I think it's highly unlikely to give an incorrect matrix for larger sizes. So I can't agree with slowing it down many orders of magnitude by default.

Unless you do intend to use this code, couldn't you trust the opinion of those who do and develop it?

I know how many hours, night and weeks I wasted because of a typo in data or in code. You wouldn't mind slowing by "several orders of magnitude" a function like .__repr__ because it is never a critical. Do you generate Hadamard matrices intensively? Furthermore you can disable the check with 'check=False'. Please don't turn this ticket into a war just because of a check you can disable, which saved my skin innumerable times in similar code. It's not worth it. I know it, and I need the flag. Besides, checking the constructions is already the default for all other design constructions.

It doesn't have a database; it uses the Paley construction. As noted above, you can easily use a fallback database/algorithm if it fails.

More constructions of hadamard matrices may come. Others may be pure data. Other may be recursive constructions on both.

BTW, fmpz_mat_hadamard succeeds for every multiple of 4 below 92; current code for the Paley construction in Sage doesn't succeed to construct a size 20 Hadamard matrix.

Then there must be another bug somewhere. This code clearly needs some testing.

Nathann

comment:10 in reply to: ↑ 9 Changed 4 years ago by ncohen

BTW, fmpz_mat_hadamard succeeds for every multiple of 4 below 92; current code for the Paley construction in Sage doesn't succeed to construct a size 20 Hadamard matrix.

Two updates:

1) With this branch, Sage can build a matrix for n=20. It's one of the bugs I fixed 2) There is no Paley construction for 52 on [1]. Either you get the data somewhere else or there is something wrong in that code.

Nathann

[1] http://neilsloane.com/hadamard/

Last edited 4 years ago by ncohen (previous) (diff)

comment:11 Changed 4 years ago by fredrik.johansson

52 = 2 * (5^2 + 1) is covered by the "type 2" Paley construction.

The list you linked to is evidently incomplete.

comment:12 Changed 4 years ago by ncohen

Then that's another bug of the current code:

sage: hadamard_matrix_paleyII(52)
...
ValueError: The order 52 is not covered by the Paley type II construction.

comment:13 follow-up: Changed 4 years ago by fredrik.johansson

Well, it's up to you whether you want to fix the broken code or just insert a few lines to call the code in flint that is faster and known to work :-)

comment:14 in reply to: ↑ 13 Changed 4 years ago by ncohen

Well, it's up to you whether you want to fix the broken code or just insert a few lines to call the code in flint that is faster and known to work :-)

Thank you very much. Then I will do something else. I am trying to improve the database of strongly regular graphs and I don't want to interrupt that to write interface code.

Nathann

comment:15 Changed 4 years ago by git

  • Commit changed from 247a2bb4dcec180227cd9fb9bcd331a27a289016 to a5a4bbff80f2aed2a5b549d6b64f215967bc5ec7

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a5a4bbftrac #19340: Pre-existing bug that hid several paley constructions

comment:16 follow-up: Changed 4 years ago by vdelecroix

Hello,

Thanks Frederik for your suggestion (and your work in flint!). I pushed a branch at u/vdelecroix/19340 with one commit which creates a function hadamard_matrix_flint. It could be called from the current hadamard_matrix code. It is faster. It is useful to have several implementations to check (as it was pointed out on a concrete example by Frederik)...

Vincent

comment:17 in reply to: ↑ 16 Changed 4 years ago by ncohen

Yo,

It is useful to have several implementations to check (as it was pointed out on a concrete example by Frederik)...

Indeed. Please do that in #19341, though.

Nathann

comment:18 in reply to: ↑ 6 ; follow-up: Changed 4 years ago by dimpase

Replying to fredrik.johansson:

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds),

IMHO it says that your fmpz_mat_is_hadamard is far from optimal. Indeed, you neither need to take the transpose nor to do full matrix multiplication.

As you fisrt check that the matrix is +/-1, you don't need to check that the diagonal entries of HH.T are all n. So you only need to do scalar products of n(n-1)/2 rows...

(I understand that Hadamard matrices in FLINT don't do any serious job, they are there for testing or so; if one was after real speed there, you'd be using BLAS...)

comment:19 in reply to: ↑ 18 ; follow-up: Changed 4 years ago by fredrik.johansson

Replying to dimpase:

Replying to fredrik.johansson:

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds),

IMHO it says that your fmpz_mat_is_hadamard is far from optimal. Indeed, you neither need to take the transpose nor to do full matrix multiplication.

As you fisrt check that the matrix is +/-1, you don't need to check that the diagonal entries of HH.T are all n. So you only need to do scalar products of n(n-1)/2 rows...

I didn't follow this. Are you saying that you can do the test in O(n^2) instead of O(n^omega)?

(I understand that Hadamard matrices in FLINT don't do any serious job, they are there for testing or so; if one was after real speed there, you'd be using BLAS...)

comment:20 in reply to: ↑ 19 Changed 4 years ago by dimpase

Replying to fredrik.johansson:

Replying to dimpase:

Replying to fredrik.johansson:

You might consider using the flint functions fmpz_mat_hadamard and fmpz_mat_is_hadamard, which should be faster.

Checking by default seems kind of useless since this is much slower than generating the matrix (generating a size 1000 Hadamard matrix takes 0.01 seconds and checking it takes 0.5 seconds),

IMHO it says that your fmpz_mat_is_hadamard is far from optimal. Indeed, you neither need to take the transpose nor to do full matrix multiplication.

As you fisrt check that the matrix is +/-1, you don't need to check that the diagonal entries of HH.T are all n. So you only need to do scalar products of n(n-1)/2 rows...

I didn't follow this. Are you saying that you can do the test in O(n^2) instead of O(n^omega)?

No, I am not saying that you can do the scalar product of two matrix rows in constant time. :-)

But at least you can replace

   for (i = 0; i < n && result; i++)
        for (j = 0; j < n && result; j++)
            result = (*fmpz_mat_entry(C, i, j) == n * (i == j));

with

   for (i = 0; i < n && result; i++)
        for (j = i+1; j < n && result; j++)
            result = (*fmpz_mat_entry(C, i, j) == 0);

another possible optimisation would be to have a dedicated function for matrix multiplication with the symmetric result (for HH^T is symmetric; other possible use would be to compute product of commuting symmetric matrices).

comment:21 follow-up: Changed 4 years ago by fredrik.johansson

That part of the tests amounts to approximately 0.000% of the total running time.

In any case, a constant factor anywhere in the code doesn't fundamentally change the imbalance between testing and generating, as you suggested above.

The reason testing is far slower is not a factor 2 inefficiency (or even a factor 6 or whatever from not using BLAS) in the matrix multiplication, but the fact that constructing a Hadamard matrix is only O(n^2) while testing is O(n^omega).

comment:22 in reply to: ↑ 21 Changed 4 years ago by dimpase

Replying to fredrik.johansson:

That part of the tests amounts to approximately 0.000% of the total running time.

In any case, a constant factor anywhere in the code doesn't fundamentally change the imbalance between testing and generating, as you suggested above.

as you imagine, I didn't profile this change. Still, it removes O(n^2) multiplications and comparisons. If you prefer to keep them in your code, it is up to you.

The reason testing is far slower is not a factor 2 inefficiency (or even a factor 6 or whatever from not using BLAS) in the matrix multiplication, but the fact that constructing a Hadamard matrix is only O(n^2) while testing is O(n^omega).

On a good BLAS, the speed-up is a function of the number of cores available... Then, as we know, the devil is in the constants; if matrices were optimised for the task at hand, you would only have started to notice on 10000x10000 matrices, not on 1000x1000.

Anyhow, this is an academic discussion.

comment:23 Changed 4 years ago by dimpase

Hadamard matrix, i.e. has its first row/column filles with +1.

should be filled

comment:24 Changed 4 years ago by git

  • Commit changed from a5a4bbff80f2aed2a5b549d6b64f215967bc5ec7 to 2553467c572dc462c4b514ba8e405fb928017928

Branch pushed to git repo; I updated commit sha1. New commits:

2553467trac #19340: Typo

comment:25 Changed 4 years ago by dimpase

my problem with this patch that it uses is_prime instead of is_prime_power to build Paley matrices. While the present implementation of Paley matrices needs primes, it's entirely unnecessary. Should I add a patch that fixes this here? (or else, surely it's not a big deal to process errors from Paley constructions here rather than limit their inputs...)

comment:26 follow-up: Changed 4 years ago by ncohen

see #19341.

comment:27 in reply to: ↑ 26 Changed 4 years ago by dimpase

  • Status changed from needs_review to positive_review

Replying to ncohen:

see #19341.

OK!

comment:28 Changed 4 years ago by ncohen

Thanks,

Nathann

P.S.: What about the... You know... Reviewer field?

comment:29 Changed 4 years ago by dimpase

  • Reviewers set to Dima Pasechnik

comment:30 Changed 4 years ago by vbraun

  • Branch changed from u/ncohen/19340 to 2553467c572dc462c4b514ba8e405fb928017928
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.