Opened 9 months ago

Closed 9 months ago

#32561 closed enhancement (fixed)

Speed up random bounded tolerance graph

Reported by: dcoudert Owned by:
Priority: major Milestone: sage-9.5
Component: graph theory Keywords:
Cc: gh-kliem Merged in:
Authors: David Coudert Reviewers: Jonathan Kliem
Report Upstream: N/A Work issues:
Branch: b0a4794 (Commits, GitHub, GitLab) Commit: b0a4794fae8dc44c5a8ec38e3213099295205296
Dependencies: Stopgaps:

Status badges

Description

Since #32186, we use Combinations in RandomBoundedToleranceGraph and the method is rather slow

sage: %timeit g = graphs.RandomBoundedToleranceGraph(8)
17.9 s ± 449 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Here we avoid the use of Combinations and we get a much faster generator

sage: %timeit g = graphs.RandomBoundedToleranceGraph(8)
102 µs ± 2.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Change History (15)

comment:1 Changed 9 months ago by dcoudert

  • Branch set to public/graphs/32561_random_tolgraph
  • Commit set to 9eacbdc5821da99f97ee5aa56a39aae25cdb9b89
  • Status changed from new to needs_review

New commits:

9eacbdctrac #32561: faster RandomBoundedToleranceGraph

comment:2 Changed 9 months ago by git

  • Commit changed from 9eacbdc5821da99f97ee5aa56a39aae25cdb9b89 to 04e0547947d1ac23f42a4790b7e36716faf0b2ff

Branch pushed to git repo; I updated commit sha1. New commits:

4b050d6trac #32561: randint is imported at module level
04e0547trac #32561: also modifies RandomToleranceGraph

comment:3 Changed 9 months ago by dcoudert

I took the opportunity to remove multiple imports of randint and give the same shape to method RandomToleranceGraph.

comment:4 Changed 9 months ago by gh-kliem

While I agree with the need to change this, I don't agree with the implementation.

However, I don't really understand tolerance graphs and what kind of values give you interesting tolerance graphs.

With this little knowledge that I have, your implementation seems very strange distributed. Half of the intervals start on the right half. I think it should be much less.

So I propose the following changes:

     tolrep = []
     for _ in range(n):
-        l = randint(0, W - 1)
+        l = randint(0, W)
-        r = randint(l + 1, W)
+        r = randint(0, W - 1)
+        if r >= l:
+            l, r = r + 1, l
         tolrep.append((l, r, randint(1, r - l)))

The obtained random intervals will be symmtric. It is equivalent to

    tolrep = []
    for _ in range(n):
        l = randint(0, W)
        r = l
        while r == l:
            r = randint(0, W)
        ...

but much nicer as it avoids the loop.

The timings of Combinations(W, 2).random_element() are absolutely awful. I wasn't aware of this. The complexity appears to grow quadratic to W (and only about twice as fast as picking a random element from itertools.combinations(range(W), 2)). Combinations considers a multi-set though, which is a much more complex problem.

comment:5 Changed 9 months ago by gh-kliem

See also #32584, which improves Combinations(n, k).random_element() and gives the following timings:

sage: %timeit g = graphs.RandomBoundedToleranceGraph(8)
85.3 ms ± 3.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Not even close to the new timings, but the intervals uniformly distributed (from the set of all intervals). I don't know if this is desirable or not.

Edit: Well both distributions are the same. One problem is that random_element of Combinations is linear in the length of the list. Kind of nice, but not when obtaining combinations Combinations(16000, 10) or similar.

Last edited 9 months ago by gh-kliem (previous) (diff)

comment:6 Changed 9 months ago by git

  • Commit changed from 04e0547947d1ac23f42a4790b7e36716faf0b2ff to b0a4794fae8dc44c5a8ec38e3213099295205296

Branch pushed to git repo; I updated commit sha1. New commits:

d919666trac #32561: merged with 9.5.beta2
b0a4794trac #32561: review comments

comment:7 follow-up: Changed 9 months ago by dcoudert

I slightly changes the values since we want 0 <= l < r <= W.

We could also try with a single randint in range [1, W*(W-1)/2] and then retrieve l and r, but I'm not sure it's really better.

comment:8 Changed 9 months ago by gh-kliem

I guess I could open a new ticket and implement the following:

def random_combination(a, b, k):
    """
    Return a list of ``k`` distinct sorted random integers ``i`` with ``a <= i <= b``.
    """
    l = [randint(a, b - i) for i in range(k)]
    for i in range(k):
        for j in range(i):
            if l[i] >= l[j]:
                l[i] += 1
    return sorted(l)

Then Combinations_setk could make use of this, if k is small relative to n (the above function is quadratic in k, which Combinations(n,k).random_element() appears to be linear in n.

comment:9 in reply to: ↑ 7 Changed 9 months ago by gh-kliem

Replying to dcoudert:

I slightly changes the values since we want 0 <= l < r <= W.

We could also try with a single randint in range [1, W*(W-1)/2] and then retrieve l and r, but I'm not sure it's really better.

The improvement is good as it is. As I noted later, this is uniformly distributed, as what we are doing is equivalent to the following:

  1. Pick l, r in with randint(0, W).
  2. Restart if l == r.
  3. Sort l and `r.

comment:10 Changed 9 months ago by gh-kliem

  • Reviewers set to Jonathan Kliem

comment:11 Changed 9 months ago by dcoudert

I don't understand the relationship between combinations and the method you propose in #comment:8. Furthermore, I can get

sage: random_combination(2, 6, 5)
[2, 3, 3, 5, 5]

At least, we have improved method RandomBoundedToleranceGraph.

comment:12 Changed 9 months ago by gh-kliem

Yes, indeed this doesn't work. This is the correct code:

sage: def random_combination(a, b, k):
....:     l = [randint(a, b - i) for i in range(k)]
....:     for i in range(k):
....:         for j in range(i):
....:             if l[i] >= l[j]:
....:                 l[i] += 1
....:         l[:i+1] = sorted(l[:i+1])
....:     return l

This is exactly what we do for the case k=2: random_combination(0, W(n), 2).

comment:13 Changed 9 months ago by gh-kliem

  • Status changed from needs_review to positive_review

Anyway, LGTM.

comment:14 Changed 9 months ago by dcoudert

Thanks for the review and all the comments.

comment:15 Changed 9 months ago by vbraun

  • Branch changed from public/graphs/32561_random_tolgraph to b0a4794fae8dc44c5a8ec38e3213099295205296
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.