Opened 9 months ago
Closed 9 months ago
#32561 closed enhancement (fixed)
Speed up random bounded tolerance graph
Reported by: | dcoudert | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-9.5 |
Component: | graph theory | Keywords: | |
Cc: | gh-kliem | Merged in: | |
Authors: | David Coudert | Reviewers: | Jonathan Kliem |
Report Upstream: | N/A | Work issues: | |
Branch: | b0a4794 (Commits, GitHub, GitLab) | Commit: | b0a4794fae8dc44c5a8ec38e3213099295205296 |
Dependencies: | Stopgaps: |
Description
Since #32186, we use Combinations
in RandomBoundedToleranceGraph
and the method is rather slow
sage: %timeit g = graphs.RandomBoundedToleranceGraph(8) 17.9 s ± 449 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Here we avoid the use of Combinations
and we get a much faster generator
sage: %timeit g = graphs.RandomBoundedToleranceGraph(8) 102 µs ± 2.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Change History (15)
comment:1 Changed 9 months ago by
- Branch set to public/graphs/32561_random_tolgraph
- Commit set to 9eacbdc5821da99f97ee5aa56a39aae25cdb9b89
- Status changed from new to needs_review
comment:2 Changed 9 months ago by
- Commit changed from 9eacbdc5821da99f97ee5aa56a39aae25cdb9b89 to 04e0547947d1ac23f42a4790b7e36716faf0b2ff
comment:3 Changed 9 months ago by
I took the opportunity to remove multiple imports of randint
and give the same shape to method RandomToleranceGraph
.
comment:4 Changed 9 months ago by
While I agree with the need to change this, I don't agree with the implementation.
However, I don't really understand tolerance graphs and what kind of values give you interesting tolerance graphs.
With this little knowledge that I have, your implementation seems very strange distributed. Half of the intervals start on the right half. I think it should be much less.
So I propose the following changes:
tolrep = [] for _ in range(n): - l = randint(0, W - 1) + l = randint(0, W) - r = randint(l + 1, W) + r = randint(0, W - 1) + if r >= l: + l, r = r + 1, l tolrep.append((l, r, randint(1, r - l)))
The obtained random intervals will be symmtric. It is equivalent to
tolrep = [] for _ in range(n): l = randint(0, W) r = l while r == l: r = randint(0, W) ...
but much nicer as it avoids the loop.
The timings of Combinations(W, 2).random_element()
are absolutely awful. I wasn't aware of this. The complexity appears to grow quadratic to W
(and only about twice as fast as picking a random element from itertools.combinations(range(W), 2)
). Combinations
considers a multi-set though, which is a much more complex problem.
comment:5 Changed 9 months ago by
See also #32584, which improves Combinations(n, k).random_element()
and gives the following timings:
sage: %timeit g = graphs.RandomBoundedToleranceGraph(8) 85.3 ms ± 3.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Not even close to the new timings, but the intervals uniformly distributed (from the set of all intervals). I don't know if this is desirable or not.
Edit: Well both distributions are the same. One problem is that random_element
of Combinations
is linear in the length of the list. Kind of nice, but not when obtaining combinations Combinations(16000, 10)
or similar.
comment:6 Changed 9 months ago by
- Commit changed from 04e0547947d1ac23f42a4790b7e36716faf0b2ff to b0a4794fae8dc44c5a8ec38e3213099295205296
comment:7 follow-up: ↓ 9 Changed 9 months ago by
I slightly changes the values since we want 0 <= l < r <= W.
We could also try with a single randint
in range [1, W*(W-1)/2]
and then retrieve l and r, but I'm not sure it's really better.
comment:8 Changed 9 months ago by
I guess I could open a new ticket and implement the following:
def random_combination(a, b, k): """ Return a list of ``k`` distinct sorted random integers ``i`` with ``a <= i <= b``. """ l = [randint(a, b - i) for i in range(k)] for i in range(k): for j in range(i): if l[i] >= l[j]: l[i] += 1 return sorted(l)
Then Combinations_setk
could make use of this, if k
is small relative to n
(the above function is quadratic in k
, which Combinations(n,k).random_element()
appears to be linear in n
.
comment:9 in reply to: ↑ 7 Changed 9 months ago by
Replying to dcoudert:
I slightly changes the values since we want 0 <= l < r <= W.
We could also try with a single
randint
in range[1, W*(W-1)/2]
and then retrieve l and r, but I'm not sure it's really better.
The improvement is good as it is. As I noted later, this is uniformly distributed, as what we are doing is equivalent to the following:
- Pick
l
,r
in withrandint(0, W)
. - Restart if
l == r
. - Sort
l
and `r.
comment:10 Changed 9 months ago by
- Reviewers set to Jonathan Kliem
comment:11 Changed 9 months ago by
I don't understand the relationship between combinations and the method you propose in #comment:8. Furthermore, I can get
sage: random_combination(2, 6, 5) [2, 3, 3, 5, 5]
At least, we have improved method RandomBoundedToleranceGraph
.
comment:12 Changed 9 months ago by
Yes, indeed this doesn't work. This is the correct code:
sage: def random_combination(a, b, k): ....: l = [randint(a, b - i) for i in range(k)] ....: for i in range(k): ....: for j in range(i): ....: if l[i] >= l[j]: ....: l[i] += 1 ....: l[:i+1] = sorted(l[:i+1]) ....: return l
This is exactly what we do for the case k=2
: random_combination(0, W(n), 2)
.
comment:13 Changed 9 months ago by
- Status changed from needs_review to positive_review
Anyway, LGTM.
comment:14 Changed 9 months ago by
Thanks for the review and all the comments.
comment:15 Changed 9 months ago by
- Branch changed from public/graphs/32561_random_tolgraph to b0a4794fae8dc44c5a8ec38e3213099295205296
- Resolution set to fixed
- Status changed from positive_review to closed
New commits:
trac #32561: faster RandomBoundedToleranceGraph