Opened 9 years ago

Closed 8 years ago

Last modified 6 years ago

#12923 closed defect (fixed)

Comparison of vectors is O(n) even in the simple cases

Reported by: hivert Owned by: jason, was
Priority: major Milestone: sage-5.3
Component: linear algebra Keywords: vector equality days38
Cc: mguaypaq Merged in: sage-5.3.beta0
Authors: Florent Hivert Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description

Comparison of large vectors in Sage is slow in a surprising way: even if all the entries of the vectors are different, the cost is proportional to the length of the vector instead of having constant cost !

sage: l = 1000;   m0 = vector(ZZ, [0]*l); m1 = vector(ZZ, [1]*l)
sage: %timeit m0 == m1
625 loops, best of 3: 656 盜 per loop
sage: l = 10000;   m0 = vector(ZZ, [0]*l); m1 = vector(ZZ, [1]*l)
sage: %timeit m0 == m1
125 loops, best of 3: 5.66 ms per loop
sage: l = 100000;   m0 = vector(ZZ, [0]*l); m1 = vector(ZZ, [1]*l)
sage: %timeit m0 == m1
5 loops, best of 3: 59.1 ms per loop

This seems to affect all the dense vectors independently from the base ring.

Attachments (1)

trac_12923-free_mod_cmp_fix-fh.patch (7.7 KB) - added by hivert 9 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 9 years ago by hivert

  • Keywords days38 added

Changed 9 years ago by hivert

comment:2 Changed 9 years ago by hivert

  • Cc mguaypaq added
  • Status changed from new to needs_review

comment:3 follow-up: Changed 9 years ago by tscrim

  • Status changed from needs_review to needs_work

It works fine for integers, but observe the following:

sage: l = 1000; v1 = vector(ZZ, [0]*l); v2 = vector(QQ, [1]*l)                 
sage: %timeit v1 == v2
5 loops, best of 3: 2.25 ms per loop
sage: %timeit v1 == v2
125 loops, best of 3: 3.24 ms per loop
sage: %timeit v1 == v2
125 loops, best of 3: 3.27 ms per loop
sage: %timeit v1 == v2
125 loops, best of 3: 3.35 ms per loop
sage: l = 2000; v1 = vector(ZZ, [0]*l); v2 = vector(QQ, [1]*l)
sage: %timeit v1 == v2
5 loops, best of 3: 4.72 ms per loop
sage: %timeit v1 == v2
125 loops, best of 3: 6.81 ms per loop
sage: %timeit v1 == v2
125 loops, best of 3: 6.64 ms per loop

I can't test any higher because running these tests completely kills my memory (the first comparison will allocate ~150MB RAM, the second ~450MB RAM). It only allocates when I call the comparison. Also my memory usage did not drop when I reassign v1 and v2.

comment:4 in reply to: ↑ 3 Changed 9 years ago by hivert

  • Status changed from needs_work to needs_review

Replying to tscrim:

It works fine for integers, but observe the following:

sage: l = 1000; v1 = vector(ZZ, [0]*l); v2 = vector(QQ, [1]*l)

I can't test any higher because running these tests completely kills my memory (the first comparison will allocate ~150MB RAM, the second ~450MB RAM). It only allocates when I call the comparison. Also my memory usage did not drop when I reassign v1 and v2.

I'd rather to see that as a different problem. The goal of the code here is to remove a bunch bug whose result was that the specialized code already written wasn't called. I'm not optimizing anything as I don't have a serious use case for those specific case, do you ? Note that I didn't made any serious benchmark either. Comparing vectors with different base rings, which implies different internal data structure could be certainly optimized, but I'd rather leaving it to a different ticket.

Cheers,

Florent

comment:5 Changed 9 years ago by tscrim

  • Status changed from needs_review to positive_review

Works for comparisons between vectors in the same data structure as you intended.

comment:6 Changed 9 years ago by jdemeyer

Please fill in your real name as Reviewer.

comment:7 Changed 9 years ago by tscrim

  • Reviewers set to Travis Scrimshaw

comment:8 Changed 8 years ago by jdemeyer

  • Merged in set to sage-5.3.beta0
  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:9 follow-up: Changed 6 years ago by jdemeyer

In this patch, you added in several places

+    # __hash__ is not properly inherited if comparison is changed
+    def __hash__(self):
+        """
+        TEST::
+
+            sage: w = vector(ZZ, [-1,0,0,0])
+            sage: w.set_immutable()
+            sage: isinstance(hash(w), int)
+            True
+        """
+        return free_module_element.FreeModuleElement.__hash__(self)

What's the reason for this? Removing these __hash__ methods doesn't do any harm.

comment:10 in reply to: ↑ 9 Changed 6 years ago by hivert

Replying to jdemeyer:

In this patch, you added in several places What's the reason for this? Removing these __hash__ methods doesn't do any harm.

In Cython when you overload comparison __hash__ is not inherited anymore. See the following discussion on cython-user

https://groups.google.com/forum/#!searchin/cython-users/$20hash%28el%29$20does$20not$20work$20though/cython-users/6Jqb0v8g_Vo/f5jXtw4KpmYJ

comment:11 Changed 6 years ago by jdemeyer

Yes, sorry, you are right.

I got confused because

v.__hash__()

works but

hash(v)

doesn't.

Note: See TracTickets for help on using tickets.