Opened 4 years ago

Closed 4 years ago

#25948 closed defect (fixed)

py3: a few more miscellaneous dict iterator (dict.keys, dict.values) fixes

Reported by: embray Owned by:
Priority: major Milestone: sage-8.4
Component: python3 Keywords: sagedays@icerm
Cc: Merged in:
Authors: Erik Bray Reviewers: Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: 69f8e03 (Commits, GitHub, GitLab) Commit: 69f8e03fa62eb8ed4ddc0c113486d0aab7553a12
Dependencies: Stopgaps:

Status badges

Description

As in #25947, found by grepping through the test output. A couple of these are far-reaching.

Change History (14)

comment:1 Changed 4 years ago by embray

  • Status changed from new to needs_review

comment:2 Changed 4 years ago by tscrim

  • Keywords sagedays@icerm added
  • Reviewers set to Travis Scrimshaw

I think it would be better to do list(d.values()) than list(itervalues(d)), both because it means less six dependence (which means it looks like Python3 in terms of the code) and for speed in Python2:

sage: from six import itervalues
sage: d = {i:i for i in range(100)}
sage: %timeit list(itervalues(d))
The slowest run took 33.55 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.34 µs per loop
sage: %timeit list(d.values())
The slowest run took 13.27 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 899 ns per loop

Given that these lists are likely small (less than 100, probably averaging between 5-10), I think any memory duplication is negligible.

Otherwise LGTM.

comment:3 Changed 4 years ago by git

  • Commit changed from ce5c10b4ea36729f4fa3d916e982e72df394cc4b to 75e0d2c3ac82d895d9153830490fd7f162173dc1

Branch pushed to git repo; I updated commit sha1. New commits:

5973ae6py3: a couple more minor dict iterator fixes
fa50082py3: return lists instead of dict iterators for these methods
664a969py3: more minor fixes involving dict iterators and dict sorting
75e0d2cpy3: one more dict.keys -> list fix

comment:4 Changed 4 years ago by embray

Added a few more; still all trivial one-liners I think.

comment:5 Changed 4 years ago by git

  • Commit changed from 75e0d2c3ac82d895d9153830490fd7f162173dc1 to af8b427e4baa8dab4d61e9f5e94a121b2ecac389

Branch pushed to git repo; I updated commit sha1. New commits:

af8b427py3: fix a couple AttributeErrors on dict.iteritems

comment:6 Changed 4 years ago by embray

We've had this discussion before, and IMO unless you can show that it has a noticeable performance impact in this specific case I don't care enough to change it.

comment:7 Changed 4 years ago by tscrim

  • Branch changed from u/embray/python3/misc/dict-iterators to u/tscrim/python3_misc_dict_iters-25948
  • Commit changed from af8b427e4baa8dab4d61e9f5e94a121b2ecac389 to 91503870b530276d48bc43b0589abe0f8593f834

I care a little bit more. :) If my change is good, then positive review.


New commits:

9150387Remove itervalues.

comment:8 Changed 4 years ago by embray

It looks like you changed it in some places but not others. Was that a result of more specific benchmarks?

The only reason I don't care as that the benchmark you cite could be very significant in some contexts, or very insignificant or even wrong in other contexts. I certainly care about about more specific benchmarks.

comment:9 Changed 4 years ago by git

  • Commit changed from 91503870b530276d48bc43b0589abe0f8593f834 to 69f8e03fa62eb8ed4ddc0c113486d0aab7553a12

Branch pushed to git repo; I updated commit sha1. New commits:

64b21f1Merge branch 'u/tscrim/python3_misc_dict_iters-25948' of git://trac.sagemath.org/sage into u/tscrim/python3_misc_dict_iters-25948
69f8e03Making the output machine-independent.

comment:10 Changed 4 years ago by tscrim

My branch

sage: A = ClusterAlgebra(['A',[1,2],1])
sage: A.current_seed()
The initial seed of a Cluster Algebra with cluster variables x0, x1, x2 and no coefficients over Integer Ring
sage: for k in [0,1,2]*4:
....:     A.current_seed().mutate(k)
....:     
sage: %timeit A.F_polynomials_so_far()
The slowest run took 11.30 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 443 ns per loop

vs your branch

sage: %timeit A.F_polynomials_so_far()
The slowest run took 64.77 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 806 ns per loop

So while this is not a big shift in timings (despite the 2x speedup), running this method repeatedly can start taking time:

sage: A = ClusterAlgebra(['A',[1,2],1])
sage: %%timeit
....: A.reset_current_seed()
....: for k in [0,1,2]*6:
....:     A.current_seed().mutate(k)
....:     A.F_polynomials_so_far()
....: 
The slowest run took 12.77 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.49 ms per loop

vs your branch

The slowest run took 12.22 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.58 ms per loop

I also changed some tests as they were failing on my machine and the patchbot to make them machine-independent (I am assuming they were passing on your machine).

comment:11 Changed 4 years ago by embray

I also changed some tests as they were failing on my machine and the patchbot to make them machine-independent (I am assuming they were passing on your machine).

I'm not sure I understand this. Why would the polynomials be sorted differently on different machines? The sorting I had looks like it makes sense. If it was giving different results on Python 2 that still might be a real issue worth looking into instead of just doing key=str.

comment:12 Changed 4 years ago by embray

  • Status changed from needs_review to needs_work

I see, here are a couple examples:

sage -t --long src/sage/algebras/cluster_algebra.py
**********************************************************************
File "src/sage/algebras/cluster_algebra.py", line 101, in sage.algebras.cluster_algebra
Failed example:
    sorted(A.cluster_variables_so_far())
Expected:
    [x0, x1, (x0 + 1)/x1, (x1 + 1)/x0, (x0 + x1 + 1)/(x0*x1)]
Got:
    [x1, (x0 + 1)/x1, x0, (x1 + 1)/x0, (x0 + x1 + 1)/(x0*x1)]
**********************************************************************
File "src/sage/algebras/cluster_algebra.py", line 1678, in sage.algebras.cluster_algebra.ClusterAlgebra.cluster_variables_so_far
Failed example:
    sorted(A.cluster_variables_so_far())
Expected:
    [x0, x1, (x1 + 1)/x0]
Got:
    [x1, x0, (x1 + 1)/x0]

I'm not sure what's up with that. Maybe there's another change on my Python 3 branch this is dependent on...

comment:13 Changed 4 years ago by embray

  • Status changed from needs_work to positive_review

I see now--ClusterAlgebraElement obtains its comparison operators from ElementWrapper, and defines things such that both < and > always return False. I'm not sure how I feel about that, but nevertheless it means from Python's perspective they can be "sorted", but the sorting is meaningless.

I'm not 100% sure I like str as a sort key in these examples, but it doesn't really matter either way, as long as it gives consistent results.

comment:14 Changed 4 years ago by vbraun

  • Branch changed from u/tscrim/python3_misc_dict_iters-25948 to 69f8e03fa62eb8ed4ddc0c113486d0aab7553a12
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.