Opened 6 years ago

Closed 4 months ago

#16073 closed enhancement (fixed)

Python 3 preparation: The semantic of map() function is changed

Reported by: wluebbe Owned by:
Priority: major Milestone: sage-duplicate/invalid/wontfix
Component: python3 Keywords: python3
Cc: embray, jdemeyer, fbissey, tscrim, jhpalmieri, vklein Merged in:
Authors: Wilfried Luebbe, André Apitzsch Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by wluebbe)

In Py2 map() returns a list, while in Py3 map() returns an iterator (as itertools.imap() does in Py2).

The tool 2to3 wraps map() usages with a call to list().
An alternative approach is to add from future_builtins import map and to check the code.

This ticket is tracked as a dependency of meta-ticket ticket:16052.

Change History (70)

comment:1 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:2 Changed 5 years ago by wluebbe

  • Branch set to u/wluebbe/ticket/16073
  • Commit set to 8c8e5dc6783c29333679e28d38ea596f520e3258

This first commit was generated by 2to3. It has failing doctests! These will be fixed with a separate commit.


New commits:

8c8e5dctrac #16073: changes generated by "2to3 -w -f map src/sage"

comment:3 Changed 5 years ago by git

  • Commit changed from 8c8e5dc6783c29333679e28d38ea596f520e3258 to b786139346f3c79ef980824ca607b8c0d40f2c22

Branch pushed to git repo; I updated commit sha1. New commits:

b786139trac #16073: fix an error exposed by the generated code

comment:4 Changed 5 years ago by wluebbe

The problem was that in one module the standard library function list had been overwritten. Now all doctests pass.

comment:5 Changed 5 years ago by wluebbe

The code would now certainly benefit from changing expressions like list(map(...)) into a list comprehension.

This is more pythonic, more readable (especially in the case list(map(lambda ...)) ) and it avoids redundant list creation in Python 2. But it's a lot of stuff to check and change ...

comment:6 Changed 5 years ago by wluebbe

This patch will cause a lot of merge conflicts with #16067 filter. It seems advisable to wait until #16067 is merged and then regenerate this patch.

Somebody to review #16067 ... ;-)

comment:7 Changed 5 years ago by wluebbe

  • Status changed from new to needs_review

comment:8 Changed 5 years ago by wluebbe

  • Authors set to Wilfried Luebbe
  • Status changed from needs_review to needs_work

comment:9 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.3 to sage-6.4

comment:10 Changed 5 years ago by aapitzsch

  • Branch changed from u/wluebbe/ticket/16073 to public/16073
  • Commit changed from b786139346f3c79ef980824ca607b8c0d40f2c22 to 9ebf5fd033b4a6340c817c52ee7b257484cb8ebb

Rebased.


New commits:

9ebf5fdMerge remote-tracking branch 'origin/develop' into 16073

comment:11 Changed 5 years ago by git

  • Commit changed from 9ebf5fd033b4a6340c817c52ee7b257484cb8ebb to dc803c154c077bb345715e40a662fa14beb38c99

Branch pushed to git repo; I updated commit sha1. New commits:

dc803c1remove some unnecessary conversions to list

comment:12 Changed 5 years ago by aapitzsch

  • Status changed from needs_work to needs_review

comment:13 Changed 5 years ago by darij

The patch is red.

Either way, shouldn't it be only wrapping some of the uses, rather than all of them?

comment:14 Changed 5 years ago by git

  • Commit changed from dc803c154c077bb345715e40a662fa14beb38c99 to 6ddc5fc6e2673e0e0edd31e11b1c37350230ca48

Branch pushed to git repo; I updated commit sha1. New commits:

6ddc5fcMerge remote-tracking branch 'origin/develop' into 16073

comment:15 follow-up: Changed 5 years ago by darij

OK, but as I expected there are some questionable things here.

- r = sum ( map( lambda x: x[2], cycle ) )
+ r = sum ( [x[2] for x in cycle] )

Wouldn't keeping the old line be better? Summing an iterator should be faster than summing a list which has been constructed merely to be summed.

comment:16 Changed 5 years ago by git

  • Commit changed from 6ddc5fc6e2673e0e0edd31e11b1c37350230ca48 to 8331e848b1ab06e87ed5114926d941083ff0ae7f

Branch pushed to git repo; I updated commit sha1. New commits:

8331e84remove more unnecessary conversions to list

comment:17 in reply to: ↑ 15 Changed 5 years ago by aapitzsch

Replying to darij:

OK, but as I expected there are some questionable things here.

- r = sum ( map( lambda x: x[2], cycle ) )
+ r = sum ( [x[2] for x in cycle] )

Wouldn't keeping the old line be better? Summing an iterator should be faster than summing a list which has been constructed merely to be summed.

You are right.

In this case we should write

r = sum ( (x[2] for x in cycle) )

Feel free to fix it (and others), currently I have no time to check the necessity of list() wrapping.

comment:18 follow-up: Changed 5 years ago by darij

- return "".join(map(lambda x: self._character_to_code[x], string))
+ return "".join([self._character_to_code[x] for x in string])

Wondering if this is a good step forward, given that you can join an iterator to a string:

>>> map(lambda x: x+x, ['a','b','c'])
<map object at 0x7fd747ddee10>
>>> "blah".join(map(lambda x: x+x, ['a','b','c']))
'aablahbbblahcc'

(Python 3).


- map(
- lambda x: chr(ord(x) + 49),
- list(str(num[0]))),
+ [chr(ord(x) + 49) for x in list(str(num[0]))],

Can't we get rid of the list?


- a,b,c,d,e,f = map(G,R)
+ a,b,c,d,e,f = list(map(G,R))

This seems to not be necessary:

>>> a,b,c = iter([1,2,3])
>>> a
1
>>> b
2
>>> c
3

This is unnecessary:

- if all(map(lambda s: s.is_final, state.label())):
+ if all([s.is_final for s in state.label()]):

If anything then all definitely works on iterators.


Here is the thing that worries me the most: There are tools for automatic translation of python2 into 3. I am not sure if we should, or are going to, release them on the Sage library; but if we are, I would rather not do any manual work unless we can make sure that it will not interfere with the automated changes. I have a hunches that a well-written bot could do better than some of the changes in this patch (for instance, it should be a rule that all(map(...)) can stay as it is rather than being replaced by all(list(map(...)))), thus leading to better speed, whereas with the changes already done it would not simplify them back to what they should be. This does not speak against edits such as replacing list variables by list_ and likewise, which are the right thing to do either way. But I would be wary of making changes that possibly complicate the situation without knowing that it will become untangled when the time comes to actually switch to python 3.

Last edited 5 years ago by darij (previous) (diff)

comment:19 Changed 5 years ago by git

  • Commit changed from 8331e848b1ab06e87ed5114926d941083ff0ae7f to 99e57e20ca0f39d11c00a263f82a9de7d4f836ae

Branch pushed to git repo; I updated commit sha1. New commits:

99e57e2remove furthr unnecessary conversions to list

comment:20 in reply to: ↑ 18 Changed 5 years ago by aapitzsch

Darij thanks. I hopefully made all the changes you suggested.

Replying to darij:

Here is the thing that worries me the most: There are tools for automatic translation of python2 into 3. I am not sure if we should, or are going to, release them on the Sage library; but if we are, I would rather not do any manual work unless we can make sure that it will not interfere with the automated changes. I have a hunches that a well-written bot could do better than some of the changes in this patch (for instance, it should be a rule that all(map(...)) can stay as it is rather than being replaced by all(list(map(...)))), thus leading to better speed, whereas with the changes already done it would not simplify them back to what they should be. This does not speak against edits such as replacing list variables by list_ and likewise, which are the right thing to do either way. But I would be wary of making changes that possibly complicate the situation without knowing that it will become untangled when the time comes to actually switch to python 3.

These tools don't work well in all cases and usually one can disable special fixes. So manual work won't interfere with automated changes.

The target of these changes is to simplify the switch to python 3 without complicating the situation. :)

comment:21 follow-up: Changed 5 years ago by wluebbe

I got

1 item had failures:
   3 of 211 in sage.tests.cmdline.test_executable
    [210 tests, 3 failures, 29.22 s]
----------------------------------------------------------------------
sage -t --long src/sage/tests/cmdline.py  # 3 doctests failed

while there were no failures with plain sage 6.4. I have not yet found the cause.

comment:22 in reply to: ↑ 21 Changed 5 years ago by aapitzsch

Replying to wluebbe:

I got

1 item had failures:
   3 of 211 in sage.tests.cmdline.test_executable
    [210 tests, 3 failures, 29.22 s]
----------------------------------------------------------------------
sage -t --long src/sage/tests/cmdline.py  # 3 doctests failed

while there were no failures with plain sage 6.4. I have not yet found the cause.

I have no problems with cmdline.py. Results keep the same with and without merging origin/develop into 16073.

sage -t --long src/sage/tests/cmdline.py
    [210 tests, 32.88 s]
----------------------------------------------------------------------
All tests passed!

comment:23 Changed 5 years ago by wluebbe

Strange! And Volker just released sage-6.5.beta0 and auto-merge failed again ... :-(

By the way, back in June I was working to replace map with list comprehension or generator expressions (mostly). It was not quite complete when I took some time off.

I want to give that approach a fresh start on sage-6.5.beta0. I will describe the reasoning tomorrow (or so ...).

comment:24 Changed 5 years ago by git

  • Commit changed from 99e57e20ca0f39d11c00a263f82a9de7d4f836ae to f47ac2b0aeef8cd6172206c1f6ef36c326dafdc9

Branch pushed to git repo; I updated commit sha1. New commits:

f47ac2bMerge remote-tracking branch 'origin/develop' into 16073

comment:25 Changed 5 years ago by wluebbe

I uploaded a new branch u/wluebbe/ticket/16073-2.

My goal is to replace all map calls with list comprehension or generator expressions. Currently there are still about 40 list(map(... in the code (somehow my regexp did not find them :-/ ) and a number of maps in the doctests. So there is still some work to do.

The drawback of wrapping map with list is that in Python2 this is redundant. If a context accepts iterators then in Python2 map cannot exploit this (as it returns a list). Remapping map in Python2 on its iterator version itertools.imap requires extra code.

The current branch does change list comprehensions to generator expressions in all possible cases. I would rather leave this refactoring to a new ticket since there are lots of applicable cases separate from 16073.

I am very much interested in feedback to u/wluebbe/ticket/16073-2.

comment:26 Changed 5 years ago by darij

wluebbe: sorry for the stupid question, but what is the precise difference between the two branches? I cannot tell it from the post.

comment:27 Changed 5 years ago by wluebbe

The 2 branches (the current public/16073 and u/wluebbe/ticket/16073-2) are alternative approaches. I did not want to be so impolite and just swap the branches :-) And I did not think it is appropriate to open a separate ticket.

comment:28 Changed 5 years ago by darij

I have added your branch to #15379 (an accidental duplicate ticket I made a year ago) to be able to see it on the trac viewer.

I have to say I have no idea which of your two branches is the better one :/

comment:29 Changed 5 years ago by wluebbe

My branch gets rid of (mostly) all map(...) and list(map(...)) calls replacing them with list comprehension or generator expressions. There are currently a few (about 40) list(map(...)) calls left over (as e.g. in src/sage/graphs/bipartite_graph.py lines 643ff).

comment:30 Changed 5 years ago by git

  • Commit changed from f47ac2b0aeef8cd6172206c1f6ef36c326dafdc9 to cf901f987a14edbad31608be8f5cf88b9d099748

Branch pushed to git repo; I updated commit sha1. New commits:

cf901f9Merge remote-tracking branch 'origin/develop' into 16073

comment:31 Changed 4 years ago by chapoton

  • Status changed from needs_review to needs_work

does not apply

comment:32 Changed 4 years ago by git

  • Commit changed from cf901f987a14edbad31608be8f5cf88b9d099748 to 2c1de15add5af1438c4be1bab3b21accff978e9a

Branch pushed to git repo; I updated commit sha1. New commits:

2c1de15Merge remote-tracking branch 'origin/develop' into 16073

comment:33 Changed 4 years ago by aapitzsch

  • Authors changed from Wilfried Luebbe to Wilfried Luebbe, André Apitzsch
  • Status changed from needs_work to needs_review

comment:34 follow-up: Changed 4 years ago by jdemeyer

I find [func(_) for _ in L] more readable than list(map(func, L)).

comment:35 in reply to: ↑ 34 Changed 4 years ago by aapitzsch

  • Status changed from needs_review to needs_work

Replying to jdemeyer:

I find [func(_) for _ in L] more readable than list(map(func, L)).

I will change it.

comment:36 Changed 4 years ago by git

  • Commit changed from 2c1de15add5af1438c4be1bab3b21accff978e9a to 3ce7033d5eb0377908986f18fa0b2e0681b303af

Branch pushed to git repo; I updated commit sha1. New commits:

3ce7033make 'list(map(..))' more readable

comment:37 Changed 4 years ago by aapitzsch

  • Status changed from needs_work to needs_review

comment:38 Changed 4 years ago by git

  • Commit changed from 3ce7033d5eb0377908986f18fa0b2e0681b303af to 02ceae747267fe0610ebbcfb126712742cca189d

Branch pushed to git repo; I updated commit sha1. New commits:

02ceae7revert changes in random_variable.py

comment:39 Changed 4 years ago by jdemeyer

I'm sorry but this is much too boring to review.

comment:40 follow-up: Changed 4 years ago by jdemeyer

Are the changes scripted? If so, can you just tell me which script you used to generate them, then I'll review the script instead.

comment:41 in reply to: ↑ 40 ; follow-up: Changed 4 years ago by aapitzsch

Replying to jdemeyer:

Are the changes scripted? If so, can you just tell me which script you used to generate them, then I'll review the script instead.

There is no script. But if necessary we can split this ticket.

comment:42 Changed 4 years ago by git

  • Commit changed from 02ceae747267fe0610ebbcfb126712742cca189d to ce34bea5d35fbba9d7c3d4a642c51cbe87d18e22

Branch pushed to git repo; I updated commit sha1. New commits:

ce34beaMerge remote-tracking branch 'origin/develop' into 16073

comment:43 Changed 4 years ago by aapitzsch

  • Milestone changed from sage-6.4 to sage-6.7

comment:44 in reply to: ↑ 41 Changed 4 years ago by jdemeyer

Replying to aapitzsch:

There is no script.

That actually worries me. If one has to manually review each individual change, that is a lot of changes to review and a lot of potential for mistakes also.

comment:45 Changed 4 years ago by aapitzsch

I still can split this ticket. Would it be sufficient to move the changes in src/sage/combinat/ to a new ticket?

comment:46 Changed 4 years ago by jdemeyer

Well, my preferred solution would be if the majority of the changes would be scripted.

Splitting would help also but maybe 2 pieces are not enough.

comment:47 Changed 4 years ago by git

  • Commit changed from ce34bea5d35fbba9d7c3d4a642c51cbe87d18e22 to 2916b218b07d9ae20dd030737392df9a1f55419b

Branch pushed to git repo; I updated commit sha1. New commits:

2916b21revert changes in src/sage/combinat

comment:48 Changed 4 years ago by git

  • Commit changed from 2916b218b07d9ae20dd030737392df9a1f55419b to 038b9475df4bbda26bdba92b20f2318bbbca5d10

Branch pushed to git repo; I updated commit sha1. New commits:

038b947revert more changes

comment:49 Changed 4 years ago by aapitzsch

I added three additional tickets (#18472, #18473, #18474).

comment:50 Changed 4 years ago by darij

Can this now be set to closed?

comment:51 Changed 4 years ago by wluebbe

  • Dependencies set to #18472, #18473, #18474, #18531
  • Status changed from needs_review to needs_work

No, unfortunately there are still a significant number of maps in the code base.

@André: Can you please move the branch public/16073 from this ticket into a new, separate ticket. Then I will review the change. - Reason: I will create one (maybe two) extra tickets/changes (this is now #18531) for the leftover maps that I found during the review of #18473 and #18474. And I would prefer that this tickets only refers to the tickets containing the changes.

comment:52 Changed 4 years ago by git

  • Commit changed from 038b9475df4bbda26bdba92b20f2318bbbca5d10 to 2df6d8488ef2496ff22cdf45027fc08ab40fe913

Branch pushed to git repo; I updated commit sha1. New commits:

2df6d84Merge remote-tracking branch 'origin/develop' into 16073

comment:53 Changed 4 years ago by aapitzsch

  • Branch public/16073 deleted
  • Commit 2df6d8488ef2496ff22cdf45027fc08ab40fe913 deleted
  • Dependencies changed from #18472, #18473, #18474, #18531 to #18472, #18473, #18474, #18531, #18532

comment:54 Changed 4 years ago by wluebbe

There are still a lot of map() call in the code.

But not all occurrences must be removed: there are a number of places where a Py3 map()-iterator leads to the same result as the Py2 map().

Here is a long (but incomplete) list:

all(here_is_an_iterable_allowed)
any(here_is_an_iterable_allowed)
bytearray(here_is_an_iterable_allowed)
dict(here_is_an_iterable_of_key_value_pairs_allowed)
filter(function, here_is_an_iterable_allowed)
frozenset(here_is_an_iterable_allowed)
list(here_is_an_iterable_allowed)
map(function, here_is_an_iterable_allowed)
max(here_is_an_iterable_allowed)
min(here_is_an_iterable_allowed)
reduce(function, here_is_an_iterable_allowed)
set(here_is_an_iterable_allowed)
sorted(here_is_an_iterable_allowed)
sum(here_is_an_iterable_allowed)
tuple(here_is_an_iterable_allowed)
zip(here_is_an_iterable_allowed)

''.join(here_is_an_iterable_allowed)
a,b,c* = here_is_an_iterable_allowed
x[i:j] = here_is_an_iterable_allowed
a_dict.update(here_is_an_iterable_of_key_value_pairs_allowed)
file.writelines(here_is_an_iterable_allowed)
math.fsum(here_is_an_iterable_allowed)

and MANY functions in module itertools.

for i in here_is_an_iterable_allowed:
    pass
[2*x for x in here_is_an_iterable_allowed]
(2*x for x in here_is_an_iterable_allowed)

But if the use of map includes a lambda then replacing it with a list comprehension improves the readability!

comment:55 Changed 4 years ago by wluebbe

  • Dependencies changed from #18472, #18473, #18474, #18531, #18532 to #18472, #18473, #18474, #18531, #18532, #18553
  • Description modified (diff)

comment:56 Changed 3 years ago by chapoton

  • Component changed from distribution to python3
  • Milestone changed from sage-6.7 to sage-7.2

comment:57 Changed 3 years ago by jdemeyer

  • Dependencies #18472, #18473, #18474, #18531, #18532, #18553 deleted

comment:58 Changed 3 years ago by chapoton

see #22121 and #22127 for steps in this direction

Last edited 3 years ago by chapoton (previous) (diff)

comment:59 Changed 2 years ago by chapoton

  • Milestone changed from sage-7.2 to sage-8.0

comment:60 Changed 2 years ago by chapoton

another step in #23084

comment:61 Changed 2 years ago by chapoton

another step in #23130

comment:62 Changed 22 months ago by chapoton

next #24382

comment:63 Changed 21 months ago by chapoton

next in #24572

comment:64 Changed 18 months ago by embray

Some more in #25231

comment:65 Changed 17 months ago by embray

Some more in #25418

comment:66 Changed 15 months ago by vklein

Some more in #25925

comment:67 Changed 4 months ago by chapoton

  • Cc embray jdemeyer fbissey tscrim jhpalmieri vklein added
  • Milestone changed from sage-8.0 to sage-duplicate/invalid/wontfix

Probably one coud close this one too ? Agreed ?

comment:68 Changed 4 months ago by chapoton

  • Status changed from needs_work to needs_review

comment:69 Changed 4 months ago by vklein

  • Status changed from needs_review to positive_review

Sure.

comment:70 Changed 4 months ago by chapoton

  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.