Opened 3 years ago
Closed 3 years ago
#27542 closed defect (fixed)
FindStat, unicode and rational numbers
Reported by:  mantepse  Owned by:  

Priority:  blocker  Milestone:  sage8.8 
Component:  python3  Keywords:  FindStat 
Cc:  chapoton  Merged in:  
Authors:  Martin Rubey  Reviewers:  Christian Stump 
Report Upstream:  N/A  Work issues:  
Branch:  1a363bd (Commits, GitHub, GitLab)  Commit:  1a363bd8d8eacf6db4d741ce2323503e5d2fc2a0 
Dependencies:  Stopgaps: 
Description (last modified by )
As it turns out, #27346 actually broke the FindStat? interface. We now have:
sage: findstat(914) # optional  internet <repr(<__main__.FindStatStatistic at 0x7fbc3dc1ec10>) failed: UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 30: ordinal not in range(128)>
The reason is the following change, which was made for py3 compatibility.
@ 966,9 +967,9 @@ class FindStatStatistic(SageObject): else: raise  self._description = self._raw[FINDSTAT_STATISTIC_DESCRIPTION].encode("utf8")  self._name = self._raw[FINDSTAT_STATISTIC_NAME].encode("utf8")  self._references = self._raw[FINDSTAT_STATISTIC_REFERENCES].encode("utf8") + self._description = self._raw[FINDSTAT_STATISTIC_DESCRIPTION] + self._name = self._raw[FINDSTAT_STATISTIC_NAME] + self._references = self._raw[FINDSTAT_STATISTIC_REFERENCES] self._collection = FindStatCollection(self._raw[FINDSTAT_STATISTIC_COLLECTION]) self._code = self._raw[FINDSTAT_STATISTIC_CODE] self._sage_code = self._raw[FINDSTAT_STATISTIC_SAGE_CODE] @@ 1044,7 +1045,7 @@ class FindStatStatistic(SageObject): stat_str = "\n".join(["\n".join(keys) + "\n====> " + values for (keys, values) in stat]) verbose("Sending the following data to FindStat\r\n %s" % stat_str, caller_name='FindStat')  values = urlencode({"freedata": stat_str, "depth": str(self._depth), "caller": "Sage"}) + values = urlencode({"freedata": stat_str, "depth": str(self._depth), "caller": "Sage"}).encode("utf8") verbose("Fetching URL %s with encoded data %s" % (url, values), caller_name='FindStat')
It went unnoticed, because the doctests were marked # random
...
Another problem is that rational numbers can be interpreted as lists, which confuses the interface. For example,
sage: findstat([(la, la[0]/1) for la in Partitions(10)], depth=0)
fails with the cryptic message that there are more values than elements.
Change History (15)
comment:1 Changed 3 years ago by
 Branch set to u/mantepse/findstat_and_unicode
comment:2 Changed 3 years ago by
 Commit set to 3e43bdbb8bc956fa67e185bea3f1a8795351e634
comment:3 Changed 3 years ago by
 Commit changed from 3e43bdbb8bc956fa67e185bea3f1a8795351e634 to 669d3f587642b488aad2cd162cc0289936773eea
Branch pushed to git repo; I updated commit sha1. New commits:
669d3f5  one encoding for python 3

comment:4 Changed 3 years ago by
 Status changed from new to needs_review
I realise that it is not optimal, but I haven't got a clue how to do it properly.
comment:5 Changed 3 years ago by
 Priority changed from major to blocker
Making this a blocker because it turns out that the interface is almost unusable for serious work on python2 now, and there is a fix.
comment:6 followup: ↓ 7 Changed 3 years ago by
 I'm wondering how stable the findstat output is. My impression is that it's not and that's the reason why they were marked
# random
in the first place. If you remove# random
, then you might get failures in the future whenever the output from findstat changes.
 Wouldn't it make sense to move the encoding to
__repr__
? In other words, useunicode
everywhere except at the very end when you're returning from__repr__
.
comment:7 in reply to: ↑ 6 ; followup: ↓ 9 Changed 3 years ago by
Replying to jdemeyer:
 I'm wondering how stable the findstat output is. My impression is that it's not and that's the reason why they were marked
# random
in the first place. If you remove# random
, then you might get failures in the future whenever the output from findstat changes.
This is correct. However, unfortunately, adding # random
also led to the current situation, where I have a completely disfunctional interface in a release. Since I plan to work a fair bit on the interface in the coming three months, I think it is better to have failing doctests which I can make pass using judicious use of ...
, rather than having all doctests pass because they are not really tested.
 Wouldn't it make sense to move the encoding to
__repr__
? In other words, useunicode
everywhere except at the very end when you're returning from__repr__
.
I have no idea. I'll try, thank you for the suggestion!
comment:8 followup: ↓ 11 Changed 3 years ago by
Alternatively, if you just want to test that repr()
works, it suffices to add a doctest
x = repr(findstat(...)) # check that repr() doesn't raise an exception
comment:9 in reply to: ↑ 7 Changed 3 years ago by
Replying to mantepse:
 Wouldn't it make sense to move the encoding to
__repr__
? In other words, useunicode
everywhere except at the very end when you're returning from__repr__
.I have no idea. I'll try, thank you for the suggestion!
I just tried. This does work for FindStatStatistic.__repr__
. However, I then have to put an addtional check into FindStatStatistic.references
, because this returns a FancyTuple
, whose __repr__
doesn't like unicode either.
It seems to me that having the check in multiple places makes it more brittle, but I am easy to convince.
(apparently, encoding FindStatStatistic._description
is unnecessary, though)
comment:10 Changed 3 years ago by
 Commit changed from 669d3f587642b488aad2cd162cc0289936773eea to d94fbc0b46c0b6067cac4332c7df3146ac3ef322
Branch pushed to git repo; I updated commit sha1. New commits:
d94fbc0  remove an unnecessary encoding

comment:11 in reply to: ↑ 8 Changed 3 years ago by
Replying to jdemeyer:
Alternatively, if you just want to test that
repr()
works, it suffices to add a doctestx = repr(findstat(...)) # check that repr() doesn't raise an exception
OK. I'll keep this in mind. For the moment, I think I prefer to keep the doctests, and make them sufficiently stable.
comment:12 Changed 3 years ago by
 Commit changed from d94fbc0b46c0b6067cac4332c7df3146ac3ef322 to 1a363bd8d8eacf6db4d741ce2323503e5d2fc2a0
Branch pushed to git repo; I updated commit sha1. New commits:
1a363bd  fix problem with rational values being interpreted as lists

comment:13 Changed 3 years ago by
 Description modified (diff)
 Summary changed from FindStat and unicode to FindStat, unicode and rational numbers
comment:14 Changed 3 years ago by
 Reviewers set to Christian Stump
 Status changed from needs_review to positive_review
Lgtm, everything seems to work and bot is green.
comment:15 Changed 3 years ago by
 Branch changed from u/mantepse/findstat_and_unicode to 1a363bd8d8eacf6db4d741ce2323503e5d2fc2a0
 Resolution set to fixed
 Status changed from positive_review to closed
Branch pushed to git repo; I updated commit sha1. New commits:
remove many random tags and make test pass, distinguish between python 2 and python 3