Opened 8 years ago
Closed 8 years ago
#13161 closed defect (fixed)
Fix unicode issue in axes labels
Reported by:  kcrisman  Owned by:  jason, was 

Priority:  major  Milestone:  sage5.3 
Component:  graphics  Keywords:  
Cc:  ddrake, shahuwang  Merged in:  sage5.3.rc0 
Authors:  Dan Drake, 王瑞期  Reviewers:  Sébastien Labbé 
Report Upstream:  N/A  Work issues:  
Branch:  Commit:  
Dependencies:  Stopgaps: 
Description (last modified by )
See this ask.sagemath.org post. The proposed fix there is to replace this line by
self.__axes_labels = (str(l[0]), str(l[1])) +self.__axes_labels = (l[0], l[1])
Considering that l
is already supposed to be a tuple or list of two strings, this seems pretty plausible.
Apply trac_13161.patch and trac_13161reviewsl.patch
Attachments (2)
Change History (26)
comment:1 Changed 8 years ago by
comment:2 followup: ↓ 3 Changed 8 years ago by
 Cc ddrake added
There is an update at the ask.sagemath.org post. Apparently it's actually a fair amount of work to get matplotlib to do certain unicode fonts properly... that doesn't mean this isn't a bug.
I'm wondering what the Sagemobile people at Sunkyunkwan U. in Korea do for this, if they do anything? Dan, do you know?
comment:3 in reply to: ↑ 2 ; followup: ↓ 5 Changed 8 years ago by
Replying to kcrisman:
I'm wondering what the Sagemobile people at Sunkyunkwan U. in Korea do for this, if they do anything? Dan, do you know?
In my experience, Korean developers are ignorant of encoding issues; it seems like they test it on their computer, and if it works, they ship it. This works fine when all your users are using the same version of Windows XP as you, but I still get lots of emails and see webpages in euckr or other encodings that don't declare their encoding and hence don't work when things default to utf8.
So, short answer: my guess is they don't do anything. But I'll ask them.
comment:4 Changed 8 years ago by
I came up with this bug yesterday using sage5.0. Here are the notes I took :
The following works:
sage: c = circle((0,0), 1) sage: c.axes_labels(['X', 'Y']) sage: c
Accent in normal string gets a ValueError:
sage: c.axes_labels(['an accent : é', 'Y']) sage: c Traceback (most recent call last): ... ValueError: matplotlib display text must have all code points < 128 or use Unicode strings
Accent in unicode string gets translated to str which yields a UnicodeError :
sage: c.axes_labels([u'an accent : é', 'Y']) Traceback (most recent call last): ... > 634 self.__axes_labels = (str(l[0]), str(l[1])) ... UnicodeEncodeError: 'ascii' codec can't encode characters in position 1213: ordinal not in range(128)
Not declaring the unicode for command text
also gives a ValueError:
sage: text('an accent : é', (0,0)) Traceback (most recent call last): ... ValueError: matplotlib display text must have all code points < 128 or use Unicode strings
But declaring unicode works for command text
. Although, no error are raised, the result is not good : the "é" gets printed as the two characters "~A commercial c" :
sage: text(u'an accent : é', (0,0))
Strangely it gets printed correctly if the command is written in a file (without declaring encoding) and then attached :
# file.sage t = text(u'an accent : é', (0,0))
and then
sage: attach file.sage sage: t
is perfect! This shows that somehow matplotlib is able to work with unicode strings. We just have to make it work properly.
comment:5 in reply to: ↑ 3 Changed 8 years ago by
Replying to ddrake:
So, short answer: my guess is they don't do anything. But I'll ask them.
To followup: their response is, they didn't do anything special and do run into encoding issues here and there.
comment:6 Changed 8 years ago by
 Status changed from new to needs_review
Patch up. I removed the str()
calls, and also just wrote tuple(l)
, since the code guarantees that we have a tuple or list of length 2.
I see the problem reported by Sébastian, though; using Unicode strings, the generated plots have the wrong characters. But I think that is a font problem and could be fixed on another ticket; for now, let's get our own code working.
comment:7 followup: ↓ 8 Changed 8 years ago by
Hi, on sage5.0, I get the following reject :
10 slabbe@pol ~/Applications/sage5.0/devel/sage/sage/plot $ cat graphics.py.rej  graphics.py +++ graphics.py @@ 637,7 +656,7 @@ raise TypeError, "l must be a list or tuple" if len(l) != 2: raise ValueError, "l must have length 2"  self._axes_labels = (str(l[0]), str(l[1])) + self._axes_labels = tuple(l) def axes_label_color(self, c=None): r"""
I need to install a newer version of Sage to do the review...
Sébastien
comment:8 in reply to: ↑ 7 Changed 8 years ago by
comment:9 followup: ↓ 10 Changed 8 years ago by
 Reviewers set to Sébastien Labbé
 Status changed from needs_review to positive_review
Problem is fixed (tested on sage5.2.rc0). All tests passed on sage/plot/graphics.py
. Documention builds fine and looks good from html and command line. I will open a new ticket for the other problem mentionned above about how matplolib handles unicode.
Positive review.
comment:10 in reply to: ↑ 9 Changed 8 years ago by
I will open a new ticket for the other problem mentionned above about how matplolib handles unicode.
This is now #13296
comment:11 Changed 8 years ago by
I'm just temporarily adding the original author by his ask.sagemath handle. Hopefully we'll get a complete name soon.
comment:12 Changed 8 years ago by
 Milestone changed from sage5.2 to sage5.3
comment:13 Changed 8 years ago by
 Status changed from positive_review to needs_work
This breaks when building the PDF documentation:
! Package ucs Error: Unknown Unicode character 45812 = U+B2F4, (ucs) possibly declared in uni178.def. (ucs) Type H to see if it is available with options. See the ucs package documentation for explanation. Type H <return> for immediate help. ... l.39867 ...s}{<EB><8B><B4>}\PYG{l+s}{'}\PYG{p}{]}\PYG{p}{)} ? ! Emergency stop. ... l.39867 ...s}{<EB><8B><B4>}\PYG{l+s}{'}\PYG{p}{]}\PYG{p}{)} ! ==> Fatal error occurred, no output PDF file produced!
comment:14 followup: ↓ 15 Changed 8 years ago by
I never builded the documentation as pdf before. I just did it and get the same error.
Could this be related to the first line added by the patch:
# * encoding: utf8 *
which should be
# * coding: utf8 *
?
Another possibility would be to write the test as
sage: c.axes_labels(['z',u'\xeb\x8b\xb4']) sage: c._axes_labels ('z', u'\xeb\x8b\xb4')
instead of
sage: c.axes_labels(['z',u'ëŽ']) sage: c._axes_labels ('z', u'\xeb\x8b\xb4')
I am testing those two options right now.
comment:15 in reply to: ↑ 14 Changed 8 years ago by
I am testing those two options right now.
I get the same error for both options...
comment:16 Changed 8 years ago by
 Status changed from needs_work to needs_review
My guess is that the Asian (?) character used in the doctest is not recognized by utf8 encoding, or at least by PDFLaTeX's unicode implementation. So we should try a different character instead, for example with this change:

sage/plot/graphics.py
diff git a/sage/plot/graphics.py b/sage/plot/graphics.py
a b class Graphics(SageObject): 637 637 :: 638 638 639 639 sage: c = circle((0,0), 1) 640 sage: c.axes_labels(['z',u' 담'])640 sage: c.axes_labels(['z',u'é']) 641 641 sage: c._axes_labels 642 ('z', u'\x eb\x8b\xb4')642 ('z', u'\xc3\xa9') 643 643 644 644 sage: c.axes_labels([u'an accent : é', 'Y']) 645 645 sage: c._axes_labels
I'm attaching a new patch which does this.
Changed 8 years ago by
comment:17 Changed 8 years ago by
 Description modified (diff)
comment:18 followup: ↓ 21 Changed 8 years ago by
Just for interestingness, couldn't you use an actual word instead of one character?
For example, "intérêt" (French for interest, a word which might plausably appear as label).
comment:19 followup: ↓ 20 Changed 8 years ago by
 Cc shahuwang added
Hi shahuwang  is this a good way to have your name? Although usually we use moreorless Roman characters for our stuff (including for e.g. Korean developers), maybe this is a first time...
comment:20 in reply to: ↑ 19 Changed 8 years ago by
Replying to kcrisman:
Hi shahuwang  is this a good way to have your name? Although usually we use moreorless Roman characters for our stuff (including for e.g. Korean developers), maybe this is a first time...
Hi, it seems that there will be some diffcult if I use my name in Chinese,sorry for that,and I will change it into Roman characters.Thanks very much.
Also,I want to report another bug of sage to deal with CJK(Chinese,Japanese,Korean),I open a ticket for it ,this is the ticket .As I am not familiar with the process,maybe I have made some mistake here.
Changed 8 years ago by
comment:21 in reply to: ↑ 18 Changed 8 years ago by
Replying to jdemeyer:
Just for interestingness, couldn't you use an actual word instead of one character?
For example, "intérêt" (French for interest, a word which might plausably appear as label).
I just added a patch which applies on top of the previous one and which adds real life example of french word which can occur in axis label and which contain an accent :
 Axes des abscisses (axe des X)
 Axes des ordonnées (axe des Y)
Needs review!
comment:22 Changed 8 years ago by
 Description modified (diff)
comment:23 Changed 8 years ago by
 Status changed from needs_review to positive_review
Looks good to me.
comment:24 Changed 8 years ago by
 Merged in set to sage5.3.rc0
 Resolution set to fixed
 Status changed from positive_review to closed
In fact, I get a kind of weird error trying to do things like
anyway, so we could have better support for this in general. One shouldn't have to set a matplotlib option.