Ticket #4975 (closed defect: fixed)

Opened 20 months ago

Last modified 20 months ago

[with patch, positive review] Sage 3.2.2 chokes on utf-8 encoded files

Reported by: mkasperski Owned by: mhansen
Priority: critical Milestone: sage-3.3
Component: user interface Keywords:
Cc: Author(s):
Report Upstream: Reviewer(s):
Merged in: Work issues:

Description

I just downloaded sage 3.2.2 and found it to choke on utf-8 encoded source files (for comparison, sage 3.0 handles them properly). My bet is that some problem happened during integration of issue 2637. Tested using sage-3.2.2-ubuntu32bit-intel-i686-Linux binary distribution.

Take the file like:

# -*- coding: utf-8 -*-

"""
Zażółć gęślą jaźń
"""

print 1

and try:

$ sage test.sage

Result:

  File "test.py", line 6
SyntaxError: Non-ASCII character '\xc5' in file test.py on line 7, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

Produced file:

# This file was *autogenerated* from the file test.sage.
from sage.all_cmdline import *   # import sage library
_sage_const_1 = Integer(Integer(1))# -*- coding: utf-8 -*-

"""
Zażółć gęślą jaźń
"""

print _sage_const_1

Attachments

trac_4975.patch Download (1.9 KB) - added by mhansen 20 months ago.
trac_4975-2.patch Download (1.0 KB) - added by mhansen 20 months ago.

Change History

Changed 20 months ago by mabshoff

  • priority changed from minor to critical
  • milestone set to sage-3.3

Changed 20 months ago by mhansen

Changed 20 months ago by mhansen

  • owner changed from was to mhansen
  • status changed from new to assigned
  • summary changed from Sage 3.2.2 chokes on utf-8 encoded files to [with patch, needs review] Sage 3.2.2 chokes on utf-8 encoded files

I've attached a patch for the scripts directory which fixes this.

Changed 20 months ago by mabshoff

  • summary changed from [with patch, needs review] Sage 3.2.2 chokes on utf-8 encoded files to [with patch, positive review] Sage 3.2.2 chokes on utf-8 encoded files

Nice fix. Positive review.

We might want to think about writing some tests for the various sage scripts, but this is not subject to this ticket.

Cheers,

Michael

Changed 20 months ago by mabshoff

  • status changed from assigned to closed
  • resolution set to fixed

Merged in Sage 3.3.alpha0

Changed 20 months ago by mkasperski

  • status changed from closed to reopened
  • resolution fixed deleted

I can confirm that applying this patch on 3.2.2 resolves the problem I initially reported (cd ..../sage-3.2.2-ubuntu32bit-intel-i686-Linux/local/bin;

patch -p1 < .../trac_4975.patch)

Still, something is wrong. The following file is handled properly:

# -*- coding: utf-8 -*-

"""
Zażółć gęślą jaźń
"""

print 1
print u"Zażółć gęślą jaźń"

but if you comment out print 1:

# -*- coding: utf-8 -*-

"""
Zażółć gęślą jaźń"
"""

print u"Zażółć gęślą jaźń"

it fails again:

sage test2.sage
  File "test2.py", line 3
SyntaxError: Non-ASCII character '\xc5' in file test2.py on line 4, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

What is interesting, removing any of elements (either a """ string, or print) makes the problem to disappear.

Here is .py generated from the file above:

"""
Zażółć gęślą jaźń"
"""
# -*- coding: utf-8 -*-
# This file was *autogenerated* from the file test2.sage.
from sage.all_cmdline import *   # import sage library

print u"Zażółć gęślą jaźń"

Changed 20 months ago by mkasperski

Having taken a look at the source, it seems that the problem happens here:

    insert = '%s%s%s.\nfrom sage.all_cmdline import *   # import sage library\n'%(coding, AUTOGEN_MSG, f)
    i = find_position_right_after_module_docstring(G)
    G = G[:i] + insert + G[i:]

as you see, if sage detects module docstring, it adds coding AFTER IT

Changed 20 months ago by mkasperski

And here is how I would rewrite those lines:

    if coding:
       G = coding + G
    insert = '%s%s.\nfrom sage.all_cmdline import *   # import sage library\n'%(AUTOGEN_MSG, f)
    i = find_position_right_after_module_docstring(G)
    G = G[:i] + insert + G[i:]

(sorry for no formal patch but I am not sure against what to make it)

Changed 20 months ago by mhansen

Changed 20 months ago by mhansen

Heh, I didn't even notice that line :-) I put up a patch which applies on top of the last one.

Changed 20 months ago by mabshoff

  • status changed from reopened to closed
  • resolution set to fixed

Please do not reopen merged tickets. The issue you reported has been fixed. If you found more problems these will be addressed via the new ticket.

Cheers,

Michael

Changed 20 months ago by mabshoff

Oops, I missed a couple of the last messages and I will apply Mike's second patch. But my remark still applies.

Cheers,

Michael

Changed 20 months ago by mabshoff

Merged trac_4975-2.patch in Sage 3.3.alpha0.

Cheers,

Michael

Note: See TracTickets for help on using tickets.