Opened 7 months ago
Closed 6 months ago
#30106 closed defect (fixed)
sage.libs.ecl: Fix unicode handling
Reported by: | mkoeppe | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-9.2 |
Component: | symbolics | Keywords: | |
Cc: | gh-mwageringel, nbruin, dimpase, gh-spaghettisalat | Merged in: | |
Authors: | Matthias Koeppe | Reviewers: | Markus Wageringel |
Report Upstream: | N/A | Work issues: | |
Branch: | 59dd62b (Commits) | Commit: | 59dd62b301bb487452c99d97fabb2f4b180a7c1b |
Dependencies: | #22191 | Stopgaps: |
Description (last modified by )
As a follow-up to #29278, #29280: If we use Unicode variable names in SR
, declaring a domain gives an error:
sage: SR.var('Ο', domain='real') RuntimeError: ECL says: THROW: The catch MACSYMA-QUIT is undefined. SystemError: <built-in method var of sage.symbolic.ring.SymbolicRing object at 0x334506908> returned a result with an error set
This comes from our ECL interface:
sage: from sage.libs.ecl import * sage: u_symbol = EclObject('π₯') sage: u_symbol <repr(<sage.libs.ecl.EclObject at 0x337e7b3c8>) failed: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 2: invalid start byte> sage: u_symbol.python() UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 2: invalid start byte
Also note:
sage: b_symbol = EclObject(bytes([166])) sage: b_symbol <repr(<sage.libs.ecl.EclObject at 0x337e7b058>) failed: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa6 in position 0: invalid start byte>
Change History (32)
comment:1 Changed 7 months ago by
- Description modified (diff)
comment:2 Changed 7 months ago by
- Cc gh-mwageringel added; mwageringel removed
comment:3 Changed 7 months ago by
comment:4 Changed 7 months ago by
- Description modified (diff)
- Milestone changed from sage-wishlist to sage-9.2
comment:5 Changed 7 months ago by
- Branch set to u/mkoeppe/sage_libs_ecl__fix_unicode_handling
comment:6 Changed 7 months ago by
- Commit set to 6ce8f9e2cb46b5fc58c5c329201d6f74301946c3
Branch pushed to git repo; I updated commit sha1. New commits:
βb6ebb52 | src/sage/cpython/*string*: Update documentation
|
β6ce8f9e | python_to_ecl: Handle unicode strings
|
comment:7 Changed 7 months ago by
Now
sage: SR.var('Ο', domain='real') Ο
but more work is needed.
comment:8 Changed 7 months ago by
- Commit changed from 6ce8f9e2cb46b5fc58c5c329201d6f74301946c3 to c98352349e1cf8989e3f4ba758788e5004c63abe
Branch pushed to git repo; I updated commit sha1. New commits:
βc983523 | EclObject.str: Handle unicode
|
comment:9 Changed 7 months ago by
sage: from sage.libs.ecl import * sage: u_symbol = EclObject('π₯') sage: u_symbol <ECL: π₯>
comment:10 Changed 7 months ago by
- Commit changed from c98352349e1cf8989e3f4ba758788e5004c63abe to f2379353686eeecfaa0ae1b49936a212e4df146a
Branch pushed to git repo; I updated commit sha1. New commits:
βf237935 | ecl_to_python: Handle unicode strings
|
comment:11 Changed 7 months ago by
sage: u_string = EclObject('"π"') sage: u_string <ECL: "π"> sage: _.python() '"π"'
comment:12 Changed 7 months ago by
- Commit changed from f2379353686eeecfaa0ae1b49936a212e4df146a to 6f740eb71d017088d90d891379cc826a0f4a3577
Branch pushed to git repo; I updated commit sha1. New commits:
β6f740eb | ecl_eval: Handle unicode strings
|
comment:13 Changed 7 months ago by
sage: clock = ecl_eval('''(defun clock (h) (string (elt "ππππππππππππ" (mod h 12))))''') sage: clock(3).python() '"π"'
comment:14 Changed 7 months ago by
- Status changed from new to needs_review
comment:15 follow-up: βΒ 17 Changed 7 months ago by
Thank you for solving this.
Should our Maxima interface support unicode with this branch now, or do you know what else is needed to make this work? The following used to give an error, but results in a crash now:
sage: var('ΞΎ')._maxima_() Condition of type: SIMPLE-ERROR Cannot coerce string Cannot coerce string $_SAGE_VAR_ΞΎ to a base-string to a base-string No restarts available. ... Excessive debugger depth! Probable infinite recursion! Quitting.
Also, please add a doctest for the new functionality.
comment:16 follow-up: βΒ 18 Changed 7 months ago by
Are you on ecl 20.4 ?
comment:17 in reply to: βΒ 15 Changed 7 months ago by
- Status changed from needs_review to needs_work
Replying to gh-mwageringel:
The following used to give an error, but results in a crash now:
sage: var('ΞΎ')._maxima_() Condition of type: SIMPLE-ERROR Cannot coerce string Cannot coerce string $_SAGE_VAR_ΞΎ to a base-string to a base-string No restarts available. ... Excessive debugger depth! Probable infinite recursion! Quitting.
Yes, I can confirm this here. I'll look into this.
Also, please add a doctest for the new functionality.
Sure, will do.
comment:18 in reply to: βΒ 16 Changed 7 months ago by
comment:19 Changed 7 months ago by
- Commit changed from 6f740eb71d017088d90d891379cc826a0f4a3577 to ee97a6d8617974f8ed2538d3d85c9ede91fa89db
Branch pushed to git repo; I updated commit sha1. New commits:
βee97a6d | Fix infinite recursion when error messages are not base-strings
|
comment:20 Changed 7 months ago by
I suggest we work on ecl 20.4, for uniformity
comment:21 follow-up: βΒ 24 Changed 7 months ago by
The above error is due to a bug in ECL 16.1.2:
> (princ-to-string 'ΞΎ) "Ξ" > (setf local-table (copy-readtable nil)) > (setf (readtable-case local-table) :invert) > :INVERT > (let ((*readtable* local-table) (*print-case* :upcase)) (princ-to-string 'ΞΎ)) Cannot coerce string Ξ to a base-string
I will try with the ECL upgrade now.
comment:22 Changed 7 months ago by
- Commit changed from ee97a6d8617974f8ed2538d3d85c9ede91fa89db to 2d87ee362ed1ea6c7833e1e8509456927c82c23b
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
β0076bc3 | backport Maxima fix for bug #3629
|
β1d074e8 | doctest fixes
|
β266d8c1 | backport ECL PR #210
|
β12447bc | reject old makeinfo
|
βa3e0eca | add upstream fix from MR 215
|
β89b006b | add the patch from upstream MR 214
|
β0b77737 | add upstream MR 216 (to fix cygwin fork)
|
β8ca1c0e | Merge tag '9.2.beta2' into t/22191/public/packages/ecl20
|
βf82c716 | Commit 75877dd8 from upstream
|
β2d87ee3 | Merge commit 'f82c716fdf9c6e91a07166d36b6329a15ecfb41d' of git://trac.sagemath.org/sage into t/30106/sage_libs_ecl__fix_unicode_handling
|
comment:23 Changed 7 months ago by
- Commit changed from 2d87ee362ed1ea6c7833e1e8509456927c82c23b to 828a727915e1920964c7bdbdd52e50b84e731a62
Branch pushed to git repo; I updated commit sha1. New commits:
β828a727 | Add doctests
|
comment:24 in reply to: βΒ 21 Changed 7 months ago by
Replying to mkoeppe:
The above error is due to a bug in ECL 16.1.2:
> (princ-to-string 'ΞΎ) "Ξ" > (setf local-table (copy-readtable nil)) > (setf (readtable-case local-table) :invert) > :INVERT > (let ((*readtable* local-table) (*print-case* :upcase)) (princ-to-string 'ΞΎ)) Cannot coerce string Ξ to a base-stringI will try with the ECL upgrade now.
This bug is still present in ECL 20.4.24. The above code is from Maxima's PRINT-INVERT-CASE
function in commac.lisp
.
comment:25 Changed 7 months ago by
- Status changed from needs_work to needs_review
comment:26 Changed 7 months ago by
- Dependencies set to #22191
comment:27 Changed 7 months ago by
Note that after commit ee97a6d, the test case from comment 17 now gives a proper error message, no longer a crash:
sage: var('ΞΎ')._maxima_() ....: <repr(<sage.interfaces.maxima_lib.MaximaLibElement at 0x32c1d4828>) failed: RuntimeError: ECL says: Cannot coerce string $_SAGE_VAR_ΞΎ to a base-string>
Fixing this ECL bug or working around it is outside of the scope of this ticket.
comment:28 Changed 7 months ago by
- Reviewers set to Markus Wageringel
Ok, thanks for the fix. There is one more place where ecl_string_to_python
should be used β in print_objects
:
sage: from sage.libs.ecl import * sage: u_symbol = EclObject('π₯') sage: print_objects() # crashes
Moreover, the docstring of ecl_eval
should be changed into a raw string, as the examples contain backslashes.
Once that is fixed, you can set a positive review on my behalf.
comment:29 Changed 7 months ago by
- Commit changed from 828a727915e1920964c7bdbdd52e50b84e731a62 to 59dd62b301bb487452c99d97fabb2f4b180a7c1b
Branch pushed to git repo; I updated commit sha1. New commits:
β99b894a | print_objects: Handle unicode
|
β59dd62b | ecl_eval: Make docstring raw
|
comment:31 Changed 7 months ago by
Follow-up ticket to keep track of the ECL issue: #30122
comment:32 Changed 6 months ago by
- Branch changed from u/mkoeppe/sage_libs_ecl__fix_unicode_handling to 59dd62b301bb487452c99d97fabb2f4b180a7c1b
- Resolution set to fixed
- Status changed from positive_review to closed
Not sure if there is a function in the ECL C API that constructs a Unicode Lisp string. So I would just be going through
CODE-CHAR
as in