Opened 2 years ago
Closed 17 months ago
#30473 closed enhancement (fixed)
Unicode operators for sage.manifolds
Reported by:  Matthias Köppe  Owned by:  

Priority:  major  Milestone:  sage9.4 
Component:  user interface  Keywords:  
Cc:  Eric Gourgoulhon, Travis Scrimshaw, Markus Wageringel, Frédéric Chapoton  Merged in:  
Authors:  Eric Gourgoulhon  Reviewers:  Matthias Koeppe 
Report Upstream:  N/A  Work issues:  
Branch:  f2ae50e (Commits, GitHub, GitLab)  Commit:  f2ae50e68687b59af27e35a4ec02b0f667ea828c 
Dependencies:  Stopgaps: 
Description (last modified by )
Replacing
/\
by ∧ (U+2227)*
by ⊗ (U+2297) for tensor products>
by → (U+2192),>
by ↦ (U+21A6)d/dx
by ∂/∂x, etc. (U+2202)R
(real field) by ℝ (U+211D)C
(complex field) by ℂ (U+2102)
Some references:
 https://docs.python.org/3/howto/unicode.html
 https://en.wikipedia.org/wiki/List_of_mathematical_symbols_by_subject
 https://en.wikipedia.org/wiki/Mathematical_operators_and_symbols_in_Unicode
 https://unicode.org/charts/PDF/U2200.pdf
 https://unicode.org/charts/PDF/U2100.pdf
With the code introduced in this ticket, we have
sage: M = Manifold(2, 'M') sage: X.<x,y> = M.chart() sage: M.identity_map().display() Id_M: M → M (x, y) ↦ (x, y) sage: M.zero_scalar_field().display() zero: M → ℝ (x, y) ↦ 0 sage: v = M.vector_field(y, x, name='v') sage: v.display() v = y ∂/∂x + x ∂/∂y sage: X.frame() Coordinate frame (M, (∂/∂x,∂/∂y)) sage: v.wedge(X.frame()[0]).display() v∧∂/∂x = x ∂/∂x∧∂/∂y sage: f = M.scalar_field(x^2 + y^2, name='f') sage: f.display() f: M → ℝ (x, y) ↦ x^2 + y^2 sage: diff(f).display() df = 2*x dx + 2*y dy sage: (v*diff(f)).display() v⊗df = 2*x*y ∂/∂x⊗dx  2*y^2 ∂/∂x⊗dy + 2*x^2 ∂/∂y⊗dx + 2*x*y ∂/∂y⊗dy
Attachments (3)
Change History (70)
comment:1 Changed 2 years ago by
comment:2 Changed 2 years ago by
Milestone:  sage9.2 → sage9.3 

comment:3 Changed 2 years ago by
Cc:  Frédéric Chapoton added 

comment:4 followup: 5 Changed 23 months ago by
That's a really nice idea! Is it also possible to entail the first two characters as operators in Python code?
comment:5 followup: 6 Changed 23 months ago by
Replying to ghmjungmath:
Is it also possible to entail the first two characters as operators in Python code?
That is not possible in Python, but it could be added to the Sage preparser, as has been done for the backslash operator. Some people at sd109 voiced an interest in such custom unicode operators.
There is also a decorator that turns functions into infix operators which works similarly to the backslash operator, but, as unicode operators are not valid function names, that does not really help.
comment:6 followup: 7 Changed 22 months ago by
Replying to ghmwageringel:
Replying to ghmjungmath:
Is it also possible to entail the first two characters as operators in Python code?
That is not possible in Python, but it could be added to the Sage preparser, as has been done for the backslash operator. Some people at sd109 voiced an interest in such custom unicode operators.
I'd like that very much, too. Is the preparser also capable of something like
type "\otimes" > press TAB
> unicode character pops up
similarly to Greek letters in Py3 right now?
comment:7 followup: 53 Changed 22 months ago by
Replying to ghmjungmath:
Is the preparser also capable of something like
type "\otimes" > press
TAB
> unicode character pops upsimilarly to Greek letters in Py3 right now?
That is an IPython feature, which does not seem to be implemented for operators like \otimes
. The preparser is not capable of this.
comment:8 Changed 22 months ago by
Milestone:  sage9.3 → sage9.4 

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
comment:9 Changed 17 months ago by
Description:  modified (diff) 

comment:10 Changed 17 months ago by
I gave it a try, by changing line 736 of src/sage/tensor/modules/free_module_tensor.py
:
 basis_term_txt = "*".join(bases_txt) + basis_term_txt = "\u2297".join(bases_txt)
and changing doctests accordingly. Everything seems OK with doctests and the html documentation. However, generating the pdf documentation by
sage docbuild reference/tensor_free_modules pdf
failed, with the error:
! Package inputenc Error: Unicode character ⊗ (U+2297) (inputenc) not set up for use with LaTeX.
I had to add the following line in src/sage/docs/conf.py
(after line 407):
\DeclareUnicodeCharacter{2297}{\ensuremath{\otimes}}
to make the pdf documentation build without any error. The pdf output is fine (the tensor product operator is correctly displayed). However, before to proceed, I would like to know if this the correct way to fix the pdf doc building...
comment:11 followup: 12 Changed 17 months ago by
There is already a tensor product somewhere, with no pdf problem:
sage: A=algebras.FreeDendriform(QQ,'a') sage: %display unicode_art sage: a=A.an_element() sage: A.coproduct(a) 1 ⨂ B + 2*1 ⨂ B + 2*1 ⨂ B + B ⨂ 1 + 4*B ⨂ B + 2*B ⨂ 1 + 2*B ⨂ 1 a a a a a a a a ╲ ╱ ╲ ╱ a a a a
*EDIT*
git grep "⨂" src/sage src/sage/categories/signed_tensor.py: unicode_symbol = " ⨂ " src/sage/categories/tensor.py: unicode_symbol = " ⨂ " src/sage/combinat/free_module.py: R ⨂ R
*EDIT* : But in fact, it does not appear in pdf doc, because it's only in hidden methods.
comment:12 followup: 17 Changed 17 months ago by
Replying to chapoton:
There is already a tensor product somewhere, with no pdf problem:
*EDIT* : But in fact, it does not appear in pdf doc, because it's only in hidden methods.
Thank you for pointing out the existence of ⨂ in src/sage/categories/tensor.py
. It is U+2A02 (unicode name: nary circled times operator), while U+2297 is for ⊗ (unicode name: circled times). In terms of readability, U+2A02 is much better than U+2297, so we should definitely use U+2A02.
comment:13 Changed 17 months ago by
With U+2A02, I had to perform the following change to src/sage/docs/conf.py
to get the pdf doc built:
 \DeclareUnicodeCharacter{2A02}{\otimes} + \DeclareUnicodeCharacter{2A02}{\ensuremath{\otimes}}
The output pdf looks OK.
comment:14 Changed 17 months ago by
Description:  modified (diff) 

comment:15 Changed 17 months ago by
Description:  modified (diff) 

comment:16 followup: 19 Changed 17 months ago by
You may want to make this change on top of the branch of #31880, which touches the same file, to avoid merge conflicts
comment:17 followup: 18 Changed 17 months ago by
Replying to egourgoulhon:
In terms of readability, U+2A02 is much better than U+2297, so we should definitely use U+2A02.
I am in favor of U+2297, as it is used as a binary operator. The nary operator would correspond to \bigotimes
in tex instead. At least with the Dejavu Sans Mono font, the binary operator looks good to me, whereas the nary operator U+2A02 is too wide for a monospaced font (which could lead to misaligned unicode art).
Replying to egourgoulhon:
However, before to proceed, I would like to know if this the correct way to fix the pdf doc building...
Yes, this is the correct way to add support for unicode characters to the pdf documentation.
Changed 17 months ago by
Attachment:  tensor_product2A02.png added 

Changed 17 months ago by
Attachment:  tensor_product2297.png added 

Changed 17 months ago by
Attachment:  tensor_product_Ubuntu_Mono_13.png added 

comment:18 Changed 17 months ago by
Replying to ghmwageringel:
Replying to egourgoulhon:
In terms of readability, U+2A02 is much better than U+2297, so we should definitely use U+2A02.
I am in favor of U+2297, as it is used as a binary operator. The nary operator would correspond to
\bigotimes
in tex instead. At least with the Dejavu Sans Mono font, the binary operator looks good to me, whereas the nary operator U+2A02 is too wide for a monospaced font (which could lead to misaligned unicode art).
My initial preference for U+2A02 came from the appearance in Sage's console in the Ubuntu terminal (font: Monospace Regular 13):
However, when cutting and pasting to gedit (font: Ubuntu Mono 13), we get:
As you said, U+2A02 appears too large. So, unless someone argues against it, I am going to revert to U+2297. It's also more natural since U+2297 is the standard symbol for tensor product as a binary operator, as your pointed out. Thanks for your advice!
Replying to egourgoulhon:
However, before to proceed, I would like to know if this the correct way to fix the pdf doc building...
Yes, this is the correct way to add support for unicode characters to the pdf documentation.
Thanks for your answer!
comment:19 Changed 17 months ago by
Replying to mkoeppe:
You may want to make this change on top of the branch of #31880, which touches the same file, to avoid merge conflicts
Thanks for the advice; however this ticket has no logical connection with #31880 and the merge conflict in src/sage/docs/conf.py
will be a trivial one. Moreover, since I am touching almost all files in src/sage/manifolds
and src/sage/tensor/modules
, there will be merge conflicts with the next beta anyway.
comment:20 Changed 17 months ago by
Authors:  → Eric Gourgoulhon 

Branch:  → public/manifolds/unicode_art 
Commit:  → f6bcc9d7e94ee770679c58d9e06ed0ed6f29f94b 
Here is a preliminary version, which implements ⊗
and ∧
. There remains to implement →
, ↦
and ∂
(I shall do it tomorrow).
New commits:
33aec3c  Use unicode symbol 2A02 for tensor product on finite rank free modules

f8dac45  Use Unicode symbol 2227 for exterior product on finite rank free modules

4763886  Used Unicode symbol 2297 for tensor product on finite rank free modules

8fb91b4  WIP: Unicode symbols for exterior and tensor products of tensor fields

f6bcc9d  Unicode symbols for exterior and tensor products of tensor fields

comment:21 followups: 22 23 Changed 17 months ago by
Perhaps it might be a good idea to refactor the code in such a way that the symbols can be changed easily? Something like global variables such as wedge_symbol
?
This might have the following benefits:
 Unification throughout Sage, i.e. tensor products are used not only for free modules.
 It is easier to change if we are not quite satisfied with the result at a later point.
comment:22 Changed 17 months ago by
Replying to ghmjungmath:
Perhaps it might be a good idea to refactor the code in such a way that the symbols can be changed easily? Something like global variables such as
wedge_symbol
?
+1
comment:23 Changed 17 months ago by
Replying to ghmjungmath:
Perhaps it might be a good idea to refactor the code in such a way that the symbols can be changed easily? Something like global variables such as
wedge_symbol
?
I don't think it's worth to clutter Sage's global variables for such a thing. There aren't actually so many unicode options. For instance, for the tensor product of tensor fields, there are two of them: U+2297 and U+2A02 and the discussion in comment:17 and comment:18 reveals that U+2A02 is not appropriate. Moreover, it is quite easy to spot where the symbols are implemented in Sage's code: from src/sage
,
git grep 'u2297'
yields
tensor/modules/free_module_tensor.py: basis_term_txt = "\u2297".join(bases_txt) tensor/modules/free_module_tensor.py: result._name = format_mul_txt(self._name, '\u2297', other._name)
By the way, this is the reason why I used basis_term_txt = "\u2297".join(bases_txt)
instead of basis_term_txt = "⊗".join(bases_txt)
, which would have been equivalent in terms of output.
In any case, I would vote for the discussion about global variables to be deferred to another ticket. To minimize merge conflicts, it would be helpful to have the current ticket merged not too late.
comment:24 followup: 35 Changed 17 months ago by
Maybe add a comment? And use single quotes both times?
 basis_term_txt = "\u2297".join(bases_txt) + # Unicode character '\u2297' is '⊗'; see ticket #30473 + basis_term_txt = '\u2297'.join(bases_txt)
+ # Unicode character '\u2297' is '⊗'; see ticket #30473
result._name = format_mul_txt(self._name, '\u2297', other._name)
comment:25 followup: 31 Changed 17 months ago by
I just see that all three display
methods basically contain the same code. I think some common parts should be outsourced for example in sage.tensor.format_utilities
.
This might also be useful in view of mixed forms, which have their own display
method as well.
comment:26 followup: 27 Changed 17 months ago by
@Eric: Could you change the wedge symbol in mixed_form.py
as well please?
comment:27 followup: 29 Changed 17 months ago by
Replying to ghmjungmath:
@Eric: Could you change the wedge symbol in
mixed_form.py
as well please?
This is already done, see https://git.sagemath.org/sage.git/diff/src/sage/manifolds/differentiable/mixed_form.py?id=f6bcc9d7e94ee770679c58d9e06ed0ed6f29f94b&id2=a60179ab6b642246ee54120e43fdf9663afe5638
comment:28 followup: 30 Changed 17 months ago by
I wasn't thinking of adding this to the global namespace, but having a file where people can import the symbols to use in their string outputs.
comment:29 Changed 17 months ago by
Replying to egourgoulhon:
Replying to ghmjungmath:
@Eric: Could you change the wedge symbol in
mixed_form.py
as well please?This is already done, see https://git.sagemath.org/sage.git/diff/src/sage/manifolds/differentiable/mixed_form.py?id=f6bcc9d7e94ee770679c58d9e06ed0ed6f29f94b&id2=a60179ab6b642246ee54120e43fdf9663afe5638
Ah perfect. Sorry, haven't noticed. Thanks! :)
comment:30 followup: 38 Changed 17 months ago by
Replying to tscrim:
I wasn't thinking of adding this to the global namespace, but having a file where people can import the symbols to use in their string outputs.
Yes, that's what I understood, but IMHO this implies some discussion:
 where to put this file?
 shall we homogenize between various parts of Sage? (for instance as pointed out in comment:11, tensor products of modules use U+2A02 (\bigotimes), while here, we are using here U+2297 (\otimes) for tensor products of elements)
 how to name the Python variables? (I would suggest a naming convention from the LaTeX names of the symbols, e.g. something like
unicode_otimes
,unicode_mapsto
, etc.)  how to articulate this file with
src/sage/docs/conf.py
? (i.e. ensure that each time a new variable is added to this file, it will be taken into account inconf.py
)
For this reason, I would prefer this to be done in another ticket.
comment:31 Changed 17 months ago by
Replying to ghmjungmath:
I just see that all three
display
methods basically contain the same code. I think some common parts should be outsourced for example insage.tensor.format_utilities
.
Yes certainly, but in another ticket.
comment:32 Changed 17 months ago by
Commit:  f6bcc9d7e94ee770679c58d9e06ed0ed6f29f94b → afa0e72feba703b762508c907795a6d787ac5b00 

Branch pushed to git repo; I updated commit sha1. New commits:
7e91c14  #30473: Add some comments about unicode symbols

9091e7d  Unicode symbol 21A6 for text display of continuous maps and chart functions

db2f163  Unicode symbol 2192 for text display of continuous maps and scalar fields

afa0e72  Unicode symbols 211D and 2102 for text display of the codomains of scalar fields

comment:33 Changed 17 months ago by
Here is a new provisory version, implementing →
, ↦
, as well as ℝ
and ℂ
for the codomains of scalar fields on respectively real and complex manifolds.
There remains to implement ∂
for coordinate frames.
comment:34 Changed 17 months ago by
Description:  modified (diff) 

comment:35 Changed 17 months ago by
Replying to slelievre:
Maybe add a comment? And use single quotes both times?
 basis_term_txt = "\u2297".join(bases_txt) + # Unicode character '\u2297' is '⊗'; see ticket #30473 + basis_term_txt = '\u2297'.join(bases_txt)
Thanks for the suggestion. This is done in the latest version (comment:32).
comment:36 Changed 17 months ago by
Description:  modified (diff) 

comment:37 Changed 17 months ago by
Commit:  afa0e72feba703b762508c907795a6d787ac5b00 → 4e13ffa05615e91d6b266660481516c72c39d25e 

Branch pushed to git repo; I updated commit sha1. New commits:
4e13ffa  Unicode symbols defined in new file src/sage/typeset/unicode_characters.py

comment:38 Changed 17 months ago by
Replying to egourgoulhon:
Replying to tscrim:
I wasn't thinking of adding this to the global namespace, but having a file where people can import the symbols to use in their string outputs.
Yes, that's what I understood, but IMHO this implies some discussion [...]
For this reason, I would prefer this to be done in another ticket.
Having slept on it, I've finally included this in the current ticket ;)
I've put the file defining Python identifiers for Unicode characters in src/sage/typeset
and named it unicode_characters.py
, cf.
https://git.sagemath.org/sage.git/tree/src/sage/typeset/unicode_characters.py?id=4e13ffa05615e91d6b266660481516c72c39d25e
I've also included it in the reference manual, in the section "Programming > Utilities > Formatted Output > Unicode characters".
Regarding the last point of :comment:30, there is no (automatic) articulation with src/sage/docs/conf.py
yet.
comment:39 Changed 17 months ago by
Commit:  4e13ffa05615e91d6b266660481516c72c39d25e → da8893f5c68353345ba8a18a07d371705084a9e7 

Branch pushed to git repo; I updated commit sha1. New commits:
da8893f  Use sage.typeset.unicode_characters in TensorProductFunctor and SignedTensorProductFunctor

comment:40 Changed 17 months ago by
In the above commit, I used the new file unicode_characters.py
to deal with the symbol ⨂ (bigotimes) pointed out in comment:11. Frédéric, Travis, do you agree?
comment:41 followup: 43 Changed 17 months ago by
Have you considered formatted strings instead of additions?
 unicode_symbol = " " + unicode_bigotimes + " " + unicode_symbol = f" {unicode_bigotimes} "
comment:42 Changed 17 months ago by
Commit:  da8893f5c68353345ba8a18a07d371705084a9e7 → 5167e6cfc4130eb88b78ddf9eb5e130c089f39f9 

Branch pushed to git repo; I updated commit sha1. New commits:
2a23cb5  Unicode symbol 2202 (partial) for the text display of coordinate frames

5d096f1  fstring for unicode_symbol in TensorProductFunctor and SignedTensorProductFunctor

76c2fd5  Use Unicode symbol for the Riemann sphere example

5167e6c  Use Unicode symbol for default text display of RealLine

comment:43 Changed 17 months ago by
Replying to slelievre:
Have you considered formatted strings instead of additions?
 unicode_symbol = " " + unicode_bigotimes + " " + unicode_symbol = f" {unicode_bigotimes} "
Thanks for the tip; this is done in the latest version.
comment:44 Changed 17 months ago by
Status:  new → needs_review 

I've added ∂/∂... for the vector fields of coordinate frames and ℝ for the real line. So I think this is ready for review now.
comment:45 followup: 46 Changed 17 months ago by
Shouldn't unicode output only be used if %display unicode_art
is set? It is not set by default, which means that it would not be necessary to touch all the doctests in this ticket.
As unicode output requires the presence of suitable fonts, I think it might be best if it stays optional.
comment:46 followup: 48 Changed 17 months ago by
Replying to ghmwageringel:
Shouldn't unicode output only be used if
%display unicode_art
is set? It is not set by default, which means that it would not be necessary to touch all the doctests in this ticket.
No, I don't think so. "Unicode art", in extension of "ASCII art", means to use multiline character art. Merely using nonASCII characters does not make something unicode art.
comment:47 Changed 17 months ago by
Summary:  Unicode art for sage.manifolds → Unicode operators for sage.manifolds 

(The title of the ticket should probably be changed.)
comment:48 followup: 50 Changed 17 months ago by
Replying to mkoeppe:
No, I don't think so. "Unicode art", in extension of "ASCII art", means to use multiline character art. Merely using nonASCII characters does not make something unicode art.
Ok, thanks for clarifying.
comment:49 Changed 17 months ago by
Description:  modified (diff) 

comment:50 Changed 17 months ago by
Replying to ghmwageringel:
Replying to mkoeppe:
No, I don't think so. "Unicode art", in extension of "ASCII art", means to use multiline character art. Merely using nonASCII characters does not make something unicode art.
Ok, thanks for clarifying.
I've updated the ticket description to show the purpose of this ticket via some examples.
comment:51 Changed 17 months ago by
Description:  modified (diff) 

comment:52 followup: 55 Changed 17 months ago by
I think in *TensorProductFunctor
, we should use U+2297
 i.e., unicode_otimes
 since it is a binary operator.
comment:53 Changed 17 months ago by
Replying to ghmwageringel:
Replying to ghmjungmath:
Is the preparser also capable of something like
type "\otimes" > press
TAB
> unicode character pops upsimilarly to Greek letters in Py3 right now?
That is an IPython feature, which does not seem to be implemented for operators like
\otimes
. The preparser is not capable of this.
Th IPython feature could be hooked into. The relevant dictionaries are IPython.core.latex_symbols.latex_symbols
and IPython.core.latex_symbols.reverse_latex_symbols
. You could install new translations by updating those dictionaries. It would probably be worth trialing it a bit to see if there are unwanted sideeffects (the dictionary may be used for other purposes  the source mentions it's a list borrowed from Julia, with entries removed that do not yield valid Python identifiers)
comment:54 Changed 17 months ago by
Commit:  5167e6cfc4130eb88b78ddf9eb5e130c089f39f9 → 332410b486ea3e180c073042980947d820824674 

Branch pushed to git repo; I updated commit sha1. New commits:
332410b  Use unicode_otimes in TensorProductFunctor and SignedTensorProductFunctor

comment:55 Changed 17 months ago by
Replying to tscrim:
I think in
*TensorProductFunctor
, we should useU+2297
 i.e.,unicode_otimes
 since it is a binary operator.
Done in the latest commit. Incidently, note that this is doctested by only one test, in src/sage/combinat/free_module.py
, and never shows up in the reference manual, as pointed out in comment:11.
comment:56 Changed 17 months ago by
What's wrong with the patchbot? I'm looking forward for a positive review due to #30272.
comment:58 followup: 60 Changed 17 months ago by
Status:  needs_review → needs_work 

sage t randomseed=0 src/sage/manifolds/differentiable/diff_map.py ********************************************************************** File "src/sage/manifolds/differentiable/diff_map.py", line 933, in sage.manifolds.differentiable.diff_map.DiffMap.pullback Failed example: gM.display() Expected: (2*cos(t) + 2) dt*dt Got: (2*cos(t) + 2) dt⊗dt
comment:59 Changed 17 months ago by
Commit:  332410b486ea3e180c073042980947d820824674 → d87d09b7e101ac462c04372b2f8cd691cdb774a1 

comment:60 Changed 17 months ago by
Status:  needs_work → needs_review 

Replying to mkoeppe:
sage t randomseed=0 src/sage/manifolds/differentiable/diff_map.py ********************************************************************** File "src/sage/manifolds/differentiable/diff_map.py", line 933, in sage.manifolds.differentiable.diff_map.DiffMap.pullback Failed example: gM.display() Expected: (2*cos(t) + 2) dt*dt Got: (2*cos(t) + 2) dt⊗dt
Thanks for pointing this out (this was due to the merge of #31904 in 9.4.beta4).
comment:61 followup: 62 Changed 17 months ago by
Reviewers:  → Matthias Koeppe 

Status:  needs_review → positive_review 
comment:63 followup: 65 Changed 17 months ago by
Status:  positive_review → needs_work 

I'm getting a lot of test failures of the form
********************************************************************** File "src/sage/calculus/functional.py", line 145, in sage.calculus.functional.derivative Failed example: derivative(a).display() Expected: da = 2 dx/\dy Got: da = 2 dx∧dy **********************************************************************
can you please do a whole make ptestlong
comment:64 Changed 17 months ago by
Commit:  d87d09b7e101ac462c04372b2f8cd691cdb774a1 → f2ae50e68687b59af27e35a4ec02b0f667ea828c 

Branch pushed to git repo; I updated commit sha1. New commits:
f2ae50e  #30473: fix doctests outside sage/manifolds and sage/tensor/modules

comment:65 Changed 17 months ago by
Status:  needs_work → needs_review 

Replying to vbraun:
I'm getting a lot of test failures of the form [...] can you please do a whole
make ptestlong
Thanks for pointing this out. This is fixed in the latest commit.
comment:66 Changed 17 months ago by
Status:  needs_review → positive_review 

comment:67 Changed 17 months ago by
Branch:  public/manifolds/unicode_art → f2ae50e68687b59af27e35a4ec02b0f667ea828c 

Resolution:  → fixed 
Status:  positive_review → closed 
Sounds a good idea!