Opened 12 years ago
Closed 12 years ago
#8127 closed enhancement (fixed)
Wraps string features into WordDatatypes
Reported by: | vdelecroix | Owned by: | vdelecroix |
---|---|---|---|
Priority: | major | Milestone: | sage-4.3.3 |
Component: | combinatorics | Keywords: | string, word |
Cc: | slabbe, sage-combinat | Merged in: | sage-4.3.3.alpha0 |
Authors: | Vincent Delecroix | Reviewers: | Sébastien Labbé |
Report Upstream: | N/A | Work issues: | |
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description
Python has low-level operations for strings and we should allow them for words. For example
sage: sage: s = "ma maman est magique" sage: s.split(' ') ['ma', 'maman', 'est', 'magique'] sage: s.split('ma') ['', ' ', '', 'n est ', 'gique']
The patch implements split and partition for words
sage: w = Word("ma maman est magique") sage: w.split(' ') [word: ma, word: maman, word: est, word: magique] sage: w.split('ma') [word: , word: , word: , word: n est , word: gique]
Attachments (2)
Change History (10)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
- Status changed from new to needs_review
comment:3 Changed 12 years ago by
- Cc saliola added; fsaliola removed
- Reviewers set to Sébastien Labbé
- Status changed from needs_review to needs_work
I just tested the patch. All test passed. Not tested the sphinx docbuild because the affected file is not part of the doc (maybe it should?). But I wonder if no blank line before the INPUT: line could raise a error in the docbuild process. I think the patch needs work for the reasons explained below.
The string_rep()
output depends on the current state of the truncate length :
sage: u = Word('abcdbruitbruit01234...ababbruitbruit01234...abababab') sage: u.string_rep() 'abcdbruitbruit01234...ababbruitbruit0123...'
which leads to the following bug :
sage: w = Word(range(20)) sage: u.split(w) [word: abcdbruitbruit01234...ababbruitbruit0123...] sage: WordOptions(truncate_length=5) sage: u.split(w) [word: abcdb..., word: ababb..., word: ababa...]
To correct this, I suggest that the function split
be supported only for str and WordDatatype_str
object.
Same suggestion for partition function:
sage: u = Word('abcdbruitbruit01234...ababbruitbruit01234...abababab') sage: w = Word(range(20)) sage: u.partition(w) [word: abcdbruitbruit01234...ababbruitbruit0123..., word: , word: ] sage: WordOptions(truncate_length=5) sage: u.partition(w) [word: abcdb..., word: 01234..., word: ababb...]
I think split should follow the behavior of the python split for str, i.e works for no input given :
sage: s = 'absdg asdfas asdfa' sage: s.split() ['absdg', 'asdfas', 'asdfa'] sage: s.split(' ') ['absdg', 'asdfas', 'asdfa'] sage: s.split(None) ['absdg', 'asdfas', 'asdfa'] sage: w = Word(s) sage: w.split() Traceback (most recent call last): ... AttributeError: 'NoneType' object has no attribute 'string_rep' sage: w.split(' ') [word: absdg, word: asdfas, word: asdfa] sage: w.split(None) Traceback (most recent call last): ... AttributeError: 'NoneType' object has no attribute 'string_rep'
The INPUT block of split
should mention the part (optional, default: None)
for both arguments.
For both functions, I suggest to move the ..notes
block after the OUTPUT block.
comment:4 Changed 12 years ago by
- Component changed from algebra to combinatorics
comment:5 Changed 12 years ago by
- Cc sage-combinat added; saliola removed
- Status changed from needs_work to needs_review
This now needed to have a string (or a WordDatatype_str) as separator. A ValueError? is raised when it's not the case.
comment:6 Changed 12 years ago by
I just added a patch which changes split function to follow python behavior for when no arguments are given and changed some doc mise en forme.
All tests passed. The above problem is solve. Sphinx builds fine on both functions. I am giving a positive review to Vincent's patch.
Vincent, if you agree with my patch, you can change the status of this ticket for positive review.
comment:7 Changed 12 years ago by
- Status changed from needs_review to positive_review
It's all right ;-)
comment:8 Changed 12 years ago by
- Merged in set to sage-4.3.3.alpha0
- Resolution set to fixed
- Status changed from positive_review to closed
With the patch applied, I obtain some doctest failures :