Opened 11 years ago
Closed 3 years ago
#5415 closed defect (fixed)
problems with multifactorial?
Reported by:  cwitty  Owned by:  iconjack 

Priority:  major  Milestone:  sage8.0 
Component:  basic arithmetic  Keywords:  
Cc:  Merged in:  
Authors:  Jack Kennedy  Reviewers:  Vincent Delecroix, Frédéric Chapoton 
Report Upstream:  N/A  Work issues:  
Branch:  0204235 (Commits)  Commit:  020423572f0b240cd857b26fc7b750080ca3ed39 
Dependencies:  Stopgaps:  todo 
Description (last modified by )
The multifactorial method on integers is different than the one at http://mathworld.wolfram.com/Multifactorial.html and http://www.research.att.com/~njas/sequences/A007661; unless there are multiple competing definitions of multifactorial, this is a bug.
(The references give, for example, (5).multifactorial(3) == 10, whereas Sage currently returns 5.)
There are further problems: k is assumed to be a Cint. Hence,
sage: a=2^64 sage: a.multifactorial(a)
doesn't work, whereas the appropriate value, a, would be no problem to represent.
How badly do people need these numbers? for which ranges should the computation be optimized? The current implementation can easily be adjusted to give answers that agree with the definition elsewhere, fairly efficiently, for ranges where k fits easily in a "long".
Indeed, the current implementation refuses to compute the multifactorial if n itself doesn't fit in a long. The claim in the current error message, "This is probably OK, since the answer would have billions of digits." is not quite true: if the number of digits is about (n/k) times the number of digits in n, so as long as n/k isn't too large, there isn't really an issue. It may well be that applications of multifactorial never venture in those ranges, though.
Change History (45)
comment:1 Changed 11 years ago by
 Milestone changed from sage3.4 to sage3.4.1
comment:2 Changed 11 years ago by
I think the problem is that not enough factors are included. Actually, there are two problems: in the base case, it should return n, not 1; that is, make this change:
# base case if 0 < n < k:  return ONE + return n
After making this change, I'm still getting the wrong answers for a.multifactorial(3)
whenever a is congruent to 2 mod 3 (except for a=2), and for a.multifactorial(4)
whenever a is congruent to 2 or 3 mod 4 (except for a=2,3). It seems that not enough factors are used; for example, 10.multifactorial(4) should be 10 x 6 x 2 = 120, but Sage computes it as 10 x 6 = 60.
If we fix this, we can put in doctests like the following:
sage: L = sloane_sequence(6882)[2] # optional  internet Searching Sloane's online database... sage: all([Integer(a).multifactorial(2) == L[a] for a in range(1,20)]) # optional  internet True
comment:3 Changed 11 years ago by
A careful look at Sloane's functions reveals that the base case is indeed =1, according to his definitions. I'm not sure if this is standard, but Mathworld doesn't seem to talk about those cases carefully. On the other hand, Wikipedia http://en.wikipedia.org/wiki/Factorial#Multifactorials suggests that this is somewhat standard.
comment:4 Changed 11 years ago by
 Cc robertwb added
Or an even more careful look reveals that Sloane starts at n=0  my bad. So there is an inconsistency in the definitions. Sloane's sequences agree with jhpalmieri, while Sage agrees with Wikipedia.
I'm ccing: the author of multifactorial in Sage, who will hopefully weigh in on what definition is okay.
comment:5 followup: ↓ 8 Changed 11 years ago by
I needed the double factorial for something (I can't even remember what now) so I just wrote this. It sounds like there's several competing definitions, but wikipedia is not necessarily the most authoritative.
I would ask on sagecombinat what the "right" definition is, they're more likely to know.
comment:6 Changed 11 years ago by
Can you do that? I'm not subscribed to them. Thanks.
comment:7 Changed 11 years ago by
I would certainly want (5).multifactorial(3) to be 10 and not 5.
comment:8 in reply to: ↑ 5 Changed 11 years ago by
Replying to robertwb:
I needed the double factorial for something (I can't even remember what now) so I just wrote this. It sounds like there's several competing definitions, but wikipedia is not necessarily the most authoritative.
(I think it was for making sure gamma(3/2) etc. were correct.)
I would ask on sagecombinat what the "right" definition is, they're more likely to know.
But one should definitely document that there isn't a universally agreedupon definition.
Can you indicate what one would change in integer.pyx? Changing things that seem right yield horrible allocation errors.
comment:9 Changed 7 years ago by
 Milestone changed from sage5.11 to sage5.12
comment:10 Changed 6 years ago by
 Milestone changed from sage6.1 to sage6.2
comment:11 Changed 6 years ago by
 Milestone changed from sage6.2 to sage6.3
comment:12 Changed 6 years ago by
 Milestone changed from sage6.3 to sage6.4
comment:13 Changed 5 years ago by
 Report Upstream set to N/A
 Stopgaps set to todo
comment:14 followup: ↓ 15 Changed 4 years ago by
I have tried to redefine multifactorial function. Please review. https://github.com/sagemath/sage/pull/52
comment:15 in reply to: ↑ 14 Changed 4 years ago by
Replying to prateek.cs14:
I have tried to redefine multifactorial function. Please review. https://github.com/sagemath/sage/pull/52
A few notes:
 Python is notoriously (and if you believe some of Guido's commentary basicaly intentionally so) bad at recursion, so you should avoid it unless your really need it. Here you don't (and the current implementation carefully avoids deep recursion)
 The current implementation does quite a bit of balancing of factors to ensure good multiplication performance. From what I see on your github branch, you've tacked on a new recursion case to what is called "reflection" there, essentially making the main body (everything below it) unreachable. That code looks like it's pretty carefully written. If you want to throw it out you should do so (and not just leave it in there as unreachable code) but then you should back up your proposal with some serious timings that show your code performs better under all circumstances.
 Whenever you make a change like this you should adjust the documentation as well, ensuring that the new behaviour is reflected in the documentation and is properly tested (probably adding some new tests that differentiate between old and new behaviour.
 Github presently really only is a mirror of the sage repository. It isn't really used for development. So before a change can be considered for inclusion, your change should be uploaded as a git branch here (using the
gittrac
command, probably).
I suspect that the appropriate change to make is in line 3967:
 for i from 1 <= i <= n//k: + for i from 0 <= i <= n//k:
but I didn't check in detail.
comment:16 Changed 4 years ago by
 Description modified (diff)
 Milestone changed from sage6.4 to sage6.10
 Status changed from new to needs_info
 Summary changed from wrong definition of multifactorial? to problems with multifactorial?
comment:17 followup: ↓ 18 Changed 4 years ago by
Thanks nbruin. Please review the changes made. https://github.com/prateekcs14/sage/commit/2cb944378c97bdad76a053e37d579d492f68d44c
comment:18 in reply to: ↑ 17 Changed 4 years ago by
Replying to prateek.cs14:
Thanks nbruin. Please review the changes made. https://github.com/prateekcs14/sage/commit/2cb944378c97bdad76a053e37d579d492f68d44c
Currently, the code is nicely interruptable with CTRLC if a particularly long computation was being done. Does your code have that property? (you stripped out the sig_on/sig_off
. On the other hand you revert to using (python?) integers instead of using gmp directly, so perhaps interrupts get enabled in that code)
Currently, the code is taking efforts to balance the size of the factors it was multiplying. This is a wellknown technique to improve performance, because it reduces the number of multiplications where a particularly big number is involved (currently there is a 32/64 bit bug in that code, by the way) You strip that out. Do you have data to confirm this is not a problem?
You may want to try things like
sage: %timeit 10000001.multifactorial(2) 1 loops, best of 3: 8.47 s per loop
and possibly with larger numbers too. You should compare the current implementation with the new one.
comment:19 Changed 4 years ago by
Please review the changes made.I increased the sub_products by 1 and set i in k*i+residue from 0. https://github.com/sagemath/sage/commit/8447058f8bb7f017bdefd7086300c4fc3b46ff21
comment:20 Changed 4 years ago by
 Branch set to u/nbruin/problems_with_multifactorial_
comment:21 Changed 4 years ago by
 Commit set to 098b62d0331feb8a92156695ca16c6ac305f9d73
 Owner changed from robertwb to prateek.cs14
comment:22 Changed 4 years ago by
 Owner changed from prateek.cs14 to (none)
comment:23 Changed 4 years ago by
 Branch changed from u/nbruin/problems_with_multifactorial_ to u/prateek.cs14/problems_with_multifactorial_
comment:24 Changed 4 years ago by
 Commit changed from 098b62d0331feb8a92156695ca16c6ac305f9d73 to 68eb8e801c53facebb83482138937cdb066c1546
comment:25 Changed 4 years ago by
 Status changed from needs_info to needs_work
Congratulations on getting a branch attached. Todo list:
 make sure documentation is changed to reflect new behaviour and include doctests that confirm it
 clean up your branch: as you can see in the diffs:
52 files changed, 7057 insertions, 5 deletions
. That's due to your "Merge" commit.
Since the intended change only touches 4 lines, it's probably easier to start with a fresh copy of current "develop", make the edits. and commit that. Then you need no merging at all.
comment:26 Changed 4 years ago by
 Commit changed from 68eb8e801c53facebb83482138937cdb066c1546 to 173f0eaf83f3f894bcbd98b381f891c668da2030
comment:27 Changed 4 years ago by
Your indentation is not really working well in your doctests  follow the developer guide and other code. (Ask if you need help!) I also agree with Nils' comments.
comment:28 Changed 4 years ago by
Also note that, as was remarked previously, the original behaviour of multifactorial was consistent with the description in its documentation. So, by changing the behaviour but not the documentation you are basically introducing a bug. Ideally you'd include a reference to a literature source to document which convention you are following, along with a description of what the routine in sage does.
comment:29 Changed 4 years ago by
Can I change the definition to
" Computes the kth factorial n!^{(k)}
of self. For k=1
this is the standard factorial, and for k greater than one it is the product of every kth terms down from self to the smallest possible positive integer. The recursive definition is used to extend this function to the negative integers."
Is this look fine ?
comment:30 Changed 4 years ago by
Some people have complained about using "self" in the docstring, and they have a point: in the call
10.multifactorial(3)
there is never any mention of "self". I think it can be avoided here. Your description in words is pretty good, but requires careful reading to pry out the base cases. That might be difficult for nonnative speakers. Perhaps
Returns the kth multifactorial. The kth multifactorial n, denoted by n!^{(k)}, as implemented in Sage is defined by n!^{(k)} = n for 1 <= n < k and n!^{(k)} = n * ( (nk)^{(k)} for n >= k The recursive definition is used to extend this function to the negative integers.
You'd have to figure out how to ensure that the formulas are rendered acceptably in all output formats of the sage documentation, though.
That said, your proposal is in the style of the current docstring, so I don't think a positive review would be held back by it.
Note that one of the doctests illustrates the behaviour:
sage: 23.multifactorial(2) 316234143225 sage: prod([1..23, step=2]) 316234143225
It would be very instructive if you would change the parameters there to a case that distinguishes between the old and the new behaviour.
comment:31 Changed 4 years ago by
Looks like this ticket has been abandoned again.
comment:32 Changed 4 years ago by
 Branch u/prateek.cs14/problems_with_multifactorial_ deleted
 Commit 173f0eaf83f3f894bcbd98b381f891c668da2030 deleted
 Milestone changed from sage6.10 to sage7.4
 Owner changed from (none) to iconjack
comment:33 Changed 4 years ago by
 Branch set to tbd
 Status changed from needs_work to needs_review
The multifactorial implementation is optimized to limit large multiplications by computing e.g. ((a*b)*(c*d))*((e*f)*(g*h)) instead of (((((((a*b)*c)*d)*e)*f)*g)*h). It's a little tricky, using bit patterns of the index to fold the multiplications, rather than doing it recursively. As others noted, there was an offbyone error (two actually), which caused 5.factorial(3) to eval to 5 rather than 10.
A different bug was that for example 11.multifactorial(17) was returning 1 instead of 11.
WStein suggested using the alreadyexisting prod function that already implements this balanced multiplication. However, prod is more general, not just integers. The current multifactorial uses GMP's integer functions, and may be faster.
In the long run, I think that multifactorial, prod, balanced_list_prod, and interator_prod should be refactored, but since this is my first commit, I've decided to simply fix the bugs addressed in this ticket. There's another bug, related to noninteger values being passed in, but I'll make another ticket for that.
Also added a test case.
comment:34 Changed 4 years ago by
 Branch changed from tbd to /u/iconjack/5415multifactorial
comment:35 Changed 4 years ago by
 Branch changed from /u/iconjack/5415multifactorial to u/iconjack/5415multifactorial
comment:36 Changed 4 years ago by
It seems your branch does not exist. Don't you use gittrac as in the documentation?
comment:37 Changed 4 years ago by
 Commit set to 502452f75850f2c4f72fb0b25b7a6a84e4e33110
Branch pushed to git repo; I updated commit sha1. New commits:
502452f  Trac #5415: problems with multifactorial?

comment:38 followup: ↓ 39 Changed 4 years ago by
Do you know why residue
is not declared as cdef int
?
comment:39 in reply to: ↑ 38 Changed 4 years ago by
Replying to vdelecroix:
Do you know why
residue
is not declared ascdef int
?
No, I didn't change that line. I'll change it to cdef int and test but for now I'd like to get this change reviewed and committed.
comment:40 Changed 4 years ago by
 Reviewers set to Vincent Delecroix
 Status changed from needs_review to needs_work
Your changes look good. Two small things
 add your name in the "Authors" field of the ticket
 Add the following doctest in a
TESTS
blocksage: for a in range(1,20): ....: for b in range(1,20): ....: assert ZZ(a).multifactorial(b) == prod(x for x in range(a,0,b))
comment:41 Changed 4 years ago by
comment:42 Changed 3 years ago by
 Branch changed from u/iconjack/5415multifactorial to public/5415
 Commit changed from 502452f75850f2c4f72fb0b25b7a6a84e4e33110 to 020423572f0b240cd857b26fc7b750080ca3ed39
 Status changed from needs_work to needs_review
comment:43 Changed 3 years ago by
 Cc robertwb removed
 Milestone changed from sage7.4 to sage8.0
comment:44 Changed 3 years ago by
 Reviewers changed from Vincent Delecroix to Vincent Delecroix, Frédéric Chapoton
 Status changed from needs_review to positive_review
ok, this passes all tests. I am setting to positive.
comment:45 Changed 3 years ago by
 Branch changed from public/5415 to 020423572f0b240cd857b26fc7b750080ca3ed39
 Resolution set to fixed
 Status changed from positive_review to closed
Better luck in Sage 3.4.1.
Cheers,
Michael