Opened 6 years ago

Closed 6 years ago

#19296 closed enhancement (fixed)

Generating function in FindStat interface

Reported by: stumpc5 Owned by:
Priority: major Milestone: sage-6.10
Component: packages: optional Keywords: FindStat, OEIS
Cc: mantepse Merged in:
Authors: Christian Stump, Martin Rubey Reviewers: Martin Rubey, Christian Stump
Report Upstream: N/A Work issues: document
Branch: 15ec356 (Commits, GitHub, GitLab) Commit: 15ec3561001bbb9f6372e4d245959f6a2adb470a
Dependencies: Stopgaps:

Status badges

Description (last modified by mantepse)

This ticket implements the generating function pull from a FindStat? statistic and its search in the OEIS.

It also fixes some issues recently discovered, in particular #19543

Change History (115)

comment:1 Changed 6 years ago by stumpc5

  • Branch set to u/stumpc5/generating_function_in_findstat_interface

comment:2 Changed 6 years ago by stumpc5

  • Authors set to Christian Stump
  • Cc mantepse added
  • Commit set to eefeaae4deb2df18fb97cc8458090139b9eb2c31
  • Description modified (diff)
  • Keywords FindStat OEIS added
  • Type changed from PLEASE CHANGE to enhancement
  • Work issues set to document

New commits:

eefeaaefirst version for generating function and automated oeis search, still to be improved

comment:3 Changed 6 years ago by git

  • Commit changed from eefeaae4deb2df18fb97cc8458090139b9eb2c31 to 0a69285394e85831ae4de65caa5e10267621001d

Branch pushed to git repo; I updated commit sha1. New commits:

0a69285cosmetic

comment:4 Changed 6 years ago by stumpc5

@mantepse, do I need to add the utf8 encoding in

self._generating_function = self._raw[FINDSTAT_STATISTIC_GENERATING_FUNCTION]

as in some of the other similar lines before?

comment:5 Changed 6 years ago by git

  • Commit changed from 0a69285394e85831ae4de65caa5e10267621001d to b0eda2842a96ea3438d255a369da48f4c83066c9

Branch pushed to git repo; I updated commit sha1. New commits:

b0eda28remved a leftover print command

comment:6 Changed 6 years ago by git

  • Commit changed from b0eda2842a96ea3438d255a369da48f4c83066c9 to 6640df6c8d5e2fbfc4ced4fb8f1ba4545e23fa69

Branch pushed to git repo; I updated commit sha1. New commits:

6640df6added name_plural method

comment:7 Changed 6 years ago by git

  • Commit changed from 6640df6c8d5e2fbfc4ced4fb8f1ba4545e23fa69 to e1b0c89914a9b2cc75d16a5d04f7b68bbc942f61

Branch pushed to git repo; I updated commit sha1. New commits:

e1b0c89added verbose and improved the oeis search

comment:8 Changed 6 years ago by git

  • Commit changed from e1b0c89914a9b2cc75d16a5d04f7b68bbc942f61 to 9926f7f0431ccac0fc6b5fe4947422c3984ff65c

Branch pushed to git repo; I updated commit sha1. New commits:

9926f7fadded max length to oeis search

comment:9 Changed 6 years ago by git

  • Commit changed from 9926f7f0431ccac0fc6b5fe4947422c3984ff65c to 83d5332fac33521569e7882e4ee17aded984d559

Branch pushed to git repo; I updated commit sha1. New commits:

83d5332improved max length to oeis search

comment:10 Changed 6 years ago by git

  • Commit changed from 83d5332fac33521569e7882e4ee17aded984d559 to f544c28f4b0c0c4707c017f60d45a62b58501c76

Branch pushed to git repo; I updated commit sha1. New commits:

ea11793added list of coefficient option to generating functions
595e289added some documentation
f544c28added name to json and pull it from there

comment:11 follow-up: Changed 6 years ago by mantepse

I think it would be better to have the commits added name to json and pull it from there and added name_plural method in a separate ticket. Actually, I even have a bug to fix...

comment:12 follow-up: Changed 6 years ago by stumpc5

@mantepse If I now push the new collections.json without the CollectionLevelsPrecomputed field. Is then the current version of the interface braking? I guess so, right?

comment:13 in reply to: ↑ 11 ; follow-up: Changed 6 years ago by stumpc5

Replying to mantepse:

I think it would be better to have the commits added name to json and pull it from there and added name_plural method in a separate ticket. Actually, I even have a bug to fix...

We can do that if you want...

comment:14 in reply to: ↑ 13 Changed 6 years ago by stumpc5

Replying to stumpc5:

Replying to mantepse:

I think it would be better to have the commits added name to json and pull it from there and added name_plural method in a separate ticket. Actually, I even have a bug to fix...

We can do that if you want...

This is now #19307.

comment:15 Changed 6 years ago by git

  • Commit changed from f544c28f4b0c0c4707c017f60d45a62b58501c76 to cde7b5bdce31462f4046451bbff30026ece0a920

Branch pushed to git repo; I updated commit sha1. New commits:

44da776cleaned the statistic name method
836f205improved statistic name, added collection name plural, some cosmetic changes as well
b95a704Merge branch 't/19307/improve_statistic_name_and_add_collection_name_plural_to_findstat_interface' into t/19296/generating_function_in_findstat_interface
cde7b5bremoved everything that is now in #19307

comment:16 in reply to: ↑ 12 Changed 6 years ago by mantepse

Replying to stumpc5:

@mantepse If I now push the new collections.json without the CollectionLevelsPrecomputed field

Yes, if you do this then there is a breakage, although trivial to fix. I'm afraid you can get rid of CollectionLevelsPrecomputed only after the trivial fix is merged, which will very likely take some time.

That's great that you thought of this!

comment:17 Changed 6 years ago by git

  • Commit changed from cde7b5bdce31462f4046451bbff30026ece0a920 to 8f093a0014bad991b957738f97207abdccd877de

Branch pushed to git repo; I updated commit sha1. New commits:

f79ed35improved the oeis search to now allow too short searches
8f093a0stupid mistake fix

comment:18 Changed 6 years ago by git

  • Commit changed from 8f093a0014bad991b957738f97207abdccd877de to ce7214ac8a76d5a2cd6339eff21a2b5be126d186

Branch pushed to git repo; I updated commit sha1. New commits:

ce7214aadded two lines that were missing from the diverging

comment:19 Changed 6 years ago by git

  • Commit changed from ce7214ac8a76d5a2cd6339eff21a2b5be126d186 to 8df3e6390fb74de9a8d300b40032a464cd139992

Branch pushed to git repo; I updated commit sha1. New commits:

356612cchanges search_oeis to oeis_search
8df3e63added doctests, ready for review

comment:20 Changed 6 years ago by stumpc5

  • Component changed from PLEASE CHANGE to packages: optional
  • Status changed from new to needs_review

comment:21 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:22 Changed 6 years ago by mantepse

I fixed two trivial doctesting problems, and I have a few questions and suggestions:

1) the docstring of oeis_search

Returns the OEIS search for the generating function of ``self``.

could be improved: what's returned is actually the result of an OEIS search. So perhaps:

Search the OEIS for the generating function of the statistic.

2) also, there should be an OUTPUT section, perhaps

OUTPUT:

- a tuple of OEIS sequences, see :meth:`OEIS.find_by_description` for more information.

3) not sure, but I think the blank-line print are unconventional.

4) the EXAMPLES section does not display correctly in html.

comment:23 Changed 6 years ago by git

  • Commit changed from 8df3e6390fb74de9a8d300b40032a464cd139992 to 54051ea405a43a33531668ba002a38414c0098df

Branch pushed to git repo; I updated commit sha1. New commits:

54051eafix two docstring problems

comment:24 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:25 Changed 6 years ago by git

  • Commit changed from 54051ea405a43a33531668ba002a38414c0098df to 304ceca132e718ea355eb279966091f230b3532f

Branch pushed to git repo; I updated commit sha1. New commits:

304cecatook care of Martin's suggestions (all -- I didn't check the html output)

comment:26 Changed 6 years ago by git

  • Commit changed from 304ceca132e718ea355eb279966091f230b3532f to 8582fb12c8f3f7d0a03b794018745abc08167b99

Branch pushed to git repo; I updated commit sha1. New commits:

8582fb1forgot to resolve a merge conflict

comment:27 Changed 6 years ago by git

  • Commit changed from 8582fb12c8f3f7d0a03b794018745abc08167b99 to 5558be983f10469e5be31e45652364437a0edb03

Branch pushed to git repo; I updated commit sha1. New commits:

5558be9I still had some problems with the merging, hope that's fixed now

comment:28 Changed 6 years ago by stumpc5

  • Dependencies set to #19307

New commits:

5558be9I still had some problems with the merging, hope that's fixed now

comment:29 follow-up: Changed 6 years ago by mantepse

The html now looks almost fine, with the exception of the link to OEIS.find_by_description(), which does not work.

There is another, less minor thing: generating_functions currently only works for queries that yield statistics, i.e., of the form findstat(number):

sage: s=findstat("Permutations", number_of_alignments); s
0: (St000222: The number of alignments of a permutation, [], 200)
sage: s.generating_functions()
...
AttributeError: 'FindStatStatistic' object has no attribute '_generating_function'

sage: s[0][0].generating_functions()
{2: 2,
 3: 3*q + 3,
 4: 12*q^2 + 8*q + 4,
 5: 20*q^4 + 45*q^3 + 35*q^2 + 15*q + 5,
 6: 100*q^6 + 180*q^5 + 198*q^4 + 140*q^3 + 72*q^2 + 24*q + 6}

While this is as expected, I think it shouldn't be that way. In particular, if possible it would be nice to be able to use oeis_search even for new statistics.

Of course, this requires that the generating function is computed by the sage interface, and not retrieved from findstat.org.

I'll try to propose a patch shortly.

comment:30 Changed 6 years ago by stumpc5

  • Status changed from needs_review to needs_work

The generating function as list does not work for negative statistic values.

comment:31 in reply to: ↑ 29 Changed 6 years ago by stumpc5

Replying to mantepse:

The html now looks almost fine, with the exception of the link to OEIS.find_by_description(), which does not work.

would you mind fixing this? I have built the documentation at the moment for a different ticket, so that would take me forever.

There is another, less minor thing: generating_functions currently only works for queries that yield statistics, i.e., of the form findstat(number):

Good catch! Looking forward to see your solution!

See also the other issue I just posted... I will provide a fix for that!

comment:32 Changed 6 years ago by stumpc5

Here is another question concerning negative statistic values: so far, I had considered the generating function 2q + q^2 as the sequence 0,2,1 where the initial 0 corresponds to the constant term 0. But if I consider negative statistic values, I might end up with the generating function q^{-1} + 2q, which I have to move to the sequence 1,0,2.

So, should I always shift the sequence of a generating function such that it starts with the first nonzero term?

[I find this counter intuitive for nonnegative statistics, but it is unavoidable for partially negative statistics.]

Finally, it seems special wired if my generating functions are

f_1(q) = q^{-1} + q
f_2(q) = 2q + q^2

and then shift the first, but not the second, to finally get 1,0,1,0,2,1.

comment:33 Changed 6 years ago by mantepse

I don't think I understand. Where do you convert sequences to generating functions (or the other way round)?

comment:34 Changed 6 years ago by stumpc5

The generating function is well-defined for any integer valued statistic (though it may be a Laurent polynomial). If I want to search the OEIS for the generating function of a statistic, I have to turn the generating function into a sequence, or indeed a linearized table T(n,k) where T(n,k) is the coefficient of q^k in the generating function of the n-th piece of the statistic (say the permutations of n or the Dyck paths of semilength n,...).

But since k \geq 0, this does not work for Laurent polynomials.

comment:35 Changed 6 years ago by mantepse

OK, so that's really a question on how to fine-tune oeis searches. Don't you separate the individual generating functions with blanks? If so, I would start with the first non-zero coefficient.

comment:36 Changed 6 years ago by stumpc5

OK, so that's really a question on how to fine-tune oeis searches. Don't you separate the individual generating functions with blanks? If so, I would start with the first non-zero coefficient.

Indeed I agree, and that's what I do already. But I would also like to use this functionality to provide a linear sequence and submit it to the OEIS. And for that I do not separate the individual generating functions.

comment:37 Changed 6 years ago by stumpc5

But since this is not really an issue for the Sage interface to FindStat, I should maybe not consider it here...

comment:38 follow-up: Changed 6 years ago by mantepse

I discovered an interesting user-interface problem: when doing a search for distribution FindStat? does not expect that the objects within one block of values are all on the same level. Eg, the following is a valid findstat query:

[1,2]
[2,1]
[1,2,3]
[1,3,2]
[2,1,3]
[2,3,1]
[3,1,2]
[3,2,1]
====> 1,1,2,2,2,2,2,2

But of course, in this case there is no way to compute the generating function...

comment:39 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:40 in reply to: ↑ 38 Changed 6 years ago by stumpc5

  • Commit changed from 5558be983f10469e5be31e45652364437a0edb03 to 2d85a2ec9a47908f6f707c714281f57595e093ff

Replying to mantepse:

I discovered an interesting user-interface problem: when doing a search for distribution FindStat? does not expect that the objects within one block of values are all on the same level.

What is the problem with that? I implemented this as a feature!


New commits:

54051eafix two docstring problems
1652db2Merge branch 'u/stumpc5/generating_function_in_findstat_interface' of git://trac.sagemath.org/sage into u/mantepse/generating_function_in_findstat_interface
2d85a2edraft of generating function from data

comment:41 Changed 6 years ago by stumpc5

Are you planning to remove all dependencies on CollectionLevelsPrecomputed in this ticket? I thought to have that in an indiviual ticket, but it's also fine this way...

comment:42 Changed 6 years ago by git

  • Commit changed from 2d85a2ec9a47908f6f707c714281f57595e093ff to 167d39b44d87fc41d901dcfbe97a6cacf9f5522d

Branch pushed to git repo; I updated commit sha1. New commits:

167d39bintegrate new generating functions code

comment:43 Changed 6 years ago by mantepse

Could you please have a look? The current solution uses the http://FindStat.org delivered data for findstat(number), and computes the gf's otherwise.

comment:44 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:45 Changed 6 years ago by git

  • Commit changed from 167d39b44d87fc41d901dcfbe97a6cacf9f5522d to ccd121c9f00b669f4106358549dd00a0ceecc6dd

Branch pushed to git repo; I updated commit sha1. New commits:

ccd121cfixed a little bug in the generating function code

comment:46 Changed 6 years ago by git

  • Commit changed from ccd121c9f00b669f4106358549dd00a0ceecc6dd to 78e9a6087a56e8b1bfb808b49b9e0bdec4d5092e

Branch pushed to git repo; I updated commit sha1. New commits:

78e9a60shoot, typo

comment:47 Changed 6 years ago by stumpc5

sage: Pi = Permutations(3).list()*2
sage: st = findstat([Pi,[pi.major_index() for pi in Pi]])
hi
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
ValueError: FindStat delivers more statistic values than the level has elements. This should not happen.  Please send and email to the developers.

Hm, that's too bad, isn't it?

Last edited 6 years ago by stumpc5 (previous) (diff)

comment:48 Changed 6 years ago by stumpc5

Unsurprisingly, the same happens for

sage: st = findstat([(pi,pi.major_index()) for pi in Pi])

and even for

sage: st = findstat([(pi,pi.major_index()  ) for pi in Permutations(3)] +
                    [(pi,pi.major_index()+1) for pi in Permutations(3)]
                   )

The later is indeed not a statistic at all!

comment:49 Changed 6 years ago by mantepse

excellent, that's a bug in the preparation of _data

comment:50 Changed 6 years ago by git

  • Commit changed from 78e9a6087a56e8b1bfb808b49b9e0bdec4d5092e to 4adcd928d977788bd4a50ccbc648e0ae89c9b15e

Branch pushed to git repo; I updated commit sha1. New commits:

4adcd92fixed two bugs (one of mine and one of Martin) in the generating function code

comment:51 Changed 6 years ago by stumpc5

Why did you add if gfs[key] to the loop for key in sorted(gfs.keys())? That seems to produce wrong outputs...

comment:52 Changed 6 years ago by mantepse

Because I made compute_generating_functions use None if there are values for the level, but not enough. I guess that's stupid, so feel free to remove it.

comment:53 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:54 Changed 6 years ago by stumpc5

  • Commit changed from 4adcd928d977788bd4a50ccbc648e0ae89c9b15e to 36d52b37ea2f71b6e3f186f83b7033fa57ba56ad

what is the point of having the lines

if len(set(elements)) != len(elements):
    raise ValueError("FindStat expects that every object appears only once!")

twice? Making the first if len(set(elements)) != len(elements) into an if len(set(all_elements)) != len(all_elements) should make the second obsolete. Or am I overseeing sth. or do you not like that?


New commits:

36d52b3check that user does not provide an object twice

comment:55 Changed 6 years ago by mantepse

Well, you can change it if you like. I didn't like to turn all_elements into a set on every iteration, that's all. So, currently in every iteration we check that the elements in one block are distinct, and then we check that there is no element appearing in two different blocks.

We could actually simply delete the first check, that's probably clearest.

comment:56 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:57 Changed 6 years ago by stumpc5

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

We could actually simply delete the first check, that's probably clearest.

agreed and pushed.

comment:58 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:59 Changed 6 years ago by stumpc5

  • Commit changed from 36d52b37ea2f71b6e3f186f83b7033fa57ba56ad to 1798c37a7f0e962f5e72bb60c47722e8fbc249e4

I still don't know how to properly interact with the branches, but it seems that there is nothing better than constantly merging and changing the branch on trac...


New commits:

b23dac2modified the list generating function so that it can handle negative statistic values
347c026merged Martin any my changes
ccd121cfixed a little bug in the generating function code
78e9a60shoot, typo
4adcd92fixed two bugs (one of mine and one of Martin) in the generating function code
bf2c4b6Merge remote-tracking branch 'trac/u/mantepse/generating_function_in_findstat_interface' into t/19296/generating_function_in_findstat_interface
1798c37cleaned two lines of code as discussed

comment:60 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:61 Changed 6 years ago by mantepse

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

I have now removed name_plural and the change to _name, because they have a separate ticket and induced a doctest failure.

Do we need to open another ticket for the bugfix barking on duplicate objects? (I do not care) Otherwise, I think this is ready to go. probably we need to doctest compute_generating_functions to make the patchbot happy.

Last edited 6 years ago by mantepse (previous) (diff)

comment:62 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:63 Changed 6 years ago by stumpc5

  • Commit changed from 1798c37a7f0e962f5e72bb60c47722e8fbc249e4 to 5b42946bef35abc01c15c56f09ae19ee2030686c

Okay, if two tickets A and B do two different things, but to develop what I want in B I need A. How am I supposed to be doing that without merging A into B and having A as a dependency in B?


New commits:

5b42946remove name_plural and change to name, because these belong to ticket #19307

comment:64 Changed 6 years ago by mantepse

  • Dependencies #19307 deleted

comment:65 Changed 6 years ago by git

  • Commit changed from 5b42946bef35abc01c15c56f09ae19ee2030686c to 1fe70f7eaa757c9c2cf1bdde415bce9e22d3fb15

Branch pushed to git repo; I updated commit sha1. New commits:

1fe70f7do not produce None in _compute_generating_functions, add doctest, fix hyperlinks

comment:66 Changed 6 years ago by mantepse

  • Reviewers set to Martin Rubey
  • Status changed from needs_work to positive_review

New commits:

1fe70f7do not produce None in _compute_generating_functions, add doctest, fix hyperlinks

comment:67 Changed 6 years ago by stumpc5

  • Status changed from positive_review to needs_info

Sorry to get back to this, but if len(set(all_elements)) != len(all_elements) assumes the elements to be hashable.

FindStat? internally this is totally fine, but that makes the following, very natural, findstat search fail:

sage: findstat( { g:g.wiener_index() for g in Graphs(4) if g.is_connected() } )
---------------------------------------------------------------------------
...
TypeError: This graph is mutable, and thus not hashable. Create an immutable copy by `g.copy(immutable=True)`

It will obviously take much longer to do the uniqueness test in a list, but I guess it's worth it. What do you think?

comment:68 Changed 6 years ago by mantepse

The line you sent won't even work if you correct the typo (Graphs -> graphs) and omit findstat.

But otherwise you are right. Possibly we should normalize the keys at the beginning, maybe even store the string representations.

comment:69 Changed 6 years ago by stumpc5

I don't know how I got that line wrong, it should read

findstat([ (g,g.wiener_index()) for g in graphs(4) if g.is_connected() ])

and now only the findstat hashing causes the problem (and it was working without this ticket).

Possibly we should normalize the keys at the beginning, maybe even store the string representations.

+1. Are you going to work on this and the addition of the set_first_terms, or should I do one/both of these?

comment:70 Changed 6 years ago by git

  • Commit changed from 1fe70f7eaa757c9c2cf1bdde415bce9e22d3fb15 to 4d6e9eb840d59a6b2b3b0446b1162d19eb6dc893

Branch pushed to git repo; I updated commit sha1. New commits:

4d6e9ebbugfix

comment:71 Changed 6 years ago by stumpc5

Wow, that's a whole lot of code for a bug I would have fixed by changing one line (but I am not saying my solution would have been the better)!

I wonder:

Derzeit werden alle normalisiert, auch wenn's 17 Millionen sind

  • Are really more normalized than used? A search should be restricted to FINDSTAT_MAX_??? and a new statistic to FINDSTAT_MAX_***. Do you now normalize also all the others?
  • is it really better (implementation-wise and time-wise) to store also all string representations than to check whether some list is duplicate-free? Or are you / do you plan using that string representation also somewhere else?

comment:72 Changed 6 years ago by mantepse

Wow, that's a whole lot of code for a bug I would have fixed by changing one line (but I am not saying my solution would have been the better)!

I'd be interested in seeing your solution.

I wonder:

Derzeit werden alle normalisiert, auch wenn's 17 Millionen sind

Are really more normalized than used? A search should be restricted to FINDSTAT_MAX_??? and a new statistic to FINDSTAT_MAX_***. Do you now normalize also all the others?

I'm afraid so. However, if the normalization really takes too long, I would move the code in _find_by_values that restricts the amount of data to FINDSTAT_MAX_??? to FindStat.__call__ and only normalize those data which are sent to FindStat?. If there is no problem, I certainly do not want to do this.

is it really better (implementation-wise and time-wise) to store also all string representations than to check whether some list is duplicate-free? Or are you / do you plan using that string representation also somewhere else?

How could I check without normalization that the list contains no duplicates?

comment:73 follow-up: Changed 6 years ago by stumpc5

Just a quick issue along the way:

sage: S = Permutations()
sage: a,b,c = S([1,2,3]),S([1,3,2]),S([2,1,3])
sage: findstat( [([a],[1]),([b],[1])] )
a new statistic on Cc0001: Permutations

Shouldn't that return an error as too little values are given?

comment:74 Changed 6 years ago by stumpc5

Okay, the following is really nice as you do it:

sage: g = Graph([[1,2],[2,3]])
sage: h = Graph([[1,3],[2,3]])
sage: i = Graph([[1,3],[2,3],[2,4]])
sage: findstat( [([g,h,i],[1,1,1])] )
---------------------------------------------------------------------------
...
ValueError: FindStat expects that every object occurs at most once.

There is no change to do that without your normalization procedure. I was suggesting only to replace the len(set(all_entries)) == len(all_entries) by a short method to check that all_entries is a duplicate-free list. That is a one-liner and does what we did before slower but for non-hashable objects. But it does also not achieve the normalization.

As long as the timings are okay (and I haven't seen any issues so far, but sending a zillion might be a problem) I am happy with your solution...

comment:75 in reply to: ↑ 73 Changed 6 years ago by mantepse

Replying to stumpc5:

Just a quick issue along the way:

sage: S = Permutations()
sage: a,b,c = S([1,2,3]),S([1,3,2]),S([2,1,3])
sage: findstat( [([a],[1]),([b],[1])] )
a new statistic on Cc0001: Permutations

Shouldn't that return an error as too little values are given?

Yes, and if you look closely you'll see already a doctest that doesn't know how to fail yet :-)

comment:76 Changed 6 years ago by stumpc5

Replying to stumpc5:

Shouldn't that return an error as too little values are given?

I just checked: FindStat internally, it is allowed to send less than 3 entries, but it does not capture anything since only hits with 3 or more coincidences are returned. I think that's okay to leave. Anyway, I will add a check that not too little information is sent...

Am I right that you are not touching the file currently?

comment:77 Changed 6 years ago by stumpc5

Before I do the change: is

# this might go wrong:
try:
    assert data != []
except:
    raise ValueError("after discarding elements not in the range, and keeping less than %s values, nothing remained to send to FindStat." %FINDSTAT_MAX_VALUES)

any better than

if data == []:
    raise ValueError("after discarding elements not in the range, and keeping less than %s values, nothing remained to send to FindStat." %FINDSTAT_MAX_VALUES)

I would otherwise do

if len(data) < FINDSTAT_MIN_VALUES:
    raise ValueError("After discarding elements not in the range, too little (=%s) values remained to send to FindStat." %FINDSTAT_MIN_VALUES)

comment:78 Changed 6 years ago by mantepse

ad 0: no, I'm not going to touch the file.

ad 1: no, I like the second version (yours) much better.

ad 2: OK, except that your test is not correct, you need the sum of the lengths of the lists in data.

comment:79 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:80 Changed 6 years ago by git

  • Commit changed from 4d6e9eb840d59a6b2b3b0446b1162d19eb6dc893 to c41472f6e47c8b215079da04dfe85b52f91c1222

Branch pushed to git repo; I updated commit sha1. New commits:

c41472fadded a min values counter doctest

comment:81 Changed 6 years ago by git

  • Commit changed from c41472f6e47c8b215079da04dfe85b52f91c1222 to edc3e89bacf23558c6f1e443439536618dff8518

Branch pushed to git repo; I updated commit sha1. New commits:

edc3e89fixed some doctests, all tests pass now

comment:82 Changed 6 years ago by stumpc5

Seems ready for me for now. I did a few tests on graphs and the standardization doesn't seem to influence things too much. Let's wait for a user case complaining about that timing...

comment:83 Changed 6 years ago by mantepse

I have a few more docfixes, can I merge and push?

comment:84 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:85 Changed 6 years ago by mantepse

  • Commit changed from edc3e89bacf23558c6f1e443439536618dff8518 to 025e77400992c076fd2e978b9759ba2411cfc063
  • Status changed from needs_info to needs_review

New commits:

025e774a few fixes to error messages

comment:86 Changed 6 years ago by stumpc5

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to u/stumpc5/generating_function_in_findstat_interface

comment:87 Changed 6 years ago by mantepse

  • Commit changed from 025e77400992c076fd2e978b9759ba2411cfc063 to 65fb91274c7cd520bf1fa352da6701528fd64414
  • Status changed from needs_review to positive_review

New commits:

edc3e89fixed some doctests, all tests pass now
65fb912trivial merge

comment:88 Changed 6 years ago by vbraun

  • Status changed from positive_review to needs_work

merge conflict, wait for the next beta

comment:89 Changed 6 years ago by mantepse

Am I supposed to see the merge conflict somewhere?

comment:90 Changed 6 years ago by stumpc5

It only touches findstat.py so I guess the conflict is with #19314. I will pull 6.9 in a bit and then see if I can find the conflict. Let me know if you have solved it before...

comment:91 Changed 6 years ago by mantepse

OK, I won't touch it. Thanks!

comment:92 Changed 6 years ago by git

  • Commit changed from 65fb91274c7cd520bf1fa352da6701528fd64414 to 6c7099a5e0a4c036fd5c4fa8060a70898ef0b45e

Branch pushed to git repo; I updated commit sha1. New commits:

6c7099aMerge branch 'develop' into t/19296/generating_function_in_findstat_interface

comment:93 Changed 6 years ago by stumpc5

I cannot see any merge conflicts either when merging 6.9 into this branch. I might have to do with all our take-stuff-out-of-ticket-and-move-it-into-another-ticket-and-get-it-back-here thing, but I wouldn't know where it goes wrong.

For the record: the only file changed from 6.9 when merging this ticket is src/sage/databases/findstat.py and the merge went through cleanly.

comment:94 Changed 6 years ago by git

  • Commit changed from 6c7099a5e0a4c036fd5c4fa8060a70898ef0b45e to 3bb57edb4bdf1cfe0951136781ab786185f0c20a

Branch pushed to git repo; I updated commit sha1. New commits:

3bb57edmerged 6.10.rc0 and fixed conflicts

comment:95 Changed 6 years ago by stumpc5

  • Status changed from needs_work to needs_review

comment:96 Changed 6 years ago by stumpc5

I merged the the newest develop and fixed the indeed occurring merge conflict. I usually do mistakes doing that (even though I try not to), so please have a brief look again before giving it a positive review.

comment:97 Changed 6 years ago by stumpc5

Martin, could you have a quick look here and give it a positive review?

comment:98 Changed 6 years ago by stumpc5

ping

comment:99 Changed 6 years ago by jdemeyer

  • Milestone changed from sage-6.9 to sage-6.10
  • Status changed from needs_review to needs_work

Don't use except:, catch specific exceptions or use except Exception: to catch all standard exceptions.

comment:100 Changed 6 years ago by mantepse

  • Branch changed from u/stumpc5/generating_function_in_findstat_interface to u/mantepse/generating_function_in_findstat_interface

comment:101 Changed 6 years ago by mantepse

  • Commit changed from 3bb57edb4bdf1cfe0951136781ab786185f0c20a to 72877fad8284b604978163ef10a54fa6421c17f1
  • Description modified (diff)
  • Status changed from needs_work to needs_review

comment:102 Changed 6 years ago by git

  • Commit changed from 72877fad8284b604978163ef10a54fa6421c17f1 to 1c83fe3962b1ef7dd698b7e1bab86561473d9d27

Branch pushed to git repo; I updated commit sha1. New commits:

1c83fe3tiny fixes

comment:103 Changed 6 years ago by mantepse

I'm giving the parts Christian wrote positive review.

comment:104 Changed 6 years ago by mantepse

  • Authors changed from Christian Stump to Christian Stump, Martin Rubey
  • Reviewers changed from Martin Rubey to Martin Rubey, Christian Stump

comment:105 Changed 6 years ago by stumpc5

  • Status changed from needs_review to positive_review

looks good to me, and the patchbot is happy...

comment:106 Changed 6 years ago by jdemeyer

  • Status changed from positive_review to needs_work

This is silly:

try:
    assert foo
except AssertionError:
    raise ValueError(...)

it would be much better to write this is

if not foo:
    raise ValueError(...)

but really, I think you also want to catch TypeError from the depth = int(depth) call.

You should also add doctests for these bad inputs.

comment:107 Changed 6 years ago by git

  • Commit changed from 1c83fe3962b1ef7dd698b7e1bab86561473d9d27 to fd221b7260bd2f9fdd5864619a8346daa71b3d59

Branch pushed to git repo; I updated commit sha1. New commits:

4620683Merge branch 'develop' of git://github.com/sagemath/sage into t/19296/generating_function_in_findstat_interface
fd221b7better argument checking due to jdemeyer

comment:108 Changed 6 years ago by mantepse

Do you like this version better? (I want to keep the argument checking of depth short and as one entity, that's why I did it this way.)

comment:109 Changed 6 years ago by mantepse

  • Status changed from needs_work to positive_review

comment:110 Changed 6 years ago by jdemeyer

Fine for me.

Please remember that you should never use except:, as it is usually the wrong thing to do (it also catches KeyboardInterrupt for example).

comment:111 Changed 6 years ago by vbraun

  • Status changed from positive_review to needs_work

Doctests must run without network access

sage -t --long src/sage/databases/findstat.py
**********************************************************************
File "src/sage/databases/findstat.py", line 486, in sage.databases.findstat.FindStat.__call__
Failed example:
    findstat("Permutations", 1)
Expected:
    Traceback (most recent call last):
    ...
    ValueError: The given arguments, Permutations and 1, cannot be used for a FindStat search.
Got:
    <BLANKLINE>
    Traceback (most recent call last):
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.databases.findstat.FindStat.__call__[3]>", line 1, in <module>
        findstat("Permutations", Integer(1))
      File "sage/misc/lazy_import.pyx", line 383, in sage.misc.lazy_import.LazyImport.__call__ (/mnt/highperf/buildbot/slave/sage_git/build/src/build/cythonized/sage/misc/lazy_import.c:3593)
        return self._get_object()(*args, **kwds)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/databases/findstat.py", line 613, in __call__
        (collection, to_str) = get_collection(None, query_1)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/databases/findstat.py", line 515, in get_collection
        collection = FindStatCollection(element)
      File "sage/misc/classcall_metaclass.pyx", line 326, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (/mnt/highperf/buildbot/slave/sage_git/build/src/build/cythonized/sage/misc/classcall_metaclass.c:1239)
        return cls.classcall(cls, *args, **kwds)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/databases/findstat.py", line 1861, in __classcall_private__
        return FindStatCollections()(entry)
      File "sage/misc/classcall_metaclass.pyx", line 326, in sage.misc.classcall_metaclass.ClasscallMetaclass.__call__ (/mnt/highperf/buildbot/slave/sage_git/build/src/build/cythonized/sage/misc/classcall_metaclass.c:1239)
        return cls.classcall(cls, *args, **kwds)
      File "sage/misc/cachefunc.pyx", line 1304, in sage.misc.cachefunc.WeakCachedFunction.__call__ (/mnt/highperf/buildbot/slave/sage_git/build/src/build/cythonized/sage/misc/cachefunc.c:8470)
        w = self.f(*args, **kwds)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/structure/unique_representation.py", line 1021, in __classcall__
        instance = typecall(cls, *args, **options)
      File "sage/misc/classcall_metaclass.pyx", line 493, in sage.misc.classcall_metaclass.typecall (/mnt/highperf/buildbot/slave/sage_git/build/src/build/cythonized/sage/misc/classcall_metaclass.c:1665)
        return (<PyTypeObject*>type).tp_call(cls, args, kwds)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/sage/databases/findstat.py", line 2357, in __init__
        for j in json.load(urlopen(FINDSTAT_URL_DOWNLOADS_COLLECTIONS)):
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 154, in urlopen
        return opener.open(url, data, timeout)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 431, in open
        response = self._open(req, data)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 449, in _open
        '_open', req)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 409, in _call_chain
        result = func(*args)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 1227, in http_open
        return self.do_open(httplib.HTTPConnection, req)
      File "/mnt/highperf/buildbot/slave/sage_git/build/local/lib/python/urllib2.py", line 1197, in do_open
        raise URLError(err)
    URLError: <urlopen error [Errno 110] Connection timed out>
**********************************************************************
1 item had failures:
   1 of   6 in sage.databases.findstat.FindStat.__call__
    [54 tests, 1 failure, 127.29 s]

comment:112 Changed 6 years ago by git

  • Commit changed from fd221b7260bd2f9fdd5864619a8346daa71b3d59 to 00dc9e56087a8e79f72d3848422878cef667264c

Branch pushed to git repo; I updated commit sha1. New commits:

04162f7Merge branch 'develop' of git://github.com/sagemath/sage into t/19296/generating_function_in_findstat_interface
00dc9e5Merge branch 'u/mantepse/generating_function_in_findstat_interface' of git://trac.sagemath.org/sage into t/19296/generating_function_in_findstat_interface

comment:113 Changed 6 years ago by git

  • Commit changed from 00dc9e56087a8e79f72d3848422878cef667264c to 15ec3561001bbb9f6372e4d245959f6a2adb470a

Branch pushed to git repo; I updated commit sha1. New commits:

15ec356fix test thanks to vbraun

comment:114 Changed 6 years ago by mantepse

  • Status changed from needs_work to positive_review

Sorry about that, fixed.

comment:115 Changed 6 years ago by vbraun

  • Branch changed from u/mantepse/generating_function_in_findstat_interface to 15ec3561001bbb9f6372e4d245959f6a2adb470a
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.