Opened 6 years ago

Closed 6 years ago

#18239 closed defect (fixed)

Constructing Cayley graphs is slow

Reported by: azi Owned by:
Priority: major Milestone: sage-6.7
Component: group theory Keywords:
Cc: ncohen, vdelecroix Merged in:
Authors: Vincent Delecroix Reviewers: Nathann Cohen
Report Upstream: N/A Work issues:
Branch: d0ec65b (Commits) Commit: d0ec65bb221ff76dd0c63b606659ccfd3e4015e6
Dependencies: Stopgaps:

Description (last modified by vdelecroix)

Computing the Cayley graph of a group and generator set is extremely slow in Sage. The only reasonable solution was to use GRAPE directly.

As an example, let p and q be two permutations

p=((2,9),(7,28),(10,32),(11,33),(12,34),(13,47),(14,36),(18,58),(22,66),(26,73),(30,77),(35,45),(37,81),(38,82),(39,98),(40,84),(41,85),(42,86),(43,106),(44,88),(46,89),(48,112),(49,91),(52,111),(56,125),(60,129),(64,136),(68,140),(71,144),(75,148),(79,152),(83,96),(87,104),(90,109),(92,156),(93,157),(94,177),(95,159),(97,160),(99,182),(100,162),(101,163),(102,187),(103,165),(105,166),(107,193),(108,168),(110,169),(113,199),(114,171),(117,192),(120,198),(123,211),(127,215),(131,219),(134,223),(138,227),(142,231),(146,235),(150,239),(154,243),(158,175),(161,180),(164,185),(167,190),(170,196),(172,246),(173,266),(174,248),(176,249),(178,271),(179,251),(181,252),(183,276),(184,254),(186,255),(188,282),(189,257),(191,258),(194,288),(195,260),(197,261),(200,294),(201,263),(203,281),(206,287),(209,293),(213,305),(217,309),(221,313),(225,316),(229,320),(233,324),(237,327),(241,331),(245,334),(247,264),(250,269),(253,274),(256,279),(259,285),(262,291),(265,335),(267,354),(268,337),(270,338),(272,359),(273,340),(275,341),(277,364),(278,343),(280,344),(283,369),(284,346),(286,347),(289,375),(290,349),(292,350),(295,379),(297,368),(300,374),(303,378),(307,388),(311,392),(315,395),(318,396),(322,400),(326,403),(329,404),(333,407),(336,352),(339,357),(342,362),(345,366),(348,372),(351,377),(353,408),(355,423),(356,410),(358,411),(360,428),(361,413),(363,414),(365,431),(367,416),(370,435),(371,418),(373,419),(376,439),(381,434),(384,438),(387,440),(390,446),(394,449),(398,450),(402,453),(406,454),(409,421),(412,426),(415,430),(417,432),(420,437),(422,455),(424,464),(425,457),(427,458),(429,467),(433,460),(436,470),(442,469),(445,471),(448,474),(452,475),(456,462),(459,466),(461,468),(463,476),(465,479),(473,480),(477,478))
q=((1,107,306,184,333,101,19,272,328,201,56,265,31,289,447,10,75,280,132,429,70,14,217,422,155,43,212,100,241,433,3,178,236,114,394,172,8,194,389,278,26,186,61,360,405,2,127,353,80,376,122,40,150,367,222,94,145,49,311,463),(4,167,380,254,402,163,53,339,397,263,117,335,69,348,472,32,138,344,210,459,133,36,300,455,234,87,296,162,322,460,15,250,317,171,445,246,23,259,441,343,64,255,121,412,451,9,206,408,143,420,202,84,229,416,304,158,224,91,384,476),(5,188,130,273,406,12,57,355,153,290,123,97,76,370,314,37,146,110,218,465,6,44,307,275,242,102,59,179,329,292,16,267,78,195,448,38,27,283,220,361,71,46,128,424,244,11,213,181,151,436,17,95,237,197,312,173,29,108,390,363),(7,105,214,277,332,41,60,270,238,295,55,174,79,286,391,39,74,189,221,427,24,48,216,356,245,42,124,183,240,371,18,176,147,200,393,92,30,191,308,365,25,103,131,358,330,13,126,268,154,373,54,99,149,284,315,93,72,113,310,425),(20,256,208,340,452,34,118,409,232,349,203,160,139,417,386,81,225,169,301,477,21,88,381,341,323,164,119,251,398,350,50,336,141,260,473,82,65,345,302,413,134,89,207,456,325,33,297,252,230,461,51,159,318,261,385,247,67,168,442,414),(22,166,298,342,401,85,120,338,319,351,116,248,142,347,443,83,137,257,303,458,62,90,299,410,326,86,204,253,321,418,52,249,226,262,444,156,68,258,382,415,63,165,209,411,399,35,205,337,233,419,115,161,228,346,387,157,135,170,383,457),(28,190,388,362,407,185,129,357,404,377,125,352,152,372,474,96,148,366,313,466,144,109,309,462,334,104,305,274,331,468,58,269,327,291,449,264,77,285,446,430,73,279,219,426,454,45,215,421,243,437,211,180,239,432,395,175,235,196,392,478),(47,287,423,324,439,281,182,320,435,440,177,316,199,438,479,66,193,434,364,453,187,198,359,450,379,192,354,231,375,480,98,227,369,378,467,223,112,374,464,403,106,368,276,400,470,111,271,396,294,471,266,140,288,469,431,136,282,293,428,475),)
A = PermutationGroup([p,q])
p,q = A(p),A(q)

Then, the following code

G = A.cayley_graph(generators=[p, q, p.inverse()], simple=True)

takes 26 seconds to finish and it returns a DiGraph that still to be converted to a Graph.

With the branch applied

sage: %time G = A.cayley_graph(generators=[p, q, p.inverse()], simple=True)
CPU times: user 192 ms, sys: 4 ms, total: 196 ms
Wall time: 193 ms

Crazy speed up: x130 :-)

A previous workaround was to do

gout = gap('UndirectedEdges(CayleyGraph((Group(' + str(x) + ',' + str(g)+ ' )), [' + str(g) + ',' + str(x) + ',' + str(g.inverse()) + ']));') 
ast.literal_eval(str(gout))

which takes 0.4s. Note that the last example assumed ast is imported and also that

gap.LoadPackage('"grape"')

was loaded.

Change History (39)

comment:1 Changed 6 years ago by azi

  • Description modified (diff)

comment:2 Changed 6 years ago by ncohen

Do you know what the bottleneck is?

comment:3 follow-up: Changed 6 years ago by azi

As far as I've profiled it seems that most of the time is spent multiplying PermutationGroup? elements.

Also I am not sure but perhaps GRAPE is perhaps implemented in C and is already more efficient for this reason alone.

comment:4 in reply to: ↑ 3 Changed 6 years ago by ncohen

As far as I've profiled it seems that most of the time is spent multiplying PermutationGroup? elements.

Also I am not sure but perhaps GRAPE is perhaps implemented in C and is already more efficient for this reason alone.

Well, being a C code may explain why it could be faster but now why it would be SO much faster.

It turns out [1] that there are several implementations of this cayley_graph function. There is one in sage.categories.semigroups and another in groups/group.pyx. Do you know which one is run by your code ?

It would help if you could share this group A.

Nathann

[1] See? I don't scream anymore. Of course I want to punch a lot of people, but I say "It turns out"

comment:5 Changed 6 years ago by azi

Oh, I forgot to write that - the group A is the one generated by x and g.

A = PermutationGroup([x,g])

As far as I've researched into this, the sage.categories.semigroups cayley_graph function is called.

comment:6 Changed 6 years ago by ncohen

Okay. The solution is the same, again and again. Some people should get a license to write code.

comment:7 Changed 6 years ago by azi

Solution?

comment:8 Changed 6 years ago by ncohen

2sc

comment:9 Changed 6 years ago by ncohen

Well actually it's not the cayley_graph function. It is list(A).

comment:10 Changed 6 years ago by vdelecroix

  • Cc vdelecroix added

just cc'ing me

comment:11 Changed 6 years ago by ncohen

  • Description modified (diff)

comment:12 Changed 6 years ago by vdelecroix

Indeed! Modifying the __iter__ function in PermutationGroup_generic I got

sage: timeit("for x in A: pass")
5 loops, best of 3: 116 ms per loop

Instead of

sage: timeit("for x in A: pass")
... still waiting

Do you know why the list method in the same class is cached?

Vincent

comment:13 Changed 6 years ago by vdelecroix

  • Authors set to Vincent Delecroix
  • Branch set to public/18239
  • Commit set to 7dbf867236b6c9c88acd25009fdf8453571b9d75
  • Status changed from new to needs_review

New commits:

7dbf867Trac 18239: fix PermutationGroup_generic.__iter__

comment:14 Changed 6 years ago by vdelecroix

  • Description modified (diff)

In the description ticket I assumed that x meant p and g meant q. If I did wrong, please modify.

comment:15 Changed 6 years ago by ncohen

O_O

Okay.. Well, it works :-P

comment:16 Changed 6 years ago by vdelecroix

Perhaps we should add a long time to the doctest? It takes 2sec for me. I just want to check that it is reasonably doable in the new version (it was more than 2 min in the previous state).

comment:17 Changed 6 years ago by git

  • Commit changed from 7dbf867236b6c9c88acd25009fdf8453571b9d75 to 1e75b5bd931cf025ef7fa64617021a02474508c8

Branch pushed to git repo; I updated commit sha1. New commits:

1e75b5bTrac 18239: fix doctests

comment:18 Changed 6 years ago by git

  • Commit changed from 1e75b5bd931cf025ef7fa64617021a02474508c8 to 86f696f0dd36cc683ce5a67c3986f55c846592e1

Branch pushed to git repo; I updated commit sha1. New commits:

86f696ftrac #18239: Small additional speedup

comment:19 Changed 6 years ago by ncohen

I've got a good news and a bad news. Good news: we can shave 200ms off again.

Bad news: we don't have two but three implementations of cayley_graph. One of them in old.py, which is used by default by PermutationGroup >_<

Nathann

Last edited 6 years ago by ncohen (previous) (diff)

comment:20 follow-up: Changed 6 years ago by azi

Is there a reason why not wipe out all three and add single function under graphs?

comment:21 Changed 6 years ago by ncohen

Oh yeah.. And another bad news:

FiniteGroup.cayley_graph() and categories.semigroups.cayley_graph do not take the same input >_<

Nathann

comment:22 in reply to: ↑ 20 Changed 6 years ago by ncohen

Is there a reason why not wipe out all three and add single function under graphs?

A reason to not wipe out two among the three: None.

A reason to have a single function in the graph folder: I'd say none. Because Nicolas decided that whatever applied to all groups should be in categories/.

Nathann

comment:23 Changed 6 years ago by ncohen

Okay... So what would you think of removing the two cayley_graph functions from group.pyx and old.pyx ?

Nathann

comment:24 Changed 6 years ago by git

  • Commit changed from 86f696f0dd36cc683ce5a67c3986f55c846592e1 to d6494fd229e40573b2a86ba64c8cde241ab795b3

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

d6494fdtrac #18239: Small simplification and removal of unused cayley_graph functions

comment:25 Changed 6 years ago by git

  • Commit changed from d6494fd229e40573b2a86ba64c8cde241ab795b3 to 07e79d5ac60305075ac196f144252d32f0962078

Branch pushed to git repo; I updated commit sha1. New commits:

ccd450fTrac 18239: hashing is important!
07e79d5Trac 18239: fix doctests

comment:26 Changed 6 years ago by vdelecroix

  • Description modified (diff)

Got a x5 speedup ;-PPP

comment:27 Changed 6 years ago by git

  • Commit changed from 07e79d5ac60305075ac196f144252d32f0962078 to 69aa0939c24ad4f0a2c02b2a5f296aa04f3a498e

Branch pushed to git repo; I updated commit sha1. New commits:

69aa093Trac 18239: typo

comment:28 follow-up: Changed 6 years ago by ncohen

What are those constants again?.... It's like you are obfuscating the code.

comment:29 in reply to: ↑ 28 Changed 6 years ago by vdelecroix

Replying to ncohen:

What are those constants again?.... It's like you are obfuscating the code.

These are the constants from the hasing of tuples in Python...

comment:30 Changed 6 years ago by ncohen

Could you say so in the code?

comment:31 Changed 6 years ago by git

  • Commit changed from 69aa0939c24ad4f0a2c02b2a5f296aa04f3a498e to 763ca62c664f5f6c35d821f2fc908c479feafcdc

Branch pushed to git repo; I updated commit sha1. New commits:

763ca62Trac 18239: some precision about __hash__

comment:32 Changed 6 years ago by vdelecroix

  • Description modified (diff)

comment:33 follow-up: Changed 6 years ago by ncohen

This is your code:

mult += <long> (i + self.n)

This is from Python's latest release

mult += (long)(82520L + len + len);

comment:34 in reply to: ↑ 33 ; follow-up: Changed 6 years ago by vdelecroix

Replying to ncohen:

This is your code:

mult += <long> (i + self.n)

This is from Python's latest release

mult += (long)(82520L + len + len);

Right, this is not exactly the same as the Python implementation. Do you want me that it coincides with tuples as it was the case before? Note that in Python implementation len goes from n-1 to 0. Would it be better with some salt like

mult += <long>(82520L + i + self.n)

comment:35 in reply to: ↑ 34 Changed 6 years ago by ncohen

Right, this is not exactly the same as the Python implementation. Do you want me that it coincides with tuples as it was the case before? Note that in Python implementation len goes from n-1 to 0. Would it be better with some salt like

mult += <long>(82520L + i + self.n)

Honestly I do not care at all. I see two ways out:

1) You want it to coincide with the Python implementation, and so let's do it right

2) You don't need it to coincide, and then I don't see why it has to be this complicated.

Nathann

comment:36 Changed 6 years ago by git

  • Commit changed from 763ca62c664f5f6c35d821f2fc908c479feafcdc to 762b3e02dbffa614c937965978be65e0374c58c2

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

ab75258Trac 18239: faster and simpler __hash__
762b3e0Trac 18239: fix doctests

comment:37 Changed 6 years ago by git

  • Commit changed from 762b3e02dbffa614c937965978be65e0374c58c2 to d0ec65bb221ff76dd0c63b606659ccfd3e4015e6

Branch pushed to git repo; I updated commit sha1. New commits:

d0ec65bTrac 18239: last doctest failure

comment:38 Changed 6 years ago by ncohen

  • Reviewers set to Nathann Cohen
  • Status changed from needs_review to positive_review

comment:39 Changed 6 years ago by vbraun

  • Branch changed from public/18239 to d0ec65bb221ff76dd0c63b606659ccfd3e4015e6
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.