Opened 5 years ago

Closed 5 years ago

#16005 closed enhancement (fixed)

Waste of time in iterator_edges 2

Reported by: ncohen Owned by:
Priority: major Milestone: sage-6.3
Component: graph theory Keywords:
Cc: azi, mmezzarobba Merged in:
Authors: Nathann Cohen Reviewers: Vincent Delecroix
Report Upstream: N/A Work issues:
Branch: 567600b (Commits) Commit: 567600b943c81837a5bc525136b55c6f7878327b
Dependencies: #15978 Stopgaps:

Description

As reported by Marc in #15978, iterator_edges wastes a lot of time. Computationally, returning all edges of a graph currently works this way:

for each edge E of the graph: # Okay, linear time
    <Something equivalent to checking that E is an edge> # Totally useless
    return E

This can be solved by creating a function that is somehow between

  • Returning all neighbors of a vertex (which is not sufficient to find the edges incident to this vertex)
  • Work directly with the data structure, i.e. a binary tree

This function is named out_neighbors_BTNode_unsafe and does what some part of the code already did, before throwing the information away. It builds the list of BTNodes describing the [neighbors of/edges incident to] a vertex, from which all that we need can be deduced.

As a result, the useless computations reported above can be removed.

Nathann

Change History (15)

comment:1 Changed 5 years ago by ncohen

  • Branch set to u/ncohen/16005
  • Component changed from PLEASE CHANGE to graph theory
  • Status changed from new to needs_review

Getting better ...

sage: %time _=g.to_dictionary()     
CPU times: user 1.07 s, sys: 0 ns, total: 1.07 s
Wall time: 1.08 s
sage: %time _=g.edges()        
CPU times: user 2.94 s, sys: 4 ms, total: 2.95 s
Wall time: 2.96 s
sage: %time _=g.edges(labels=False)
CPU times: user 2.7 s, sys: 0 ns, total: 2.7 s
Wall time: 2.73 s

Nathann

comment:2 Changed 5 years ago by git

  • Commit set to 56ec117271ec89184091d9bda49b48e0c72d65ab

Branch pushed to git repo; I updated commit sha1. New commits:

768a63etrac #15978: Waste of time in g.edges()
56ec117trac 16005: Waste of time in iterator_edges 2

comment:3 Changed 5 years ago by vdelecroix

  • Status changed from needs_review to needs_work

on 6.2.beta8:

CONFLICT (content): Merge conflict in src/sage/graphs/base/sparse_graph.pyx

comment:4 Changed 5 years ago by git

  • Commit changed from 56ec117271ec89184091d9bda49b48e0c72d65ab to 567600b943c81837a5bc525136b55c6f7878327b

Branch pushed to git repo; I updated commit sha1. New commits:

567600btrac #16005: merge into 6.2.beta8

comment:5 Changed 5 years ago by ncohen

  • Status changed from needs_work to needs_review

Done !

Nathann

comment:6 Changed 5 years ago by vdelecroix

  • Reviewers set to Vincent Delecroix

Hi Nathann,

Some old timings

sage: g=graphs.RandomGNP(1500,.4)
sage: timeit('for e in g.edge_iterator(): pass')
5 loops, best of 3: 684 ms per loop

Some new timings

sage: g=graphs.RandomGNP(1500,.4)
sage: timeit('for e in g.edge_iterator(): pass')
5 loops, best of 3: 317 ms per loop

This is definitely cool!

Question for further optimization: why in the SparseGraphBackend the methods iterator_edges, iterator_in_edges and iterator_out_edges do not use the out_neighbors_unsafe/in_neighbors_unsafe of SparseGraph with an array int * neighbors allocated once for all to the maximum of the degrees? That would avoid as many "malloc"+"list creation" as there are number of vertices in the call.

comment:7 follow-up: Changed 5 years ago by ncohen

No reason that I can see. It will be a bit faster, it will make the code sligthly harder to read.. Comparing with an iterator could be nice, too. Do you feel like giving it a try ?

Nathann

comment:8 in reply to: ↑ 7 Changed 5 years ago by vdelecroix

Replying to ncohen:

No reason that I can see. It will be a bit faster, it will make the code sligthly harder to read.. Comparing with an iterator could be nice, too. Do you feel like giving it a try ?

Yeeeeeeeeeees. Let me try.

comment:9 Changed 5 years ago by vdelecroix

  • Branch changed from u/ncohen/16005 to public/16005
  • Commit changed from 567600b943c81837a5bc525136b55c6f7878327b to 98286f45fc077f9778708ca3f064a2f5a13853ad

Hi Nathann,

I update the branch to sage.6.2.rc0 and implement what I said. There is still one sage_malloc that can be avoided in out_neighbors_BTNode_unsafe (it is documented in a TODO bloc). The timings are a bit better with this new implementation. Please have a look.

Vincent


New commits:

9623fd7Merge branch 'u/ncohen/16005' of trac.sagemath.org:sage into 16005-waste_of_times_graph2
98286f4optimization of edge_iteration in SparseGraphBackend + documentation

comment:10 follow-up: Changed 5 years ago by ncohen

Do you have a problem with getting this patch merged and dealing with your optimisations in a different ticket ? You wrote a lot of code which will take time to review, and there will probably be some design choice to discuss ?...

Nathann

comment:11 in reply to: ↑ 10 ; follow-up: Changed 5 years ago by vdelecroix

  • Status changed from needs_review to positive_review

Replying to ncohen:

Do you have a problem with getting this patch merged and dealing with your optimisations in a different ticket ? You wrote a lot of code which will take time to review, and there will probably be some design choice to discuss ?...

My commit is mainly documentation, and the code is copy/paste of the very same 10 lines... so I do not agree with "You wrote a lot of code". Did you read it or ran

if [ $(git log -p -1|grep ^[\+-]|wc -l) -ge 300 ];
   then echo "This f******g patch is toooooooo long";
fi

But for sure I agree that there are some design discussions to have. The follow up is in #16220.

More seriously, there are many malformations in the documentation (especially the INPUT/OUTPUT blocks). They concern cdef functions so as soon as only the reference manual is concerned everything is fine.

Good enough for positive review.

comment:12 Changed 5 years ago by vdelecroix

  • Branch changed from public/16005 to u/ncohen/16005
  • Commit changed from 98286f45fc077f9778708ca3f064a2f5a13853ad to 567600b943c81837a5bc525136b55c6f7878327b

comment:13 in reply to: ↑ 11 Changed 5 years ago by ncohen

My commit is mainly documentation, and the code is copy/paste of the very same 10 lines... so I do not agree with "You wrote a lot of code". Did you read it or ran

Well, by just reading it I did not get immediately that it was only copy/paste. And it is code that should be looked at carefully. Don't worry, in the Graph realm it is not because something is settled "in a different ticket" that it is buried forever.

I just remembered today that the conversation died on https://groups.google.com/forum/#!topic/sage-devel/q5uy_lI11jg, and this is the kind of stuff that needs attention. Burying these things is criminal.

But for sure I agree that there are some design discussions to have. The follow up is in #16220.

Yepyep. I'll go there next, though I believe I still have some answers to give you on other tickets.

More seriously, there are many malformations in the documentation (especially the INPUT/OUTPUT blocks). They concern cdef functions so as soon as only the reference manual is concerned everything is fine.

Well, it's old code I guess. We'll fix that on the way.

Good enough for positive review.

Thanks ! If you are looking for things to review, #12867 is getting old too :-P

Nathann

comment:14 Changed 5 years ago by vbraun_spam

  • Milestone changed from sage-6.2 to sage-6.3

comment:15 Changed 5 years ago by vbraun

  • Branch changed from u/ncohen/16005 to 567600b943c81837a5bc525136b55c6f7878327b
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.