Opened 8 years ago

Closed 8 years ago

#12371 closed defect (fixed)

The graph_decompositions/ code seems to have bounds issues

Reported by: Snark Owned by: jason, ncohen, rlm
Priority: major Milestone: sage-5.0
Component: graph theory Keywords:
Cc: Merged in: sage-5.0.beta11
Authors: Julien Puydt Reviewers: Nathann Cohen, Jeroen Demeyer
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by ncohen)

That makes the code non-portable ; the patch I provide adds either signed or unsigned in each place, which makes the ARM platform pass the tests happily.

Notice that it does fix the current problem but there are still two problems with the code:

  1. in vertex_separation.pyx, we still have "minimums[i] = n" which is an unsigned char getting an unsigned int ;
  2. in vertex_separation.pyx, we still have "tmp_count = <unsigned char> popcount(i)", where popcount returns an "int".

APPLY:

Attachments (5)

graph_decompositions_char.patch (2.7 KB) - added by Snark 8 years ago.
Patch to fix the unsigned/signed char issue
trac_12371.patch (3.3 KB) - added by ncohen 8 years ago.
graph_decompositions_typings.patch (7.6 KB) - added by Snark 8 years ago.
Patch to change the types in the graph_decompositions/ code
trac_popcount.patch (3.5 KB) - added by ncohen 8 years ago.
test_popcount.patch (968 bytes) - added by ncohen 8 years ago.

Download all attachments as: .zip

Change History (42)

Changed 8 years ago by Snark

Patch to fix the unsigned/signed char issue

comment:1 Changed 8 years ago by Snark

  • Status changed from new to needs_review

Now that I think of it, this "int" is the size of the graph, and assumed to be small, so perhaps that should be an "unsigned char" too?

comment:2 Changed 8 years ago by ncohen

Helloooo !!!

Actually, what Volker said on [1] makes sense. What about using u_int8_t and int8_t as data types ? It would also be visually shorter and make the code much more readable O_o

[1] https://groups.google.com/d/msg/sage-devel/2HwMYI1CY98/QNWOwanWp4kJ

comment:3 Changed 8 years ago by Snark

  1. I think the standard names for the types are uint8_t and int8_t, and are found in stdint.h.
  2. The purpose of that ticket is the char issue, so I would rather see another ticket on changing the implementation to use (u)int8_t.

comment:4 Changed 8 years ago by ncohen

Let's no create ten patches to address the same changes. I will write the patch in a couple hours if it is ok with you.

Nathann

comment:5 Changed 8 years ago by Snark

It's ok for me ; if you have something to test, I'll give it a try on my ARM box too.

comment:6 Changed 8 years ago by ncohen

Hellooooooo !!

Here is a patch that does the same, but with the correct types :-)

Nathann

Changed 8 years ago by ncohen

comment:7 Changed 8 years ago by Snark

  • Description modified (diff)
  • Status changed from needs_review to needs_work
  • Summary changed from The graph_decompositions/ code uses "char" without the unsigned/signed keyword to The graph_decompositions/ code seems to have bounds issues

Eh, this patch is basically mine with s/unsigned char/uint8_t/g and s/signed char/int8_t/g, so it's no surprise it passes the tests.

But it also means it has all the problems of the original code, plus the ones of my patch.

When I read the patched code, I see the following (potential or real) problems :

  1. there are loops on "int" counters from 0 to a cardinal (they should be the same type as the cardinal);
  2. the implementation of pop_count uses magical values which seem to assume a definite sizeof(int);
  3. there are "int" which are in fact bit arrays ; it should probably be "unsigned int" ;
  4. there are arrays which are indexed by above, so I'm not sure there is no risk there (though there is a test to bail out in case of problems, 8*sizeof(int)).

If I remember well, the current doctests just use graphs of cardinal up to 8, so we can only detect signed/unsigned issues, but not other ones.

I'll try to work on a bigger patch ; I'm updating the summary of that ticket to reflect that much more ambitious scope.

Changed 8 years ago by Snark

Patch to change the types in the graph_decompositions/ code

comment:8 Changed 8 years ago by ncohen

Don't you think it is going a bit too far ? You are refusing to compare unsigned char with integers, for some reason bit arrays "should be" unsigned int and not "int"...

And even if the ansers of some methods do not require more than 8 bits that is no reason why we should force the user to deal with int8_t instead of regular integers...

I agree that the argument of popcount should be redefined to int32_t though.

Nathann

comment:9 follow-up: Changed 8 years ago by Snark

Well, you seemed eager to use "correct types", and signed/unsigned char didn't seem good enough, so I changed the rest of the code in that direction :-)

And for bit arrays, I didn't write they "should be" but "should probably be", which is even weaker: I'm not convinced.

Whatever gets applied, as long as it works, is ok for me.

(PS: you didn't comment on point 2, which might be an issue if sizeof(int) isn't what you expect)

comment:10 in reply to: ↑ 9 Changed 8 years ago by ncohen

Well, you seemed eager to use "correct types", and signed/unsigned char didn't seem good enough, so I changed the rest of the code in that direction :-)

Well, the thing with "char" is that it has a "meaning", it represents characters, and that's why there is a difference on your architecture. What I mean by using char is actually "a 8-bits integer", which can be unsigned at times. But what is the point of saving 3 bytes on an integer counter ?

Whatever gets applied, as long as it works, is ok for me.

+1

(PS: you didn't comment on point 2, which might be an issue if sizeof(int) isn't what you expect)

Well, I said it would make sense to move its argument to int32_t. Isn't that what you meant ?

Nathann

comment:11 Changed 8 years ago by Snark

Well, if int32_t is used for popcount, we'd better be sure "int" doesn't have more bits than that, because that's what is used elsewhere, and especially the whole code is guarded against 8*sizeof(int), not againt 32.

comment:12 Changed 8 years ago by ncohen

Hmmmm... You are right ! But then it makes much more sense to adapt popcount than to adapt all code to popcount.

I'll give this a look, it is also used in the bitset code.

Nathann

comment:13 Changed 8 years ago by ncohen

Ahahah. Turns out the stupid way is also the best. It looks like there is no preprocessor way to check the integer size, but as we compile with O3 gcc does the job anyway and removes the condition itself :-)

And I finally did not touch the bitset code, as they deal with long type.

Nathann

comment:14 follow-up: Changed 8 years ago by Snark

Nathann, what is the status of this ticket?

comment:15 in reply to: ↑ 14 Changed 8 years ago by ncohen

Hellooooo !!

Nathann, what is the status of this ticket?

Well.... You sent a patch while setting the ticket to "needs_work", so you tell me ^^;

I tried to answer the points you made in your 4 items list. I think (a very personal advice) that points 1, 3 and 4 have more to do with theology than with programming (there is nothing that makes me worry in any of them, except that any patch fixing them would made the way harder to read) and the remark you made in 2 is right for me, and can be fixed with the trick I gave two comments above. That is : it can be fixed the stupid way with a if(sizeof(int) == 4), at no cost in efficiency as this kind of things is solver at compilation time) :-)

Nathann

comment:16 follow-up: Changed 8 years ago by Snark

Both your patches look good, and I would gladly see them go in :-)

comment:17 in reply to: ↑ 16 Changed 8 years ago by ncohen

  • Description modified (diff)
  • Status changed from needs_work to needs_review

Both your patches look good, and I would gladly see them go in :-)

Oh. Then if you think those ones should go, then all that is left to do is to review them :-)

And the "is solver" in my previous message was a "is solved", naturally :-p

Nathann

comment:18 Changed 8 years ago by davidloeffler

Apply trac_12371.patch, trac_popcount.patch

(for the patchbot, which is trying to apply all four patches at once)

comment:19 Changed 8 years ago by ncohen

Ping ? This ticket still needs a review :-)

Nathann

comment:20 follow-up: Changed 8 years ago by Snark

  • Status changed from needs_review to positive_review

Well, I said 'yes' four weeks ago, so I guess I should put "positive review".

I thought telling "ok" to the code maintainer was good enough...

comment:21 in reply to: ↑ 20 Changed 8 years ago by ncohen

Well, I said 'yes' four weeks ago, so I guess I should put "positive review".

I thought telling "ok" to the code maintainer was good enough...

oops, sorry :-)

and thank you for the review !

Nathann

comment:22 Changed 8 years ago by jdemeyer

  • Status changed from positive_review to needs_info

1) This isn't a proper commit message for the reviewer patch:

[mq]: ffff

2) Could you please explain the logic behind the reviewer patch? It seems to me like you're potentially changing the behaviour, and it's not clear to me why. The ticket description isn't clear on this either.

comment:23 follow-up: Changed 8 years ago by ncohen

  • Status changed from needs_info to needs_review

Hellooo Jeroen !!

Sorry about that patch, the commit message is now updated !

I do not get your second question, though.... The popcount function as it is implemented is made for 32 bits integers, and I thought it was better to write a popcount function for "int" than to write a popcount function for int32_fast_t, so here it is. This function just tests the size of an int before returning its result, which has 0 cost as these things are simplified by the compiler anyway. When do you think the behaviour may change ?

Nathann

comment:24 Changed 8 years ago by ncohen

  • Status changed from needs_review to positive_review

(so that Jeroen sees it)

comment:25 in reply to: ↑ 23 ; follow-up: Changed 8 years ago by jdemeyer

Replying to ncohen:

The popcount function as it is implemented is made for 32 bits integers

That's not clear from looking at the code.

and I thought it was better to write a popcount function for "int" than to write a popcount function for int32_fast_t, so here it is. This function just tests the size of an int before returning its result.

That's my question: why do the test? Why not just do

cdef inline int popcount(int i): 
    return __builtin_popcount(i) 

Also: there are portability issues: __builtin_popcount() is a GCC extension, certainly not supported by all compilers.

Changed 8 years ago by ncohen

comment:26 in reply to: ↑ 25 ; follow-up: Changed 8 years ago by ncohen

Helloooo Jeroen !!!

That's not clear from looking at the code.

It is clearer now. The function is renamed to popcount32 and I added a short docstring.

That's my question: why do the test? Why not just do

cdef inline int popcount(int i): 
    return __builtin_popcount(i) 

Oh. Because this code is much faster.

Also: there are portability issues: __builtin_popcount() is a GCC extension, certainly not supported by all compilers.

Arggggg ! O_O;;;

Well... I uploaded a new patch but I would like to have your advice on that. I added a new function whose purpose is only to test popcount32, but the test is *LONG* (around 5 minutes) so I flagged it with "not tested". I thought it was too long even for a "long" flag. It tries "all 32bits integers" and computes popcount in two different ways, to check for correction. Considering how this 3-lines function is written I do not expect it to cause problem on a different architecture (as long as we are dealing with >=32bits integers... There is no 16bits left, right ? O_o;;;).

Nathann

comment:27 Changed 8 years ago by davidloeffler

  • Status changed from positive_review to needs_info

comment:28 Changed 8 years ago by ncohen

(the problem with needs_info is that Jeroen does not see it anymore ;)

comment:29 in reply to: ↑ 26 ; follow-up: Changed 8 years ago by jdemeyer

Replying to ncohen:

Oh. Because this code is much faster.

Really? I highly doubt that you can write something which is faster than __builtin_popcount().

comment:30 in reply to: ↑ 29 Changed 8 years ago by ncohen

Really? I highly doubt that you can write something which is faster than __builtin_popcount().

Well, it is not like I have an explanation for that, it is too low-level for me. If I were to try to find an answer I would say that the function implemented is inline while builtin_popcount is necessarily a function to be called.

Of course, nothing beats using builtin_popcount with a popcount instruction enabled on the processor, as it is directly translated to a function call.

Ok, now I made these tests a long time ago and it is actually very easy to check. I will post that in a second.

Nathann

comment:31 Changed 8 years ago by ncohen

Yooooooo !!!

Well, I just did the tests, and you can find attached the patch I used to do that. I actually first did the tests by replacing "return popcount32" in the code by "return \_\_builtin_popcount", and there was no difference that I could see in the runtimes of path_decomposition. I was about to say "ok you are right, there is no difference after all", but short of my natural dislike of acknowledging my own mistakes, I was really convinced that there was a real diference between the two.

And it turns out that if I did not see the difference anymore, it is because of some change I made late in the path_decomposition algorithm (a previous version was *MUCH* slower than the current one), and this popcount is not the critical function anymore. The "stupid test" (try many integers, compute what's left afterwards) is clear enough though :

sage: from sage.graphs.graph_decompositions.vertex_separation import test_handmade, test_builtin
sage: time test_builtin()                                                                       
-805306368
Time: CPU 5.94 s, Wall: 5.95 s
sage: time test_handmade()                                                                      
-805306368
Time: CPU 1.85 s, Wall: 1.85 s

Of course, these tests are immediately destroyed if you are lucky enough to compile the code on a machine with the popcnt instruction enabled... But so far I found no way to do that in a way that would not make the code crash on a machine that does *not* have it enabled.

Nathann

Changed 8 years ago by ncohen

comment:32 Changed 8 years ago by jdemeyer

  • Authors set to Julien Puydt
  • Reviewers set to Nathann Cohen, Jeroen Demeyer

comment:33 follow-up: Changed 8 years ago by Snark

I wrote some initial patches, but after that, Nathann is the one who did the authoring, so I'm not sure the changes of comment:32 are correct.

comment:34 in reply to: ↑ 33 Changed 8 years ago by jdemeyer

Replying to Snark:

I wrote some initial patches, but after that, Nathann is the one who did the authoring, so I'm not sure the changes of comment:32 are correct.

Then change it!

comment:35 Changed 8 years ago by jdemeyer

  • Status changed from needs_info to needs_review

Positive review since the patch works.

comment:36 Changed 8 years ago by jdemeyer

  • Status changed from needs_review to positive_review

comment:37 Changed 8 years ago by jdemeyer

  • Merged in set to sage-5.0.beta11
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.