Opened 5 years ago
Last modified 5 years ago
#19337 needs_work enhancement
Improve asteroidal triples code
Reported by: | dcoudert | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | sage-6.9 |
Component: | graph theory | Keywords: | |
Cc: | ncohen | Merged in: | |
Authors: | David Coudert | Reviewers: | |
Report Upstream: | N/A | Work issues: | |
Branch: | u/dcoudert/asteroid (Commits) | Commit: | e32015d13649c7df56362fc9fd2b57937f465705 |
Dependencies: | #19334 | Stopgaps: |
Description (last modified by )
Fix some types problems raised in #19334 for asteroidal_triples.pyx
.
In particular, we get rid of uint32_t
whenever possible.
Change History (27)
comment:1 Changed 5 years ago by
- Branch set to u/dcoudert/asteroid
comment:2 Changed 5 years ago by
- Commit set to aa2f9b32409dcad26fdcf75f69cd686b7952f513
comment:3 Changed 5 years ago by
- Cc ncohen added
- Description modified (diff)
- Status changed from new to needs_review
Using int should be enough for the size of graphs we can process.
comment:4 Changed 5 years ago by
- Dependencies #19334 deleted
comment:5 follow-up: ↓ 6 Changed 5 years ago by
Err... int are usually larget than 32bits ?...
comment:6 in reply to: ↑ 5 ; follow-up: ↓ 7 Changed 5 years ago by
Replying to ncohen:
Err... int are usually larget than 32bits ?...
In fact, no. On all systems I know, int
s are exactly 32 bits.
comment:7 in reply to: ↑ 6 ; follow-up: ↓ 9 Changed 5 years ago by
In fact, no. On all systems I know,
int
s are exactly 32 bits.
Funny. I had somewhere in my head that int and long were the same size.
Anyway, why should we downgrade those uint32 to int exactly? O_o
Nathann
comment:8 Changed 5 years ago by
It seems from the comments on the other ticket that you would only need to replace "-1" with "-1L"?..
comment:9 in reply to: ↑ 7 Changed 5 years ago by
Replying to ncohen:
In fact, no. On all systems I know,
int
s are exactly 32 bits.Funny. I had somewhere in my head that int and long were the same size.
on 32-bit systems that's true :-)
Note that the C standard doesn't say much about this. On Windows for example, long
is always 32-bit (even on 64-bit systems). There might be systems where int
is 64-bit.
comment:10 Changed 5 years ago by
- Dependencies set to #19334
comment:11 follow-up: ↓ 14 Changed 5 years ago by
The only guarantee by the C spec is that int
is signed and at least 16 bit.
I'd recommend the C99 int_fast32_t
type which will be guaranteed to be at least 32 bit and the fastest on the target system. IMHO you should never use the old int
/ long
except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.
comment:12 Changed 5 years ago by
- Commit changed from aa2f9b32409dcad26fdcf75f69cd686b7952f513 to e32015d13649c7df56362fc9fd2b57937f465705
comment:13 Changed 5 years ago by
So let's try with int_fast32_t
if it is better.
comment:14 in reply to: ↑ 11 ; follow-ups: ↓ 15 ↓ 20 Changed 5 years ago by
Replying to vbraun:
I'd recommend the C99
int_fast32_t
type
-1.
Why should this graph code use a type of at least 32 bits? Where does the arbitrary number 32 come from? Why not int_fast16_t
or int_fast64_t
then?
If you don't want int
, I'd go with long
then which is also guaranteed to be at least 32 bits and which matches the actual return type of bitset_first_in_complement()
.
you should never use the old
int
/long
except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.
Really, why that? I would say that you should always use types like long
unless you need a specific bit-length.
comment:15 in reply to: ↑ 14 Changed 5 years ago by
Why should this graph code use a type of at least 32 bits? Where does the arbitrary number 32 come from? Why not
int_fast16_t
orint_fast64_t
then?
I did not check this specific code but uint32_t probably comes from the data structure "static_sparse_graph", which uses them.
Then, as David said, this algorithm does not need uint32_t -- it would not run. The 31 bits of int32_t would do, and 16 bits would probably be sufficient.
Really, why that? I would say that you should always use types like
long
unless you need a specific bit-length.
Here we want to specify the bit length, because the smallest the type is the better it will be cached. We will read memory very very often in that code.
Nathann
comment:16 follow-up: ↓ 18 Changed 5 years ago by
About the "fastest on the target system": that very much depends on the application. On 64-bit systems, it could easily be a 64-bit type. If you're making arrays of those (it seems that you do), you can end up slower than int32_t
because of the larger memory accesses.
PS: I realize that we're bikeshedding here. I'm not really against int_fast32_t
, I just wouldn't recommend it in this case. I would advice: if you're certain(!) that 32 bits is enough, use int32_t
. Otherwise, use long
.
comment:17 Changed 5 years ago by
- Status changed from needs_review to needs_work
comment:18 in reply to: ↑ 16 Changed 5 years ago by
About the "fastest on the target system": that very much depends on the application. On 64-bit systems, it could easily be a 64-bit type. If you're making arrays of those (it seems that you do), you can end up slower than
int32_t
because of the larger memory accesses.
Yes yes, I agree.
PS: I realize that we're bikeshedding here. I'm not really against
int_fast32_t
, I just wouldn't recommend it in this case. I would advice: if you're certain(!) that 32 bits is enough, useint32_t
. Otherwise, uselong
.
Sooo... Would you say that there is actually *anything* that needs be changed in this file? We use fixed-size types, they are unsigned because that's what the data structure expects, and the change in Cython has been fixed by Volker with a cast.
Nathann
comment:19 Changed 5 years ago by
Hello,
the algorithm has time complexity O(n^3)
and space complexity O(n^2)
. So we will certainly not use it on graphs with 100,000 nodes (I can't on my laptop).
I was using uint32_t
since static_sparse_graph
uses it, but I agree that long
is more appropriate with bitset_first_in_complement()
. So I can use long
for all indexes, or a more appropriate type if any.
For the arrays, all values stored in that array are in range 1..n
. So either I use a 16 bits type and add a test at the beginning of the method in case n
needs more than that, or I use any type with at least 32 bits. Well, uint32_t
is good for that.
Let me know.
David.
comment:20 in reply to: ↑ 14 ; follow-up: ↓ 21 Changed 5 years ago by
If you need only 16 bits then there is of course int_fast16_t
. Its definition is analogous to plain int
except that the latter has no promise of being a particularly fast choice. I don't care about the actual bitwidth, it should just be reflected (and easily readable) in the code.
Replying to jdemeyer:
If you don't want
int
, I'd go withlong
then which is also guaranteed to be at least 32 bits
But not guaranteed to be particularly fast.
and which matches the actual return type of
bitset_first_in_complement()
.
I.e. the "legacy libraries" case.
you should never use the old
int
/long
except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.Really, why that? I would say that you should always use types like
long
unless you need a specific bit-length.
But by using long
you are already asking for a particular bit width, except that it does not document the intent (how many Sage contributors know the specs for long
?). Usually, you see long
because
- the author mistakenly believes that it is the longest int type
- the author mistakenly believes that it can hold a pointer
- the author mistakenly believes that it is at least 64-bit
- the author mistakenly believes that it is 64-bit on a 64-bit machine
- cargo cult, i.e., to match legacy code
With the C99 data types you just need to answer
- what is the minimum bit-width required
- do I need to optimize for space or speed
and the resulting program inherently documents it. Its so much better, a total slam-dunk.
comment:21 in reply to: ↑ 20 ; follow-ups: ↓ 22 ↓ 23 Changed 5 years ago by
Replying to vbraun:
Usually, you see
long
because
- the author mistakenly believes that it is the longest int type
- the author mistakenly believes that it can hold a pointer
- the author mistakenly believes that it is at least 64-bit
- the author mistakenly believes that it is 64-bit on a 64-bit machine
- cargo cult, i.e., to match legacy code
I can play that game too:
Usually, you see int_fastN_t
because
- the author mistakenly believes that his code will never need more than N bits (think "640K should be enough for everybody")
- the author mistakenly believes that
int_fastN_t
will yield the fastest possible code (not true if the code is memory-bound)
what is the minimum bit-width required
I don't think that one should always answer this question. There are valid use cases for types which have different sizes on 32-bit and 64-bit systems.
comment:22 in reply to: ↑ 21 Changed 5 years ago by
I can play that game too:
Go play it somewhere else. This thing is a trac ticket. Review it or exchange private emails.
Nathann
comment:23 in reply to: ↑ 21 ; follow-up: ↓ 24 Changed 5 years ago by
Replying to jdemeyer:
what is the minimum bit-width required
I don't think that one should always answer this question. There are valid use cases for types which have different sizes on 32-bit and 64-bit systems.
I agree that there are valid use cases, but
- int/long are also terrible for that; They encode mostly encode historical accidents, e.g. long is different on 64-bit linux vs 64-bit windows on the same hardware
- in Mathematics, correctness and reproducibility (including on different OS/hardware) are much more important than elsewhere. If you can't trust that the result is correct then neither speed nor memory usage matters.
comment:24 in reply to: ↑ 23 Changed 5 years ago by
- in Mathematics, correctness and reproducibility (including on different OS/hardware) are much more important than elsewhere. If you can't trust that the result is correct then neither speed nor memory usage matters.
This, right after you decided to not include a stopgap in a code that returns wrong answers? That made me laugh.
comment:25 Changed 5 years ago by
Today, with sage 6.9.rc3 and without this patch, I have a segfault!
With this patch it's working.
I don't know which is the best type to use and above discussion confuses me.
What I do know is that this patch solves a segfault and so is needed.
Best, David.
comment:26 Changed 5 years ago by
Do you know if the segfault is deterministic? Does it happen consistently or may it be related to something else? I do not see how changing the type may fix anything that is not already fixed by #19334.
Nathann
Branch pushed to git repo; I updated commit sha1. New commits:
trac #19337: better comments