#24852 closed defect (fixed)

Fix bytes/str in sage.data_structures.bitset

Reported by: jdemeyer Owned by: embray
Priority: major Milestone: sage-8.4
Component: python3 Keywords:
Cc: embray Merged in:
Authors: Erik Bray Reviewers: Frédéric Chapoton
Report Upstream: N/A Work issues:
Branch: b6c068a (Commits) Commit: b6c068a087581db0cecf2956b3041156edc7ac36
Dependencies: #26397 Stopgaps:

Description


Change History (26)

comment:1 Changed 22 months ago by jdemeyer

  • Branch set to u/jdemeyer/fix_bytes_str_in_sage_data_structures_bitset

comment:2 Changed 22 months ago by jdemeyer

  • Commit set to 59ad1fb956faade593e7dfe30b080ddae9456fe0

It looks like bitset_from_str is meant to be a low-level C function. Wouldn't it be better to fix calls to bitset_from_str instead?


New commits:

59ad1fbpy3: Fix bytes/str in sage.data_structures.bitset

comment:3 Changed 22 months ago by jdemeyer

  • Status changed from new to needs_review

comment:4 Changed 22 months ago by jdemeyer

  • Status changed from needs_review to needs_work

comment:5 Changed 22 months ago by embray

Yes, I see your point. Let me see how many of those there actually are...

comment:6 Changed 22 months ago by embray

No, it's only used in bitset.pyx, and the way I use str_to_bytes here makes sense.

comment:7 follow-up: Changed 22 months ago by embray

There are one or two cases in bitset.pyx where it's explicitly being used on bytes (e.g. in __setstate__, and other cases where it's being used on str (in the 2/3-compatibility sense) where using str_to_bytes in bitset_from_str makes sense.

I think what might make sense, for symmetry's sake, might be to rename the original bitset_from_str to bitset_from_char (as in char *) and then make bitset_from_str a wrapper around that which uses str_to_bytes.

comment:8 Changed 22 months ago by embray

On second thought, __setstate__ is fine since __getstate__ also returns str. Though I don't know why it would dump/load as strings when it could just use binary.

comment:9 Changed 19 months ago by chapoton

  • Milestone changed from sage-8.2 to sage-8.3

comment:10 Changed 19 months ago by embray

  • Owner changed from (none) to embray

comment:11 Changed 18 months ago by chapoton

what is the status of this one ?

comment:12 Changed 18 months ago by jdemeyer

I still think that it needs to be fixed.

comment:13 in reply to: ↑ 7 ; follow-up: Changed 18 months ago by jdemeyer

Replying to embray:

rename the original bitset_from_str to bitset_from_char (as in char *)

Bikeshed: I find it a bit strange to use char to refer to char *. Better would be cstr or charptr or chararray or whatever.

comment:14 in reply to: ↑ 13 Changed 18 months ago by embray

Replying to jdemeyer:

Replying to embray:

rename the original bitset_from_str to bitset_from_char (as in char *)

Bikeshed: I find it a bit strange to use char to refer to char *. Better would be cstr or charptr or chararray or whatever.

I have the same uneasiness about it but couldn't think of a better name at the time. I think for now I'll keep just "char" for consistency's sake, unless you also want to rename char_to_str.

I still believe just "char" is clear enough in context, though I do like "cstr".

comment:15 Changed 16 months ago by vdelecroix

  • Milestone changed from sage-8.3 to sage-8.4

update milestone 8.3 -> 8.4

comment:16 Changed 15 months ago by chapoton

What's the status here ? can we move on ?

comment:17 Changed 15 months ago by embray

I think this is still worth fixing:

I think what might make sense, for symmetry's sake, might be to rename the original bitset_from_str to bitset_from_char (as in char *) and then make bitset_from_str a wrapper around that which uses str_to_bytes.

There's also the matter of Jeroen's bikeshed about the name, but I think I addressed that above.

comment:18 Changed 14 months ago by chapoton

many doctests failures with python2... essentially totally broken

comment:19 Changed 14 months ago by embray

Okay. Would you like to fix it, taking into account my last comment? I can still do it too, it just hasn't been high priority.

comment:20 Changed 14 months ago by chapoton

I have no idea what to do here.. So I am not able to do it myself, sorry.

comment:21 Changed 14 months ago by embray

Ok, but I think you underestimate yourself. It's mainly just a simple str/bytes conversion issue.

The existing function maed bitset_from_str originally took a char * as its argument, which won't make sense for Python 3 str, so I changed it to accept a generic object instead and use str_to_bytes on it.

Jeroen's suggestion was that we keep the original implementation of bitset_from_str, but rename it to something like bitset_from_char, keeping the original char * interface, then add a new bitset_from_str which on Python 2 can easily just be an easily for bitset_from_str, and on Python 3 would call str_to_bytes and pass that to bitset_from_char. Just for example.

comment:22 Changed 14 months ago by embray

Well, I've got 30 minutes to kill. So I'll take a look at it now.

comment:23 Changed 14 months ago by embray

  • Branch changed from u/jdemeyer/fix_bytes_str_in_sage_data_structures_bitset to u/embray/python3/ticket-24852
  • Commit changed from 59ad1fb956faade593e7dfe30b080ddae9456fe0 to b6c068a087581db0cecf2956b3041156edc7ac36
  • Status changed from needs_work to needs_review

Reworked this a bit more. In addition to the original changes from this branch, I also updated __getstate__ and __setstate__ to work on bytes instead of str, and I renamed the original bitset_string function to bitset_bytes, but also added a new bitset_string that returns a str on Python 3.

I think everything should work on Python 2 as well. I tested all the modules that use this class directly, at least.


New commits:

b6c068apy3: Fix bytes/str in sage.data_structures.bitset

comment:24 Changed 14 months ago by chapoton

  • Reviewers set to Frédéric Chapoton
  • Status changed from needs_review to positive_review

ok, then. Thanks a lot.

comment:25 Changed 14 months ago by chapoton

  • Dependencies set to #26397

comment:26 Changed 14 months ago by vbraun

  • Branch changed from u/embray/python3/ticket-24852 to b6c068a087581db0cecf2956b3041156edc7ac36
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.