Opened 6 months ago

Closed 3 months ago

#26026 closed defect (fixed)

Updates to Huffman codings

Reported by: embray Owned by:
Priority: major Milestone: sage-8.5
Component: coding theory Keywords:
Cc: tscrim Merged in:
Authors: Erik Bray Reviewers: Frédéric Chapoton, Travis Scrimshaw
Report Upstream: N/A Work issues:
Branch: e913f5d (Commits) Commit: e913f5df63139d7e1000adf3dcf7aa51deddeb1f
Dependencies: Stopgaps:

Description

The main goal of these changes were to make this class work on Python 3, which previously it didn't due to comparisons failing while manipulating the heap queue data structures when building the tree. The other problem was that the use of dict iterators meant that the codings weren't quite predictable; sometimes the codes for two characters could be swapped, leading to different test results (granted, not in a way that affects the length of the encodings).

This reimplements building the tree in a way that's deterministic and works the same on Python 2 and 3, and fixes a few other minor issues I noticed.

Change History (13)

comment:1 Changed 6 months ago by git

  • Commit set to d41dbe25f3480c7853556c012bde80c15741334a

Branch pushed to git repo; I updated commit sha1. New commits:

9135406improve determinism in Huffman codings
9e3f7bcdifferent algorithm without using heap queues
7c2de4fremove stray line in test
d41dbe2explicitly raise an error when trying to make a Huffman code for fewer than 2 symbols (for which the code in this class breaks)

comment:2 Changed 6 months ago by embray

  • Status changed from new to needs_review

comment:3 Changed 6 months ago by gh-bryangingechen

The ticket #18315 also makes extensive changes to the Huffman coding algorithm.

comment:4 follow-up: Changed 6 months ago by embray

Oops, I didn't see that one; it is quite old. Are you trying to revive it?

comment:5 Changed 6 months ago by embray

Looking at the implementation in #18315, it would suffer from the same problems this is trying to fix, so I think this ticket is still relevant, albeit obviously not directly compatible with #18315.

I haven't looked closely at the other coding algorithms added in that ticket but they might have some of the same kinds of problems too. I have a few other nitpicks with that ticket but it looks good on the whole.

comment:6 in reply to: ↑ 4 Changed 6 months ago by gh-bryangingechen

Replying to embray:

Oops, I didn't see that one; it is quite old. Are you trying to revive it?

I haven't looked at it in any real detail yet, and I'm also not knowledgeable on the subject. It did seem worthy of a little more attention though.

comment:7 Changed 4 months ago by embray

  • Milestone changed from sage-8.4 to sage-8.5

comment:8 Changed 3 months ago by chapoton

  • Status changed from needs_review to needs_work

one trivial failing doctest

comment:9 Changed 3 months ago by chapoton

  • Branch changed from u/embray/python3/sage-coding-source_coding-huffman to public/ticket/26026
  • Commit changed from d41dbe25f3480c7853556c012bde80c15741334a to e913f5df63139d7e1000adf3dcf7aa51deddeb1f
  • Status changed from needs_work to needs_review

I fixed the doctest


New commits:

1c94d3cMerge branch 'u/embray/python3/sage-coding-source_coding-huffman' in 8.5.b2
e913f5dtrac 26026 fix trivial failing doctest

comment:10 Changed 3 months ago by chapoton

  • Cc tscrim added

green bot, please review

comment:11 Changed 3 months ago by chapoton

please review (this is about a file with many failing doctest in python 3)

comment:12 Changed 3 months ago by tscrim

  • Reviewers set to Frédéric Chapoton, Travis Scrimshaw
  • Status changed from needs_review to positive_review

LGTM.

comment:13 Changed 3 months ago by vbraun

  • Branch changed from public/ticket/26026 to e913f5df63139d7e1000adf3dcf7aa51deddeb1f
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.